Commun. Math. Phys. 225, 1 – 32 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
On Convergence to Equilibrium Distribution, I. The Klein–Gordon Equation with Mixing T. V. Dudnikova1, , A. I. Komech2, , E. A. Kopylova3, , Yu. M. Suhov4 1 Mathematics Department, Elektrostal Polytechnical Institute, Elektrostal, 144000 Russia.
E-mail:
[email protected] 2 Mechanics and Mathematics Department, Moscow State University, Moscow, 119899 Russia.
E-mail:
[email protected] 3 Physics and Applied Mathematics Department, Vladimir State University, Vladimir, Russia.
E-mail:
[email protected] 4 Statistical Laboratory, Department of Pure Mathematics and Mathematical Statistics, University of
Cambridge, Cambridge, UK. E-mail:
[email protected] Received: 4 January 2001 / Accepted: 2 July 2001
Dedicated to M. I. Vishik on the occasion of his 80th anniversary Abstract: Consider the Klein–Gordon equation (KGE) in Rn , n ≥ 2, with constant or variable coefficients. We study the distribution µt of the random solution at time t ∈ R. We assume that the initial probability measure µ0 has zero mean, a translationinvariant covariance, and a finite mean energy density. We also assume that µ0 satisfies a Rosenblatt- or Ibragimov–Linnik-type mixing condition. The main result is the convergence of µt to a Gaussian probability measure as t → ∞ which gives a Central Limit Theorem for the KGE. The proof for the case of constant coefficients is based on an analysis of long time asymptotics of the solution in the Fourier representation and Bernstein’s “room-corridor” argument. The case of variable coefficients is treated by using an “averaged” version of the scattering theory for infinite energy solutions, based on Vainberg’s results on local energy decay. 1. Introduction The aim of this paper is to underline a special role of equilibrium distributions in statistical mechanics of systems governed by hyperbolic partial differential equations (for parabolic equations see [6, 27]). Important examples arise when one discusses the role of a canonical Gibbs distribution (CGD) in the Planck theory of spectral density of the black-body emission and in the Einstein–Debye quantum theory of solid state (see, e.g. [31]). [The word “canonical” is used in this paper to emphasize the fact that the probability distribution under consideration is formally related to the “Hamiltonian”, or the energy functional, of the corresponding equation by the Gibbs exponential formula. Owing to the linearity of our equations, there are plenty of other first integrals which lead to other stationary measures.] Historically, the emission law was established Supported partly by research grants of DFG (436 RUS 113/615/0-1) and RFBR (01-01-04002)
Supported partly by the Institute of Physics and Mathematics of Michoacan in Morelia, the Max-Planck
Institute for the Mathematics in Sciences (Leipzig) and by research grant of DFG (436 RUS 113/615/0-1) Supported partly by research grant of RFBR (01-01-04002)
2
T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov
at a heuristic level by Kirchhoff in 1859 (see [34]) and stated formally by Planck in 1900 (see [25]). The law concerns the correspondence between the temperature and the colour of an emitting body (e.g., a burning carbon, or an incandescent wire in an electric bulb). Furthermore, it provides fundamental information on an interaction between the Maxwell field and “matter”. Planck’s formula specifies a “radiation intensity” IT (ω) of the electromagnetic field at a fixed temperature T > 0, as a function of the frequency ω > 0. It is convenient to treat IT (·) as the spectral correlation function of a stationary random process. Then if gT denotes an equilibrium distribution of this process, the Kirchhoff–Planck law suggests the long-time convergence µt gT , t → ∞.
(1.1)
Here µt is the distribution at time t of a nonstationary random solution. The resulting equilibrium temperature T is determined by an initial distribution µ0 . Convergence to equilibrium (1.1) is also expected in a system of Maxwell’s equations coupled to an equation of evolution of “matter”. For example, both (1.1) and the Kirchhoff–Planck law should hold for the coupled Maxwell–Dirac equations [5], or for their second-quantised modifications. However, the rigorous proof here is still an open problem. Previously, the convergence of type (1.1) to a CGD gT has been established for an ideal gas with infinitely many particles by Sinai (see, e.g., [7]). Similar results were later obtained for other infinite-dimensional systems (see [2, 10] and a survey [9]). For nonlinear wave problems, the first such result has been established by Jaksic and Pillet in [18]: they consider a system of a classical particle coupled to a wave field in a smooth nonlocal fashion. For all these models, the CGD gT is well-defined, although the convergence is highly non-trivial. On the other hand, for the local coupling such as in the Maxwell–Dirac equations, the problem of “ultraviolet divergence” arises: the CGDs cannot be defined directly as the local energy is formally infinite almost surely. This is a serious technical difficulty that suggests that, to begin with, one should analyse convergence to non-canonical stationary measures µ∞ , with finite mean local energy: µt µ∞ , t → ∞.
(1.2)
In fact, most of the above-mentioned papers establish the convergence to both CGDs and non-canonical stationary measures, by using the same methods. In our situation, the aforementioned ultraviolet divergence makes the difference between (1.1) and (1.2). In this paper we prove convergence (1.2) for the Klein–Gordon equation (KGE) in Rn , n ≥ 2: u(x, ¨ t) = nj=1 (∂j − iAj (x))2 u(x, t) − m2 u(x, t), x ∈ Rn , (1.3) u|t=0 = u0 (x), u| ˙ t=0 = v0 (x). ∂ , x ∈ Rn , t ∈ R, m > 0 is a fixed constant and (A1 (x), . . . , An (x)) a ∂xj vector potential of an external magnetic field; we assume that functions Aj (x) vanish outside a bounded domain. The solution u(x, t) is considered as a complex-valued classical function. It is important to identify a natural property of the initial measure µ0 guaranteeing convergence (1.2). We follow an idea of Dobrushin and Suhov [10] and use a “space”mixing condition of Rosenblatt- or Ibragimov–Linnik-type. Such a condition is natural from physical point of view. It replaces a “quasiergodic hypothesis” and allows us to avoid introducing a “thermostat” with a prescribed time-behaviour. Similar conditions
Here ∂j ≡
Convergence to Equilibrium Distribution, I
3
have been used in [2, 3, 33, 32]. In this paper, mixing is defined and applied in the context of the KGE. Thus we prove convergence (1.2) for a class of initial measures µ0 on a classical function space, with a finite mean local energy and satisfying a mixing condition. The limiting measure µ∞ is stationary and turns out to be a Gaussian probability measure (GPM). Hence, this result is a form of the Central Limit Theorem for the KGE. Another important question we discuss below is the relation of the limiting measure µ∞ to the CGD gT . The (formal) Klein–Gordon Hamiltonian is given by a quadratic form and so the CGDs gT are also GPMs, albeit generalised (i.e. living in generalised function spaces). As our limiting measures µ∞ are “classical” GPMs, they do not include CGDs. However, in the case of constant coefficients, a CGD can be obtained as a limit of measures µ∞ as the “correlation radius” figuring in the mixing conditions imposed on µ0 tends to zero. More precisely, we assume that for a fixed T > 0, 1 E v0 (x)v0 (y) + ∇u0 (x) · ∇u0 (y) + m2 u0 (x)u0 (y) → T δ(x − y), r → 0, 2 (1.4) where E denotes the expectation. Then the covariance functions (CFs) of the corresponding limit GPM µ∞ converge to the covariance functions of the CGD gT . In turn, this implies the convergence µt µ∞ ∼ gT , r → 1.
(1.5)
See Sect. 4. It should be noted that the existence of a “massive” (in a sense, infinite-dimensional) set of the limiting measures µ∞ that are different from CGD’s is related to the fact that KGE (1.3) is degenerate and admits infinitely many “additive” first integrals. Like the Klein–Gordon Hamiltonian, these integrals are quadratic forms; hence they generate GPMs via Gibbs exponential formulas. Convergence (1.2) has been obtained in [19–21] for translation-invariant initial measures µ0 . However, the original proofs were too long and used a specific apparatus of Bessel’s functions applicable exclusively in the case of the KGE. They have not been published in detail because of the lack of a unifying argument that could show the limits of the method and its forthcoming developments. To clarify the mechanism behind the results, one needed some new and robust ideas. The current work provides a modern approach applicable to a wide class of linear hyperbolic equations with a nondegenerate “dispersion relation”, see Eq. (7.20) below. We also weaken considerably the mixing condition on measure µ0 . Moreover, our approach yields much shorter proofs and is applicable to non-translation invariant initial measures. The last fact is important in relation to the two-temperature problem [3,12,33] and the hydrodynamic limit [8]. Such progress became possible in large part owing to the systematic use of a Fourier transform (FT) and a duality argument of Lemma 7.1. [The importance of the Fourier transform was demonstrated in earlier works [3, 32, 33].] Similar results, for the wave equation (WE) in Rn with odd n ≥ 3, are established in [11] which develops the results [26]. The KGE shares some common features with the WE (which is formally obtained by setting m = 0 in (1.3)), and the exposition in [20, 21] followed the structure of the earlier work [26]. On the other hand, the KGE and WE also have serious differences, see below. It is worth mentioning that possible extensions of our methods include, on the one hand, Dirac’s and other relativistic-invariant linear hyperbolic equations and on the other
4
T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov
hand harmonic lattices, as well as “coupled” systems of both types. We intend to return to these problems elsewhere. We now pass to a detailed description of the results. Formal definitions and statements ˙ t)), Y0 = (Y00 , Y01 ) ≡ are given in Sect. 2. Set: Y (t) = (Y 0 (t), Y 1 (t)) ≡ (u(·, t), u(·, (u0 , v0 ). Then (1.3) takes the form of an evolution equation Y˙ (t) = AY (t), t ∈ R; Y (0) = Y0 .
(1.6)
Here, A=
0 1 , A0
(1.7)
where A = nj=1 (∂j − iAj (x))2 − m2 . We assume that the initial date Y0 is a random element of a complex functional space H corresponding to states with a finite local energy, see Definition 2.1 below. The distribution of Y0 is a probability measure µ0 of mean zero satisfying some additional assumptions, see Conditions S1–S3 below. Given t ∈ R, denote by µt the measure that gives the distribution of Y (t), the random solution to (1.6). We study the asymptotics of µt as t → ±∞. We identify C ≡ R2 and denote by ⊗ the tensor product of real vectors. The CFs of the initial measure are supposed to be translation-invariant: ij j Q0 (x, y) := E Y0i (x) ⊗ Y0 (y) ij
= q0 (x − y), x, y ∈ Rn , i, j = 0, 1
(1.8)
(in fact our methods require a weaker assumption, but to simplify the exposition, we will not discuss it here). We also assume that the initial mean energy density is finite: e0 := E |v0 (x)|2 + |∇u0 (x)|2 + m2 |u0 (x)|2 = q011 (0) − q000 (0) + m2 q000 (0) < ∞, x ∈ Rn .
(1.9)
Finally, we assume that measure µ0 satisfies a mixing condition of a Rosenblatt- or Ibragimov–Linnik type, which means that Y0 (x) and
Y0 (y) are asymptotically independent as
|x − y| → ∞.
(1.10)
As was said before, our main result gives the (weak) convergence (1.2) of µt to a limiting measure µ∞ which is a stationary GPM on H. A similar convergence holds for t → −∞. Explicit formulas are then given for the CFs of µ∞ . The strategy of the proof is as follows. First, we prove (1.2) for the equation with constant coefficients (Ak (x) ≡ 0), in three steps. I. We check that the family of measures µt , t ≥ 0, is weakly compact. II. We check that the CFs converge to a limit: for i, j = 0, 1, ij ij (1.11) Qt (x, y) = Y i (x) ⊗ Y j (y)µt (dY ) → Q∞ (x, y), t → ∞.
Convergence to Equilibrium Distribution, I
5
III. Finally, we check that the characteristic functionals converge to a Gaussian one: 1 µˆ t (#) := exp{iY, #}µt (dY ) → exp{− Q∞ (#, #)}, t → ∞. (1.12) 2 Here # is an arbitrary element of the dual space and Q∞ the quadratic form with the ij integral kernel (Q∞ (x, y))i,j =0,1 ; Y, # denotes the scalar product in a real Hilbert 2 n space L (R ) ⊗ Rn . Property I follows from the Prokhorov Theorem by a method used in [37]. First, we prove a uniform bound for the mean local energy in µt , using the conservation of mean energy density. The conditions of the Prokhorov Theorem are then checked by using Sobolev’s embedding Theorem in conjunction with Chebyshev’s inequality. Next, we deduce Property II from an analysis of oscillatory integrals arising in the FT. An important role is attributed to Proposition 6.1 reflecting the properties of the CFs in the FT deduced from the mixing condition. On the other hand, the FT approach alone is not sufficient for proving Property III even in the case of constant coefficients. The reason is that a function of infinite energy corresponds to a singular generalised function in the FT, and the exact interpretation of the mixing condition (1.10) for such generalised functions is unclear. We deduce Property III from a representation of the solution in terms of the initial date in coordinate space. This is a modification of the approach adopted in [19–21]. It allows us to combine the mixing condition with the fact that waves in the coordinate space disperse to infinity. This leads to a representation of the solution as a sum of weakly dependent random variables. Then (1.12) follows from a Central Limit Theorem (CLT) under a Lindeberg-type condition. Checking such a condition is an important part of the proof. It is useful to discuss the dispersive mechanism that is behind (1.12) and compare the KGE (m > 0) and WE (m = 0). Take, for simplicity, n = 3 and u0 ≡ 0. The solution to (1.3) (with Ak (x) ≡ 0) is given by u(x, t) = E(x − y, t)v0 (y) dy, t > 0, (1.13) where E is the “retarded” fundamental solution
√ 1 mθ (t − |x|) J1 (m t 2 − x 2 ) E(x, t) = , δ(|x| − t) − √ 4πt 4π t 2 − x2
(1.14)
J1 is the Bessel function of the first order. For m = 0 the function E(·, t) is supported by the sphere |x| = t of area ∼ t 2 , and (1.13) becomes the Kirchhoff formula 1 u(x, t) = v0 (y) dS(y), (1.15) 4π t |x−y|=t
which manifests the dispersion of waves in the 3D space. Dividing the sphere {y ∈ R3 : |x − y| = t} into N ∼ t 2 “rooms” of a fixed width d 1, we rewrite (1.15) as N k=1
u(x, t) ∼ √
rk
N
,
(1.16)
6
T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov
where rk are nearly independent owing to the mixing condition. Then (1.2) follows by the well-known Bernstein “room-corridor” arguments. For m > 0 function E(·, t) is supported by the ball |x| ≤ t which means the absence of a strong Huyghen’s principle for the KGE. The volume of the ball is ∼ t 3 , hence rewriting (1.13) in the form (1.16) would need asymptotics of the type E(x, t) = O(t −3/2 ),
|x| ≤ t
(1.17)
√ as t → ∞. As J1 (r) ∼ cos(r − 3π/4)/ r, asymptotics (1.17) only holds in the region |x| ≤ vt with v < 1. For instance, cos(mγ t − 3π/4) E |x|=vt ∼ , (γ t)3/2 √ where γ = 1 − v 2 . However, the degree of the decay is different near the light cone |x| = t corresponding to v = 1 and γ = 0. For example, for a fixed r > 0, √ cos(m 2rt − 3π/4) = O(t −3/4 ), (1.18) E |x|=t−r ∼ (2rt)3/4 where r = t −|x| is the “distance” from the light cone. This illustrates that an application of Bernstein’s method in the case of the KGE requires a new idea. √ The key observation is that the asymptotics (1.18) displays oscillations ∼ cos m 2rt of E near the light cone as t → ∞. The solution becomes an oscillatory integral, and one is able to compensate the weak decay ∼ t −3/4 by a partial integration with Bessel functions, by a method following an argument from [23, Appendix B]. Such an approach was used in [21] and was accompanied by tedious computations in a combined “coordinate-momentum” representation. The approach adopted in this paper allows us to avoid this part of the argument. An important role is played by a duality argument of Lemma 7.1 leading to an analysis of an oscillatory integral with a phase function (=“dispersion relation”) with a nondegenerate Hessian, see (7.20). Simple examples show that the convergence may fail when the mixing condition does not hold. For instance, take u0 (x) ≡ ±1 and v0 (x) ≡ 0 with probability p± = 0.5. Then the mean value is zero and (1.9) holds, but (1.10) does not. The solution u(x, t) ≡ ± cos (mt) a.s., hence µt is periodic in time, and (1.2) fails. Finally, a comment on the case of variable coefficients Ak (x). In this case explicit formulas for the solution are unavailable. Here we construct a scattering theory for solutions of infinite global energy. This version of the scattering theory allows us to reduce the proof of (1.2) to the case of constant coefficients (this strategy is similar to [4, 11, 12]). In particular, in [11] one establishes, in the case of a WE, a long-time asymptotics U (t)Y0 = .U0 (t)Y0 + ρ(t)Y0 , t > 0.
(1.19)
Here U (t) is the dynamical group of the WE with variable coefficients, U0 (t) corresponds to the “free” equation with constant coefficients, and . is a “scattering operator”. In this paper, instead of (1.19), we use a dual representation: U (t)# = U0 (t)W # + r(t)#, t ≥ 0.
(1.20)
Convergence to Equilibrium Distribution, I
7
Here U (t) is a ”formal adjoint” to the dynamical group of Eq. (1.3), while U0 (t) corresponds to the “free” equation, with Ak (x) ≡ 0. The remainder r(t) is small in mean: E|Y0 , r(t)#|2 → 0, t → ∞.
(1.21)
This version of scattering theory is essentially based on Vainberg’s bounds for the local energy decay (see [35, 36]). Remark 1.1. (i) In [11] we deduce asymptotics (1.19) from its primal counterpart (1.20). In this paper we do not analyse connections between (1.20) and (1.19). (ii) It is useful to comment on the difference between two versions of scattering theory produced for the WE and KGE. In the first theory, the remainders ρ(t) and r(t) are small a.s., while in the second theory, developed in this paper, r(t) is small in mean (see (1.21)). Such a difference is related to a slow (power) decay of solutions to the KGE. The main result of the paper is stated in Sect. 2 (see Theorem A). Sections 3–8 deal with the case of constant coefficients: the main statement is given in Sect. 3 (see Theorem B), the relation to CGDs is discussed in Sect. 4, the compactness (Property I) is established in Sect. 5, convergence (1.11) in Sect. 6, and convergence (1.12) in Sects. 7, 8. In Sect. 9 we check the Lindeberg condition needed for convergence to a Gaussian limit. In Sect. 10 we discuss the infinite energy version of the scattering theory, and in Sect. 11 convergence (1.2). In Appendix A we collected FT-type calculations. Appendix B is concerned with a formula on generalised GPMs on Sobolev spaces.
2. Main Results 2.1. Notation. We assume that functions Ak (x) in (1.3) satisfy the following conditions: E1. Aj (x) are real C ∞ -functions. E2. Aj (x) = 0 for |x| > R0 , where R0 < ∞. E3.
∂A1 ∂A2 ≡ if n = 2. ∂x2 ∂x1
Assume that the initial state Y0 belongs to the phase space H defined below. 1 (Rn ) ⊕ H 0 (Rn ) is the Fréchet space of pairs Y (x) ≡ Definition 2.1. H ≡ Hloc loc (u(x), v(x)) of complex functions u(x), v(x), endowed with local energy seminorms
Y 2R =
|v(x)|2 + |∇u(x)|2 + m2 |u(x)|2 dx < ∞, ∀R > 0.
(2.1)
|x| 0, and the embedding is compact. 2.2. Random solution. Convergence to equilibrium. Let (;, 0
|µ0 (A ∩ B) − µ0 (A)µ0 (B)| . µ0 (B)
(2.7)
Definition 2.6. The measure µ0 satisfies the strong, uniform Ibragimov–Linnik mixing condition if ϕ(r) → 0,
r → ∞.
(2.8)
Below, we specify the rate of decay of ϕ (see Condition S3).
2.4. Main assumptions and results. We assume that measure µ0 has the following properties S0–S3: S0. µ0 has zero expectation value, EY0 (x) ≡ 0,
x ∈ Rn .
(2.9)
S1. µ0 has translation-invariant CFs, i.e. Eq. (1.8) holds for almost all x, y ∈ Rn . S2. µ0 has a finite mean energy density, i.e. Eq. (1.9) holds. S3. µ0 satisfies the strong uniform Ibragimov–Linnik mixing condition, with ∞ ϕ≡ 0
r n−1 ϕ 1/2 (r)dr < ∞.
(2.10)
10
T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov
ij Define, for almost all x, y ∈ Rn , the matrix Q∞ (x, y) ≡ Q∞ (x, y) 1 Q∞ (x, y) ≡ 2
(q000 + P ∗ q011 )(x − y)
i,j =0,1
(q001 − q010 )(x − y)
(q010 − q001 )(x − y) (q011 − ( − m2 )q000 )(x − y)
by
. (2.11)
Here P(z) is the fundamental solution for the operator − + m2 , and ∗ stands for the convolution of generalized functions. We show below that q011 ∈ L2 (Rn ) (see (6.1)). Then the convolution P ∗ q011 in (2.11) also belongs to L2 (Rn ). Let H = L2 (Rn ) ⊕ H 1 (Rn ) denote the space of complex valued functions # = (#0 , #1 ) with a finite norm 2 #H = (|#0 (x)|2 + |∇#1 (x)|2 + |#1 (x)|2 ) dx < ∞. (2.12) Rn
Denote by Q∞ a real quadratic form in H defined by ij Q∞ (x, y)# i (x), # j (y) dx dy, Q∞ (#, #) =
(2.13)
i,j =0,1 Rn ×Rn
where ·, · stands for the real scalar product in C2 ≡ R4 . The form Q∞ is continuous ij
in H as the functions Q∞ (x, y) are bounded. Theorem A. Let n ≥ 2, m > 0, and assume that E1–E3, S0–S3 hold. Then (i)
The convergence in (2.4) holds for any ε > 0.
(ii) The limiting measure µ∞ is a GPM on H. (iii) The characteristic functional of µ∞ has the form 1 µˆ ∞ (#) = exp{− Q∞ (W #, W #)}, # ∈ D, 2 where W : D → H is a linear continuous operator. 2.5. Remarks on conditions on the initial measure. (i) The (rather strong) form of mixing in Definition 2.6 is motivated by two facts: (a) it greatly simplifies the forthcoming arguments, (b) it allows us to produce an “optimal” (most slow) decay of ϕ indicating natural limits of Bernstein’s room-corridor method. Condition (2.7) can be easily verified for GPMs with finite-range dependence and their images under “local” maps H → H. See the examples in Sect. 2.6 below. (ii) The uniform Rosenblatt mixing condition [30] also suffices, together with a higher power > 2 in the bound (1.9): there exists δ > 0 such that E |v0 (x)|2+δ + |∇u0 (x)|2+δ + m2 |u0 (x)|2+δ < ∞. (1.4 ) Then (2.10) requires a modification: ∞ δ 1 r n−1 α p (r)dr < ∞, where p = min , , 2+δ 2 0
(2.10 )
Convergence to Equilibrium Distribution, I
11
where α(r) is the Rosenblatt mixing coefficient defined as in (2.7) but without µ(B) in the denominator. The statements of Theorem A and their proofs remain essentially unchanged, only Lemma 8.2 requires a suitable modification [17]. 2.6. Examples of initial measures with mixing condition. 2.6.1. Gaussian measures. In this section we construct initial GPMs µ0 satisfying S0– S3. Let µ0 be a GPM on H with the characteristic functional
1 µˆ 0 (#) ≡ E exp(iY, #) = exp − Q0 (#, #) , # ∈ D. (2.14) 2 ij
Here Q0 is a real nonnegative quadratic form with an integral kernel (Q0 (x, y))i,j =0,1 . Let ij
ij
Q0 (x, y) ≡ q0 (x − y),
(2.15)
ij
for any i, j , where the function q0 ∈ C 2 (Rn ) ⊗ M 2 has compact support. Then S0, S1 ij and S2 are satisfied; S3 holds with ϕ(r) ≡ 0 for r ≥ r0 if q0 (z) ≡ 0 for |z| ≥ r0 . For a ij
given matrix function q0 (z) such a measure exists on space H iff the corresponding ij FT is a nonnegative matrix-valued measure: qˆ0 (k) ≥ 0, k ∈ Rn , [15, Thm V.5.1]. ij
For example, all these conditions hold if qˆ0 (k) = Di δ ij f (k1 ) · · · · · f (kn ) with Di ≥ 0 and √ 2 1 − cos(r0 z/ n) f (z) = , z ∈ R. z2 2.6.2. Non-Gaussian measures. Now choose a pair of odd functions f 0 , f 1 ∈ C 1 (R), with bounded first derivatives. Define µ∗0 as the distribution of the random function (f 0 (Y 0 (x)), f 1 (Y 1 (x))), where (Y 0 , Y 1 ) is a random function with a Gaussian distribution µ0 from the previous example. Then S0–S3 hold for µ∗0 with a mixing coefficient ϕ ∗ (r) ≡ 0 for r ≥ r0 . Measure µ∗0 is not Gaussian if Di > 0 and the functions f i are bounded and nonconstant. 3. Equations with Constant Coefficients In Sects. 3–9 we assume that coefficients Ak (x) ≡ 0. Problem (1.3) then becomes u(x, ¨ t) = u(x, t) − m2 u(x, t), t ∈ R, u|t=0 = u0 (x), u| ˙ t=0 = v0 (x).
(3.1)
As in (1.6), we rewrite (3.1) in the form Y˙ (t) = A0 Y (t), t ∈ R; Y (0) = Y0 . Here we denote
A0 =
0 1 , A0 0
(3.2)
(3.3)
12
T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov
where A0 = − m2 . Denote by U0 (t), t ∈ R, the dynamical group for problem (3.2), then Y (t) = U0 (t)Y0 . The following proposition is well-known and is proved by a standard integration by parts. Proposition 3.1. Let Y0 = (u0 , v0 ) ∈ H, and Y (·, t) = (u(·, t), u(·, ˙ t)) ∈ C(R, H) is the solution to (3.1). Then the following energy bound holds: for R > 0 and t ∈ R,
|u(x, ˙ t)|2 + |∇u(x, t)|2 + m2 |u(x, t)|2 dx
|x| 0, and the bounds hold: sup EU0 (t)Y0 2R < ∞, R > 0.
(3.5)
t≥0
Proposition 3.3. For every # ∈ D,
1 µˆ t (#) ≡ exp(iY, #)µt (dY ) → exp − Q∞ (#, #) , 2
t → ∞.
(3.6)
Propositions 3.2 and 3.3 are proved in Sects. 5 and 7–9, respectively. We will use repeatedly the FT (12.2) and (12.3) from Appendix A. 4. Relation to CGDs In this section we discuss how our results are related to CGDs. We restrict consideration to the case of Eq. (1.3) with constant coefficients and to the translation-invariant isotropic case. The CGD gT with the absolute temperature T ≥ 0 is defined formally by
where H :=
1 2
H 1 − gT (du × dv) = e T du(x)dv(x), Z x
(4.1)
|v(x)|2 + |∇u(x)|2 + m2 |u(x)|2 dx, and Z is a normalisation
constant. To make the definition rigorous, let us introduce a scale of weighted Sobolev spaces H s,α (Rn ) with arbitrary s, α ∈ R. We use notation (2.2).
Convergence to Equilibrium Distribution, I
13
Definition 4.1. (i) H s,α (Rn ) is the complex Hilbert space of the distributions w ∈ S (Rn ) with the finite norm ws,α ≡ xα 7s wL2 (Rn ) < ∞.
(4.2)
(ii) Hs,α is the Hilbert space of the pairs Y = (u, v) ∈ H 1+s,α (Rn ) ⊕ H s,α (Rn ) with the norm |||Y ||| s,α ≡ u1+s,α + vs,α .
(4.3)
Note that Hs,α ⊂ Hs,α if s < s and α < α, and this embedding is compact. These facts follow by standard methods of pseudodifferential operators and Sobolev’s Theorem (see, e.g. [16]). Now we can define the CGDs rigorously: gT is a GPM on a space Hs,α , s, α < −n/2, with the CFs gT00 (x − y) = T P(x − y), gT11 (x − y) = T δ(x − y), gT01 (x − y) = gT10 (x − y) = 0.
(4.4)
By Minlos Theorem [15, Thm. V.5.1], such a measure exists on Hs,α with s, α < −n/2 as, formally (see Appendix B), |||Y |||2s,α gT (dY ) < ∞.
(4.5)
Measure gT is stationary for the KGE, as its CFs are stationary; the last fact follows from formulas (12.6), (12.2). Also, gT is translation invariant, so S1 holds. Condition S2 fails since the “mean energy density” gT11 (0) − gT00 (0) + m2 gT00 (0) is infinite; this gives an “ultraviolet divergence”. Mixing condition S3 holds due to an exponential decay of the P(z). The convergence of type (1.1) holds for initial measures µ0 that are absolutely continuous with respect to the CGD gT , and the limit measure coincides with gT . This mixing property (and even the K-property) can be proved by using well-known methods developed for Gaussian processes [7], and we do not discuss it here. Remark. Assumption S2 implies that µ0 (H) = 1 and hence µ∞ (H) = 1. This excludes the case of a limiting CGD as it is a generalised GPM not supported by H. However, it is possible to extend our results to a class of generalised initial measures converging to CGDs. For the case of constant coefficients such an extension could be done by smearing the initial generalised field as the dynamics commutes with the averaging (cf. [12]). For variable coefficients such an extension requires a further work. To demonstrate the special role of the CGDs we consider a family of initial GPMs µ0,r , r ∈ (0, 1], satisfying S0–S3, with the radius of correlation r. More precisely, ij suppose that the corresponding CFs q0,r have the following properties G0–G3: 01 (z) = q 01 (−z), z ∈ Rn . G0. q0,r 0,r 11 (z) − q 00 (z) + m2 q 00 (z) = 0, |z| ≥ r. G1. q0,r 0,r 0,r
14
T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov
G2. For some T > 0, G3. sup
r∈(0,1]
1 2
11 00 00 q0,r (z) − q0,r (z) + m2 q0,r (z) dz → T , r → 0.
11 00 00 |q0,r (z)|+|q0,r (z)|+m2 |q0,r (z)| dz 0, choose d ≡ dt ≥ 1 and ρ ≡ ρt > 0. Asymptotic relations between t, dt and ρt are specified below. Set h = d + ρ and a j = j h, bj = a j + d, j ∈ Z.
(7.6)
j
j
We call the slabs Rt = {x ∈ Rn : a j ≤ x n ≤ bj } “rooms” and Ct = {x ∈ Rn : bj ≤ x n ≤ a j +1 } “corridors”. Here x = (x 1 , . . . , x n ), d is the width of a room, and ρ of a corridor. Denote by χr the indicator of the interval [0, d] and χc that of [d, h] so that j ∈Z (χr (s−j h)+χc (s−j h)) = 1 for (almost all) s ∈ R. The following decomposition holds: j j Y0 , K(·, t) = (Y0 , χr K(·, t) + Y0 , χc K(·, t)), (7.7) j ∈Z
where where
j χr
j
j
j
:= χr (x n − j h) and χc := χc (x n − j h). Consider random variables rt , ct , j
j
j
j
rt = Y0 , χr K(·, t), ct = Y0 , χc K(·, t), Then (7.7) and (7.2) imply U0 (t)Y0 , # =
j ∈ Z.
j j (rt + ct ).
(7.8)
(7.9)
j ∈Z
The series in (7.9) is indeed a finite sum. In fact, (7.5) and (12.1) imply that in the FT ˙ˆ ˆ ˆ ˆ t) and K(k, t) = Gˆt (k)#(k). Therefore, representation, K(k, t) = Aˆ 0 (k)K(k, 1 ˆ K(x, t) = e−ikx Gˆt (k)#(k) dk. (7.10) (2π )n Rn
This can be rewritten as a convolution K(·, t) = Rt ∗ #,
(7.11)
where Rt = F −1 Gˆt . The support supp # ⊂ Br with an r > 0. Then the convolution representation (7.11) implies that the support of the function K at t > 0 is a subset of an “inflated future cone” supp K ⊂ {(x, t) ∈ Rn × R+ : |x| ≤ t + r},
(7.12)
as Rt (x) is supported by the “future cone” |x| ≤ t. The last fact follows from general formulas (see [13, (II.4.5.12)]), or from the Paley–Wiener Theorem (see, e.g. [13, Thm. ˆ t (k) is an entire function of k ∈ Cn satisfying suitable bounds. Finally, II.2.5.1]), as R (7.8) implies that j
j
rt = ct = 0
for
j h + t < −r
or
j h − t > r.
(7.13)
t h
(7.14)
Therefore, series (7.9) becomes a sum U0 (t)Y0 , # =
Nt −Nt
as h ≥ 1.
j
j
(rt + ct ), Nt ∼
Convergence to Equilibrium Distribution, I
19
Lemma 7.2. Let n ≥ 1, m > 0, and S0–S3 hold. The following bounds hold for t > 1: j
j
E|rt |2 ≤ C(#) dt /t, E|ct |2 ≤ C(#) ρt /t,
j ∈ Z.
(7.15)
Proof. We discuss the first bound in (7.15) only; the second is done in a similar way. Step 1. Rewrite the left-hand side as the integral of CFs. Definition (7.8) and Corollary 5.1 imply by Fubini’s Theorem that j
j
j
E|rt |2 = χr (x n )χr (y n )q0 (x − y), K(x, t) ⊗ K(y, t) .
(7.16)
The following bound holds true (cf. [29, Thm. XI.17 (b)]): sup |K(x, t)| = O(t −n/2 ), t → ∞.
(7.17)
x∈Rn
In fact, (7.10) and (12.2) imply that K can be written as the sum K(x, t) =
1 ˆ e−i(kx∓ωt) a ± (ω)#(k) dk, (2π )n ±
(7.18)
Rn
where a ± (ω) is a matrix whose entries are linear functions of ω or 1/ω. Let us prove the asymptotics (7.17) along each ray x = vt + x0 with |v| ≤ 1, then it holds uniformly in x ∈ Rn owing to (7.12). We have by (7.18), K(vt + x0 , t) =
1 ˆ e−i(kv∓ω)t−ikx0 a ± (ω)#(k) dk. (2π )n ±
(7.19)
Rn
This is a sum of oscillatory integrals with the phase functions φ± (k) = kv ± ω(k). Each function has two stationary points, solutions to the equation v = ±∇ω(k) if |v| < 1, and has none if |v| ≥ 1. The phase functions are nondegenerate, i.e.
∂ 2 φ± (k) det ∂ki ∂kj
n i,j =1
= 0, k ∈ Rn .
(7.20)
ˆ At last, #(k) is smooth and decays rapidly at infinity. Therefore, K(vt + x0 , t) = O(t −n/2 ) according to the standard method of stationary phase, [14]. Step 2. According to (7.12) and (7.17), Eq. (7.16) implies that j
E|rt |2 ≤ Ct −n
j
χr (x n )q0 (x − y) dxdy = Ct −n
|x|≤t+r
|x|≤t+r
j
χr (x n )dx
q0 (z)dz, Rn
(7.21) ij where q0 (z) stands for the norm of a matrix q0 (z) . Therefore, (7.15) follows as q0 (·) ∈ L1 (Rn ) by (6.1).
' &
20
T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov
8. Convergence of Characteristic Functionals In this section we complete the proof of Proposition 3.3. We use a version of the CLT developed by Ibragimov and Linnik. If Q∞ (#, #) = 0, Proposition 3.3 is obvious. Thus, we may assume that for a given # ∈ D, Q∞ (#, #) = 0.
(8.1)
Choose 0 < δ < 1 and ρt ∼ t 1−δ , dt ∼
t , ln t
t → ∞.
Lemma 8.1. The following limit holds true: ρ 1/2 ρt t + Nt2 ϕ 1/2 (ρt ) + → 0, Nt ϕ(ρt ) + t t Proof. Function ϕ(r) is nonincreasing, hence by (2.10), n 1/2
r ϕ
r (r) = n
s
n−1 1/2
ϕ
r (r) ds ≤ n
0
(8.2)
t → ∞.
s n−1 ϕ 1/2 (s) ds ≤ Cϕ < ∞.
(8.3)
(8.4)
0
Then Eq. (8.3) follows as (8.2) and (7.14) imply that Nt ∼ ln t.
' &
By the triangle inequality,
j |µˆ t (#) − µˆ ∞ (#)| ≤ E exp{iU0 (t)Y0 , #} − E exp i r t t
1
1 j + exp − E|rt |2 − exp − Q∞ (#, #) t 2 2
1 j j +E exp i rt − exp − E|rt |2 t t 2 ≡ I1 + I2 + I3 , (8.5) where the sum
t
stands for
Nt j =−Nt
. We are going to show that all summands I1 , I2 , I3
tend to zero as t → ∞. Step (i). Equation (7.14) implies
j j j j rt (exp i ct − 1) ≤ E|ct | ≤ (E|ct |2 )1/2 . (8.6) I1 = E exp i t
t
t
t
From (8.6), (7.15) and (8.3) we obtain that I1 ≤ CNt (ρt /t)1/2 → 0, t → ∞. Step (ii). By the triangle inequality, 1 1 j E|rt |2 − Q∞ (#, #) ≤ |Qt (#, #) − Q∞ (#, #)| I2 ≤ t 2 2 1 2 1 j 2 j 2 j + E rt − E|rt | + E rt − Qt (#, #) t t t 2 2 ≡ I21 + I22 + I23 ,
(8.7)
(8.8)
Convergence to Equilibrium Distribution, I
21
ij where Qt is a quadratic form with the integral kernel Qt (x, y) . Equation (6.5) implies that I21 → 0. As to I22 , we first have that j I22 ≤ E|rt rtl |. (8.9) j 0. (i) Let (E|ξ |2 )1/2 ≤ a, (E|η|2 )1/2 ≤ b. Then |Eξ η − Eξ Eη| ≤ Cab ϕ 1/2 (r). (ii) Let |ξ | ≤ a, |η| ≤ b a.s. Then |Eξ η − Eξ Eη| ≤ Cab ϕ(r). j
We apply Lemma 8.2 to deduce that I22 → 0 as t → ∞. Note that rt = Y0 (x), j ∗ #) is measurable with respect to the σ -algebra σ (Rt ). The distance j between the different rooms Rt is greater than or equal to ρt according to (7.6). Then (8.9) and S1, S3 imply, together with Lemma 8.2 (i), that
j χr (x n )(Rt
I22 ≤ CNt2 ϕ 1/2 (ρt ),
(8.10)
which goes to 0 as t → ∞ because of (7.15) and (8.3). Finally, it remains to check that I23 → 0, t → ∞. By the Cauchy–Schwartz inequality, 2 j 2 j j rt −E rt + c I23 ≤ E t t t t 1/2 2 1/2 j j j Nt E|ct |2 + C E rt E|ct |2 . (8.11) ≤ CNt t
t
t
Then (7.15), (8.9) and (8.10) imply 2 j j j rt ≤ E|rt |2 + 2 E|rt rtl | ≤ CNt dt /t + C1 Nt ϕ 1/2 (ρt ) ≤ C2 < ∞. E t
t
j 0, 1 √ j 2 Eε σt |rt | → 0, t → ∞. t σt
(8.17)
j 2 Here σt ≡ t E|rt | , and Eδ f ≡ EXδ f , where Xδ is the indicator of the event |f | > δ 2 . Note that (8.13) and (8.1) imply that σt → Q∞ (#, #) = 0, t → ∞. Hence it remains to verify that ∀ε > 0, t
j
Eε |rt |2 → 0, t → ∞.
We check Eq. (8.18) in Sect. 9. This will complete the proof of Proposition 3.3.
(8.18) ' &
Convergence to Equilibrium Distribution, I
23
9. The Lindeberg Condition The proof of (8.18) can be reduced to the case when for some 7 ≥ 0 we have, almost surely, that |u0 (x)| + |v0 (x)| ≤ 7 < ∞, x ∈ Rn .
(9.1)
Then the proof of (8.18) is reduced to the convergence j E|rt |4 → 0, t → ∞
(9.2)
t
by using Chebyshev’s inequality. The general case can be covered by standard cutoff j arguments by taking into account that the bound (7.15) for E|rt |2 depends only on e0 and ϕ. The last fact is obvious from (7.21) and (6.3) with p = 1 and γ = 0. We deduce (9.2) from Theorem 9.1. Let the conditions of Theorem B hold and assume that (9.1) is fulfilled. Then for any # ∈ D there exists a constant C(#) such that j
E|rt |4 ≤ C(#)74 dt2 /t 2 , t > 1. Step 1. Given four points x1 , x2 , x3 , x4 ∈
Proof.
Rn ,
(9.3)
set:
(4)
M0 (x1 , ..., x4 ) = E (Y0 (x1 ) ⊗ ... ⊗ Y0 (x4 )) . Then, similarly to (7.16), Eqs. (9.1) and (7.8) imply by the Fubini Theorem that j
j
j
(4)
E|rt |4 = χr (x1n ) . . . χr (x4n )M0 (x1 , . . . , x4 ), K(x1 , t) ⊗ · · · ⊗ K(x4 , t).
(9.4)
Let us analyse the domain of the integration (Rn )4 in the RHS of (9.4). We partition (Rn )4 into three parts, W2 , W3 and W4 : n 4
(R ) =
4
Wi , Wi = {x¯ = (x1 , x2 , x3 , x4 ) ∈ (Rn )4 : |x1 − xi | = max |x1 − xp |}. p=2,3,4
i=2
(9.5) Furthermore, given x¯ = (x1 , x2 , x3 , x4 ) ∈ Wi , divide Rn into three parts Sj , j = 1, 2, 3: Rn = S1 ∪ S2 ∪ S3 , by two hyperplanes orthogonal to the segment [x1 , xi ] and partitioning it into three equal segments, where x1 ∈ S1 and xi ∈ S3 . Denote by xp , xq the two remaining points with p, q = 1, i. Set: Ai = {x¯ ∈ Wi : xp ∈ S1 , xq ∈ S3 }, Bi = {x¯ ∈ Wi : xp , xq ∈ S1 } and Ci = {x¯ ∈ Wi : xp , xq ∈ S3 }, i = 2, 3, 4. Then (4) ¯ x¯ ∈ (Rn )4 , in the following way: Wi = Ai ∪ Bi ∪ Ci . Define the function m0 (x), (4) M0 (x) ¯ − q0 (x1 − xp ) ⊗ q0 (xi − xq ), x¯ ∈ Ai , (4) (9.6) ¯ = m0 (x) (4) Wi M0 (x), ¯ x¯ ∈ Bi ∪ Ci . (4)
This determines m0 (x) ¯ correctly for almost all quadruples x. ¯ Note that j n j χr (x1 ) . . . χr (x4n )q0 (x1 − xp ) ⊗ q0 (xi − xq ), K(x1 , t) ⊗ · · · ⊗ K(x4 , t) j j = χr (x1n )χr (xpn )q0 (x1 − xp ), K(x1 , t) j j ⊗K(xp , t) χr (xin )χr (xqn )q0 (xi − xq ), K(xi , t) ⊗ K(xq , t) .
24
T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov
Each factor here is bounded by C(#) dt /t. Similarly to (7.15), this can be deduced from an expression of type (7.16) for the factors. Therefore, the proof of (9.3) reduces to the proof of the bound j
j
(4)
It := |χr (x1n ) . . . χr (x4n )m0 (x1 , . . . , x4 ), K(x1 , t) ⊗ · · · ⊗ K(x4 , t)| ≤ C(#)74 dt2 /t 2 ,
t > 1.
(9.7)
Step 2. Similarly to (7.21), Eq. (7.17) implies, j j (4) χr (x1n ) . . . χr (x4n )|m0 (x1 , . . . , x4 )|dx1 dx2 dx3 dx4 , It ≤ C(#) t −2n
(9.8)
(Btr )4 (4)
where Btr is the ball {x ∈ Rn : |x| ≤ t + r}. We estimate m0 using Lemma 8.2 (ii). Lemma 9.2. For each i = 2, 3, 4 and almost all x ∈ Wi the following bound holds: (4)
|m0 (x1 , . . . , x4 )| ≤ C74 ϕ(|x1 − xi |/3).
(9.9)
Proof. For x¯ ∈ Ai we apply Lemma 8.2 (ii) to C2 ⊗ C2 ≡ R4 ⊗ R4 -valued random variables ξ = Y0 (x1 ) ⊗ Y0 (xp ) and η = Y0 (xi ) ⊗ Y0 (xq ). Then (9.1) implies the bound for almost all x¯ ∈ Ai , (4)
|m0 (x)| ¯ ≤ C74 ϕ(|x1 − xi |/3).
(9.10)
For x¯ ∈ Bi , we apply Lemma 8.2 (ii) to ξ = Y0 (x1 ) and η = Y0 (xp ) ⊗ Y0 (xq ) ⊗ Y0 (xi ). Then S0 implies a similar bound for almost all x¯ ∈ Bi , (4) (4) ¯ = M0 (x) ¯ − EY0 (x1 ) ⊗ E Y0 (xp ) ⊗ Y0 (xq ) ⊗ Y0 (xi ) |m0 (x)| ≤ C74 ϕ(|x1 − xi |/3), and the same for almost all x¯ ∈ Ci .
(9.11)
' &
Step 3. It remains to prove the following bounds for each i = 2, 3, 4: j j χr (x1n ) . . . χr (x4n )Xi (x)ϕ(|x1 − xi |/3)dx1 dx2 dx3 dx4 ≤ Cdt2 t 2n−2 , Vi (t) := (Btr )4
(9.12) where Xi is an indicator of the set Wi . In fact, this integral does not depend on i, hence set i = 2 in the integrand: j j Vi (t) ≤ C χr (x1n )ϕ(|x1 − x2 |/3) χr (x3n ) X2 (x) dx4 dx3 dx1 dx2 . (Btr )2
Btr
Btr
(9.13)
Convergence to Equilibrium Distribution, I
25
Now a key observation is that the inner integral in dx4 is O(|x1 − x2 |n ) as X2 (x) = 0 for |x4 − x1 | > |x1 − x2 |. This implies j j Vi (t) ≤ Cr 4 χr (x1n ) ϕ(|x1 − x2 |/3)|x1 − x2 |n dx2 dx1 χr (x3n ) dx3 . Btr
Btr
Btr
(9.14) The inner integral in dx2 is bounded as ϕ(|x1 − x2 |/3)|x1 − x2 |n dx2 ≤ C(n)
2(t+r)
r 2n−1 ϕ(r/3) dr
0
Btr
≤ C1 (n)
sup
r∈[0,2(t+r)]
r n ϕ 1/2 (r/3)
2(t+r)
r n−1 ϕ 1/2 (r/3) dr,
(9.15)
0
where the “sup” and the last integral are bounded by (8.4) and (2.10), respectively. Therefore, (9.12) follows from (9.14). This completes the proof of Theorem 9.1. & ' Proof of convergence (9.2). As dt ≤ h ∼ t/Nt , bound (9.3) implies, t
j
E|rt |4 ≤
C74 dt2 C1 74 Nt ≤ → 0, 2 t Nt
Nt → ∞.
' &
10. The Scattering Theory for Infinite Energy Solutions In this section we develop a version of the scattering theory to deduce Theorem A from Theorem B. The main step is to establish an asymptotics of type (1.20) for adjoint groups by using results of Vainberg [35]. Consider operators U (t), U0 (t) in the complex space H = L2 (Rn ) ⊕ H 1 (Rn ) (see (2.12)). The energy conservation for the KGE implies the following corollary: Corollary 10.1. There exists a constant C > 0 such that ∀# ∈ H : U0 (t)#H ≤ C#H ,
U (t)#H ≤ C#H , t ∈ R.
(10.1)
Lemma 10.3 below develops earlier results [35, Thms. 3, 4, 5]. Consider a family of finite seminorms in H , #2(R) = (|#0 (x)|2 + |#1 (x)|2 + |∇#1 (x)|2 ) dx, R > 0. |x|≤R
Denote by H(R) the subspace of functions from H with a support in the ball BR . Definition 10.2. Hc denotes the space ∪R>0 H(R) endowed with the following convergence: a sequence #n converges to # in Hc iff ∃R > 0 such that all #n ∈ H(R) , and #n converge to # in the norm · (R) .
26
T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov
Below, we speak of continuity of maps from Hc in the sense of sequential continuity. Given t ≥ 0, denote n ≥ 3, (t + 1)−3/2 , ε(t) = (10.2) (t + 1)−1 ln−2 (t + 2), n = 2. Lemma 10.3. Let Assumptions E1–E3 hold, and n ≥ 2. Then for any R, R0 > 0 there exists a constant C = C(R, R0 ) such that for # ∈ H(R) , U (t)#(R0 ) ≤ Cε(t)#(R) , t ≥ 0.
(10.3)
This lemma has been proved in [21] by using Conditions E1–E3 and a method developed in [36]. For the proof, the contour of integration in the k-plane from [35] had to be curved logarithmically at infinity as in [36], but should not be chosen parallel to the real axis. The main result of this section is Theorem 10.4 below. Given t ≥ 0, set (t + 1)−1/2 , n ≥ 3, ε1 (t) = (10.4) ln−1 (t + 2), n = 2. Theorem 10.4. Let Assumptions E1–E3 and S0–S3 hold, and n ≥ 2. Then there exist linear continuous operators W, r(t) : Hc → H such that for # ∈ Hc , U (t)# = U0 (t)W # + r(t)#, t ≥ 0,
(10.5)
and the following bounds hold ∀R > 0 and # ∈ H(R) : r(t)#H ≤ C(R)ε1 (t)#(R) , t ≥ 0, E|Y0 , r(t)#| ≤ 2
C(R)ε12 (t)#2(R) ,
t ≥ 0.
(10.6) (10.7)
Proof. We apply the standard Cook method: see, e.g., [29, Thm. XI.4]. Fix # ∈ H(R) and define W #, formally, as W# =
lim U (−t)U (t)# t→∞ 0
∞ =#+ 0
d U (−t)U (t)# dt. dt 0
We have to prove the convergence of the integral in norm in space H . First, observe that d U (t)# = A0 U0 (t)#, dt 0
d U (t)# = A U (t)#, dt
where A0 and A are the generators to groups U0 (t), U (t), respectively. Similarly to (7.5), we have 0A A = , (10.8) 1 0
Convergence to Equilibrium Distribution, I
where A =
n
27
(∂j − iAj )2 − m2 . Therefore,
j =1
d U (−t)U1 (t)# = U0 (−t)(A − A0 )U (t)#. dt 0 Now (10.8) and (7.5) imply A − A0 = Furthermore, E2 implies that L =
n
(10.9)
0L . 00
(∂j − iAj )2 − is a first order partial differential
j =1
operator with the coefficients vanishing for |x| ≥ R0 . Thus, (10.1) and (10.3) imply that U0 (−t)(A − A0 )U (t)#H ≤ C (A − A0 )U (t)#H 0 = C (A − A0 )U (t)# L2 (BR ) 0 1 ≤ C1 U (t)# H 1 (BR ) 0
≤ C(R)ε(t)#(R) , t ≥ 0.
(10.10)
Hence (10.9) implies ∞ s
d U (−t)U (t)#H dt ≤ C(R)ε1 (s)#(R) , s ≥ 0. dt 0
(10.11)
Therefore, (10.5) and (10.6) follow by (10.1). It remains to prove (10.7). First, similarly to (7.16), EY0 , r(t)#2 = q0 (x − y), r(t)#(x) ⊗ r(t)#(y).
(10.12)
Therefore, the Shur Lemma implies (similarly to (7.21)) EY0 , r(t)#2 ≤ q0 L1 r(t)#L2 r(t)#L2 ,
(10.13)
where the norms · Lp have an obvious meaning. Finally, (10.6) implies for # ∈ H(R) , r(t)#L2 ≤ Cr(t)#H ≤ C(R)ε1 (t)#(R) . Therefore, (10.7) follows from (10.13) since q0 L1 < ∞ by (6.1).
(10.14) ' &
11. Convergence to Equilibrium for Variable Coefficients The assertion of Theorem A follows from two propositions below: Proposition 11.1. The family of the measures {µt , t ∈ R}, is weakly compact in H−ε , ∀ε > 0. Proposition 11.2. For any # ∈ D, 1 µˆ t (#) ≡ exp(iY, #) µt (dY ) → exp{− Q∞ (W #, W # )}, t → ∞. 2
(11.1)
28
T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov
We deduce these propositions from Propositions 3.2 and 3.3, respectively, with the help of Theorem 10.4. Proof of Proposition 11.1. Similarly to Proposition 3.2, Proposition 11.1 follows from the bounds sup EU (t)Y0 R < ∞,
R > 0.
t≥0
(11.2)
For the proof, write the solution to (1.3) in the form u(x, t) = v(x, t) + w(x, t).
(11.3)
Here v(x, t) is the solution to (3.1), and w(x, t) is the solution to the following Cauchy problem: n 2 2 w(x, ¨ t) = k=1 (∂k − iAk (x)) w(x, t) − m w(x, t) − nk=1 2iAk (x)∂k v(x, t) − nk=1 (i∂k Ak (x) + A2k (x))v(x, t), (11.4) w|t=0 = 0, w| ˙ t=0 = 0, x ∈ Rn . Then (11.3) implies EU (t)Y0 R ≤ EU0 (t)Y0 R + E(w(·, t), w(·, ˙ t))R .
(11.5)
By Proposition 3.1 we have sup EU0 (t)Y0 R < ∞. t≥0
(11.6)
It remains to estimate the second term in the right-hand side of (11.5). The Duhamel representation for the solution to (11.4) gives t U (t − s)(0, ψ(·, s)) ds,
(w, w) ˙ =
(11.7)
0
where ψ(x, s) = −2i
n k=1
Ak (x)∂k v(x, s) −
n k=1
(i∂k Ak (x) + A2k (x))v(x, s). Assump-
tion E2 implies that supp ψ(·, s) ⊂ BR0 . Moreover, (0, ψ(·, s))R0 ≤ Cv(·, s)H 1 (BR
0)
≤ CU0 (s)Y0 R0 .
(11.8)
The decay estimates of type (10.3) hold for the group U (t), as well as for U (t), as both groups correspond to the same equation by Lemma 7.1. Hence, we have from (11.8), U (t − s)(0, ψ(·, s))R ≤ C(R)ε(t − s)(0, ψ(·, s))R0 ≤ C1 (R)ε(t − s)U0 (s)Y0 R0 ,
(11.9)
Convergence to Equilibrium Distribution, I
29
where ε(·) is defined in (10.2). Therefore, (11.7) and (11.6) imply t E(w(·, t), w(·, ˙ t))R ≤ C(R)
ε(t − s)EU0 (s)Y0 R0 ds 0
≤ C2 (R) < ∞, Then (11.6) and (11.5) imply (11.2).
t ≥ 0.
(11.10)
' &
Proof of Proposition 11.2. Equations (10.5) and (10.7) imply by Cauchy–Schwartz, |E exp iU (t)Y0 , # − E exp iY0 , U0 (t)W #| ≤ E|Y0 , r(t)#| ≤ (E|Y0 , r(t)#|2 )1/2 → 0, t → ∞. It remains to prove that
1 E exp iY0 , U0 (t)W # → exp − Q∞ (W #, W # ) , t → ∞. 2
(11.11)
This does not follow directly from Proposition 3.3 since generally, W # ∈ D. We approximate W # by functions from D. W # ∈ H , and D is dense in H . Hence, for any V > 0 there exists K ∈ D such that W # − KH ≤ V. Therefore, we can derive (11.11) by the triangle inequality
1 E exp iY0 , U0 (t)W # − exp − Q∞ (W #, W # ) 2 ≤ E exp iY0 , U0 (t)W # − E exp iY0 , U0 (t)K
1 + E exp iU0 (t)Y0 , K − exp − Q∞ (K, K ) 2
1
1 + exp − Q∞ (K, K ) − exp − Q∞ (W #, W # ) . 2 2
(11.12)
(11.13)
Applying Cauchy–Schwartz, we get, similarly to (10.12)-(10.14), that E|Y0 , U0 (t)(W # − K)| ≤ (E|Y0 , U0 (t)(W # − K)|2 )1/2 ≤ CU0 (t)(W # − K)H . Hence, (10.1) and (11.12) imply E|Y0 , U0 (t)(W # − K)| ≤ CV, t ≥ 0.
(11.14)
Now we can estimate each term on the right-hand side of (11.13). The first term is O(V) uniformly in t > 0 by (11.14). The second term converges to zero as t → ∞ by Proposition 3.3 since K ∈ D. Finally, the third term is O(V) owing to (11.12) and the continuity of the quadratic form Q∞ (#, #) in L2 (Rn ) ⊗ C2 . The continuity follows ij from the Shur Lemma since the integral kernels q∞ (z) ∈ L1 (Rn ) ⊗ M 2 by Corollary 6.3. Now the convergence in (11.11) follows since V > 0 is arbitrary. & '
30
T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov
12. Appendix A. Fourier Transform Calculations Consider the covariance functions of the solutions to the system (3.2). Let F : w → wˆ denote the FT of a tempered distribution w ∈ S (Rn ) (see, e.g. [13]). We also use this notation for vector- and matrix-valued functions. 12.1. Dynamics in the FT space. In the FT representation, the system (3.2) becomes Y˙ˆ (k, t) = Aˆ 0 (k)Yˆ (k, t), hence Gˆt (k) = exp(Aˆ 0 (k)t).
Yˆ (k, t) = Gˆt (k)Yˆ0 (k), Here we denote Aˆ 0 (k) =
0
1
−|k|2 − m2
0
Gˆt (k) =
,
cos ωt −ω sin ωt
(12.1) sin ωt ω , cos ωt
(12.2)
where ω = ω(k) = |k|2 + m2 . 12.2. Covariance matrices in the FT space. Lemma 12.1. In the sense of matrix-valued distributions, −1 Gˆt (k)qˆ0 (k)Gˆt (k), t ∈ R. qt (x − y) := E Y (x, t) ⊗ Y (y, t) = Fk→x−y
(12.3)
Proof. Translation invariance (1.8) implies E Y0 (x) ⊗C Y0 (y) = C0+ (x − y), E Y0 (x) ⊗C Y0 (y) = C0− (x − y),
(12.4)
where ⊗C stands for the tensor product of complex vectors. Therefore, E Yˆ0 (k) ⊗C Yˆ0 (k ) = Fx→k Fy→k C0+ (x − y) = (2π )n δ(k + k )Cˆ 0+ (k), E Yˆ0 (k) ⊗C Yˆ0 (k ) = Fx→k Fy→−k C0− (x − y) = (2π )n δ(k − k )Cˆ 0− (k). (12.5) Now (12.1) and (12.2) give in matrix notation that E Yˆ (k, t) ⊗C Yˆ (k , t) = (2π )n δ(k + k )Gˆt (k)Cˆ 0+ (k)Gˆt (k), E Yˆ (k, t) ⊗C Yˆ (k , t) = (2π )n δ(k − k )Gˆt (k)Cˆ 0− (k)Gˆt (k).
(12.6)
Therefore, by the inverse FT formula we get −1 Gˆt (k)Cˆ 0+ (k)Gˆt (k), E Y (x, t) ⊗C Y (y, t) = Fk→x−y Gˆt (k)Cˆ − (k)Gˆt (k). E Y (x, t) ⊗C Y (y, t) = F −1
(12.7)
k→x−y
Then (12.3) follows by linearity.
' &
0
Convergence to Equilibrium Distribution, I
31
13. Appendix B. Measures in Sobolev’s Spaces Here we formally verify the bound (4.5) for s, α < −n/2. Definition (4.2) implies for u ∈ H s,α , 1 2α −ix(k−k ) s s ) dkdk dx. e (13.1) x k k u(k) ˆ u(k ˆ u2s,α = (2π)2n Let µ(du) be a translation-invariant measure in H s,α with a CF Q(x, y) = q(x − y). Similarly to (12.5), (12.4), we get ˆ )µ(du) = (2π )n δ(k − k ) tr q(k). ˆ (13.2) u(k) ˆ u(k Then, integrating (13.1) with respect to the measure µ(du), we get the formula 1 2α x dx k2s tr q(k) ˆ dk. (13.3) u2s,α µ(du) = (2π )n Applying it to q(k) ˆ = T with α, s < −n/2 and to q(k) ˆ = T (k 2 + m2 )−1 with 1 + s instead of s, we get (4.5). Acknowledgements. The authors thank V. I.Arnold,A. Bensoussan, I.A. Ibragimov, H. P. McKean, J. Lebowitz, A. I. Shnirelman, H. Spohn, B. R. Vainberg and M. I. Vishik for fruitful discussions and remarks.
References 1. Billingsley, P.: Convergence of Probability Measures. New York, London, Sydney, Toronto: John Wiley, 1968 2. Boldrighini, C., Dobrushin, R.L.Sukhov, Yu.M.: Time asymptotics for some degenerate models of evolution of systems with an infinite number of particles. Technical Report, University of Camerino, 1980 3. Boldrighini, C., Pellegrinotti,A., Triolo, L.: Convergence to stationary states for infinite harmonic systems, J. Stat. Phys. 30, 123–155 (1983) 4. Botvich, D.D., Malyshev, V.A.: Unitary equivalence of temperature dynamics for ideal and locally perturbed fermi-gas. Commun. Math. Phys. 91, no. 4, 301–312 (1983) 5. Bournaveas, N.: Local existence for the Maxwell–Dirac equations in three space dimensions. Comm. Partial Diff. Equs. 21, no. 5–6, 693–720 (1996) 6. Bulinskii, A.V., Molchanov, S.A.: Asymptotic Gaussian property of the solution of the Burgers equation with random initial data. Theory Probab. Appl. 36, no. 2, 217–236 (1991) 7. Cornfeld, I.P., Fomin, S.V., Sinai, Ya.G.: Ergodic Theory. New York–Berlin: Springer, 1982 8. Dobrushin, R.L., Pellegrinotti, A., Suhov, Yu.M.: One-dimensional harmonic lattice caricature of hydrodynamics: A higher correction. J. Stat. Phys. 61, no. 1/2, 387–402 (1990) 9. Dobrushin, R.L., Sinai,Ya.G., Sukhov,Yu.M.: Dynamical systems of statistical mechanics. In: Dynamical Systems, Ergodic Theory and Applications, Encyclopaedia of Mathematical Sciences, V. 100. Berlin: Springer, 2000, pp. 384–431 10. Dobrushin, R.L., Suhov, Yu.M.: On the problem of the mathematical foundation of the Gibbs postulate in classical statistical mechanics. In: Mathematical Problems in Theoretical Physics, Lecture Notes in Physics, V. 80. Berlin: Springer-Verlag 1978, pp. 325–340 11. Dudnikova, T.V., Komech, A.I., Ratanov, N.E., Suhov,Yu.M.: On convergence to equilibrium distribution. II. Wave equations with mixing. Submitted to J. Stat. Phys. 12. Dudnikova, T.V. Komech, A.I., Spohn, H.: On convergence to statistic equilibrium in two-temperature problem for wave equation with mixing. Preprint Max-Planck Institute for Mathematics in the Sciences, N. 26. Leipzig, 2000 (http://www.mis.mpg.de) 13. Egorov, Yu.V., Komech, A.I., Shubin, M.A.: Elements of the Modern Theory of Partial Differential Equations. Berlin: Springer, 1999 14. Fedoryuk, M.V.: The stationary phase method and pseudodifferential operators. Russ. Math. Surveys 26, no. 1, 65–115 (1971)
32
T. V. Dudnikova, A. I. Komech, E. A. Kopylova, Yu. M. Suhov
15. Gikhman, I.I., Skorokhod, A.V.: The Theory of Stochastic Processes, Vol. I. Berlin: Springer, 1974 16. Hörmander, L.: The Analysis of Linear Partial Differential Operators III: Pseudo-Differential Operators. Berlin–Heidelberg–New York: Springer-Verlag, 1985 17. Ibragimov, I.A., Linnik, Yu.V.: Independent and Stationary Sequences of Random Variables. Groningen: Wolters-Noordhoff, 1971 18. Jaksic, V., Pillet, C.-A.: Ergodic properties of classical dissipative systems. I. Acta Math. 181, no. 2, 245–282 (1998) 19. Komech, A.I.: Stabilisation of statistics in wave and Klein–Gordon equations with mixing. Scattering theory for solutions of infinite energy. Rend. Sem. Mat. Fis. Milano 65, 9–22 (1995) 20. Kopylova, E.A.: Stabilization of statistical solutions of the Klein–Gordon equation. Mosc. Univ. Math. Bull. 41, no. 2, 72–75 (1986) 21. Kopylova, E.A.: Stabilisation of Statistical Solutions of Klein–Gordon Equations. PhD Thesis, Moscow State University, 1986 22. Mikhailov, V.P.: Partial Differential Equations. Moscow: Mir, 1978 23. Morawetz, C.S., Strauss, W.A.: Decay and scattering of solutions of a nonlinear relativistic wave equation. Comm. Pure Appl. Math. 25, 1–31 (1972) 24. Petrov, V.V.: Limit Theorems of Probability Theory. Oxford: Clarendon Press, 1995 25. Planck, M.: The Theory of Heat Radiation. New York: Dover Publications, 1959 26. Ratanov, N.E.: Stabilisation of statistic solutions of second order hyperbolic equations. Russian Mathematical Surveys 39, no. 1, 179–180 (1984) 27. Ratanov, N.E., Shuhov, A.G., Suhov, Yu.M.: Stabilisation of the statistical solution of the parabolic equation, Acta Appl. Math. 22, no. 1, 103–115 (1991) 28. Reed, M., Simon, B.: Methods of Modern Mathematical Physics II: Fourier Analysis, Self-Adjointness. New York: Academic Press, 1975 29. Reed, M., Simon, B.: Methods of Modern Mathematical Physics III: Scattering Theory. New York: Academic Press, 1979 30. Rosenblatt, M.A.: A central limit theorem and a strong mixing condition. Proc. Nat. Acad. Sci. U.S.A. 42, no. 1, 43–47 (1956) 31. Seitz, F.: The Modern Theory of Solids. New York: McGraw-Hill, 1940 32. Shuhov, A.G., Suhov, Yu.M.: Ergodic properties of groups of the Bogoliubov transformations of CAR C ∗ -algebras. Ann. Phys. 175, 231–266 (1987) 33. Spohn, H., Lebowitz, J.L.: Stationary non equilibrium states of infinite harmonic systems. Commun. Math. Phys. 54, 97–120 (1977) 34. Sommerfeld, A.: Thermodynamics and Statistical Mechanics. New York: Academic Press, 1956 35. Vainberg, B.R.: Behaviour for large time of solutions of the Klein–Gordon equation. Trans. Moscow Math. Soc. 30, 139–158 (1976) 36. Vainberg, B.R.. Asymptotic Methods in Equations of Mathematical Physics. New York–London–Paris: Gordon and Breach, 1989 37. Vishik, M.I., Fursikov, A.V.: Mathematical Problems of Statistical Hydromechanics. Dordrecht; Kluwer Academic Publishers, 1988 Communicated by H. Spohn
Commun. Math. Phys. 225, 33 – 66 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Nonassociative Star Product Deformations for D-Brane World-Volumes in Curved Backgrounds Lorenzo Cornalba1 , Ricardo Schiappa2 1 Laboratoire de Physique Théorique, École Normale Supérieure, 75231 Paris Cedex 05, France.
E-mail:
[email protected] 2 Department of Physics, Harvard University, Cambridge, MA 02138, USA.
E-mail:
[email protected] Received: 22 March 2001 / Accepted: 13 July 2001
Abstract: We investigate the deformation of D-brane world-volumes in curved backgrounds. We calculate the leading corrections to the boundary conformal field theory involving the background fields, and in particular we study the correlation functions of the resulting system. This allows us to obtain the world-volume deformation, identifying the open string metric and the noncommutative deformation parameter. The picture that unfolds is the following: when the gauge invariant combination ω = B + F is constant one obtains the standard Moyal deformation of the brane world-volume. Similarly, when dω = 0 one obtains the noncommutative Kontsevich deformation, physically corresponding to a curved brane in a flat background. When the background is curved, H = dω = 0, we find that the relevant algebraic structure is still based on the Kontsevich expansion, which now defines a nonassociative star product with an A∞ homotopy associative algebraic structure. We then recover, within this formalism, some known results of Matrix theory in curved backgrounds. In particular, we show how the effective action obtained in this framework describes, as expected, the dielectric effect of D-branes. The polarized branes are interpreted as a soliton, associated to the condensation of the brane gauge field. Contents 1. 2. 3. 4. 5. 6. 7. 8. A. B.
Introduction and Summary . . . . . . . . . . . . . . Open Strings in Parallelizable Backgrounds . . . . . . Perturbation Theory . . . . . . . . . . . . . . . . . . Computation of n-Point Functions . . . . . . . . . . . Nonassociative Deformations of World-volumes . . . Corrections Involving the Metric Tensor . . . . . . . Tachyons and Matrix Models in Curved Backgrounds Future Perspectives . . . . . . . . . . . . . . . . . . Dilogarithm Identities . . . . . . . . . . . . . . . . . Computation of the Function S (x) . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
34 36 38 43 51 57 59 62 62 63
34
L. Cornalba, R. Schiappa
1. Introduction and Summary Noncommutative quantum field theoretic limits of string theory have received considerable attention in the recent literature, and have been studied in a variety of papers (see, e.g., [1–6] and references therein). The attention is focused on a specific scaling limit, where the effects of large magnetic backgrounds are translated into Moyal noncommutative deformations of the D-brane world-volume algebra of functions. The open string physics is therefore captured within a quantum field theory (which is renormalizable, despite appearances [7, 8]). A common point to most previous investigations is that the background (sigma model) fields are taken to be constant and that, as a consequence, the target space is flat. One may then ask the natural question of what happens if the background is curved, i.e., if the background fields are no longer constant? This question received some attention in a couple of recent papers [9–12], but there is no general answer to it (other papers of interest with some relation to this subject are, e.g., [13– 16]). Our goal in this work is to address this problem in the context of a simple model with weakly curved backgrounds, which can be on one side connected to the known flat background framework, and on the other hand can be related to formal results of brane physics in WZW models, which can be analyzed exactly with conformal field theory techniques [10, 11, 17]. More concretely, the aim of this paper is to understand how the presence of a nontrivial background field affects the world-volume deformation of a D-brane. It is known that, in the presence of a constant background B-field, the physics can be exactly described either by a sigma model approach [18–25], or alternatively, by translating the background B-field into a noncommutative Moyal deformation of the brane worldvolume algebra of functions [3, 5]. The constant field situation represents a particular choice of background and one can ask what happens in more complicated situations. One thing to keep in mind is that (as for the Born–Infeld action [26–28]) the gauge covariant combination to consider is not B alone, but B + F , which we shall denote by ω ≡ B + F in the following. One may then consider three cases of increasing complexity: the case of constant ω, the case where dω = 0 but ω is not constant, and the most general case where dω = 0 and we have NS–NS three form flux (as dω = dB + dF = H ) and a curved background. The analysis leads to the following complete picture. The first case, corresponding to constant ω, has been extensively studied in the literature where one obtains a noncommutative Moyal deformation of the brane world-volume [3–6]. The physics is by now very well understood, corresponding to a flat brane embedded in a flat background space. The second case, when ω is not constant but dω = 0, has also been studied in the literature, though to a much less extent. This gives the so-called Cattaneo–Felder model of [29]. One therefore obtains the natural extension of the Moyal deformation to the case of varying symplectic form, corresponding to the noncommutative Kontsevich star product deformation of the brane world-volume algebra of functions [30]. This situation corresponds to the embedding of a curved brane in a flat background space. These configurations have also been studied from the point of view of BPS membranes in Matrix theory, where the varying F -field physically corresponds to a varying density of zero-branes over a curved membrane [31, 32]. Finally, the general case where dω = 0 is the main subject of this paper. One no longer has a symplectic form and apparently no obvious definition of a star product – which usually comes from a given Poisson structure on the world-volume of the D-brane. In this general situation, we will find that the world-volume algebra of functions is deformed to an algebra which is not only noncommutative, but also nonassociative. One interesting point we shall uncover is that this
Nonassociative Star Product Deformations
35
nonassociative star product can still be defined using Kontsevich’s formula [30]. Therefore, the nonassociativity can be traced, thanks to Kontsevich’s formality formulae, to the Schouten–Nijenhuis bracket of ω−1 with itself, which is proportional to the NS–NS field strength dω = H [30, 29]. These nonassociative algebras have the structure of an A∞ homotopy associative algebra (see, e.g., [33–35]) which have previously received some attention in the string field theory literature since they are the natural algebras that appear in general open–closed string field theories [36, 37]. Our approach in this paper will rely on a perturbative calculation of n-point functions on the disk, using the background field method applied to open string theory [18, 19, 22, 23]. The background fields are expanded in Taylor series, and the derivative terms that appear are treated as new interactions, which we treat in a perturbative expansion. This allows us to obtain the open string parameters, metric G and deformation θ , generalizing the results in [23, 3, 5]. It also allows us to identify the star product deformations, as described in the previous paragraph. We begin, in Sect. 2, by describing the specific closed string backgrounds which we shall consider in this paper. These will be the class of parallelizable manifolds, exact background solutions for closed string theory [20]. Then, in Sect. 3, we shall describe in detail the perturbation theory on the disk for open strings in these curved backgrounds, i.e., we will study the new interaction vertices due to the curvature terms. In particular, we present the general methods that we then use in Sect. 4 for the calculation of n-point functions on the disk, with particular emphasis on the conformal properties of these disk correlators. These correlators also yield the open string parameters and the nonassociative Kontsevich star product. Section 5 includes a brief resume of the different situations and the different world-volume deformations and star products, which can be read directly by the reader who wishes to skip the calculations in the preceding sections. It also describes in some detail the concept of a nonassociative star product deformation, which could be a topic of great interest for future research. Most of the previous treatment is done in a particular α → 0 scaling limit [5], where the closed string metric, g, scales to zero. In Sect. 6 we move away from this limit and compute corrections to the previous results which explicitly depend on the closed string metric. These calculations yield the formulas relating open and closed string parameters. It is interesting to observe that the final answer is a simple generalization of the flat background results of [23, 3, 5]. In Sect. 7 we make contact with previous results and in particular we describe, within our formalism, the dielectric effect of Dbranes [38] in these curved backgrounds. Indeed, these solutions describing polarization of lower dimensional branes, obtained first in [38] and then further studied in different situations involving D-branes and fundamental strings in R–R or NS–NS backgrounds by, e.g., [39–45], is now reinterpreted, dually, as an instability of the space filling brane, which condenses to a lower dimensional brane. This is accomplished by first studying the relation between the partition function – the correlators we computed in the earlier sections – and the effective action. Once this connection is made (using boundary string field theory arguments), we obtain the usual matrix action in the presence of an H -field, and we can then use the previous results on the subject. Finally, we discuss in the concluding sections how further studies of these nonassociative geometries could lead to a proper definition of Matrix theory [46–50] in a general curved background. These nonassociative geometries could provide the proper framework to generalize the arguments in [51–53] and the weak field calculations of [54, 55] in order to build the matrix theory action in a general curved target space.
36
L. Cornalba, R. Schiappa
2. Open Strings in Parallelizable Backgrounds The physics of a string propagating in a curved background is conveniently described in terms of a nonlinear sigma model. In the presence of a background metric gab (x) and NS–NS 2-form field Bab (x) the action which governs the motion of the string is given by [18–20, 22, 23], 1 i a b S= gab (X) dX ∧ ∗dX + Bab (X) dXa ∧ dX b , (1) 4πα 4π α where is the string world-volume. Moreover, when considering open strings one can include boundary interactions on ∂. In the sequel, we will mainly focus on the coupling to the U (1) gauge field Aa (x), given by SB = i Aa (X) dXa . ∂
In this paper we will consider only the physics at weak string coupling, and we will consequently assume to have the topology of a disk. Other background fields (such as the dilaton) will not play a role in our subsequent analysis. We shall mainly address maximal branes, though our results are completely general. Also, from now on, we will work in units such that 2π α = 1. The action (1) is written in a generic coordinate system x a in spacetime. On the other hand, in order to use (1) to compute correlators in perturbation theory, it is natural to follow the standard techniques of the background field method and use coordinates x a which are Riemann normal coordinates at the origin – i.e. defined using geodesic paths in target space which start at x a = 0 [18, 20]. We recall that the main advantage of this choice is that the Taylor series expansion of any tensor around x a = 0 is explicitly given in terms of covariant tensors evaluated at the origin. In particular one has, up to quadratic order in the coordinates, 1 gab (x) = gab − Racbd x c x d + · · · . 3
(2)
Let us now consider the expansion of the NS–NS 2-form field, by first recalling that we have some gauge freedom in the definition of Bab (x). In fact, the transformations B → B + d, A → A − leave the total action S + SB invariant, and we can use this freedom to impose the following (radial) gauge1 : x a Bab (x) = x a Bab (0) . One can explicitly solve the above equation in terms of the NS–NS three-form field strength H = dB, and obtain Bab (x) = Bab + x c
1 0
s 2 Habc (sx) ds.
1 Given a generic field B (x), we can consider the gauge transformation parameter (x) given by a ab a (x) = x b 01 sBab (sx) ds. It is then a simple computation to see that the combination ∂a b − ∂b a equals −Bab (x) + x c 01 s 2 Habc (sx) ds.
Nonassociative Star Product Deformations
37
Therefore, the normal coordinate expansion for the field Bab is explicitly given by 1 1 Bab (x) = Bab + Habc x c + ∇d Habc x c x d + · · · . 3 4
(3)
Using the expressions (2) and (3), one can expand (1) about the classical constant background ∂Xa = 0 and obtain S = S0 + S1 + · · · ,
(4)
where Sn contains n + 2 powers of the coordinate fields X a and where, in particular, 1 i a b S0 = gab dX ∧ ∗dX + Bab dX a ∧ dX b , 2 2 i S1 = Habc X a dX b ∧ dX c . (5) 6 In this paper, we will be primarily interested in the effects of the term S1 , which describes a small curved deviation from the flat closed string background. Let us elaborate more on this point. To leading order in α , the beta function equations which describe consistent closed string backgrounds read [18, 20]: Rab =
1 Hacd Hb cd , 4
∇ a Habc = 0.
(6)
If we work to first order in H , one may then neglect the presence of curvature coming from the metric and only consider the effects of H coming from (5). We can actually make these arguments more systematic if we consider a general class of conjectured solutions to the beta function equations, called parallelizable manifolds [20]. These configurations are characterized by the following properties. First of all, the tensor Habc is covariantly constant, ∇a Hbcd = 0. Moreover, if we consider the generalized connection + 21 H , then the corresponding curvature tensor, 1 1 1 1 Rabcd = Rabcd + ∇a H bcd − ∇b H acd + Hade Hbc e − Hace Hbd e , 2 2 4 4 must vanish. Using the fact that Ra[bcd] = 0, one can easily show that the field Habc must satisfy a Jacobi identity, in the sense that Habe Hcd e + cyclicabc = 0. These facts then imply 1 Habe Hcd e , 4 and therefore (6). Moreover, at a more fundamental level, it was explicitly shown that when the target is parallelizable, the string sigma model is ultra-violet finite to two loops, with vanishing beta functions [20]. It was moreover suggested that this holds true to higher orders for the superstring, and one thus has a consistent solution of closed string theory [20]. In the parallelizable situation the expansion (4) drastically simplifies. In the sequel we shall only need the explicit forms of S0 and S1 given above. On the other hand, in Rabcd =
38
L. Cornalba, R. Schiappa
order to extend the results of this paper to higher order in H , one needs the expressions of Sn for n ≥ 2. We include, for completeness, the first of these terms explicitly given by: 1 X a X c dX b ∧ ∗dX d . S2 = − Habe Hcd e 24 3. Perturbation Theory In the last section we have reviewed the general form of the sigma model action which describes open string dynamics in curved backgrounds. From now on we shall only consider backgrounds which are weakly curved. More precisely we will work, for the rest of the paper, to leading order in the background field H , and consequently we shall focus our analysis on the action S0 + S1 + SB . If we denote with F = dA the U (1) field strength, and with ω the symplectic structure ωab (x) = Bab + Fab (x) , then the relevant action is given by 1 i dX a ∧ ∗dX b + i ω + Habc X a dX b ∧ dX c . gab 2 6
(7)
(8)
Before we start the detailed discussion of the perturbation theory for the action (8), and in order to set the stage and motivate the subsequent results, let us begin by recalling some known facts which are valid in the flat space limit of Habc = 0. On one side, the conventional approach to open string physics starts by considering the simple free action 21 gab dX a ∧ ∗dX b , or even the full free action S0 . One then analyzes the physics of boundary interactions by considering the coupling SB to the U (1) gauge field, A, and one treats (following, for example, the approaches in [19, 22, 23]) the interactions perturbatively in F = dA. In this scheme the basic interaction vertex with n external legs involves n − 2 derivatives of F , and the perturbation theory quickly becomes unmanageable as soon as one considers rapidly varying gauge fields. It was noted, on the other hand, in [29] that, if one considers the simple topological action i ω (that is, one looks at (8) in the limit gab , Habc → 0), then the resulting path integral drastically simplifies. In fact, if one considers the n-point function of n generic functions f1 (x), . . . , fn (x), placed cyclically on the boundary ∂ of the string world-volume, one obtains the simple result (independently of the moduli of the insertion points) [29, 5]: (9) f1 · · · fn = V (ω) dx (f1 # · · · # fn ) . In the above, # is the associative Kontsevich star product2 with respect to the Poisson structure α = ω−1 , i f # g = f · g + α ab ∂a f ∂b g + · · · , 2
(10)
2 The terms hidden behind the dots · · · in (10) are given by explicit diagrammatic expressions, as explained in [30], valid for any bi-vector field α ab (x) in terms of the functions f , g, the tensor α ab and their derivatives. If α −1 is closed, then the corresponding product is associative.
Nonassociative Star Product Deformations
39
√ and V (ω) = det ω (1 + · · · ) is a volume form3 such that V (ω) dx acts as a trace for the product #. The basic point we would like to stress is that the product (10) contains derivatives of α (and therefore of F ) to all orders, and is therefore valid for arbitrary gauge field configurations. This means that the perturbation theory in Aa becomes tractable to all orders when gab → 0, and is conveniently described in terms of the algebraic operation #. We shall see in this paper that, when one introduces the perturbation S1 but still considers the limit gab → 0, then one can still re-sum the perturbation theory to all orders in Aa . We will see that the relevant algebraic structure is still given by a Kontsevich product of the general form (10), but now with ω replaced in a natural way by the gauge invariant combination: 1 ωab (x) = ωab (x) + Habc x c = Bab (x) + Fab (x) , 3 and with α replaced by α = ω−1 . In order to clearly distinguish the two cases, we shall denote this second product (relative to α ) with •, given by the usual Kontsevich expansion, i ab f •g =f ·g+ α ∂a f ∂b g + · · · . 2 The two-form ω is not closed and correspondingly the product • is now nonassociative. We will discuss later how the nonassociativity is controlled by the field strength H = d ω. The n-point functions are again given by an equation similar to (9), with # replaced by •. On the other hand, expressions like f1 •· · ·•fn are ambiguous, due to the nonassociativity of the product, and one needs to insert parenthesis to precisely define their meaning. This can be done in various ways, and this fact is reflected in the dependence of n-point functions on the n−3 conformal moduli of the insertion points on the boundary ∂. The n-point functions will then be interpolations, parameterized by n − 3 moduli, between the various possible positions of the parenthesis in the expression f1 • · · · • fn . From now until Sect. 4.5 we will concentrate on the simplest case of F = 0 or ωab (x) = Bab . We thus neglect the boundary interaction SB and concentrate on the action S0 + S1 . The generalization to the case (7) will be comparatively simple (as for the d ω = 0 case) and is left to Sect. 4.5, which also summarizes the results in the general context. We now turn to a systematic discussion of the perturbation theory for the action S0 + S1 . 3.1. The Free Theory. Let us first recall some facts about the unperturbed action S0 . Since S0 is invariant under translations X a → Xa + ca , the field X a can be split into a constant zero mode x a and a fluctuating quantum field ζ a , Xa = x a + ζ a .
(11)
Path integrals with the free action S0 are then explicitly given by a path integral over the quantum field ζ a and an ordinary integral over the zero-mode x a as [19]: [dX] e−S0 (X) → dx [dζ ] e−S0 (ζ ) . 3 For more details on V (ω) we refer the reader to [56].
40
L. Cornalba, R. Schiappa
The integral in [dζ ] is gaussian and is determined once one obtains the two-point function for the fluctuating field ζ . From now on, and unless otherwise specified, we will parameterize the disk with the complex upper-half plane H+ . As discussed in [22, 3, 5], the two-point function can be more conveniently written if one introduces the open string metric Gab and noncommutativity tensor θ ab as given by 1 1 +θ = . G g+B It then has the general form, ζ a (z) ζ b (w) = where
1 1 ab i ab θ A (z, w) − Gab B (z, w) + g C (z, w) , π π 2π
(12)
w−z 1 ln , 2i z−w B (z, w) = ln |z − w| , z − w . C (z, w) = ln z − w
A (z, w) =
In the sequel, we shall only need to consider the propagator (12) when one point (say w) is placed at the boundary ∂ of the string world-sheet. In this case w = w and C (z, w) = 0. Also, in the case w = w, the coefficients A (z, w) and B (z, w) have a simple geometrical interpretation. A measures the angle between the line z–w and the vertical line passing through w, and B gives the logarithm of the distance between z and w. We now consider the limit gab → 0. In this limit, the effective open string metric Gab becomes large and therefore the term in (12) proportional to Gab becomes irrelevant. Also, one has in this limit, that θ = B −1 = α(x). In this case the propagator (12) reduces to i ζ a (z) ζ b (w) = θ ab A (z, w) , π and the computation of path integrals becomes simple. As we discussed in the previous subsection, if one considers n functions f1 , . . . , fn , positioned at ordered points τ1 < · · · < τn on the boundary ∂ of the string world-sheet, then the path integral (13) [dX] e−S0 (X) f1 (X (τ1 )) · · · fn (X (τn )) can be evaluated [3, 5] with the result, V (B) dx (f1 # · · · # fn ) . Since ω (x) = B is constant, the product # is the usual Moyal star product and V (B) = √ det B. A word on notation. From now on we will omit the explicit reference to the volume form in the integrals. We shall therefore use the following short-hand notation: V (ω) dx · · · → · · · .
Nonassociative Star Product Deformations
41
3.2. The Interaction. Let us now consider the effects of the perturbation S1 . Corresponding to the split (11), the effect of S1 is to introduce two bulk graphs: i V = − Habc x c dζ a ∧ dζ b , (14) 6 i W = − Habc ζ a dζ b ∧ dζ c . (15) 6 We will then consider the following path integral: [dX] e−S0 (X)−S1 (X) f1 (X (τ1 )) · · · fn (X (τn )) [dX] e−S0 (X) [1 + V + W] f1 (X (τ1 )) · · · fn (X (τn )) .
(16)
In order to analyze the effects of V and W, let us first introduce some notation and discuss some useful simple results. Consider a generic point z ∈ H+ , and consider the path integral: [dX] e−S0 (X) ζ a (z) f1 (X (τ1 )) · · · fn (X (τn )) . If we introduce the short-hand notation, A (z, τi ) = Ai , for the angle between the line z–τi and the vertical through τi , then the result of the above path integral is simply given by n i Ai θ aa f1 # · · · # ∂a fi # · · · # fn . π i=1
The above result is easy to understand once one considers the expansion of the functions fi (X) = fi (x + ζ ) in Taylor series in powers of ζ . The contraction of the field ζ a (z) with a field ζ a (τi ) coming from the Taylor expansion of the function fi gives a factor of πi Ai θ aa . We are then left with a path integral of the form (13), where the function fi has been replaced with its derivative ∂a fi . More generally, when a free field ζ a (z) is contracted with one of the boundary functions it acts as a differentiation: i A θ aa ∂a . π
(17)
With this result, we can now consider the effects of the perturbation vertices V and W in the path integral (16). Let us start with the analysis of V. Choose any two indices i < j and consider the term where the two ζ ’s in V differentiate the two functions fi and fj (in the sense just described above). If ζ a differentiates fi and ζ b differentiates fj one then gets
i a a b b x c · · · # ∂a fi # · · · # ∂ H θ θ dA ∧ dA abc i j b fj # · · · . 2 6π The integral over can be evaluated by noting that the upper-half plane H+ corresponds to the simplex − π2 < Ai < Aj < π2 in the Ai –Aj plane. Therefore the integral
42
L. Cornalba, R. Schiappa
dAi ∧ dAj is equal to 21 π 2 . Moreover, if we instead let ζ a differentiate fj and ζ b differentiate fi we obtain, using the antisymmetry of Habc , the same result as above. Summing the two contributions, and summing over all possible pairs i < j , one then obtains Vij ,
i<j
where, for i < j , we have defined
i Vij = Habc θ aa θ bb x c # f1 # · · · # ∂a fi # · · · # ∂ b fj # · · · # f n . 6 In the above equation we have used the fact that (for the Moyal product) f ·g = f #g, in order to rewrite everything in terms of # products, including the multiplication by the coordinate function x c . To conclude the analysis of the effect of the two-vertex V, one must also consider the term coming from the contraction of the two ζ ’s in V among themselves. This term will require some care, since we must regularize the contraction of two fields at coincident points. On the other hand, the general structure of the contribution can be obtained with little effort by recalling that the two indices a and b in (14) are contracted with the antisymmetric tensor Habc . This implies that the contribution in question must have the form V = N Habc θ bc x a # (f1 # · · · # fn ) , where N is an unknown constant which will later be determined to be 1/3. We now move to the analysis of the contributions coming from the three-graph W. First, given three indices i < j < k, let us define 1 a a b b c c Wij k = − Habc θ θ θ f1 # · · · # ∂a fi # · · · # ∂ c fk # · · · # f n . b fj # · · · # ∂ 12 It is then easy to check, using the general result (17), that the contribution from the three-vertex which comes from the contraction of the fields ζ ’s in W with the functions fi , fj , fk is given by
S τi , τj , τk Wij k , i<j 0 such that for any
1 integer n ≥ 1 and any real number t ∈ 0; n 2 −α , we have gn,t ≤ C.hn,t .
− t2 2 n
(1) t
1. Control of an,p+1 (t) := e 2 − 1 − 2n
. We have: (1) an,p+1 (t)
≤ n.e
2 − t2 1− n1
2 t4 − t2 = e 8n2
1− n1
t4 . 8n
(3)
98
F. Pène
2. Now, we shall get a bound for the following quantity:
2 n
itS√ n (f ) t
Dn (t) := Eν e n − 1−
.
2n
We first notice that we have: l
n−1 itSn−(l+1) (f )
t 2
√ l l+1
n Dn (t) ≤ E 1− Y ◦ T .e ◦ T ν
, 2n
(4)
l=0
with Y := e
itf √ n
t2 − 1− . 2n
itf
√ According to Proposition 1.1 and to the fact that we have
e n − 1
≤ √tfn , function t2 Y is in Hη,m0 and we have Y (η,m0 ) ≤ 2n + √t n f (η,m0 ) = O √t n . 3. We fix a real number r0 > 1. We define integers a1 = a1 (n), ..., ap+3 = ap+3 (n) as follows: − ln (n) a1 := ln δr0 and
5 ln n− 2 , aj := (r0 − 1) a1 + ... + aj −1 + ln δr0
for any integer j = 2, ..., p +3; where δr0 is the constant appearing in Proposition 1.2 for r = r0 . We can notice that there exists some constant κ > 0 such that, for any n ≥ 1 κnα and, therefore, a1 + ... + ap+3 < κnα . and any j = 1, ..., p + 3, we have: aj < p+3
l
itSn−l−1 (f )
n−1 √ (2) t2 n
4. Control of: an,p+1 (t) := l=n−κnα 1 − 2n Eν Y.e ◦ T
. We have: 2 2 α −1) t κ t − t2 1− 1−α − n1 (2) α − t (n−κn n 2n , (5) e an,p+1 (t) = O n e =O √ 1 n n 2 −α $ # 1 since we consider only real numbers t satisfying 0; n 2 −α . 5. We shall estimate the following quantity: α n−κn −1
l=0
t2 1− 2n
l itSn−(l+1) (f )
√ n
Eν Y.e ◦ T
.
The study of this quantity constitutes the major part of our proof. Let us write: itSn−(l+1) (f ) √ n Jl (n, t) := Eν Y.e ◦T .
Billiards
99
6. Let l be any integer in {0, ..., n − κnα − 1}. Then, we have n − (l + 1) > a1 + ... + ap+3 . We use the following decomposition: p+3 Saj (f ) ◦ T a1 +...+aj −1 + SMn,l (f ) ◦ T a1 +...+ap+3 , Sn−(l+1) (f ) = j =1
with Mn,l := n − (l + 1) − a1 + ... + ap+3 . Thus, we have p+3 itSM (f ) aj (f ) & itS√ n,l √ n Jl (n, t) = Eν Y e n ◦ T 1+a1 +...+aj −1 e ◦ T 1+a1 +...+ap+3 . j =1
Let us denote by Il (n, t) the following quantity:
itS (f ) p+3 itSM (f ) a itSa1 (f ) & n,l √ √j √ n Eν Y e n ◦ T e n − 1 ◦ T 1+a1 +...+aj −1 e ◦ T 1+a1 +...+ap+3. j =2
We have: |Il (n, t)| ≤ D1
t p+3
√
n.n
1−α 2
(p+2)
,
(6)
for some constant D1 > 0 independent of (n, t, l). Indeed, we have: p+3 p+3 itSa (f ) & & t √aj t √jn |Il (n, t)| ≤ Y ∞ − 1 ≤O √ √ , e n n j =2
p+2
j =2
itSaj (f )
tSa (f )
√ writing
e n − 1
≤
√j n
and according to Corollary 1.3 (we recall that we 1 consider only real numbers t satisfying: t ∈ 0; n 2 −α and that we have aj < κnα ). The main part of our proof is devoted to the establishment of an estimation of the error term Il (n, t) − Jl (n, t). We notice that this error term is given by the following formula: p+3 itSM (f ) & n,l √ n Eν Y εj ◦ T 1+a1 +...+aj −1 e ◦ T 1+a1 +...+ap+3 , ε=(ε1 ,...,εp+2 )
j =1
(
where the sum is taken over the ε = (ε1 , ..., εp+3 ) with εj ∈ itSaj (f ) √
−1; e
itSaj (f ) √ n
) not
itSa1 (f ) √
all equal to e n and such that ε1 = e n . Let us consider such a vector ε = (ε1 , ..., εp+3 ). We define the integer j0 := max{j ≥ 2 : εj = −1}. We have: p+3 itSM (f ) & √n,l n εj ◦ T 1+a1 +...+aj −1 e ◦ T 1+a1 +...+ap+3 Y j =1
= −Dε (n, t).Eε,l (n, t) ◦ T 1+a1 +...+aj0 ,
100
F. Pène
with Dε (n, t) := Y.
j& 0 −1
εj ◦ T 1+a1 +...+aj −1
(7)
j =1
and
p+3 &
Eε,l (n, t) :=
εj ◦ T
aj0 +1 +...+aj −1
e
itSM (f ) √n,l n
◦ T aj0 +1 +...+ap+3 .
j =j0 +1
According to our choice of j0 , for any j = j0 + 1, ..., p + 3, we have: εj = e Consequently, we have it √
S
itSaj (f ) √ n
.
(f )
Eε,l (n, t) = e n n−(l+1)−(a1 +...+aj0 ) . • First step: Control of Cov Dε (n, t), Eε,l (n, t) ◦ T 1+a1 +...+aj0 . Using controls of rate of decorrelation given in proposition 1.2, we show that we have:
t
(8)
Cov Dε (n, t), Eε,l (n, t) ◦ T 1+a1 +...+aj0 ≤ D2 2 , n for some constant D2 > 0 independent of (n, t, ε). Indeed, according to Proposition 1.1, Dε (n, t) is in Hη,a1 +...+aj0 −1 +m0 and we have: Dε (n, t)(η,a1 +...+aj0 −1 +m0 ) ≤ Y (η,m0 ) ≤O
t √
j& 0 −1 n
j& 0 −1
εj (η,aj +m0 −1)
j =1
(η,m0 )
1 + t
Cf
√
j =1
aj
n
≤O
t √
,
n
$ # 1 (we recall that we consider only the real numbers t satisfying t ∈ 0; n 2 −α ). In the same way, we can show that Eε,l (n, t) is in Hη,Nn,l , with Nn,l := n − (l + 1) − (a1 + ... + aj0 ) + m0 − 1 and that we have: (η,m ) tCf 0 n = O (n) . Eε,l (n, t)(η,Nn,l ) ≤ 1 + √ n According to proposition 1.2, we get:
Cov Dε (n, t), Eε,l (n, t) ◦ T 1+a1 +...+aj0
j √ 1+ j0=1 aj −r0 ≤ O(t n)δr0
According to our choice for the aj ’s, we have: δr0
1+
j0
j =1 aj −r0
j0 −1
m0 +
j =1
aj
=O
1 5
n2
.
j0 −1
m0 +
j =1
aj
.
Billiards
101
• Second step: Control of the mean of Dε (n, t). We shall now show that we have: t3 t2 |Eν [Dε (n, t)]| ≤ D3 √ 1−α + 2 , (9) n n.n for some D3 > 0 independent of (n, t, ε). Let us denote by J the following set: ( ) itS (f ) J := j = 1, ..., j0 − 1 : εj = e
a √j n
. itSa (f )
We recall that 1 is in J . By replacing, in formula (7), εj by εj := 1 + √jn , for 2 t each j ∈ J , we introduce an error in O √t n n1−α (uniformly in (n, t, ε)), since
2
t 2
Sa (f )
j we have εj − εj ≤ and according to Corollary 1.3. By replacing Y by 2n 3 2 itf t t (uniformly in (n, t, ε)). Y := √n + 2n 1 − f 2 , we make an error in O n√ n itSa (f )
For any j ∈ J , we write εj,0 := 1 and εj,1 := √jn . We shall control the following quantities: & Zε,q (n, t) := Eν Y εj,q ◦ T 1+a1 +...+aj −1 , j j ∈J
{0, 1}J .
with q = (qj )j ∈ J ∈ (a) If we have j ∈J qj = 0, then we have Zε,q (n, t) = Eν [Y ] = (b) If we have have:
j ∈J
t2 1 − Eν [f 2 ] . 2n
qj = 1 and q1 = 1, then, according to Corollary 1.3, we
3 a1 itf it t k Zε,q (n, t) = Eν √ . √ f ◦T +O 3−α n n n 2 k=1 a 1 t3 t2 k f ◦ T + O √ 1−α . = − Eν f. n nn k=1
According to Proposition 1.2, to formula (2) (we recall that we have supposed σ 2 (f ) = 1) and to our choice for a1 , we have: a1 $ # # 2 $ t2 t − Eν f.f ◦ T k = − Eν f.f ◦ T k + O δr0 a1 n n k=1 k≥1 2 a t 1 − Eν [f 2 ] 1 =− + O δr0 n 2 2 t 2 1 − Eν [f 2 ] t 1 =− +O . n 2 nn
102
F. Pène
(c) If we have j ∈J qj = 1 and qj = 1 (for some j ∈ J \ {1}), then, according to Corollary 1.3, we have: a1 +...+aj 3 itf t it k +O f ◦T Zε,q (n, t) = Eν √ . √ 3−α n n n 2 k=1+a1 +...+aj −1 t2 t3 a1 +...+aj −1 = O (δr0 ) + O √ 1−α n nn 2 t t3 =O + O √ 1−α . n2 nn (d) If
j ∈J
qj ≥ 2, then we have: Zε,q (n, t) = O
t3 √ 1−α nn
.
Indeed, we have: Y ∞ = O √t n and, according to Corollary 1.3,
2 j ∈J qj & t t 1+a1 +...+aj −1 =O . εj,qj ◦ T 1−α =O 1−α n j ∈J n 2 1
Therefore, we have:
|Eν [Dε (n, t)]| ≤ O +
Zε,q (n, t)
q∈{0,1}J
t3 t2 ≤ O √ 1−α + O . n2 n.n
t3 √ 1−α n.n
• Third step: Control of the mean of Eε,l (n, t). We recall that we have: Eε,l (n, t) = e
it √ S (f ) n n−(l+1)−(a1 +...+aj0 )
.
* Let us write n = nn,l,ε := n−(l+1)− a1 + ... + aj0 and t = tn,l,ε := t nn . We # $ 1 can notice that n and t satisfy: n ≥ 1 and t ∈ 0; n 2 −α . Therefore, according to the induction hypothesis Hp applied to (n , t ), we have: t p p 21 −α n t 2 l+1 κ + 1−α 2 n
2
Eν Eε,l (n, t) ≤ e− t2 nn + Lp
t2
≤ e− 2 e
n
+ an ,p t tp
+ Lp n
On the other hand, we notice that we have
Eν Eε,l (n, t) ≤ 1.
p
1 2 −α
(10) + an ,p t .
(11)
(12)
Billiards
103
7. According to inequalities (6), (8), (9), (11) and (12), is less than α n−κn −1
l=0
+
t2 1− 2n
D3 √
ε
where the sum
l D1 √
t3 n.n1−α
t2
e− 2 e
t p+3 n.n
t2 2
t2 2n
1−
l=0
l
|Jl (n, t)|
t t2 p+2 + 2 D 3 n2 n2 tp + an ,p t , 1
+ 2p+2 D2
1−α 2 (p+2)
l+1 κ n + n1−α
n−κnα −1
+ Lp n
p
2 −α
(
ε
is taken over the ε = (ε1 , ..., εp+3 ) with εj ∈
itSaj (f ) √
−1; e
(13)
itSaj (f ) √ n
)
itSa1 (f ) √
not all equal to e n and with ε1 = e n (we recall that, by definition, t and n depend not only on t and n but also on l and ε). We can notice that we have: n−1 l=0
t2 1− 2n
l
2n ≤ min n, 2 t
n−1
and
t2 1− 2n
l=0
l
t2l
e 2n ≤ n.
In the following, we control each term of formula (13). l (1) t2 t p+3 (a) Let us write: bn,p+1 (t) := n−1 1−α (p+2) . We have: l=0 1 − 2n D1 √ n.n
(1) bn,p+1 (t)
t p+3 2n ≤ 2 D1 √ 1−α =O t n.n 2 (p+2)
(3)
(b) Let us write: an,p+1 (t) :=
n−1 l=0
t2 2n
1−
l
(3)
an,p+1 (t) ≤ n.2p+2 D2 (4)
(c) Let us write: an,p+1 (t) := (4) an,p+1 (t)
(5)
t p+1
n
(p+1) 21 −α
.
t =O n2
1−
t2 2n
l
t . n
(15)
D3 nt 2 . We have: 2
t2 1 ≤ 2n min 1, 2 D3 2 = O min 1, t 2 n−1 . t n
(d) Let us write: an,p+1 (t) := have:
n−1
(5) an,p+1 (t)
l=0
≤2
p+2
ε
1−
D3 e
(14)
2p+2 D2 nt2 . We have:
n−κnα −1 l=0
2
t2 2n
l
t2
t 2 (l+1)+κnα n
t −2 2 D3 √n.n e 1−α e
2 − t2 1− n1 −
3
κ n1−α
t3 n
1 2 −α
.
(16)
. We
(17)
104
F. Pène
n−1
(2)
(e) Let us write: bn,p+1 (t) :=
ε
l=0
t2 2n
1−
l
(6)
(f) Let us write: an,p+1 (t) := show that we have: (6) an,p+1 (t) = O
n
1−
l=0
t
2
+O e
n1−2α
1
np( 2 −α)
. We have:
. (p+1) 21 −α
n−κnα −1 ε
tp
3
t p+1
(2)
bn,p+1 (t) ≤ 2p+2 .2D3 Lp
t D3 √n.n 1−α Lp
− t2
t2 2n
l
1 κ 2 − n1−α
(18)
t D3 √n.n 1−α an ,p (t ). We 3
− n2
t3 n 2 −α 1
.
(19)
+ , Let a vector ε = ε1 , ..., εp+3 be given. We notice that if we have l ≤ n2 − κnα − 1, then we have n = n − (l + 1) − (a1 + ... + aj0 ) ≥ n2 . From this, we get: n 2
α !−κn −1
1−
l=0 n 2
t2 2n
l √
t3 an ,p (t ) n.n1−α
α !−κn −1
1−
≤
l=0
t2 2n
l
Ap t3 1 1−α n.n n 2 −α t , =O n1−2α √
Ap 2 2 −α t3 2n √ t 2 n.n1−α n 21 −α according to the induction hypothesis Hp . On the other hand, we have: 1
≤
α n−κn −1
l=
n 2
1−
!−κnα ≤
t2 2n
l
α n−κn −1
√
t3 an ,p (t ) n.n1−α t2l
e− 2n √
t3 Ap n.n1−α
!−κnα n − t2 1 − κ − 2 t3 ≤ + 1 e 2 2 n1−α n √ 1−α Ap 2 n.n t2 1 κ 2 3 t − − − . ≤ O e 2 2 n1−α n 1 n 2 −α l=
n 2
8. Consequently, we have hn (f, t) ≤ an,p+1 (t) + bn,p+1 (t), with bn,p+1 (t) := 2j =1 (j ) (j ) bn,p+1 (t) and an,p+1 (t) := min 2, 6j =1 an,p+1 (t) . We have shown that if property Hp is satisfied, then property Hp+1 is also satisfied. We conclude that property Hp is true for any integer p ≥ 0.
Billiards
105
Conclusion (end of the proof of Theorem 2.1). We show how Theorem 2.1 can be deduced from Proposition 2.2. Let f be a real-valued function in Hη,m0 , ν-centered with σ 2 (f ) > 0. Without any loss of generality, we suppose that we have σ 2 (f ) = 1. According to an inequality given by C. Esseen in [5], for any real number U > 0, we have 24 1 2 U dt /n (f ) ≤ . hn (f, t) + √ π 0 t π 2π U 1 Let α ∈ 0; 41 be a real number, p ≥ 2 be an integer. Let us fix ω := α + 2p . According to Proposition 2.2, we have: 0
1
n 2 −ω
hn (f, t)
Lp dt ≤ t p
tp p 21 −α
n
= On→+∞ = On→+∞
n 21 −ω 0
1
np(ω−α) 1 . 1 n 2 −α
Therefore, we get:
/n (f ) = On→+∞
+ On→+∞
1 1
n2
1 −α− 2p
n 2 −α
1
+ On→+∞
1 1
n 2 −α 1
,
for any real number α ∈ 0; 41 and any integer p ≥ 2. 4. Rate of Convergence for the Billiard Flow: Proof In this section, we prove Theorem 2.3. We denote τ¯ := M τ + (ω) dν(ω). Let f : T 1 Q → R be a Hölder function of order η, µ-centered and such that σ˜ 2 (f ) > 0. We have to study the random variables Xt : Q1 → R given by: 1 Xt := √ t
0
t
f ◦ Ys (·) ds.
We shall denote by X a centered gaussian random variable defined on a probabilised space (B, P) such that E X2 = σ˜ 2 (f ). In this proof, we identify the billiard flow (Q1 , µ, (Yt )t ) with the suspension flow (N , λ, (St )t ). Let us define the function F : τ + (ω) 2 ) M → R by F (ω) := 0 f (ω, s) ds.As F is in Hη,1 and as we have σ˜ 2 (f ) = σ τ(F ¯ , according to Theorem 2.1, for any real number ε > 0, we have:
n−1
1 1
sup ν √ F ◦ T k (·) ≤ x − P (X ≤ x) = O n− 2 +ε .
τ¯ .n x∈R
k=0
106
F. Pène
τt¯ !−1 1. Let Wt : M → R be the random variable given by Wt := √1t k=0 F ◦ T k (·). We have: 1 sup |ν (Wt ≤ x) − P (X ≤ x)| = O t − 2 +ε , x∈R
for any real number ε > 0. Indeed, for any real number t ≥ 2τ¯ , we have: t τ¯ !−1 1 k √ − * 1 F ◦ T , + t τ¯ τt¯ k=0
∞
√ √ 2τ¯ t 2F ∞ ≤ 3 F ∞ = . √ τ ¯ t t2
It comes ν
t τ¯ !−1
k k=0 F ◦ T * + , ≤x− τ¯ τt¯
√
2F ∞ ≤ ν (Wt ≤ x) √ t √ τt¯ !−1 F ◦Tk 2F ∞ k=0 . * + , ≤x+ ≤ν √ t τ¯ τt¯
We get:
sup |ν (Wt ≤ x) − P (X ≤ x)| ≤ O t
− 21 +ε
x∈R
+ P |X − x| ≤
√
2F ∞ √ t
,
for any real number ε > 0. n(t,·)−1 F ◦ T k, 2. Let Zt : M → R be the random variable defined by Zt := √1t k=0 where n(t, ω) denotes the number of reflections off ∂Q between instants 0 and t, for a particle having the configuration ω at time 0: ( n(t, ω) := max n ≥ 0 :
n−1
τ
+
) T (ω) ≤ t .
k
k=0
In points 2 to 5 of this proof, we prove that we have: 1 sup |ν (Zt ≤ x) − P(X ≤ x)| ≤ O t − 4 +ε ,
x∈R
for any real number ε > 0. First, we notice that we have: 1 Zt − Wt = √ t
max(
!,n(t,·))−1
t τ¯
k=min(
t τ¯
F ◦ T k.
!,n(t,·))
3. Remaking calculations we have already done in [10, pp. 81 and 142] we prove the following result:
Billiards
107
Lemma 4.1. Let a real number L ≥ 2 be given. There exists a constant CL > 0 such that, for any real numbers t > 1 and K ≥ 1, we have:
. -
√
n(t, ·) − t ≥ K t ≤ CL K −L . ν
τ¯
Proof. Let us consider any real numbers t > 1 and K ≥ 2. We have
. -
√ t
ν
n(t, ·) − τ¯ ≥ K t / 0. 1 2. √ √ t t ≤ν n(t, ·) ≥ +ν n(t, ·) ≤ +K t −K t τ¯ τ¯ t √ t √ τ¯ +K τ¯ −K t −1 t! ≤ ν τ+ ◦ T i ≤ t + ν τ+ ◦ T i > t i=0 i=0 t √ τ¯ +K t −1 √ τ + ◦ T i − τ¯ ≤ −τ¯ K t + ≤ ν i=0 t √ τ¯ −K √ t! τ + ◦ T i − τ¯ ≥ τ¯ K t − 1 +ν i=0
n−1 1 2L + i ≤ 2 sup √ τ ◦ T − τ¯ n n≥1 i=0
L2L (M,ν)
n−1 1 2L + i ≤ 2 sup √ τ ◦ T − τ¯ n n≥1 i=0
L2L (M,ν)
2L √ t/τ¯ + K t + 1 2L √ τ¯ (K t − 1) (Kt/τ¯ + Kt + Kt)L = O K −L , √ 2L τ¯ K t 2
according to Corollary 1.3. # $ 4. For any real number p > 2 and L ≥ 2, there exists a constant ap,L such that, for any real numbers t > 1, α > 0 and β > 0, we have 1 β p −4+ 2 t + t −Lβ . ν ({|Zt − Wt | > α}) ≤ ap,L αp Indeed, we have:
.
1 t
|Zt − Wt | > α and n(t, ·) − ≤ t 2 +β ν ({|Zt − Wt | > α}) ≤ ν τ¯
-
.
1 t
n(t, ·) − > t 2 +β +ν .
τ¯
According to Lemma 4.1, we have:
. -
n(t, ·) − t > t 21 +β ≤ CL t −Lβ . ν
τ¯
108
F. Pène
Moreover, according to Corollary 1.3 and to Theorem B of [12], there exists a constant Kp > 0 such that, for any integer N ≥ 0, we have:
n−1
p
n−1
p
p
Eν sup
F ◦ T k + Eν sup
F ◦ T −k ≤ Kp N 2 .
n=1,...,N
n=1,...,N
k=0
k=0
Therefore, we have: |Zt − Wt | > α and ν
≤
.
n(t, ·) − t ≤ t 21 +β
τ¯
. Eν |Zt − Wt |p 11 |n(t,·)− τt¯ |≤t 2 +β
≤O
t
p
1 β 4+ 2 p
αp t 2
αp
.
5. We get:
ν (|Wt − Zt | > α) ≤ O
t
p − 41 + β2
αp
+ t −Lβ ,
for any real numbers p > 2 and L ≥ 2. From this, it comes: 1 sup |ν (Zt ≤ x) − P(X ≤ x)| = O t − 4 +ε , x∈R
for any real number ε > 0. Indeed, we have ν(Zt ≤ x) − P(X ≤ x) ≤ ν(Wt ≤ x + α) + ν(Wt − Zt > α) − P(X ≤ x) ≤ ν(Wt ≤ x + α) − P(X ≤ x + α) + ν(Wt − Zt > α) + P(x ≤ X ≤ x + α) and P(X ≤ x) − ν(Zt ≤ x) ≤ P(X ≤ x) − ν (Wt ≤ x − α) + ν (Zt − Wt > α) ≤ P(X ≤ x − α) − ν(Wt ≤ x − α) + ν(Zt − Wt > α) + P(x − α ≤ X ≤ x). Therefore, according to the foregoing, for any real numbers ε > 0, p > 2 and L ≥ 2, we have: sup |ν (Zt ≤ x) − P(X ≤ x)| 1 ≤ O t − 2 +ε + ν(|Wt − Zt | > α) + P(x − α ≤ X ≤ x + α) 1 β p −4+ 2 1 t ≤ O t − 2 +ε + O + t −Lβ + O(α). αp
x∈R
Billiards
109
and α := t −γ , with γ > 0, we get: 1 p − 41 +γ + 4L − 21 +ε −γ . +t +t sup |ν (Zt ≤ x) − P(X ≤ x)| ≤ O t
Taking β :=
1 2L
x∈R
To conclude, we shall take p, L large and γ close to 41 . For any real numbers L ≥ 3 1 1 and p ≥ 4, we take γ := 41 − 2p − 4L . We get: 1 1 −1+ 1 + 1 sup |ν (Zt ≤ x) − P(X ≤ x)| ≤ O t − 2 +ε + t − 2 + t 4 2p 4L . x∈R
As this is true for any L ≥ 3 and p ≥ 4, we have shown that we have: 1 sup |ν (Zt ≤ x) − P(X ≤ x)| ≤ O t − 4 +ε , x∈R
for any real number ε > 0. 6. We notice that, for any ω ∈ M and any real number t > 0, we have:
max τ + f 1
t M ∞
|Zt (ω)−Xt (ω, 0)| = √ f (Ys (ω, 0)) ds ≤ . √
t n(t,ω)−1 τ + (T k (ω)) t k=0
As we did in point 1 of this proof, we deduce from the foregoing that we have: 1 sup |ν (Xt (· , 0) ≤ x) − P(X ≤ x)| ≤ O t − 4 +ε , x∈R
for any real number ε > 0. 7. Now we consider the random variable Xt : Q1 → R given by Xt (ω, s) := Xt (ω, 0). We show that we have: 1
sup µ Xt ≤ x − P(X ≤ x) ≤ O t − 6 . x∈R
Let a real number x be fixed. For any integer m ≥ 1, we have:
µ X ≤ x − ν(Xt (·, 0) ≤ x)
t
τ + (ω)
=
1{Xt (ω,0)≤x} dν(ω)
1{Xt (ω,0)≤x} dν(ω) − τ ¯ M M
m−1
1
+ k τ (T (ω))
=
1{Xt (T k (ω),0)≤x } − 1{Xt (ω,0)≤x} dν(ω)
m
τ ¯ k=0 M
m−1
1
τ + (T k (ω))
≤
1{Xt (T k (ω),0)≤x } − 1{Xt (ω,0)≤x} dν(ω)
m
τ¯ k=0 M m−1 1 τ+ ◦ T k + −1 m τ¯ 1 k=0
≤
L (M,ν)
m−1 τ+
maxM τ¯ .m k=0 1 . +O √ m
M
1{Xt (T k (ω),0)≤x≤Xt (ω,0)} + 1{Xt (ω,0)≤x≤Xt (T k (ω),0)} dν(ω)
110
F. Pène + f ∞
As we have |Xt (·, 0) − Xt (T k (·), 0)| ≤ 2 k maxM√τt
, we get:
µ X ≤ x − ν(Xt (·, 0) ≤ x)
t . maxM τ + m maxM τ + f ∞ 1 |Xt (·, 0) − x| ≤ 2 ≤ ν +O √ √ τ¯ m t . m maxM τ + f ∞ maxM τ + P |X − x| ≤ 2 ≤ √ τ¯ t 1 1 +O √ + O t − 4 +ε m 1 1 m ≤ O √ + O t − 4 +ε + O √ , m t 1
for any real number ε > 0. By taking m := t 3 !, we get: 1
sup µ Xt ≤ x − ν(Xt (·, 0) ≤ x) ≤ O t − 6 . x∈R
8. To conclude, we notice that, for any (ω, s) ∈ N and any real number t > 0 we have: 2 maxM τ + f ∞ |Xt (ω, s) − Xt (ω, 0)| ≤ . √ t From this and the foregoing, we conclude that we have: 1 sup |µ (Xt ≤ x) − P(X ≤ x)| ≤ O t − 6 .
x∈R
A. Construction of L.-S. Young’s Tower: Recalls We see how Proposition 1.2 can be proved using the method developed by L.-S. Young in [14] for general hyperbolic systems. Let a real number η ∈]0; 1[ be fixed. Stable and unstable curves. Hyperbolic properties of (M, ν, T ) (existence and absolute continuity of stable and unstable foliations) are useful to make L.-S. Young’s construction. We recall here some well known results about stable and unstable curves for (M, ν, T ). Definition. We call a curve of M a curve γ contained in a connected component of M and which is C 1 for the parametrisation by ϕ). (r, For such a curve γ , we write l(γ ) := γ dr 2 + dϕ 2 . We call a stable curve (resp. unstable curve) a curve γ s (resp. γ u ) contained in M \ R−∞,0 (resp. in M \ R0,+∞ ) and satisfying lim l(T n (γ s )) = 0 (resp.
n→+∞
lim l(T −n (γ u )) = 0).
n→+∞
Billiards
111
We recall the following results: Proposition A.1. There exists a set M of M, exactly T -invariant, such that ν(M) = 1 and such that any x ∈ M is contained in an unique maximal stable curve written γ s (x) and in an unique maximal unstable curve written γ u (x). Proposition A.2. There exist two real numbers α ∈]0; 1[ and C > 0 such that, for any stable curve γ s , any unstable curve γ u and any integer n ≥ 0, we have l T n (γ s ) ≤ Cα n and l T −n (γ u ) ≤ Cα n . Moreover, the intersection of a stable curve with an unstable curve contains at most one point. Following L.-S. Young, we can construct an extension M˜ d , ν˜ d , T˜d of M, ν, T d (for some integer d ≥ 1) and a factor Mˆ d , νˆ d , Tˆd of M˜ d , ν˜ d , T˜d for which the transfer operator has “good” spectral properties on some functional space. The idea of the proof of the strong decorrelation property given in Proposition 1.2 is to prove first a result analogous to Proposition 1.2 for (M, ν, T d ) using these constructions. We shall establish these results after having briefly recalled the method of construction of these dynamical systems and stressing on the properties that shall be useful for our purpose. We recall the notions of extension and factor. Definition. Let (B0 , µ0 , θ0 ) and (B1 , µ1 , θ1 ) be two dynamical systems. The system (B1 , µ1 , θ1 ) is said to be an extension of (B0 , µ0 , θ0 ) by the map π : B1 → B0 if: • the map π is measurable; • µ0 is the image measure of µ1 by π , i.e. µ0 (A) = µ1 (π −1 (A)) for any measurable subset A of B0 ; • we have: π ◦ θ1 = θ0 ◦ π . We also say that (B0 , µ0 , θ0 ) is the factor of (B1 , µ1 , θ1 ) by π . An extension of (M, ν, T ). Definition. We call a rectangle of M a measurable subset A of M of the following form: 9 9 γ s ∩ γ u , A= s γ s ∈IA
u γ u ∈IA
s is a family of stable curves and I u a family of unstable curves and such that where IA A s × Iu . s u γ ∩ γ = ∅, for any (γ s , γ u ) ∈ IA A Let a rectangle A of M be given. We call the s-sub-rectangle of A a rectangle B of the following form: 9 9 γ s ∩ γ u , B= γ s ∈IBs
u γ u ∈IA
112
F. Pène
s . We call the u-sub-rectangle of A a rectangle C of the following with IBs contained in IA form: 9 9 C= γ s ∩ γ u , s γ s ∈IA
u γ u ∈IC
u. with ICu contained in IA
s ∩ In [14], L.-S. Young gives the construction of a rectangle K = γ s s γ ∈I u s contained in M (where I is a family of stable curves contained in γ u ∈I u γ M \ R1 and I u a family of unstable curves contained in M \ R−1 ) endowed with a return time R(·) in K under the action of T and of a (countable) ν-essential partition {Ki }i≥0 of K in s-sub-rectangles satisfying (in particular) the following: • R is equal to a constant ri on each Ki ; • For any x ∈ K, we have: T R(x) γ s (x) ⊆ γ s T R(x) (x) and T R(x) γ u (x) ⊇ γ u T R(x) (x) . • For any i ≥ 0, T ri (Ki ) is a u-sub-rectangle of K; • Ki is contained in a connected component of M \ R−ri ,0 . Λ
Λ γs
γu
TRi
Λi γs
γu
Then, she constructs a Borel probability measure µ˜ on K, T R(·) -invariant, such that Eµ˜ [R] < +∞ and such that the "discrete-time suspension system" M˜ 1 , ν˜ 1 , T˜1 over (K, µ, ˜ T R(·) ) defined by the function R(·) as follows is an extension of (M, ν, T ) (by π˜ 1 : M˜ 1 → M given by π˜ 1 (x, l) = T l (x)): • M˜ 1 := {(x, l) : x ∈ K, 0 ≤ l ≤ R(x) − 1}; • T˜1 (x, l) = (x, l + 1) if l < R(x) − 1 and T˜1 (x, l) = T R(x) (x), 0 if l = R(x) − 1; ˜ l) l≥0 µ(A • ν˜ 1 l≥0 Al × {l} = Eµ˜ [R(·)] , where, for each l, Al is a measurable subset of {R > l}.
Billiards
113
A partition. We define il : {x ∈ K : R(x) > l} → Ll by il (x) = (x, l). L.-S. Young gives the construction of a partition D = Ll,j ; l ≥ 0, j = 1, ..., jl , where {Ll,j }j is a finite partition of the l th "store" Ll := {(x, l ) ∈ M˜ 1 ; l = l} satisfying the following properties: Properties A.3. 1. j0 = 1 and L0,1 = L0 = K × {0}; 2. each il −1 Ll,j is a s-sub-rectangle of K, union of Ki ; 3. For any l ≥ 0, il+1 −1 Ll+1,j ; j = 1, ..., jl+1 is a partition of {R > l + 1} finer −1 Ll,j ; j = 1, ..., jl ; than the one induced by il 4. For any x, y in il−1 (Ll,j ) and in a same unstable curve, there exists an unstable curve containing x and y and contained in M \ R−l,0 ; 5. If T˜1−1 (L0 )∩Ll,j = ∅, then there exists an integer i ≥ 0 such that T˜1−1 (L0 )∩Ll,j = Ki × {ri − 1}. For any X, Y ∈ M˜ 1 , we define the separation time s(X, Y ) between X and Y as follows: : ; s(X, Y ) := max n ≥ 0 : T˜1n (Y ) ∈ D T˜1n (X) . The following fact shall be useful in our proof. Fact A.4. Let n ≥ 0 be an integer. Let X and Y be two points in M˜ 1 such that s(X, Y ) ≥ n. Then, the intersection point z of the curves γ s (π˜ 1 (X)) and γ u (π˜ 1 (Y )) exists. Moreover, T n (z) and T n (π˜ 1 (Y )) are both contained in the same unstable curve. Let us write d := gcd(ri ). An extension of (M, ν, T d ). We can show that the dynamical system M˜ d , ν˜ d , T˜d , defined as follows, is an extension of (M, ν, T d ) by π˜ d := π˜ 1| M˜ d : • M˜ d := l≥0 Lld ; • µ˜ d := (˜ν1 )|M˜ d and ν˜ d := d.µ˜ d is the probability measure proportional to µ˜ d ; • T˜d := T˜1d . |M˜ d
A factor with a quasicompact transfer operator. We consider the factor Mˆ d , νˆ d , Tˆd of M˜ d , ν˜ d , T˜d given by the canonical projection πˆ d : M˜ d → Mˆ d , where Mˆ d is the set of the Rd -classes of M˜ d , for the binary relation Rd defined on M˜ d by: (x, l)Rd (x , l ) ⇔ l = l and x, x are in a same γ s ∈ I s . L.-S. Young defines a natural measure m ˆ on Mˆ d such that νˆ d is absolutely continuous relatively to m ˆ and such that the density ρˆ := ddνˆmˆd satisfies: • c0 −1 ≤ ρˆ ≤ c0 , for some real number c0 > 1;
sˆ (x, ˆ y) ˆ ρ( ˆ x), ˆ for some real numbers c1 > 0 and α0 ∈]0; 1[; • ρ( ˆ x) ˆ − ρ( ˆ y) ˆ ≤ c 1 α0
114
F. Pène
ˆ ˆ with sˆ (πˆ d(x), πˆ d (y)) := s(x, y). We shall write Lld := πˆ d (Lld ) and Lld,j := πˆ d Lld,j . Let us fix α1 := max(α, α0 ). For any real numbers β ∈]0; 1[ and ε > 0, we define the functional space V(β,ε) as follows: : ; V(β,ε) := fˆ : Mˆ d → C measurable, fˆV(β,ε) < +∞ , where fˆ
V(β,ε)
:= fˆ
(β,ε,∞)
ˆ f
(β,ε,∞)
ˆ f
(β,ε,h)
+ fˆ
(β,ε,h)
, with
:= sup fˆ|Lˆ ld e−ld.ε , ∞
l≥0
:=
sup
sup
l≥0;j =1,...,jld x, ˆ ld,j ˆ y∈ ˆ L
ˆ
ˆ − fˆ(y) ˆ
f (x) β
d
0 such that, for any real number ε ∈]0; ε0 ], the three following points hold: • There exists a real number C0 > 0 satisfying · L2 (ˆνd ) ≤ C0 · V(β,ε) ; • There exist two real numbers τ1 ∈]0; 1[ and C1 > 0 such that, for any integer n ≥ 0 and for any fˆ ∈ V(β,ε) satisfying Mˆ d fˆ d m ˆ = 0, we have n ˆ P f
V(β,ε)
≤ C1 τ1 n fˆ
V(β,ε)
;
x) ˆ
ˆ y)−1 ˆ • We have P (fˆ)(x) ˆ = zˆ :Tˆd (ˆz)=xˆ ξ(ˆz)fˆ(ˆz), with log ξ( ≤ C2 α1 sˆ(x, , for any xˆ ξ(y) ˆ
ˆ l,j . and yˆ in the same L In the following, we shall consider (β, ε) satisfying these properties. B. Proof of the Strong Decorrelation Property An exponential rate of decorrelation for (M, ν, T d ). Theorem B.1. Let κ ∈ 0; 21 be a real number. There exists a constant Lη,κ > 0 such that, for any integers m ˜ 1, m ˜ 2 , any functions φ and ψ in Hη,m˜ 1 .d and in Hη,m˜ 2 .d respectively and any integer n ≥ 0, we have
m ˜1
Covν φ, ψ ◦ T n.d ≤ Lη φ∞ Cψ + Cφ ψ∞ + φ∞ .ψ∞ τ0 n− 1−2κ , with τ0 := max α κη.d , τ11−2κ . Before establishing this result, we give the idea of its proof. We shall suppose n(1−2κ) ≥ m ˜ 1 and shall show how the study of Covν$ φ, ψ ◦ T #n.d leads us, after approximations, to the study of a quantity of the form Emˆ fˆ.gˆ ◦ Tˆ n , where fˆ and gˆ are two bounded d
Billiards
115
functions defined on Mˆ d such that P n1 fˆ is m-centered ˆ and is in V(β,ε) with n1 = 2 κn! + m ˜ 1 . Therefore, we shall get:
$
#
ˆ ≤ g ˆ ∞ P n (fˆ)
Emˆ fˆ.gˆ ◦ Tˆdn = Emˆ [P n (fˆ).g] 1 n ˆ ≤ g ˆ ∞ C0 P (f ) ≤ g ˆ ∞ C0 C1 τ1 n−n1 P n1 (fˆ) . V(β,ε)
V(β,ε)
˜ 2 , κ, φ and ψ be as in the statement of the theorem. If n(1 − 2κ) < m ˜ 1, Proof. Let m ˜ 1, m then we have
m ˜1
Covν (φ, ψ ◦ T n.d ) ≤ φ∞ .ψ∞ ≤ φ∞ .ψ∞ τ0 − 1−2κ τ0 n . In the following, we shall suppose n(1 − 2κ) ≥ m ˜ 1 . We denote k = kn := κn!. We have n ≥ 2κn + m ˜ 1 ≥ 2k + m ˜ 1 . Therefore
Covν (φ, ψ ◦ T n.d ) = Covν˜ d φ˜ ◦ T˜dk , (ψ˜ ◦ T˜dk ) ◦ T˜dn
with φ˜ := φ ◦ π˜ d and ψ˜ := ψ ◦ π˜ d . So, we have Covν (φ, ψ ◦ T n.d ) ≤ An + Bn + Cn , with An , Bn and Cn defined as follows:
1. We write An := Covν˜ d φ˜ ◦ T˜dk , (ψ˜ ◦ T˜dk ) ◦ T˜dn − Covν˜ d φ˜ ◦ T˜dk , ψˆ k ◦ T˜dn
where ψˆ k (x) is the infimum of ψ˜ ◦ T˜dk = ψ ◦ T kd ◦ π˜ d on the atom of M2k+m˜ 2 containing x, where we have written (2k+ m ˜ 2 )d > M2k+m˜ 2 := . T˜1−i D i=0
|M˜ d
We shall use the regularity of ψ to get an upper-bound for An . We recall that, by m ˜ 2 .d (x)). Moreover, we hypothesis, x + → ψ(x) is Hölder of order η in (x,T (x), ..., T kd+j π˜ d M2k+m˜ 2 (for j = 0, ..., m ˜ 2 d) is contained shall see that each atom of T in a connected component of M \ R−sj ,0 and has a diameter less than 2Cα1 kd , with sj := (2k + m ˜ 2 )d − (kd + j ) ≥ kd. Indeed, let Y1 and Y2 be two points in the same ˜ 2 . Therefore, according to Fact atom of M2k+m˜ 2 . Then, we have s (Y1 , Y2 ) ≥ 2k + m A.4, the intersection point y3 of γ u (π˜ d (Y1 )) with γ s (π˜ d (Y2 )) exists. Since y3 and π˜ d (Y2 ) are both contained in the same stable curve and according to Proposition A.2, we have d T kd+j (y3 ), T kd+j (π˜ d (Y2 )) ≤ Cα kd+j ≤ Cα kd . Moreover, according to Fact A.4, T (2k+m˜ 2 )d (y3 ) and T (2k+m˜ 2 )d (π˜ 1 (Y1 )) are both contained in the same unstable curve. So we have: d T kd+j (y3 ), T kd+j (π˜ d (Y1 )) ≤ Cα sj ≤ Cα kd . As ψ is in Hη,m˜ 2 d , according to the foregoing, we have η ˜ ˜k ψ ◦ Td − ψˆ k ≤ Cψ 2Cα kd ≤ Kη Cψ τ0 n ; ∞
116
F. Pène
with Kη := (2C)η α −ηd . We get An ≤ φ∞ ψ˜ ◦ T˜dk − ψˆ k
∞
≤ Kη φ∞ Cψ τ0 n .
2. We write Bn := Covν˜ d φ˜ ◦ T˜dk , ψˆ k ◦ T˜dn − Covν˜ d φˆ k , ψˆ k ◦ T˜dn , where φˆ k (x) is ? (2k+m ˜ 1 )d ˜ −i the infimum of φ˜ ◦ T˜dk on the atom of M2k+m˜ 1 := T D containing 1 i=0
x. As previously, we can show that we have Bn ≤ Kη Cφ ψ∞ τ0 n . 3. We shall now give an upper bound for the following quantity:
|M˜ d
Cn := Covν˜ d φˆ k , ψˆ k ◦ T˜dn
=
ψˆ k ◦ T˜dn .φˆ k d ν˜ d − φˆ k d ν˜ d . ψˆ k d ν˜ d
M˜ M˜ M˜
d
d d
=
ψˆ k ◦ Tˆdn .φˆ k d νˆ d − φˆ k d νˆ d . ψˆ k d νˆ d
, Mˆ d
Mˆ d
Mˆ d
where we also denote by φˆ k the map φˆ k ◦ πˆ d−1 ,
≤
Mˆ
d
≤
ˆ
n ˆ ˆ ˆ ˆ ˆ dm ˆ − ψk .P (φk ρ) φk d νˆ d . ψk d νˆ d
ˆ ˆ M Md
d
n ˆ ˆ ˆ ψk . P (φk ρ) φk ρˆ d m) ˆ −( ˆ ρˆ d m ˆ
Md Mˆ d n ˆ ˆ ≤ ψ∞ C0 P (φk ρ) ˆ − ˆ ρˆ φk ρˆ d m Mˆ d
V(β,ε)
n−(2k+m ˜ 1 ) 2k+m ˜1 ˆ φk − ≤ ψ∞ C0 C1 τ1 P
Mˆ d
ˆ ˆ ρˆ φk ρˆ d m
V(β,ε)
,
ˆ ρˆ is m-centered ˆ and we shall see that P 2k+m˜ 1 (φˆ k ρ) ˆ is in since φˆ k − Mˆ d φˆ k ρˆ d m V(β,ε) . Let l ≥ 0 and j = 1, ..., jld be two integers. We denote by Ald,j the set of ˆ 2k+m˜ := πˆ d M2k+m˜ such that Tˆ 2k+m˜ 1 (A) ⊆ L ˆ ld,j . Let A be an atoms A of M 1 1 d 2k+ m ˜ 1 ˆ ld,j . defines a one-to-one map from A onto L atom of Ald,j . Then, the map Tˆd Indeed, Point 5 of Properties A.3, the fact that each Ki is a s-sub-rectangle and that ˆ T ri (Ki ) is a u-sub-rectangle insure as that Td defines a one-to-one map from each ?d −j " ˜ ˆ ⊆ L ˆ 0 ) onto L ˆ 0 . We denote by Bˆ (in πˆ d (D) and such that Tˆd (B) j "=0 T1 −(2k+m ˜ ) Tˆ(A,d) 1 the inverse map of Tˆd 2k + m ˜ 1 restricted to A. We notice that, for any ˆ ld,j , we have xˆ ∈ L
P 2k+m˜ 1 (fˆ)(x) ˆ =
A∈Ald,j
−(2k+m ˜ ) ξA (x) ˆ fˆ Tˆ(A,d) 1 (x) ˆ ,
Billiards
117
@2k+m˜ 1 −1 i −(2k+m˜ 1 ) where ξA (x) ˆ := i=0 ξ Tˆd (Tˆ(A,d) (x)) ˆ . Since we have P (ρ) ˆ = ρ, ˆ ξA ≥ 0 and ρˆ ≥ 0, we have
2k+m˜ 1 ˆ
(φk ρ) ˆ = sup P 2k+m˜ 1 (φˆ k ρ) ˆ |Lˆ ld,j e−ldε P (β,ε,∞)
∞
l,j
≤ sup φ∞ P 2k+m˜ 1 (ρ) ˆ |Lˆ ld,j e−ldε ≤ c0 φ∞ . ∞
l,j
ˆ l,j , we have According to the foregoing, for any x, ˆ yˆ ∈ L
ˆ y) ˆ
ˆ
C2 α1 sˆ(x,
log ξA (x) , ≤
ξA (y) ˆ
1 − α1
ˆ y) ˆ , for some constant C > 0 independent ˆ − ξA (x) ˆ ≤ C1 ξA (x)α ˆ 1 sˆ(x, and so ξA (y) 1 of n and of A. We denote by cA,ld,j the constant to which φˆ k is equal on A and (A) −(2k+m ˜ ) ρˆ2k+m˜ 1 ,ld,j := ρˆ ◦ Tˆ(A,d) 1 . We get P 2k+m˜ 1 (φˆ k ρ) ˆ (β,ε,h) 2k+ m ˜ 2k+ m ˜ 1 1 ˆ ˆ |P (φk ρ)( ˆ x) ˆ −P (φk ρ)( ˆ y)| ˆ −ldε = < e = sup sup sˆ (x, ˆ y) ˆ d l,j ˆ ld,j d x, ˆ y∈ ˆ L β ≤ sup
sup
(A)
l,j x, ˆ ld,j A∈A ˆ y∈ ˆ L ld,j
≤ sup
sup
l,j x, ˆ ld,j ˆ y∈ ˆ L
φ∞
|cA,j,ld |.
(A)
|ξA (x) ˆ ρˆ2k+m˜ 1 ,ld,j (x) ˆ − ξA (y) ˆ ρˆ2k+m˜ 1 ,ld,j (y)| ˆ β
A∈Ald,j
d
0 and ud minimal is called the fundamental solution √ since all the other solutions can be generated from it by a simple law. Set d := (ud +vd d)/2. The trace of d is defined as tr(d ) := ud . The one-to-one correspondence between primitive conjugacy classes in and closed geodesics on F is as follows: Every primitive hyperbolic P ∈ has two real fixed points
174
M. Peter
(one of them can be ∞). The orthogonal circle in H which ends in these fixed points induces a closed geodesic on F of length l where cosh(l/2) = | Tr P |. The possible values for l and their multiplicities are described in Proposition 2.1 ([15], Corollary 1.5). The lengths of closed geodesics on F are the numbers 2 log d with multiplicities h(d). From this proposition and the preceding description it follows that α(n) =
log n n
h(d).
(2.1)
d: tr(d )=n
Quantitative results about class numbers are often derived using Dirichlet’s class number formula. First we must define Jacobi’s character and Dirichlet L-series. Define χd : Z → {0, ±1} to be completely multiplicative with 1, x 2 ≡ d (p) is solvable, p |d χd (p) := −1, x 2 ≡ d (p) is insolvable , 0, p|d 1, d ≡ 1 (8) χd (2) := −1, d ≡ 5 (8) , χd (−1) := 1. 0, d ≡ 0 (4) χd is a Dirichlet character modulo d (the quadratic reciprocity law is used to prove the d-periodicity). For s > 0, the series L(s, χd ) :=
χd (n) n≥1
ns
is uniformly convergent and defines a holomorphic function. Proposition 2.2 (Dirichlet’s class number formula). For a positive non-square discriminant d, we have √ h(d) log d = d L(1, χd ). In the next section this formula will be used to prove the almost periodicity of α. In principle this could be done by writing L(1, χd ) =
χd (n) + error. n
1≤n≤N
Note that the sum on the right-hand side is a periodic function of d and can be used to approximate L(1, χd ). But this procedure has three severe drawbacks: First the approximation is not particularly good. Therefore we will use a smoothed version of the series representation of L(1, χd ) instead. Second we must approximate a sum of values L(1, χd ) with the condition tr(d ) = n. Breaking up this condition into easier summations, as must be done to process it further, would make things much more difficult. Third we want to compute the Fourier coefficients of α and show that they are multiplicative. This would be near to impossible with the above approach.
The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface
175
Instead an approximating periodic function is used which already incorporates some sort of multiplicativity. From the multiplicativity of χd the Euler product L(s, χd ) =
1−
p
χd (p) −1 , ps
s > 1,
follows. Thus it seems more reasonable to use a partial product of this representation than a partial sum of the series as approximating function.
3. Almost Periodicity The standard reference for almost periodic arithmetical functions is [17]. Here the necessary material will be reviewed briefly. Let q ≥ 1. For f : N → C, define the seminorm f q :=
lim sup N→∞
1/q 1 |f (n)|q ∈ [0, ∞]. N 1≤n≤N
f is called q-limit periodic if for every > 0 there is a periodic function h with f −hq ≤ . The set Dq of all q-limit periodic functions becomes a Banach space with norm · q if functions f1 , f2 with f1 − f2 q = 0 are identified. If 1 ≤ q1 ≤ q2 < ∞, we have D1 ⊇ Dq1 ⊇ Dq2 as sets (but they are endowed with different norms!). There is the more general notion of q-almost periodic function which will not be used in this paper (they are defined as above but with arbitrary trigonometric polynomials for h instead of periodic functions). For all f ∈ D1 , the mean value 1 f (n) N→∞ N
M(f ) := lim
1≤n≤N
exists. The space D2 is a Hilbert space with inner product f, h := M(f h),
f, h ∈ D2 .
For u ∈ R, define eu (n) := e2πiun , n ∈ N. For all f ∈ D1 , the Fourier coefficients f(u) := M(f e−u ), u ∈ R, exist. For u ∈ Q, we have f(u) = 0 (this comes from the fact that f can be approximated by linear combinations of functions ev with v ∈ Q, and that M(ev e−u ) equals 1 if v − u ∈ Z and 0 otherwise; for almost periodic functions it is no longer true). In D2 , we have the canonical orthonormal base {ea/b }, where 1 ≤ a ≤ b and gcd(a, b) = 1. Limit periodic (and almost periodic) functions have a couple of nice properties. They can be added, multiplied and plugged into continuous functions and, under certain conditions, the result is again a limit periodic (almost periodic) function. They have mean values and limit distributions. Here we will use Parseval’s equation. In Corollary 4.2 we will prove that α ∈ Dq for all q ≥ 1. As a side result in Sect. 5, we get M(α) = α (0)
176
M. Peter
= 1. Thus for r ∈ N0 , we have α, ˜ α˜ +r ∈ D2 (where α˜ +r (n) := α(n ˜ + r)), and Parseval’s equation gives γ (r) = M(α˜ α˜ +r ) = α, α+r − 2M(α) + 1 a a α α −1 = = +r b b b≥1 1≤a≤b: (a,b)=1
b≥1 1≤a≤b: (a,b)=1
a 2 2πiar/b α − 1. e b (3.1)
The last equation follows easily since α is real valued. If we can compute α and show that it is multiplicative our main theorem follows. The approximation of α by periodic functions is done in two steps. First we bring Dirichlet L-series into the picture. For n ∈ N, n > 2, define 1 L(1, χd ), β(n) := v 2 2 d,v≥1: dv =n −4
where we must remember that d will always run through non-square positive discriminants. Lemma 3.1. For q ≥ 1, we have α − βq = 0. Proof. For d fixed, the powers dl , l ≥ 1, of the fundamental unit give all solutions (u, v) of the Pellian equation u2 − dv 2 = 4 with u, v ≥ 1 by way of the rule √ u+v d = dl . 2 √ √ For every such solution we have v d = u2 − 4. Thus Proposition 2.2 gives h(d) log d β(n) = . √ n2 − 4 l d,l≥1: tr(d )=n
Since
√ 1 1 n + n2 − 4 l log d = log d = log , l l 2 it follows from (2.1) that √ 1 1 n + n2 − 4 β(n) = √ h(d) log 2 l n2 − 4 l =√
n n2 − 4
log
1 2 (n +
√
d,l≥1: tr(d )=n
n2
log n
− 4)
log n α(n) + O n
d≥1, l≥2: tr(dl )=n
h(d) .
√ Since h(d)√ d (this can be seen, e.g. from Proposition 2.2 and the estimates log d ≥ log 21 (1+ d) and L(1, χd ) log d; the latter follows easily by partial summation from the orthogonality relation for characters), it follows that for d, l ≥ 1 with tr(dl ) = n, n + √n2 − 4 1/ l √ l 1/ l h(d) d (d ) = ≤ n1/ l . 2
The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface
177
Thus
α(n) log n
1 ≤ log n τ (n2 − 4) n
d,v≥1: dv 2 =n2 −4
for all > 0; here τ denotes the divisor function and τ (m) m was used. This implies |β(n) − α(n)| −1/2 log 21 (1 + 1 − 4/n2 ) 4 1 − 2 1+ − 1 · |α(n)| n log n log n + √ 1 n 2 2 d,v≥1: dv =n −4
−1/2+
n
,
which proves the lemma. In the crucial second step β is approximated by a sum of partial products of the Euler product of L(1, χd ). For n > 2, P ≥ 2, define χd (p) −1 1 1− . v p
βP (n) :=
p≤P
d,v≥1: dv 2 =n2 −4, p|v⇒p≤P (1)
(2)
Then β(n) − βP (n) = ,P (n) + ,P (n), where
(1)
,P (n) :=
d,v≥1: dv 2 =n2 −4, p|v for some p>P
and (2) ,P (n)
:=
d,v≥1: dv 2 =n2 −4, p|v⇒p≤P
1 L(1, χd ) v
χd (p) −1 1 . 1− L(1, χd ) − v p p≤P
Fix q ∈ N with q > 1 and choose q > 1 with 1/(2q) + 1/q = 1. The next lemma (1) shows that ,P (n) is negligible in the 2q-norm as P → ∞. Lemma 3.2. For P ≥ 2, we have (1) ,P 2q
v>P
1 vq
1/q .
Proof. Hölder’s inequality gives (1)
|,P (n)| ≤
d,v≥1: dv 2 =n2 −4 p|v for some p>P
1 vq
1/q
d,v≥1: dv 2 =n2 −4 p|v for some p>P
1/(2q) L(1, χd )2q
.
178
M. Peter
For x ≥ 1, this gives
(1)
|,P (n)|2q ≤
v>P
2 2 the inner sum in (2,3) ,P ,N (n) is
l≥1: p|l⇒p≤P
1 −l/N e − 1 l
√ l> N: p|l⇒p≤P
√ l> N: p|l⇒p≤P
1 + N −1/2 =: c1 (P , N ). l
2 + l
√ 1≤l≤ N
1 l · l N
The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface
179
Hölder’s inequality now gives (2,3) 2q , P ,N (n) 2P
χd (l) −l/N 2q e l
1/(2q) .
Thus for x ≥ 1, (2,2,1) 2q , (n) P ,N
2P
2q
1 e−(l1 +···+l2q )/N l1 · · · l2q
χd (l) −l/N e l
√ 1≤v≤ x 2P
+
l1 ,... ,l2q ≥1
1≤v≤ x
l1 · · · l2q −(l1 +···+l2q )/N e l1 · · · l2q
τ2q (l) x + x (1+)/2 N 2q , l K(l) 2q
√ 1≤v≤ x
v
l>P
which together with (3.2) proves the lemma. (2,1)
In order to estimate ,P ,N (n) we must show that the error I (d, N ) := L(1, χd ) −
χd (l) l≥1
l
e−l/N ,
(3.3)
which comes from smoothing the Dirichlet series expansion of L(1, χd ), is small for large N . This is done by representing I (d, N ) as an integral over a vertical line in
The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface
181
the critical strip {0 < s < 1} and using information about the location of the nontrivial zeros of L(s, χd ). The Dirichlet series for L(s, χd ) is absolutely and uniformly convergent on the line s = 2. Sterling’s formula gives (s) |#s|c e−π|#s|/2 for c1 ≤ s ≤ c2 , |#s| ≥ 1, with some constant c = c(c1 , c2 ). Using Mellin’s formula 1+i∞ 1 (s)y −s ds = e−y , y > 0, 2πi 1−i∞ we get 1 2πi
2+i∞
2−i∞
(s − 1) L(s, χd ) N s−1 ds =
χd (l) l≥1
l
e−l/N .
(3.4)
On the other hand, for 1/2 ≤ s ≤ 2, we have L(s, χd ) (d|s|)1/2
(3.5)
(this is easily seen by partial summation). Thus the line of integration in (3.4) may be moved to the line s = κ with some 1/2 < κ < 1. Taking into account the pole of the integrand at s = 1, and using the residue theorem we get for (3.4) the expression κ+i∞ 1 (s − 1) L(s, χd ) N s−1 ds + L(1, χd ) 2πi κ−i∞ and thus 1 I (d, N ) = − 2π i
κ+i∞
κ−i∞
(s − 1) L(s, χd ) N s−1 ds.
(3.6)
We must now considerably reduce the exponent 1/2 of d in (3.5). In order to see the principle let us first assume the Generalized Riemann Hypothesis which says that for all Dirichlet characters χ modulo q, the L-series L(s, χ ) has only zeros with real part 1/2 in the critical strip 0 < s < 1. From this the Generalized Lindelöf Hypothesis follows and in particular for all d ≥ 1, s = κ and > 0, we have L(s, χd ) (d|s|) . From (3.6) it follows that for d, N ≥ 1, I (d, N ) d N κ−1 . This shows that
(2,1)
,P ,N (n)
|I (d, N )| N κ−1 n3
d,v≥1: dv 2 =n2 −4
and for P ≥ 2 and x, N ≥ 1, we get 1 (2,1) 2q ,P ,N (n) x 6q N 2q(κ−1) . x
(3.7)
2 0, we have 2q 1 (2,1) 2q ,P ,N (n) x N 2q(κ−1) + x µ−1+ log(x 2 N ) . x 2 0 depending on . Together with (3.5) and (3.6) this gives (|t| + 1)c e−π|t|/2 x N κ−1 dt I (d, N ) |t|≤(log x)2 /2 + |t|c e−π|t|/2 (d|t|)1/2 N κ−1 dt |t|≥(log x)2 /2
x N κ−1 .
(3.8)
Next we must show that L(s, χd ) cannot have a zero in Rx too often. From zero density estimates it follows that # (n, v, d) 2 < n ≤ x, d, v ≥ 1, n2 − dv 2 = 4, L(s, χd ) has a zero in Rx x µ+ (3.9) (see [12], Lemma 4.11 or [14], estimate (2.6)). A trivial estimation of (3.3) gives I (d, N ) log(dN ).
(3.10)
(3.8), (3.9), (3.10) and Hölder’s inequality give (2,1) 2q , P ,N (n) 2P
Together with Lemma 3.2 this proves the proposition.
4. An Euler Product In this section we exploit the particular construction of βP by writing it as a product of functions each depending only on a single prime. For p a prime, b ∈ N0 and n ∈ Z, set Ipb (n) := 1 if n2 ≡ 4 (p2b ) and, in case p = 2, if (n2 − 4)2−2b is a discriminant (for p > 2 this is automatically fulfilled). Set Ipb (n) := 0 otherwise. Define β(p) (n) :=
−1 1 1 1 − χ(n2 −4)p−2b (p) Ipb (n), b p p
n > 2.
(4.1)
b≥0
Lemma 4.1. For P ≥ 2, we have βP =
p≤P
β(p) .
Proof. In the definition of βP (n), write v = p≤P p bp with bp ∈ N0 . There is a discriminant d > 0 with dv 2 = n2 − 4 iff p 2bp |n2 − 4 for all p ≤ P and d := (n2 − 4)v −2 ≡ 0, 1 (4). Since p2bp ≡ 1 (4) for 2 < p ≤ P , the last condition is equivalent to (n2 − 4)2−2b2 ≡ 0, 1 (4). If these conditions are fulfilled, we have (n2 − 4)p−2bp = drp2 with rp ∈ N, p |rp , for p ≤ P . Thus χ(n2 −4)p−2bp (p) = χd (p) for p ≤ P . This proves the lemma. Corollary 4.2. For q ≥ 1, we have α ∈ Dq . In particular, limP →∞ βP = α with respect to the q-norm.
184
M. Peter
Proof. For q ∈ N with q > 1 fixed and q > 1 with 1/(2q) + 1/q = 1 it follows from Lemma 3.1 and Proposition 3.7 that
α − βP 2q c2 (P )1/q + c3 (P )1/(2q) ; here c2 (P ) :=
1 → 0 vq v>P
as P → ∞ since q > 1. Furthermore, c3 (P ) :=
τ2q (l) →0 l K(l) 2q
l>P
as P → ∞, since τ2q (l) = l K(l) l≥1
a,b≥1: a squarefree
a b2 τ2q (ab2 ) < ∞. ab2 · a a2 b2 a≥1
b≥1
Thus limP →∞ α − βP 2q = 0. For f : N → ∞ arbitrary and 1 ≤ q1 ≤ q2 < ∞, we have f q1 ≤ f q2 by Hölder’s inequality. Thus limP →∞ α − βP q = 0 for all q ≥ 1. Since the bth summand of β(p) is p 2b+1 -periodic (22b+3 -periodic in case p = 2) and the series representing β(p) is uniformly convergent, the function β(p) is uniformly limit periodic, i.e. β(p) ∈ Du ; here Du is the set of all functions which can be approximated to an arbitrary accuracy by periodic functions with respect to the supremum norm. Since Du is closed under multiplication it follows from Lemma 4.1 that βP ∈ Du for all P ≥ 2. This gives α ∈ Dq for all q ≥ 1. Next the Fourier coefficients of α are computed in terms of the Fourier coefficients of the β(p) . In particular, this will show their multiplicativity. Lemma 4.3. (a) For all primes p, we have β (p) (0) = 1. (b) For b ∈ N, a ∈ Z, gcd(a, b) = 1, choose ap ∈ Z for all p|b such that − ordp b ≡ ab−1 (1). Then p|b ap p a a p α . = β (p) ordp b b p p|b Proof. From Corollary 4.2 it follows that for arbitrary 0 < < 1 there is some P ≥ 2 with α − βP 1 ≤ and ordp b = 0 for all p > P . For all p ≤ P it follows from (4.1) that there is some lp ≥ ordp b and coefficients cp (ap∗ ) ∈ C, 1 ≤ ap∗ ≤ plp , such that cp (ap∗ ) ea ∗ /plp ≤ , β(p) − 1≤ap∗ ≤plp
u
p
−P . where · u denotes the supremum norm and := P −1 maxp≤P β(p) u + 1 From Lemma 4.1 it follows that cp (ap∗ ) ea ∗ /plp ≤ . βP − p≤P
1≤ap∗ ≤plp
p
u
The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface
185
Thus α −
1≤ap∗ ≤plp (p≤P ) p≤P
cp (ap∗ ) e
lp ∗ p≤P ap /p
≤ 2. 1
For f ∈ D1 we have |f| ≤ f 1 . Furthermore, p≤P ap∗ p −lp ≡ ab−1 (1) iff ap∗ ≡ ap p lp −ordp b (p lp ) for p ≤ P . Therefore the orthogonality relation for the exponential function gives a α cp (ap p lp −ordp b ) ≤ 2. − b p≤P
− ordp b ) − c (a p lp −ordp b ) ≤ for p ≤ P and thus Similarly, β p p (p) (ap p − ordp b lp −ordp b β (a p ) − c (a p ) ≤ . p p (p) p p≤P
p≤P
This gives a a p α β − ≤ 3. (p) b p ordp b p≤P
In the next section we will compute β (p) and thereby show that β (p) (0) = 1. This gives (a) and a a p α − β ≤ 3. (p) b p ordp b p|b
Since 0 < < 1 is arbitrary, (b) follows.
From Corollary 4.2 it follows that (3.1) holds. Here the series on the right hand side is absolutely convergent (plug in r = 0). Thus Lemma 4.3 gives γ (r) + 1 =
1+
p
Ar (p bp ) ,
bp ≥1
where for p prime and b ∈ N, we define Ar (p b ) :=
1≤a≤pb , p |a
a 2 2πiar/pb . β(p) b e p
186
M. Peter
5. Computation of the Local Factors The last step is to calculate β (p) . In particular, this will show that β (p) (0) = 1 which is left over from the proof of Lemma 4.3. For p prime, b ∈ N0 , define −1 1 β(p,b) (n) := 1 − χ(n2 −4)p−2b (p) Ipb (n), n > 2. p Then β(p) = b≥0 p −b β(p,b) , where the series is uniformly convergent. The calculation will only be done for p > 2. The case p = 2 is similar but somewhat more elaborate. Let b, c ∈ N0 , a ∈ Z, p |a. c Case 1. 2b < c − 1. Since β(p,b) is p 2b+1 -periodic, we have β (p,b) (a/p ) = 0. Case 2. b = c = 0. Then β (p,0) (0) =
−1 1 1 1 − χn2 −4 (p) p p n mod p
=
1 # n mod p χn2 −4 (p) = 1 p(1 − 1/p) +
2 1 # n mod p χn2 −4 (p) = −1 + . p p(1 + 1/p)
The cardinality of the first set is (p − 3)/2 and that of the second is (p − 1)/2 (see for 2 example the proof of Lemma 3.3 in [14]). Thus β (p,0) (0) = 1 − 2/(p(p − 1)). Case 3. b = 0, c = 1. Define Sp± (a) :=
e2πian/p .
n mod p: χn2 −4 (p)=±1
Then β (p,0)
a p
=
−1 1 1 e−2πian/p 1 − χn2 −4 (p) p p n mod p
4π a 1 1 −1 ± = Sp (a) . 1∓ 2 cos + p p p ±
Case 4. 2b ≥ c − 1, b ≥ 1. Then β (p,b)
a 1 = 2b+1 c p p ±
n mod p 2b+1 : n≡±2 (p2b )
−1 1 c 1 − χ(n2 −4)p−2b (p) e−2πian/p . p
Setting n = ±2 + mp2b gives a 1 ∓4πia/pc 1 m −1 ∓2πiamp2b−c = 1 − β e e , (p,b) pc p 2b+1 ± p p m mod p
where
· p
denotes the Legendre symbol.
The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface
187
Case 4.1. 2b = c − 1. We have
e∓2πiam/p =
1 1 ∓2πiam/p m e +1 − 2 p 2 m mod p
m mod p: (m/p)=1
=
1 ∓2πiam/p m 1 − . e 2 p 2 m mod p
Set p := 1 if p ≡ 1 (4) and p := i otherwise. The last sum can be reduced to the Gaussian sum associated to the Legendre character which can be computed explicitely (see for example [8], Chapter 2). This gives for the above quantity the value 1 1 ∓a p p 1/2 − . 2 p 2 Therefore β (p,b)
a 2π a 1 −2 = cos pc p 2b+1 p 2 − 1 pc
p3/2 p a 4πia/pc −a −4πia/pc . e e + p p2 − 1 p
+ Case 4.2. 2b > c − 1. Then β (p,b)
a 1 ∓4πia/pc 1 m −1 = 1 − e pc p 2b+1 ± p p m mod p
= =
1
p 2b+1
±
2 p 2b+1
cos
e∓4πia/p
c
1 p−1 1 p−1 · + · +1 1 − 1/p 2 1 + 1/p 2
4π a p 2 + p + 1 pc
p+1
.
Now we can calculate the Fourier coefficients β (p)
a 1 a β = . (p,b) c b p p pc b≥0
Case 1. c = 0. Then β (p) (0) = 1 −
1 2 2 p2 + p + 1 + = 1. cos(0) b 2b+1 − 1) p p p+1
p(p 2
b≥1
188
M. Peter
Case 2. c = 1. Since Sp+ (a) + Sp− (a) = −2 cos(4π a/p), we get β (p)
4π a 1 −1 ± 1 1∓ = 2 cos + Sp (a) p p p p ± 4π a p 2 + p + 1 1 2 + cos p b p 2b+1 p p+1 b≥1 1 2p 4π a = 2 cos + Sp± (a) p −1 p p ∓ 1 ± 2 1 n − 4 2πiam/p = 2 . e p −1 p
a
n mod p
Case 3. c ≥ 2. Then β (p)
a = pc
0≤b(c−1)/2 −4πia/pc c e , c odd p pa e4πia/p + −a 1 p = 2 . c even (p − 1) p 3c/2−2 2 cos 4πa pc ,
Finally Ar (p c ) can be computed. Case 1. c = 1. Then Ar (p) =
n1 ,n2 mod p
n2 − 4 n2 − 4 1 1 2 (p 2 − 1)2 p p
e2πi(n1 −n2 +r)/p .
1≤a≤p−1
The innermost sum is p − 1 for n1 − n2 + r ≡ 0 (p) and −1 otherwise. Since n2 − 4 p − 3 p − 1 = − = −1, p 2 2
n mod p
the value for Ar (p) as given in the theorem follows. Case 2. c ≥ 2. Then for p |a, we have −1 c 1 c a e−8πia/p . 1 + β(p) c = 2 3c/2−2 p (p − 1)p p
The Correlation Between Multiplicities of Closed Geodesics on the Modular Surface
189
Thus Ar (p c ) =
1 2 (p − 1)2 p 3c−4
1≤a≤pc , p |a
8π a −1 c c e2πiar/p 2 + 2 cos . p pc
A short calculation gives the value in the theorem.
Acknowledgements. I would like to express my sincere gratitude to Prof. Zeév Rudnick for bringing this problem to my attention. I gained much from conversations with him and from the stimulating atmosphere which he and his co-organizers created at the DMV seminar “The Riemann Zeta Function and Random Matrix Theory”.
References 1. Aurich, R., Steiner, F.: Energy-level statistics of the Hadamard-Gutzwiller ensemble. Phys. D 43, 155–180 (1990) 2. Barner, K.: On A. Weil’s explicit formula. J. Reine Angew. Math. 323, 139–152 (1981) 3. Bleher, P.: Trace formula for quantum integrable systems, lattice point problem, and small divisors. In: Emerging Applications of Number Theory, Hejhal, D.A., Friedman, J., Gutzwiller, M.C., Odlyzko, A.M. (eds). Berlin–Heidelberg–New York: Springer, 1999 4. Bogomolny, E.B., Leyvraz, F., Schmit, C.: Distribution of eigenvalues for the modular group. Commun. Math. Phys. 176, 577–617 (1996) 5. Bogomolny, E.B., Georgeot, B., Giannoni, M.-J., Schmit, C.: Arithmetical Chaos. Phys. Rep. 291, 219– 324 (1997) 6. Bohigas, O., Giannoni, M.-J., Schmit, C.: Characterization of chaotic quantum spectra and universality of level fluctuation laws. Phys. Rev. Lett. 52, 1–4(1984); Spectral properties of the Laplacian and random matrix theory. J. Physique Lett. 45, L-1015 (1984) 7. Borevich, Z.I., Shafarevich, I.R.: Number Theory. New York: Academic Press, 1966 8. Davenport, H.: Multiplicative Number Theory. Graduate Texts in Mathematics 74. Berlin–New York: Springer, 1980 9. Hejhal, D.A.: The Selberg Trace Formula, Vol. 1. Lecture Notes in Mathematics 548, Berlin–Heidelberg– New York: Springer, 1976 10. Luo, W., Sarnak, P.: Number variance for arithmetic hyperbolic surfaces. Commun. Math. Phys. 161, 419–432 (1994) 11. Ono, T.: An Introduction to Algebraic Number Theory. New York: Plenum Press, 1990 12. Peter, M.: Momente der Klassenzahlen binärer quadratischer Formen mit ganzalgebraischen Koeffizienten. Acta Arith. 70, 43–77 (1995) 13. Peter, M.: Almost periodicity of the normalized representation numbers associated to positive definite ternary quadratic forms. J. Number Theory 77, 122–144 (1999) 14. Peter, M.: The limit distribution of a number theoretic function arising from a problem in statistical mechanics. J. Number Th. 90, 265–280 (2001) 15. Sarnak, P.: Class numbers of indefinite binary quadratic forms. J. Number Th. 15, 229–247 (1982) 16. Schmutz, P.: Arithmetic groups and the length spectrum of Riemann surfaces. Duke Math. J. 84, 199–215 (1996) 17. Schwarz, W., Spilker, J.: Arithmetical Functions. London Mathematical Society LNS 184. Cambridge: Cambridge University Press, 1994 18. Titchmarsh, E.C.: The Theory of the Riemann Zeta-Function. Oxford: Oxford University Press, 1986 Communicated by P. Sarnak
Commun. Math. Phys. 225, 191 – 217 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Weak Hopf Algebras and Singular Solutions of Quantum Yang–Baxter Equation Fang Li1,3, , Steven Duplij2 1 Department of Mathematics, Zhejiang University (Xixi Campus), Hangzhou, Zhejiang 310028, P.R. China.
E-mail:
[email protected];
[email protected] 2 Theoretical Physics, Kharkov National University, Kharkov 61077, Ukraine.
E-mail:
[email protected] 3 Institute of Mathematics, Chinese Academy of Sciences, Beijing 100080, P.R. China
Received: 1 May 2001 / Accepted: 1 September 2001
Abstract: We investigate a generalization of Hopf algebra slq (2) by weakening the invertibility of the generator K, i.e. exchanging its invertibility KK −1 = 1 to the regularity KKK = K. This leads to a weak Hopf algebra wslq (2) and a J -weak Hopf algebra vslq (2) which are studied in detail. It is shown that the monoids of group-like elements of wslq (2) and vslq (2) are regular monoids, which supports the general conjucture on the connection betweek weak Hopf algebras and regular monoids. Moreover, w from wslq (2) a quasi-braided weak Hopf algebra U q is constructed and it is shown that the corresponding quasi-R-matrix is regular R w Rˆ w R w = R w . 1. Introduction The concept of a weak Hopf algebra as a generalization of a Hopf algebra [29, 1] was introduced in [18] and its characterizations and applications were studied in [20]. A k-bialgebra1 H = (H, µ, η, , ε) is called a weak Hopf algebra if there exists T ∈ Homk (H, H ) such that id ∗ T ∗ id = id and T ∗ id ∗ T = T , where T is called a weak antipode of H . This concept also generalizes the notion of the left and right Hopf algebras [24, 12]. The first aim of this concept is to give a new sub-class of bialgebras which includes all of Hopf algebras such that it is possible to characterize this sub-class through their monoids of all group-like elements [18, 20]. It was known that for every regular monoid S, its semigroup algebra kS over k is a weak Hopf algebra as the generalization of a group algebra [19]. The second aim is to construct some singular solutions of the quantum Yang-Baxter equation (QYBE) and research QYBE in a larger scope. On this hand, in [20] a quantum quasi-double D(H ) for a finite dimensional cocommutative perfect weak Hopf algebra Project (No. 19971074) supported by the National Natural Science Foundation of China. 1 In this paper, k always denotes a field.
192
F. Li, S. Duplij
with invertible weak antipode was built and it was verified that its quasi-R-matrix is a regular solution of the QYBE. In particular, the quantum quasi-double of a finite Clifford monoid as a generalization of the quantum double of a finite group was derived [20]. In this paper, we will construct two weak Hopf algebras in the other direction as a generalization of the quantum algebra slq (2) [22, 2]. We show that wsl2 (q) possesses a quasi-R-matrix which becomes a singular (in fact, regular) solution of the QYBE, with a parameter q. In this reason, we want to treat the meaning of wslq (2) and its quasiR-matrix just as slq (2) [28, 16]. It is interesting to note that wslq (2) is a natural and non-trivial example of weak Hopf algebras. 2. Weak Quantum Algebras For completeness and consistency we remind the definition of the enveloping algebra Uq = Uq (sl(2)) (see e.g. [16]). Let q ∈ C and q = ±1,0. The algebra Uq is generated by four variables (Chevalley generators) E, F , K, K −1 with the relations K −1 K = KK −1 = 1, KEK
−1
KF K
−1
= q E, 2
=q
EF − F E =
−2
F,
− K −1
K . q − q −1
(1) (2) (3) (4)
Now we try to generalize the invertibility condition (1). The first thought is weaken the invertibility to regularity, as it is usually made in semigroup theory [17] (see also [10, 6,7] for higher regularity). So we will consider such weakening the algebra Uq slq (2) , in which instead of the set K, K −1 we introduce a pair Kw , K w by means of the regularity relations Kw K w Kw = Kw , K w Kw K w = K w .
(5)
If K w satisfying (5) is unique for a given Kw , then it is called inverse of Kw (see e.g. [27, 11]). The regularity relations (5) imply that one can introduce the variables Jw = Kw K w ,
J w = K w Kw .
(6)
In terms of Jw the regularity conditions (5) are J w K w = Kw ,
K w Jw = K w ,
(7)
J wKw = Kw,
Kw J w = Kw .
(8)
Since the noncommutativity of generators Kw and K w very much complexifies the generalized construction2 , we first consider the commutative case and imply in what follow that Jw = J w . 2 This case will be considered elsewhere.
(9)
Weak Hopf Algebras and Yang–Baxter Equation
193
Let us list some useful properties of Jw which will be needed below. First we note that commutativity of Kw and K w leads to idempotency condition Jw2 = Jw ,
(10)
which means that Jw is a projector (see e.g. [15]). Conjecture 1. In algebras satisfying the regularity conditions (5) there exists as minimum one zero divisor Jw − 1. Remark 1. In addition with unity 1 we have an idempotent analog of unity Jw which makes the structure of weak algebras more complicated, but simultaneously more interesting. For any variable X we will define “J -conjugation” as def
XJw = Jw XJw
(11)
and the corresponding mapping will be written as ew (X) : X → XJw . Note that the mapping ew (X) is idempotent 2 ew (X) = ew (X) .
(12)
Remark 2. In the invertible case Kw = K, K w = K −1 we have Jw = 1 and ew (X) = X = id (X) for any X, so ew = id. It is seen from (5) that the generators Kw and K w are stable under “Jw -conjugation” KJw = Jw Kw Jw = Kw ,
K Jw = Jw K w Jw = K w .
(13)
Obviously, for any X Kw XK w = Kw XJw K w ,
(14)
Kw XK w = Y ⇒ Kw XJw K w = YJw .
(15)
and for any X and Y
Another definition connected with the idempotent analog of unity Jw is the “Jw product” for any two elements X and Y , viz. def
X Jw Y = XJw Y.
(16)
Remark 3. From (7) it follows that the “Jw -product” coincides with the usual product, if X ends with generators Kw and K w on right side or Y starts with them on left side. j
Let J (ij ) = Kwi K w then we will need a formula i−j Kw , i > j, j Jw(ij ) = Kwi K w = Jw , i = j, j −i K w , i < j,
(17)
194
F. Li, S. Duplij
which follows from the regularity conditions (7). The variables J (ij ) satisfy the regularity conditions Jw(ij ) Jw(j i) Jw(ij ) = Jw(ij ) (ij )
(18)
(ij )
and stable under “J -conjugation” (11) JwJw = Jw . The regularity conditions (7) lead to the noncancellativity: for any two elements X and Y the following relations hold valid: X = Y ⇒ Kw X = Kw Y, Kw X = Kw Y X = Y,
(19) (20)
X = Y ⇒ K w X = K w Y,
(21)
K w X = K w Y X = Y, X = Y ⇒ XJw = YJw , XJw = YJw X = Y.
(22) (23) (24)
The generalization of Uq slq (2) by exploiting regularity (5) instead of invertibility (1) can be done in two different ways. Definition 1. Define Uqw = wslq (2) as the algebra generated by the four variables Ew , Fw , Kw , K w with the relations: K w K w = K w Kw ,
(25)
Kw K w Kw = Kw , K w Kw K w = K w , Kw Ew = q Ew Kw , K w Ew = q Kw Fw = q Ew Fw − Fw Ew =
−2
−2
(26) Ew K w ,
(27)
F w K w , K w Fw = q Fw K w ,
(28)
2
2
Kw − K w . q − q −1
(29)
We call wslq (2) a weak quantum algebra. Definition 2. Define Uqv = vslq (2) as the algebra generated by the four variables Ev , Fw , Kv , K v with the relations (Jv = Kv K v ): Kv K v = K v K v ,
(30)
Kv K v Kv = Kv , K v Kv K v = K v ,
(31)
Kv Ev K v = q Ev ,
(32)
2
Kv Fv K v = q Ev Jv Fv − Fv Jv Ev =
−2
Fv ,
(33)
Kv − K v . q − q −1
(34)
We call vslq (2) a J-weak quantum algebra.
Weak Hopf Algebras and Yang–Baxter Equation
195
In these definitions indeed the first two lines (25)–(26) and (30)–(31) are called to generalize the invertibility KK −1 = K −1 K = 1. Each next line (27)–(29) and (32)– (34) generalizes the corresponding line (2)–(4) in two different ways respectively. In the first almost quantum algebra wslq (2) the last relation (29) between E and F generators remains unchanged from slq (2), while two EK and F K relations are extended to four ones (27)–(28). In vslq (2), oppositely, two EK and F K relations remain unchanged from slq (2) (with K −1 → K substitution only), while the last relation (34) between E and F generators has the additional multiplier Jv which role will be clear later. Note that the EK and F K relations (32)–(33) can be written in the following form close to (27)–(28): Kv Ev Jv = q 2 Jv Ev Kv , K v Ev Jv = q −2 Jv Ev K v , Kv Fv Jv = q
−2
J v Fv K v , K v F v J v = q J v F v K v . 2
(35) (36)
Using (16) and (7) in the case of Jv we can also present the vslq (2) algebra as an algebra with the “Jv -product” Kv Jv K v = K v Jv Kv ,
(37)
Kv Jv K v Jv Kv = Kv , K v Jv Kv Jv K v = K v ,
(38)
Kv Jv Ev Jv K v = q Ev ,
(39)
2
Kv Jv Fv Jv K v = q Ev Jv Fv − Fv Jv Ev =
−2
Fv ,
(40)
Kv − K v . q − q −1
(41)
Remark 4. Due to (7) the only relation where the “Jw -product” really plays its role is the last relation (41). Uqv
From the following proposition, one can find the connection between Uqw = wslq (2), = vslq (2) and the quantum algebra slq (2).
Proposition 1. wslq (2)/(Jw − 1) ∼ = slq (2); vslq (2)/(Jv − 1) ∼ = slq (2). Proof. For cancellative Kw and Kv it is obvious.
Proposition algebras wslq (2) and vslq (2) possess zero divisors, one of 2. Quantum which is3 Jw,v − 1 which annihilates all generators. Proof. From regularity (26) and (31) it follows Kw,v Jw,v − 1 = 0 (see also (1)). Mul tiplying (27) on Jw gives Kw Ew Jw = q 2 Ew Kw Jw ⇒ Kw Ew K w Kw = q 2 Ew Kw . Using the second equation in (27) for the term in the bracket we obtain Kw q 2 K w Ew Kw = q 2 Ew Kw ⇒ (Jw − 1) Ew Kw = 0. For Fw similarly, but we use Eq. (28). By analogy, multiplying (32) on Jv we have Kv Ev K v Kv K v = q 2 Ev Jv ⇒ Kv Ev K v = q 2 Ev Jv ⇒ q 2 Ev = q 2 Ev Jv , and so Ev (Jv − 1) = 0. For Fv similarly, but we use Eq. (33). Remark 5. Since slq (2) is an algebra without zero divisors, some properties of slq (2) cannot be upgraded to wslq (2) and vslq (2), e.g. the standard theorem of Ore extensions and its proof (see Theorem I.7.1 in [16]). 3 We denote by X w,v one of the variables Xw or Xv .
196
F. Li, S. Duplij
Remark 6. We conjecture that in Uqw and Uqv there are no other than Jw,v − 1 zero divisors which annihilate all generators. In other case thorough analysis of them will be much more complicated and very different from the standard case of non-weak algebras. We can get some properties of Uqw and Uqv as follows. Lemma 1. The idempotent Jw is in the center of wslq (2). from (13). Multiplying the first equation in (27) on K w we Proof. ForKw it follows derive Kw Ew K w = q 2 Ew Jw , and applying the second equation in (27) we obtain Ew Jw = Jw Ew . For Fw similarly, but we use Eq. (28). Lemma 2. There are unique algebra automorphisms ωw and ωv of Uqw and Uqv respectively such that ωw,v (Kw,v ) = K w,v , ωw,v (K w,v ) = Kw,v , ωw,v (Ew,v ) = Fw,v , ωw,v (Fw,v ) = Ew,v . 2 = id and ω2 = id. Proof. The proof is obvious, if we note that ωw v
(42)
As in the case of the automorphism ω for slq (2) [16], the mappings ωw and ωv can be called the weak Cartan automorphisms. Remark 7. Note that ωw = ω and ωv = ω in general case. The connection between the algebras wslq (2) and vslq (2) can be seen from the following Proposition 3. There exist the following partial algebra morphism χ : vslq (2) → wslq (2) such that χ (X) = ev (X)
(43)
(v)
or more exactly: generators Xw = Jv Xv Jv = XvJv for all Xv = Kv , K v , Ev , Fv satisfy the same relations as Xw (25)–(29). Proof. Multiplying Eq. (32) on Kv we have Kv Ev K v Kv = q 2 Ev Kv , and using (7) we obtain Kv Ev Jv = q 2 Ev Jv Kv ⇒ Kv Jv Ev Jv = q 2 Jv Ev Jv Kv , and so KvJv EvJv = q 2 EvJv KvJv , which has the shape of the first equation in (27). For Fv similarly using Eq. (33) we obtain KvJv FvJv = q −2 FvJv KvJv . Equation (34) can be modified using (7) and then applying (11), then we obtain EvJv FvJv − FvJv EvJv = which coincides with (29).
KvJv − K vJv q − q −1
Weak Hopf Algebras and Yang–Baxter Equation
197
For conjugated equations (the second ones in (27)–(28)) after multiplication of (32) on K v we have K v Kv Ev K v = q 2 K v Ev ⇒ Jv Ev Jv K v = q 2 K v Jv Ev Jv or using definition (11) and (7) K vJv EvJv = q −2 EvJv K vJv . By analogy from (33) it follows K vJv FvJv = q 2 FvJv K vJv .
(v)
Note that the generators Xw coincide with Xw if Jv = 1 only. Therefore, some (but not all) properties of wslq (2) can be extended on vslq (2) as well, and below we mostly will consider wslq (2) in detail. Lemma 3. Let m ≥ 0 and n ∈ Z. The following relations hold in Uqw : m n m Ew Kw = q −2mn Kwn Ew ,
m n Ew Kw
=q
2mn
n m K w Ew ,
[Ew , Fwm ] = [m]Fwm−1 = [m] m [Ew , Fw ] = [m]
Fwm Kwn = q 2mn Kwn Fwm ,
n Fwm K w
=q
−2mn
n K w Fwm ,
q −(m−1) Kw − q m−1 K w q − q −1
(44) (45)
(46)
q m−1 Kw − q −(m−1) K w m−1 Fw , q − q −1 q −(m−1) Kw − q m−1 K w m−1 Ew q − q −1
m−1 = [m]Ew
(47)
q m−1 Kw − q −(m−1) K w . q − q −1
Proof. The first two relations result easily from Definition 1. The third one follows by induction using Definition 1 and [Ew , Fwm ] = [Ew , Fwm−1 ]Fw + Fwm−1 [Ew , Fw ] = [Ew , Fwm−1 ]Fw + Fwm−1 Applying the automorphism ωw (42) to (46), one gets (47).
Kw − K w . q − q −1
Note that the commutation relations (44)–(47) coincide with the slq (2) case. For vslq (2) the situation is more complicated, because Eqs. (32)–(33) cannot be solved under K v due to noncancellativity (see also (19)–(24)). Nevertheless, some analogous relations can be derived. Using the morphism (43) one can conclude that the similar (v) relations (44)–(47) hold for Xw = Jv Xv Jv , from which we obtain for vslq (2), Jv Evm Kvn = q −2mn Kvn Evm Jv , n Jv Evm K v
=
n q 2mn K v Evm Jv ,
Jv Fvm Kvn = q 2mn Kvn Fvm Jv ,
n Jv Fvm K v
=
n q −2mn K v Fvm Jv ,
(48) (49)
198
F. Li, S. Duplij
Jv Ev Jv Fvm Jv − Jv Fvm Jv Ev Jv = [m]Jv Fvm−1 = [m] Jv Evm Jv Fv Jv − Jv Fv Jv Evm Jv = [m]
q −(m−1) Kv − q m−1 K v q − q −1
(50)
q m−1 Kv − q −(m−1) K v m−1 F v Jv , q − q −1 q −(m−1) Kv − q m−1 K v m−1 Ev J v q − q −1
= [m]Jv Evm−1
(51)
q m−1 Kv − q −(m−1) K v . q − q −1
It is important to stress that due to noncancellativity of weak algebras we cannot cancel these relations on Jv (see (19)–(24)). In order to discuss the basis of Uqw = wslq (2), we need to generalize some properties of Ore extensions (see [16]).
3. Weak Ore Extensions Let R be an algebra over k and R[t] be the free left R-module consisting of all polynomials of the form P = ni=0 ai t i with coefficients in R. If an = 0, define deg(P ) = n; say deg(0) = −∞. Let α be an algebra morphism of R. An α-derivation of R is a k-linear endomorphism δ of R such that δ(ab) = α(a)δ(b) + δ(a)b for all a, b ∈ R. It follows that δ(1) = 0. Theorem 1. (i) Assume that R[t] has an algebra structure such that the natural inclusion of R into R[t] is a morphism of algebras and deg(P Q) ≤ deg(P ) + deg(Q) for any pair (P , Q) of elements of R[t]. Then there exists a unique injective algebra endomorphism α of R and a unique α-derivation δ of R such that ta = α(a)t + δ(a) for all a ∈ R; (ii) Conversely, given an algebra endomorphism α of R and an α-derivation δ of R, there exists a unique algebra structure on R[t] such that the inclusion of R into R[t] is an algebra morphism and ta = α(a)t + δ(a) for all a ∈ R. Proof. (i) Take any 0 = a ∈ R and consider the product ta. We have deg(ta) ≤ deg(t)+deg(a) = 1. By the definition of R[t], there exists uniquely determined elements α(a) and δ(a) of R such that ta = α(a)t + δ(a). This defines maps α and δ in a unique fashion. The left multiplication by t being linear, so are α and δ. Expanding both sides of the equality (ta)b = t (ab) in R[t] using ta = α(a)t + δ(a) for a, b ∈ R, we get α(a)α(b)t + α(a)δ(b) + δ(a)b = α(ab)t + δ(ab). It follows that α(ab) = α(a)α(b) and δ(ab) = α(a)δ(b) + δ(a)b, and, α(1)t + δ(1) = t1 = t. So, α(1) = 1, δ(1) = 0. Therefore, we know that α is an algebra endomorphism and δ is an α-derivation. The uniqueness of α and δ follows from the freeness of R[t] over R. (ii) We need to construct the multiplication on R[t] as an extension of that on R such that ta = α(a)t + δ(a). For this, it needs only to determine the multiplication ta for any a ∈ R.
Weak Hopf Algebras and Yang–Baxter Equation
199
Let M = {(fij )i,j ≥1 :fij ∈ End k (R) and each row and each column has only finitely 1 many fij = 0} and I = 1 is the identity of M. .. . For a ∈ R, let a : R → R satisfying a (r) = ar. Then a ∈ Endk (R); and for r ∈ R, (α a )(r) = α(ar) = α(a)α(r) = (α(a)α)(r), (δ a )(r) = δ(ar) = α(a)δ(r) + δ(a)r = + δ(a))(r), δ + δ(a) in Endk (R), and, obviously, (α(a)δ thus α a = α(a)α, a = α(a)δ for a, b ∈ R, ab = a b; a + b = a + b. δ α δ . . ∈ M and define / : R[t] → M satisfying /( ni=0 ai t i ) = Let T = . α .. . n ai I )T i . It is seen that / is a k-linear map. i=0 ( Lemma 4. The map / is injective. Proof. Let p = ni=0 ai t i . Assume /(p) = 0. 01 .. . 0i−1 For ei = 1i , obviously, {ei }i≥1 are linear independent. Since δ(1) = 0 and 0 i+1 . .. 0n 01 .. . 0i−1 δ(1)i α(1) = 1, we have T ei = = ei+1 and T i e1 = ei+1 for any i ≥ 0. Thus, α(1)i+1 0 i+2 .. . 0n 0 = /(P )e1 = ni=0 ( ai I )T i e1 = ni=0 ai ei+1 . It means that ai = 0 for all i, then ai = ai 1 = ai 1 = 0. Hence P = 0. . )T + δ(a)I Lemma 5. The following relation holds T ( a I ) = (α(a)I Proof. We have
+ δ(a) δ α(a)δ a α δ + δ(a) α(a)α α(a)δ a .. T ( aI ) = = α . α(a)α .. . .. . + δ(a)I = (α(a)I )T + δ(a)I. = α(a)T
.. . .. .
200
F. Li, S. Duplij
Now, we complete the proof of Theorem 1. Let S denote the subalgebra generated by T and a I (all a ∈ R) in M. From Lemma 5, we see that every element of S can be generated linearly by some elements in the form as ( a I )T n (a ∈ R, n ≥ 0). n n But /(at ) = ( a I )T , so /(R[t]) = S, i.e. / is surjective. Then by Lemma 4, / is bijective. It follows that R[t] and S are linearly isomorphic. Define ta = /−1 (T ( a I )), then we can extend this formula to define the multiplication of R[t] with f g = /−1 (xy) for any f, g ∈ R[t] and x = /(f ), y = /(g). Under this definition, R[t] becomes an algebra and / is an algebra isomorphism from R[t] to )T + δ(a)I ) = α(a)t + δ(a) for all a ∈ R. S, and, ta = /−1 (T ( a I )) = /−1 ((α(a)I Obviously, the inclusion of R into R[t] is an algebra morphism. Remark 8. Note that Theorem 1 can be recognized as a generalization of Theorem I.7.1 in [16], since R does not need to be without zero divisors, α does not need to be injective and only deg(P Q) ≤ deg(P ) + deg(Q). Definition 3. We call the algebra constructed from α and δ a weak Ore extension of R, denoted as Rw [t, α, δ]. n possible comLet Sn,k be the linear endomorphism of R defined as the sum of all k positions of k copies of δ and of n−k copies of α. By induction n, from ta = α(a)t +δ(a) n na = under the condition of Theorem 1(ii), we get t S (a)t n−k and moreover, n,k k=0 p n m n+m i i i = i i=0 ai t i=0 bi t p=0 ap i=0 ci t , where ci = k=0 Sp,k (bi−p+k ). Corollary 1. Under the condition of Theorem 1(ii), the following statements hold: (i) As a left R-module, Rw [t, α, δ] is free with basis {t i }i≥0 ; (ii) If α is an automorphism, then Rw [t, α, δ] is also a right free R-module with the same basis {t i }i≥0 . Proof. (i) It follows from the fact that Rw [t, α, δ] is just R[t] as a left R-module. (ii) Firstly, we can show that Rw [t, α, δ] = i≥0 t i R, i.e. for any p ∈ Rw [t, α, δ], there n i are a0 ,a1 ,· · · ,an ∈ R such that p = i=0 t ai . Equivalently, we show by induction on n that for any b ∈ R, bt n can be in the form ni=0 t i ai for some ai . When n = 0, it is obvious. Suppose that for n ≤ k − 1 the result holds. Consider the case n = k. Since α is surjective, there is a ∈ R that b = α n (a)= Sn,0 (a). But such n n n n−k n n t a = k=0 Sn,k (a)t , we get bt = t a − k=1 Sn,k (a)t n−k = ni=0 t i ai by the hypothesis of induction for some ai with an = a. For any i and a, b ∈ R, (t i a)b = t i (ab) since Rw [t, α, δ] is an algebra. Then Rw [t, α, δ] is a right R-module. Suppose f (t) = t n an + · · · + ta1 + A0 = 0 for ai ∈ R and an = 0. Then f (t) can be written as an element of R[t] by the formula t n a = nk=0 Sn,k (a)t n−k whose highest degree term is just that of t n an = nk=0 Sn,k (an )t n−k , i.e. α n (an )t n . From (i), we get n α (an ) = 0. It implies an = 0. It is a contradiction. Hence Rw [t, α, δ] is a free right R-module. We will need the following: Lemma 6. Let R be an algebra, α be an algebra automorphism and δ be an α-derivation of R. If R is a left (resp. right) Noetherian, then so is the weak Ore extension Rw [t, α, δ]. The proof can be made similarly as for Theorem I.8.3 in [16].
Weak Hopf Algebras and Yang–Baxter Equation
201
Theorem 2. The algebra wslq (2) is Noetherian with the basis m
i j l i j i j Fw Kw , Ew Fw K w , Ew Fw Jw }, Pw = {Ew
(52)
where i, j, l are any non-negative integers, m is any positive integer. Proof. As is well known, the two-variable polynomial algebra k[Kw , K w ] is Noetherian (see e.g. [15]). Then A0 = k[Kw , K w ]/(Jw Kw − Kw , K w Jw − K w ) is also Noetherian. For any i, j ≥ 0 and a, b, c ∈ k, if at least one element of a, b, c does not equal 0, j aKwi + bK w + cJw is not in the ideal (Jw Kw − Kw , K w Jw − K w ) of k[Kw , K w ]. So, j j in A0 , aKwi + bK w + cJw = 0. It follows that {Kwi , K w , Jw : i, j ≥ 0} is a basis of A0 . Let α1 satisfy α1 (Kw ) = q 2 Kw and α1 (K w ) = q −2 K w . Then α1 can be extended to an algebra automorphism on A0 and A1 = A0 [Fw , α1 , 0] is a weak Ore extension of A0 j from α = α1 and δ = 0. By Corollary 1, A1 is a free left A0 -module with basis {Fw }i≥0 . m j j j Thus, A1 is a k-algebra with basis {Kwl Fw , K w Fw , Jw Fw : l and j run respectively over all non-negative integers, m runs over all positive integers}. But, from the definition m j j j j m of the weak Ore extension, we have Kwl Fw = q −2lj Fw Kwl , K w Fw = q 2mj Fw K w , j j j j m j Jw Fw = Fw Jw . So, we conclude that {Fw Kwl , Fw K w , Fw Jw : l and j run respectively over all non-negative integers, m runs over all positive integers} is a basis of A1 . j j j m j m j Let α2 satisfy α2 (Fw Kwl ) = q −2l Fw Kwl , α2 (Fw K w ) = q 2m Fw K w , α2 (Fw Jw ) = j Fw Jw . Then α2 can be extended to an algebra automorphism on A1 . Let δ satisfy δ(1) = δ(Kw ) = δ(K w ) = 0, δ(Fwj Kwl )
=
l
δ(Fwj K w ) =
δ Fwj Jw =
j −1 i=0 j −1 i=0 j −1 i=0
Fwj −1
q −2i Kw − q 2i K w l Kw , q − q −1
Fwj −1
q −2i Kw − q 2i K w l Kw, q − q −1
Fwj −1
q −2i Kw − q 2i K w Jw q − q −1
for j > 0 and l ≥ 0. Then just as in the proof of Lemma VI.1.5 in [16], it can be shown that δ can be extended to an α2 -derivation of A1 such that A2 = A1 [Ew , α2 , δ] is a weak Ore extension of A1 . Then in A2 , Ew Kw = α2 (Kw )Ew + δ(Kw ) = q −2 Kw Ew , Ew Fw = α2 (Fw )Ew + δ(Fw ) = Fw Ew +
Ew K w = q 2 K w E w ,
Kw − K w . q − q −1
From these, we conclude that A2 ∼ = Uqw as algebras. Thus, from Lemma 6, Uqw is w i } Noetherian. By Corollary 1, Uq is free with basis {Ew i≥0 as a left A1 -module. Thus, j
j
m
j
i , F K E i , F J E i : i, j, l as a k-linear space, Uqw has the basis Qw = {Fw Kwl Ew w w w w w w run over all non-negative integers, m runs over all positive integers}. By Lemma 3 any x ∈ Pw (resp. Qw ) can be k-linearly generated by some elements of Qw (resp. Pw ), and therefore Pw and Qw generate the same space Uqw .
202
F. Li, S. Duplij
The similar theorem can be proved for vslq (2) as well. Theorem 3. The algebra vslq (2) is Noetherian with the basis m Pv = Jv Evi Jv Fvj Kvl , Jv Evi Jv Fvj K v , Jv Evi Jv Fvj Jv ,
(53)
where i, j, l are any non-negative integers, m is any positive integer. Proof. The two-variable polynomial algebra k[Kv , K v ] is Noetherian (see e.g. [15]). Then A0 = k[Kv , K v ]/(Jv Kv − Kv , K v Jv − K v ) is also Noetherian. For any i, j ≥ 0 j and a, b, c ∈ k, if at least one element of a, b, c does not equal 0, aKvi +bK v +cJv is not j in the ideal (Jv Kv − Kv , K v Jv − K v ) of k[Kv , K v ]. So, in A0 , aKvi + bK v + cJv = 0. j It follows that {Kvi , K v , Jv : i, j ≥ 0} is a basis of A0 . Let α1 satisfy α1 (Kv ) = q 2 Kv and α1 (K v ) = q −2 K v . Then α1 can be extended to an algebra automorphism on A0 and A1 = A0 [Jv Fv Jv , α1 , 0] is a weak Ore extension of A0 from α = α1 and δ = 0. By Corollary 7, A1 is a free left A0 -module with basis m j j j j {Jv Fv Jv }i≥0 . Thus, A1 is a k-algebra with basis {Kvl Fv Jv , K v Fv Jv , Jv Fv Jv : l and j run respectively over all non-negative integers, m runs over all positive integers}. From m j j j the definition of the weak Ore extension, we have Kvl Fv Jv = q −2lj Jv Fv Kvl , K v Fv Jv = j m j j j j m j q 2mj Jv Fv K v , Jv Fv = Fv Jv . So, we conclude that {Fv Kvl Jv , Fv K v Jv , Jv Fv Jv : l and j run respectively over all non-negative integers, m runs over all positive integers} is a basis of A1 . j j j m j m Let α2 satisfy α2 (Jv Fv Kvl ) = q −2l Jv Fv Kvl , α2 (Jv Fv K v ) = q 2m Jv Fv K v , j j α2 (Jv Fv Jv ) = Jv Fv Jv . Then α2 can be extended to an algebra automorphism on A1 . Let δ satisfy δ(1) = δ(Kv ) = δ(K v ) = 0, δ(Jv Fvj Kvl ) = l
δ(Jv Fvj K v ) =
δ Jv Fvj Jv =
j −1 i=0 j −1 i=0 j −1 i=0
Jv Fvj −1
q −2i Kv − q 2i K v l Kv , q − q −1
Jv Fvj −1
q −2i Kv − q 2i K v l Kv, q − q −1
Jv Fvj −1
q −2i Kv − q 2i K v Jv q − q −1
for j > 0 and l ≥ 0. Then just as in the proof of Lemma VI.1.5 in [16], it can be shown that δ can be extended to an α2 -derivation of A1 such that A2 = A1 [Jv Ev Jv , α2 , δ] is a weak Ore extension of A1 . Then in A2 , Jv Ev Kv = α2 (Kv )Jv Ev Jv + δ(Kv ) = q −2 Kv Ev Jv , Jv Ev K v = q 2 K v Ev Jv , Jv Ev Jv Fv Jv = α2 (Fv )Jv Ev Jv + δ(Jv Fv Jv ) = Jv Fv Jv Ev Jv +
Kv − K v . q − q −1
From these, we conclude that A2 ∼ = Uqv as algebras. Thus, from Lemma 6, Uqv is Noetherian. By Corollary 1, Uqv is free with basis {Jv Evi Jv }i≥0 as a left A1 -module. Thus, as a k-linear space, Uqv has the basis m
Qv = {Jv Fvj Kvl Evi Jv , Jv Fvj K v Evi Jv , Jv Fvj Jv Evi Jv },
Weak Hopf Algebras and Yang–Baxter Equation
203
where i, j, l run over all non-negative integers, m runs over all positive integers. By (48)–(51) any x ∈ Pv (resp. Qv ) can be k-linearly generated by some elements of Qv (resp. Pv ), and therefore Pv and Qv generate the same space Uqv . 4. Extension to the q = 1 Case Let us discuss the relation between Uqw = wslq (2) and U (slq (2)). Just like the quantum algebra slq (2), we first have to give another presentation for Uqw . Let q ∈ C and q = ±1,0. Define Uqw as the algebra generated by the five variables Ew , Fw , Kw , K w , Lv with the relations (for Uqv Eqs. (56) and (57) should be exchanged with (32) and (33) respectively): K w K w = K w Kw , K w K w K w = Kw ,
(54)
K w Kw K w = K w ,
Kw Ew = q 2 Ew Kw , Kw Fw = q
−2
Fw Kw ,
(55)
K w Ew = q −2 Ew K w ,
(56)
K w Fw = q F w K w ,
(57)
2
[Lw , Ew ] = q(Ew Kw + K w Ew ),
(58)
−1
(59)
[Lw , Fw ] = −q
(Fw Kw + K w Fw ),
Ew Fw − Fw Ew = Lw ,
(q − q
−1
)Lw = (Kw − K w ).
(60)
For vslq (2) we can similarly define the algebra Uqv , Kv K v = K v Kv , Kv K v Kv = Kv ,
(61)
K v Kv K v = K v ,
Kv Ev K v = q 2 Ev , Kv Fv K v = q
−2
(63)
Fv ,
(64)
Lv Jv Ev − Ev Jv Lv = q(Ev Kv + K v Ev ), Lv Jv Fv − Fv Jv Lv = −q
−1
(Fv Kv + K v Fv ),
Ev Jv Fv − Fv Jv Ev = Lv , (q − q
(62)
−1
)Lv = (Kv − K v ).
(65) (66) (67)
Note that contrary to Uqw and Uqv , the algebras Uqw and Uqw are defined for all invertible values of the parameter q, in particular for q = 1. Proposition 4. The algebra Uqw is isomorphic to the algebra Uqw with ϕw satisfying ϕw (Ew ) = Ew , ϕw (Fw ) = Fw , ϕw (Kw ) = Kw , ϕw (K w ) = K w . Proof. The proof is similar to that of Proposition VI.2.1 in [16] for slq (2). It suffices to check that ϕw and the map ψw : Uqw → Uqw satisfying ψw (Ew ) = Ew , ψw (Fw ) = Fw , ψw (Kw ) = Kw , ψw (Lw ) = [Ew , Fw ] are reciprocal algebra morphisms. On the other hand, we can give the following relationship between Uqw and U (sl(2)) whose proof is easy.
204
F. Li, S. Duplij
Proposition 5. For q = 1 (i) the algebra isomorphism U (sl(2)) ∼ = U1w /(Kw − 1) holds; (ii) there exists an injective algebra morphism π from U1w to U (sl(2))[Kw ]/(Kw3 − Kw ) satisfying π(Ew ) = XKw , π(Fw ) = Y , π(Kw ) = Kw , π(L) = H Kw . Remark 9. In Proposition 5(ii), π is only injective, but not surjective since K 2 = 1 in U (sl(2))[K]/(K 3 − K) and then X does not lie in the image of π.
5. Weak Hopf Algebras Structure Here we define weak analogs in wslq (2) and vslq (2) for the standard Hopf algebra structures , ε, S – comultiplication, counit and antipod, which should be algebra morphisms. For the weak quantum algebra wslq (2) we define the maps w : wslq (2) → wslq (2) ⊗ wslq (2), εw : wslq (2) → k and Tw : wslq (2) → wslq (2) satisfying respectively w (Ew ) = 1 ⊗ Ew + Ew ⊗ Kw , (Fw ) = Fw ⊗ 1 + K w ⊗ Fw ,
(68)
w (Kw ) = Kw ⊗ Kw , w (K w ) = K w ⊗ K w ,
(69)
εw (Ew ) = εw (Fw ) = 0, εw (Kw ) = εw (K w ) = 1,
(70)
Tw (Ew ) = −Ew K w , Tw (Fw ) = −Kw Fw , T (Kw ) = K w , Tw (K w ) = Kw .
(71)
The difference with the standard case (we follow notations of [16]) is in substitution of K −1 with K w and the last line, where instead of antipod S the weak antipod Tw is introduced [18]. Proposition 6. The relations (68)–(71) endow wslq (2) with a bialgebra structure. Proof. It can be shown by direct calculation that the following relations hold valid: w (Kw )w (K w ) = w (K w )w (Kw ),
(72)
w (Kw )w (K w )w (Kw ) = w (Kw ),
(73)
w (K w )w (Kw )w (K w ) = w (K w ),
(74)
w (Kw )w (Ew ) = q 2 w (Ew )w (Kw ),
(75)
w (K w )w (Ew ) = q −2 w (Ew )w (K w ),
(76)
w (Kw )w (Fw ) = q
−2
w (Fw )w (Kw ),
w (K w )w (Fw ) = q 2 w (Fw )w (K w ), w (Ew )w (Fw ) − w (Fw )w (Ew ) =
(w (Kw ) − w (K w )) ; (q − q −1 )
(77) (78) (79)
Weak Hopf Algebras and Yang–Baxter Equation
205
εw (Kw )εw (K w ) = εw (K w )εw (Kw ),
(80)
εw (Kw )εw (K w )εw (Kw ) = εw (Kw ),
(81)
εw (K w )εw (Kw )εw (K w ) = εw (K w ),
(82)
εw (Kw )εw (Ew ) = q 2 εw (Ew )εw (Kw ),
(83)
εw (K w )εw (Ew ) = q
−2
εw (Ew )εw (K w ),
(84)
εw (Kw )εw (Fw ) = q
−2
εw (Fw )εw (Kw ),
(85)
εw (K w )εw (Fw ) = q εw (Fw )εw (K w ), 2
εw (Ew )εw (Fw ) − εw (Fw )εw (Ew ) =
(εw (Kw ) − εw (K w )) ; (q − q −1 )
Tw (K w )Tw (Kw ) = Tw (Kw )Tw (K w ),
(86) (87)
(88)
Tw (Kw )Tw (K w )Tw (Kw ) = Tw (Kw ),
(89)
Tw (K w )Tw (Kw )Tw (K w ) = Tw (K w ),
(90)
Tw (Ew )Tw (Kw ) = q Tw (Kw )Tw (Ew ), 2
(91)
Tw (Ew )Tw (K w ) = q
−2
Tw (K w )Tw (Kw ),
(92)
Tw (Fw )Tw (Kw ) = q
−2
Tw (Kw )Tw (Fw ),
(93)
Tw (Fw )Tw (K w ) = q Tw (K w )Tw (Fw ), 2
Tw (Fw )Tw (Ew ) − Tw (Ew )Tw (Fw ) =
(Tw (Kw ) − Tw (K w )) . (q − q −1 )
(94) (95)
Therefore, through the basis in Theorem 2, and εw can be extended to algebra morphisms from wslq (2) to wslq (2) ⊗ wslq (2) and from wslq (2) to k, Tw can be extended to an anti-algebra morphism from wslq (2) to wslq (2) respectively. Using (72)–(87) it can be shown that (w ⊗ id)w (X) = (id ⊗w )w (X), (εw ⊗ id)w (X) = (id ⊗εw )w (X) = X
(96) (97)
for any X = Ew , Fw , Kw or K w . Let µw and ηw be the product and the unit of wslq (2) respectively. Hence (wslq (2), µw , ηw , w , εw ) becomes a bialgebra. Next we introduce the star product in the bialgebra (wslq (2), µw , ηw , w , εw ) similar to the standard way (see e.g. [16]) (A w B) (X) = µw [A ⊗ B] w (X).
(98)
Proposition 7. Tw satisfies the regularity conditions (id w Tw w id)(X) = X, (Tw w id w Tw )(X) = Tw (X) for any X = Ew , Fw , Kw or K w . It means that Tw is a weak antipode.
(99) (100)
206
F. Li, S. Duplij
Proof. Follows from (72)–(95) by tedious calculations. For X = Kw ,K w it is easy, and so we consider X = Ew , as an example. We have (id w Tw w id)(Ew ) = µw [(id w Tw ) ⊗ id] w (Ew ) = µw [(id w Tw ) ⊗ id] (1 ⊗ Ew + Ew ⊗ Kw ) = (id w Tw ) (1) id (Ew ) + (id w Tw ) (Ew ) id (Kw ) = µw [id ⊗Tw ] w (1) id (Ew ) + µw [id ⊗Tw ] w (Ew ) id (Kw ) = µw [id ⊗Tw ] (1 ⊗ 1) id (Ew ) + µw [id ⊗Tw ] (1 ⊗ Ew + Ew ⊗ Kw ) id (Kw ) = Tw (1) id (Ew ) + id (1) Tw (Ew ) id (Kw ) + id (Ew ) Tw (Kw ) id (Kw ) = Ew − Ew K w · Kw + Ew · K w · Kw = Ew = id (Ew ) . By analogy, for (100) and X = Ew we obtain (Tw w id w Tw )(Ew ) = µw [(Tw w id) ⊗ Tw ] w (Ew ) = µw [(Tw w id) ⊗ Tw ] (1 ⊗ Ew + Ew ⊗ Kw ) = (Tw w id) (1)Tw (Ew ) + (Tw w id) (Ew )Tw (Kw ) = µw [Tw ⊗ id] (1 ⊗ 1) Tw (1Ew 1) + µw [Tw ⊗ id] (1 ⊗ Ew + Ew ⊗ Kw ) Tw (Kw ) = Tw (1) Tw (Ew ) + Tw (1) id (Ew ) Tw (Kw ) + Tw (Ew ) id (Kw ) Tw (Kw ) = −Ew K w + Ew K w − Ew K w Kw K w = −Ew K w = Tw (Ew ).
Corollary 2. The bialgebra wslq (2) is a weak Hopf algebra with the weak antipode Tw . We can get an inner endomorphism as follows: Proposition 8. Tw2 is an inner endomorphism of the algebra wslq (2) satisfying for any X ∈ wslq (2), Tw2 (X) = Kw XK w ,
(101)
especially Tw2 (Kw ) = id (Kw ) ,
Tw2 K w = id K w .
(102)
Proof. Follows from (71). Assume that with the operations µw , ηw , w , εw the algebra wslq (2) would possess an antipode S so as to become a Hopf algebra, which should satisfy (S w id)(Kw ) = ηw εw (Kw ), and so it should follow that S(Kw )Kw = 1. But, it is not possible to hold since S(Kw ) can be written as a linear sum of the basis in Theorem 2. It implies that it is impossible for wslq (2) to become a Hopf algebra for the operations above. Corollary 3. wslq (2) is an example of a non-commutative and non-cocommutative weak Hopf algebra which is not a Hopf algebra.
Weak Hopf Algebras and Yang–Baxter Equation
207
In order for Uqw to become a weak Hopf algebra, it is enough to define w (Ew ), w (Fw ), w (Kw ), w (K w ), εw (Ew ), εw (Fw ), εw (Kw ), εw (K w ), Tw (Ew ), Tw (Fw ), Tw (Kw ), Tw (K w ) just as in wslq (2) and define w (Lw ) =
K w − Kw 1 (Kw ⊗ Kw − K w ⊗ K w ), εw (Lw ) = 0, Tw (Lw ) = . −1 q −q q − q −1
From Proposition 4 we conclude that wslq (2) is isomorphic to the algebra Uqw with ϕw . Moreover, one can see easily that ϕw is an isomorphism of weak Hopf algebras from wslq (2) to Uqw . For the J -weak quantum algebra vslq (2) we suppose that some additional Jv should appear even in the definitions of comultiplication and antipod. A thorough analysis gives the following nontrivial definitions: v (Ev ) = Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ,
(103)
v (Fv ) = Jv Fv Jv ⊗ Jv + K v ⊗ Jv Fv Jv ,
(104)
v (Kv ) = Kv ⊗ Kv , v (K v ) = K v ⊗ K v ,
(105)
εv (Ev ) = εv (Fv ) = 0, εv (Kv ) = εv (K v ) = 1,
(106)
Tv (Ev ) = −Jv Ev K v ,
(107)
Tv (Kv ) = K v ,
Tv (Fv ) = −Kv Fv Jv ,
Tv (K v ) = Kv .
(108)
Note that from (105) it follows that v (Jv ) = Jv ⊗ Jv ,
(109)
and so Jv is a group-like element. Proposition 9. The relations (103)–(108) endow vslq (2) with a bialgebra structure. Proof. First we should prove that v defines a morphism of algebras from vslq (2) ⊗ vslq (2) into vslq (2). We check that (110) v (Kv ) v K v = v K v v (Kv ) , v (Kv ) v K v v (Kv ) = v (Kv ) , (111) (112) v K v v (Kv ) v K v = v K v , v (Kv ) v (Ev ) v K v = q 2 v (Ev ) , (113) −2 v (Kv ) v (Fv ) v K v = q v (Fv ) , (114) v (Kv ) − v K v v (Ev ) v (Jv ) v (Fv ) − v (Fv ) v (Jv ) v (Ev ) = . (115) q − q −1 The relations (110)–(112) are clear from (105). For (113) we have v (Kv ) v (Ev ) v K v = (Kv ⊗ Kv ) (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) K v ⊗ K v = Jv ⊗ Kv Ev K v + Kv Ev K v ⊗ Kv = q 2 (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) = q 2 v (Ev ) .
208
F. Li, S. Duplij
Relation (114) is obtained similarly. Next for (115) exploiting (7), (34) and (35)–(36) we derive v (Ev ) v (Jv ) v (Fv ) − v (Fv ) v (Jv ) v (Ev ) = (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) (Jv ⊗ Jv ) Jv Fv Jv ⊗ Jv + K v ⊗ Jv Fv Jv − Jv Fv Jv ⊗ Jv + K v ⊗ Jv Fv Jv (Jv ⊗ Jv ) (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) = Jv Fv Jv ⊗ Jv Ev Jv − Jv Fv Jv ⊗ Jv Ev Jv + Jv Ev K v ⊗ Kv Fv Jv − K v E v Jv ⊗ J v F v K v + J v E v Jv F v Jv ⊗ K v − J v F v Jv E v J v ⊗ K v + K v ⊗ J v E v J v F v J v − K v ⊗ J v F v Jv E v J v = Jv (Ev Jv Fv − Fv Jv Ev ) Jv ⊗ Kv + K v ⊗ Jv (Ev Jv Fv − Fv Jv Ev ) Jv Kv − K v Kv − K v Kv ⊗ K v − K v ⊗ K v Jv ⊗ K v + K v ⊗ J v Jv = −1 −1 q −q q −q q − q −1 v (Kv ) − v K v = . q − q −1 = Jv
Then we show that v (X) is coassociative (v ⊗ id) v (X) = (id ⊗v ) v (X) .
(116)
Take E as an example. On the one hand (v ⊗ id) v (E) = (v ⊗ id) (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) = v (Jv ) ⊗ Jv Ev Jv + v (Jv ) v (E) v (Jv ) ⊗ Kv = Jv ⊗ Jv ⊗ Jv Ev Jv + Jv ⊗ Jv Ev Jv ⊗ Kv + Jv Ev Jv ⊗ Kv ⊗ Kv . On the other hand (id ⊗v ) v (E) = (id ⊗v ) (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) = Jv ⊗ v (Jv ) v (E) v (Jv ) + Jv Ev Jv ⊗ v (Kv ) = Jv ⊗ Jv ⊗ Jv Ev Jv + Jv ⊗ Jv Ev Jv ⊗ Kv + Jv Ev Jv ⊗ Kv ⊗ Kv , which coincides with the previous example. The proof that the counit ε defines a morphism of algebras from vslq (2) onto k is straightforward and the result has the form εv (Kv ) εv K v = εv K v εv (Kv ) , εv (Kv ) εv K v εv (Kv ) = εv (Kv ) , εv K v εv (Kv ) εv K v = εv K v , εv (Kv ) εv (Ev ) εv K v = q 2 εv (Ev ) , εv (Kv ) εv (Fv ) εv K v = q −2 εv (Fv ) , εv (Kv ) − εv K v εv (Ev ) εv (Jv ) εv (Fv ) − εv (Fv ) εv (Jv ) εv (Ev ) = . q − q −1
(117) (118) (119) (120) (121) (122)
Weak Hopf Algebras and Yang–Baxter Equation
209
Moreover, it can be shown that (εv ⊗ id)v (X) = (id ⊗εv )v (X) = X for X = Ev , Fv , Kv , K v . Further we check that Tv defines an anti-morphism of algebras from vslq (2) to op vslq (2) as follows: Tv (Kv ) Tv K v = Tv K v Tv (Kv ) , (123) Tv (Kv ) Tv K v Tv (Kv ) = Tv (Kv ) , (124) (125) Tv K v Tv (Kv ) Tv K v = Tv K v , 2 Tv K v Tv (Ev ) Tv (Kv ) = q Tv (Ev ) , (126) −2 Tv K v Tv (Fv ) Tv (Kv ) = q Tv (Fv ) , (127) Tv (Kv ) − Tv K v Tv (Fv ) Tv (Jv ) Tv (Ev ) − Tv (Ev ) Tv (Jv ) Tv (Fv ) = . (128) q − q −1 The first three relations are obvious. For (126) using (107) and (35) we have Tv K v Tv (Ev ) Tv (Kv ) = Kv −Jv Ev K v K v = −q 2 Kv −K v Ev Jv K v = −q 2 Jv Ev Jv K v = q 2 Jv Ev K v = q 2 Tv (Ev ) . For the last relation (128), using (35)–(36), we obtain Tv (Fv ) Tv (Jv ) Tv (Ev ) − Tv (Ev ) Tv (Jv ) Tv (Fv ) = (Kv Fv Jv ) Jv −Jv Ev K v − −Jv Ev K v Jv (Kv Fv Jv )
Tv (Kv ) − Tv K v K v − Kv = Jv (Fv Jv Ev − Ev Jv Fv ) Jv = Jv Jv = . q − q −1 q − q −1 Therefore, we conclude that vslq (2), µv , ηv , v , Tv has the structure of a bialgebra.
The following property of Tv is crucial for understanding the structure of the bialgebra vslq (2), µv , ηv , v , Tv . Proposition 10. For any X ∈ vslq (2) we have (cf. (101)–(102)) Tv2 (Kv ) = ev (Kv ) , Tv2 K v = ev K v , Tv2 (Ev ) = Kv Ev K v , Tv2 (Fv ) = Kv Fv K v ,
(129) (130)
where ev (X) is defined in (11). Proof. for Ev we have Tv2 (Ev ) = Follows from (7) and (107)–(108). As an example Tv −Jv Ev K v = −Tv K v Tv (Ev ) Tv (Jv ) = Kv Jv Ev K v Jv = Kv Ev K v . The star product in vslq (2), µv , ηv , v , Tv has the form (A v B) (X) = µv [A ⊗ B] v (X).
(131)
210
F. Li, S. Duplij
Proposition 11. Tv satisfies the regularity conditions (ev v Tv v ev )(X) = ev (X) , (Tv v ev v Tv )(X) = Tv (X)
(132) (133)
for any X = Ev , Fv , Kv or K v . Proof. Follows from (103)–(108) and (131). For X = Kv ,K v it is easy, and so we consider X = Ev , as an example. We have (ev v Tv v ev )(Ev ) = µv [(ev v Tv ) ⊗ ev ] v (Ev ) = µv [(ev v Tv ) ⊗ ev ] (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) = (ev v Tv ) (Jv ) ev (Jv Ev Jv ) + (ev v Tv ) (Jv Ev Jv ) ev (Kv ) = µv [ev ⊗ Tv ] v (Jv )ev (Jv Ev Jv ) + µv [ev ⊗ Tv ] v (Ev )ev (Kv ) = µv [ev ⊗ Tv ] (Jv ⊗ Jv ) ev (Ev ) + µv [ev ⊗ Tv ] (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) ev (Kv ) = ev (Jv ) Tv (Jv ) ev (Ev ) + ev (Jv ) Tv (Jv Ev Jv ) ev (Kv ) + ev (Ev ) Tv (Kv ) ev (Kv ) = Jv · Jv · Jv Ev Jv − Jv · Jv Jv Ev K v · Jv Kv Jv + Jv Ev Jv · K v · Jv Kv Jv = Jv Ev Jv = ev (Ev ) . By analogy, for (133) and X = Ev we obtain (Tv v ev v Tv )(Ev ) = µv [(Tv v ev ) ⊗ Tv ] v (Ev ) = µv [(Tv v ev ) ⊗ Tv ] (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) = (Tv v ev ) (Jv )Tv (Jv Ev Jv ) + (Tv v ev ) (Ev )Tv (Kv ) = µv [Tv ⊗ ev ] (Jv ⊗ Jv ) Tv (Jv Ev Jv ) + µv [Tv ⊗ ev ] (Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv ) Tv (Kv ) = Tv (Jv ) ev (Jv ) Tv (Jv Ev Jv ) + Tv (Jv ) ev (Jv Ev Jv ) Tv (Kv ) + Tv (Jv Ev Jv ) ev (Kv ) Tv (Kv ) = −Jv · Jv · Jv Jv Ev K v Jv + Jv · Jv Ev Jv · K v − Jv Jv Ev K v Jv · Jv Kv Jv · K v = −Jv Ev K v = Tv (Ev ).
From (132)–(133) it follows that vslq (2) is not a weak Hopf algebra in the definition of [18]. So we will call it a J -weak Hopf algebra and Tv a J -weak antipode. As it is seen from (99)–(100) and (132)–(133) the difference between them is in the exchange id with ev . Remark 10. The variable ev can be treated as an n = 2 example of the “tower identity” (n) (n) eαβ introduced for semisupermanifolds in [9, 10] or the “obstructor” eX for general mappings, categories and the Yang–Baxter equation in [6–8]. Comparing (68)–(71) with (103)–(108) we conclude that the connection of w , Tw , εw and v , Tv , εv can be written in the following way: v (X) = w (ev (X)) , Tv (X) = Tw (ev (X)) , εv (X) = εw (ev (X)) ,
(134) (135) (136)
which means that additionally to the partial algebra morphism (43) there exists a partial coalgebra morphism which is described by (134)–(136).
Weak Hopf Algebras and Yang–Baxter Equation
211
6. Group-Like Elements Now, we discuss the set G(wslq (2)) of all group-like elements of wslq (2). As is wellknown (see e.g. [14]) a semigroup S is called an inverse semigroup if for every x ∈ S, there exists a unique y ∈ S such that xyx = x and yxy = y, and a monoid is a semigroup with identity. We will show the following j
Proposition 12. The set of all group-like elements G(wslq (2)) = {J (ij ) = Kwi K w : i, j run over all non-negative integers}, which forms a regular monoid under the multiplication of wslq (2). Proof. Suppose x ∈ wslq (2) is a group-like element, i.e. w (x) = x ⊗ x. By Theorem i F j Kl + β i j m i j 2, x can be written as x = i,j,l,m αij l Ew ij m Ew Fw K w + γij Ew Fw Jw . Here w w and in the sequel, every α, β and γ with subscripts is in the field k and does not equal zero. Then i j l i j m i j [αij l w (Ew Fw Kw ) + w (βij m Ew Fw K w ) + w (γij Ew Fw Jw )] w (x) = i,j,l,m
=
[αij l (1 ⊗ Ew + Ew ⊗ Kw )i (Fw ⊗ 1 + K w ⊗ Fw )j (Kw ⊗ Kw )l
i,j,l,m
+ βij m (1 ⊗ Ew + Ew ⊗ Kw )i (Fw ⊗ 1 + K w ⊗ Fw )j (K w ⊗ K w )m + γij (1 ⊗ Ew + Ew ⊗ Kw )i (Fw ⊗ 1 + K w ⊗ Fw )j Jw ]; and x⊗x =
i,j,l,m
⊗
m
i j l i j i j αij l Ew Fw Kw + βij m Ew Fw K w + γij Ew F w Jw
i,j,l,m
i j l i j m i j αij l Ew Fw Kw + βij m Ew Fw K w + γij Ew F w Jw .
It is seen that if i = 0 or j = 0, w (x) is impossible to equal x ⊗ x. So, i = 0 and m j = 0. We get x = l,m αl Kwl + βm K w + Jw . Then m m αl Kwl ⊗ Kwl + βm K w ⊗ K w + Jw ⊗ Jw ; w (x) = l,m
x⊗x =
l,l ,m,m
m
αl αl Kwl ⊗ Kwl + αl βm Kwl ⊗ K w + αl Kwl ⊗ Jw m
m
m
m
+ αl βm K w ⊗ Kwl + βm βm K w ⊗ K w + βm K w ⊗ Jw m + αl Jw ⊗ Kwl + βm Jw ⊗ K w + Jw ⊗ Jw .
If there exists l = l , then x ⊗x possesses the monomial Kwl ⊗Kwl , which does not appear in w (x). It contradicts w (x) = x ⊗ x. Hence we have only a unique l. Similarly, m there exists a unique m. Thus x = αl Kwl + βm K w + Jw . Moreover, it is easy to see that m l αl Kw , βm K w and Jw can not appear simultaneously in the expression of x. Therefore, m we conclude that x = αl Kwl , βm K w or Jw (no summation) and we have w (Jw(ij ) ) = Jw(ij ) ⊗ Jw(ij ) .
(137)
212
F. Li, S. Duplij j
(ij )
It follows that G(wslq (2)) = {Jw = Kwi K w : i, j run over all non-negative integers}. j j i For any J (ij ) = Kwi K w ∈ G(wslq (2)), one can find J (j i) = Kw K w ∈ G(wslq (2)) (ij ) (j i) (ij ) (ij ) such that the regularity (18) takes place Jw Jw Jw = Jw , which means that G(wslq (2)) forms a regular monoid under the multiplication of wslq (2). For vslq (2) we have a similar statement. (ij )
j
Proposition 13. The set of all group-like elements G(vslq (2)) = {Jv = Kvi K v : i, j run over all non-negative integers}, which forms a regular monoid under the multiplication of vslq (2). Proof. Suppose x ∈ vslq (2) is a group-like element, i.e. v (x) = x ⊗ x. By The j j m orem 3, x can be written as x = i,j,l,m αij l Jv Evi Jv Fv Kvl + βij m Jv Evi Jv Fv K v + j
γij Jv Evi Jv Fv Jv . Here and in the sequel, every α, β and γ with subscripts is in the field k and does not equal zero. Then [αij l v (Jv Evi Jv Fvj Kvl ) v (x) = i,j,l,m
m
+ v (βij m Jv Evi Jv Fvj K v ) + v (γij Jv Evi Jv Fvj Jv )] [αij l (Jv ⊗ Jv )(Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv )i = i,j,l,m
× (Jv ⊗ Jv )(Jv Fv Jv ⊗ Jv + K v ⊗ Jv Fv Jv )j (Kv ⊗ Kv )l + βij m (Jv ⊗ Jv )(Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv )i × (Jv ⊗ Jv )(Jv Fv Jv ⊗ Jv + K v ⊗ Jv Fv Jv )j (K v ⊗ K v )m + γij (Jv ⊗ Jv )(Jv ⊗ Jv Ev Jv + Jv Ev Jv ⊗ Kv )i × (Jv ⊗ Jv )(Jv Fv Jv ⊗ Jv + K v ⊗ Jv Fv Jv )j Jv ]; and x⊗x =
i,j,l,m
⊗
m
αij l Jv Evi Jv Fvj Kvl + βij m Jv Evi Jv Fvj K v + γij Jv Evi Jv Fvj Jv
i,j,l,m
m αij l Jv Evi Jv Fvj Kvl + βij m Jv Evi Jv Fvj K v + γij Jv Evi Jv Fvj Jv .
It is seen that if i = 0 or j = 0, v (x) is impossible to equal x ⊗ x. So, i = 0 and m j = 0. We get x = l,m αl Kvl + βm K v + Jv . Then m m αl Kvl ⊗ Kvl + βm K v ⊗ K v + Jv ⊗ Jv ; v (x) = l,m
x⊗x =
l,l ,m,m
m
αl αl Kvl ⊗ Kvl + αl βm Kvl ⊗ K v + αl Kvl ⊗ Jv m
m
m
m
+ αl βm K v ⊗ Kvl + βm βm K v ⊗ K v + βm K v ⊗ Jv m + αl Jv ⊗ Kvl + βm Jv ⊗ K v + Jv ⊗ Jv .
Weak Hopf Algebras and Yang–Baxter Equation
213
If there exists l = l , then x ⊗ x possesses the monomial Kvl ⊗ Kvl , which does not appear in v (x). It contradicts v (x) = x ⊗x. Hence we have only a unique l. Similarly, m there exists a unique m. Thus x = αl Kvl + βm K v + Jv Moreover, it is easy to see that m αl Kvl , βm K v and Jv can not appear simultaneously in the expression of x. Therefore, m we conclude that x = αl Kvl , βm K v or Jv (no summation) and we have v (Jv(ij ) ) = Jv(ij ) ⊗ Jv(ij ) . (ij )
(138)
j
It follows that G(vslq (2)) = {Jv = Kvi K v : i, j run over all non-negative integers}. j j i (ij ) (j i) For any Jv = Kvi K v ∈ G(vslq (2)), one can find Jv = Kv K v ∈ G(vslq (2)) (ij ) (j i) (ij ) (ij ) such that the regularity (18) takes place Jv Jv Jv = Jv , which means that G(vslq (2)) forms a regular monoid under the multiplication of vslq (2). These results show that wslq (2) and vslq (2) are examples of a weak Hopf algebra whose monoid of all group-like elements is a regular monoid. It incarnates further the corresponding relationship between weak Hopf algebras and regular monoids [19]. 7. Regular Quasi-R-Matrix From Proposition 1 we have seen that wslq (2)/(Jw −1) = slq (2). Now, we give another relationship between wslq (2) and slq (2) so as to construct a non-invertible universal R w -matrix from wslq (2). Theorem 4. wslq (2) possesses an ideal W and a sub-algebra Y satisfying wslq (2) = Y ⊕ W and W ∼ = slq (2) as Hopf algebras. j
j
m
j
i F K l , Ei F K , Ei F J : Proof. Let W be the linear sub-space generated by {Ew w w w w w w w w for all i ≥ 0, j ≥ 0, l > 0 and m > 0}, and Y is the linear sub-space generated by i F j : i ≥ 0, j ≥ 0}. It is easy to see that wsl (2) = Y ⊕W ; wsl (2)W wsl (2) ⊆ W , {Ew q q q w thus, W is an ideal; and, Y is a sub-algebra of wslq (2). Note that the identity of W is Jw . Moreover, W is a Hopf algebra with the unit Jw , the comultiplication W w satisfying
W w (Ew ) = Jw ⊗ Ew + Ew ⊗ Kw ,
(139)
= Fw ⊗ Jw + K w ⊗ Fw ,
(140)
W w (Fw ) W w (Kw )
= Kw ⊗ Kw ,
W w (K w ) = K w ⊗ K w ,
(141)
and the same counit, multiplication and antipode as in wslq (2). Let ρ be the algebra morphism from slq (2) to W satisfying ρ(E) = Ew , ρ(F ) = Fw , ρ(K) = Kw and i F j Kl , ρ(K −1 ) = K w . Then ρ is, in fact, a Hopf algebra isomorphism since {Ew w w j m j i i Ew Fw K w , Ew Fw Jw : for all i ≥ 0, j ≥ 0, l > 0 and m > 0} is a basis of W by Theorem 2. Let us assume here that q is a root of unity of order d in the field k, where d is an odd integer and d > 1. d , F d , K d − J ) the two-sided ideal of U w generated by E d , F d , K d − Set I = (Ew w w w q w w w w Jw . Define the algebra U q = Uqw /I .
214
F. Li, S. Duplij d
w
Remark 11. Note that K w = Jw in U q = Uqw /I since Kwd = Jw . It is easy to prove that I is also a coideal of Uq and Tw (I ) ⊆ I . Then I is a weak w Hopf ideal. It follows that U q has a unique weak Hopf algebra structure such that the natural morphism is a weak Hopf algebra morphism, so the comultiplication , the counit w and the weak antipode of U q are determined by the same formulas with Uqw . We will w show that U q is a quasi-braided weak Hopf algebra. As a generalization of a braided bialgebra and R-matrix we have the following definitions [18]. Definition 4. Let there be k-linear maps µ : H ⊗ H → H, η : k → H, : H → H ⊗H, ε : H → k in a k-linear space H such that (H, µ, η) is a k-algebra and (H, , ε) is a k-coalgebra. We call H an almost bialgebra, if is a k-algebra morphism, i.e. (xy) = (x) (y) for every x, y ∈ H . Definition 5. An almost bialgebra H = (H, µ, η, , ε) is called quasi-braided, if there exists an element R of the algebra H ⊗ H satisfying op (x)R = R(x)
(142)
( ⊗ idH )(R) = R13 R23 , (idH ⊗)(R) = R13 R12 .
(143) (144)
for all x ∈ H and
Such R is called a quasi-R-matrix. w d , Fd) ⊕ U q where By Theorem 4, we have U q = Uqw /I = Y /I ⊕ W/I ∼ = Y /(Ew w d d d q = slq (2)/(Ew , Fw , K − 1) is a finite Hopf algebra. We know in [16] that the subU m K n : 0 ≤ m, n ≤ d − 1} is a finite dimensional q of U q generated by {Ew algebra B w q is a braided Hopf algebra as a quotient of the quantum double Hopf sub-algebra and U q . The R-matrix of U q is of B
= 1 R d
0≤i,j,k≤d−1
(q − q −1 )k k(k−1)/2+2k(i−j )−2ij k i q Ew Kw ⊗ Fwk Kwj . [k]!
ρ
ρ
q ∼ Since slq (2) ∼ = W as Hopf algebras and (E d , F d , K d − 1) ∼ = I , we get U = W/I as Hopf algebras under the induced morphism of ρ. Then W/I is a braided Hopf algebra with a R-matrix, Rw =
1 d
0≤k≤d−1;1≤i,j ≤d
(q − q −1 )k k(k−1)/2+2k(i−j )−2ij k i Ew Kw ⊗ Fwk Kwj . q [k]!
Because the identity of W/I is Jw , there exists the inverse Rˆ w of R w such that = R w Rˆ w = Jw . Then we have
Rˆ w R w
R w Rˆ w R w = R w , Rˆ w R w Rˆ w = Rˆ w ,
(145) (146)
Weak Hopf Algebras and Yang–Baxter Equation
215
which shows that this R-matrix is regular in U q . It obeys the following relations: w w op w (x)R = R w (x)
(147)
w (w ⊗ id)(R w ) = R13 R23 ,w w w w (id ⊗w )(R ) = R13 R12 ,
(148) (149)
for any x ∈ W/I and
which are also satisfied in U q . Therefore R w is a von Neumann’s regular quasi-R-matrix of U q . So, we get the following Theorem 5. U q is a quasi-braided weak Hopf algebra with Rw =
1 d
0≤k≤d−1;1≤i,j ≤d
(q − q −1 )k k(k−1)/2+2k(i−j )−2ij k i Ew Kw ⊗ Fwk Kwj q [k]!
as its quasi-R-matrix, which is regular. The quasi-R-matrix from the J -weak Hopf algebra vslq (2) has a more complicated structure and will be considered elsewhere. 8. Discussion In conclusion we would like to compare the presented generalization of the Hopf algebra with the existing ones. A weak Hopf algebra in sense of [4, 30, 26] is a k-linear vector space H that is both an associative algebra (H, µ, η) and a coassociative coalgebra (H, weak , εweak ) related to each other in a certain self-dual way [3, 26] and that possesses an antipode Sweak satisfying (in Sweedler notations [29]) Sweak x(1) x(2) = 1(1) εweak x1(2) , (150) x(1) Sweak x(2) = εweak x1(1) 1(2) , (151) (pre-antipode), and if in addition Sweak x(1) x(2) Sweak x(3) = Sweak (x) ,
(152)
then Sweak can be called a Nill’s antipode. Weak Hopf algebras have “weaker” axioms (2) related to the unit and counit: εweak (xyz) = εweak (xy(1) )εweak (y(2) z) and weak (1) = (weak (1) ⊗ 1) (1 ⊗ weak (1)). So the comultiplication is non-unital weak (1) = 1⊗1 (like in weak quasi Hopf algebras [23]) and the counit is only “weakly” multiplicative, ε(xy) = ε(x1)ε(1(2) y). Therefore they can be called non-unital weak Hopf algebras. Note that this kind of “weakness” is the “strength” of weak Hopf algebras [3], because it allows (even in the finite dimensional and semisimple cases) the weak Hopf algebra to possess non-integral (quantum) dimensions. The earlier proposals of face algebras [13], quantum groupoids [25], the (finite dimensional) generalized Kac algebras [31] are weak Hopf algebras in this sense [26], not the most general ones, but having an involutive antipode. The weak antipode T introduced in [18] and in this paper (Tw and Tv ) is not usually a pre-antipode in the sense (150)–(151). Therefore the class of non-unital Hopf
216
F. Li, S. Duplij
algebras [26, 3] (or quantum groupoids [25]) and the class of weak Hopf algebras [18, 20, 5] are not included in each other. In fact, we have the following relation: A ❄ D
✲ B
✲ C ❄ ✲ E
where A denotes a Hopf algebra, B a non-unital weak Hopf algebra, C a non-unital almost weak Hopf algebra, D a weak Hopf algebra and E an almost weak Hopf algebra. From this, we see easily that just Hopf algebras compose their common subclass. Nill [26] points out that these algebras have many examples in the theory of quantum chain models. Dissimilarly, our examples come from regular monoid algebras [18–20] and also from this paper, i.e. wslq (2), vslq (2), etc. Note that although the weak Hopf algebras in this paper and the non-unital weak Hopf algebras introduced earlier do not include each other usually, their antipodes are defined by a similar method, that is, by using of the regularity of antipodes in the involution algebra of the original algebras. Therefore, we believe that it is possible to characterize certain aspects in similar ways. A further interesting work, which we want to continue, is to study our weak Hopf algebras through similar objects and methods for the non-unital weak Hopf algebras and moreover, to find applications in the theory of quantum chain models and other relative areas. Acknowledgements. F.L. thanks M. L. Ge, P. Trotter and N. H. Xi for kind help and fruitful discussions during his visits. S.D. is thankful to A. Kelarev, V. Lyubashenko, W. Marcinek and B. Schein for useful remarks. S.D. is grateful to the Zhejiang University for kind hospitality and the National Natural Science Foundation of China for financial support.
References 1. Abe, E.: Hopf Algebras. Cambridge: Cambridge Univ. Press, 1980 2. Bernstein, J. and Khovanova, T.: On quantum group SLq (2). Preprint MIT, hep-th/9412056, Cambridge, 1994 3. Böhm, G., Nill, F., and Szlachányi, K.: Weak Hopf algebras I. Integral theory and C ∗ -structure. J. Algebra 221, 385–438 (1999) 4. Böhm, G. and Szlachányi, K.: A coassociative C ∗ -quantum group with nonintegral dimensions. Lett. Math. Phys. 35, 437–456 (1996) 5. Duplij, S. and Li, F.: On regular solutions of quantum Yang-Baxter equation and weak Hopf algebras. J. Kharkov National University, ser. Nuclei, Particles and Fields 521, 15–30 (2001) 6. Duplij, S. and Marcinek, W.: Higher regularity properties of mappings and morphisms. Preprint Univ. Wrocław, IFT UWr 931/00, math-ph/0005033, Wrocław, 2000 7. Duplij, S. and Marcinek, W.: On higher regularity and monoidal categories. Kharkov State University Journal (Vestnik KSU), ser. Nuclei, Particles and Fields 481, 27–30 (2000) 8. Duplij, S. and Marcinek, W.: Noninvertibility, semisupermanifolds and categories regularization. In: Noncommutative Structures in Mathematics and Physics (Duplij S. and Wess J., eds.). Dordrecht: Kluwer, 2001, pp. 125–140 9. Duplij, S.: On semi-supermanifolds. Pure Math. Appl. 9, 283–310 (1998) 10. Duplij, S.: Semisupermanifolds and semigroups. Kharkov: Krok, 2000 11. Goodearl, K.: Von Neumann Regular Rings. London: Pitman, 1979 12. Green, J.A., Nicols, W.D., and Taft, E.J.: Left Hopf algebras. J. Algebra 65, 399–411 (1980) 13. Hayashi, T.: An algebra related to the fusion rules of Wess-Zumino-Witten models. Lett. Math. Phys. 22, 291–296 (1991) 14. Howie, J.M.: Fundamentals of Semigroup Theory. Oxford: Clarendon Press, 1995 15. Hungerford, T.W.: Algebra. New York: Springer-Verlag, 1980
Weak Hopf Algebras and Yang–Baxter Equation
217
16. Kassel, C.: Quantum Groups. New York: Springer-Verlag, 1995 17. Lawson, M.V.: Inverse Semigroups: The Theory of Partial Symmetries. Singapore: World Sci., 1998 18. Li, F.: Weak Hopf algebras and some new solutions of Yang-Baxter equation. J. Algebra 208, 72–100 (1998) 19. Li, F.: Weak Hopf algebras and regular monoids. J. Math. Research and Exposition 19, 325–331 (1999) 20. Li, F.: Solutions of Yang-Baxter equation in endomorphism semigroup and quasi-(co)braided almost bialgebras. Comm. Algebra 28, 2253–2270 (2000) 21. Li, F.: Weaker structures of Hopf algebras and singular solutions ofYang– Baxter equation. In: Symposium on The Frontiers of Physics at Millennium, Singapore: World Scientific Publishing Co. Pte. Ltd, 2001 22. Lustig, G.: On quantum groups. J. Algebra 131, 466–475 (1990) 23. Mack, G. and Schomerus, V.: Quasi Hopf quantum symmetry in quantum theory. Nucl. Phys. B370, 185–191 (1992) 24. Nichols, W.D. and Taft, E.J.: The Left Antipodes of a Left Hopf Algebra. Contemp. Math. 13. Providence: Amer. Math. Soc., 1982 25. Nikshych, D. and Vainerman, L.: Finite quantum groupoids and their applications. Preprint Univ. California, math.QA/0006057, Los Angeles, 2000 26. Nill, F.: Axioms for weak bialgebras. Preprint Inst. Theor. Phys. FU, math.QA/9805104, Berlin, 1998 27. Petrich, M.: Inverse Semigroups. New York: Wiley, 1984 28. Shnider, S. and Sternberg, S.: Quantum Groups. Boston: International Press, 1993 29. Sweedler, M.E.: Hopf Algebras. New York: Benjamin, 1969 30. Szlachányi, K.: Weak hopf algebras, Operator algebras and quantum field theory (Rome, 1996), Cambridge, MA: Internat. Press, 1997, pp. 621–632 31. Yamanouchi, T.: Duality for generalized Kac algebras and a characterization of finite groupoid algebras. J. Algebra 163, 9–50 (1994) Communicated by A. Connes
Commun. Math. Phys. 225, 219 – 221 (2002)
Communications in
Mathematical Physics
Erratum
Ground State Energy of the One-Component Charged Bose Gas Elliott H. Lieb1 , Jan Philip Solovej2 1 Department of Physics, Jadwin Hall, P. O. Box 708, Princeton University, Princeton, NJ 08544, USA.
E-mail:
[email protected] 2 Department of Mathematics, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen,
Denmark. E-mail:
[email protected] Received: 17 September 2001 / Accepted: 5 November 2001 Commun. Math. Phys. 217, 127–163 (2001)
The proof of Lemma B.1 of [1] contains an unjustified operator inequality. In the last estimate on p. 162 the Cauchy–Schwarz inequality was used incorrectly. The lemma is however still correct as stated. We shall show this below. The operator inequality to be proven is that (∇θ )2
−N −N −N + (∇θ)2 ≤ Ct −2 + Cs 2 t −4 , −N + s −2 −N + s −2 −N + s −2
(1)
where −N is the Neumann Laplacian of some bounded open set O ⊂ Rn , s > 0, and θ ∈ C ∞ (O) is constant near the boundary of O and satisfies the estimates ∂ α θ∞ ≤ Ct −|α| , for some t > 0 and all multi-indices α with |α| ≤ 3. The proof of (1) is a little technical. For the application in the paper the following estimate, in which the Cauchy–Schwarz inequality has been used correctly, would have sufficed. 2 −N −N −N 2 4 −1 + (∇θ) ≤ st (∇θ) + (st) (∇θ )2 −N + s −2 −N + s −2 −N + s −2 ≤ Cst −3 + C(st)−1
−N . −N + s −2
In order to prove (1) we shall use the two operator inequalities 2
[−N , f ][f, −N ] ≤
C∇f 2∞ (−N ) + C
i
∂i2 f ∞
© 2001 by the authors. This article may be reproduced in its entirety for non-commercial purposes.
(2)
220
E. H. Lieb, J. P. Solovej
and f (−N )f = −
∂i f 2 ∂i +
i
[∂i f, f ∂i ] ≤ −C
i
∂i f 2 ∂i + C
i
(∂i f )2
i
≤ Cf 2∞ (−N ) + C∇f 2 ,
(3)
where f is a smooth function with compact support in O, which we identify as a multiplication operator. We begin by rewriting the left side of (1): −N −N + (∇θ)2 (∇θ )2 −2 −N + s −N + s −2 ∞ −N −N = (∇θ )2 + (∇θ)2 du −2 + u)2 −2 + u)2 (− + s (− N N +s 0 ∞ 1 −N = (∇θ)2 −2 + u −2 + u − + s − N N +s 0 −N 1 2 (∇θ ) du + −N + s −2 + u −N + s −2 + u ∞ 1 −N + − N , (∇θ)2 −2 −N + s + u (−N + s −2 + u)2 0 1 −N 2 (∇θ) + , − du. N (−N + s −2 + u)2 −N + s −2 + u The first integral we estimate using a Cauchy–Schwarz inequality 0
∞
1 −N (∇θ)2 −N + s −2 + u −N + s −2 + u +
∞
≤ t2 0
−N 1 (∇θ)2 du −2 −N + s + u −N + s −2 + u
1 1 (∇θ)2 (−N )(∇θ)2 −N + s −2 + u −N + s −2 + u + t −2
≤ Ct −2
−N du (−N + s −2 + u)2
−N + Cs 2 t −4 , −N + s −2
where in the last estimate we have used (3) with f = (∇θ)2 .
(4)
Ground State Energy of the One-Component Charged Bose Gas
221
The the final step in the proof of (1) is to estimate the last integral in (4) using a Cauchy–Schwarz inequality, this time together with (2) with f = (∇θ)2 . ∞ 1 −N − N , (∇θ)2 −2 −N + s + u (−N + s −2 + u)2 0 1 −N 2 (∇θ) du , − N (−N + s −2 + u)2 −N + s −2 + u ∞ 1 1 − N , (∇θ)2 (∇θ)2 , −N du ≤ t4 −2 −N + s + u −N + s −2 + u 0 ∞ (−N )2 +t −4 du (−N + s −2 + u)4 0 ∞ 1 1 4 (Ct −6 (−N ) + Ct −8 ) du + s 2 t −4 ≤t −2 + u − + s − + s −2 + u N N 0 ≤ Ct −2
−N + Cs 2 t −4 . −N + s −2
This proves (1). Another correction concerns the assumption made in Corollary 6.5 and on p. 152 that ρ 1/4 R is large. It is not necessary to assume this. Indeed the integrand of the I integral in Lemma 6.11 is monotone increasing in g and, therefore, g may be replaced by 4π |k|−2 , and the resulting integral is finite. On the top of p. 154 we make a choice of R that violates the assumption that ρ 1/4 R is large but, as we have just seen, this assumption is not needed. References 1. Lieb, E.H. and Solovej, J.P.: Ground State Energy of the One-Component Charged Bose Gas. Commun. Math. Phys. 217, 127–163 (2001) Communicated by M. Aizenman
Commun. Math. Phys. 225, 223 – 274 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
On the Point-Particle (Newtonian) Limit of the Non-Linear Hartree Equation Jürg Fröhlich1 , Tai-Peng Tsai2 , Horng-Tzer Yau2,∗ 1 Theoretical Physics, ETH-Hönggerberg, 8093 Zürich, Switzerland. E-mail:
[email protected] 2 Courant Institute, New York University, New York, NY 10012, USA.
E-mail:
[email protected];
[email protected] Received: 30 June 2000 / Accepted: 25 June 2001
Abstract: We consider the nonlinear Hartree equation describing the dynamics of weakly interacting non-relativistic Bosons. We show that a nonlinear Møller wave operator describing the scattering of a soliton and a wave can be defined. We also consider the dynamics of a soliton in a slowly varying background potential W (εx). We prove that the soliton decomposes into a soliton plus a scattering wave (radiation) up to times of order ε −1 . To leading order, the center of the soliton follows the trajectory of a classical particle in the potential W (εx). 1. Introduction and Summary of Main Results The problem of identifying classical regimes of quantum mechanics is a long standing problem of quantum theory. For simple systems it was first studied by Schrödinger in 1926; see [1]. In this paper, we explore a classical regime for a class of systems of identical, non-relativistic bosons, e.g., bosonic atoms such as 7 Li, with very weak twobody interactions described by a potential −κ of van der Waals or Newtonian type satisfying certain regularity properties described below. These bosons move under the influence of an external potential λV , where V is a smooth, positive function on physical space R3 and λ ≥ 0. The potential λV describes e.g. a trap confining the bosons. Let κ denote the strength of the two-body interaction between two bosons as compared to their average kinetic energy, (e.g. in the sense that is small as compared to the kinetic energy operator of two bosons, in the sense of Kato and Rellich, [2]). We are interested in understanding the dynamics of a “condensate” of N = O κ −1 bosons in the “meanfield regime”, where κ is very small. By a “condensate” we mean a state of the system with the property that all except for o(N ) bosons are in the same one-particle state described by a wave function ψ(x), x ∈ R3 . N -particle states of this kind are also called coherent states. ∗ Work partially supported by National Science Foundation Grant no. DMS-9703752 and DMS-0072098
224
J. Fröhlich, T.-P. Tsai, H.-T. Yau
Let ψ0 = ψ0 (x), x ∈ R3 , denote the initial one-particle wave function of a coherent state of the system at time t = 0. In the mean-field limit, κ → 0, N → ∞,
with κ · N =: ν = const.,
(1.1)
the quantum-mechanical time evolution of a condensate of bosons has the property that it maps the initial coherent state with a one-particle wave function ψ0 to a coherent state at a later time t with a one-particle wave function ψt . As proven by K. Hepp [3] (see also [4] for some refinements and extensions), the one-particle wave function ψt of the condensate turns out to be a solution of the (non-linear) Hartree equation, Eq. (1.2) below. If the two-body interactions are dominantly attractive, as for 7 Li atoms, and, given κ, the number of bosons is large enough (i.e., N > Ncrit. (κ), or ν > νcrit. ), the system has bound states. In other words, the bosons may condense into a tightly bound, spatially sharply localized cluster. In the mean-field regime, such bound states appear to be (weakly) well approximated by coherent states with a one-particle wave function corresponding to a non-trivial local minimum of the Hartree energy functional. Turning on a very slowly varying external potential, λV (x) := W (εx),
(1.2)
where W is a smooth, positive function, and ε is much smaller than the diameter of a bound state of N bosons when λ = 0, one expects that the position, r(t) ∈ R3 , of the center of mass of that bound state closely follows a solution of Newton’s equations of motion, r˙ (t) = v(t), v(t) ˙ = −ε (∇W ) (εr (t)) , (1.3) −1 for times t with |t| < O ε . It is in this precise sense that the quantum system of bosons described above approaches a classical regime in the mean-field limit. For attractive two-body interactions, the Hartree equation describing the dynamics of a condensate (coherent state) in the mean-field limit has a self-focussing non-linearity. As a consequence, it has non-trivial “solitary wave solutions” looking like approximate δ-functions, for ν sufficiently large. These solitary wave solutions are precisely the oneparticle wave functions of coherent bound states in the mean-field limit. Our main objective in this paper is to study slow motion of solitons of the Hartree equation. We propose to show that, under the influence of a slowly varying external potential W (εx), the center of mass position, r(t), of a solitary-wave solution of the self-focussing Hartree equation remains close to asolution of Newton’s equations of motion stated above, for all times t with |t| < O ε −1 . (We do, however, not prove rigorous results on the precise way in which a system of identical bosons approaches its mean-field limit; but see [3–5].) Our main results on the self-focussing Hartree equations have been announced in [6], where the reader can find additional background material and motivation coming from physics. In order to be able to describe our main results concisely, we introduce some notation and recall some known results on the Hartree equation. Let H 1 (Rn ) denote the Sobolev space, (1.4) H 1 (Rn ) = ψ(x), x ∈ Rn ∇ψ 2 + ψ 2 < ∞ ,
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
225
where ψ denotes a measurable complex function on Rn , ∇ψ denotes its gradient, and
(·) 2 denotes the L2 -norm. We study properties of solutions of the Hartree equation 1 i∂t ψt = − ψt + λV ψt − ν ∗ |ψt |2 ψt . (1.5) 2 In Eq. (1.5), ψt (x) = ψ(x, t), x ∈ Rn , t ∈ R, is a time (t)-dependent, complex-valued scalar function on physical space Rn belonging to the Sobolev space H 1 (Rn ), for each time t; denotes the scalar Laplacian, λV (x), λ ∈ R, is an external potential, with V a smooth, bounded, positive function on Rn , and −(x) is a radially symmetric two-body potential, with ∈ Lp (Rn , d n x) + L∞ (Rn ), p ≥ n2 ; furthermore ∗ denotes convolution. We shall use the following standard notation: For an arbitrary measurable function ψ on Rn , ψ := ψ(x) d n x, (1.6) Rn
ψ p :=
p
1/p
|ψ|
(1.7)
is the norm on the space Lp = Lp (Rn , d n x), 1 ≤ p < ∞,
ψ H 1 := ∇ψ 2 + ψ 2 is the norm on H 1 = H 1 (Rn ), and
(ψ ∗ χ )(x) :=
ψ(x − y)χ (y)d n y
(1.8)
(1.9)
Rn
denotes the convolution of ψ with another such function χ . There are two important functionals on Sobolev space H 1 which are conserved under the flow ψ := ψ0 → ψt , ψ ∈ H 1 , determined by the Hartree equation (1.5). The first one is the L2 -norm of ψ ¯ ψ := |ψ|2 = ψ 22 N ψ, (1.10) and the second one is the Hamilton (or energy) functional 1 λ 2 ¯ H ψ, ψ := |∇ψ| + V |ψ|2 4 2 1 − ∗ |ψ|2 |ψ|2 . 4
(1.11)
We note that if is a non-negative function belonging to Lp + L∞ , p ≥ n2 , then, for an arbitrary δ > 0, there exists a finite constant C(δ) such that ¯ ψ ∇ψ 22 + C(δ) N ψ, ¯ ψ 2, 0≤ ∗ |ψ|2 |ψ|2 ≤ δN ψ, (1.12)
226
J. Fröhlich, T.-P. Tsai, H.-T. Yau
¯ ψ), and for arbitrary λ, |λ| < see e.g. [7]. Thus, for an arbitrary, but fixed value of N (ψ, ¯ ψ) is bounded from below. ∞, the Hamilton functional H(ψ, Under the assumptions that λV (x) has a minimum at x = x∗ , |x∗ | < ∞, that ¯ ψ) is large enough, one can (x) ≥ 0 and that the value, N, of the functional N (ψ, ¯ show (see Sect. 3) that the Hamilton functional H(ψ, ψ) restricted to the sphere ¯ ψ =N SN := ψ ψ ∈ H 1 , N ψ, (1.13) in Sobolev space reaches its minimum on a positive function QN ∈ SN concentrated near x∗ and decaying exponentially fast in |x|, as |x| → ∞ . This result still holds when λ = 0 (i.e., for a vanishing external potential); but if QN is a minimizer of H S then N so is QN,a , where QN,a (x) := QN (x − a), for arbitrary a ∈ RN . This is a consequence of the translation invariance of H, for λ = 0. A minimizer, QN of H S is a solution of the non-linear eigenvalue equation N
−
1 Q + λV Q − ∗ Q2 Q = EQ, 2
for some real number E, with
(1.14)
¯ Q =N. N Q,
Then ψ(x, t) = QN (x)e−iEt is a stationary solution of the Hartree equation (1.5). Multiplying Eq. (1.14) by Q := QN and integrating, we find that 1 λ 1 E= (1.15) V Q2N − ∗ Q2N Q2N . (∇QN )2 + 2N N N ¯ ψ) evalOne should notice that EN is not the value of the energy functional H(ψ, SN ¯ ψ) in the presence uated on the minimizer ψ = QN , because one is minimizing H(ψ, ¯ ψ) = N . of a constraint, namely N (ψ, (0) ¯ ψ) , with λ = 0, Let QN be a minimizer of the Hamilton functional H(ψ, S N
(0)
centered at x = 0; (QN is known to exist and to be non-trivial, for N large enough). We set λ = 1 and choose V (x) ≡ V (ε) (x) := W (εx),
(1.16)
where W is a fixed, smooth, bounded, positive function on Rn , and ε > 0 is a parameter. Our main concern, in this paper, is to construct local (in time t) solutions of the Hartree equation (1.5), with λ = 1 and V = V (ε) as in (1.16), of the form
(0) (1.17) ψ(x, t) = QN (x − r(t)) + hε (x − r(t), t) eiθ(x,t) , (0)
where hε is a small, dispersive correction to the solitary wave described by QN (x − r(t))eiθ(x,t) , with
hε (·, t) H 1 0(ε3/2 ),
(1.18)
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
227
θ (x, t) is a time-dependent phase, θ (x, t) = v(t)·x − Et + ϑ0 (t),
(1.19)
where v(t) = dr(t) is the velocity of the solitary wave, and ϑ0 (t) is independent of dt x, for all times t with |t| 0(ε −1 ), and provided the soliton trajectory (r(t), v(t)) solves appropriate equations of motion. It will be shown that (r(t), v(t)) must solve the Newtonian equations of motion r˙ (t) = v(t), v(t) ˙ = −ε(∇W )(εr(t)) + a(t),
(1.20)
where a(t) is a “friction force”, with |a(t)| 0(ε 2 ),
(1.21)
for |t| 0(ε−1 ). The friction force a(t) will be determined more precisely in Sect. 3. Neglecting the friction force a(t), Eqs. (1.20) are Newton’s equations of motion for a point particle of mass N moving in an external acceleration field of strength ε with potential V (ε) . Thus, for the velocity v(t) of this particle to deviate substantially from the initial condition v(0) = v0 , the time t must be 0(ε −1 ). For times t, with |t| 0(ε −1 ), the friction force a(t) has a negligibly small effect, for small ε. A solution of the Hartree equation (1.5) of the form (1.17), with properties (1.18) through (1.21), for times t with |t| 0(ε −1 ), describes the motion of an extended particle in a shallow potential well V (ε) interacting weakly with a dispersive medium of infinitely many degrees of freedom with which it can exchange mass and energy. The point-particle limit in which Newton’s laws of motion become exact is the limit ε → 0. For ε > 0, the interactions between the extended particle and the dispersive medium can lead to phenomena such as mass accretion, loss of mass and energy from the particle into dispersive waves, and friction, for times t large on a scale of ε −1 . The intuitive picture is one of a bound cluster of “dust” describing an extended particle, which exhibits Newtonian motion with friction. The friction is caused by the loss of some “dust” originally bound to the particle. This loss of “dust” is only observed when the motion of the particle is not inertial (i.e., accelerated or decelerated) and is described by dispersive waves satisfying a wave equation which is essentially the linearization of (0) Eq. (1.5) around a solitary wave described by QN(t) (x − r(t))eiθ(x,t) . For very large times, the trajectory of the extended particle is expected either to approach an inertial motion diverging to spatial infinity (if W (x) → const, as |x| → ∞ and if the initial mass and velocity of the particle were large enough), or to approach a local minimum of W where the particle will come to rest. This dissipative behavior of the particle motion is an example of the general phenomenon of “dissipation through radiation”. Some simple results on the large-time asymptotics of solutions of the Hartree equation (1.5) (existence of wave operators) are proven in Sect. 4. But it is fair to say that we do not yet have a good mathematical understanding of large-time behavior of solutions of Eq. (1.5). For some earlier results on scattering for the Hartree and nonlinear Schrödinger equation, see, e.g., [7,8] and references given there. Our analysis of solutions of the Hartree equation (1.5) of the form described in (1.17), with properties (1.18) through (1.21), is based on a key assumption, which is, implicitly,
228
J. Fröhlich, T.-P. Tsai, H.-T. Yau
an assumption on the two-body potential − that will not be made explicit in this paper: Let (f, g) := f¯g denote the usual scalar product on L2 , and let H denote the Hessian of the Hamilton ¯ ψ), with λ = 0, at ψ = Q(0) . Furthermore, let H denote the functional H(ψ, real N restriction of H on real-valued functions, and extend it to a complex-linear operator. It is given by an unbounded, selfadjoint operator on L2 will be shown in Sect. 3 that Hreal defining a quadratic form on H 1 which is bounded from below. It is not hard to see that (0) Q, (Hreal − E)Q = ε0 (Q, Q) < 0, for Q := QN , (1.22) where E = EN and 2 ε0 = − N
∗ Q2 Q 2 .
(1.23)
−E has only one negative eigenvalue. Since H is translation-invariant, Actually, Hreal it follows that ∇Q := {∂1 Q, . . . , ∂n Q}, ∂j := ∂x∂ j , j = 1, . . . , n, are n non-vanishing, − E orthogonal to Q, i.e., linearly independent zero-modes for Hreal − E)∂j Q = 0, and ∂j Q, Q = 0, (1.24) (Hreal − E. for all j = 1, . . . , n. Thus 0 is an at least n-fold degenerate eigenvalue of Hreal ¯ ψ) , there is no spectrum of H − E in the interval Since Q is a minimizer of H(ψ, real SN − E in the interval (ε0 , 0). Furthermore, it is easy to see that the spectrum of Hreal [0, −E), where
1 1 2 2 E= (1.25) ∗ Q Q2 < 0 (∇Q) − N 2
is pure-point, while, on the half-line [−E, ∞), it is continuous. Thus, there is a gap, − E in [0, ∞); see Sect. 3 for ε2 > 0, between 0 and the rest of the spectrum of Hreal details. −E is precisely Our key assumption is that the multiplicity of the eigenvalue 0 of Hreal equal to n. This implies that h, (Hreal − E)h ≥ ε2 (h, h), ε2 > 0, (1.26) for all functions h ∈ H 1 with h ⊥ {Q, ∇Q} in the L2 -scalar product (·, ·). We are now prepared to summarize the contents of this paper and to state our main results in the form of theorems. In Sect. 2, we recall the Hamiltonian nature of the Hartree equation (1.5) on the phase space H 1 . We exhibit continuous symmetries of the Hamilton functional that give rise to Eq. (1.5) and derive the corresponding conservation laws. We show that the Hartree equation can also be viewed as the Euler–Lagrange equation derived from an action functional. The Lagrangian formulation of the non-linear Hartree equation is useful to study the formal point-particle limit (the ε → 0 limit in (1.16) through (1.21)). This limit is discussed, in general terms but without mathematical proofs, in Sect. 2, using ideas
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
229
similar to those in [9] in an analysis of vortex motion in the Ginzburg-Landau equation, which is based on an effective-action formalism. We also discuss some expected features of the non-linear Hartree dynamics in the large-time limit. Our first main result is proven in Sect. 3. (0)
Theorem 1.1. Suppose that assumption (1.26) holds for all minimizers Q = QN , with N in an open neighborhood of some N0 > 0. We also assume that (x) is radial,
W 2,1 (R3 )∩W 2,∞ (R3 ) ≤ C
(1.27)
for some constant C . Then there is a positive constant C0 such that, for an arbitrary T < ∞, there is an ε0 > 0 with the property that, for any 0 < ε ≤ ε0 and any initial condition of the form (1.28) ψ(x, 0) = ψ0 (x) = Q (x − r0 ) + hε,0 (x) eiv0 x , with Q = QN0 and hε,0 H 1 ≤ C0 ε 3/2 , the Hartree equation, Eq. (1.5), with λ = 1 and V (x) = W (εx) as in (1.16), has a solution of the form (1.17), for all times t with |t| < T ε−1 , with the following properties: 1. The phase θ (x, t) is as in (1.19); 2. the trajectory (r(t), v(t)) of the extended-particle solution (1.17) is a solution of the equations of motion (1.20) with initial conditions r(t) = r0 , v(t) = v0 , for a friction force a(t) bounded by |a(t)| ≤ C1 ε 2 ; 3. the dispersive correction hε satisfies
hε (·, t) H 1 ≤ C2 ε 3/2 , for some finite constants C1 , C2 depending on T . This result makes the point-particle limit (ε → 0) of the Hartree equation (1.5) precise for initial conditions describing a single extended particle (solitary wave) moving in a shallow potential well, W (εx), and perturbed by a small amount of radiation (described by hε ). It is a special case of the more general situation considered in Sect. 3. A more detailed discussion and the proof of Theorem 1.1 form the contents of Sect. 3. The results just described raise the issue of asymptotic properties of the dynamics determined by the Hartree equation, as time t tends to ±∞. In Sect. 4, we establish a result on the scattering of small-amplitude waves off a single solitary wave. For simplicity, we suppose that physical space is three-dimensional, n = 3, (but our methods can be applied whenever n ≥ 3), we set λ = 0, and we choose to be a non-negative, bounded function of rapid decrease, as |x| → ∞. We consider an “asymptotic profile” described by ψas (x, t) = Q (x − r0 − v0 t) e
i x·v0 − 21 v02 +E t
+ has (x, t),
(1.29)
where has is a solution of the free-particle Schrödinger equation 1 i∂t has (x, t) = − has (x, t), 2
(1.30)
230
J. Fröhlich, T.-P. Tsai, H.-T. Yau
with initial condition has (x, 0) to and being sufficiently small =: has,0 (x) belonging in the space H 4 (R3 ) ∩ W 3,1 R3 , 1 + |x|2 d 3 x and such that the Fourier transform, (0) hˆ as,0 (k), vanishes at k = v0 . In (1.29), Q = QN0 ∈ SN0 is a solution of Eq. (1.14), with (0)
λ = 0, and it is assumed that inequality (1.26) is satisfied for Q = QN ∈ SN , for all N in a small neighborhood of N0 > 0. Theorem 1.2. For an asymptotic profile ψas (x, t) as described in (1.29), (1.30), and under the hypotheses stated above, there are solutions, ψ± (x, t), of the Hartree equation (1.5) (for λ = 0) such that ψ± (x, t) −→ ψas (x, t), as t → ±∞,
(1.31)
in H 2 (R3 ). Their difference is of order O(t −1 ). Thus the non-linear Møller wave maps /± : ψas −→ ψ± exist as symplectic maps on asymptotic profiles of the form (1.29), (1.30). We emphasize that the effect of the scattering wave on the location and the phase of the soliton has to be tracked precisely for all time. The stability of the soliton is quite simple and can be obtained purely from energy consideration. A review can be found in Sect. 3 (see also Weinstein [14]). Therefore, the key points of Theorem 1.2 are its two precise assertions: 1. The location of the soliton is almost “linear.” 2. The scattering wave behaves like an ordinary dispersive wave, (described by has (x, t)), plus a small correction. The condition on the Fourier transform of has,0 is a technical one and we expect to remove it later on. Our result constitutes the first step toward scattering theory. The proof of Theorem 1.2 is the contents of the final section, Sect. 4, of this paper.
2. The Hartree Equation as a Hamiltonian System with Infinitely Many Degrees of Freedom, and Its Point-Particle Limit In the introduction, we have described results indicating how the Hartree equation (1.2) captures the dynamics of a system of very many non-relativistic bosons with very weak two-body interactions in a condensate state. This regime has been called the “mean-field limit”. Actually, the mean-field limit is equivalent, mathematically, to the classical limit in which the value of Planck’s constant, h, ¯ is sent to 0. We are accustomed to expect (actually in general erroneously) that the unitary dynamics of a quantum-mechanical system reduces to the Hamiltonian dynamics of a corresponding classical system, in the classical limit. In the examples studied in this paper, this expectation is justified.
2.1. The Hamiltonian nature of the Hartree equation. The phase space, 0, for the Hartree equation (1.5) is the Sobolev (energy) space H 1 (Rn ) defined in (1.4). We use ¯ ψ(x) and its complex conjugate ψ(x), x ∈ Rn , as complex coordinates for 0. The i ¯ It leads to the following Poisson symplectic 2-form on 0 is given by 2 dψ ∧ d ψ. brackets: ¯ ¯ {ψ(x), ψ(y)} = ψ(x), ψ(y) = 0, ¯ ψ(x), ψ(y) = 2iδ(x − y) .
(2.1) (2.2)
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
231
¯ ψ), leading to the Hartree equation (1.5) is given by The Hamilton functional, H(ψ, 1 ¯ ψ) = H(ψ, (2.3) |∇ψ|2 + 2λV |ψ|2 − ∗ |ψ|2 |ψ|2 . 4 For ∈ Lp + L∞ , p ≥ n2 , H is well defined on 0 and bounded below on the spheres ¯ ψ =N N∗ . Since the proof of Lemma 2.1 is standard, it is omitted. The interesting analytical issues arise in the problems described in Remarks (iii) and (iv). They deserve further study.
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
235
2.3. A heuristic discussion of the point-particle limit of the Hartree equation. In this section we start from the results reviewed in the last section (see Lemma 2.1) to study the point-particle (Newtonian) limit of the Hartree equation. In this limit the Hartree equation reduces to the Newtonian mechanics of point-particles interacting through two-body potential forces. We use ideas closely related to those proposed in [9] in an analysis of vortex motion in the plane, as described by the Ginzburg–Landau equations. Let λV and be as in Eqs. (1.5), (2.6). We set λ = 1 and consider a family of external potentials of the form V (x) ≡ V (ε) (x) := W (εx),
(2.29)
where W is some smooth, positive function on Rn , and ε > 0 is a parameter. Furthermore, the two-body potential, −, is chosen to be (x) = s (x) + 6 (εx),
(2.30)
where s (x) is a rotation-invariant, smooth function decaying rapidly in ρ := |x|, as ρ → ∞, and with the properties that ds (ρ) < 0, for ρ > 0, dρ
(2.31)
and that the key gap assumption (1.26) stated in Sect. 1 holds for = s . The perturbing potential 6 is rotation-invariant and smooth and may be of long range, e.g. 6 (ρ) ∼ ρ 2−n , as ρ → ∞,
(2.32)
for n ≥ 3, which is the behavior of the Coulomb and of Newton’s gravitational potential. For simplicity, we assume that |d6 (ρ) dρ is uniformly bounded in ρ. We pick k positive integers N1 , . . . , Nk , with Nj > N∗ (s ), for all j . For λV = 0 and N > N∗ (s ), we define δN :=
N −1
(0)
d n xQN (x)2 x 2 ,
(2.33)
(0) ¯ ψ) , as described where QN is a rotation-invariant minimizer of the functional H(ψ, SN in Lemma 2.1. We consider an initial condition, ψ0 (x), for the Hartree equation (2.6) describing (0) a configuration of k far-separated “solitons”, QNj (x − rj ), rj ∈ Rn , j = 1, . . . , k, (perturbed by a small-amplitude wave), with the following properties: Each soliton (0) QNj (x) is a rotation-invariant solution of Eq. (3.22), with = s and N = Nj , ¯ ψ) (for λ = 0, = s ). Furthermore minimizing H(ψ, SN
max
j =1,... ,k
δNj
min
1≤i<j ≤k
ri − rj
where ε is the parameter introduced in (2.29), (2.30).
≤ ε,
(2.34)
236
J. Fröhlich, T.-P. Tsai, H.-T. Yau
Our goal is to construct a solution, ψt , of the Hartree equation (2.6) of the form ψt (x) =
k j =1
(0) QNj (t) x − rj (t) eiθj (x,t) + hε (x, t),
(2.35)
where rj (0) = rj , as in (2.34), and r˙j (0) = vj ∈ Rn , j = 1, . . . , k, with the following properties: There is a positive constant T such that, for all times t with |t| < Tε , (a) (b)
hε (·, t) ∼ o(ε),
for an appropriately chosen norm (·) , θj (x, t) = r˙j (t)· x − rj (t) + ϑj (t),
where ϑj (t) is independent of x, and N˙ j (t) = o(ε). (c) The trajectories r1 (t), . . . , rk (t) and the phases ϑ1 (t), . . . , ϑk (t) will turn out to satisfy equations of motion which can be derived from the Hartree equation. In this section we do not present a mathematical proof of the claim that solutions of the Hartree equation (2.6) of the form (2.35) with properties (a)–(c) exist; (but see Sect. 3). We merely verify that a function ψt (x) of the form (2.35) with properties (a)–(c) approaches a critical ¯ ψ) introduced in (2.14), as ε → 0, provided the point of the action functional S(ψ, trajectories rj (t) satisfy certain Newtonian equations of motion and the phases ϑj (t) ¯ ψ) satisfy the Hartree are suitably chosen (j = 1, . . . , k). Since critical points of S(ψ, equation (2.6), this makes it plausible that solutions of (2.6) of the form (2.35) with properties (a) – (c) exist. This claim is proven in Sect. 3 for k = 1. Our heuristic analysis is based on the following simple facts: (1) For i = j, (0) (0) d n xQNi (x − ri ) QNj x − rj → 0, exponentially fast, as |ri − rj | = 0(ε−1 ) → ∞. This follows from Lemma 2.1. T (0) (2) QNi (t) , hε (·, t) = o(ε), for |t| ≤ , ε as ε → 0, for all i = 1, . . . , k; see (2.35) and property (a). (0) (0) (3) QNi , ∇QNi = 0, for all i, by translation invariance (see Eq. (1.24)). (4) For y := x − ri (t), (0) 2 d n y QNi (t) (y) y = 0, for all i, by rotation invariance. (0) (0) (5) N˙ i (t) = 2 QNi (t) , QN˙ (t) , for all i, i (0) (0) because Ni = QNi , QNi .
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
Using that ∂ (0) QNj (t) x − rj (t) eiθj (x,t) ∂t
(0) (0) = QN˙ (t) x − rj (t) − r˙j (t)·∇QNj (t) x − rj (t) j (0) ˙ + i θj (x, t)QNj (t) x − rj (t) eiθj (x,t) , with
237
(2.36)
θ˙j (x, t) = r¨j (t) x − rj (t) − r˙j (t)2 + ϑ˙ j (t),
(2.37)
∇θj (x, t) = r˙j (t),
(2.38)
and
¯ ψ) introduced in (2.14), we find that, for ψt (x) as in (2.35), the action functional S(ψ, with − Tε ≤ t1 < t2 ≤ Tε , is given by k (0) 2 i ˙ dt Nj − QNj x − rj r¨j · x − rj 2 j =1 t1 Nj 2 1 (0) 2 2 ˙ ∇QNj − r˙ + Nj r˙j − Nj ϑj − 2 2 j (0) 2 (0) 2 1 ∗ QNj QNj − Nj W εrj + 2 1 + Ni Nj 6 ε ri − rj + sε , 2
1 ¯ ψ = S ψ, 2
t2
(2.39)
i:i=j
where sε is an error term ∼ o(ε). In the first term on the R.S. of (2.39) we have used (5), the second term proportional to r¨j vanishes by (4), in the third and fourth term we have used (2.37), in the sixth term we have used (2.38), and various cross terms vanish because of (3) or only contribute to the error term because of (1) and (2). We have also used that (0) 2 d n xW (εx)QNj x − rj = Nj W εrj + o(ε) ; and that, for i = j , (0) (0) 2 2 d n x d n y QNi (x − ri ) (x − y) QNj y − rj = Ni Nj 6 ri − rj + o(ε), by (4) and because s (x) decays rapidly in |x|. Thus 1 ¯ ψ = S ψ, SNewton rj , Nj j =1,... ,k 2 t2 k i ˙ 1 (0) (0) dt Nj − Nj ϑ˙ j − 2H QNj , QNj + sε , + 2 2 t1
j =1
(2.40)
238
J. Fröhlich, T.-P. Tsai, H.-T. Yau
where SNewton
rj , Nj
t2 dt
=
j =1,... ,k
k Nj 2 r˙j − Nj W εrj 2 j =1
t1
1 Ni Nj 6 ε ri − rj + 2
(2.41)
i:i=j
is the usual Hamiltonian action for k point particles with masses N1 , . . . , Nk in an external acceleration field potential W (ε·) and interacting through two-body forces with with potential Ni Nj 6 ε ri − rj . In order to guarantee that the ansatz (2.35) yields a solution of the Hartree equation (2.6) with properties (a), (b) and (c), we must require that the variation of the ¯ ψ calculated in (2.40), (2.41) with respect to the variational parameaction S ψ, ters rj , Nj , ϑj , j = 1, . . . , k, and hε vanish! To write down the variational equations, we observe that the second term on the R.S. of (2.40) isindependent of r1 , . . . , rk , except ¯ ψ with respect to r1 , . . . , rk for the error term sε , which is o(ε). Thus, varying S ψ, yields Newton’s equations of motion r¨j = − ε (∇W ) εrj ε + Ni (∇6 ) ε rj − ri + aj , (2.42) 2 i:i=j
where aj comes from the error term sε , and |aj (t)| ∼ o(ε), for |t| ≤ Variation with respect to N1 , . . . , Nk yields the equations ϑ˙ j =
T ε
; j = 1, . . . , k.
1 2 Ni 6 ε ri − rj r˙j − W εrj + 2 i:i=j ∂ (0) (0) H QNj , QNj + o(ε) . − ∂Nj
(2.43)
It is easy to see that 1 ∂ (0) (0) (0) 2 ∇QNj H QNj , QNj = ∂Nj 2Nj 1 (0)2 (0)2 ∗ QNj QNj − Nj = Ej − Nj 6 (0) + o(ε), see Eq. (2.26). Hence, for |t| ≤ ϑ˙ j =
T ε
(2.44)
, k
1 2 r˙j − W εrj + Ni 6 ε ri − rj − Ej + o(ε) . 2 i=1
(2.45)
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
239
Variation with respect to ϑ1 , . . . , ϑk yields the equations N˙ j = o(ε),
(2.46)
(approximate conservation of masses of particles), and, finally, variation with respect to hε yields an equation of motion of the form k ∂ (2.47) hε (x, t) = X hε , rj , Nj , ϑj j =1 (x, t), ∂t with X ∼ o(ε), for |t| ≤ Tε , where (·) is an appropriately chosen norm. At a heuristic level, eqs. (2.42) and (2.46) show very clearly that the limit ε → 0 corresponds to the point-particle limit in which the masses, N1 , . . . , Nk , of the particles (“solitons”) are constant and their trajectories are solutions of Newton’s equations of motion, on time scales of 0(ε−1 ). It is interesting and useful to work out explicit expressions for all the terms of o(ε) in Eqs. (2.42), (2.45), (2.46) and (2.47), in order to understand more about the corrections to the Newtonian point-particle limit and to get a handle on phenomena like radiation loss and dissipation through emission of small-amplitude dispersive radiation. But, since our discussion in this section is at a formal level, let’s not! In the special case where k = 1, the terms of size o(ε) are analyzed in Sect. 3. The analysis of the correction term sε in expression (2.40) for the action functional and of the properties of solutions of Eq. (2.47) is crucial in attempting to understand the long-time behavior of solutions of the Hartree equation (2.6). In the introduction, we have drawn attention to results of Soffer and Weinstein [8], see also [12], concerning “nonlinear Rayleigh scattering” for small-amplitude solutions of the non-linear Schrödingeror Hartree equations with a suitable external potential λV . One would like to extend their results in the direction of a theory of non-linear resonances (metastable states) and gain understanding of the phenomenon of “approach to a groundstate”. Of particular interest ¯ ψ), see (2.3), restricted to a sphere are situations where the Hamilton functional H(ψ, SN in phase space has several distinct local minima, for N large enough. This happens when λV has several minima separated by large barriers and − is the potential of an attractive force. One would then like to understand the shape of the “basins of attraction” ¯ ψ) : The forward (backward) basin of in phase space of the local minima of H(ψ, SN ¯ ψ) parametrized by N consists of all attraction of a family of local minima of H(ψ, SN initial conditions in phase space which approach an element of this family plus dispersive radiation decaying to 0 at the free dispersion rate, as t → +∞ (t → −∞). This is the phenomenon of “approach to a groundstate”. More ambitiously, one might try to construct a “centre manifold” of asymptotically attracting configurations of solitons to which solutions of the Hartree equation with initial conditions sufficiently close to the centre manifold converge locally in space, as |t| → ∞. See [12] for some preliminary results. Let us consider an example: We choose an initial condition for the Hartree equation describing two far-separated solitons at positions r1 , r2 and with initial velocities v1 , v2 . We suppose that λV = 0 and that −6 is purely attractive and of short range. The “masses” N1 , N2 of the solitons and the initial conditions r1 , v1 and r2 , v2 are chosen such that the two solitons form a bound state, i.e., that N1 2 N2 2 v + v − N1 N2 6 (ε (r1 − r2 )) < 0 . 2 1 2 2
(2.48)
240
J. Fröhlich, T.-P. Tsai, H.-T. Yau
One would then like to calculate the power, PR (t), of emission of dispersive radiation through a sphere of radius R " max (|r1 |, |r2 |) . Moreover, one would like to show that, as t → ±∞, a typical configuration of two solitons satisfying (2.48) collapse to a single soliton moving through space at a constant velocity. This phenomenon would describe the “radiative collapse of a binary system”. More generally, it would be interesting to understand how, at intermediate times, small inhomogeneities in the initial conditions for solutions of the Hartree equation grow to form a structure of rotating bodies (solitons) perturbed by outgoing, dispersive radiation, before it eventually approaches a number of far separated solitons escaping from each other. [In studying such problems, one finds out that the Hartree equation not only “knows” about Newton’s equations of motion, it also “knows” about the Euler equations for the motion of rigid bodies.] The problems described here are problems on the scattering theory for the Hartree equation. If − is attractive, i.e., for a self-focussing non-linearity, scattering theory is bound to be very subtle, involving infinitely many “scattering channels”, and is beyond the reach of our methods; (see, however, Sect. 4 for some preliminary results, and [7] for the case where − is repulsive). 3. Proof of Theorem 1.1 In this section, we prove the first main result (Theorem 1.1) of this paper. 3.1. Stability of soliton solutions of Hartree equations. We first review the stability of the soliton solutions to the Hartree equation without external potential, i.e., for λ = 0. The equation is i∂t ψ = 2
1 ∂H = − 9ψ − ( ∗ |ψ|2 )ψ, 2 ∂ ψ¯
(3.1)
where ∂∂H (H = H(λ=0) , see (1.11)) is the first variation of the energy functional w.r.t. ψ¯ ¯ Recall that Q is a minimizer of H under the constraint N (ψ, ¯ ψ) := ψ 2 = N , for ψ. some N fixed, and thus Q satisfies the equation 1 − 9Q − ( ∗ |Q|2 )Q = EQ, 2
(3.2)
for some non-linear eigenvalue (Lagrange multiplier) E. Suppose the function ψ can be written in the form ψ = (Q + h)e−iEt . Then the linearized equation satisfied by h takes the form i∂t h = Lh,
(3.3)
1 ¯ Lh = − 9h − Eh − ( ∗ Q2 )h − Q( ∗ (Q(h + h))). 2
(3.4)
where
Due to the appearance of h¯ on the right side of (3.4), L is not a complex-linear operator. It is therefore convenient to separate the last equation into real and imaginary parts Lh = L+ A + iL− B,
h = A + iB,
(A and B real),
(3.5)
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
241
where 1 L− = − 9 − E − ∗ Q2 , 2 L+ = L− − 2Q[ ∗ (Q·)], In matrix form, ∂ ∂t
(L+ A = L− A − 2Q[ ∗ (QA)]).
A 0 L− A A = =: L . B −L+ 0 B B
(3.6)
(L is the matrix form of −iL; it determines a linear Hamiltonian vector field.) The operators L− and L+ also appear naturally in the second variation of the energy functional H. Writing ψ = u + iv, we have by explicit computation ∂H ∂H H(Q + h) = H(Q) + dx A + dx B ∂u Q ∂v Q 1 ∂ 2 H ∂ 2 H + dxA A + 2 dxA B 2 ∂u∂u Q ∂u∂v Q ∂ 2 H + dxB B + O(h3 ) ∂v∂v Q where
∂H = ∂u Q
∂H ∂u
¯ ψ=ψ=Q
.
Notice that H has no cross terms in u and v, except in the nonlinear term depending only on |ψ|2 . Since Q is real, we have that ∂H =0. ∂v Q Thus the first order term is just ∂H ∂H dx A = E dxQA, A = 2 dx ∂u Q ∂ ψ¯ Q where we have used Eq. (3.2). Similarly, ∂ 2 H = 0, ∂u∂v Q and the second order term is just 1 ∂ 2 H ∂ 2 H dxA A + dxB B 2 ∂u∂u Q ∂v∂v Q = dxAL+ A + dxBL− B + E dx(A2 + B 2 ) . − E = L .) We have thus proved that (Observe that Hreal +
H(Q + h) = H(Q) + E[(Q, A) + h 2 ] + Re(Lh, h) + O(h3 ),
(3.7)
242
J. Fröhlich, T.-P. Tsai, H.-T. Yau
where (f, g) =
f¯gdx is the standard L2 scalar product and Re(Lh, h) = dxAL+ A + dxBL− B.
Let Qε ≡ QN+ε be the (real) minimizer centered at the origin, with Qε 2 = N + ε. Let hε = Qε − Q. Then ε = Q + hε 2 − Q 2 = 2 Qhε + h2ε = 2 Qhε + O(ε 2 ). We define E(N ) as the minimal energy subject to the constraint ψ 2 = N : E(N ) =
inf
ψ 2 =N
H(ψ).
The last two equations and (3.7) then yield the standard relation ∂E(N ) = E/2. ∂N For an arbitrary h with Reh ⊥ Q, Eq. (3.7) yields dxAL+ A + dxBL− B = H(Q + h) − H(Q) − E h 2 + O(h3 ).
(3.8)
(3.9)
Since H(Q + h) ≥ E( Q + h 2 ) = E(N + h 2 ), (because Reh ⊥ Q, Q + h 2 =
Q 2 + h 2 ), we obtain from Eq. (3.8) H(Q + h) − H(Q) − E h 2 ≥ O(h3 ). This proves that
dxAL+ A ≥ 0,
dxBL− B ≥ 0,
for all A ⊥ Q and arbitrary B. Thus L− ≥ 0, and L+ has at most one negative eigenvalue. From the explicit form of L− and L+ we conclude that L− Q = 0,
L+ ∇Q = 0,
L− (xQ) = −∇Q.
(3.10)
Since Q is positive and L− ≥ 0, its null space is the span of Q, i.e., L− ≥ 0,
N (L− ) = spanR {Q}.
From the explicit form of L+ we have that (Q, L+ Q) =: ε0 · (Q, Q) < 0, where ε0 = −2(N (Q))
−1
(3.11)
Q2 ∗ Q2 < 0.
Thus L+ has exactly one negative eigenvalue. The continuous spectra of L− and L+ can easily be shown to be the half-line [−E, ∞). Since L+ ∇Q = 0, 0 is an at least n-fold
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
243
degenerate eigenvalue of L+ . A key assumption in our analysis is that the whole null − E is spanned by ∇Q, i.e., space of L+ = Hreal N (L+ ) = spanR {∇Q}.
(3.12)
Since the continuous spectrum of L− and of L+ is the half-line [−E, ∞), 0 is an isolated point. Hence there is a positive number δ such that (h, L+ h) ≥ δ(h, h), if h is orthogonal to the span of ∇Q and to the ground state of L+ . In particular, the number of eigenvalues strictly below δ is exactly n + 1. We have proved the following lemma. Lemma 3.1. Assume that (3.12) holds. Then the null spaces of L− and L+ are given by N (L− ) = spanR {Q}, N (L+ ) = spanR {∇Q}. Furthermore, there is a constant ε2 > 0 such that (a) (g, L− g) ≥ ε2 (g, g) if g ⊥ Q. (b) (f, L+ f ) ≥ ε2 (f, f ) if f ⊥ spanR {Q, ∇Q}. If we assume that
Q + h 2 = Q 2 the term with the factor E in (3.7) vanishes, because 2(Q, A) = − h 2 , and we have that
H(Q + h) = H(Q) +
dxAL+ A +
dxBL− B + O(h3 ).
(3.13)
Thus if h = A + iB, with A ⊥ spanR {∇Q},
B ⊥ Q,
Q + h 2 = Q 2 ,
(3.14)
then we can write A = A1 +cQ, with (A1 , Q) = 0, for some c of order h 2 , (c(Q, Q) = (A, Q) = −(h, h)/2). Since (Q, ∇Q) = 0, we have that (∇Q, A1 ) = 0, provided that (A, ∇Q) = 0. Therefore, under assumption (3.14), we can rewrite (3.7) as H(Q + h) − H(Q) = (A, L+ A) + (B, L− B) + O(h3 ) (3.15) 3 = (A1 , L+ A1 ) + (B, L− B) + O(h ). (3.16) Since A1 ⊥ spanR {Q, ∇Q}, we can apply Lemma 3.1 to conclude that (A1 , L+ A1 ) + (B, L− B) ≥ ε2 ( A1 2 + B 2 ). Since the difference between A1 2 and A 2 is of higher order, we obtain H(Q + h) − H(Q) ≥ ε2 h 2 + O(h3 ).
(3.17)
The last equation implies the global (modulational) stability of the soliton solution under small perturbations. To see this, suppose the initial data is of the form Q + h0 , with Q + h0 2 = Q 2 ; (the last condition always holds, since we can choose a Q with the mass of the initial value). At a later time t, we can find r and θ such that ψt (x − r)e−iθ = h(x) + Q(x) with the mass of the correction, h 2 , minimized. One can easily check that h satisfies condition (3.14). By inequality (3.17), h 2 is bounded from above by the left hand side, which is conserved under the time evolution.
244
J. Fröhlich, T.-P. Tsai, H.-T. Yau
3.2. Dynamical linearization of the Hartree equation around solitons. We now return to the Hartree equation (1.5) with external potential λV (x) = W (εx). Since our time scale is of order t ∼ ε −1 , the change in the external potential during the evolution on this time scale may not be small. Thus the argument in the last section no longer applies. We shall show that, nevertheless, the soliton solution is stable on this time scale, and we shall track the motion of the soliton precisely. The Hartree equation (1.5) is 1 i∂t ψ = − 9ψ + W (εx)ψ − ( ∗ |ψ|2 )ψ =: H (ψ)ψ. 2
(3.18)
The solutions we are interested in are of the form ψ(x, t) = [Q (x − r(t)) + hε (x − r(t), t)] eiθ(x,t) ,
(3.19)
for ε > 0 small enough, where Q(x) = Q(ε=0) (x) is a minimizer of the energy functional H, and hε (x − r(t), t) is a small correction term which tends to 0, as ε → 0; θ(x, t) is a time-dependent phase of the form θ (x, t) = v(t) · (x − r(t)) − Et + θ1 (t). Also, we expect that, to leading order, the velocity v(t) and the location r(t) of the soliton are given by r˙ (t) = v(t) ,
v(t) ˙ = −ε (∇W ) (εr(t)) .
For the time being, there is no canonical way to determine corrections to these equations, as the decomposition (3.19) is not unique. We require v, r, and θ1 to obey the following equations: r˙ (t) = v(t), v(t) ˙ = −ε(∇W )(εr(t)) + a(t), θ˙1 (t) = 1 v 2 (t) − W (εr(t)) + ω(t), 2
where the (vector) acceleration correction a(t) and the (scalar) “angular velocity” correction ω(t) are of higher order in ε and will be used for fine adjustment, later on. Their initial values will be discussed in Subsect. 4.5.1, when we adjust the initial datum hε,0 . We now derive the equations for a, ω and h. Let ξ(x, t) = Q(y)eiθ , y = x − r(t). By explicit computation, 1 −1 −1 ξ {i∂t − H (ξ )} ξ = ξ i∂t ξ + 9ξ − W (εx) + ∗ |ξ |2 2
∇Q 1 2 ˙ = −θ1 (t) − ∂t [v(t)(x − r(t)] − 2 v (t) + i (v(t) − r˙ (t)) Q 9Q − W (εx) + E + + ∗ |ξ |2 . 2Q We expand the potential W around the point r(t): W (εx) = W (εr(t)) + ∇W (εr(t))ε(x − r(t)) + /0 (x, t),
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
245
where the remainder /0 (x, t) is real and, by the mean value theorem, |/0 (x, t)| ≤ Cε2 |y|2 ,
(3.20)
where C = C(W ) depends on W . Recalling the equation for r, v and (3.2), we then have ξ −1 {i∂t − H (ξ )} ξ = −/ξ,
(3.21)
where / = −W (εr) + W (εx) + vy ˙ + ω = /0 (x, t) + a(t)y + ω(t).
(3.22)
Next, we consider h(y, t) = hε (x − r(t), t). Substituting ψ = (Q + h)eiθ into Eq. (3.18) and canceling eiθ we get i {(Q + h)(i∂t θ) − r˙ · ∇(Q + h) + ∂t h} 1 1 = − 9(Q + h) − iv · ∇(Q + h) + v 2 (Q + h) 2 2 + W (εx)(Q + h) − ( ∗ |Q + h|2 )(Q + h), where Q and h are taken at (y, t) = (x − r(t), t), that is, Q = Q(x − r(t)) = Q(y) and h = h(x − r(t), t) = h(y, t). Using r˙ (t) = v(t), Eq. (3.2), and 1 2 1 ˙ − W (εr) + ω(t) − E = −W (εx) + / − v 2 − E, ∂t θ = v˙ · x + − v − vr 2 2 we obtain 1 i∂t h = − 9h − Eh + /(Q + h) − ( ∗ |Q + h|2 )(Q + h) − ( ∗ Q2 )Q . 2 (3.23) Treating /h as an error term, we can rewrite this equation as ∂t h = −iLh + G,
(3.24)
where the operator L is given by (3.4), and the nonlinear part is G = − i/(Q + h) − iF (h), ¯ . with F (h) = − ( ∗ |h|2 )(Q + h) + ( ∗ [Q(h + h)])h
(3.25)
In matrix form, ∂ ∂t
A 0 L− A ReG = + . B B ImG −L+ 0
(3.26)
We observe that, except for /0 which is part of / (and thus appears in G), all quantities in this system are evaluated at (y, t).
246
J. Fröhlich, T.-P. Tsai, H.-T. Yau
3.3. Properties of the linearized flow. We have shown that the linear part in the dynamical linearization of the nonlinear Hartree equation results in the standard linear evolution (3.3) with matrix form given in (3.6). We notice that L+ and L− , the real and imaginary part of L, can be reinterpreted as complex-linear operators which turn out to be self-adjoint in the usual L2 space. The operator
0 −L+ 0 L− , with L∗ = L= −L+ 0 L− 0 acting on H 1 × H 1 is, however, not symmetric. Although our functions A and B are real, we shall view L− and L+ as self-adjoint operators on the Sobolev space H 1 of complex-valued functions. The operator L is, however, not self-adjoint and thus does not have a spectral decomposition. A standard procedure is to decompose the space H 1 ×H 1 into a direct sum of its generalized null space, S := Ng (L) = {v : Ln v = 0 for some n} , and the orthogonal compliment of the generalized null space of its adjoint, i.e., the space M = Ng (L∗ )⊥ . It is simple to check that both spaces, S = Ng (L) and M = Ng (L∗ )⊥ , are invariant under L. Note that the decomposition H 1 × H 1 = M ⊕ S is, however, not an orthogonal decomposition. Following M. I. Weinstein [13], we want to establish the following picture: 1. H 1 × H 1 = M ⊕ S. 2. The H 1 × H 1 -norm on M remains uniformly bounded under the linearized flow for all time. 3. The dynamics on S can be computed explicitly. We use PM and PS to denote (non-orthogonal) projections with respect to the decomposition M ⊕ S. We first establish some spectral properties of L+ and L− . 3.3.1. Generalized null space. We first determine the generalized null space S = Ng (L). We recall Lemma 3.1 and the equations L− Q = 0,
L+ ∇Q = 0,
L− xQ = −∇Q.
Since Q ⊥ spanR {∇Q} = N (L+ ) and L+ is self-adjoint, there exists a solution, 01 , of the equation L+ 01 = Q. We may assume 01 ⊥ ∇Q by subtracting its projection on the ∇Q-direction. If 01 ⊥ Q, then (01 , Q) = (01 , L+ 01 ) > ε1 (01 , 01 ), by Lemma 3.1. This contradiction shows (01 , Q) = 0. Now we let 0 = 01 +b∇Q with b = 2(01 , xQ). Then (0, Q) = (01 , Q) = 0, and (0, xQ) = 0. To summarize, we have found a 0 such that L+ 0 = Q,
(0, xQ) = 0,
(0, Q) = 0.
(3.27)
We require (0, xQ) = 0, in order to construct a dual basis on S in Proposition 3.2 below. To determine the generalized null space, we need to solve all solutions of the equation Ln ( uv ) = 00 for some n. If n = 2k is even, it is equivalent to (L− L+ )k u = 0 and (L+ L− )k v = 0. If n = 2k + 1 is odd, it is equivalent to L+ (L− L+ )k u = 0 and k = 0. We have solved the solutions for the case n = 1 above: It is the span L−(L+ L− ) v∇Q 0 of Q and 0 .
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
247
We next consider the case n = 2. The null space of L+ L− is N (L+ L− ) = L− −1 N (L+ ) = N (L− ) ⊕ spanR {xQ} = spanR {Q, xQ}.
(3.28)
Similarly, N (L− L+ ) = N (L+ ) ⊕ spanR {0} = spanR {∇Q, 0}.
(3.29)
For the case n = 3, we have N (L− L+ L− ) = L− −1 N (L− L+ ) = L− −1 spanR {∇Q, 0} . Since N (L− ) = spanR {Q} and (Q, 0) = 0, 0 is not in the range of L− . Thus L− −1 spanR {∇Q, 0} = L− −1 spanR {∇Q} = N (L− L+ ). This proves that N (L− L + L− ) = N (L+ L− ). Similarly, N (L + L − L+ ) = N (L− L+ ). Therefore, if Ln ( uv ) = 00 for some n ≥ 2, then L2 ( uv ) = 00 . Thus we have found a basis for Ng (L). We also have similar statements for Ng (L∗ ). Summarizing, we have proved Proposition 3.2. 0 ∇Q 0 0 , xQ , 0 }, Q , 0 0 xQ 0 Q ∗ Ng (L ) = spanR { 0 , 0 , ∇Q , 0 }.
S = Ng (L) = spanR {
(3.30)
Notice that these vectors are dual bases and we have ordered them correspondingly. In particular, for an arbitrary function g we have 0 PS (g) = κ1 (Img, 0) Q + κ2 (Reg, xQ) ∇Q 0 0 + κ1 (Reg, Q) 00 , + κ2 (Img, ∇Q) xQ where κ1 = 1/(Q, 0) and κ2 = 1/(xj Q, ∂j Q) = −2. Also note that we have 0 0 0 0 0 L Q = 0 , L ∇Q = 0 , L yQ = − ∇Q , L 00 = − Q . 0 0 (3.31) Let g(t) be a solution to the linear evolution (3.3) and denote the projection onto S by PS g(t) = α(t)
0 Q
+ β(t)
∇Q 0
+ γ (t)
0 xQ
+ δ(t)
0 0
.
Then by (3.31) the equations for the coefficients (α(t), β(t), γ (t), δ(t)) (note β(t) and γ (t) are vector functions) are given by 0 ˙ = −δ, Q : α ∇Q : β˙ = −γ , 0 0 yQ : γ˙ = 0, 0 : δ˙ = 0. 0
248
J. Fröhlich, T.-P. Tsai, H.-T. Yau
3.3.2. The flow on M. We have decomposed the space H 1 × H 1 into a direct sum of the generalized null space S = Ng (L) and M = Ng (L∗ )⊥ . The generalized null spaces for L and L∗ are given by Proposition 3.2. Thus, M is the space M = {( uv ) : u ⊥ spanR {Q, xQ}, v ⊥ spanR {∇Q, 0}}. Since all functions in the space S = Ng (L) and M ⊥ = Ng (L∗ ) are smooth, the projections PS and PM are bounded in any H k space. Our first aim is to prove Lemma 3.3 (H 1 -norm on M). 1. If g ∈ M, then Re(Lg, g) is non-negative and comparable to g 2H 1 . 2. If g(t) = e−itL φ and 0 = φ ∈ M, then g(t) H 1 is uniformly bounded below and above. To prove this lemma, we first show that, for all vectors ( uv ) ∈ M, C −1 u 2L2 ≤ (u, L+ u),
C −1 v 2L2 ≤ (v, L− v) ,
(3.32)
for some constant C, as follows from Lemma 3.1. In fact, it is sufficient to assume that u ⊥ spanR {Q, xQ} and v ⊥ spanR {0}. (As will become clear, we only use (0, Q) = 0 and (xQ, ∇Q) = 0 in this argument.) For the v-part, if (v, Q) = 0, the claim follows from Lemma 3.1. Hence we may assume tv = Q + w for some t = 0, w ⊥ Q. By assumption 0 = (0, tv) = (0, Q + w), hence |(0, Q)| = |(0, w)| ≤ 0 2 w 2 . Therefore we have w 2 ≥ c3 > 0 and ε2 c32 ε2 w 22 (w, L− w) (v, L− v) = ≥ ≥ . (v, v)
Q 22 + w 22
Q 22 + w 22
Q 22 + c32 For the u-part, if (u, ∇Q) = 0, the claim follows from Lemma 3.1. Hence we may assume u = b∇Q + w for some vector b = 0 and some w ⊥ Q, ∇Q. By assumption, 0 = b(xQ, u) = (bxQ, b∇Q + w) = C|b|2 + (bxQ, w), with C = 0. Hence w 2 > C|b| and ε2 w 22 (w, L+ w) (u, L+ u) = ≥ ≥ Cε2 , 2 (u, u) Cb2 + w 2 Cb2 + w 22 by a similar estimate. Hence (3.32) is proved. Now, since ∇u 2 is bounded by (u, L+ u) and u 2 , (and hence by (u, L+ u), see (3.32)), we can replace the norm on the left hand side of (3.32) by the H1 -norm; (here the H1 -norm is the sum of the L2 -norm plus the L2 -norm of the derivative). Therefore we have proved that, for ( uv ) ∈ M, C −1 (u, u)H 1 ≤ (u, L+ u) ≤ C(u, u)H 1 , C
−1
(3.33)
(v, v)H 1 ≤ (v, L− v) ≤ C(v, v)H 1 .
The upper bounds on (u, L+ u) and (v, L− v) are obvious. Hence the first part of the lemma is proved. The second part follows from the first part and the next lemma, which states that the quantity (u, L+ u) + (v, L− v), which is equivalent to the H 1 -norm on M, is actually conserved by the linear flow (3.3). We note that (u, L+ u) + (v, L− v) = Re(Lg, g) = Im(Lf, g).
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
249
Lemma 3.4. Recall that L = −iL, and iL = Li, (see (4.5)). 1. Re(Lf, g) = Im(Lf, g) = Im(Lg, f ) = −Im(f, Lg). 2. If g(t) = e−itL φ, then Im(Lk g, g) is constant for any integer k ≥ 0. 3. For any g(t) with ∂t g = Lg + G, one has d Im(Lg, g) = 2Im(Lg, G). dt Proof. All these assertions can be checked by simple computations. We only prove the last one in the following. d Im(Lg, g) = Im(L2 g + LG, g) + Im(Lg, Lg + G) dt = Im(LG, g) + Im(Lg, G) = 2Im(Lg, G).
' &
The following two lemmas will be used to prove inequality (3.61) below. (Note H k denotes the Sobolev space W k,2 .) Lemma 3.5. (a) For any m ≥ 1, e−itL is a bounded map from M ∩ H m into itself. Explicitly, for any φ ∈ M ∩ H m , −itL φ m ≤ Cm φ H m . e H
(b) L = −iL, restricted to M, has an inverse which is bounded from M ∩ L2 to M ∩ H 2 . Proof. Proof of (a): Let g(t) = e−itL . The case m = 1 is Lemma 3.3, part 2. If m ≥ 3 is odd, we have that
g(t) 2H m ≤ C Im(Lm g, g) + C g(t) 2H m−2 ≤ C Im(Lm φ, φ) + C φ 2 m−2 ≤ C φ 2 m . H
H
The second inequality uses Lemma 3.4, part 2. (Note: If m is even, Im(Lm g, g) = 0, and the first inequality fails.) The general case follows from interpolation. Proof of (b): For ( uv ) ∈ M we seek xy ∈ M such that L xy = ( uv ), i.e., L− y = u and L+ x = −v. Notice that u ⊥ Q and v ⊥ ∇Q, and the null spaces of the self-adjoint operators L− and L+ are spanned by Q and ∇Q respectively. Since 0 is an isolated −1 eigenvalue of L− and L+ , it follows that L−1 − and L+ are bounded operators on the orthogonal complements of the null spaces. This proves that L has a bounded inverse on M ∩ L2 . To prove the bound, write w = ( uv ) ∈ M ∩ H 2 . By (3.33) 1
u 22 + C L+ u 22 . 2 Hence u W 1,2 ≤ C L+ u 2 . Similarly v W 1,2 ≤ C L− v 2 . Furthermore, write L+ = − 21 9 + V . (The explicit form of V can easily be read from the definition of L+ .)
u 2W 1,2 ≤ C(u, L+ u) ≤
9u 2 = 2 L+ u − V u 2 ≤ 2 L+ u 2 + C(V ) u 2 ≤ C L+ u 2 . Hence u W 2,2 ≤ C L+ u 2 . Similarly v W 2,2 ≤ C L− v 2 . We conclude that
w W 2,2 ≤ C Lw 2 . The lemma follows by a duality argument. & '
250
J. Fröhlich, T.-P. Tsai, H.-T. Yau
Let Xk = H k ∩ L2 (1 + |y|2k )dy
(3.34)
denote the subspace of H k of functions with prescribed decay at infinity. Lemma 3.6 (Finite propagation speed). For any integer k ≥ 0, for any real m ≥ 1, and for φ ∈ M ∩ Xk ∩ H k+m , m tL y e φ
Hk
≤ C y m φ X + C(1 + |t|m ) φ H k+m . k
(3.35)
The constant C depends on k and m. Remark. For the free Schrödinger equation, one need not assume that φ ∈ M, since Lemma 3.5 (a) always holds. Proof. Let α be any multi-index with |α| = k. Let g(t) = e−itL φ and v(t) = ∇yα g(t). We have ∂t v = Lv + Ig,
with I = [∇ α , L].
Hence, d dt
y
2m
|v| = 2Re y 2m vv ¯ t ≤ C y 2m−1 |v||∇v|dy + C v 22 + C y 2m |v||Ig|. 2
Since I is a localized operator involving derivatives only up to (k − 1)st order, (I vanishes for k = 0), the last term is bounded by v 2 g H k−1 ≤ C φ 2H k . Hence, by interpolation, m−1 d m 2 y v ≤ C y m v · ∇v + C φ 2H k dt 2 y 2 2 m 2(1−1/2m) 1/m ≤ C y v 2 · v H m + C φ 2H k . Let f (t) = y m v 22 and N = C φ 2H k+m . By Hölder’s inequality, f ≤ f 1−1/2m N 1/2m + N ≤
f + C(1 + t)2m−1 N, 1+t
hence f (t) ≤ Cf (0) + C(1 + t)2m N , which proves the claim. We will need the case k = 1 when we prove Lemma 3.8.
' &
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
251
3.4. The fine adjustment. We first recall the conclusion of dynamical linearization. We decompose the function ψ into the sum ψ(x, t) = [Q (x − r(t)) + hε (x − r(t), t)] eiθ(x,t) , where θ (x, t) is a time-dependent phase of the form θ (x, t) = v(t) · (x − r(t)) − Et + θ1 (t), with r˙ (t) = v(t), v(t) ˙ = −∇W (εr(t))ε + a(t), 1 θ˙1 (t) = v 2 (t) − W (εr(t)) + ω(t). 2 Here the (vector) acceleration correction a(t) and the (scalar) angular velocity correction ω(t) are of smaller orders, and we shall determine their values in this subsection. The main correction h satisfies the equation ∂ h = Lh + G, ∂t
h(0) = hε,0 ,
(3.36)
with G = −i/(Q + h) − iF (h), / = /0 + ay + ω, /0 = W (εx) − W (εr) − εy∇W (εr), ¯ F = −( ∗ |h|2 )(Q + h) − ( ∗ [Q(h + h)])h. We decompose h(t) into a sum of its components in S and M: h(t) = hS (t) + hM (t). The component in S is a sum of the basis vectors (3.30) 0 0 + β(t) ∇Q + γ (t) yQ + δ(t) 00 . hS (t) = α(t) Q 0 We now consider projections of Eq. (3.36) onto S and M. Taking inner products with the dual basis, (see Proposition 3.2), we obtain the equations on S: 0 ˙ = −δ +κ1 (ImG, 0), (3.37) Q : α ∇Q ˙ (3.38) : β = −γ +κ2 (ReG, yQ), 0 0 (3.39) κ2 (ImG, ∇Q), yQ : γ˙ = 0 (3.40) : δ˙ = κ1 (ReG, Q). 0
The equation on M is ∂ hM = LhM + PM G. ∂t
(3.41)
Notice that /0 = W (εx) − W (εr) − εy∇W (εr) is determined by r(t), which solves r¨ = −ε∇W (εr) + a.
252
J. Fröhlich, T.-P. Tsai, H.-T. Yau
This system is not closed, yet, since we still need to determine a and ω, which are used for the fine adjustment. Observe that a and ω appear explicitly in the equation on S only through ImG, that is, ayQ and ωQ. These two terms appear in (3.37) and (3.39), the equations for α and γ . Our strategy is to choose a and ω so that α˙ = 0 and γ˙ = 0. Then hS (t) has at most linear growth. It is important to understand the orders of these quantities. Assume that h ≤ o(ε). Since the force G contains an external input /0 Q ∼ ε 2 , G is of the form O(h2 ) + ε 2 . The equation for hM , i.e., (3.41), is thus of the form f ≤ f 2 + c2 ε 2 ,
c>1,
(3.42)
(and we have assumed that we can take care of the linear part). The solutions of this equation can blow up at t = (cε)−1 . Explicitly, if f (0) = 0 then f (t) = cε tan(cεt). A more careful examination shows that, due to a cancellation property when integrating in time (which is due to an oscillatory behaviour in time), one can show that h(t) ∼ ε3/2 . Based on this observation, we will prove that a(t) ∼ ε2 ,
ω(t) ∼ ε2 ,
PS (h) ∼ ε3 ,
PM (h) ∼ ε3/2 .
(3.43)
In the following subsections, we will prove the existence of h(y, t) by proving a priori bounds and using its local existence. It is also possible to prove existence by a contraction mapping argument, as we will do in Sect. 4 for the wave operator. 3.5. Initial value and equations on S. 3.5.1. Initial value. Recall that the initial datum is given by ψ0 (x) = Q (x) + hε,0 (x) eiv0 x . The coordinates of the initial value hε,0 in the S direction can be calculated: 0 Q : α(0) = κ1 (Imhε,0 , 0), ∇Q : β(0) = κ2 (Rehε,0 , yQ), 0 0 yQ : γ (0) = κ2 (Imhε,0 , ∇Q), 0 δ(0) = κ1 (Rehε,0 , Q). 0 : By our assumption on hε,0 , these initial values are of order ε 3/2 , which is too large for our purpose. They can be made smaller by introducing suitable normalization conventions. We first replace Q by Q∗ = Q ψ0 2 , with Q∗ L2 = ψ0 L2 . We then define h1 by the equation ψ0 (x) = Q (x) + hε,0 (x) eiv0 x = Q∗ (x) + h1 (x) eiv0 x .
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
253
From the assumption Q∗ 2 = ψ0 2 , we have 2(Q∗ , Reh1 ) = − h1 2 . Next, we want to choose r ∗ , v ∗ , θ ∗ , and write ∗ ∗ ∗ ψ0 (x) = Q∗ x − r ∗ + h∗ x − r ∗ eiv (x−r )+iθ ,
(3.44)
so that PS h∗ is essentially zero. Notice that h∗ is determined once we have chosen r ∗ , v ∗ and θ ∗ : As a function of y = x − r ∗ , ∗ ∗ ∗ h∗ (y) = Q∗ y + r ∗ + h1 y + r ∗ ei[v0 (y+r )−v y−θ ] − Q∗ (y) . The leading term of h∗ is given by (we will choose r ∗ ∼ 0, v ∗ ∼ v0 and θ ∗ ∼ v0 r ∗ ) h∗ (y) ∼ h1 (y) + Q∗ (y) i(v0 − v ∗ )y + i(v0 r ∗ − θ ∗ ) + r ∗ · ∇Q∗ (y) . We can now calculate the initial value of h∗ along the S direction (w.r.t. Q∗ ) as before. The conclusion is 0 ∗ ∗ +κ1 (v0 r ∗ − θ ∗ ) Q∗ , 0 ∗ , Q : α ∼ κ1 Imh1 , 0 ∇Q : β ∗ ∼ κ2 Reh1 , yQ∗ +κ2 k rk∗ ∇k Q∗ , yQ∗ , 0 0 ∗ ∗ ∗ ∗ ∗ yQ : γ ∼ κ2 Imh1 , ∇Q +κ2 (v0 − v ) Q , ∇Q , 0 δ ∗ ∼ κ1 Reh1 , Q∗ . 0 : Since the initial value hε,0 is of order ε 3/2 , h1 is of the same order and we can choose v0 r ∗ − θ ∗ , r ∗ and v0 − v ∗ of order ε3/2 such that α ∗ , β ∗ and γ ∗ vanish to leading order. It is easy to check that the next order is bounded by ε3 . Furthermore, δ ∗ is of order ε 3 as well, thanks to the relation 2(Q∗ , Reh1 ) = − h1 2 . In the remaining part of this section, we will prove Theorem 1.1 with ψ0 of the form (3.44), and PS h∗ ∼ ε3 . The initial values of r(0), v(0), and θ(0) are defined correspondingly. Notice that, by the assumption of the Theorem, (3.12) is satisfied by Q∗ . After this case is proved, the statement in the theorem, with Q = QN0 , can be obtained by defining h as h(y, t) = ψ(x, t)e−iθ − QN0 (y) = (ψe−iθ − Q∗ ) + (Q∗ − QN0 ) = h∗ (y, t) + (Q∗ − QN0 ) = O(ε3/2 ). From now on, we may and will drop the superscript ∗ and assume PS hε,0 ≤ Cc0 ε 3 ,
(3.45)
where ε is sufficiently small: ε ≤ ε−1 , with ε−1 and C depending only on the initial setting (H, N0 , Q,...) but not on W or T . Equation (3.45) will be used in (3.52) below. We note that the smallness of c0 is only used to find a suitable h∗ (0). It is no longer needed in the future and hence c0 is independent of T and W . Also note that we may assume hε,0 ≤ c0 ε 1+σ for σ ∈ (0, 1/2]. Then we replace ε 3/2 , inthe above argument, by ε1+σ , and we get a similar conclusion, with (3.45) replaced by PS hε,0 ≤ Cc0 ε 2+2σ .
254
J. Fröhlich, T.-P. Tsai, H.-T. Yau
3.5.2. Equations on S. From now on, C denotes a constant which may depend on the quantities (, Q...), but not on W or T . Recall that we want to set α˙ = 0 and γ˙ = 0 in (3.37) and (3.39), which yield equations for a and ω. From the definition of G and the inner product relations (xQ, 0) = 0 and κ1 = 1/(Q, 0), we have κ1 (ImG, 0) = −ω − κ1 (G2 , 0), where G2 = /0 Q + /Reh + ReF (h).
(3.46)
Similarly, from (Q, ∇Q) = 0 and κ2 = 1/(xj Q, ∂j Q), we have κ2 (ImG, ∇Q) = −a − κ2 (G2 , ∇Q). Therefore, in order to have α˙ = 0 and γ˙ = 0, it suffices to set ω = −δ − κ1 (G2 , 0), a = −κ2 (G2 , ∇Q).
(3.47)
With this choice of a and ω, we have α(t) = α(0), γ (t) = γ (0); β(t) and δ(t) are defined by solving the ODEs (3.38) and (3.40), i.e., t δ(t) = κ1 (ReG(s), Q)ds + δ(0), (3.48) 0 t β(t) = κ2 (ReG(s), yQ)ds − γ (0)t + β(0). (3.49) 0
qLet Cw = 1 + W W 3,∞ . Then |/0 (x, t)| ≤ CCw
ε 2 |y|2 ,
(cf. (3.20)). Define
ζ (t) := |a(t)| + |ω(t)| + ε1/2 h(t) H 1 + Cw ε 2 . (We would like to have that ζ (t) = O(ε 2 ) for 0 ≤ t ≤ T ε −1 .) In the following we work in the time range [0, t1 ] where ζ (t) ≤ C∗ ε 2 ,
with ε ≤ ε0 ≤ (C∗ + T + 100)−2
(3.50)
holds. Here C∗ > Cw is a (large) constant to be determined later. Equation (3.50) is true for t = 0 if C∗ is sufficiently large with respect to c0 . Moreover, if ζ (s) < C∗ ε 2 for some s < T ε−1 , then (3.50) holds for a small time interval [s, s + δs] by a local wellposedness result. Our goal is to show that Eq. (3.50) holds for 0 ≤ t ≤ T ε−1 , by requiring ε0 sufficiently small. Our strategy is to show that, indeed, ζ (t) ≤ 21 C∗ ε 2 if (3.50) holds. A local wellposedness result then guarantees that (3.50) holds for the whole time range. The quantities ω and a are defined in terms of G2 in (3.47), and recall the definition of G2 , Eq. (3.46). Note that /0 Q is the leading term in G2 . In their definitions, the main term comes from /0 Q, and we have |(/0 Q, 0)| + |(/0 Q, ∇Q)| ≤ CCw ε 2 .
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
255
Also |(/(t)Reh(t), 0)| + |(/(t)Reh(t), ∇Q)| ≤ C(Cw ε 2 + |a(t)| + |ω(t)|) h 2 ≤ Cε −1/2 ζ (t)2 . From the assumption (1.27) on , we have for a general φ ∈ H 1 , ∗ |φ|2 φ 1 ≤ ∗ |φ|2 ∞ · φ H 1 + (∇) ∗ |φ|2 H
L∞
L
From the Young inequality, we have ∗ |φ|2 ∞ ≤ L∞ |φ|2 L
L1
· φ L2 .
= L∞ φ 2L2 .
Similarly, we can bound the term with replaced by ∇. Thus we have proved that
F (φ) H 1 ≤ C φ 2H 1 + C φ 3H 1 for some constant depending on . Hence we can bound (F (h(t)), 0) and (F (h(t)), ∇Q) by |(F (h(t)), 0)| + |(F (h(t)), ∇Q)| ≤ C h 2H 1 + C h 3H 1 ≤ Cε −1 ζ (t)2 . Under assumption (3.50), we have thus proved that |ω(t)| + |a(t)| ≤ CCw ε 2 + Cε −1 ζ (t)2 .
(3.51)
To estimate β and δ, we note that ReG = (/0 + ay + ω)Imh + ImF (h). Since we are only interested in the inner products of ReG with Q or yQ, and Q has exponential decay, we can treat y to be of order one. Thus we have the bound t dsε −1 ζ (s)2 , (3.52) |β(t)| + |δ(t)| ≤ Cc0 (T + 1)ε 3 + 0
where we have used (3.45) and εt ≤ T . 3.6. Modified linear operator on M. It is important to observe that / is not bounded. In fact, (3.53) / =W (εx) − W (εr) − εy∇W (εr) + ay + ω = O ε 2 (y 2 + 1) = O (1 + ε|y|) ,
(3.54)
depending on whether we use Taylor expansion. In either case / is not bounded. This makes the term −i/h in the nonlinear term G hard to control, although the term −i/Q stays fine since Q is localized. By a finite propagation speed estimate we will see that / is of order 1. However, −i/h still cannot be considered an error term. To overcome this difficulty, we will include this term in the linear operator.
256
J. Fröhlich, T.-P. Tsai, H.-T. Yau
Explicitly, Eq. (3.41) for hM can be rewritten as ∂t hM = (L + PM 1i /)hM + PM G, = −i/(Q + hS ) − iF (h). G
(3.55)
Hence we must consider the solution propagator P(s, t) which solves the following problem: If u(t) = P(s, t)φ, then u is a solution of the equation ∂t u(t) = (L + PM 1i /)u(t),
u(s) = φ.
We note that the operator L + PM 1i / leaves M and S invariant; but we will primarily consider P(s, t) on M. Now the equation for hM can be written as t hM (t) = + P(0, t)hM (0). P(s, t)PM G(s)ds (3.56) 0
into the sum of a main part, φ(s), and a remainder, PM G3 (s), We decompose PM G where φ(s) = PM (−i/(s)Q) = PM (−i/0 (s)Q),
G3 = −i/hS − iF (h).
The following lemma provides a basic estimate on the propagator P(s, t). Lemma 3.7. Assume (3.50) is true for 0 ≤ t ≤ T ε−1 . For φ ∈ M,
P(s, t)φ H 1 ≤ C10 φ H 1 for 0 ≤ s ≤ t ≤ T ε−1 , where C10 = eCCw T is independent of ε. We shall prove this lemma in the next subsection. Assuming this lemma and recalling that G3 (s) is of order h2 + h3 , we can bound the contribution of G3 (s) to hM by t t P(s, t)PM G3 (s)ds ≤ CC10 ε −1 ζ (s)2 ds. H1
0
0
The key observation is the following lemma, which takes into account cancellations in the time integration. Lemma 3.8 (Cancellation). Assume (3.50) is true for 0 ≤ t ≤ T ε −1 . Let φ ∈ M ∩ X3 . For 1 ( t ≤ T ε−1 , we have that t P(s, t)φds ≤ C12 t 1/2 φ X 3 H1
0
for a constant C12 = C12 (W, T ) independent of ε. Furthermore, for φ(t) : [0, T ε −1 ] → M ∩ X3 , t P(s, t)φ(t)ds ≤ C12 t 1/2 sup φ(s) X + C12 t sup φ(s) − φ(σ ) H 1 . 3 0
H1
s
|s−σ |≤t 1/2
(3.57) The space X3 has been defined in (3.34).
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
257
We also claim the following bounds on the main term φ(s) = PM (−i/0 (s)Q),
φ(s) X3 ≤ CCw ε 2 ,
φ(s) − φ(σ ) H 1 ≤ CCw ε 3 |s − σ |.
(3.58)
We will prove the lemma and the claim in next subsection. Assuming Lemma 3.8 and the claim, we get t P(s, t)PM (−i/0 (s)Q)ds
H1
0
≤ C12 t 1/2 CCw ε 2 + C12 tCCw ε 3 t 1/2 ≤ C13 ε 3/2 ,
where C13 = CC12 Cw (T + 1)3/2 . Hence, by (3.56), hM (t) is bounded by
hM (t) H 1 ≤ C13 ε 3/2 + CC10
t
ε −1 ζ (s)2 ds + Cc0 ε 3/2 .
(3.59)
0
Recall that ζ (t) = |a(t)| + |ω(t)| + ε1/2 h(t) H 1 . Then we can combine all these bounds, (3.51), (3.52), and (3.59), to obtain the following estimate: ζ (t) ≤ C(Cw + c0 (1 + ε 1/2 T ) + C13 )ε 2 + Cε −1 ζ 2 (t) + CC10
t
ε −1 ζ (s)2 ds
0
≤ Cε2 (Cw + c0 (1 + ε 1/2 T ) + C13 + C10 C∗2 ε(1 + T )) ≤ Cε2 C14 ,
where C14 = Cw + 2c0 + C13 + 1,
if we require ε1/2 T ≤ 1 and C10 C∗2 ε(1 + T ) < 1, in addition to assumption (3.50). We now choose C∗ = 2CC14 and then ε0 such that ε0 ≤ (C∗ + 100)−2 ,
1/2
ε0 T ≤ 1,
C10 C∗2 ε0 (1 + T ) < 1.
With these choices, we have proven that ζ (t) ≤
1 C∗ ε 2 2
(3.60)
under assumption (3.50) that ζ (t) ≤ C∗ ε 2 . Suppose that [0, t1 ] is the maximal time interval such that (3.50) holds and t1 < T ε−1 . Then the equality must hold at t = t1 by local existence and continuity, and ζ (t) must be slightly less than C∗ ε 2 for some t < t1 . This is a contradiction to (3.60) and hence (3.50) holds true for all t ≤ T ε−1 .
258
J. Fröhlich, T.-P. Tsai, H.-T. Yau
3.7. Proofs of lemmas. 3.7.1. Proof of Lemma 3.7. Here we prove that the flow given by P(s, t) is bounded in M: Proof. Let u(t) = P(s, t)φ ∈ M, and f (t) = Im(Lu(t), u(t)) ≥ 0. Recall the second assertion of Lemma 3.4: It implies that fˆ(t) = Im(Lg(t), g(t)), with g(t) := et L φ, is constant. We propose that f (t) does not grow in t very fast, for s, t ∈ (0, T ε −1 ). More precisely, we will prove that d f (t) ≤ Cεf (t), dt which implies f (t) ≤ Cf (s), and hence Lemma 4.7 follows. We recall the third assertion of Lemma 3.4. In our case, ∂t u = Lu + PM 1i /u, hence d f (t)/2 = Im(Lu, PM 1i /u) = Im(Lu, −i/u) − Im(Lu, PS (−i/u)) dt 1 = Im ∇ u(∇/)u ¯ + O(ε 2 u 22 ) − Im(Lu, PS (−i/u)) . 2 j (e , −i/u)ej = If {ej } and {ej } denote dual bases of S, then PS (−i/u) = (i/ej , u)ej . Hence PS (−i/u) H 1 ≤ Cε2 u L2 , and Im(Lu, PS (−i/u)) ≤ C u H 1 · PS (−i/u) H 1 ≤ Cε2 u 2H 1 . Since ∇/ ∞ = ε∇W (εx) − ε∇W (εr) + a ∞ ≤ 2Cw ε + C∗ ε 2 , (with no y depen dence), the term Im 21 ∇ u(∇/)u ¯ dominates, and d f (t) ≤ Im ∇ u(∇/)u ¯ + Cε 2 u 2H 1 ≤ (2Cw ε + Cζ (t)) u 2H 1 ≤ CCw εf. dt The last inequality follows from (3.50) and Lemma 3.3. Hence f (t) ≤ eCCw εt f (0) ≤ eCCw T f (0) for t ≤ T ε−1 .
' &
3.7.2. Proof of Lemma 3.8. Next we prove the key cancellation lemma. The cancellation is due to oscillatory behavior in time. We first prove a variant of Lemma 3.8 for the original flow et L , which will help us to visualize the oscillation. Then we will prove a weaker result for the modified flow in Lemma 3.8. d ρ(t) = O(ε) in H 1 . (One such Suppose ρ(t) ∈ M satisfies ρ(t) = O(1) and dt −2 example is ε /0 (t)Q.) Then there is a C > 0 such that t e(t−s)L ρ(s)ds ≤ C(1 + εt). (3.61) 0
H1
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
259
By Lemma 3.5, L−1 is defined on M and commutes with e(t−s)L . Thus
t
e
(t−s)L
d (t−s)L −1 L ρ(s)ds −e 0 ds
t t d (t−s)L −1 = −e L ρ(s) + e(t−s)L L−1 ρ(s)ds 0 ds 0 t = O(1) + e(t−s)L O(ε)ds
ρ(s)ds =
0
t
0
= O(1) + O(εt),
in H 1 .
Here we have used Lemma 3.3. (Notice the analogy with the integration of eit , which does not increase the order of eit because of its oscillation.) Now we prove the lemma. Proof. Choose τ ∼ t 1/2 " 1. We have
t
P(s, t)φds =
0
j
=
(j +1)τ jτ
P(s, t)φds
P((j + 1)τ, t)
j
(j +1)τ jτ
P(s, (j + 1)τ )φds.
We write each summand as (j +1)τ P(s, (j + 1)τ )φds ≡ (I) jτ
=
(j +1)τ jτ
+
e((j +1)τ −s)L φds
(j +1)τ
jτ
(j +1)τ s
P(σ, (j + 1)τ )PM 1i /(σ )e(σ −s)L φdσ ds
≡ (II) + (III). We have
(II) H 1 ≤ C φ H 1 (1 + ετ ) ≤ C φ H 1 by (3.61) and (3.50). For (III), since φ is localized, we expect it is not affected much by the large potential in PM 1i /(σ ) for large y. To prove this, we use the finite propagation speed estimate (3.35): For s ∈ (0, τ ), PM 1i /(·)es L φ 1 ≤ C Cw ε 2 y 2 + C∗ ε 2 (1 + |y|) es L φ 1 H H 2 2 2 ≤ CCw ε (1 + y )φ 1 + (1 + s ) φ H 3 H 2 + CC∗ ε (1 + |y|)φ H 1 + (1 + s) φ H 2 .
260
J. Fröhlich, T.-P. Tsai, H.-T. Yau
Hence
III H 1 ≤ CC10 ε 2 τ 2 (Cw + C∗ ) (1 + y 2 )φ
H1
≤ CC10 Cw φ X3 ,
+ (Cw τ 2 + C∗ τ ) φ H 3
τ 2ε
since < 2 and ε1/2 C∗ ≤ 1, see (3.50). Therefore (I ) H 1 ≤ C11 φ X3 with C11 = C + CC10 Cw and t P(s, t)φds ≤ C10 C11 φ X3 ≤ CC10 C11 t 1/2 φ X3 . H1
0
j
Next, for a suitably localized function φ(t) ∈ M ∩ X3 , t P(s, t)φ(t)ds 1 0 H (j +1)τ = P((j + 1)τ, t) P(s, (j + 1)τ ) φ(j τ ) + φ(s) − φ(j τ ) ds jτ j 1 H 2 ≤ C10 C11 φ(j τ ) X3 + C10 τ sup φ(s) − φ(σ ) H 1 j
≤ C12 t
|s−σ |≤τ
j 1/2
sup φ(s) X3 + C12 t s
sup |s−σ |≤t 1/2
2 = C(1 + C eCCw T )2 . with C12 = CC11 w
φ(s) − φ(σ ) H 1 ' &
This estimate is mainly used for φ(s) = PM (−i/0 (s)Q). 3.7.3. Proof of claim (3.58). We rewrite /0 in the form /0 (x, t) = W (εr + εy) − W (εr) − ∇W (εr) · εy 1 {∇W (εr + uεy) · εy} du − ∇W (εr) · εy =
0
1 1
0
0
=
∇ 2 W (εr + vuεy) : εy ⊗ uεy dvdu.
From the first line we have that ∇ 3 /0 ∞ ≤ ε3 ∇ 3 W ∞ . From the third line we obtain /0 e−ν|y| ≤ ε2 ∇ 2 W . Hence, for φ(s) = PM (−i/0 (s)Q), we have that ∞ ∞ 3
φ(s) X3 ≤ C ∇ /0 + C /0 e−ν|y| ≤ C W W 3,∞ ε 2 , ∞
∞
e−ν|y|
where the factor is due to the exponential decay of Q. Furthermore, since 2 ∇ W (εr(s) + vuεy) − ∇ 2 W (εr(σ ) + vuεy) ≤ sup |∇ 3 W | · ε|r(s) − r(σ )| and |r(s) − r(σ )| ≤ CC∗ |s − σ |, (note |v(t)| ≤ CC∗ ), we conclude that
φ(s) − φ(σ ) L2 ≤ C ∇ 3 W C∗ ε 3 |s − σ |. By rewriting ∇/0 (x, t) = for ∇[φ(s) − φ(σ )] L2 .
1 0
∞
∇ 2 W (εr(t) + uεy) · ε 2 y du, we get the same bound
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
261
4. Møller Wave Operator In this section we prove Theorem 1.2. We assume for simplicity that the space dimension n = 3. All arguments can be modified easily to n > 3. In the main argument of this section, we assume v0 = 0 and work with the profile ξ∞ = has,0 , with ξˆ∞ (0) = 0. At the end of this section we deal with general v0 by applying a Galilei transform. In either case, we have has,0 (x) = ξ∞ (x)eiv0 ·x , and hˆ as,0 (v0 ) = ξˆ∞ (0) = 0. 4.1. Dynamical linearization. We recall the Hartree equation 1 i∂t ψ = − 9ψ − ( ∗ |ψ|2 )ψ, 2 and the equation for the ground state Q, 1 − 9Q − ( ∗ Q2 )Q = EQ. 2 We consider solutions of the Hartree equation of the form ψ = (Q(y) + h(y, t))eiθ(y,t) , where y = x − r(t),
r˙ (t) = v(t),
θ (y, t) = v(t)y − Et + θ1 (t),
v(t) ˙ = a(t), ∞ 1 2 θ1 (t) = − v + ω ds . 2 t
The argument here is the same as that in Subsect. 4.2, with W ≡ 0. We obtain the equation for h: ∂t h = Lh + G(h), where the linear part 1 1 −29 − E + A , i ¯ A(h) = −( ∗ Q2 )h − Q( ∗ (Q(h + h))), L=
and the nonlinear part 1 {/(Q + h) + F (h)} , / = ω + ay, i ¯ F (h) = −( ∗ |h|2 )(Q + h) − ( ∗ [Q(h + h)])h. G=
We take projections of Eq. (4.1) onto S and M. The equations on S are 0 ˙ = −δ +κ1 (ImG, 0), Q : α ∇Q : β˙ = −γ +κ2 (ReG, yQ), 0 0 κ2 (ImG, ∇Q), yQ : γ˙ = 0 : δ˙ = κ1 (ReG, Q). 0
(4.1)
262
J. Fröhlich, T.-P. Tsai, H.-T. Yau
(See Proposition 4.2.) The equation on M is ∂t hM = LhM + PM G(h). Next we consider the wave operator. Given a profile ξ∞ at t = ∞, we hope to find a function h(y, t) such that h(y, t) − et L0 ξ∞ → 0, as t → ∞, in a sense to be made more precise. Here L0 = −i − 21 9 − E so that L = L0 − iA. Our strategy is to write h(·, t) = ξ(·, t) + g(·, t), where ξ(t) is the main term, which satisfies a linear equation and has the desired profile explicitly; g(t) is an error and converges to zero, as t → ∞, in a suitable sense. In view of the equation for h, we would like ξ to satisfy the linear equation ∂t ξ = Lξ + PM J ξ
ξ(t) ∈ M,
(4.2)
with ξ(t) → et L0 ξ∞ , as t → ∞. The operator J is a modification of the multiplication operator −i/ and is to be defined later in (4.9). Define the propagator P(s, t) such that u(t) := P(s, t)φ solves the equation ∂t u = Lu + PM J u,
u(s) = φ ∈ M.
Clearly, P(s, t) leaves the space M invariant so that u ∈ M. Note that t < s in this section, cf. Sect. 3. We define ξ to be given by ∞ 1 PM (4.3) ξ(t) = PM et L0 ξ∞ − P(s, t)PM [ A + PM J (s)]es L0 ξ∞ ds. i t We have that ξ ∈ M, by definition, and that ξ satisfies (4.2) (differentiate (4.3) and use that [L, PM ] = 0!). We shall prove later on that ξ(t) → et L0 ξ∞
in L2 ,
as t → ∞,
(4.4)
under the assumption ξˆ∞ (0) = 0.
(4.5)
The potential / = ω + ay is unbounded and complicates the analysis. One may prove certain finite propagation speed estimates, so that y is effectively cut off, as in Sect. 3. Alternatively, we can modify the form of ψ so that the unbounded potential is cut off. We shall follow the second option in this section. Specifically, we would like h not to “see” the fast phase change vy in θ when y is large. Let χ (·) be a smooth cutoff function with χ (x) = 1, for |x| ≤ 1, and χ (x) = 0, for |x| ≥ 2. We consider ψ of the form: ψ = Q(y)eiθ + h(y, t)ei(χvy−Et+θ1 ) = Q + µ−1 h eiθ ,
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
263
where θ = vy − Et + θ1 , µ = exp(i(1 − χ )vy), y = x − r(t) and χ = χ (C∗ y/t), (C∗ > 0 is a constant to be chosen later). Then µ−1 h satisfies (4.1) ∂t (µ−1 h) = L(µ−1 h) − i/(Q + µ−1 h) − iF (µ−1 h).
(4.6)
Now µ∂t (µ−1 h) = ∂t h + h∂t (−i(1 − χ )vy) and ∂t (−i(1 − χ )vy) =: −i(ay + J (1) ), where
J (1) = −χ ay − (1 − χ )v 2 + (∇χ )(vt + y)t −2 vy . Also µL(µ−1 h) = Lh + µ[L, µ−1 ]h. Explicit computation gives
µ∇µ−1 = i −(1 − χ )v + (∇χ )t −1 vy = iJ (2) , µ9µ−1 = −(J (2) )2 + i∇ · J (2) ,
∇ · J (2) = 2(∇χ ) · t −1 v + (9χ )t −2 vy.
Recall that L = −i(−9/2 − E + A). Thus, iµ [9, µ−1 ]h − iµ[A, µ−1 ]h 2 i (2) 2 ∇ · J (2) = − (J ) − h − J (2) · ∇h + J (3) h, 2 2
µ[L, µ−1 ]h =
where ¯ − iQ ∗ [Q(h + h)] ¯ J (3) h := −iµ[A, µ−1 ]h = iµQ ∗ [Q(µ−1 h + µh)]. This yields the following equation for h: ∂t h = Lh + J h − i/µQ − iµF (µ−1 h),
(4.7)
where J h = −iω + iJ
(1)
i ∇ · J (2) − (J (2) )2 − 2 2
h − J (2) · ∇h + J (3) h.
Notice that J depends on ω, a and v with v(t) ˙ = a(t). Throughout the rest of this section we assume that there is a constant C∗ such that t 3 |a(t)| + t 2 |ω(t)| ≤ C∗ .
(4.8)
We shall prove later on that this assumption holds. Under this assumption, one finds that
J (1) ∞ ≤ O(t −2 ),
J (2) ∞ ≤ O(t −2 ),
∇ · J (2) ∞ ≤ O(t −3 ),
J (3) ∞ ≤ O(e−t ).´
We write J = Ja + Jb · ∇ + Jc ,
(4.9)
264
J. Fröhlich, T.-P. Tsai, H.-T. Yau
with Ja = i[−ω − χ ay + (∇χ )t −2 (vy)y],
Jb = −J (2) = − −(1 − χ )v + (∇χ )t −1 vy , i ∇ · J (2) Jc = −i(1 − χ )v 2 + i(∇χ )vt −1 (vy) − (J (2) )2 − + J (3) . 2 2 Note that Jb is real. Furthermore, the only appearance of µ in J is in J (3) , which is exponentially small. Assuming the bound (4.8) on a and ω, we can check the following bounds on J :
Ja ∞ + Jb ∞ ≤ O(t −2 ),
Jc ∞ ≤ O(t −3 ).
(4.10)
Once J (t) is defined, so is ξ(t) by (4.2). We can now use (4.7) and (4.2) to obtain an equation for g := h − ξ : ∂t g = (L + J )g + PS J ξ − i/µQ − iµF (µ−1 (ξ + g)).
(4.11)
−1 G(1) µ := J gS − i/(µ − 1)Q − iµF (µ (ξ + g)).
(4.12)
Let
(1)
Since −i/Q ∈ S, we have that PM G = PM J gM + PM Gµ , and the equation for g on M is gM (t) = −
t
∞
P(s, t)PM G(1) µ ds,
(4.13)
Let −1 G(2) µ := J g + PS J ξ − i/(µ − 1)Q − iµF (µ (ξ + g)).
(4.14)
(2)
Then PS G = −i/Q + PS Gµ , and the equations on S are 0 Q
∇Q
0 0 yQ
0 0
:
α˙ = −δ − ω +κ1 (ImG(2) µ , 0),
:
β˙ = −γ
:
γ˙ =
:
δ˙ =
+κ2 (ReG(2) µ , yQ), −a+κ2 (ImG(2) µ , ∇Q), κ1 (ReG(2) µ , Q).
Here we have used that κ1 (−/Q, 0) = −ω, κ2 (−/Q, ∇Q) = −a.
(4.15)
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
265
4.2. Bounds on the Propagator P(s, t). The following lemma shows that P(s, t) conserves the H 1 -norm in M. Lemma 4.1. Assume the bound (4.8). Then P(s, t) is bounded in M ∩ H k , k = 1, 2, 3. More precisely, there is a constant C such that for any C∗ (the bound in (4.8)), any T ≥ 1, and any φ ∈ M ∩ H k , we have P(s, t)φ H k ≤ eCC∗ /T φ H k , k = 1, 2, 3, provided that s, t ≥ T . (The larger T is, the better the estimate.) Proof. We first consider the case k = 1. Assume u(t) ∈ M, ∂t u(t) = Lu + PM J (t)u, u(s) = φ. Let f (t) = Im(Lu, u) ≥ 0. Then d f (t) = Im(Lu, PM J u) = Im(Lu, J u) − Im(Lu, PS J u). 2dt Here we have used Lemma 4.3. Note |(Lu, PS J u)| ≤ CC∗ t −2 u 2L2 and Im(Lu, J u) = CRe(9u, Jb · ∇u) + O(t −2 ) u 2H 1 , also, (recall Jb is real) 2Re(9u, Jb · ∇u) = − Jb · ∇|∇u|2 − 2Re (∇ u¯ · ∇)Jb · ∇u 2 = (∇ · Jb )|∇u| − 2Re (∇ u¯ · ∇)Jb · ∇u ≤ CC∗ t −3 u 2H 1 . (4.16) Hence we have
d f (t) ≤ CC∗ t −2 u 2 1 ≤ CC∗ t −2 f (t). dt H
(4.17)
Hence we get
t [ln f ]t ≤ −CC∗ t −1 ≤ CC∗ T −1 . s s
In particular, f (t) f (s) −1 , ≤ eCC∗ T . f (s) f (t) Now we consider the case k = 3. The case k = 2 follows by interpolation. Let u(t) be as above and w = Lu ∈ M. We have ∂t w = Lw + LPM J u = Lw + J w + [L, J ]u − LPS J u. This time we let f3 (t) = Im(Lw, w) and have d f3 (t) = Im(Lw, J w + [L, J ]u − LPS J u). 2dt We have |(Lw, LPS J u)| ≤ CC∗ t −2 w 2 u 2 , and we already showed |Im(Lw, J w)| ≤ CC∗ t −2 w 2H 1 when we considered f (t), see especially (4.16). Finally |Im(Lw, [L, J ]u)| = |Im(Lw, −i(∇Jb ) · ∇u + O(t −2 )u)| ≤ CC∗ t −3 w H 1 u H 2 + CC∗ t −2 w H 1 u H 1 ,
266
J. Fröhlich, T.-P. Tsai, H.-T. Yau
by integration by parts. Since w 2H 1 is comparable with f3 , we conclude
d f3 (t) ≤ CC∗ t −2 f3 (t) + f3 (t) u(t) H 1 ≤ CC∗ t −2 [f3 (t) + f (t)] . dt Together with (4.17), we see (f + f3 ) satisfies the same inequality in (4.17), and hence the same bound. Since (f + f3 ) ∼ u(t) 2H 3 , the lemma is proved. & ' Remark. Due to the spatial cut-off in our Eq. (4.7), we do not need to prove a finite speed estimate for P, (as we did in Lemma 4.6 for P), in order to prove the above lemma. 4.3. Estimates of ξ . We now estimate ξ precisely. Recall (4.2) and (4.3), the equations ∞ of ξ . Our goal is to estimate the term − t P(s, t)PM ( 1i A + PM J (s))es L0 ξ∞ ds. We need the following standard results on the free evolution. Lemma 4.2 (Decay of eit9/2 ). Let k > 0 be a positive integer and assume ∇pm ξˆ∞ (0) = 0 for all non-negative integers m ≤ 2k − 2, then C n it9/2 ξ∞ (x) ≤ d/2+k (1 + |y|2k )|∇yn ξ∞ (y)|dy, (4.18) ∇x e x=O(1) t for any integer n ≥ 0. Proof. We first consider the case n = 0. Write r = (e
it9/2
ξ∞ )(x) =
1
e
i|x−y|2 2t . We
have
i|x−y|2 2t
ξ∞ (y)dy (2πit)d/2 1 1 2 1 k−1 k = 1 + r + r + ··· + + O(r ) ξ∞ (y)dy. r 2 (k − 1)! (2πit)d/2
Therefore, the conclusion of the lemma holds if |x − y|2l ξ∞ (y)dy = 0 for all x, for all l < k, which is true under the assumption of the lemma. For general n, we take the derivative m nˆ n first and then proceed as above. Note ∇pm (∇ x ξ∞ )(0) = ∇p (p ξ∞ )(0) = 0 for all m ≤ 2k − 2. & ' We now use that P(s, t) is bounded in H1 (Lemma 4.1) to have ∞ ∞ 1 s L0 1 s L0 P(s, t)PM Ae ξ∞ ds ≤ PM i Ae ξ∞ 1 ds . i t t H1 H From Lemma 4.2 with k = 1, the last term is bounded by ∞ ∞ s L0 ξ ds ≤ s −5/2 ds ≤ Ct −3/2 . e ∞ 1,∞ t
W
(y∼1)
t
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
267
Notice that this is the only place we use assumption (4.5). Now we recall J = Ja + Jb · ∇ + Jc . Since Jc (s) ∞ ≤ s −3 , we have ∞ ∞ s L0 − J (s)e ξ ds ≤ Cs −3 ds ξ∞ H 1 ≤ Ct −2 ξ∞ H 1 . P (s, t)P M c ∞ t
H1
t
We now expand P(s, t) once more to get ∞ P(s, t)PM (Ja + Jb · ∇)(s)es L0 ξ∞ ds = ξJ + ξAJ + ξJ J , − t
where
∞ e(t−s)L0 PM (Ja + Jb · ∇)(s)es L0 ξ∞ ds, ξJ = −PM t ∞ s 1 P(σ, t)PM Ae(σ −s)L0 dσ PM (Ja + Jb · ∇)(s)es L0 ξ∞ ds, ξAJ = i t t ∞ s P(σ, t)PM J (s)e(σ −s)L0 dσ PM (Ja + Jb · ∇)(s)es L0 ξ∞ ds. ξJ J = t
t
Recall from J.-L. Journe, A. Soffer and C. D. Sogge [15], C Vˆ 1 is0 H0 is1 H0 Ve . 1 ∞ ≤ e (L ,L ) (s0 + s1 )d/2
(4.19)
Suppose that we can neglect the second projection PM in the definition of ξJ . Since Jb ∇es L0 = Jb es L0 ∇, and we can write Ja = −iω + Ja2 , Jb = v + Jb2 , where Ja2 and Jb2 have compact supports and Ja2 (s) L1 (p) + Jb2 (s) L1 (p) = O(s −2 ), the L∞ -norm of the integrand of ξJ is bounded by Ct −3/2 s −2 . Integrating in s we get
ξJ (t) L∞ ≤ Ct −3/2 ξ∞ W 1,1 . To handle the PM , we simply use that PM = 1 − PS . Since PS is a projection onto local smooth functions, the same proof applies. We shall not repeat the argument to handle the projection PM later on. We can also bound ξJ (t) in the H 1 norm by brutal force as we deal with Jc :
ξJ (t) H 1 ≤ Ct −1 ξ∞ H 2 , since ξJ involves only free evolution. We now use that P(s, t) is bounded in H1 to have ∞ s (σ −s)L0
ξAJ (t) H 1 ≤ PM (Ja + Jb · ∇)(s)es L0 ξ∞ Ae t
H1
t
From the definition of A, we have (σ −s)L0 PM (Ja + Jb · ∇)(s)es L0 ξ∞ 1 Ae H (σ −s)L0 ≤ e PM (Ja + Jb · ∇)(s)es L0 ξ∞ ∞ L (σ −s)L0 s L0 + ∇e PM (Ja + Jb · ∇)(s)e ξ∞
L∞
.
dσ ds.
268
J. Fröhlich, T.-P. Tsai, H.-T. Yau
Again we use (4.19) to have (σ −s)L0 PM (Ja + Jb · ∇)(s)es L0 ξ∞ e
L∞
≤ σ −3/2 s −2 ξ∞ W 1,1 .
Since ∇ and e(σ −s)L0 commute, we can bound the term with ∇e(σ −s)L0 in the same way −2 by also using ∇ y Ja2 L1 (p) + ∇ y Jb2 L1 (p) ≤ O(s ). We conclude that ∞ s
ξAJ (t) H 1 ≤ σ −3/2 s −2 dσ ds ξ∞ W 2,1 ≤ t −3/2 ξ∞ W 2,1 . (4.20) t
t
Finally, we can bound ξJ J (t) by ∞ s
ξJ J (t) H 1 ≤ σ −2 s −2 dσ ds ξ∞ H 3 ≤ t −2 ξ∞ H 3 . t
t
(4.21)
Let ξ(t) = ξ (0) (t) + ξ (1) (t) + ξ (2) (t), where ξ (0) (t) = PM et L0 ξ∞ , ξ (1) = ξJ and ξ (2) (t) denotes the rest. Then we have proved that (0) ξ (t) ∞ + ξ (1) (t) ∞ ≤ Ct −3/2 , L L (4.22) (1) −1 ξ (t) 1 ≤ Ct , ξ (2) (t) 1 ≤ Ct −3/2 , H
H
with the constants depending on ξ∞ . In fact, tracking the proof we see that, since ∇ commutes with es L0 , we actually have (0) ξ (t) 2,∞ + ξ (1) (t) 2,∞ ≤ Ct −3/2 , W W (4.23) (1) −1 ξ (t) 2 ≤ Ct , ξ (2) (t) 2 ≤ Ct −3/2 . H
H
Of course we need to use a stronger norm for ξ∞ . The following norm is sufficient:
ξ∞ H 4 + ξ∞ W 3,1 + ξ∞ W 2,1 ((1+x 2 )dx) ≤ C −1 C∗ ,
(4.24)
where C∗ is a small constant to be chosen in the next subsection. 4.4. Existence of g. In this section we construct the solution via a contraction mapping argument. After defining the map in Step 1, we show the following bounds in Step 2: t 2 |ω(t)| + t 3 |a(t)| + t 2 g(t) H 2 < C∗ ,
(t > T )
(4.25)
provided that ξ∞ ≤ C −1 C∗ with C∗ > 0 sufficiently small (see (4.24)) and T sufficiently large. Finally in Step 3 we show that the contraction mapping converges in the norm t 2 |ω(t)| + t 3 |a(t)| + t 2 g(t) H 1 in the ball t 2 |ω(t)| + t 3 |a(t)| + t 2 g(t) H 2 < C∗ . Notice that we use the H 1 norm for g(t) in the contraction, which is weaker than the H 2 norm appearing in (4.25). Our approach is certainly not the shortest. Once a certain apriori bound is established, we can follow standard existence construction by taking weak limits. This will avoid the
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
269
proof of the contraction completely. Our approach however provides more information to the scattering operator. STEP 1
We first define the map (ω, a, g) −→ (ω , a , g )
(4.26)
with the convention gS = PS g and gM = PM g , and so on. Recall that J (t) and ξ(t), defined by (4.9) and (4.3) respectively, depend on ω and a. To solve the equation on the S (4.15), we first solve β and δ from (4.15). Since we plan to solve the equation by iteration, we define (we think γ = 0) ∞ κ1 (ReG(2) (4.27) δ (t) = − µ (s), Q)ds , t ∞ κ2 (ReG(2) β (t) = − µ (s), yQ)ds . t
Instead of solving the equation for α and γ , we choose ω and a such that α˙ = γ˙ = 0. Therefore, we define ω , a to be ω = −δ + κ1 (ImG(2) µ , 0),
a =
(4.28)
κ2 (ImG(2) µ , ∇Q).
With this choice, the component of g in the S direction is simply gS (t) = β (t) ∇Q + δ (t) 00 . 0 Finally, the component on the M direction is given by ∞ P(s, t)PM G(1) gM (t) = − µ ds, t
(4.29)
(1)
P(s, t) depends on a and ω, so is where Gµ is defined in (4.12). Note the definition of µ. Our next step is to prove this map is bounded in a certain norm. STEP 2
Suppose that ξ∞ ≤ C −1 C∗ (see (4.24)) and t 2 |ω(t)| + t 3 |a(t)| + t 2 g(t) H 2 < C∗ .
(4.30)
We will prove the following bound:
t 2 |ω (t)| + t 3 |a (t)| + t 2 g (t)
H2
< C∗ /2
(4.31)
provided that C∗ is sufficiently small. The last statement seems to be contradictory as the norm is getting smaller after each iteration and we can drive the constant to zero. But this is impossible as the constant on the estimate of ξ∞ remains unchanged. Indeed, the right hand side of the last bound depends mainly on the constant appearing in the estimate of ξ∞ , i.e., in the inequality ξ∞ ≤ C −1 C∗ . Since a(t) satisfies (4.30) = v, ˙ v = r˙, we have |v(t)| ≤ CC∗ t −2 and |r(t)| ≤ and a−1 −1 CC∗ t . We now estimate µF (µ (ξ + g))H 2 . By definition, µF (µ−1 h) = − ∗ |h|2 (µQ + h) − 2 ∗ [QRe(µ−1 h)] h.
270
J. Fröhlich, T.-P. Tsai, H.-T. Yau
Recall the decomposition and the estimate for ξ (4.23) from Subsect. 5.3. Write h = ξ + g = ξ (0) + ξ (1) + (ξ (2) + g). Because of the bound (1.27) on , ∗ |h|2 (µQ + h) 2 H 2 ≤ ∗ |h| 2,∞ · µQ + h H 2 W (0) ≤ C ∗ |ξ + ξ (1) |2 2,∞ + C ∗ |ξ (2) + g|2 2,∞ W W 2 2 (0) (2) (1) ≤ C W 2,1 · ξ + ξ 2,∞ + C W 2,∞ · ξ + g 2 W
≤ CC∗2 t −3 .
H
Since ∗ [QRe(µ−1 h)] h is a local term by the presence of Q, using the bound (1.27) on we have ∗ [QRe(µ−1 h)] h 2 ≤ C h 2H 2 (y∼1) ≤ CC∗2 t −3 . H
We conclude that
µF (µ−1 (ξ + g))
H2
≤ CC∗2 t −3 .
From the bound of J (4.10) and the assumption on the norm of g (4.25), we have
J gS (t) H 2 ≤ CC∗2 t −2−2 . For any f ∈ S, we also have |(f, J gM )| ≤ CC∗ t −2 f H1 gM L2 ≤ CC∗2 t −4 f H1 . Also, |(f, PS J ξ )| ≤ CC∗2 t −2−3/2 . Finally −i/(µ − 1)Q is exponentially small in t. (2) Hence we conclude that |(f, Gµ )| ≤ CC∗2 t −3 f . Thus 1 C∗ t −2 , 8 1 |a (t)| ≤ C∗ t −3 , 8
|β (t)| + |δ (t)| ≤
1 |ω (t)| ≤ C∗ t −2 , 8 1 gS (t) 2 ≤ C∗ t −2 , H 8
provided that C∗ is sufficiently small. One can also easily check that ∞ ∞ 1 (1) CC∗2 s −3 ds ≤ C∗ t −2 . Gµ (s) 2 ds ≤ gM (t) 2 ≤ H H 8 t t The claim (4.31) is proved. STEP 3 Given two data (ω1 , a1 , g1 ) and (ω2 , a2 , g2 ) we denote by δ their differences: δω = ω1 − ω2 , δa = a1 − a2 , δg = g1 − g2 , δg = g1 − g2 , and so on. We also let δ 0 = sup t 2 |δω(t)| + t 3 |δa(t)| + t 2 δg(t) H 1 . (4.32) t
Note: different a(t) gives different µ, (µ = ei(1−χ)vy ), but χ is the same. Also, from the −2 definition of J , we have δJa (t) ∞ + δJb (t) ∞ + δJ (t) ∞ c ≤ Cδ 0 t . 2 3 2 Our goal is to estimate t |δω (t)|+t |δa (t)|+t δg (t) H 1 . Recall the definition of ω , a and g from (4.27), (4.28) and (4.29). In order to estimate the difference of
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
271 (2)
(2)
(2)
ω , a from two initial data, we need to control the difference of δGµ := Gµ,1 −Gµ,2 , (2)
where Gµ,k := Jk gk − iµk F (µ−1 k (ξk + gk )), k = 1, 2. Here µk , ξk and Jk denote the corresponding µ, ξ and J , k = 1, 2 and thus ∂t ξk = (L + PM Jk )ξk . We shall −1 first estimate δξ , then δF = µ1 F (µ−1 1 (ξ1 + g1 )) − µ2 F (µ2 (ξ2 + g2 )) and finally δ(J g) := J1 g1 − J2 g2 and δ(J ξ ) := J1 ξ1 − J2 ξ2 . From the equation of ξ , δξ satisfies ∂t δξ = (L + PM J1 )δξ + PM (δJ )ξ2 . Since δξ(t) → 0 in H 1 as t → ∞, (see (4.23)), we have ∞ P1 (s, t)PM (δJ (s))ξ2 (s)ds δξ = − t
in H 1 . We now derive a bound on δξ . The last term can be decomposed into two parts A + B with ∞ P1 (s, t)PM (δJ (s))PM es L0 ξ∞ ds, A := − t ∞ P1 (s, t)PM (δJ (s)) ξ2 (s) − PM es L0 ξ∞ ds. B := − t
Since δJ (s) ∞ ≤ Cδ 0 s −2 and ξ2 (s)−es L0 ξ2 H 2 ≤ Cs −1 from (4.23), we can bound B by
B H 1 ≤
Cδ 0 s −2 C∗ s −1 ds = CC∗ δ 0 t −2 .
We can bound A exactly as in Subsect. 4.5. In other words, it can be written as a sum of three terms satisfying (4.23). More precisely, A = A(0) + A(1) + A(2) and (0) A (t) 2,∞ + A(1) (t) 2,∞ ≤ Cδ 0 t −3/2 , W
(1) A (t)
H2
W
≤ Cδ 0 t −1 ,
(2) A (t)
H2
≤ Cδ 0 t −3/2 .
(In fact, A(0) = 0.) Notice that the constants on the right hand side now have a δ 0 factor. In particular, we can write δξ = (δξ )a + (δξ )b with (δξ )a = A(0) + A(1) and (δξ )b = A(2) + B such that
(δξ )a (t) W 1,∞ ≤ Cδ 0 t −3/2 ,
(δξ )b (t) H 1 ≤ Cδ 0 t −3/2 .
(4.33)
From the definition of δF , we can bound δF in terms of δξ and δg. (Note that (µ1 − µ2 )Q is exponentially small in t.) The previous bound on δξ and the bound (4.32) on δg thus yields that
δF H 1 ≤ CC∗ δ 0 t −3 . Also, δ(J g) = (δJ )g1 + J2 (δg). Thus, for any f ∈ S with f H 1 ≤ 1 we have |(f, δ[J g])| ≤ CC∗ δ 0 t −4 .
272
J. Fröhlich, T.-P. Tsai, H.-T. Yau
Similarly, |(f, δ[PS J ξ ])| ≤ CC∗ δ 0 t −7/2 . Finally, δ(−i/(µ − 1)Q) ≤ CC∗ t −2 e−Ct . We conclude for any f ∈ S with f H 1 ≤ 1 that −3 |(f, δG(2) µ )| ≤ CC∗ δ 0 t .
Simple calculations then show that 1 −3 δ0 t , 8 1 |δω (t)| ≤ δ 0 t −2 , 8 1 δgS (t) 1 ≤ δ 0 t −2 , H 8 |δa (t)| ≤
provided that C∗ is sufficiently small. Finally, the equation of gM (4.29) can be written explicitly as ∂t gM = LgM + PM J (gM + gS ) − i/(µ − 1)Q − iµF (µ−1 (g + ξ )) .
Hence for δgM = g1,M − g2,M we have ∂t δgM = (L + PM J1 )δgM + PM (−δJ )g2,M + δ(J gS ) − iδ(/(µ − 1))Q − iδF .
Since (δgM )(t) → 0 as t → ∞ in H 1 , we can put it in integral form:
(δgM )(t) = ∞ − P1 (s, t)PM (−δJ )g2,M + δ(J gS ) − iδ(/(µ − 1))Q − iδF ds. t
(4.34)
Therefore, we can bound the H1 norm of δgM by ∞ ≤ C δgM 1 (−δJ )g2,M + δ(J gS ) − iδ(/(µ − 1))Q − iδF H
Since
t
(δJ )g2,M
H1
≤ Cδ 0 s −2 g2,M
H2
H1
ds .
,
(that is why we needed to prove a stronger bound for g in Step 2), together with previous bounds on δ(J gS ), −iδ(/(µ − 1))Q and iδF , we can bound the integrand by C∗ δ 0 s −3 . Thus we have 1 δgM 1 ≤ δ 0 t −2 H 8 provided that C∗ is sufficiently small.
Point-Particle (Newtonian) Limit of Non-Linear Hartree Equation
Conclusion:
273
For the case v0 = 0, we have proved that t 2 |δω (t)| + t 3 |δa (t)| + t 2 δg (t)
H1
≤ δ 0 /2
under the assumptions (4.24), (4.30) and (4.32). Thus the map (4.26) is a contraction. Since (4.30) holds for a nonempty set of functions (including zero), we obtain a solution (ω, a, g), together with ξ . Furthermore, we have proved that t 2 |ω(t)| + t 3 |a(t)| + t 2 g(t) H 2 ≤ C∗ ∞ for t greater than an aboulute constant T . Hence v(t) = − t a(s)ds = O(t −2 ). Similarly r(t) = O(t −1 ) and θ0 (t) = O(t −1 ). Also recall y = x − r(t) and has,0 = ξ∞ . Therefore, by Taylor expansion, ψ(x, t) − ψas (x, t) = Q(y)ei(vy−Et+θ0 ) + h(y, t)ei(χvy−Et+θ0 ) − Q(x)e−iEt + eit9/2 ξ∞ (x) = O(t −1 )
in H 2 .
Note that our result is true for t > T . However, if we replace all previous estimates of the form t −m by (t + T )−m , our contraction argument still holds. Hence Theorem 1.2 is proved for the case v0 = 0. To conclude Theorem 1.2 for general v0 , we apply the following Galilei transform (boost): ψ(x, t) −→ ψ(x − v0 t, t)ei(v0 ·x− 2 v0 t) . 1 2
(Recall has,0 (x) = ξ∞ (x)eiv0 ·x and hˆ as,0 (v0 ) = ξˆ∞ (0) = 0.) Also, for general r0 we apply a translation, which does not require a change of assumption. The proof is complete. Acknowledgements. We thank J. Bourgain, I.M. Sigal and T. Spencer for some useful discussions and comments. T. P. Tsai and H. T. Yau would like to acknowledge the support of the Center for Theoretical Sciences at Taiwan where part of this work was done.
References 1. Schrödinger, E.: Der stetige Übergang von der Mikro- zur Makromechanik. Die Naturwissenschaften 28, 664–669 (1926) 2. Kato, T.: Perturbation Theory for Linear Operators on Hilbert Space. Berlin–Heidelberg–New York: Springer-Verlag, 1980 3. Hepp, K.: The classical limit for quantum mechanical correlation functions. Commun. Math. Phys. 35, 265–277 (1974) 4. Ginibre, J. and Velo, G.: The classical field limit of nonerlativistic bosons, I. Ann. Phys. (NY) 128, 243–285 (1980); “· · · , II”. Ann. Inst. H. Poincaré 33, 363–394 (1980); (see also: Ginibre, J. and Velo, G.: Commun. Math. Phys. 66, 37–76 (1979) and 68, 45–68 (1979)) 5. Lieb, E.H., Seiringer, R. and Yngvason, J.: Bosons in a Trap: A Rigorous Derivation of the GrossPitaevskii Energy Functional. Los Alamos archive, math-ph/9908027 6. Fröhlich, J., Tsai, T.-P. and Yau, H.-T.: On a classical limit of quantum theory and the non-linear Hartree equation. To appear in the proceedings of the conference “Visions in Mathematics” (Tel Aviv, 1999). Special volume of GAFA. Basel: Birkhäuser, 2000
274
J. Fröhlich, T.-P. Tsai, H.-T. Yau
7. Ginibre, J. and Velo, G.: On a class of nonlinear Schrödinger equations with nonlocal interaction. Math. Z. 170, 109–136 (1980); Scattering theory in the energy space for a class of nonlinear Schrödinger equations. J. Math. Pure Appl. 64, 363–401 (1985); Scattering theory in the energy space for a class of Hartree equations. Preprint 1998 8. Soffer, A. and Weinstein, M.I.: Multichannel nonlinear scattering theory for nonintegrable equations I, II. Commun. Math. Phys. 133, 119–146 (1990); J. Diff. Eqns. 98, 376–390 (1992) 9. Ovchinnikov, Y.N. and Sigal, I.M.: Dynamics of localized structures. Physica A 261, 143–158 (1998); Ginzburg-Landau equation, I, general discussion. In: PDE’s and their Applications. Seco, L. et al. (eds.). CRM Proceedings and Lecture Notes 12, 199–220 (1997) 10. Reed, M. and Simon, B.: Methods of modern mathematical physics, II, Fourier analysis, self-adjointness. New York, San Francisco, London: Academic Press, 1975 11. Schlein, B.: diploma thesis, ETH 1999 12. Pillet, C.-A. and Wayne, C.E.: Invariant manifolds for a class of dispersive, Hamiltonian partial differential equations. J. Diff. Equations 141, 310–326 (1997) 13. Weinstein, M.I.: Modulational stability of ground states of nonlinear Schrödinger equations. SIAM J. Math. Anal. 16, no. 3, 472–491 (1985) 14. Weinstein, M.I.: Lyapunov stability of ground states of nonlinear dispersive evolution equations. Comm. Pure Appl. Math. 39, 51–68 (1986) 15. Journe, J.-L., Soffer, A. and Sogge, C.D.: Decay estimates for Schrödinger operators. Comm. Pure Appl. Math. 44, no. 5, 573–604 (1991) Communicated by A. Kupiainen
Commun. Math. Phys. 225, 275 – 304 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
From Invariant Curves to Strange Attractors Qiudong Wang1, , Lai-Sang Young2, 1 Dept. of Math., University of Arizona, Tucson, AZ 85721, USA. E-mail:
[email protected] 2 Courant Institute of Mathematical Sciences, 251 Mercer St., New York, NY 10012, USA.
E-mail:
[email protected] Received: 10 January 2001 / Accepted: 10 July 2001
Abstract: We prove that simple mechanical systems, when subjected to external periodic forcing, can exhibit a surprisingly rich array of dynamical behaviors as parameters are varied. In particular, the existence of global strange attractors with fully stochastic properties is proved for a class of second order ODEs. Introduction In the history of classical mechanics, dissipative systems received only limited attention, in part because it was believed that in these systems all orbits eventually tended toward stable equilibria (fixed points or periodic cycles). Evidence that second order equations with a periodic forcing term can have interesting behavior first appeared in the study of van der Pol’s equation, which describes an oscillator with nonlinear damping. The first observations were due to van der Pol and van der Mark. Cartwright and Littlewood proved later that in certain parameter ranges, this equation had periodic orbits of different periods [CL]. Their results pointed to an attracting set more complicated than a fixed point or an invariant curve. Levinson obtained detailed information for a simplified model [Ln]. His work inspired Smale, who introduced the general idea of a horseshoe [Sm], which Levi used later to explain the observed phenomena [Li1]. A number of other differential equations with chaotic behavior have been studied in the last few decades, both numerically and analytically. Examples from the dissipative category include the equations of Lorenz [Lo, G, Ro, Ry, Sp, T, W], Duffing’s equation [D, Ho], Lorentz gases acted on by external forces [CELS], and modified van der Pol type systems [Li2]. For a systematic treatment of the Lorenz and Duffing equations, see [GH]. While some progress has been made, the number of equations for which a rigorous global description of the dynamics is available has remained small. This research is partially supported by a grant from the NSF
This research is partially supported by a grant from the NSF
276
Q. Wang, L.-S. Young
In this paper, we consider an equation of the form dθ d 2θ − 1) = (θ)PT (t), + λ( dt 2 dt where θ ∈ S 1 and λ > 0. If the right side is set identically equal to zero, this equation represents the motion of a particle subjected to a constant external force which causes it to decelerate when its velocity exceeds one and to accelerate when it is below one. Independent of the initial condition, the particle approaches uniform motion in which it moves with velocity equal to one. To this extremely simple dynamical system, we add another external force in the form of a pulse: is an arbitrary function, PT is timeperiodic with period T , and for t ∈ [0, T ), it is equal to 1 on a short interval and 0 otherwise. We learned after this work was completed that a similar equation has been studied numerically in the physics literature by G. Zaslavsky.1 We prove that the system above exhibits, for different values of λ and T , a very rich array of dynamical phenomena, including (a) invariant curves with quasi-periodic behavior, (b) gradient-like dynamics with stable and unstable equilibria, (c) transient chaos caused by the presence of horseshoes, with almost every trajectory eventually tending to a stable equilibrium, and (d) strange attractors with SRB measures and fully stochastic behavior. These results are new for the equation in question. As abstract dynamical phenomena, (a)–(c) are fairly well understood, and their occurrences in concrete models have been noted; see [GH]. The situation with regard to (d) is very different. The analysis that allows us to handle attractors of this type was not available until recently. To our knowledge, this is the first time a concrete differential equation has been proved analytically to have a global nonuniformly hyperbolic attractor with an SRB measure.2 We regard Theorem 3, which discusses the strange attractor case, as the main result of this paper. Our proof of Theorem 3 is based on [WY], in which we built a dynamical theory for a (general) class of attractors with one direction of instability and strong dissipation. In [WY], we identified a set of conditions which guarantees the existence of strange attractors with strong stochastic properties. The properties in question include most of the standard mathematical notions associated with chaos: positive Lyapunov exponents, positive entropy, SRB measures, exponential decay of correlations, symbolic coding of orbits, fractal geometry, etc. The occurrence of scenario (d) above is proved by checking the conditions in [WY]. For the convenience of the reader, we will recall these conditions as well as the package of results that follows once these conditions are checked. Our purpose in writing this paper is not only to point out the range of phenomena that can occur when simple second order equations are periodically forced, but to bring to the foreground the techniques that have allowed us to reach these conclusions in a relatively straightforward manner. These techniques are clearly not limited to the systems considered here. It is our hope that they will find applications in other dynamical systems, particularly those that arise naturally from mechanics or physics. 1 Zaslavsky produced in [Z1] numerical evidence of strange attractors. He also discussed in [Z2] how this model can be viewed as a strong idealization of the turbulence problem. 2 Levi proved in [Li1] the occurrence of scenario (c) for his modified van der Pol systems, not scenario (d) as is sometimes incorrectly reported.
From Invariant Curves to Strange Attractors
277
1. Statement of Results 1.1. Setting and assumptions. Consider the differential equation d 2θ dθ +λ = µ + (θ)PT (t), dt 2 dt
(1)
where θ ∈ S 1 , λ, µ > 0 are constants, : S 1 → R is a smooth function, and PT has the following form: for some t0 < T , PT satisfies PT (t) = PT (t + T )
and PT (t) =
1 0
for all t
for t ∈ [0, t0 ], for t ∈ (t0 , T ).
As discussed in the introduction, (1) describes a simple mechanical system consisting of a µ particle moving in a circle subjected to an external time-periodic force. With r = dθ dt − λ , (1) is equivalent to dθ µ =r+ , dt λ dr = −λr + (θ)PT (t). dt
(2)
Let FT denote the time-T -map of (2), that is, the map that transforms the phase space S 1 × R from time 0 to time T . Unless explicitly stated otherwise, when we write FT , it will be assumed that T is the period of the forcing. We set µ = λ for simplicity, and normalize the forcing term as follows: Given a function 0 : S 1 → R, we let = t10 0 , that is to say, the magnitude of this part of the force is taken to be inversely proportional to the duration of its action, and the proportionality constant is taken to be 1 for simplicity. Our analysis will proceed as follows: * The function 0 is fixed throughout. With the exception of Theorem 2(b) (where more is assumed), the only requirements are that 0 is of class C 4 and all of its critical points are nondegenerate. 1 * We assume t0 < 10 min{λ−1 , K0−2 }, where K0 = max{ 0 C 4 , 1}. Further restrictions on t0 are imposed in each case as needed. (We do not regard t0 as an important parameter and will assume it is as small as the arguments require.) * The two important parameters are λ and T . We will prove that (i) the properties of (1) are intrinsically different for λ small and for λ large, and (ii) for fixed λ, the properties of (1) depend quite delicately on the value of T . To interpret our results correctly, the reader should keep in mind that the dynamical pictures described below are not the only ones that can occur, and it is possible to have combinations of them, such as sinks and strange attractors, on different parts of the phase space. Our aim here is to identify several important pure dynamics types, to indicate the nature and approximate locations of the parameter sets on which they occur, and to convey a sense of prevalence, meaning that these phenomena occur naturally and not as a result of mere coincidence.
278
Q. Wang, L.-S. Young
1.2. Statements of theorems. The setting of Sect. 1.1 is assumed throughout. We consider the discrete-time system defined by the Poincaré map FT . Precise meanings of some of the technical terms are given after the statements of the theorems. Theorem 3 is our main result. The scenarios presented in Theorems 1 and 2 are also integral parts of the picture. Theorem 1 (Existence of invariant curves). Let λ ≥ 4K0 and T ≥ t0 + 23 . Then there is a simple closed curve of class C 4 to which all the orbits of FT converge. Moreover, we have the following dichotomy: (a) (Quasi-periodic attractors) Let 0 = {T : ρ(T ) ∈ R \ Q}, where ρ(T ) is the rotation number of FT |. Then (i) 0 intersects every unit interval in [ 23 , ∞) in a set of positive Lebesgue measure, and (ii) the following hold for T ∈ 0 : FT | is topologically conjugate to an irrational rotation, and for every z ∈ S 1 × R, 1 n−1 δF i z converges weakly to µ where µ is the unique invariant probability 0 n T measure on . ¯0 (b) (Periodic sinks and saddles) There is an open and dense subset 1 of [t0 + 23 , ∞)\ such that for T ∈ 1 , FT has a finite number of periodic sinks and saddles on . Every orbit of FT converges to one of these periodic orbits. Theorem 2 is elementary; it uses standard techniques, and 0 is required only to be C 2 . We include this result because the dynamical pictures described occur for a nontrivial set of parameters. Theorem 2 (Convergence to stable equilibria). (a) (Gradient-like dynamics) ∃λ0 < max | 0 | such that ∀λ > λ0 , if t0 is sufficiently small, then there are open intervals of T for which FT has a finite number of periodic points all of which are saddles or sinks, and every orbit not on the stable manifold of a saddle tends to a sink. (b) (Transient chaos) Assume 0 has exactly two critical points. Then there exist intervals of λ accumulating at 0 such that for each of these λ, if t0 is sufficiently small, then there are open intervals of T for which FT has a periodic sink and a “horseshoe”, i.e. a uniformly hyperbolic invariant set such that FT | is conjugate to a shift of finite type with positive topological entropy. Lebesgue-a.e. z ∈ S 1 × R is attracted to the sink as n → ∞. Remarks. (i) The picture in Theorem 2(a) is more general than that in Theorem 1(b): there are no simple closed invariant curves in general (see Proposition 4.1). (ii) We describe the scenario in Theorem 2(b) as “transient chaos” for the following reasons: being an invariant set, points near it tend to stay near it for some period of time, mimicking the dynamics on . This chaotic behavior, however, is transient, because has Lebesgue measure zero, and for a typical initial condition, the orbit eventually leaves behind and heads for a sink. Our next result deals with a notion of chaos that is sustained through time. A compact, FT -invariant set ⊂ S 1 × R is called a global attractor for FT if for every z ∈ S 1 × R, dist(FTn z, ) → 0 as n → ∞. In order not to interrupt the flow of ideas, we postpone the technical definitions of some of the terms used in Theorems 2 and 3 to after the statements of both results. Here is our main result:
From Invariant Curves to Strange Attractors
279
Theorem 3 (Strange attractors). For the parameters specified below, F = FT has a strange attractor, a description of which follows: ¯ t¯0 > 0 such that for every λ < λ¯ and t0 < t¯0 , Relevant parameter set. There exist λ, there is a positive Lebesgue measure set = (λ, t0 ) in T -space for which the results of this theorem hold; ⊂ [T0 , ∞) for some large T0 , and meets every subinterval of [T0 , ∞) of length O(λ) in a set of positive Lebesgue measure. ¯ t0 < t¯0 , and T ∈ (λ, t0 ). Then F = FT has Dynamical characteristics. Let λ < λ, a global attractor with the following dynamical properties: (1) Hyperbolic behavior. F | is nonuniformly hyperbolic with an identifiable set C ⊂ which is the source of all nonhyperbolic behavior. More precisely: (a) C = ∪i Ci where Ci is a Cantor set located near (θ, r) = (ci , 0), ci being the critical points of 0 ; at each z ∈ C, stable and unstable directions coincide, i.e. there is a vector v with DF n (z)v → 0 exponentially fast as n → ±∞. (b) Away from C the dynamics is uniformly hyperbolic. More precisely, let ε := {z ∈ : dC (F n z) ≥ ε∀n ∈ Z}, where dC (·) is a notion of distance to C. Then is the closure of ∪ε>0 ε , ε is a uniformly hyperbolic invariant set for each ε > 0, and the hyperbolicity of F |ε deteriorates (e.g. minimum (E u , E s ) → 0) as ε → 0. (2) Statistical properties. (a) F admits a unique SRB measure µ supported on . (b) With the exception of a Lebesgue measure zero set of initial conditions, the asymptotic behavior of every orbit of F is governed by µ. More precisely, for Lebesgue-a.e. z ∈ S1 × R, if ϕ : S 1 × R → R is a continuous function, then 1 n−1 ϕ(F i z) → ϕdµ as n → ∞. 0 n (c) (F, µ) is ergodic, mixing, and Bernoulli. (d) For every observable ϕ : → R of Hölder class, the sequence ϕ, ϕ ◦ F, ϕ ◦ F 2 , · · · , ϕ ◦ F n , · · · viewed as a stochastic process with underlying probability space (, µ) has exponential decay of correlations and obeys the Central Limit Theorem. (3) Symbolic coding and other geometric properties. (a) Kneading sequences are well defined for all critical orbits, i.e. all orbits emanating from C. (b) With respect to the partition defined by the fractal sets Ci , the coding of orbits in is well defined and essentially one-to-one. More precisely, if σ is the shift operator, then there is a closed subset & ⊂ '∞ −∞ {1, · · · , s} with σ (&) ⊂ & and a continuous surjection π : & → such that π ◦ σ = F ◦ π ; moreover, π is i one-to-one except over ∪∞ −∞ F C, where it is two-to-one. (In general, (&, σ ) is not a shift of finite type.) (c) Let htop (F ) denote the topological entropy of F , Nn the number of cylinder sets of length n in & above, and Pn the number of fixed points of F n . Then htop (F ) = lim
n→∞
1 1 log Nn = lim log Pn . n→∞ n n
Moreover, F has an invariant measure of maximal entropy.
280
Q. Wang, L.-S. Young
For a more detailed description of the dynamics on these strange attractors, see [WY]. We review below the definitions and related background information for some of the technical terms used in the theorems. For more information on this material, see [KH] and [Y1]. A compact F -invariant set is called uniformly hyperbolic if the following hold: (1) The tangent space at every x ∈ splits into E u (x)+E s (x) with minx∈ (E u , E s ) > 0; (2) this splitting is DF -invariant; and (3) there exist C ≥ 1 and σ < 1 such that for all x ∈ and n ≥ 0, DF n (x)v ≤ Cσ n v for all v ∈ E s (x), DF −n (x)v ≤ Cσ n v for all v ∈ E u (x). In Theorem 3(1)(b), not only does min (E u , E s ) → 0 as ε → 0, we have C → ∞ as well. This means the smaller ε, the longer it takes for the geometry of hyperbolic behavior to take hold. An F -invariant Borel probability measure µ is called an SRB measure if F has a positive Lyapunov exponent µ-a.e. and the conditional measures of µ on unstable manifolds are equivalent to the Riemannian volume on these leaves. SRB measures are of physical relevance because they can be observed: in dissipative dynamical systems, all invariant probability measures are necessarily singular, but ergodic SRB measures with nonzero Lyapunov exponents have the property that there is a positive Lebesgue ϕ(F i z) → ϕdµ as n → ∞ for every measure set of points z for which n1 n−1 0 continuous function ϕ. Referring to the set of points z above as the measure-theoretic basin of µ, Theorem 3(2)(b) says that the measure-theoretic basin here is not just a positive Lebesgue measure set, it is, modulo a set of Lebesgue measure zero, the entire phase space. By a decomposition theorem for SRB measures with no zero exponents ([Le]), the uniqueness of µ implies that it is ergodic, and the mixing and Bernoulli properties are equivalent to (F n , µ) being ergodic for all n ≥ 1. We say the dynamical system (F, µ) has exponential decay of correlations for Hölder continuous observables if given a Hölder exponent η, there exists τ = τ (η) < 1 such that for all ϕ ∈ L∞ (µ) and ψ : → R Hölder with exponent η, there exists K = K(ϕ, ψ) such that (ϕ ◦ F n )ψdµ − ϕdµ ψdµ ≤ K(ϕ, ψ)τ n for all n ≥ 1. Finally, we say the Central Limit Theorem holds for ϕ with ϕdµ = 0 if n−1 √1 ϕ ◦ F i converges in distribution to the normal distribution, and the variance 0 n is strictly positive unless ϕ ◦ F = ψ ◦ F − ψ for some ψ. 1.3. Illustrations. Figure 1 below shows the approximate location and shape of the invariant curve or strange attractor (corresponding to different values of λ and T ) for the time-T -map FT : S 1 × R → S 1 × R. Figure 2 explains the mechanisms behind the changes in the dynamical picture as λ decreases. The straight line in (a) represents {r = 0} in (θ, r)-coordinates, and the subsequent pictures show the images of this line (or circle) at various times under the flow. Figure 2(b) shows the effect of the forcing; observe that it need not constitute a large perturbation. For t ∈ (t0 , T ], the forcing is turned off, and the system relaxes to a limit cycle with contraction rate e−λ . Figure 2(d) shows the image of {r = 0} for λ > 1 and e−λT reasonably contractive; these parameters correspond to the existence of invariant curves. As λ decreases, the effect of the shear term in (2) becomes more
From Invariant Curves to Strange Attractors
281
Fig. 1. Left: Invariant curves λ > 1; right: Strange attractor λ 1
(a) t = 0
(b) t = t
0
-λ e (c) t 0< t < T
(d) t = T, λ>1
(e) t = T, λ decreasing
(f) t = T, λ decreasing further
(g) t = T, T >> 1, λ 0 and the fact that 1 λt0 < 10 , we see immediately that the four terms above add up to < 5K0 t0 . (ii) ∂θ = 1 + A + B, ∂θ0
284
Q. Wang, L.-S. Young
where
t ∂θ 1 A= 0 (1 − eλτ )dτ, λt0 0 ∂θ0 t 1 ∂θ λτ −λt B= (1 − e ) 0 e dτ. λt0 ∂θ0 0
Letting 71 = maxt≤t0 | ∂θ(t) ∂θ0 − 1| and recalling that t0 K0
1, so that each W s -leaf is a C 1 segment joining the two boundary components of A. Moreover, F maps each W s -leaf strictly into a W s -leaf, contracting 1 length by a factor < 10 . It follows from this that := ∩n>0 F n (A) is a compact set which s meets each W -leaf in exactly one point. Part (b) of Lemma 4.2 follows immediately. Let γ0 be the curve {r = 0}. Then the images γn := F n γ0 converge in the Hausdorff metric to , the center manifold of F . By Lemma 4.1(a), the tangent vectors to γn have slopes between ±1/4 for all n. This proves that is the graph of a Lipschitz function g with Lipschitz constant ≤ 1/4. That g is C 4 follows from the fact that F is C 4 and standard graph transform arguments involving the Fiber Contraction Theorem. We refer the reader to [HPS]. " #
290
Q. Wang, L.-S. Young
4.1.2. Dynamics on invariant circles. For each T , let T be the simple closed curve left invariant by FT . We introduce a family of maps hT : S 1 → S 1 as follows: For θ0 ∈ S 1 , let z be the unique point in T whose θ -coordinate is θ0 . Then hT (θ0 ) = θ1 , where θ1 is the θ -coordinate of FT (z). Let ρ(hT ) denote the rotation number of hT . Since dθ1 99 > 1 − e−λ(T −t0 ) |r(t0 )| > , dT 100
(11)
it is an easy exercise to see that T → ρ(hT ) is a continuous nondecreasing function with ρ(hT +1 ) ≈ ρ(hT ) + 1. Case 1. ρ(hT ) ∈ R \ Q. By Denjoy theory, hT is topologically conjugate to the rigid rotation by ρ(hT ), which is well known to admit only one invariant probability measure. This together with Lemmas 2.5 and 4.2(b) imply immediately the unique ergodicity of FT . To prove that 0 in Theorem 1 has positive Lebesgue measure, we appeal to the following theorem of Herman: Theorem ([He]). Let Diff r+ (S 1 ) denote the space of C r orientation-preserving diffeomorphisms of S 1 . Let s → hs ∈ Diff 3+ (S 1 ) be C 1 and suppose that for some s0 < s1 , ρ(hs0 ) = ρ(hs1 ). Then {s ∈ [s0 , s1 ] : ρ(hs ) ∈ R \ Q} has positive Lebesgue measure. Case 2. ρ(hT ) ∈ Q. We fix p, q ∈ Z+ , p, q relatively prime, and let I be a connected component of {T : ρ(hT ) = pq } with nonempty interior. From (11), it follows that d dT
q
99 (hT (θ0 )) > 100 for every θ0 . Standard transversality arguments give an open and q dense subset I˜ of I such that for T ∈ I˜, the graph of hT is transversal to the diagonal q of S 1 × S 1 . For T ∈ I˜, the fixed points of hT (in the order in which they appear on S 1 ) are alternately strictly repelling and strictly contracting. With the contraction normal to T , they correspond to saddles and sinks respectively for FT . This completes the proof of Theorem 1.
4.2. Proof of Theorem 2. Our analysis will proceed as follows. Referring the reader to Sect. 2.1 for definitions and notation, we will argue that uniformly expanding invariant sets of fa translate directly into uniformly hyperbolic invariant sets of Ta,b for b sufficiently small. That being the case, to produce the phenomena described in Theorem 2, it suffices to produce the corresponding behaviors for fa . Furthermore, since uniformly expanding invariant sets are stable under perturbations, and fa is a small perturbation of fˆa for t0 m0 . Fix λ > m0 . Varying a (which corresponds to moving the graph of fˆa up and down), we see that there is an open set of a for which fˆa has a finite number of fixed points which are alternately repelling and attracting. For these a, it is a simple exercise to show that for sufficiently small t0 and b, FT = Ta,b has the gradient-like dynamics described in Theorem 2. More generally, if ρ(fˆa ) = pq , then the discussion q q above applies to fˆa unless fˆa = id.
From Invariant Curves to Strange Attractors
p1
c1
p2
c2
291
p1
p1
(a)
c1
x1
p2
c2
p1
(b) Fig. 3 a,b.
Gradient-like dynamics, in general, persist when λ drops below m0 . Intuitively, no simple closed invariant curve exists beyond this point because the unstable manifold of the saddle “turns around”. We provide a rigorous proof in a restricted context. Proposition 4.1. Suppose 0 has exactly two critical points and negative Schwarzian derivative. Then there exist intervals of λ, t0 and T for which FT has gradient-like dynamics but there are no smooth simple closed invariant curves. Proof. Let c1 and c2 denote the critical points of 0 . There is an interval of a0 such ˜ 0 , then ˜ 0 has exactly two zeros, at say p1 and p2 . Fix such an that if 0 = a0 + a0 . Without loss of generality, we assume p1 < c1 < p2 < c2 < p1 + 1 = p1 , and 0 (p1 ) > 0, 0 (p2 ) < 0. In the rest of the proof, for each λ we consider, let f = fˆa , ˜ 0 (s). Observe that p1 is a repelling fixed where a = − aλ0 mod 1, so that f (s) = s + λ1 point of f , p2 is an attractive fixed point of f , and f (c1 ) = f (c2 ) = 1. This discussion is valid for all λ. For large λ, f maps (c1 , c2 ) strictly into itself. (See Fig. 3(a).) This continues to be the case for some interval of λ below m0 . Since 0 < 0 on (c1 , c2 ), we have 1 − mλ0 < f < 1 on (c1 , c2 ), so there exist ε, ε > 0 and an interval L of λ below m0 for which f (c1 + ε, c2 − ε) ⊂ (c1 + 2ε, c2 − 2ε) and |f |(c1 +ε,c2 −ε) | < 1 − ε . (See Fig. 3(b).) Thus every point in (c1 + ε, c2 − ε) tends to p2 , and since every point in S 1 \ (c1 + ε, c2 − ε) eventually enters (c1 + ε, c2 − ε), we conclude that f and hence F = Ta,b have gradient-like dynamics for a as above and t0 and b suitably small. Let p˜ 1 and p˜ 2 denote the saddle and sink of F respectively. To prove the proposition, suppose F leaves invariant a smooth simple closed curve . Since it is not possible for all the points in an invariant circle to converge to the same point, must intersect the stable manifold of p˜ 1 . This implies p˜ 1 ∈ , and hence W u , the unstable manifold of p˜ 1 , must be contained in . Fix an orientation on , and let τ be a positively oriented tangent field on W u . To derive a contradiction, we will produce, for every ε1 > 0, two points z, z ∈ W u such that d(z, z ) < ε1 and τ (z) and τ (z ) point in opposite directions. By the negative Schwarzian property of 0 , f = 0 at exactly two points x1 < x2 in (c1 , c2 ). Move λ if necessary so xi = p2 , i = 1, 2. Without loss of generality, we
292
Q. Wang, L.-S. Young
stable curves ~ p1 p1
x1
f(x1)
Fig. 4.
may assume x1 ∈ (c1 , p2 ). The following two statements, which we claim are valid for suitable choices of t0 , a and b, clearly lead to the desired contradiction. (1) The right branch of W u is roughly horizontal until about f (x1 ), where it makes a sharp turn and doubles back for a definite distance, creating two roughly parallel segments with opposite orientation (see Fig. 4). (2) There exist pairs of points on these parallel segments joined by stable curves. Claims (1) and (2) follow from Lemma 4.3, which is a general result valid for any λ and any 0 (and not just the ones considered in this subsection). It is similar in spirit to Lemma 4.2 and has the same proof, which will be omitted. ¯ 0 , λ, δ, ε) 0, ∃b¯ = b( ¯ Let z = (r, θ ) ∈ A (which depends on b) be following hold for F = Ta,b with b < b. such that |fa (θ )| > δ. Then: (a) |s(v)| = O( bδ ) $⇒ |s(DFz v)| = O( bδ ) and |DFz v| > (1 − ε)δ|v|; (b) there exists C = C( 0 , λ) such that |s(DFz v)| > Cδ $⇒ |s(v)| > Cδ and |DFz v| b |v| = O( δ ). Claim (1) follows immediately from Lemma 4.3(a). Part (b) of this lemma implies that if a region of A misses the two rectangles {(r, θ ) : |f (θ )| < δ} in all of its forward iterates, then it is foliated by stable curves. Since f (p2 ) = 0, Claim (2) is easily arranged by choosing δ sufficiently small. " # 4.2.2. Transient chaos. We return to the family fˆa where λ is now assumed to be small. Let c1 and c2 be the critical points of 0 . Then fˆa has exactly two critical points s1 and s2 near c1 and c2 . Let a be fixed for now. As λ is varied, the critical values fˆa (s1 ) and fˆa (s2 ) move at rates ∼ λ1 in opposite directions. There exists, therefore, a sequence of λ for which they coincide. Observe that this sequence is independent of a. We now fix each of these λ and adjust a so that fˆa (s1 ) = s1 , where s1 is the critical point with the property that | 0 (c1 )| ≤ | 0 (c2 )|. We will show that for the (λ, a)-pairs selected above, f = fˆa has the following properties: (i) it has a sink, and (ii) when restricted to the set of points that are not attracted to the sink, f is uniformly expanding. By design, we have f (s1 ) = s1 , which is therefore a sink, and f (s2 ) = s1 . For √ 1.5 i = 1, 2, let αi = | (c )| λ and Ii = [si − αi , si + αi ]. 0
i
From Invariant Curves to Strange Attractors
293
Lemma 4.4. Assume λ is sufficiently small. Then √ (a) for s ∈ I1 ∪ I2 , we have |f (s)| > 1.4; (b) for s ∈ I1 ∪ I2 , we have f n s → s1 as n → ∞. |f (s)| ≥ |f (si ± αi )| for some i. Since Proof. (a) We may assume for s ∈ I1 ∪ I2 that√ 1 this is = λ | 0 (ξi )|αi for some ξi ∈ Ii , it is > 1.4. (b) First we check f (Ii ) ⊂ I1 , i = 1, 2: 1 1 1.5 λ2 | (ξi )|αi2 ≤ | (ξi )| · 2λ 0 2λ 0 | 0 (ci )|2 λ λ ≤ < α1 . ≤ | 0 (ci )| | 0 (c1 )|
|f (si ± αi ) − f (si )| =
A similar computation shows that f restricted to I1 is a contraction.
# "
Let F = Ta,b , where λ and a are near the ones selected above and t0 and b are sufficiently small. Let Bi , i = 1, 2, be the two components of A \ {(θ, r) : θ ∈ I1 ∪ I2 }. With λ sufficiently small, F wraps each Bi around A (in the horizontal direction) at least once, with F (Bi ) crossing completely Bj every time they meet. This, on the topological level, is the standard construction of a horseshoe. Let := {z ∈ A : F n (z) ∈ B1 ∪ B2
∀n ∈ Z}.
With b sufficiently small, the uniform hyperbolicity of F | follows from Lemma 4.3. This completes the proof of Theorem 2. 5. Proof of Theorem 3 5.1. Conditions from [WY] for strange attractors. As explained in the introduction, the proof of Theorem 3 is obtained largely via a direct application of [WY] – provided the conditions in Sect. 1.1 of [WY] are verified. For the convenience of the reader, we give a self-contained discussion of these conditions here, modifying one of them to improve its checkability and adding a new one, (C4), to guarantee mixing. The notation in this section is that in [WY]. We consider a family of maps Ta,b : A = S 1 × [−1, 1] → A, where a ∈ [a0 , a1 ] ⊂ R and b ∈ B0 ⊂ R, B0 being any subset with 0 as an accumulation point.4 In this setup, b is a measure of dissipation; our results hold for b sufficiently small. We explain the role of the parameter a: For systems that are not uniformly hyperbolic, a scenario that competes with that of strange attractors and SRB measures is the presence of periodic sinks. In general, arbitrarily near systems with SRB measures, there are open sets of maps with sinks; proving directly the existence of an SRB measure for a given dynamical system requires information of arbitrarily high precision. We get around this problem by considering one-parameter families, in our case a → Ta,b , and by showing that if a family satisfies certain reasonable conditions, then a positive measure set of parameters with SRB measures is guaranteed. We now state our conditions on these families. 4 In [WY], B is taken to be an interval but the formulation here is all that is used. 0
294
Q. Wang, L.-S. Young
(C1) Regularity conditions. For each b ∈ B0 , the function (x, y, a) → Ta,b (x, y) is C 3 ; and as b → 0, these functions converge in the C 3 norm to (x, y, a) → Ta,0 (x, y). (ii) For each b = 0, Ta,b is an embedding of A into itself, whereas Ta,0 is a singular map with Ta,0 (A) ⊂ S 1 × {0}. (iii) There exists K > 0 such that for all a, b with b = 0, (i)
| det DTa,b (z)| ≤K | det DTa,b (z )|
∀z, z ∈ S 1 × [−1, 1].
As before, we refer to Ta,0 as well as its restriction to S 1 × {0}, i.e. the family of one-dimensional maps fa : S 1 → S 1 defined by fa (x) = Ta,0 (x, 0), as the singular limit of Ta,b . The rest of our conditions are imposed on the singular limit alone. The second condition in [WY] is: (C2) There exists a ∗ ∈ [a0 , a1 ] such that f = fa ∗ satisfies the Misiurewicz condition. The Misiurewicz condition (see [M]) encapsulates a number of properties some of which are hard to check or not needed in full force. We propose here to replace it by (C2’), a set of conditions that is more directly checkable (although a little cumbersome to state). That the results in [WY] are valid when (C2) is replaced by (C2’) below is proved in Lemma A.1 in the Appendix. (C2’) Existence of a sufficiently expanding map from which to perturb. There exists a ∗ ∈ [a0 , a1 ] such that f = fa ∗ has the following properties: There are numbers c1 > 0, N1 ∈ Z+ , and a neighborhood I of the critical set C such that f is expanding on S 1 \ I in the following sense: (a) if x, f x, · · · , f n−1 x ∈ I, n ≥ N1 , then |(f n ) x| ≥ ec1 n ; (b) if x, f x, · · · , f n−1 x ∈ I and f n x ∈ I , any n, then |(f n ) x| ≥ ec1 n ; (ii) f n x ∈ I ∀x ∈ C and n > 0; (iii) in I , the derivative is controlled as follows: (a) |f | is bounded away from 0; (b) by following the critical orbit, every x ∈ I \ C is guaranteed a recovery time n(x) ≥ 1 with the property that f j x ∈ I for 0 < j < n(x) and |(f n(x) ) x| ≥ ec1 n(x) .
(i)
Next we introduce the notion of smooth continuations. Let Ca denote the critical set of fa . For x = x(a ∗ ) ∈ Ca ∗ , the continuation x(a) of x to a near a ∗ is the unique critical point of fa near x. If p is a hyperbolic periodic point of fa ∗ , then p(a) is the unique periodic point of fa near p having the same period. It is a fact that in general, if p is a point whose fa ∗ -orbit is bounded away from Ca ∗ , then for a sufficiently near a ∗ , there is a unique point p(a) with the same symbolic itinerary under fa . (C3) Conditions on fa ∗ and Ta ∗ ,0 . (i)
Parameter transversality. For each x ∈ Ca ∗ , let p = f (x), and let x(a) and p(a) denote the continuations of x and p respectively. Then d d fa (x(a)) = p(a) da da
at a = a ∗ .
From Invariant Curves to Strange Attractors
295
(ii) Nondegeneracy at “turns”. ∂ Ta ∗ ,0 (x, 0) = 0 ∂y
∀x ∈ Ca ∗ .
The following fact often facilitates the checking of condition (C3)(i): Lemma 5.1 ([TTY], Sect. VII). Let f = fa ∗ , and suppose all x ∈ C. Then ∞ [(∂a fa )(f k x)]a=a ∗ k=0
(f k ) (f x)
1 n≥0 |(f n ) (f x)|
< ∞ for
d d = fa (x(a)) − p(a) da da
a=a ∗
.
The main conditions in [WY] are contained in (C1)–(C3) (or, equivalently, (C1), (C2’) and (C3)). The conclusions of Theorem 3, however, are more specific than those of [WY], which allow the co-existence of multiple ergodic SRB measures. We now introduce a fourth condition,5 which along with (C1)–(C3) implies the uniqueness of SRB measures and their mixing properties. This implication is proved in Lemma A.2 in the Appendix. (C4) Conditions for mixing. (i) ec1 > 2 where c1 is in (C2’). (ii) Let J1 , · · · , Jr be the intervals of monotonicity of fa ∗ , and let P = (pi,j ) be the matrix defined by 1 if f (Ji ) ⊃ Jj , pi,j = 0 otherwise. Then there exists N2 > 0 such that P N2 > 0. The discussion in this subsection can be summarized as follows: Theorem 3’. Assume {Ta,b } satisfies (C1), (C2’), (C3) and (C4) above. Then for all sufficiently small b > 0, there is a positive measure set of a for which Ta,b has the properties in (1), (2) and (3) of Theorem 3. We remark that [WY] contains a more detailed description of the dynamical picture than the statement of Theorem 3 and refer the interested reader there for more information. In the rest of this section the discussion pertains to the differential Eq. (1) defined in Sect. 1.1. All notation is as in Sect. 2.1. To prove Theorem 3, it suffices to verify that for the parameters in question, Ta,b satisfies the conditions above. This is carried out in the next three subsections. 5 Condition (*) in Sect. 1.2 of [WY], the only condition in [WY] not implied by (C1)–(C3), is clearly contained in (C4).
296
Q. Wang, L.-S. Young
5.2. Verification of (C2’): Expanding properties. Among the conditions to be checked, (C2’), which guarantees a suitable environment from which to perturb, is arguably the most fundamental of the four. It is also the one that requires the most work. In this subsection, we will – after placing some restrictions on λ and t0 – show that (C2’) is valid for all fa for which (C2’)(ii) is satisfied. The existence of a satisfying (C2’)(ii) is the topic of the next subsection. Let x¯1 , x¯2 , · · · , x¯k1 be the critical points of 0 , and let k2 = min{1, 21 mini | 0 (x¯i )|}. We fix ε = ε( 0 ) > 0 with the property that |x¯i − x¯j | > 4ε for i = j and | 0 | > k2 on ∪i (x¯i − 2ε, x¯i + 2ε), and claim that by choosing λ and t0 sufficiently small, we may assume the following about fa . Let C denote the critical set of fa , and let Cε denote the ε-neighborhood of C. Then (i) C = {x1 , · · · , xk1 } with |xi − x¯i | < ε; (ii) on Cε , |fa | > kλ2 . To justify these claims, observe first that by taking λ small enough, the critical set of fˆa can be made arbitrarily close to that of 0 . Second, by choosing t0 sufficiently small (independent of λ), we can make fa − fˆa C 3 < ελ1 for ε1 as small as we please (Lemma 2.3). These observations together with fˆa = λ1 0 imply (i) and (ii). A number of other conditions will be imposed on λ; they will be specified as we go along. Some of these conditions are determined via an auxiliary constant K > 1 which depends only on 0 and which will be chosen to be large enough for certain purposes. Let σ := 2k2−1 K 3 λ. We assume 21 σ < ε, so that |fa (x)| > K 3 for x ∈ Cε \ C 1 σ . We 2
also assume λ is small enough that |fa | > K 3 outside of Cε . Together these imply
(iii) |fa | > K 3 outside of C 1 σ . 2
For simplicity of notation, we write f = fa in the rest of this subsection. Lemma 5.2. Let c ∈ C be such that f n (c) ∈ Cσ ∀n > 0. Consider x with |x − c| < 21 σ , 1 and let n(x) be the smallest n such that |f n (x) − f n (c)| > 3K K 3 λ. Then n(x) > 1 0 and |(f n(x) ) | ≥ k3 K n(x) for some k3 = k3 (K0 , k2 ). Before giving the proof of this lemma, we first prove a distortion estimate. Sublemma 5.1. Let x, y ∈ S 1 and n ∈ Z+ be such that ωi , the segment between f i x 1 and f i y, satisfies |ωi | < 3K K 3 λ and dist(ωi , C) > 21 σ for all i with 0 ≤ i < n. Then 0 (f n ) x ≤ 2. (f n ) y Proof. n−1
log
n−1
f (f i x) |f (f i x) − f (f i y)| (f n ) x log = ≤ (f n ) y f (f i y) |f (f i y)| ≤
i=0 n−1 i=0
i=0
(1 +
K0 i λ )|f x K3
− f i y|
n−1 (1 + Kλ0 ) 1 |f n−1 x − f n−1 y|. < K3 K 3i i=0
From Invariant Curves to Strange Attractors
Assuming that
1 λ
297
and K are sufficiently large, this is < 21 .
# "
Proof of Lemma 5.2. First we show n(x) > 1. Given the location of x, we have K > |f x| = |f (ξ )||x − c| for some ξ between x and c. This implies |f x − f c| = which we may assume is
3K K 3 λ, it follows from Sublemma 5.1 that for some ξ1 , 0 1 1 |f (ξ1 )||x − c|2 · 2|(f n−1 ) (f c)| > K 3 λ. 2 3K0
(12)
Reversing the inequality at time n − 1 and using Sublemma 5.1 again, we have 1 1 1 K 3 λ. |f (ξ2 )||x − c|2 · |(f n−2 ) (f c)| < 2 2 3K0
(13)
Substituting the estimate for |(f n−1 ) (f c)| from (12) into 1 |(f n ) x| ≥ |f (ζ )||x − c| · |(f n−1 ) (f c)|, 2 we obtain |(f n ) x| ≥
1 |f (ζ )| 1 1 . K 3λ 2 |f (ξ1 )| 2K0 |x − c|
Now plug the estimate for |x − c| from (13) into the last inequality and use the lower bounds for |f (ξ2 )| and |(f n−2 ) (f c)| from (ii) and (iii) earlier on in this subsection. We arrive at the estimate k2 3(n−2) (ζ )| 1 3 3 1 |f n 3 λK |(f ) x| > K λ = constK 2 (n−2)+ 2 . 1 3 2 |f (ξ1 )| 3K0 4 3K K λ 0
The power to which K is raised is ≥ n for n ≥ 3. This completes the proof of Lemma 5.2. # " We have proved the following: Suppose fa has the property that each of its critical points c satisfies fan (c) ∈ Cσ for all n > 0. Then (C2’)(i) and (iii) hold for fa with I = C 1 σ . This follows from properties (ii) and (iii) in the first part of this subsection 2 and from Lemma 5.2.
298
Q. Wang, L.-S. Young
5.3. Verification of (C2’): “Multiple Misiurewicz points”. The goal of this section is to show that for many values of the parameter a, fa has the property that its critical orbits (in strictly positive time) stay away from its critical set. Precise statements will be formulated later. We remark that for the quadratic family x → 1 − ax 2 or any other family with a single critical point, this is a trivial exercise: there are many periodic orbits or compact invariant Cantor sets disjoint from the critical set, and if changes in parameter correspond to the movement of fa (c) in a reasonable way, then there would be many parameters for which fa (c) ∈ . We call these parameters “Misiurewicz points”. For maps with more than one critical point, as circle maps necessarily are, the required condition is that all of the critical orbits are trapped in some invariant set away from C. This is clearly more problematic, especially with having measure zero. We call parameters with these properties “multiple Misiurewicz points”. Their existence and O(λ)-density within the family {fa } is the concern of this subsection. Recall that σ = 2k2−1 K 3 λ and Cσ is the σ -neighborhood of C. Recall also from Sect. 5.2 that outside of Cσ , |fa | > K 3 . We are looking for a parameter a ∗ such that f = fa ∗ has the property that for all c ∈ C, f n c ∈ Cσ ∀n > 0. Write C = {x1 , · · · , xk1 } as before, and let be a parameter interval. For k = 1, 2, · · · , k1 and i = 1, 2, · · · , we introduce the curves of critical points (k)
a → γi (a) := fai (xk ), a ∈ . Observe that for all k,
d (k) da γ1
= 1, and for all i,
d (k) d (k) (k) γ (a) = γ (a)fa (γi (a)) + 1. da i+1 da i (k)
Thus if γj (a) ∈ Cσ for all j ≤ i and K is sufficiently large, then d (k) d (k) (k) γi+1 (a) ≈ γ (a)fa (γi (a)) da da i
(14)
d (k) 1 γ (a) ≥ K 3i . da i+1 2
(15)
and
We also have the following distortion estimate: (k)
Sublemma 5.2. For k = 1, 2, · · · , k1 and n ∈ Z+ , let ⊂ [0, 1) be such that γi (a) ∈ (k) 1 Cσ for i = 1, 2, · · · , n − 1. Assume that |γn−1 | ≤ 3K K 3 λ. Then for all a, a ∈ , we 0 have d γ (k) (a) da n ≤ 2. d (k) γn (a ) da Using (14) and (15), we see that the proof is entirely parallel to that of Sublemma 5.1 with slightly weaker estimates. We leave it as an exercise for the reader. Let d be the minimum distance between critical points. Choosing λ sufficiently small, we may assume 6k1 σ 0.
From Invariant Curves to Strange Attractors
299
Proof. We describe first an algorithm for selecting a sequence of intervals 0 ⊃ 1 ⊃ 2 ⊃ · · · so that a ∗ ∈ ∩i i has the desired property: At step n, the (k1 +1)-tuple (n ; i1,n , i2,n , · · · , ik1 ,n ) is called an “admissible configuration” if n is a subinterval of 0 , ik,n ≤ n, and the following conditions are satisfied for each k: (k)
(A1) γi |n ∩ Cσ = ∅ for all i ≤ ik,n ; (A2) for all a, a ∈ n , d (k) da γik,n (a) d (k) ≤ 2; da γik,n (a ) (k)
(A3) (“minimum length condition”) |γik,n +1 |n | ≥ 12k1 σ . Observe that (A3) is about the length of the critical curve one iterate later. Let us first show that we have an admissible configuration for n = 1. Let ik,1 = 1 d (k) γ1 = 1, we have for all k. The parameter interval 1 is chosen as follows. Since da (k) (k) (k) |γ1 |0 | = 6k1 σ , so that γ1 meets at most one component of Cσ and |(γ1 )−1 Cσ | ≤ (k) 2σ . Even in the worst case scenario when all k1 intervals (γ1 )−1 Cσ are evenly spaced, (k) there exists an interval 1 ⊂ 0 with |1 | = 2σ such that γ1 |1 ∩ Cσ = ∅ for all k. (k) Equations (A1) and (A2) are trivially satisfied, as is (A3) since |γ2 |1 | > 2σ K 3 , and 2K 3 is assumed to be > 12k1 . We now discuss how to proceed at a generic step, i.e. step n, assuming we are handed an admissible configuration (n ; i1,n , i2,n , · · · , ik1 ,n ). First, we divide the set {1, 2, · · · , k1 } into indices k that are “ready to advance”, meaning the situation is right for the k th curve to progress to the next iterate, and those that are not. Say k ∈ A if (k)
(A4) |γik,n |n |
2K 3 σ .
300
Q. Wang, L.-S. Young
Consider now k ∈ A. Conditions (A1) and (A2) are inherited from the previous step, and (A3) is checked as follows: If k ∈ A because (A4) fails, then (k)
|γik,n+1 |n+1 | ≥
1 1 (k) |γ | | ≥ cK 3 λ, · 2 3k1 ik,n n
where c is a constant independent of K of λ. Notice that this uses only the distortion estimate from step n. One iterate later, this curve will have length > cK 6 λ, which we may assume is > 12k1 σ . If (A4) holds but (A5) fails, then the distortion estimate holds for the next iterate, and (k)
|γik,n+1 +1 |n+1 | ≥
1 (k) |γ | | ≥ cd, 6k1 ik,n +1 n
which we may also assume is > 12k1 σ . This completes the construction from step n to step n + 1 when A = ∅. If A = ∅, then we let n be the left half of n , and observe that the (n + 1)tuple (n ; i1,n , i2,n , · · · , ik1 ,n ) is again admissible. To verify (A3), we fix k, and argue separately as in the last paragraph the two cases corresponding to (i) the failure of (A4) with respect to n and (ii) the failure of (A5) but not (A4). Repeat this process if necessary until A = ∅. " # 5.4. Verification of (C1), (C3) and (C4). We now verify the remaining conditions in Sect. 5.1. Observe from the arguments below that (C1) and (C3)(ii) are quite natural for systems arising from differential equations, while (C3)(i) and (C4) are, to a large extent, consequences of the fact that the maps fa are sufficiently expanding. Verification of (C1): Let Ft0 denote the time-t0 -map of (2) (the period of the forcing continues to be T ). Then (i) follows from the fact that Ft0 has bounded C 3 norms on S 1 × [−1, 1]; (ii) is obvious, and (iii) is a consequence of the fact that det(DFT ) = e−λ(T −t0 ) det(DFt0 ). Verification of (C3): For (i), since (∂a fa )(·) = 1 and |(f k ) (fx)| ≥ K k , Lemma 5.1 applies, and the quantity in question has absolute value ≥ 1 − i≥1 K1i > 0. Part (ii) is Lemma 2.3(i). Verification of (C4): (i) is proved since ec1 = K > 2. For (ii), by choosing λ sufficiently small depending on 0 , it is easily arranged that pi,j = 1 for all i, j . This completes the proof of Theorem 3. Appendix We supply here the proofs of the two lemmas promised in Sect. 5.1. This appendix has to be read in conjunction with [WY]. Lemma A.4. All the theorems in [WY] remain valid if the Misiurewicz condition in Step I, Sect.1.1, of [WY] is replaced by condition (C2’) in Sect. 5.1 of this paper. Proof. The three most important uses of the Misiurewicz condition in [WY] are: – the nondegeneracy of the critical points (this is guaranteed by (C2’)(iii)(a)); – every critical orbit stays a fixed distance away from C (this is precisely (C2’)(ii));
From Invariant Curves to Strange Attractors
301
– there exist c0 , c > 0 such that for every critical point x, |(f n ) (f x)| > c0 ecn (this is guaranteed by (C2’)(i) and (ii)). These three properties aside, the only consequences of the Misiurewicz condition used in [WY] are contained in Lemma 2.5 of [WY]. Let Cδ denote the δ-neighborhood of C. Then there exist cˆ0 , cˆ1 > 0 such that the following hold for all sufficiently small δ > 0: Let x ∈ S 1 be such that x, f x, · · · , f n−1 x ∈ Cδ , any n. Then (i) |(f n ) x| ≥ cˆ0 δecˆ1 n ; (ii) if, in addition, f n x ∈ Cδ , then |(f n ) x| ≥ cˆ0 ecˆ1 n . We claim that the conclusions of this lemma also follow from (C2’). Let n1 < · · · < nq , 0 ≤ n1 , nq ≤ n, be the times when f ni x ∈ I . Then – |(f n1 ) x| ≥ ec1 n1 by (C2’)(i)(b); – |(f ni+1 −ni ) (f ni x)| ≥ ec1 (ni+1 −ni ) by (C2’)(iii)(b) followed by (i)(b); – |(f n−nq ) (f nq x)| = |f (f nq x)| · |(f n−(nq +1) ) (f nq +1 x)|, where |f (f nq x)| ≥ |f (ξ )|d(x, C) ≥ c0 δ by (C2’)(iii)(a) and |(f n−(nq +1) ) (f nq +1 x)| ≥ c0 ec1 (n−(nq +1)) by (C2’)(i)(a). Together these inequalities prove both of the assertions in the lemma.
# "
Lemma A.5. Let {Ta,b } be as in Sect. 5.1 of this paper, and let be the set of (a, b) such that T = Ta,b satisfies the conclusions of Theorem 1 in [WY]. Suppose {Ta,b } also satisfies (C4), and δ is smaller than a number depending on c1 . Then (i) T admits at most one SRB measure µ; (ii) (T , µ) is mixing. Proof. Let {x1 < · · · < xr } be the set of critical points of f . Consider a segment ω ⊂ ∂R0 corresponding to an outermost Iµj at one of the components of C (0) . First we claim there exist N ∈ Z+ and ωˆ ⊂ ω such that T i ωˆ ∩ C (0) = ∅ for all 0 < i < N and T N ωˆ connects two components of C (0) . This claim is proved as follows. Let ω denote the image of ω at the end of its bound period. Then ω has length > δ Kβ . We continue to iterate, deleting all parts that fall into C (0) . Then i steps later, the undeleted part of T i ω is made up of finitely many segments. Suppose that for all i ≤ n, none of these segments is long enough to connect two components of C (0) , so that the number of segments deleted up to step i is ≤ 2i . We estimate the average length of these segments at time n as follows: First, the pull-back to ω of all the deleted parts has total measure ≤ i≤n 2i e−c1 i (2δ) by (C2’)(i)(b). Since 2 < ec1 by (C4)(i), we may assume this is < 21 δ Kβ provided δ is sufficiently small. The undeleted segments of T n ω add up, therefore, to > ec1 n 21 δ Kβ in length, and since there are at most 2n of them, their average length is > 2−n ec1 n 21 δ Kβ . Thus one sees that as n increases, there must come a point when our claim is fulfilled. Next we observe that if ω is a C 2 (b) segment connecting two components of C (0) , then using (C4)(ii) and reasoning as with finite state Markov chains, we have that for every n ≥ N2 and every k ∈ {1, · · · , r}, there is a subsegment ωn,k ⊂ ω such that for all i < n, T i ωn,k ∩ C (0) = ∅ and T n ωn,k stretches across the region between xk and xk+1 , extending beyond the critical regions containing these two points.
302
Q. Wang, L.-S. Young
Recall that in [WY], Sects. 8.1 and 8.2, a finite number of ergodic SRB measures {µi , i ≤ r } are constructed, and it is shown in Sect. 8.3 that these are all the ergodic SRB measures T has. The discussion above shows that starting at any reference set, a segment ω ⊂ ∂R0 as above will spend a positive fraction of time in every reference set, proving that r ≤ 1. Furthermore, starting from any reference set, the return time to it takes on all values greater than some N0 , proving that µ1 is mixing. " # 6. Concluding Remarks • For area-preserving maps, it is well known that when integrability first breaks down, the phase portrait is dominated by KAM curves. Farther away from integrability, one sees larger Birkhoff zones of instability interspersed with elliptic islands. Continuing to move toward the chaotic end of the spectrum, it is widely believed – though not proved – that most of the phase space is covered with ergodic regions with positive Lyapunov exponents. This paper deals with the corresponding pictures for strongly dissipative systems. We consider a simple model consisting of a periodically forced limit cycle. Keeping the magnitude of the “kick” constant, we prove that scenarios roughly parallel to those in the last paragraph occur for our Poincaré maps, with attracting invariant circles (taking the place of KAM curves), periodic sinks (instead of elliptic islands), and as the contractive power of the cycle diminishes, we prove that the stage is shared by at least two scenarios occupying parameter sets that are delicately intertwined: horseshoes and sinks, and strange attractors. By “strange attractors”, we refer to attractors characterized by SRB measures, positive Lyapunov exponents, and strong mixing properties. For the differential equation in question, we prove that the system has global strange attractors of this kind for a positive measure set of parameters. • Our second point has to do with bridging the gap between abstract theory and concrete problems. Today we have a fairly good hyperbolic theory, yet chaotic phenomena in naturally occurring dynamical systems have continued to resist analysis. One of the messages of this paper is that for certain types of strange attractors, the situation is now improved: For attractors with strong dissipation and one direction of instability, there are now relatively simple, checkable conditions which, when satisfied, guarantee the existence of an attractor with a detailed package of statistical and geometric properties. Our conditions are formulated to give rigorous results, but where rigorous analysis is out of reach, they can also serve as a basis for numerical work to provide justification for various mathematical statements about strange attractors. References [A] [BC] [BY] [B] [CL] [CELS]
Arnold, V.I.: Small denominators, I: Mappings of the circumference onto itself. AMS Transl. Ser. 2 46, 213–284 (1965) Benedicks, M. and Carleson, L.: The dynamics of the Hénon map. Ann. Math. 133, 73–169 (1991) Benedicks, M. and Young, L.-S.: Sinai-Bowen-Ruelle measure for certain Hénon maps. Invent. Math. 112, 541–576 (1993) Bowen, R.: Equilibrium states and the ergodic theory of Anosov diffeomorphisms. Lecture Notes in Math. Vol. 470, Berlin: Springer, 1975 Cartwright, M.L. and Littlewood, J.E.: On nonlinear differential equations of the second order. J. London Math. Soc. 20, 180–189 (1945) Chernov, N., Eyink, G., Lebowitz, J. and Sinai, Ya.G.: Steady-state electrical conduction in the periodic Lorentz gas. Commun. Math. Phys. 154, 569–601 (1993)
From Invariant Curves to Strange Attractors
[CE] [D] [GS] [G] [GH] [He] [HPS] [Ho] [J] [KH] [Le] [LY] [Li1] [Li2] [Ln] [Lo] [Lyu1] [Lyu2] [dMvS] [M1] [MV] [P] [PS] [Ro] [R1] [R2] [Ry] [Sh] [Si] [Sp] [TTY] [T] [W] [WY] [Y1] [Y2]
303
Collet, P. and Eckmann, J.-P.: Iterated Maps on the Interval as Dynamical Systems. Progress on Physics I (1980) Duffing, G.: Erzwungene Schwingungen bei veränderlicher Eigenfrequenz. Braunschwieg: 1918 Graczyk, J. and Swiatek, G.: Generic hyperbolicity in the logistic family. Ann. Math. 146, 1–52 (1997) Guckenheimer, J.: A strange strange attractor. In: Bifurcation and its Application (J.E. Marsden and M. McCracken, ed.). Berlin–Heidelberg–New York: Springer-Verlag, 1976, pp. 81 Guckenheimer, J. and Holmes, P.: Nonlinear oscillators, dynamical systems and bifurcations of vector fields. Appl. Math. Sciences 42, Berlin–Heidelberg–New York: Springer-Verlag, 1983 Herman, M.: Mesure de Lebesgue et Nombre de Rotation. Lecture Notes in Math. 597, Berlin– Heidelberg–New York: Springer, 1977, pp. 271–293 Hirsch, M., Pugh, C. and Shub, M.: Invariant Manifolds. Lecture Notes in Math. 583, Berlin– Heidelberg–New York: Springer Verlag, 1977 Holmes, P.: A nonlinear oscillator with a strange attractor. Phil. Trans. Roy. Soc. A. 292, 419–448 (1979) Jakobson, M.: Absolutely continues invariant measures for one-parameter families of onedimensional maps. Commun. Math. Phys. 81, 39–88 (1981) Katok, A. and Hasselblatt, B.: Introduction to the modern dynamical systems. Cambridge: Cambridge University Press, 1995 Ledrappier, F.: Propriétés ergodiques des mesures de Sinai. Publ. Math. Inst. Hautes Etud. Sci. 59, 163–188 (1984) Ledrappier, F. and Young, L.-S.: The metric entropy of diffeomorphisms. Ann. Math. 122, 509–574 (1985) Levi, M.: Qualitative analysis of periodically forced relaxation oscillations. Mem. AMS 214, 1–147 (1981) Levi, M.: A new randomness-generating mechanism in forced relaxation oscillations. Physica D 114, 230–236 (1998) Levinson, N.: A second order differential equation with singular solutions. Ann. Math. 50, No. 1, 127–153 (1949) Lorenz, E.N.: Deterministic nonperiodic flow. J. Atmos. Sc. 20, No. 1, 130–141 (1963) Lyubich, M.: Dynamics of quadratic polynomials, I-II. Acta Math. 178, 185–297 (1997) Lyubich, M.: Regular and stochastic dynamics in the real quadratic family. Proc. Natl. Acad. Sci. USA 95, 14025–14027 (1998) de Melo, W. and van Strien, S.: One-dimensional Dynamics. Berlin–Heidelberg–New York: Springer-Verlag, 1993 Misiurewicz, M.: Absolutely continues invariant measures for certain maps of an interval. Publ. Math. IHES. 53, 17–51M. Mora, L. and Viana, M.: Abundance of strange attractors. Acta. Math. 171, 1–71 (1993) Pesin, Ja.B.: Characteristic Lyapunov exponents and smooth ergodic theory. Russ. Math. Surv. 32.4, 55–114 (1977) Pugh, C. and Shub, M.: Ergodic attractors. Trans. A. M. S. 312, 1–54 (1989) Robinson, C.: Homoclinic bifurcation to a transitive attractor of Lorenz type. Nonlinearity 2, 495– 518 (1989) Ruelle, D.: A measure associated with Axiom A attractors. Am. J. Math. 98, 619–654 (1976) Ruelle, D.: Ergodic theory of differentiable dynamical systems. Publ. Math. Inst. Hautes Étud. Sci. 50, 27–58 (1979) Rychlik, M.: Lorenz attractors through Sil’nikov-type bifurcation, Part I. Ergodic Theory and Dynamical Systems 10, 793–822 (1990) Shub, M.: Global Stability of Dynamical Systems. Berlin–Heidelberg–New York: Springer, 1987 Sinai, Y.G.: Gibbs measure in ergodic theory. Russ. Math. Surv. 27, 21–69 (1972) Sparrow, C.: The Lorenz Equations. Berlin–Heidelberg–New York: Springer, 1982 Thieullen, P., Tresser, C. and Young, L.-S.: Positive exponent for generic 1-parameter families of unimodal maps. C.R. Acad. Sci. Paris, t. 315, Serie (1992), 69–72; J. d’Analyse 64, 121–172 (1994) Tucker, W.: The Lorenz attractor exists. C. R. Acad. Sci. Paris Ser. I Math. 328 (1999), no. 12, 1197–1202 Williams, R.: The structure of Lorenz attractors. In: Turbulence Seminar Berkeley 1996/97 (P. Bernard and T. Ratiu, ed.), Berlin–Heidelberg–New York: Springer-Verlag, 1977, pp. 94–112 Wang, Q.D. and Young, L.-S.: Strange attractors with one direction of instability. Commun. Math. Phys. 218, 1–97 (2001) Young, L.-S.: Ergodic theory of differentiable dynamical systems. In: Real and Complex Dynamical Systems, B. Branner and P. Hjorth (eds.), Dordrecht: Kluwer Acad. Press, 1995 Young, L.-S.: Statistical properties of dynamical systems with some hyperbolicity. Ann. of Math. 147, 585–650 (1998)
304
[Z1] [Z2]
Q. Wang, L.-S. Young
Zaslavsky, G.: The simplest case of a strange attractor. Phys. Lett. A 69, no. 3, 145–147 (1978) Zaslavsky, G.: Chaos in Dynamic Systems. Harwood Academic Publishers, first printing, 1985
Communicated by M. Aizenman
Commun. Math. Phys. 225, 305 – 329 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Exponential Convergence to Non-Equilibrium Stationary States in Classical Statistical Mechanics Luc Rey-Bellet, Lawrence E. Thomas Department of Mathematics, University of Virginia, Kerchof Hall, Charlottesville, VA 22903, USA. E-mail:
[email protected];
[email protected] Received: 12 March 2001 / Accepted: 5 August 2001
Abstract: We continue the study of a model for heat conduction [6] consisting of a chain of non-linear oscillators coupled to two Hamiltonian heat reservoirs at different temperatures. We establish existence of a Liapunov function for the chain dynamics and use it to show exponentially fast convergence of the dynamics to a unique stationary state. Ingredients of the proof are the reduction of the infinite dimensional dynamics to a finite-dimensional stochastic process as well as a bound on the propagation of energy in chains of anharmonic oscillators. 1. Introduction In its present state, non-equilibrium statistical mechanics is lacking the firm theoretical foundations that equilibrium statistical mechanics has. This is due, perhaps, to the extremely great variety of physical phenomena that non-equilibrium statistical mechanics describes. We will concentrate here on a system which is maintained, by suitable forces, in a state far from equilibrium. In such an idealization, the non-equilibrium phenomena can be described by stationary non-equilibrium states (SNS), which are the analog of canonical or microcanonical states of equilibrium. Recently many works have been devoted to the rigorous study of SNS. Two main streams are emerging. In the first approach, for open systems, a system is driven out of equilibrium by interacting with several reservoirs at different temperatures. In the second approach, for thermostated systems, a system is driven out of equilibrium by nonHamiltonian forces and constrained to a compact energy surface by Gaussian (or other) thermostats [9, 24]. One should view both approaches as two different idealizations of the same physical situation, in the same spirit as the equivalence of ensembles in equilibrium statistical mechanics. But for the moment, the extent to which both approaches are equivalent remains a largely open problem. We consider here an open system, a model of heat conduction consisting of a finitedimensional classical Hamiltonian model, a one-dimensional finite lattice of anharmonic
306
L. Rey-Bellet, L. E. Thomas
oscillators (referred to as the chain), coupled, at the boundaries only, to two reservoirs of classical non-interacting phonons at positive and different temperatures. We believe this model to be quite realistic, in particular it is completely Hamiltonian and non-linear. This model goes back (in the linear case) to [8] (see also [23, 26]). First rigorous results for anharmonic models appear [6] and go further in [7, 5]. Similar models in classical and quantum mechanics have attracted attention in the last few years, mostly for systems coupled to a single reservoir at zero or positive temperature, i.e., for systems near thermal equilibrium (see e.g. [12, 13, 3, 15, 25]. In our case, with two reservoirs, no Gibbs Ansatz is available and in general, even the very existence of a (non-equilibrium) stationary state is a mathematically challenging question which requires a sufficiently deep understanding of the dynamics. For the model at hand, conditions for the existence of the SNS have been given in [6] and generalized in [5]. The uniqueness of the SNS as well as the strict positivity of entropy production (or heat flux) have been proved in [7]. The leading asymptotics of the invariant measure (for low temperatures) are studied in [21] and shown to be described by a variational principle. Under suitable assumptions on the chain interactions and its interactions with the reservoirs, we establish the existence of a Liapunov function for the chain dynamics. We then use this Liapunov function to establish that the relaxation to the SNS occurs at an exponential rate, and finally we prove that the system has a spectral gap (using probabilistic techniques developed by Meyn and Tweedie in [18]). The Hamiltonian of the model has the form H = H B + HS + HI .
(1)
The two reservoirs of free phonons are described by wave equations in Rd with the Hamiltonian HB = H (ϕL , πL ) + H (ϕR , πR ), 1 H (ϕ, π) = dx(|∇ϕ(x)|2 + |π(x)|2 ), 2 where L and R stand for the “left” and “right” reservoirs, respectively. The Hamiltonian describing the chain of length n is given by HS (p, q) =
n p2 i
i=1
V (q) =
n i=1
U (1) (qi ) +
2
+ V (q1 , . . . , qn ),
n−1
U (2) (qi − qi+1 ),
i=1
where (pi , qi ) ∈ Rd × Rd are the coordinates and momenta of the i th particle of the chain. The phase space of the chain is R2dn . The interaction between the chain and the reservoirs occurs at the boundaries only and is of dipole-type HI = q1 · dx∇ϕL (x)ρL (x) + qn · dx∇ϕR (x)ρR (x), where ρL and ρR are coupling functions (“charge densities”) which we will assume spherically symmetric.
Exponential Convergence to Non-Equilibrium Stationary States
307
Our assumptions on the anharmonic lattice described by HS (p, q) are the following: • H1 Growth at infinity. The potentials U (1) (x) and U (2) (x) are C ∞ and grow at infinity like xk1 and xk2 : There exist constants Ci , Di , i = 1, 2 such that lim λ−ki U (i) (λx) λ→∞ lim λ−ki +1 ∇U (i) (λx) λ→∞
= a (i) xki ,
(2)
= a (i) ki xki −2 x,
(3)
∂ 2 U (i) (x) ≤ (Ci + Di V (x))
1− k2
i
,
(4)
where · in Eq. (4) denotes some matrix-norm. Moreover we will assume that k2 ≥ k1 ≥ 2, so that, for large x the interaction potential U (2) is “stiffer” than the one-body potential U (1) . It follows from Eqs. (2) and (3) that the critical set of V (q), i.e., the set {q : ∇V (q) = 0} is a compact set. • H2 Non-degeneracy. The coupling potential between nearest neighbors U (2) is nondegenerate in the following sense. For x ∈ Rd and m = 1, 2, · · · , let A(m) (x) : Rd → m Rd denote the linear maps given by
A(m) (x)v
l1 l2 ···lm
=
d l=1
∂ m+1 U (2) (x)vl . ∂x (l1 ) · · · ∂x (lm ) ∂x (l)
We assume that for each x ∈ Rd there exists m0 such that Rank A(1) (x), · · · A(m0 ) (x) = d. In particular this condition is satisfied, for m0 = 1, if U (2) is strictly convex. If d = 1, this condition means that for any x, there exists m0 = m0 (x) ≥ 2 such that ∂ m /∂U (2) (x) = 0. In other words the potential U (2) has no flat piece or infinitely degenerate points. The class of coupling functions ρi , i ∈ {L, R} we can allow is relatively restrictive: • H3 Rationality of the coupling. Let ρˆi denote the Fourier transform of ρi . We assume that |k|d−1 |ρˆi (k)|2 =
1 , Qi (k 2 )
where Qi , i ∈ {L, R} are polynomials with real coefficients and no roots on the real axis. In particular, if k0 is a root of Qi , then so are −k0 , k 0 and −k 0 . Under these conditions we have the following result (a more detailed and precise statement will be given in the next section). Let F (p, q) be an observable on the phase space of the chain, for example any function with at most polynomial growth (no smoothness is required). We denote as (p(t), q(t)) the solution of the Hamiltonian equation of motion with Hamiltonian (1) and initial conditions (p, q). Of course (p(t), q(t)) depends also on the variables of the reservoirs, though only through their initial conditions (πL , ϕL , πR , ϕR ). We introduce the temperature by making the assumption that the initial conditions of the reservoirs are distributed according to thermal equilibrium at temperature TR and TL respectively and we denote ·LR as the corresponding average.
308
L. Rey-Bellet, L. E. Thomas
Theorem 1.1. Under Conditions H1–H3, there is a measure ν(dp, dq) with a smooth everywhere positive density such that the Law of Large Numbers holds: 1 T lim F (p(t), q(t))dt = F dν T →∞ T 0 for almost all initial conditions (πL , ϕL , πR , ϕR ) of the reservoirs and for all initial conditions (p, q) of the chain. Moreover there exist a constant r > 1 and a function C(p, q) with Cdν < ∞ such that F (p(t), q(t))LR − F dν ≤ C(p, q)r −t for all initial conditions (p, q). That is, if we average over the initial conditions of the reservoirs the convergence is exponential. Note that the ergodic properties stated in Theorem 1.1 hold not only for ν-almost every initial condition (p, q), but in fact for every (p, q). The existence of a (unique) stationary state was proved for (exactly solvable) quadratic harmonic potentials V (q) in [26], for k1 = k2 = 2 (i.e., for potential which are quadratic at infinity) in [6, 7] and generalized to the case k2 > k1 ≥ 2 in [5]. What is really new here is that we prove that the convergence occurs exponentially fast and we also weaken slightly the conditions on the potential (in particular the case k1 = k2 is allowed and our Condition H2 on U (2) is weaker than the one used in [6, 7, 5]). Our methods also differ notably from those used in [6, 5]; in fact we reprove the existence of the SNS (with a shorter and more constructive proof than in [6, 5]) and, at the same time, we prove much stronger ergodic properties. We devote the rest of this section to a brief discussion of the Assumptions H1–H3. Since the reservoirs are free phonon gases and since we make a statistical assumption on the initial condition of the reservoirs, one can integrate out the variables of the reservoirs yielding random integro-differential equations for the variables (p, q). Our Assumption H3 of rational coupling is, in effect, a Markovian assumption: with such coupling one can eliminate the memory terms by adding a finite number of auxiliary variables to obtain a system of Markovian stochastic differential equations on the extended phase space consisting of the dynamical variables (p, q) together with the auxiliary variables. The main (new) ingredient in our proof is then the construction of a Liapunov function for the system, which implies, using probabilistic methods developed in [1, 20, 18], the exponential convergence towards the stationary state. To explain the construction of a Liapunov function, note that the dynamics of the chain in the bulk is simply Hamiltonian, while at the boundaries the action of the reservoirs results into two distinct forces. There are dissipative forces which correspond to the fact that the energy of the chain dissipates into the reservoirs. This force is independent of the temperature. On the other hand since the reservoirs are infinite and at positive temperatures, they exert (random) forces at the boundaries of the chain and these forces turn out to be proportional to the temperatures of the reservoirs. The construction of the Liapunov function proceeds in two steps. In a first step we neglect completely the random force, only dissipation acts. This corresponds to dynamics at temperature zero, and one can prove that the energy decreases and that the system relaxes to a (local) equilibrium of the Hamiltonian H (p, q). We establish the rate at which this relaxation takes place (at sufficiently high energies). In the second step we consider the complete dynamics and we show that for energies which are much higher
Exponential Convergence to Non-Equilibrium Stationary States
309
than the temperatures of the reservoirs, the random force is essentially negligible with respect to the dissipation. This means that except for (exponentially) rare excursions the system spends most of its time in a compact neighborhood of the equilibrium points. On the other hand, in this compact set, i.e., at energies of order of the temperatures of the reservoirs, the dynamics is essentially determined by the fluctuations and to prove exponential convergence to a SNS one has to show that the fluctuations are such that every part of the phase space is visited by the dynamics. To summarize, we control the dynamics at any temperature by the dynamics at zero temperature. This allows one to understand the meaning of our assumptions on the potential V (q). If we suppose that the energy has an infinite number of local minima tending to infinity, the zero temperature (long-time) dynamics is not confined to a compact energy domain and our argument fails. With regard to the condition k2 ≥ k1 in Condition H1 on the exponents of the potentials, since the results of [27] and the rigorous proofs of [17, 2], it is known that stable (in the sense of Nekhoroshev) localized states exist in non-linear lattices. Consider, for example, an infinite chain of oscillators (without reservoirs). Numerically and in certain cases rigorously [17], one can show the existence of breathers, i.e., of solutions which are spatially (exponentially) localized and time-periodic. Although the breathers occur both for k1 > k2 and k2 ≥ k1 they behave differently at high energies. For k1 > k2 , the higher the energy, the more localized the breathers get (hard breathers), while for k2 ≥ k1 , as the energy gets bigger the breathers become less and less localized (soft breathers). In fact a key point of our analysis is to show that at high energy, if the energy E of the initial condition is localized away from the boundary, then after a time of order one, the oscillators at the boundaries carry at least an energy of order E 2/k2 so that the chain system energy can relax into the reservoirs. Although we believe that the existence of a SNS probably may not depend too much on these localization phenomena, the rate of convergence to the SNS presumably does. Our approach of controlling the dynamics by the zero-temperature dynamics may not be adequate if Condition H1 fails to hold and so more refined estimates on the dynamics are needed to show that these localized states might be in fact destroyed by the coupling to the reservoirs. As regards the organization of this paper, Sect. 2 presents the effective stochastic differential equations for the chain, a discussion of allowable interactions between the reservoirs and the chain and a concise statement, Theorem 2.1, of the exponential convergence. In Sect. 3 we discuss the dissipative deterministic system (corresponding to reservoirs at temperature 0), Theorem 3.3, and then we show the extent to which the random paths follow the deterministic ones, Proposition 3.7. We give a lower bound on the random energy dissipation, Corollary 3.8. We then conclude Sect. 3 by providing the Liapunov function, Theorem 3.10, and bounds on the exponential hitting times on (sufficiently large) compact sets, Theorem 3.11. In Sect. 4 we prove that the random process has a smooth law and at most one ergodic component, improving slightly results of [6, 7, 5]. Finally in Sect. 5 we conclude the proof of Theorem 2.1 by invoking results of [18] on the ergodic theory of the Markov processes. 2. Effective Equations We first give a precise description of the reservoirs and of their coupling to the system and derive the stochastic equations which we will study. A free phonon gas is described by a linear wave equation in Rd , i.e., by the pair of real fields φ(x) = (ϕ(x), π(x)), x ∈ Rd . We define the norm φ by φ2 ≡ dx(|∇φ(x)|2 + |π(x)|2 ) and denote ·, ·
310
L. Rey-Bellet, L. E. Thomas
the corresponding scalar product. The phase space of the reservoirs at finite energy is the real Hilbert space of functions φ(x) such that the energy HB (φ) = φ2 /2 is finite and the equations of motion are 0 1 ˙ φ(t, x) = Lφ(t, x), L = . −, 0 In order to describe the coupling of the reservoir to the system, let us consider first a single confined particle in Rd with Hamiltonian HS (p, q) = p 2 /2 + V (q). As the Hamiltonian for the coupled system particle plus one single reservoir, we have 1 H (φ, p, q) = φ2 + p 2 + V (q) + q · dx∇ϕ(x)ρ(x) 2 = HB (φ) + HS (p, q) + q · φ, α, where ρ(x) is a real rotation invariant function and α = (α (1) , · · · , α (d) ) is, in Fourier space, given by 2 ˆ −ik (i) ρ(k)/k αˆ (i) = . 0 We introduce the covariance matrix C (ij ) (t) = exp (Lt)α (i) , α (j ) . A simple computation shows that 1 C (ij ) (t) = δij dk|ρ(k)|2 ei|k|t , d and we define a coupling constant λ by putting λ2 = C (ii) (0) = d1 dk|ρ(k)|2 . The equations of motion of the coupled system are q(t) ˙ = p(t), p(t) ˙ = −∇V (q(t)) − φ, α, ˙ k) = L (φ(t, k) + q(t) · α(k)) . φ(t,
(5)
With the change of variables ψ(k) = φ(k) + q · α(k), Eqs. (5) become q(t) ˙ = p(t), p(t) ˙ = −∇Veff (q(t)) − ψ, α, ˙ k) = Lψ(t, k) + p(t) · α(k), ψ(t,
(6)
where Veff (q) = V (q) − λ2 q 2 /2. Integrating the last of Eqs. (6) with initial condition ψ0 (k) one finds t Lt ψ(t, k) = e ψ0 (k) + dseL(t−s) α(k) · p(s), 0
and inserting into the second of Eqs. (6) gives q(t) ˙ = p(t),
t
p(t) ˙ = −∇Veff (q(t)) − 0
dsC(t − s)p(s) − ψ0 , e−Lt α.
(7)
Exponential Convergence to Non-Equilibrium Stationary States
311
If we now assume that, at time t = 0, the reservoir is at temperature T , then ψ0 is distributed according to the Gaussian measure with covariance T · , · and then ξ(t) ≡ ψ0 , e−Lt α is a d-dimensional stationary Gaussian process with mean 0 and covariance T C(t − s). Note that the covariance itself appears in the deterministic memory term on the r.h.s. of Eq. (7) (fluctuation-dissipation relation). By Assumption H3 there is a polynomial p(u) which is a real function of iu and which has its roots in the lower half plane such that ∞ 1 du eiut . C (ii) (t) = 2 |p(u)| −∞ Note that this is a Markovian assumption [4]: ξ(t) is Markovian in the sense that we have the identity p(−id/dt)ξ(t) = ω(t), ˙ where ω(t) ˙ is a white noise, i.e., the joint motion of d m ξ(t)/dt m , 0 ≤ m ≤ deg p − 1 is a (Gaussian) Markov process. This assumption together with the fluctuation-dissipation relation permits, by extending the phase space with a finite number of variables, to rewrite the integro-differential equations (7) as a Markov process. Note that ξ(t) can be written as [4] ∞ ξ(t) = k(t − t )dω(t ), k(t) = dueiut p(u)−1 −∞
with k(t) = 0 for t ≤ 0. For example if p(u) ∝ iu + γ then C (ii) (t) = λ2 e−γ |t| . Introducing the variable r defined by t t λr(t) = dsC(t − s)p(s) + k(t − t )dω(t ), −∞
0
we obtain from Eqs. (7) the set of Markovian differential equations: q(t) ˙ = p(t), p(t) ˙ = −∇Veff (q(t)) − λr(t), dr(t) = (−γ r(t) + λp(t))dt + (2T γ )1/2 dω(t).
(8)
If p(u) ∝ (iu + γ + iσ )(iu + γ − iσ ), then C(t) = λ2 cos(σ t)e−γ |t| and introducing the two auxiliary variables r and s defined by t λr(t) = λ2 ds cos(σ (t − s))e−γ |t−s| p(s) 0 t + (T λ2 γ )1/2 cos(σ (t − s))e−γ |t−s| dω(s), −∞ t 2 dt sin(σ (t − s))e−γ |t−s| p(s) λs(t) = λ 0 t + (T λ2 γ )1/2 dt sin(σ (t − s))e−γ |t−s| dω(s), −∞
we obtain then the set of Markovian differential equations: q(t) ˙ = p(t), p(t) ˙ = −∇Veff (q(t)) − λr(t), dr(t) = (−γ r(t) − σ s(t) + λp(t))dt + (2T γ )1/2 dω(t), s˙ (t) = −γ s(t) + σ r(t).
(9)
312
L. Rey-Bellet, L. E. Thomas
Obviously other similar sets of equations can be derived for an arbitrary polynomial p(u). Another coupling which we could easily handle with our methods occurs in the following limiting case, see [8]. Formally one wants to take C (ii) (t) = η2 δ(t). Note that this corresponds to a coupling function with |ρ(k)|2 = 1 in which case λ2 = ∞. A possible limiting procedure consists in taking a sequence of covariances tending to a delta function and at the same time suitably rescaling the coupling (see [8]). In this case one obtains the Langevin equations which serve as the commonly-used model system with reservoir in the physics literature, q(t) ˙ = p(t), dp(t) = (−∇Veff (q(t)) − η2 p(t))dt + (2T η2 )1/2 dω(t).
(10)
The derivation of the effective equations for the chain is a straightforward generalization of the above computations. Our techniques apply equally well to any of the couplings above. However, for simplicity, we will only consider the case where the couplings to both reservoirs satisfy |ρi (k)|2 ∝ k 2 + γ 2 , i = L, R. For notational simplicity we set T1 = TL and Tn = TR , we denote r1 and rn as the two auxiliary variables and we will use the notations r = (r1 , rn ), and x = (p, q, r) ∈ X = R2d(n+1) . In this case we obtain the set of Markovian stochastic differential equations given by q˙1 = p1 , p˙ 1 = −∇q1 Veff (q) − λr1 , dr1 q˙j p˙ j q˙n p˙ n
= = = = =
(−γ r1 + λp1 )dt + (2T1 γ )1/2 dω1 , pj , j = 2, . . . , n − 1, −∇qj Veff (q), j = 2, . . . , n − 1, pn , −∇qn Veff (q) − λrn ,
drn = (−γ rn + λpn )dt + (2Tn γ )1/2 dωn ,
(11)
where Veff (q) = V (q) − λ2 q12 /2 − λ2 qn2 /2. From now on, for notational simplicity we will suppress the index “eff” and consider V = Veff as our potential energy. It will be useful to introduce the following notation. We define the linear maps : : Rdn → R2d by :(x1 , . . . , xn ) = (λx1 , λxn ) and T : R2d → R2d by T (x, y) = (T1 x, Tn y). With this we can rewrite Eqs. (11) in the compact form q˙ = p, p˙ = −∇q V − :∗ r, dr = (−γ r + :p)dt + (2γ T )1/2 dω.
(12)
The solution x(t) of Eqs. (12) is a Markov process. We denote T t as the associated semigroup, T t f (x) = Ex [f (x(t)], with generator L = γ (∇r T ∇r − r∇r ) + :p∇r − r:∇p + p∇q − (∇q V (q))∇p ,
(13)
Exponential Convergence to Non-Equilibrium Stationary States
313
and Pt (x, dy) as the transition probability of the Markov process x(t). There is a natural energy function which is associated to Eq. (12), given by r2 + H (p, q). 2 A straightforward computation shows that in the special case T1 = Tn = T , G(p, q, r) =
Z −1 e−G(p,q,r)/T is an invariant measure for the Markov process x(t). Given a function W : X → R satisfying W ≥ 1 we consider the following weighted total variation norm · W given by π W = sup f dπ , (14) |f |≤W
for any (signed) measure π. We introduce norms · θ and Banach spaces L∞ θ (X) given by f θ = sup
x∈X
|f (x)| , eθG(x)
L∞ θ (X) = {f : f θ < ∞},
(15)
∞ and write Kθ for the norm of an operator K : L∞ θ (X) → Lθ (X). Theorem 1.1 is a direct consequence of the following result:
Theorem 2.1. Assume that Conditions H1 and H2 hold. The Markov process x(t) which solves (12) has smooth transition probability densities, Pt (x, dy) = pt (x, y)dy, with pt (x, y) ∈ C ∞ ((0, ∞) × X × X). The Markov process x(t) has a unique invariant measure µ, and µ has a C ∞ everywhere positive density. For any θ with 0 < θ < (max{T1 , Tn })−1 there exist constants r = r(θ) > 1 and R = R(θ) < ∞ such that Pt (x, ·) − µexp (θG) ≤ Rr −t exp (θ G(x)),
(16)
for all x ∈ X, (exponential convergence to the SNS) or equivalently T t − µθ ≤ Rr −t , (spectral gap). Furthermore for all functions f , g with f 2 , g 2 ∈ L∞ θ (X) and all t > 0 we have gT t f dµ − f dµ gdµ ≤ Rr −t f 2 1/2 g 2 1/2 , θ θ (exponential decay of correlations in the SNS). The convergence in the weighted variation norm, Eq. (16), implies that the Law of Large Numbers holds [10, 18]. Corollary 2.2. Under Assumptions H1 and H2 x(t) satisfies the Law of Large Numbers: For all initial conditions x ∈ X and all f ∈ L1 (X, dµ), 1 T f (x(t))dt = f dµ lim T →∞ T 0 almost surely.
314
L. Rey-Bellet, L. E. Thomas
The convergence of the transition probabilities as given in (16) is shown in [18] to follow from the following properties: • Strong Feller property. The diffusion process is strong Feller, i.e., the semigroup T t maps bounded measurable functions into continuous functions. This is a consequence of the hypoellipticity of the diffusion x(t), which follows from Condition H2, see Sect. 4. • Small-time open set accessibility. For all t > 0, all x ∈ X and all open set A ⊂ X we have Pt (x, A) > 0. This means that the Markov process is “strongly aperiodic”. In particular, combined with the strong Feller property it implies uniqueness of the invariant measure. This property is discussed in Sect. 4 using the support theorem of [28] and explicit computations. This generalizes (slightly) the result obtained in [7]. • Liapunov function and hitting times. Fix s > 0 arbitrary. Set W = exp (θ G) and choose θ with 0 < θ < (max {T1 , Tn })−1 . Then W is a Liapunov function for the Markov chain {x(ns)}n≥0 : W > 1, W has compact level sets and there is a compact set U , (depending on s and θ ) and constants κ < 1 and b < ∞, (both depending on U , s and θ ) such that T s W (x) ≤ κW (x) + b1U (x),
(17)
where 1U denotes the indicator function of the set U . In addition the constant κ in Eq. (17) can be chosen arbitrarily small by choosing the set U sufficiently large. The existence of a Liapunov function is the main technical result of this paper (see Sect. 3) and the Condition H1 is crucial to obtain it. Note that the time derivative of the (averaged) energy d Ex [G(x(t))] = γ Ex [Tr(T) − r 2 (t)], dt is not necessarily negative. But it is the case, as follows from our analysis below that, for t > 0, Ex [G(x(t)) − G(x)] < −cG(x)2/k2 for x sufficiently large. A nice interpretation of a Liapunov bound of the form (17) is in terms of hitting times. Let τU denote the first time the diffusion x(t) hits the set U ; then Eq. (17) implies that τU is exponentially bounded. We will show that for any a > 0, no matter how large, we can find a compact set U = U (a) such that Ex [eaτU ] < ∞, for all x ∈ X. So except for exponentially rare excursions the Markov process x(t) lives on the compact set U . Combined with the fact that the process has a smooth law, this provides an intuitive picture of the exponential convergence result of Theorem 2.1.
Exponential Convergence to Non-Equilibrium Stationary States
315
3. Liapunov Function and Hitting Times 3.1. Scaling and deterministic energy dissipation. We first consider the question of energy dissipation for the following deterministic equations: q˙ = p, p˙ = −∇q V (q) − :∗ r, r˙ = −γ r + :p,
(18)
obtained from Eq. (12) by setting T1 = Tn = 0, corresponding to an initial condition of the reservoirs with energy 0. A simple computation shows that the energy G(p, q, r) is non-increasing along the flow x(t) = (p(t), q(t), r(t)) given by Eq. (18): d G(p(t), q(t), r(t)) = −γ r 2 (t) ≤ 0. dt We now show by a scaling argument that for any initial condition with sufficiently high energy, after a small time, a substantial amount of energy is dissipated. At high energy, the two-body interaction U (2) in the potential dominates the term (1) U since k2 ≥ k1 and so for an initial condition with energy G(x) = E, the natural time scale – essentially the period of a single one-dimensional oscillator in the potential |q|k2 – is E 1/k2 −1/2 . We scale a solution of Eq. (18) with initial energy E as follows 1 1
1 − p(t) ˜ = E − 2 p E k2 2 t , 1 1
−1 − q(t) ˜ = E k2 q E k2 2 t , 1 1
−1 − r˜ (t) = E k2 r E k2 2 t . (19) ˜ E (p, Accordingly the energy scales as G(p, q, r) = E G ˜ q, ˜ r˜ ), where 2 2 p˜ 2 −1 r˜ ˜ E (p, ˜ q, ˜ r˜ ) = E k2 ˜ + + V˜E (q), G 2 2 n n−1 V˜E (q) ˜ = U˜ (1) (q˜i ) + U˜ (2) (q˜i − q˜i+1 ), i=1
U˜ (i) (x) ˜ = E −1 U˜
i=1
(i)
1 k2
E x ,
i = 1, 2.
The equations of motion for the rescaled variables are q˙˜ = p, ˜ −1 p˙˜ = −∇q˜ V˜E (q) ˜ − E k2 :∗ r, 2
− r˙˜ = −E k2 2 γ r˜ + :p. ˜ 1
1
By Assumption H1, as E → ∞ the rescaled energy becomes ˜ ∞ (p, ˜ E (p, G ˜ q, ˜ r˜ ) ≡ lim G ˜ q, ˜ r˜ ) E→∞ p˜ 2 /2 + V˜∞ (q) ˜ k1 = k2 > 2 or k2 > k1 ≥ 2 , = r˜ 2 /2 + p˜ 2 /2 + V˜ (q) k1 = k2 = 2 ∞ ˜
(20)
316
where
L. Rey-Bellet, L. E. Thomas
a (1) q˜i k2 + a (2) q˜i − q˜i+1 k2 k1 = k2 ≥ 2 V∞ (q) ˜ = . a (2) q˜ − q˜ k2 k > k ≥ 2 i i+1 2 1
The equations of motion scale in this limit to q˙˜ = p, ˜ ˙p˜ = −∇q˜ V˜∞ (q), ˜ ˙r˜ = :p, ˜
(21)
in the case k2 > 2, while they scale to q˙˜ = p, ˜ p˙˜ = −∇q˜ V˜∞ (q) ˜ − :∗ r, r˙˜ = −γ r + :p, ˜
(22)
in the case k1 = k2 = 2. Remark 3.1. The scaling for the p and q is natural due to the Hamiltonian nature of the problem, but the scaling of r has a certain amount of arbitrariness. Since G is quadratic in r, it might appear natural to scale r with a factor E −1/2 instead of E −1/k2 as we do. On the other hand, the very definition of r as an integral of p suggests that r should scale as q, as we have chosen. Remark 3.2. Had we supposed, instead of H1, that k1 > k2 , then the natural time scale at high energy would be E 1/k1 −1/2 . Scaling the variables (with k2 replaced by k1 would yield the limiting Hamiltonian p˜ 2 /2+ a (1) q˜i k1 , i.e., the Hamiltonian of n uncoupled oscillators. So in this case, at high energy, essentially no energy is transmitted through the chain. While this does not necessarily preclude the existence of an invariant measure, we expect in this case the convergence to a SNS to be much slower. In any case even the existence of the SNS in this case remains an open problem. Theorem 3.3. Given τ > 0 fixed there are constants c > 0 and E0 < ∞ such that for any x with G(x) = E > E0 and any solution x(t) of Eq. (18) with x(0) = x we have the estimate, for tE = E 1/k2 −1/2 τ , 3
G(x(tE )) − E ≤ −cE k2
− 21
.
(23)
Remark 3.4. In view of Eq. (23), this shows that r is at least typically O(E 1/k2 ) on the time interval [0, E 1/k2 −1/2 τ ]. Proof. Given a solution of Eq. (18) with initial condition x of energy G(x) = E, we use the scaling given by Eq. (19) and we obtain τ tE 3 −1 G(x(tE )) − E = −γ dtr 2 (t) = −γ E k2 2 dt r˜ 2 (t), (24) 0
0
where r˜ (t) is the solution of Eq. (20) with initial condition x˜ of (rescaled) energy ˜ E (x) G ˜ = 1. By Assumption H2 we may choose E0 so large that for E > E0 the ˜ E are contained in, say, the set {G ˜ E ≤ 1/2}. critical points of G
Exponential Convergence to Non-Equilibrium Stationary States
317
For a fixed E and x with G(x) = E, we show that there is a constant cx,E > 0 such that τ dt r˜ 2 (t) ≥ cx,E . (25) 0
τ The proof is by contradiction, cf. [21]. Suppose that 0 dt r˜ 2 (t) = 0, then we have r˜ (t) = 0, for all t ∈ [0, τ ]. From the third equation in (20) we conclude that p˜ 1 (t) = p˜ n (t) = 0 for all t ∈ [0, τ ], and so from the first equation in (20) we see that q˜1 (t) and q˜n (t) are constant on [0, τ ]. The second equation in (20) gives then 0 = p˙˜ 1 (t) = −∇q˜1 V˜ (q(t)) ˜ = −∇q˜1 U˜ (1) (q˜1 (t)) − ∇q˜1 U˜ (2) (q˜1 (t) − q˜2 (t)), together with a similar equation for p˙ n . By our Assumption H1 the map ∇ U˜ (2) has a right inverse g locally bounded and measurable and thus we obtain q˜2 (t) = q˜1 (t) − g(U˜ (1) (q˜1 (t))). Since q˜1 is constant, this implies that q˜2 is also constant on [0, τ ]. Similarly we see that q˜n−1 is constant on [0, τ ]. Using again the first equation in (20) we obtain now p˜ 2 (t) = p˜ n−1 (t) = 0 for all t ∈ [0, τ ]. Inductively one concludes that r˜ = 0 implies ˜ E . This p˜ = 0 and ∇q˜ V˜ = 0 and thus the initial condition x˜ is a critical point of G contradicts our assumption and Eq. (25) follows. ˜ E is compact. Using the continuity of the Now for given E, the energy surface G solutions of O.D.E. with respect to initial conditions we conclude that there is a constant cE > 0 such that τ inf dt r˜ 2 (t) ≥ cE . ˜ E =1} 0 x∈{ ˜ G
˜∞ Finally we investigate the dependence on E of cE . We note that for E = ∞, G has a well-defined limit given by Eq. (21) and the rescaled equations of motion, in the limit E → ∞, are given by Eqs. (21) in the case k2 > 2 and by Eq. (22) in the case ˜ ∞ = 1} is not k1 = k2 = 2. Except in the case k1 = k2 = 2 the energy surface {G ˜ compact. However, in the case k1 = k2 > 2, the Hamiltonian G∞ and the equation of motion are invariant under the translation r → r + a, for any a ∈ R2d . And in the case ˜ ∞ and the equation of motion are invariant under the k2 > k1 > 2 the Hamiltonian G translation r → r + a q → q + b, for any a ∈ R2d and b ∈ Rdn . The quotient of the ˜ ∞ = 1} by these translations, is compact. energy surface {G ˜ ∞ = 1} a similar argument as above show that τ dt (˜r + Note that for a given x˜ ∈ {G 0 a)2 > 0, for any a > 0 and since this integral clearly goes to ∞ as a → ∞ there exists a constant c∞ > 0 such that τ inf r˜ 2 (t)dt > c∞ . ˜ ∞ =1} 0 x∈{ ˜ G
Using again that the solution of O.D.E. depends smoothly on its parameters, we obtain τ inf inf dt r˜ 2 (t) > c. E>E0 x∈{ ˜ E =1} 0 ˜ G
This estimate, together with Eq. (24) gives the conclusion of Theorem 3.3.
!
318
L. Rey-Bellet, L. E. Thomas
3.2. Approximate deterministic behavior of random paths. In this section we show that at sufficiently high energies, the overwhelming majority of the random paths x(t) = x(t, ω) solving Eqs. (12) follows very closely the deterministic paths xdet solving Eqs. (18). As a consequence, for most random paths the same amount of energy is dissipated into the reservoirs as for the corresponding deterministic ones. We need the following a priori “no-runaway” bound on the growth of G(x(t)). Lemma 3.5. Let θ ≤ (max{T1 , Tn })−1 . Then Ex [exp (θ G(x(t)))] is well-defined and satisfies the bound Ex [exp (θ G(x(t)))] ≤ exp (γ Tr(T )θ t) exp (θ G(x)).
(26)
Moreover for any x with G(x) = E and any δ > 0 we have the estimate Px
sup G(x(s)) ≥ (1 + δ)E
≤ exp (γ Tr(T )θ t) exp (−δθ E).
(27)
0≤s≤t
Remark 3.6. The lemma shows that for E sufficiently large, with very high probability, G(x(t)) = O(E) if G(x) = E. The assumption on θ here arises naturally in the proof, where we need (1 − θ T ) ≥ 0, cf. Eq. (28). Proof. For θ ≤ (max{T1 , Tn })−1 we have the bound (the generator L is given by Eq. (13)) L exp (θ G(x)) = γ θ exp (θ G(x)) (Tr(T ) − r(1 − θT )r) ≤ γ θTr(T ) exp (θ G(x)),
(28)
so that for the function W (t, x) = exp (−γ θ Tr(T )t) exp (θ G(x)) we have the inequality (∂t + L)W (t, x) ≤ 0. We denote σR as the exit time from the set {G(x) < R}, i.e., σR = inf{t ≥ 0, G(x(t)) ≥ R}. If the initial condition x satisfies G(x) = E < R, we denote xR (t) the process which is stopped when it exits {G(x) < R}, i.e., xR (t) = x(t) for t < σR and xR (t) = x(σR ) for t ≥ σR . We set σR (t) = min{σR , t} and applying Ito’s formula with stopping time to the function W (t, x) we obtain Ex exp (θ G(x(σR (t)))) exp (−γ θ Tr(T )σR (t)) − exp (θ G(x)) ≤ 0, thus Ex exp (θ G(x(σR (t)))) ≤ exp (γ θ Tr(T )t) exp (θ G(x)).
(29)
Since Ex exp (θ G(x(σR (t)))) ≥ Ex exp (θ G(x(σR (t))))1σR 0 such that for paths x(t, ω) ∈ S(x, E, tE ) with tE = E 1/k2 −1/2 τ and E > E0 we have 2 −1 ,q(t) E k2 sup ,p(t) ≤ c sup 2γ T ω(t) E k12 − 21 . 0≤t≤tE 0≤t≤tE ,r(t) 1
(30)
Proof. We write differential equations for ,x(t) again assuming both the random and deterministic paths start at the same point x with energy G(x) = E. These equations can be written in the somewhat symbolic form: d,q = ,pdt,
d,p = O(E 1−2/k2 ),q − :∗ ,r dt, d,r = (−γ ,r + :,p) dt + 2γ T dω.
(31)
The O(E 1−2/k2 ) coefficient refers to the difference between forces, −∇q V (·) evaluated at x(t) and xdet (t); we have that G(x(t)) ≤ 2E, so that ∇q V (q) − ∇q V (qdet ) = O(∂ 2 V ),q = O(E 1−2/k2 ),q. For later purposes we pick a constant c so large that ρ = ρ(x) = c E
1− k2
2
≥ sup i
j
2 ∂ V (q) {q:V (q)≤2E} ∂qi ∂qj sup
for all sufficiently large E. In order to estimate the solutions of Eqs. (31), we consider the 3 × 3 matrix which bounds the coefficients in this system, and which is given by
01 0 M = ρ 0 λ . 0λγ
(32)
320
L. Rey-Bellet, L. E. Thomas
We have the following estimate on powers of M: For ,X (0) = (0, 0, 1)T , we set ,X(m) ≡ M m ,X (0) . For α = max(1, γ + λ), we obtain ,X (1) ≤ α(0, 1, 1)T , ,X(2) ≤ α 2 (1, 1, 1)T , and, for m ≥ 3, m−2 (m) ρ 2 u m−1 ,X(m) ≡ v (m) ≤ α m 2m−2 ρ 2 , m−2 w (m) ρ 2 where the inequalities are componentwise. From this we obtain the bound √ 1 2 ρ2αt 0 2 (αt)√e . etM 0 ≤ αte ρ2αt √ 1 2 ρ2αt 1 1 + αt + (αt) e
(33)
2
If 0 ≤ t ≤ tE we have bounded, and
√
ρt
0 and E0 < ∞ such that all paths x(t, w) with initial condition x with G(x) = E > E0 satisfy the bound tE 3 −1 r 2 (s)ds ≥ cE k2 2 . (37) 0
Remark 3.9. For large energy E, paths not satisfying the hypotheses of the corollary have measure bounded by Px { sup 2γ T ω > L(E)} + P{S(x, E, tE )C } 0≤s≤tE
a L(E)2 ≤ exp − + exp (θ (γ Tr(T )tE − E)) 2 bγ Tmax tE L(E)2 ≤ a exp − , bγ Tmax tE
(38)
where a and b are constants which depend only on the dimension of ω. Here we have used the reflection principle to estimate the first probability and Eq. (27) and the definition of S to estimate the second probability. For E large enough, the second term is small relative to the first. Proof. It is convenient to introduce the L2 -norm on functions on [0, t], f t ≡
1/2 t 2 ds f (s) . By Theorem 3.3, there are constants E1 and c1 such that for E > E1 0 the deterministic paths xdet (s) satisfy the bound tE 3 −1 2 rdet 2tE = rdet (s)ds ≥ c1 E k2 2 . 0
By Proposition 3.7, there are constants E2 and c2 such that ,r(s) ≤ c2 L(E), uniformly in s, 0 ≤ s ≤ tE , and uniformly in x with G(x) > E2 . So we have 1/2 1/2 3 1 −1 −1 rtE ≥ rdet tE − ,rtE ≥ c1 E k2 2 − c2 L(E) E k2 2 . But the last term is O(E α−1/4+1/2k2 ), which is of lower order than the first since α < 1/k2 , so the corollary follows, for an appropriate constant c and E sufficiently large. !
322
L. Rey-Bellet, L. E. Thomas
3.3. Liapunov function and exponential hitting times. With the estimates we prove now our main technical result. Theorem 3.10. Let s > 0 and θ < θ0 ≡ (max{T1 , Tn })−1 . Then there are a compact set U = U (s, θ ) and constants κ = κ(U, s, θ) < 1 and L = L(U, s, θ ) < ∞ such that T s exp (θ G)(x) ≤ κ exp (θ G)(x) + L1U (x),
(39)
where 1U is the indicator function of the set U . The constant κ can be made arbitrarily small by choosing U large enough. Proof. For any compact set U and for any t, T s exp (θ G)(x) is a bounded function, uniformly on [0, t]. So, in order to prove Eq. (39), we only have to prove that there exist a compact set U and κ < 1 such that sup Ex exp (θ (G(x(s)) − G(x))) ≤ κ < 1. x∈U C
Using Ito’s Formula to compute G(x(s)) − G(x) in terms of a stochastic integral we obtain Ex exp (θ (G(x(s)) − G(x))) s s = exp (θ γ Tr(T )s)Ex exp −θ γ r 2 dt + θ 2γ T rdω(t) . (40) 0
0
For any θ < θ0 , we choose p > 1 such that θp < θ0 . Using Hölder inequality we obtain, s s 2 γ r dt + θ 2γ T rdω(t) Ex exp −θ = Ex
0
exp −θ 0
0
s
γ r 2 dt +
pθ 2 2
s
2 2γ T r dt
0
s 2 pθ 2 s × exp − 2γ T r dt + θ 2γ T rdω(t) 2 0 0 1/q s 2 s 2 qpθ γ r 2 dt + 2γ T r dt ≤ Ex exp −qθ 2 0 0 1/p s s 2 2 2 p θ × Ex exp − 2γ T r dt + θp 2γ T rdω(t) 2 0 0 s 2 1/q qpθ 2 s 2 = Ex exp −qθ dtγ r + dt 2γ T r . 2 0 0 Here, in the next to last line, we have used the fact that the second factor is the expectation of a martingale (the integrand is non-anticipating) with expectation 1. Finally we obtain the bound Ex exp (θ (G(x(s)) − G(x))) 1/q s dtγ r 2 . (41) ≤ exp (θ γ Tr(T )s)Ex exp −qθ(1 − pθ Tmax ) 0
Exponential Convergence to Non-Equilibrium Stationary States
323
In order to proceed we need to distinguish two cases according to whether 3/k2 − 1/2 > 0 or 3/k2 − 1/2 ≤ 0 (see Corollary 3.8). In the first case we let E0 be defined by 1/k −1/2 s = E0 2 τ . For E > E0 we break the expectation Eq. (41) into two parts according to whether the paths satisfy sthe hypotheses t of Corollary 3.8 or not. For the first part we use Corollary 3.8 and that 0 r 2 (s)ds ≥ 0E r 2 (s) ≥ cE 3/k2 −1/2 ; for the second part we use estimate (38) in Remark 3.9 on the probability of unlikely paths together with the fact that the exponential under the expectation in Eq. (41) is bounded by 1. We obtain for all x with G(x) = E > E0 the bound Ex exp (θ (G(x(s)) − G(x))) ≤ exp θγ Tr(T )tE0 1/q 3 L(E)2 θ0 −1 . × exp −qθ (1 − pθ Tmax )cE k2 2 + a exp − bγ tE
(42)
Choosing the set U = {x; G(x) ≤ E1 } with E1 large enough we can make the term in Eq. (42) as small as we want. If 3/k2 − 1/2 ≤ 0, for a given s and a given x with G(x) = E we split the time interval [0, s] into E 1/2−1/k2 pieces [tj , tj +1 ], each one of size of order E 1/k2 −1/2 s. For the “good” paths, i.e., for the paths x(t) which satisfy the hypotheses of Corollary 3.8 on each time interval [tj , tj +1 ], the tracking estimates of Proposition 3.7 imply that G(x(t)) = O(E) for t ineach interval.Applying Corollary 3.8 and using that G(x(tj )) = s O(E) we conclude that 0 r 2 (s)ds is at least of order E 3/k2 −1/2 × E 1/2−1/k2 = E 2/k2 . The probability of the remaining paths can be estimated, using Eq. (38), not to exceed 1
1
E 2 − k2 L2max θ0 1 − 1 − a exp − . bγ tE The remainder of the argument is essentially as above, Eq. (42) and this concludes the proof of Theorem 3.10. ! The existence of the Liapunov function given by Eq. (39) can be interpreted in terms of hitting times. Let τU be the time for the diffusion x(t) to hit the set U . Theorem 3.11. Assume that θ < (max{T1 , Tn })−1 . For any (arbitrarily large) a > 0 there exists a constant E0 = E0 (a) > 0 such that for U = {x; G(x) ≤ E0 } and x ∈ U C we have Ex eaτU < ea + (ea − 1) exp (θ (G(x) − E0 )).
(43)
Proof. Let s = 1 and θ < θ0 be given, we set κ = exp (−a)/2 and take U to be the set given by Theorem 3.10. Let Xn be the Markov chain defined by Xn = x(n) and NU be the least integer such that XNU ∈ U . Then Ex [eaτU ] ≤ Ex [eaNU ],
(44)
so that to estimate the exponential hitting time, it suffices to estimate the exponential “step number”.
324
L. Rey-Bellet, L. E. Thomas
Using Chernov’s inequality we obtain Px {NU > n} = Px {−
n
(G(Xj ) − G(Xj −1 ) < G(x) − E0 , Xj ∈ U c }
j =1
≤ eθ(G(x)−E0 ) Ex
n
eθ(G(Xj )−G(Xj −1 )) , Xj ∈ U c
j =1
≤e
θ(G(x)−E0 )
n−1
Ex
eθ(G(Xj )−G(Xj −1 ))
j =1
)
× EXn−1 eθ(G(Xn )−G(Xn−1 , Xj ∈ U c ≤ eθ(G(x)−E0 ) sup Ey [eθ(G(X1 )−G(y)) ] y∈U c
n−1
× Ex
eθ(G(Xj )−G(Xj −1 )) , Xj ∈ U c
j =1
≤ ··· ≤ e
!
θ(G(x)−E0 )
"n sup Ey [e
θ(G(X1 )−G(y))
y∈U c
]
.
By Theorem 3.10 we have sup Ex [eθ(G(X1 )−G(x)) ] < κ,
x∈U c
and therefore we have geometric decay of P>n ≡ Px {NU > n} in n, P>n ≤ κ n exp (θ G(x) − E0 ). Summing by parts we obtain ∞ Ex eaNU = ean Px {τU = n}
# = lim
M→∞
n=1 M
P>n (e
$ a(n+1)
an
a
− e ) + e P>0 − e
a(M+1)
P>M ,
n=1
which, together with Eq. (44) gives Eq. (43).
!
4. Accessibility and Strong Feller Property In this section we prove that the Markov process is strong Feller and moreover we show that it is strongly aperiodic in the sense that for all t > 0, all x ∈ X and all open sets A ⊂ X we have Pt (x, A) > 0. Both results imply immediately that x(t) has at most one invariant measure: Since the process is strong Feller the invariant measure (if it exists) has a smooth density which is everywhere positive by the property of aperiodicity. Obviously no two different such measures can exist.
Exponential Convergence to Non-Equilibrium Stationary States
325
The strong Feller property is an immediate consequence of the hypoelliptic properties of the generator L of the diffusion. The result is an easy consequence of the estimates in [7, 5], since there much stronger global hypoelliptic estimates are proven (though under stronger conditions on the potential U (2) ). We present here the argument for completeness. The generator of the Markov process x(t) can be written in the form L=
2d i=1
Xi2 + X0 .
If the Lie algebra generated by the set of commutators {Xi }2d i=1 ,
{[Xi , Xi ]}2d i,j =0 ,
{[[Xi , Xj ], Xk ]}2d i,j,k=0 ,
···
(45)
has rank dim(X) at every point x ∈ X, then the Markov process has a C ∞ law. In particular it is strong Feller. This is a consequence of the Hörmander Theorem [11, 16] or it can be proved directly using Malliavin Calculus developed by Malliavin, Bismut, Stroock and others (see e.g. [19]). Proposition 4.1. If H2 holds then the generator L given by Eq. (13) satisfies the rank condition (45). Proof. This is a straightforward computation. The vector fields Xi , i = 1, · · · 2d give ∂r (j ) , i = 1, n, j = 1, · · · , d. The commutators i
& % ∂r (j ) , X0 = γ ∂r (j ) − λ∂p(j ) , 1 1 1 %% & & 2 ∂r (j ) , X0 , X0 = γ ∂r (j ) − γ λ∂p(j ) − λ∂q (j ) , 1
1
1
1
yield the vector fields ∂p(j ) and ∂q (j ) . Further 1
1
%
d & ∂q (j ) , X0 = 1
l=1
d
∂ 2 U (2) ∂ 2V (q)∂p(l) + (q1 − q2 )∂p(l) . 1 2 ∂q (j ) ∂q (l) ∂q (j ) ∂q (l) 1
l=1
1
2
1
If U (2) is strictly convex, this yields ∂p(j ) while in the general case we need to consider 2 further the commutators # $$$ # # d ∂ 2 U (2) ∂ (j1 ) , · · · , ∂ (jm−1 ) , (q1 − q2 )∂p(l) q1 q1 2 ∂q (jm ) ∂q (l) l=1
=
d l=1
1
∂ m+1 U (2) ∂
(j1 )
q1
· · · ∂q (jm ) ∂q (l) 1
2
(q1 − q2 )∂p(l) . 2
1
Condition H3 means that we can write ∂p(j ) as a linear combination of these commutators 2 for every x ∈ X. The other basis elements of the tangent space are obtained inductively following the same procedure. !
326
L. Rey-Bellet, L. E. Thomas
We now prove the strong aperiodicity of the process x(t). This is based on the support theorem of Stroock and Varadhan [28]. The support of the diffusion process x(t) with initial condition x on the time interval [0, t], is by definition the smallest closed subset Sx,t of C([0, t]) such that Px [x(t, ω) ∈ Sx,t ] = 1. The support can be studied using the associated control system, i.e., the ordinary differential equation where the white noise ω(t) ˙ is replaced by a control u(t) ∈ L1 ([0, T ]): For our problem we have the control system q˙ = p, p˙ = −∇q V + :∗ r, r˙ = (−γ r + :p) + u,
(46)
and we denote xu (t) the solution of this control system with initial condition x and control u. The support theorem asserts that the support of the diffusion Sx,t is the closure of the set {xu ; u ∈ L1 ([0, t])}. As a consequence suppPt (x, ·), the support of the transition probabilities is equal to the closure of the set of accessible points {y; ∃u ∈ L1 ([0, t]) s.t. xu (t) = y}. Proposition 4.2. If Condition H2 holds then for all t > 0, all x ∈ X, supp Pt (x, ·) = X.
(47)
Proof. This result is proved in [7] under the additional condition that the interaction potential U (2) is strictly convex, in particular ∇U (2) is a diffeomorphism. Our Condition H2 implies that ∇U (2) is surjective. We can choose an inverse g : Rd → Rd which is locally bounded. From this point the proof proceeds exactly as in Theorem 3.2 of [7] and we will not repeat it here. ! 5. Proof of Theorem 2.1 The proof of Theorem 2.1 is a consequence of the theory linking the ergodic properties of the Markov process with existence of Liapunov functions, a theory which has been developed over the past twenty years. The proof of these ergodic properties relies on the intuition that the compact set U together with a Liapunov function plays much the same role as an atom in, say, a countable state space Markov chain. The technical device to implement this idea was invented in [1, 20], and is called splitting. It consists in constructing a new Markov chain with state space X0 ∪ X1 , where Xi are two copies of the original state space X. The new chain possesses an atom and has a projection which is the original chain. The ergodic properties of a chain with an atom are then analyzed by means of renewal theory and a coupling argument is applied to the return times to the atom. A complete account of this theory for a discrete time Markov process is developed in the book of Meyn and Tweedie [18], from which the result needed here is taken (Chapter 15). For a given s > 0 consider the discrete time Markov chain Xj = x(j s) with transition probabilities P (x, dy) ≡ Ps (x, dy) and semigroup P j ≡ T j s . By the results of Sect. 4, the Markov chain is strongly aperiodic, i.e., P (x, A) > 0 for any open set A and for any x and it is strong Feller. The exponential bound on the hitting time given in Theorem 3.11 implies in particular that Ex [τU ] is finite for all x ∈ X and thus we have an invariant measure µ (for hypoelliptic diffusions this is established in [14]). By aperiodicity and the strong Feller property, this invariant measure is unique.
Exponential Convergence to Non-Equilibrium Stationary States
327
The following theorem is proved in [18]: Theorem 5.1. If the Markov chain {Xj } is strong Feller and strongly aperiodic and if there are a function W > 1, a compact set U , and constants κ < 1 and L < ∞ such that P W (x) ≤ κW (x) + L1U (x),
(48)
then there exist constants r > 1 and R < ∞ such that, for any x, r n P (x, ·) − µW ≤ RW (x), n
where the weighted variation norm · W is defined in Eq. (14). By Theorem 3.10 the assumptions of Theorem 5.1 are satisfied with W = exp(θ G) and θ < (max{T1 , Tn })−1 . For the semigroup T t we note that we have the apriori estimate T t exp(θ G)(x) ≤ exp(γ θ Tr(T )t) exp(θ G)(x), cf. Lemma 3.5, which shows that T t is a bounded operator on L∞ θ (X) defined in Eq. (15). Setting t = ns + u with 0 ≤ u < s, and using the invariance of µ one obtains T t − µθ ≤ T nτ − µθ T s θ ≤ R˜ r˜ −t ,
(49)
for some r˜ > 1 and R˜ < ∞ or equivalently ∞ r˜ t Pt (x, ·) − µexp (θG) ≤ R˜ exp (θ G(x)). 0
As a consequence, for any s > 0, T s has 1 as a simple eigenvalue and the rest of the spectrum is contained in a disk of radius ρ < 1. The exponential decay of correlations in the stationary states follows from this. Corollary 5.2. There exist constants R < ∞ and r > 1 such that for all f , g with f 2 , g 2 ∈ L∞ θ (X), we have f T t gdµ − f dµ gdµ ≤ Rf 2 1/2 g 2 1/2 r −t . θ θ 2 Proof. If f 2 ∈ L∞ θ , we have |f (x)| ≤ f θ exp(θ G(x)/2) and similarly for g. Further if Eq. (49) holds with W = exp (θ G) it also holds for exp (θ G/2), and thus for some R1 < ∞ and r1 > 1 we have t T g(x) − gdµ ≤ R1 r −t g 2 1/2 exp θ G(x) . θ 1 2 1/2
Therefore we obtain f T t gdµ − f dµ gdµ ≤ |f (x)| T t g(x) − gdµ dµ 1/2 1/2 ≤ exp (θ G)dµ R1 r1−t f 2 θ g 2 θ .
328
L. Rey-Bellet, L. E. Thomas
To conclude we need to show that which we rewrite as
exp (θ G)dµ < ∞. This follows from Eq. (48)
N exp (θ G(x)) ≤ exp (θ G(x)) − P exp (θ G(x)) + L1U (x), with N = 1 − κ. From this we obtain N
N N 1 1 1 exp (θ G(x)) + L exp (θ G(Xk )) ≤ 1U (Xk ). N N N k=1
(50)
k=1
By the Law of Large Numbers the r.h.s of Eq. (50) converges to Lµ(U ) which is finite, and thus exp (θ G)dµ is finite, too. ! This concludes the proof of Theorem 2.1. Note added in proof. Stronger spectral properties as well as a fluctuation theorem for the entropy production are proved in [22]. Acknowledgement. We would like to thank Pierre Collet, Jean-Pierre Eckmann, Servet Martinez and ClaudeAlain Pillet for their comments and suggestions as well as Martin Hairer for useful comments on the controllability issues discussed in Sect. 4. L. E. Thomas is supported in part by NSF Grant 980139.
References 1. Athreya, K.B., Ney, P.: A new approach to the limit theory of recurrent Markov chains. Trans. Am. Math. Soc. 245, 493–501 (1978) 2. Bambusi, D.: Exponential stability of breathers in Hamiltonian networks of weakly coupled oscillators. Nonlinearity 9, 433–457 (1996) 3. Bach, V., Fröhlich, J., Sigal, I.M.: Quantum electrodynamics of confined nonrelativistic particles. Adv. Math. 137, 299–395 (1998) 4. Dym, H., McKean, H.P.: Gaussian processes, function theory, and the inverse spectral problem. Probability and Mathematical Statistics, Vol. 31. New York–London: Academic Press, 1976 5. Eckmann, J.-P., Hairer, M.: Non-equilibrium statistical mechanics of strongly anharmonic chains of oscillators. Commun. Math. Phys. 212, 105–164 (2000) 6. Eckmann, J.-P., Pillet C.-A., Rey-Bellet, L.: Non-equilibrium statistical mechanics of anharmonic chains coupled to two heat baths at different temperatures. Commun. Math. Phys. 201, 657–697 (1999) 7. Eckmann, J.-P., Pillet, C.-A., Rey-Bellet, L.: Entropy production in non-linear, thermally driven Hamiltonian systems. J. Stat. Phys. 95, 305–331 (1999) 8. Ford, G.W., Kac, M., Mazur, P.: Statistical mechanics of assemblies of coupled oscillators. J. Math. Phys. 6, 504–515 (1965) 9. Gallavotti, G., Cohen, E.G.D.: Dynamical ensembles in stationary states. J. Stat. Phys. 80, 931–970 (1995) 10. Has’minskii, R.Z.: Stochastic stability of differential equations. Alphen aan den Rijn–Germantown: Sijthoff and Noordhoff, 1980 11. Hörmander, L.: The Analysis of linear partial differential operators. Vol. III. Berlin: Springer, 1985 12. Jakši´c, V., Pillet, C.-A.: Ergodic properties of classical dissipative systems. I. Acta Math. 181, 245–282 (1998) 13. Jakši´c, V., Pillet, C.-A.,: On a model for quantum friction. III. Ergodic properties of the spin-boson system. Commun. Math. Phys. 178, 627–651 (1996) 14. Kliemann, W.: Recurrence and invariant measures for degenerate diffusions. Ann. of Prob. 15, 690–702 (1987) 15. Komech, A., Spohn, H., Kunze, M.: Long-time asymptotics for a classical particle interacting with a scalar wave field. Comm. Partial Differ. Eq. 22, 307–335 (1997) 16. Kunita, H.: Supports of diffusion processes and controllability problems. In: Proc. Intern. Symp. SDE Kyoto 1976. New York: Wiley, 1978, pp. 163–185 17. MacKay, R.S., Aubry, S.: Proof of existence of breathers for time-reversible or Hamiltonian networks of weakly coupled oscillators. Nonlinearity 7, 1623–1643 (1994)
Exponential Convergence to Non-Equilibrium Stationary States
329
18. Meyn, S.P., Tweedie, R.L.: Markov Chains and Stochastic Stability. Communication and Control Engineering Series, London: Springer-Verlag London, 1993 19. Norriss, J.: Simplified Malliavin Calculus. In: Séminaire de probabilités XX. Lectures Note in Math. 1204, Berlin: Springer, 1986, pp. 101–130 20. Nummelin, E.: A splitting technique for stationary Markov Chains. Z. Wahrscheinlichkeitstheorie Verw. Geb. 43, 309–318 (1978) 21. Rey-Bellet, L., Thomas, L.E.: Asymptotic behavior of thermal non-equilibrium steady states for a driven chain of anharmonic oscillators. Commun. Math. Phys. 215, 1–24 (2000) 22. Rey-Bellet, L., Thomas, L.E.: Fluctuations of the entropy production in an harmonic chains. Preprint 2001 23. Rieder, Z., Lebowitz, J.L., Lieb, E.: Properties of a harmonic crystal in a stationary non-equilibrium state. J. Math. Phys. 8, 1073–1085 (1967) 24. Ruelle, D.: Smooth dynamics and new theoretical ideas in non-equilibrium statistical mechanics. J. Stat. Phys. 95, 393–468 (1999) 25. Ruelle, D.: Natural non-equilibrium states in quantum statistical mechanics. J. Stat. Phys. 98, 57–75 (2000) 26. Spohn, H., Lebowitz, J.L.: Stationary non-equilibrium states of infinite harmonic systems. Commun. Math. Phys. 54, 97–120 (1977) 27. Sievers, A.J., Takeno, S.: Intrinsic localized modes in anharmonic crystals. Phys. Rev. Lett. 61, 970–973 (1988) 28. Stroock, D.W., Varadhan, S.R.S.: On the support of diffusion processes with applications to the strong maximum principle. In: Proc. 6th Berkeley Symp. Math. Stat. Prob., Vol. III. Berkeley: Univ. California Press, 1972, pp. 361–368 Communicated by H. Spohn
Commun. Math. Phys. 225, 331 – 359 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime Christopher J. Fewster1 , Rainer Verch2 1 Department of Mathematics, University of York, Heslington, York YO10 5DD, UK.
E-mail:
[email protected] 2 Institut für Theoretische Physik, Universität Göttingen, Bunsenstr. 9, 37073 Göttingen, Germany,
E-mail:
[email protected] Received: 21 May 2001 / Accepted: 23 August 2001
Abstract: Quantum fields are well known to violate the weak energy condition of general relativity: the renormalised energy density at any given point is unbounded from below as a function of the quantum state. By contrast, for the scalar and electromagnetic fields it has been shown that weighted averages of the energy density along timelike curves satisfy “quantum weak energy inequalities” (QWEIs) which constitute lower bounds on these quantities. Previously, Dirac QWEIs have been obtained only for massless fields in two-dimensional spacetimes. In this paper we establish QWEIs for the Dirac and Majorana fields of mass m ≥ 0 on general four-dimensional globally hyperbolic spacetimes, averaging along arbitrary smooth timelike curves with respect to any of a large class of smooth compactly supported positive weights. Our proof makes essential use of the microlocal characterisation of the class of Hadamard states, for which the energy density may be defined by point-splitting.
1. Introduction In general relativity, it is customary to assume that the stress-energy tensor satisfies one or more of the classical energy conditions; the weak energy condition, for example, being the assertion that the energy density measured by any observer is nonnegative. The primary motivation behind these energy conditions is that they ensure that gravity acts as an attractive force (in the sense of focussing geodesic congruences) in accordance with our experience of gravitation on a wide range of scales. It is therefore natural to assume that physically reasonable forms of classical matter obey such conditions, and to regard matter theories (such as the nonminimally coupled scalar field [4,43,12,17]) violating such conditions as being of questionable physical significance on many scales. Moreover, the energy conditions have proved to be of great value in obtaining deep results in classical general relativity, such as the positive mass and singularity theorems [39, 48, 21].
332
C. J. Fewster, R. Verch
However, it is well known that all the pointwise energy conditions are violated in quantum field theory. Indeed, Epstein, Glaser and Jaffe [6] proved that no Wightman field theory on Minkowski space can admit a (nontrivial) energy density observable whose expectation values are bounded from below and vanish in the Minkowski vacuum state. Moreover, in linear field theories (both in flat and curved spacetimes) it is easy to construct states whose energy density at a given point may be tuned to be arbitrarily negative [3, 26]. This raises the possibility that quantum matter might be used to construct spacetimes with exotic properties, such as traversable wormholes [18] or so-called “warp drive” spacetimes [33], usually excluded by the classical energy conditions. One might also ask whether the conclusions of the singularity theorems remain valid for quantum matter. Furthermore, it is necessary to understand how classical matter contrives to obey the classical energy conditions, given that its fundamental constituents need not. One profitable line of enquiry, starting with the work of Ford [14], has been to investigate weighted averages of the renormalised energy density along the worldline of an observer, or over a small spacetime region. It turns out that the expectation values of these averaged observables are bounded from below independently of the state, and such bounds have been developed in successively greater generality over the last few years [16, 34, 11, 7, 9, 22, 8]. Most recently, one of us [8] has established the existence of such bounds (and given an explicit, though not optimal, lower bound) for the minimally coupled real linear scalar field in any globally hyperbolic spacetime, in the case where averaging is performed with respect to proper time along any smooth timelike curve using an arbitrary smooth compactly supported positive weight belonging to the class1 W = {f ∈ C0∞ (R) | f (τ ) = g(τ )2 for some real-valued g ∈ C0∞ (R)}.
(1.1)
The constraints imposed by these lower bounds have been called “quantum inequalities” by various authors. However, as we hope to discuss elsewhere, there seem to be strong parallels between the phenomena discussed above and various situations arising in quantum mechanics. Indeed, quantum inequalities appear to be a widespread feature of quantum theory as a whole, stemming ultimately from the uncertainty principle. In this light, we adopt the more specific terminology “quantum weak energy inequality” (or QWEI) in relation to lower bounds on the renormalised energy density of a quantum field. There are also related constraints which demand that the integral of the energy density over any complete (i.e. inextendible) smooth timelike or lightlike (“null”-) curve be nonnegative. These are called the “averaged weak energy condition” (AWEC) or “averaged null energy condition” (ANEC), respectively; they may be viewed as a limiting case of QWEIs when the weight function (f in Eq. (1.2) below) against which the energy density is integrated along a timelike or lightlike curve approaches the unit function. Historically, it was first pointed out in a work by Tipler [41] that suitable versions of AWEC or ANEC imply singularity theorems in general relativity similar to those which one obtains from pointwise positivity conditions on the energy density. Subsequently, the question of whether quantum fields obey AWEC or ANEC has been investigated in a number of works [26, 13, 46, 49, 50, 15, 12, 42]. Most of these references treat linear quantum fields; [42] establishes ANEC for general (axiomatic) quantum field theory in two-dimensional Minkowski spacetime. A study of the interrelations between averaged energy conditions and QWEIs is contained in [15]. We refer to [12] for further discussion and review of averaged energy conditions. 1 See the remarks following Theorem 4.1 for a brief discussion of this class.
A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime
333
QWEIs place stringent constraints on attempts to generate exotic spacetimes [18, 33] and may open a route towards proving results analogous to the singularity theorems for quantum matter [49, 50]. To date, however, most QWEI results have been obtained for scalar field theories, while the more physically interesting electromagnetic and Dirac fields have received comparatively little attention. Ford and Roman have considered the electromagnetic field in Minkowksi space [16] and shown that a QWEI holds for averaging against the Lorentzian weight f (τ ) = τ0 /[π(τ 2 +τ02 )] along timelike geodesics; this result was generalised to static trajectories in static spacetimes by Pfenning [31], who has also removed the restriction to Lorentzian weights [32]. It is reasonable to suppose that even more general QWEIs may be obtained for this case. Both the scalar and electromagnetic fields have the property that the classical energy density is manifestly nonnegative, a fact which underpins all the results on these fields. The Dirac field is technically very different in that the “classical” energy density is unbounded both from above and below; in second quantization, renormalisation serves the dual purpose of restoring finiteness and imposing positivity of the Hamiltonian. This problem appears to have restricted progress on the Dirac field to date. The main contribution has been that of Vollick, who established a QWEI for Dirac fields in two-dimensional spacetimes [45] by converting the problem to one involving a scalar field and then adapting arguments due to Flanagan [11]. There seems little prospect of generalising this argument beyond the two-dimensional setting. In four dimensions, Vollick has also given explicit examples of states with locally negative energy densities [44] and demonstrated that the resulting energy densities nonetheless obey QWEIs modelled on those for the scalar field. In this paper we establish a general QWEI for massive or massless Dirac fields on four-dimensional2 globally hyperbolic spacetimes. To be more specific, let γ be a smooth timelike curve, parametrized by its proper time τ , in a globally hyperbolic spacetime (M, g). Let ω0 be a given (but arbitrary) Hadamard state3 of the Dirac field on (M, g). The state ω0 is used as a “reference state” to define the expected normal ordered energy density : T00 : ω for any other Hadamard state ω. Our main result, Theorem 4.1 asserts that (1.2) inf dτ : T00 : ω (γ (τ ))f (τ ) > −∞, ω
where the infimum is taken over the class of Hadamard states and f belongs to the class W. In principle our arguments yield an explicit lower bound for the left-hand side of (1.2). This expression is unfortunately not particularly enlightening and is not expected to be sharp. Let us note that (1.2) remains true if the normal ordered energy density is replaced by the renormalised energy density, as these two quantities differ by a smooth function. The plan of the paper is as follows. We begin, for completeness and to fix notation, by reviewing the theory of the quantized Dirac field in Sect. 2. Particular attention is given to the class of Hadamard states, which may be characterised by a microlocal spectrum condition on the wave-front set of the two-point function. This formulation of the Hadamard condition is technically convenient and allows us to bring the tools of microlocal analysis to bear. Section 3 explains how the normal ordered energy density may be constructed by point-splitting. 2 The restriction to four dimensions is purely for convenience: our methods would apply in more general dimensions. 3 See Sect. 2 for a brief review of the concepts used here.
334
C. J. Fewster, R. Verch
The proof of our QWEI begins in Sect. 4, using the following strategy. The averaged normal ordered energy density is first expressed as an integral over R2 ; decomposing this integral according to the quadrants of R2 , each piece is then split further into four using a decomposition induced by the reference state ω0 . All but two of the resulting sixteen contributions can be bounded (both above and below) using estimates obtained in Sect. 5. The remaining terms are then expressed in the form R = lim→+∞ Tr J W , where J and W are self-adjoint and J is independent of ω. The parameter ∈ R+ defines a cut-off, used to avoid domain problems. We prove that W is positive and trace-class with bounded trace as ω varies. To conclude that R is bounded below, it then suffices to establish that the operators J are bounded below uniformly in . This is accomplished in Sect. 6, completing the proof of our QWEI. In Sect. 7 we briefly describe how our arguments can also be applied to the Majorana field. Conventions. The metric signature is (+, −, −, −). Lower (resp. upper) case Latin characters from the beginning of the alphabet will label tetrad (resp. spinor) indices. Tetrad indices run from 0 to 3, and we will use j, k to label the spatial components 1, 2, 3. The summation convention will be used throughout the paper except where otherwise indicated. Units with c = h¯ = 1 are adopted. The Fourier transform of an integrable f on Rn will be defined using n function ˇ convention f(k) = d x eikx f (x), with inverse h(x) = (2π )−n the nnon-standard −ikx d ke h(k). The Fourier transform of a distribution u with compact support is u(k) = u(ek ) with ek (x) = eikx . Given a Lorentzian manifold (M, g), D (M) will denote the space of distributions on M as defined in §6.3 of [24]. Thus if u ∈ D (M) there is, for each chart (U, κ) in M, a distribution uκ ∈ D (κ(U )) such that4 u(f ) = uκ (( −|g|f ) ◦ κ −1 )
∀f ∈ C0∞ (U )
(1.3)
and so that uκ = (κ ◦ κ −1 )∗ uκ in κ(U ∩ U ) for any other chart (U , κ ). We will also (and usually) write u ◦ κ −1 for uκ . Spinor and cospinor distributions will be defined in an analogous fashion, with the convention that a spinor distribution acts on test cospinor fields, and vice versa.
2. The Quantized Dirac Field 2.1. Geometrical preliminaries. In order to make the present paper sufficiently selfcontained, we need to summarize a few basic facts about the geometry of spinor fields in curved spacetimes. We will follow Dimock’s work [5] to large extent. We will consider Dirac fields in a four-dimensional globally hyperbolic spacetime (M, g). To begin with, we recall that a globally hyperbolic spacetime is a Lorentzian spacetime admitting a Cauchy surface, the latter being a smooth hypersurface in M which is intersected exactly once by each inextendible causal curve in (M, g). We will also suppose that (M, g) is orientable and time-orientable, and that such orientations have been chosen. Then (M, g) possesses a spin-structure, that is, there is a principal fibre bundle S(M, g) having SL(2, C) as structure group, acting from the right, together with 4 The factor of √−|g| is used to identify test functions with test densities, on which u strictly speaking
acts.
A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime
335
a 2-1 fibre-bundle homomorphism ψ : S(M, g) → F (M, g) which projects S(M, g) onto the frame bundle F (M, g). That is to say, ψ preserves base-points and obeys ψ ◦ Rs = R(s) ◦ ψ,
(2.1)
where R denotes the right action of the structure groups on the principal fibre bun↑ dles involved and SL(2, C) s → (s) ∈ L+ is the covering projection onto the proper orthochronous Lorentz group. We recall that F (M, g) is the bundle of oriented and time-oriented tetrads (e0 , e1 , e2 , e3 ), so that g(ea , eb ) = ηab with ηab = diag(1, −1, −1, −1), where e0 is timelike and future-pointing and the tetrad is given the orientation of M. Moreover, a collection of 4 × 4-matrices γ0 , . . . , γ3 is called a set of Dirac matrices if γa γb + γb γa = 2ηab · 1.
(2.2)
A theorem due to Pauli states that, if γ0 , . . . , γ3 and γ0 , . . . , γ3 are two sets of Dirac matrices, then there is an invertible matrix M so that γa = Mγa M −1 . Any set of Dirac ( . )
↑
matrices γ0 , . . . , γ3 is connected to the covering SL(2, C) −→ L+ in the following way. Let Spin(1, 3) consist of all unimodular 4 × 4-matrices S so that Sγa S −1 = γb b a
(2.3)
holds for some real numbers b a = b a (S). It follows from the defining properties of Dirac matrices that b a (S) is contained in the Lorentz group. The restriction of the map S → b a (S) in (2.3) to Spin0 (1, 3), the unit connected component of Spin(1, 3), is a ↑ group homomorphism with range L+ , and thus Spin0 (1, 3) is isomorphic to SL(2, C). Sometimes it is useful to distinguish sets of Dirac matrices with certain properties. One says that a set γ0 , . . . , γ3 of Dirac matrices belongs to a standard representation if γ0∗ = γ0 and γk∗ = −γk .
(2.4)
Here, γa∗ is the hermitian adjoint of γa .A set of Dirac matrices which belongs to a standard representation and has the additional property that the complex conjugate matrices fulfills γ a = −γa ,
(2.5)
is said to belong to a Majorana representation. We will now suppose that we are given a globally hyperbolic spacetime (M, g) together with a spin-structure (S(M, g), ψ) and a set of Dirac matrices γ0 , . . . , γ3 which we will assume, for the sake of notational simplicity, to belong to a standard representation. (We note, however, that everything which follows could also be carried out in a similar way without that assumption.) As was pointed out above, via the isomorphism Spin0 (1, 3) SL(2, C), C4 carries a representation of the universal covering group of ↑ L+ which is given by the action of the matrices S in Spin0 (1, 3) on vectors in C4 . Via Spin0 (1, 3) SL(2, C), we can also regard S(M, g) as a Spin0 (1, 3)-principal bundle and form the associated vector bundle DM = S(M, g) Spin0 (1,3) C4 .
(2.6)
336
C. J. Fewster, R. Verch
That is, the fibre of DM at p ∈ M consists of the orbits [sp , x] = {(RS−1 sp , Sx) : S ∈ Spin0 (1, 3)}
(2.7)
for sp ∈ S(M, g)p and x ∈ C4 . There is a fibrewise left action of Spin0 (1, 3) on DM by LS [sp , x] = [sp , Sx] . Elements in DM are called spinors, and elements in the dual bundle D ∗ M are called cospinors. Moreover, if E is a (local) section in S(M, g), then it induces on one hand a tetrad field (e0 , . . . , e3 ) = ψ ◦ E, i.e. a (local) smooth section in F (M, g), via the spin-structure, and on the other hand it induces a set (EA )4A=1 of (local) smooth sections in DM, defined by EA = [E, bA ],
(2.8)
where b1 , . . . , b4 is the standard basis in C4 . There are corresponding dual tetrad fields B . The eb (e0 , . . . , e3 ) defined by eb (ea ) = δab and (E B )4B=1 defined by E B (EA ) = δA ∗ B ∗ are smooth sections in T M, and the E are smooth sections in D M, the dual bundle to DM. We shall denote by C ∞ (DM) and C ∞ (D ∗ M) the sets of smooth sections in DM and D ∗ M, respectively. The notation for smooth sections in T M and T ∗ M will be similar. With respect to the given set of Dirac matrices, one can define a section γ in C ∞ (T ∗ M) ⊗ C ∞ (DM) ⊗ C ∞ (D ∗ M), i.e. a mixed spinor-tensor field, by setting its components γb A B in the induced frame eb ⊗ EA ⊗ E B to be equal to the matrix elements (γb )A B of γb . This definition is independent of the induced frames, i.e. independent of the chosen (local) section E in S(M, g). (Once the set of Dirac matrices γ0 , . . . , γ3 is given, γ encodes the spin-structure at the level of DM.) Moreover, there is an anti-linear isomorphism DM → D ∗ M induced by forming B the Dirac adjoint: If u = uA EA is a spinor, one can assign to it a cospinor u+ = u+ BE with components A u+ B = u γ0AB ,
(2.9)
where γ0AB are the matrix elements of γ0 . This assignment possesses an inverse (denoted by the same symbol), where a cospinor v = vB E B is mapped to a spinor v + having components v +A = γ0 AB vB . Again, γ0 AB are the matrix elements of γ0 . The operation of taking the Dirac adjoint gives rise to anti-linear isomorphisms between C ∞ (DM) and C ∞ (D ∗ M) in the obvious manner. The metric-induced covariant derivative ∇ on C ∞ (T M) induces a covariant derivative, also denoted by ∇, on C ∞ (DM). If (e0 , . . . , e3 ) and (EA )3A=1 are induced by a section E in S(M, g) and f = f A EA is a local section in DM, then ∇f = ∇b f A (eb ⊗ EA ) ∈ C ∞ (T ∗ M) ⊗ C ∞ (DM)
(2.10)
has the frame components ∇b f A = ∂b f A + σb A B f B , where ∂b f A = df A (eb ),
1 a A dC σb A B = − 1bd γa C γ B . 4
(2.11)
A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime
337
Here, df A is the differential of the function f A , and we read the components of the a are Christoffel’s connection coefficients, Dirac matrices on the right hand side while 1bd defined by a d b ∇k = (∂b k a + 1bd k )e ⊗ ea
(2.12)
for k = k b eb ∈ C ∞ (T M). The covariant derivative ∇ can be extended to cospinor fields and mixed spinortensor fields by requiring the Leibniz rule and commutativity with contractions. Thus, if h = hB E B is a cospinor field, then ∇h = ∇b hB eb ⊗ E B has the components ∇b hB = ∂b hB − hC σb C B .
(2.13)
It follows that ∇γ = 0. 2.2. The Dirac Equation. The Dirac-operator /∇ is a first order differential operator taking spinor fields to spinor fields, or cospinor fields to cospinor fields; it is defined as the action of the covariant derivative followed by contraction with the spinor-tensor γ . More precisely, if f = f A EA ∈ C ∞ (DM) and h = hB E B ∈ C ∞ (D ∗ M), then /∇f = (∇ / f )A EA = ηab γa A B ∇b f B EA ,
(2.14)
/∇h = (∇ / h)B E = η ∇b hC γa
(2.15)
B
ab
C
BE
B
.
An important property of the Dirac operator is that it commutes with taking the Dirac adjoint, i.e. (∇ / f )+ = /∇f +
and
(∇ / h)+ = /∇h+
(2.16)
for f ∈ C ∞ (DM) and h ∈ C ∞ (D ∗ M). The Dirac equation is the following first order partial differential equation for spinor fields u ∈ C ∞ (DM) or for cospinor fields v ∈ C ∞ (D ∗ M): (−i∇ / + m)u = 0, (i∇ / + m)v = 0,
(2.17) (2.18)
P = (i∇ / + m)(−i∇ / + m)
(2.19)
where m ≥ 0 is a constant. Then
is the Lichnerowicz wave operator on spinors or cospinors. It is a second order wave operator which has metric principal part, and owing to global hyperbolicity of (M, g) this implies that the Cauchy problem for the corresponding wave equations is well-posed and that P possesses uniquely determined advanced and retarded fundamental solutions. As shown in [5], this implies that the Dirac operators −i∇ / +m on spinor fields and i∇ / +m on cospinor fields possess uniquely determined pairs of advanced(−) and retarded(+) ± and S ± , respectively: This means that, for the spinor case, fundamental solutions Ssp cosp ± : C0∞ (DM) → C ∞ (DM) Ssp
(2.20)
are continuous linear maps so that ± ± (−i∇ / + m)Ssp u = u = Ssp (−i∇ / + m)u
(2.21)
338
C. J. Fewster, R. Verch
± u is contained in the causal fuholds for all u ∈ C0∞ (DM) and, moreover, supp Ssp ture(+)/causal past(−) of supp u. (We note that our convention concerning advanced/retarded fundamental solutions is opposite to that in [38].) The cospinor case is analogous. Then one defines the retarded-minus-advanced fundamental solutions + − − Ssp Ssp = Ssp
and
+ − Scosp = Scosp − Scosp .
(2.22)
In order to quantize the Dirac field, it is very convenient to “double” the system by taking pairs of spinor fields and cospinor fields together, as was done in the references [28, 29, 23]. We give here an equivalent version which makes contact with the notation used in [38]. To this end, let us denote C0∞ (DM) by Dsp and C0∞ (D ∗ M) by Dcosp , and define the doubled space Ddouble = Dcosp ⊕ Dsp . On Ddouble we introduce the sesquilinear form h h1 , 2 = f1+ , f2 − h2 , h+ (2.23) 1 f1 f2 for h1 , h2 ∈ Dcosp and f1 , f2 ∈ Dsp , where for v ∈ Dcosp and u ∈ Dsp we employ the dual pairing dµg (p) vp (up ) (2.24) v, u = M
with dµg denoting the canonical 4-volume form induced by the metric g on M. This dual pairing also embeds Dcosp in the (topological) dual space Dsp of Dsp , and vice versa, embeds Dsp in Dcosp . The sesquilinear form (· , ·) is non-degenerate, but not positive. A useful relation is Scosp h, f = −h, Ssp f
(2.25)
for h ∈ Dcosp and f ∈ Dsp . Let us define the conjugate-linear isomorphism 1 : Ddouble → Ddouble , playing the role of a charge-conjugation, by + h f . (2.26) 1 = f h+ Then one finds that 1 is a skew-conjugation with respect to (· , ·), namely it holds that (1F1 , 1F2 ) = −(F2 , F1 )
∀ F1 , F2 ∈ Ddouble .
Now we introduce the following “doubled” operators on Ddouble : −∇ / + im 0 −∇ / − im 0 , D✁ = , D✄ = 0 /∇ + im 0 /∇ − im iScosp 0 S✁ = . 0 iSsp Then it holds that
D✁ D✄ = D✄ D✁ = Pdouble =
Pcosp 0 0 Psp
(2.27)
(2.28) (2.29)
,
(2.30)
A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime
339
where P... denotes the Lichnerowicz wave operator on spinors and cospinors, respectively; moreover, one finds that 1 commutes with Pdouble and 1D✁ = −D✁ 1 ,
1D✄ = −D✄ 1,
(D✁ F1 , F2 ) = −(F1 , D✁ F2 ), (D✄ , F1 , F2 ) = −(F1 , D✄ F2 ) ∀ F1 , F2 ∈ Ddouble .
(2.31) (2.32)
± One may also check that S✁ (defined in obvious analogy to S✁ ) are the retarded(+)/advanced(−) fundamental solutions for the operator D✄ ; consequently
D✄ S✁ = S✁ D✄ = 0.
(2.33)
Furthermore, from (2.31) one can see that 1S✁ = −S✁ 1.
(2.34)
This entails that 1 is a complex conjugation for the sesquilinear form (F1 , F2 )S = (S✁ F1 , F2 ),
F1 , F2 ∈ Ddouble ,
(2.35)
so that (1F1 , 1F2 )S = (F2 , F1 )S ∀ F1 , F2 ∈ Ddouble . On the other hand one can see that (cf. (2.25)) h h1 , 2 = −if1+ , Ssp f2 + iScosp h2 , h+ 1 f1 f2 S
(2.36)
(2.37)
and this implies by Prop. 2.2 in [5] that (· , ·)S is positive-semidefinite, (F, F )S ≥ 0. Now we introduce the quotient space Ddouble /ker S✁ and denote by H its completion with respect to (· , ·)S . The conjugation 1 induces by (2.34) a conjugation on H which will again be denoted by 1. Hence, we have derived from the doubled Dirac equation a complex Hilbert-space H (with scalar product (· , ·)S ) together with a complex conjugation 1. The system can be quantized, following Araki [1], by assigning to these data the algebra of canonical anti-commutation relations CAR(H, 1). This is the unique C ∗ -algebra with unit 1 which is generated by a family {B(v) : v ∈ H } subject to the relations: (i) v → B(v) is C-linear, (ii) B(1v) = B(v)∗ , (iii) B(v)∗ B(w) + B(w)B(v)∗ = (v, w)S · 1. Now let q : Ddouble → Ddouble /ker S✁ denote the quotient map, then we define the quantized Dirac field as the linear map which assigns to each h ∈ Dcosp the element h 5(h) = B q (2.38) 0 in CAR(H, 1). The adjoint spinor field will be defined by 0 , f ∈ Dsp . 5 + (f ) = B q f
(2.39)
340
C. J. Fewster, R. Verch
As a consequence of (iii), the field and its adjoint satisfy the anti-commutation relations 5(h)5 + (f ) + 5 + (f )5(h) = −ih, Ssp f .
(2.40)
We also note that, owing to (2.33), B(q(D✄ F )) = 0, and this entails 5((i∇ / + m)h) = 0
and
5 + ((−i∇ / + m)f ) = 0
(2.41)
for all h ∈ Dcosp and f ∈ Dsp . Remarks. (i) 5 acts linearly on cospinors and fulfills, in the sense of distributions, the equation (−i∇ / + m)5 = 0, therefore the map h → 5(h) is regarded as a spinor field. Similarly, 5 + acts linearly on spinor fields and fulfills in distributional sense (i∇ / + m)5 + = 0; hence it is viewed as a cospinor field. (ii) 5 and 5 + are C ∗ -valued distributions since e.g. 2||5 + (f )||2 = −if + , Ssp f and Dsp ⊗ Dsp f1 ⊗ f2 → −if1+ , Ssp f2 is continuous (with respect to the usual testfunction topology). (iii) We briefly comment on the case where γ0 , . . . , γ3 belong to a Majorana representation (as considered in [38]) and one wishes to quantize the Majorana field. In that situation, Dcosp (and similarly, Dsp ) carries an “intrinsic” charge conjugation 1 given by v = vB E B → vB E B in any frame. Then 1 is a skew-conjugation for the sesquilinear form (h, h ) = dµg (p) hp (h+ (2.42) p) M
on Dcosp . Upon defining D✄ = /∇ − im, D✁ = /∇ + im and S✁ = iScosp , one obtains similar relations as before. Then H arises as completion of Dcosp /ker S✁ and (h, h )S = (S✁ h, h ). The field operators then simplify to 5(h) = B(q(h));
5 + (f ) = 5(f + )∗ = 5(1f + )
(2.43)
for h ∈ Dcosp , f ∈ Dsp . In other words, the Majorana field may be quantized without doubling the classical system. Instead of starting with Dcosp one can likewise consider Dsp ; this has been done in [38]. 2.3. Hadamard states. We recall that a state on a C ∗ -algebra C is a linear functional ω : C → C fulfilling ω(1) = 1 and ω(A∗ A) ≥ 0 for all A ∈ C. For the purpose of the present work, it is sufficient to focus on the two-point functions ω2 of states ω on CAR(H, 1). The two-point function ω2 of ω is an element in (Ddouble ⊗ Ddouble ) given by ω2 (F1 ⊗ F2 ) = ω(B(q(F1 ))B(q(F2 ))),
F1 , F2 ∈ Ddouble .
(2.44)
It was shown in [1] that there is for each state ω on CAR(H, 1) a linear operator Q on H with the properties: (I) 0 ≤ Q∗ = Q ≤ 1, (II) Q + 1Q1 = 1, (III) ω2 (F1 ⊗ F2 ) = (1q(F1 ), Qq(F2 ))S
for all F1 , F2 ∈ Ddouble .
A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime
341
Conversely, each linear operator Q with the properties (I) and (II) determines by (III) the two-point function ω2 of some state ω on CAR(H, 1), a so-called quasifree state, determined by the two-point function ω2 (see [1] for discussion). Such a quasifree state ω is pure (and then often called a Fock-state) if and only if Q is a projection, i.e. Q2 = Q. The operator Q with the properties (I), (II) and (III) above will be called the operator labelling the quasifree state ω. The stress-energy tensor is defined using the two-point functions of a particular class of states, the Hadamard states. One says that a state ω on CAR(H, 1) is a Hadamard state if we may write ω2 (F1 ⊗ F2 ) = iw(D✁ F1 ⊗ F2 )
(2.45)
for some distribution w ∈ (Ddouble ⊗ Ddouble ) of Hadamard form for the doubled waveoperator Pdouble on Ddouble ; the definition of a Hadamard form for such a wave-operator has been given in [38] (cf. also [28, 29, 23, 30]). This definition entails that the difference between the two-point functions of two Hadamard states is smooth. For the purposes of this work, we will also need the characterization of Hadamard states in terms of properties of the wave-front set WF(ω2 ) that appears in [29, 23, 38], following a line of argument given in [35] for the scalar case. The relevant statement, proven in the references just stated, is: A state ω on CAR(H, 1) is a Hadamard state if and only if the wave-front set of its two-point function ω2 satisfies the relation
WF (ω2 ) = (p, ξ ; p , −ξ ) ∈ T˙ ∗ (M × M) | (p, ξ ) ∼ (p , ξ ); ξ ∈ Np+ , (2.46) where T˙ ∗ (M × M) is the cotangent bundle over M × M without the zero-section, (p, ξ ) ∼ (p , ξ ) means that there is a lightlike geodesic connecting the points p and p in M and to which ξ and ξ are co-tangent, and Np+ is the set of all future-directed null covectors at p. (We remark that in [38] the opposite sign convention for the Fourier transform was chosen, leading to the opposite constraint ξ ∈ Np− (i.e., past-directed null covectors) in that reference compared to (2.46).) At this point we very briefly recall the definition of the wave-front set of a distribution [24]. For a distribution t ∈ D (Rn ), a point (x, k) ∈ Rn × (Rn \{0}) is called a regular directed point of t if there exists χ ∈ D(Rn ) with χ (x) ' = 0 and a conic open neighbourhood C of k in Rn \{0} such that sup (1 + |k |)N |χt(k )| < ∞
k ∈C
∀ N ∈ N.
(2.47)
(If this holds we will say that χt is of rapid decay in C.) The complement in Rn ×(Rn \{0}) of the set of all regular directed points of t is called the wave-front set WF (t) of t. Given a scalar distribution τ on a manifold M, one says that a non-zero covector (p, ξ ) ∈ T˙ ∗ (M) is in WF (τ ) if there is a coordinate chart (U, κ) around p ∈ M so that (κ(p), t(κ −1 ) ξ ) ∈ WF (τ ◦ κ −1 ) where (as discussed at the end of Sect. 1) τ ◦ κ −1 ∈ D (κ(U )) is a distribution on the chart range of κ. This definition of WF (τ ) is independent of the choice of the chart κ. We refer the reader to [24] for further discussion of the properties of the wave-front set of distributions on manifolds. For the case that τ is a distribution on test-sections of a vector bundle, e.g. defined on Ddouble , τ can be viewed, via (local) trivializations of the bundle, as the collection (τ A B )A,B of scalar distributions, and then WF (τ ) is defined as the union of WF (τ A B ) over all components A, B. It is not difficult to see that this definition is independent of the chosen (local) trivialization.
342
C. J. Fewster, R. Verch
Now let ω2 be the two-point function of a Hadamard state ω on CAR(H, 1), and let Q be the corresponding operator on H with the properties (I), (II) and (III) above. We will use the notation Q1 = 1Q1
(2.48)
for the “charge conjugate” of Q, and we will adopt this notation also for other operators on H. In studying the stress-energy tensor, we will be particularly interested in the following distributions on Dsp ⊗ Dcosp associated with ω (respectively, with Q), which we will also refer to as two-point functions: ωQ (f ⊗ h) = ω(5 + (f )5(h)),
1 ωQ (f ⊗ h) = ω(5(h)5 + (f )),
(2.49)
1 =ω where f ∈ Dsp and h ∈ Dcosp . Note that ωQ Q1 . As a consequence of the constraint on the wave-front set (2.46) for Hadamard states, which is also called the microlocal spectrum condition, one finds that the following microlocal spectrum condition holds 1, for ωQ and ωQ
= WF (ωQ ) = (p, ξ ; p , −ξ ) ∈ T˙ ∗ (M × M) | (p, ξ ) ∼ (p , ξ ); ξ ∈ Np= . (2.50)
Here, and below, we use = and > to denote either the presence or absence of a 1 in the following context-dependent way: Q= Q Q1
=
ωQ ωQ 1 ωQ
=
Np Np+ Np−
R= R+ . R−
To avoid confusion we will sometimes use · as a placeholder to indicate the absence of a 1. Thus Q· = Q, for example. We note that the microlocal spectrum condition in the form (2.50) has been proved directly for quasifree Hadamard states in [29, 23]. Let us note that (2.49) may be written 0 h , Qq ; (2.51) ωQ (f ⊗ h) = 1q f 0 S as a slight abuse of notation we will use this relation to define ωQ for general bounded operators Q on H and refer to ωQ as the two-point function labelled by Q. (Of course ωQ is not in general the two-point function of a state.) We will also denote by Had (H, 1) = the class of operators which obey properties (I) and (II) and such that ωQ obeys (2.50). Thus Had (H, 1) parametrizes the quasifree Hadamard states on CAR(H, 1). Our proof of Thm. 4.1 below relies on the existence of pure, quasifree Hadamard states for the Dirac field. It seems that this has never been established in the literature in full detail, therefore we sketch here how such states may be constructed by adapting an argument employed by Fulling, Narcowich and Wald [19] for the case of the free scalar field to the Dirac field. The first step is to show that there exists a pure, quasifree Hadamard state for the Dirac field for ultrastatic (M, g). In fact, if (M, g) is ultrastatic, then it may be endowed with a suitable spin structure so that the ultrastatic time shifts give rise to a continuous unitary group Ut (t ∈ R) on H leaving the scalar product (· , ·)S invariant and fulfilling 1Ut = Ut 1. Moreover, if the mass parameter m appearing in the Dirac equation is strictly positive, then the spectrum of the self-adjoint generator of Ut
A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime
343
is bounded away from zero. Theorem 2 in [1], or the results in [47], thus show that there is a pure quasifree state ω0 on the CAR algebra of the Dirac field on ultrastatic (M, g) which is a ground state for the C ∗ -dynamics induced by Ut ; the projection P0 labelling ω0 is the projection onto the positive spectral subspace of the unitary group Ut . Since ω0 is a ground state, it fulfills the microlocal spectrum condition [37] and hence is a Hadamard state [38]. In a second step, one uses a technique developed in [19] which allows one to view a neighbourhood of a Cauchy surface of any given globally hyperbolic spacetime as being isometrically embedded in a globally hyperbolic spacetime that has an ultrastatic part (with suitable spin structure as above) in its past. By the uniqueness of the Cauchy problem and the “propagation of Hadamard form” under the dynamics of the Dirac equation [38], any pure, quasifree Hadamard state prescribed on the ultrastatic part of the spacetime (e.g. ω0 ) induces a pure, quasifree Hadamard state everywhere on the spacetime, in particular on the embedded neighbourhood of the Cauchy surface of the initially given globally hyperbolic spacetime. Using the same argument once more, a pure, quasifree Hadamard state for the Dirac field is thereby induced on any given globally hyperbolic spacetime. The mass parameter m may be allowed to be variable over spacetime in this process without affecting the Hadamard form, so that one obtains a pure, quasifree Hadamard state of the Dirac field on any globally hyperbolic spacetime for any m ≥ 0. The argument just sketched implicitly also shows that there exists an abundance of quasifree Hadamard states. 3. A Point-Split Energy Density For the remainder of this paper, we will assume that (M, g) is globally hyperbolic, orientable and time orientable, with spin structure (S(M, g), ψ) and that the Dirac matrices γa belong to a standard representation. Let γ : R → M be a smooth timelike curve in (M, g), parametrized by its proper time, along which we wish to establish a QWEI. The starting point is the construction of a normal ordered energy density on γ , which is accomplished as follows. We first claim that there exists a tubular neighbourhood Cγ of γ and a local section E of S(M, g) over Cγ such that the induced tetrad field (e0 , . . . , e3 ) = ψ ◦ E satisfies e0 |γ = u, where u = γ˙ is the velocity of γ . To see this, choose any locally finite open cover {Uj | j ∈ Z} of γ by charts Uj such that (i) Uj ∩ Uk = ∅ unless |j − k| ≤ 1; (ii) Uj ∩ Uj +1 is contractable for each j ∈ Z; (iii) Uj ∩ Uk ∩ Ul = ∅ if j, k, l are distinct. The existence of such a cover follows from global hyperbolicity of (M, g) since γ is timelike. Now extend u to a smooth timelike unit vector field u on some tubular neighbourhood Cγ of γ , so that Cγ ⊂ j Uj . Choose any tetrad (e0 , . . . , e3 ) on Cγ . Then we may obtain a tetrad (e0 , . . . , e3 ) with e0 = u by applying a unique boost in ↑ L+ at each point (whose parameters are given by the components of u with respect to ea , and therefore vary smoothly). This tetrad lifts smoothly to S(M, g) in each Uj ∩ Cγ and may be patched together along Cγ to obtain the required section E by virtue of properties (i), (ii) and (iii).5 Next, we choose smooth spinor fields vA (A = 1, . . . , 4) in Cγ , such that δ AB vA ⊗ vB+ = γ0 . 5 Of course, there are exactly two such sections.
(3.1)
344
C. J. Fewster, R. Verch
This is easily satisfied by taking vA = EA , where EA is the spin frame induced by E; however, it will be convenient to make a slightly different choice when considering the Majorana field. The changes relevant for Majorana fields will be described in Sect. 7. With respect to the frame ea the Dirac stress-energy tensor is Tab =
i + ψ γ(a ∇b) ψ − (∇(a ψ + )γb) ψ , 2
(3.2)
which is manifestly symmetric, and is conserved provided ψ obeys the Dirac equation (2.17). In particular, T00 (γ (τ )) is the energy density measured by an observer with worldline γ at proper time τ . We may use (3.1) to define a bi-scalar point-split energy density T (x, y) = δ AB
i + (ψ vA )(x)(vB+ e0 · ∇ψ)(y) − ([e0 · ∇ψ + ]vA )(x)(vB+ ψ)(y) 2 (3.3)
with the property that T (x, x) = T00 (x). Integrating by parts, T becomes a scalar bi-distribution T ∈ (D(M) ⊗ D(M)) , T (f ⊗ g) = δ AB
i + ψ (∇ · [e0 vA f ])ψ(vB+ g) − ψ + (vA f )ψ(∇ · [e0 vB+ g]) . (3.4) 2
Here (and below) the notation ∇ · [e0 v] denotes minus the distributional dual of e0 · ∇, applied to the test function or (co)spinor v. Thus, ∇ · [e0 v] = v∇ · e0 + e0 · ∇v,
(3.5) µ
µ
where, with respect to local coordinates (x µ ), ∇ · e0 = ∇µ e0 and e0 · ∇v = e0 ∇µ v. Upon quantization, we obtain the algebra-valued bi-distribution T given by T (f ⊗ g) = δ AB
i + 5 (∇ · [e0 vA f ])5(vB+ g) − 5 + (vA f )5(∇ · [e0 vB+ g]) . (3.6) 2
Given a state ω we will now use the same symbol to denote its two-point function ω(f ⊗ g) = ω(5 + (f )5(g)) and also set vAB = vA ⊗ vB+ . Thus vAB ω will denote the matrix of scalar bi-distributions vAB ω(f ⊗ g) = ω(5 + (vA f )5(vB+ g)).
(3.7)
The formulae ∇ · (e0 vA f ) = vA ∇ · (e0 f ) + σ0 C A f vC and ∇ · (e0 vB+ g) = vB+ ∇ · (e0 g) + σ0 C B gvC+ now allow us to write the expectation value T ω of T in state ω as T ω = LAB vAB ω,
(3.8)
where LAB =
1 1 (1 ⊗ ie0 · ∇ − ie0 · ∇ ⊗ 1) δ AB + AAB , 2 2
(3.9)
and AAB = i[δ CB σ0 A C ⊗ 1 − 1 ⊗ δ AC σ0 B C ].
(3.10)
A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime
345
If a reference Hadamard state ω0 is now specified, we may define the normal ordered point-split energy density (with respect to ω0 ) by : T : ω = T ω − T ω0 .
(3.11)
: T : ω = LAB vAB : ω :,
(3.12)
This may also be written where : ω : = ω − ω0 is the normal ordered two-point function. Since : ω : is smooth for Hadamard ω, : T : ω is also smooth. Accordingly, the “coincidence limit” (i.e., the restriction of : T : ω to the diagonal) is well defined and yields the normal ordered energy density near γ . We denote the energy density along γ by : ρ : ω (τ ) = : T : ω (γ (τ ), γ (τ )).
(3.13)
Note also that : T : ω (f ⊗ g) is symmetric in f and g by virtue of the CAR’s. It will be convenient to regard : ρ : ω as the diagonal of the pull-back γ2∗ : T : ω , where γ2 (τ, τ ) = (γ (τ ), γ (τ )). In turn, γ2∗ : T : ω may be written as the action of a differential operator on the pulled-back normal ordered two-point function: 1 (3.14) (1 ⊗ D − D ⊗ 1)δ AB + γ2∗ AAB γ2∗ vAB : ω :, γ2∗ : T : ω = 2 where D = id/dτ (strictly speaking, D should be regarded as the distributional dual of −id/dτ ). 4. Main Argument We now come to the proof of the QWEI for Dirac fields. In the following, (M, g) is assumed to satisfy the hypotheses stated at the beginning of the previous section. Theorem 4.1. Let γ : R → M be a smooth timelike curve in (M, g) parametrized by its proper time. Let ω0 be a Hadamard state of the Dirac field on (M, g). Define the normal ordered energy density : ρ : ω by (3.13) with respect to the reference state ω0 . Then for any weight f belonging to W (defined in Eq. (1.1)) (4.1) inf dτ : ρ : ω (τ )f (τ ) > −∞, ω
where the infimum is taken over all Hadamard states ω. That is, there exists a quantum weak energy inequality for the Dirac field. Remarks. (i) If the reference state ω0 is changed, : ρ : ω is modified by a smooth function which is independent of ω. Thus we may assume without loss of generality that ω0 is pure and quasifree. Exactly the same argument entails that (4.1) holds if we replace the normal ordered energy density by the renormalised energy density. (ii) Perhaps surprisingly, the class W (of squares of real-valued C0∞ (R) functions) does not coincide with the class of nonnegative smooth compactly supported functions. In fact, Glaeser [20]6 has constructed an example of a C ∞ nonnegative function f , vanishing d2 √ f (x) diverges as x → 0. The delicacy of this point resides only at the origin, so that dx 2 in the behaviour of f at zeros of infinite order. It is not clear whether the restriction to weights in W is purely a technical limitation of our proof, or whether QWEIs should be understood as quadratic form results (cf. [10]). 6 We are grateful to S.P. Eveson and P.J. Bushell for help in locating this reference.
346
C. J. Fewster, R. Verch
Proof. It is sufficient to prove (4.1) for arbitrary f ∈ C0∞ (I ) ∩ W, where I ⊂ R is an arbitrary open interval with compact closure. To this end, choose η ∈ C0∞ (M) such that η equals unity on a neighbourhood of γ (I ). It is easy to see that : ρ : ω is unaltered on I if we replace vAB : ω : in (3.14) by the compactly supported distribution uAB : ω :, where uAB = ηvA ⊗ ηvB+ . Applying the formula dλ dλ ϕ (λ − λ ), F (−λ, λ ) dτ F (τ, τ )ϕ(τ ) = (2π )2 which is valid for F ∈ C0∞ (R2 ) and ϕ ∈ C0∞ (R), one may show that (ω) I = dτ : ρ : ω (τ )f (τ ) = dλ dλ J AB (λ, λ )WAB (λ, λ ), where
∧ (ω) WAB (λ, λ ) = γ2∗ uAB : ω : (−λ, λ ),
(4.2)
(4.3)
(4.4)
(4.5)
and we have also written J AB (λ, λ ) = where
1 AB AB ∧ (λ + λ ) f (λ − λ )δ + [θ f ] (λ − λ ) , 8π 2
θ AB (τ ) = γ2∗ AAB (τ, τ ) = i δ CB σ0 A C |γ (τ ) − δ AC σ0 B C |γ (τ )
(4.6)
(4.7)
is clearly hermitian (θ BA (τ ) = θ AB (τ )). It follows that J AB is a hermitian matrix kernel, i.e., J AB (λ, λ ) = J BA (λ , λ). (ω) Note that J AB is state-independent, while WAB contains all the dependence on the state of interest ω and the reference state ω0 . We also note that J AB (λ, λ ) decays rapidly away from the diagonal in R2 . Assuming without loss that ω0 is pure and quasifree, it must be labelled by some projection P on H. Since the stress-energy tensor is defined in terms of the two-point function, it is enough to establish (4.1) when the infimum is taken over quasifree Hadamard states, whose two-point functions are of the form ωQ for Q ∈ Had (H, 1) as discussed in Sect. 2.3. The normal ordered two-point function : ωQ : = ωQ − ωP is labelled by Q − P , i.e., : ωQ : = ωQ−P in the spirit of the remarks following (2.51). Now Q − P = −P Q1 P + P 1 QP + P QP 1 + P 1 QP 1 ,
(4.8)
and this induces a decomposition of : ωQ : into four pieces : ωQ : = −ω· 1 · + ω1 · · + ω· · 1 + ω1 · 1 , where ω= G > is the two-point function labelled by P = QG P > ; that is, 0 h =G> = G > ω (f ⊗ h) = 1q ,P Q P q f 0 S
(4.9)
(4.10)
A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime
347
for f ∈ Dsp and h ∈ Dcosp . In Sect. 5 we will show that each ω= G > may be pulled back to R2 by γ2 , allowing us to write (ω )
where
·1· 1·· ··1 1·1 WABQ = −WAB + WAB + WAB + WAB ,
(4.11)
∧ =G> WAB (λ, λ ) = γ2∗ uAB ω= G > (−λ, λ ).
(4.12)
These functions are analytic in λ, λ because they are the Fourier transforms of compactly supported distributions. Furthermore, the following Q-independent bounds will be established in Sect. 5. Lemma 4.2. For any Q ∈ Had (H, 1), =G> => WAB (λ, λ ) ≤ XAB (λ, λ )
∀(λ, λ ) ∈ R2 ,
(4.13)
where XAB is independent of Q and is defined in terms of the reference two-point function ωP by =>
=
>
XAB (λ, λ ) = YA (λ)YB (λ ),
(4.14)
=
where YA (λ) is the positive square root of = = ∧ YA (λ)2 = γ2∗ uAA ωP (−λ, λ) (no sum on A).
(4.15)
Furthermore, YA· (λ) (resp. YA1 (λ)) decays rapidly as λ → +∞ (resp. λ → −∞) and is of polynomially bounded growth as λ → −∞ (resp. λ → +∞). =
Remarks. (i) The right-hand side of Eq. (4.15) is nonnegative because uAA ωP is of positive type as a scalar bi-distribution, and this property is inherited under pull-back by γ2 (see Theorem 2.2 in [8]). In fact, the calculation =
=
uAB ωP (f A ⊗ f B ) = ωP (f A uA ⊗ (f B uB )+ ) = ω0 (5(f A uA )5(f B uB )∗ ) ≥0 (4.16) =
(summing on A and B) for f A ∈ C0∞ (M) (A = 1, . . . , 4) shows that uAB ωP is of = positive type as a matrix-valued distribution. Positive type of uAA ωP follows in consequence. A similar argument (using (4.10) and the property Q ≥ 0 for Q ∈ Had (H, 1)) shows that uAB ω· 1 · , uAB ω1 · 1 and their pull-backs by γ2 share the matrix positive type property. =
(ii) The statements on the growth of YA are obtained from the Paley-Wiener-Schwartz theorem [24] which entails that the Fourier transform of a compactly supported distribution is of at worst polynomial growth. Below, we will frequently use the fact that the product of a rapidly decaying function and one of polynomial growth is itself rapidly decaying. Because the bounds obtained in Lemma 4.2 exhibit different behaviour in the four quadrants C1 , . . . , C4 of the (λ, λ )-plane it is convenient to decompose the averaged energy density (4.4) as I = Ik , where Ik is the contribution arising from quadrant Ck . We proceed to bound the Ik in turn.
348
C. J. Fewster, R. Verch
Starting with the second and fourth quadrants C2 = R− × R+ and C4 = R+ × R− , Lemma 4.2 yields the Q-independent bound (ωQ ) => XAB (λ, λ ) (4.17) WAB (λ, λ ) ≤ =>
in which each summand on the right-hand side is of at worst polynomial growth. This may be combined with the rapid decay of J AB away from the diagonal to yield the following Q-independent bound on the contribution from these quadrants: => dλ dλ J AB (λ, λ ) XAB (λ, λ ) < ∞. (4.18) |I2 + I4 | ≤ C2 ∪C4
=>
We are left with the first and third quadrants. Since J AB exhibits polynomial growth along the diagonal, the previous argument will not allow us to bound all the terms arising from the decomposition (4.11). To see this, note that Lemma 4.2 applied to the P 1 QP 1 1 · 1 gives a bound X 1 1 growing polynomially7 in all directions in the first term WAB AB · · for the P Q1 P term is polynomially growing in the quadrant. Similarly, the bound XAB third quadrant. However, Lemma 4.2 suffices to bound the other contributions to I1 and => I3 because at least one factor in the relevant XAB is rapidly decaying. Thus ·· ·1 1· |I1 − R1 | ≤ dλ dλ J AB (λ, λ ) XAB (λ, λ ) + XAB (λ, λ ) + XAB (λ, λ ) C1
=
We first establish the existence of the quantities WAB and YA (λ) defined by (4.12) and (4.15). Using self-adjointness of P = , Cauchy-Schwarz and .Q. ≤ 1 (following from property (I) of Sect. 2.3 for Q ∈ Had (H, 1)) we obtain from (4.10) the inequality 2 + 2 = 0 P > f2 = ω= (f1 ⊗ f + )ω> (f2 ⊗ f + ) |ω= G > (f1 ⊗ f2+ )|2 ≤ P 1 1 2 P P f1 0 (5.1) for any fi ∈ Dsp (i = 1, 2). This inequality underlies the following lemma, which will be proved at the end of this section. Lemma 5.1. The wave-front set of ω= G > satisfies
WF (ω= G > ) ⊂ N = ∪ Z × −N > ∪ Z
(5.2)
=
as a subset of T ∗ (M × M), where N = = {(p, ξ ) | ξ ∈ Np } and Z is the zero section Z = {(p, 0) | p ∈ M} of T ∗ M. 8 For this purpose, we regard L2 (R+ , dλ) ⊗ C4 as L2 (X, dλ ⊗ dµ), where X is the locally compact space R+ × Z4 and µ is the counting measure on Z4 . The measure dλ ⊗ dµ is a Baire measure and the lemma may be applied.
A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime
351
This is by no means a sharp estimate of the wave-front set, but it will suffice for our purposes. Defining uAB as in (4.2), uAB ω= G > is a scalar bi-distribution with wave-front set contained in the right-hand-side of Eq. (5.2). Now by Theorem 2.5.11 in [25], the pullback γ2∗ uAB ω= G > exists provided the intersection of its wave-front set with the set of normals Nγ2 of γ2 is empty. One may show that
Nγ2 = {(γ (τ ), ξ ; γ (τ ), ξ ) ∈ T ∗ (M × M) | ξa ua (τ ) = ξb ub (τ ) = 0}
(5.3)
(see Sect. 3 of [8] where a corresponding argument is given). This has trivial intersection with WF (uAB ω= G > ), because no null covector can annihilate a timelike vector. Accordingly, γ2∗ uAB ω= G > exists in D (R2 ), and (again by Theorem 2.5.11 in [25]) its wave-front set obeys WF (γ2∗ uAB ω= G > ) ⊂ R × R= × R × −R> .
(5.4)
Since uAB is compactly supported, we may take Fourier transforms and conclude that =G> the WAB do indeed exist. Exactly the same argument, using the microlocal spectrum = condition (2.50) in place of (5.2), shows that the pull-backs γ2∗ uAB ωP exist with =
WF (γ2∗ uAB ωP ) ⊂ R × R= × R × −R= .
(5.5)
It remains to prove Lemmas 4.2 and 5.1. Proof of Lemma 4.2. An argument using regularising sequences in analogy with the proof of Theorem 2.2 in [8] shows that the inequality (5.1) is inherited by the pull-back and becomes |γ2∗ uAB ω= G > (f1 ⊗ f2 )|2 =
>
≤ γ2∗ uAA ωP (f1 ⊗ f1 ) γ2∗ uBB ωP (f2 ⊗ f2 ) ∀ f1 , f2 ∈ C0∞ (R)
(5.6)
(no sum on either A or B). Substituting f1 (t) = e−itλ , f2 (t ) = e−it λ , the required bounds (4.13) are obtained. As γ2∗ uAA ωP· is compactly supported, its set of singular directions (those directions in which its Fourier transform fails to decay rapidly) is given by K(γ2∗ uAA ωP· ) = {(λ, λ ) | (τ, λ; τ , λ ) ∈ WF (γ2∗ uAA ωP· ) for some (τ, τ ) ∈ R2 } = R+ × R −
(5.7)
(see Proposition 8.1.3 in [24]). Thus (−1, 1) is not a singular direction for γ2∗ uAA ωP· and we deduce that YA· (λ) is rapidly decaying at λ → +∞ and of polynomially bounded growth as λ → −∞ by the Paley-Wiener-Schwartz theorem [24]. An analogous argument shows that YA1 (λ) decays rapidly as λ → −∞ and is polynomially bounded as λ → +∞. , Proof of Lemma 5.1. Suppose (p1 , ξ1 ; p2 , −ξ2 ) ∈ WF (ω= G > ) with ξ1 ' = 0. We will show that =
(p1 , ξ1 ; p1 , −ξ1 ) ∈ WF (ωP ),
(5.8)
352
C. J. Fewster, R. Verch =
from which it follows that ξ1 ∈ Np1 by the microlocal spectrum condition (2.50). To prove (5.8) fix charts (Ui , κi ) with pi ∈ Ui (i = 1, 2), and define ki so that ξi = t κi (pi )ki . Let χi be arbitrary smooth spinor fields compactly supported in Ui with χi (pi ) ' = 0. We will use the notation χij = χi ⊗ χj+ ;
κij = κi × κj .
(5.9)
Since ξ1 , and hence k1 , is nonzero, any conical neighbourhood V11 of (k1 , −k1 ) contains a set of formO1 ×−O1 , where O1 is a neighbourhood of k1 bounded away from ∧
=
−1 is not of rapid decay in the conic neighbourhood zero. We claim that χ11 ωP ◦ κ11 α>0 α (O1 × −O1 ) ⊂ V11 . Since V11 was arbitrary, we conclude that (k1 , −k1 ) is a = −1 singular direction for χ11 ωP ◦ κ11 ; letting the support of χ1 tend to {p1 }, Eq. (5.8) is established as required. To justify our claim above we apply (5.1) to
fj =
√
1 −|g| ◦ κj−1
χj e
i( . )Nj
◦ κj ,
(5.10)
and recall (1.3) to obtain ∧ ∧ χ12 ω= G > ◦ κ −1 (N1 , −N2 ) ≤ χ11 ω= ◦ κ −1 (N1 , −N1 ) 12 11 P ∧ > −1 (N2 , −N2 ). × χ22 ωP ◦ κ22
(5.11)
Now let O2 be any neighbourhood of k2 . By the Paley-Wiener-Schwartz theorem we have ∧ > −1 χ22 ωP ◦ κ22 (αN2 , −αN2 ) ≤ R(α) ∀N2 ∈ O2 (5.12) for some polynomial R; accordingly, ∧ ∧ = −1 −1 sup χ12 ω= G > ◦ κ12 (αN1 , −αN2 ) ≤ R(α) sup χ11 ωP ◦ κ11 (αN1 , −αN1 )
Nj ∈Oj
N∈O1
(5.13) for all α > 0. Were our claim false, the right-hand side of this equation would be rapidly ∧
−1 decaying as α → +∞ and we could infer that χ12 ω= G > ◦ κ12 was of rapid decay in the conical neighbourhood V12 = ∪α>0 α (O1 × −O2 ) of (k1 , −k2 ). But this would contradict the initial hypothesis that (p1 , ξ1 ; p2 , −ξ2 ) ∈ WF (ω= G > ), so the claim is proved. = We have therefore shown that (p1 , ξ1 ; p2 , −ξ2 ) ∈ WF (ω= G > ) implies ξ1 ∈ Np1 ∪{0}. > An exactly analogous argument shows that, in addition, ξ2 ∈ Np2 ∪ {0}. This completes the proof. , -
A Quantum Weak Energy Inequality for Dirac Fields in Curved Spacetime
353
6. Proof of Proposition 4.4 We now prove that the operators J on L2 (R+ , dλ) ⊗ C4 are bounded from below uniformly in . To begin, we consider the related operator K acting on L2 (R+ , dλ) by ∞ σ (λ)σ (λ ) (K ϕ)(λ) = dλ (λ + λ )f(λ − λ )ϕ(λ ). (6.1) 8π 2 0 If the spin-connection terms vanished, J would be equal to K ⊗ 1. Our analysis of K is based on the following identity. Lemma 6.1. If f = g 2 for real-valued g ∈ C0∞ (R), then ∞ dµ µ g (λ − µ) g (λ − µ) . (λ + λ )f(λ − λ ) = −∞ π
(6.2)
Proof. Note first that the right-hand side (RHS) exists for each λ, λ ∈ R. Changing variables to ν = µ − (λ + λ )/2 and writing ζ = (λ − λ )/2, ∞ dν λ + λ RHS of (6.2) = +ν g (ζ − ν) g (ζ + ν) 2 −∞ π = (λ + λ )( g R g )(2ζ ) = (λ + λ )f (λ − λ ) (6.3) g (u) = g (−u), the fact that ν g (ζ − ν) g (ζ + ν) as required, where we have also used is odd, and a further change of variables. In addition, we have used R to denote the convolution (h1 R h2 )(λ) = dλ /(2π )h1 (λ − λ )h2 (λ ). , It follows from this identity that the kernel of K may be rewritten in the form σ (λ)σ (λ ) ∞ K (λ, λ ) = dµ µ g (λ − µ) g (λ − µ). (6.4) 8π 3 −∞ ± to have the kernels We now define K + K (λ, λ ) =
σ (λ)σ (λ ) 8π 3
∞ −∞
dµ |µ| g (λ − µ) g (λ − µ)
(6.5)
and − K (λ, λ ) = −
2σ (λ)σ (λ ) 8π 3
0 −∞
dµ µ g (λ − µ) g (λ − µ).
(6.6)
The integrals in these kernels are bounded on compact subsets of R+ ×R+ , so the cut-off ± + − functions σ ensure that K and K are Hilbert-Schmidt. Clearly K = K − K ; furthermore, the easily proven identity 2 ∞ µ ∞ − dµ 3 dλ g (λ + µ)σ (λ )ϕ(λ ) (6.7) ϕ | K ϕ = 4π 0 0
354
C. J. Fewster, R. Verch
− (valid, say, for ϕ ∈ C0∞ (R+ )) shows that K is positive. A similar argument establishes + positivity of K . − which will allow a bound uniform in to be Our aim is now to find a bound on K + obtained. (The operator K is bounded for each , e.g., by its Hilbert–Schmidt norm, but becomes unbounded in the limit → ∞). Regarding the inner integral in (6.7) as an L2 -inner product and applying Cauchy–Schwarz, we obtain
where
− ϕ ≤ C .ϕ.2 , ϕ | K
(6.8)
2 1 )σ (λ ) C = dµ dλ µ g (µ + λ 4π 3 R+ ×R+ ∞ = du| g (u)|2 F (u)
(6.9)
0
and 1 F (u) = 4π 3
u 0
dλ (u − λ )σ (λ )2
(6.10)
is bounded and nonnegative. Let us observe that this step depends in an essential way on the fact that, for µ > 0, the argument of g in (6.7) is bounded away from zero, together with the rapid decay property of g. The above analysis entails that −C is a lower bound for K , but this is certainly − not the sharpest bound. In fact essentially the same argument applies if K is replaced + + − 1 − by 2 K and K is adjusted to maintain K = K − K , with the conclusion that −C /2 is also a lower bound for K . The convenience of the choices made above is + that, as we now show, the operator L = J − K ⊗ 1 is form bounded relative to K with relative bound no greater than 21 . To this end, we first use the convolution theorem to write 1 ∞ ϕ | L ϕ = dτ [σ ϕ]∨ (τ )† f (τ )θ (τ )[σ ϕ]∨ (τ ) (6.11) 2 −∞ for ϕ ∈ C0∞ (R+ ) ⊗ C4 , where † denotes the matrix hermitian conjugate. Setting C = sup .θ (τ ).C4 , we then estimate
τ ∈R
2 C ∞ dτ (g[σ ϕ]∨ )(τ )C4 2 −∞ 2 C ∞ dµ (g[σ ϕ]∨ )∧ (µ)C4 ≤ 2 −∞ 2π 2 C ∞ dµ ∞ dλ = g (µ − λ )σ (λ )ϕ(λ ) . 2 −∞ 2π 0 2π C4
|ϕ | L ϕ | ≤
(6.12)
+ By comparison with the definition of K , this implies
|ϕ | L ϕ | ≤
1 + ⊗ 1)ϕ ϕ | (K 2 C dµ + 16π 3 |µ| 0: lim Wc,q,n (ξ ) = Aq,n ,
ξ →−∞
lim Wc,q,n (ξ ) = 0.
ξ →∞
These modulated front solutions are constructed with the help of a center manifold reduction, where all Wc,q,n are determined by the central modes Wc,q,±1 . In the reduced four-dimensional system for Wc,q,±1 = Wc,q,±1 (ξ ) there is a heteroclinic connection lying in the intersection of a four-dimensional stable manifold of the origin and a twodimensional unstable manifold of an equilibrium corresponding to Uq,a . Since this is a
Stability of Modulated Fronts
363
very robust situation these solutions can be constructed by some perturbation analysis from the ones for q = 0. For small ε and q = 0 the solution Wc,0,1 of the amplitude equation on the center manifold is close to the real-valued front solution Wc,0,1 (ξ ) = εB(εξ ) = εB(ζ ) of the equation 4∂ζ2 B + cB ∂ζ B + B − 3B|B|2 = 0, connecting Wc,0,1 = 0 at ζ = +∞ with Wc,0,1 = A0 at ζ = −∞. The constant cB is given by cB = ε −1 c = O(1). Our paper deals with the question: Under which conditions does the solution of (1.1) with initial data Fc,q,a (x, x) + v(x) converge to Fc,q,a (x − ct, x) as t → ∞? We will show our results for the case q = 0 and a = 0 only, to keep the notation on a reasonable level. The extension to arbitrary a is trivial by translating the origin, while the extension to arbitrary q satisfying 4|q|2 < 13 necessitates some notational work and leads to bounds which depend on q. Thus, we will write the periodic solution as U∗ (x) = A cos x + O(ε2 ),
(1.3)
with A = 2ε, and the modulated front (moving with speed c = O(ε)) as Wc (ξ )einx . Fc (ξ, x) = 21 n∈2Z+1
We describe next the nature of the stability problem. Consider an initial condition u0 (x) = Fc (x, x) + v0 (x), and let u(x, t) denote the solution of (1.1) with that initial condition. Since Fc solves (1.1), we find for the evolution of v(x, t) ≡ u(x, t) − Fc (x − ct, x): ∂t v(x, t) = Lv (x, t) − 3Fc (x − ct, x)2 v(x, t) (1.4) − 3Fc (x − ct, x)v(x, t)2 − v(x, t)3 . Here, L = −(1 + ∂x2 )2 + ε 2 . We define the translation operator τct by (τct f )(x) = f (x − ct, x), so that (1.4) can be written as ∂t v = Lv − 3(τct Fc )2 v − 3(τct Fc )v 2 − v 3 .
(1.5)
Introduce now Kct (the difference between the modulated front and the periodic solution) by (1.6) Kct (x) = τct Fc (x) − U∗ (x) = Fc (x − ct, x) − U∗ (x). Note that Kct (x) vanishes as x → −∞, and approaches −U∗ (x) as x → ∞. With these notations we can rewrite (1.5) as ∂t v = Lv − 3U∗2 v − 6U∗ Kct v − 3Kct2 v − 3U∗ v 2 − v 3 − 3Kct v 2 = Mv + Mi v + N (v) + Ni (v),
(1.7)
where Mv = Lv − 3U∗2 v, Mi v = −6U∗ Kct v − 3Kct2 v, N (v) = −3U∗ v 2 − v 3 , Ni (v) = −3Kct v 2 .
(1.8)
364
J.-P. Eckmann, G. Schneider
The variables with index i vanish with some exponential rate for fixed x ∈ R in the laboratory frame. They will be seen to be exponentially “irrelevant” in terms of a renormalization group analysis. In order to explain this renormalization problem, we will study, in the next section the model problem ∂t u(x, t) = ∂x2 u(x, t) + a(x − ct)u(x, t) + u(x, t)p , with a(ξ ) = 21 (1 + tanh ξ ), and p > 3. This problem is nice in its own right. The similitude will come from the correspondence of M with ∂x2 , and of Mi v with the term a(x − ct)u(x, t). Indeed: – the first term will be seen to be diffusive in the laboratory frame, – the second term will be seen to be irrelevant in the laboratory frame, but the first together with the second term will be exponentially damping in a suitable space of exponentially decaying functions in a frame moving with a speed close to c. As in previous work [Sa77, BK94, Ga94, EW94] our analysis will be based on an interplay of estimates obtained in these two topologies. Our main results are stated in Theorem 4.1 for the simplified problem and in Theorem 7.1 for the Swift–Hohenberg problem. We not only show convergence to the front, but give also precise first order estimates in both cases. As far as possible, the treatment of the two problems is done in analogous fashion, so that the reader who has followed the proof of the simplified problem should have no difficulty in reading the proof for the full, more complicated, problem. Remark. An ideal treatment of this problem would necessitate a norm in a frame moving with the same speed as the front. Such a space is needed to study the stability of socalled critical fronts (moving at the minimal possible speed where they are linearly stable). Achieving this aim seems to be a necessary step in solving the long-standing problem of “front selection” [DL83], in a case where the maximum principle [AW78] is not available. Remark. The method also applies to more complicated systems, like hydrodynamic stability problems. A typical example are the fronts connecting the Taylor vortices with the Couette flow in the Taylor-Couette problem. These fronts have been constructed in [HS99]. The stability of the spatially periodic Taylor vortices has been shown in [Schn98]. Notation. Throughout this paper many different constants are denoted with the same symbol C.
Part I. A Simplified Problem 2. The Model Equation Let a(ξ ) = 21 (1 + tanh ξ ).
(2.1)
∂t u(x, t) = ∂x2 u(x, t) + a(x − ct)u(x, t) + u(x, t)p ,
(2.2)
We want to study the equation
Stability of Modulated Fronts
365
with c > 0 and p > 3. For notational simplicity we assume p ∈ N. To understand the dynamics of (2.2) it might be useful to consider the following simplified problem: ∂t v(x, t) = ∂x2 v(x, t) + ϑ(x − ct)v(x, t),
(2.3)
where ϑ(z) = 1 when z > 0 and ϑ(z) = 0 when z < 0. If we go to the moving frame ξ = x − ct and let w(ξ, t) = v(ξ + ct, t) = v(x, t), then the equation for w becomes ∂t w(ξ, t) = ∂ξ2 w(ξ, t) + c∂ξ w(ξ, t) + ϑ(ξ )w(ξ, t).
(2.4)
For x > 0, we have ϑ(x) = 1 and hence the corresponding characteristic polynomial for (2.4) (in momentum space) is −k 2 + ick + 1, while for x < 0, we have ϑ(x) = 0 with its corresponding polynomial −k 2 + ick. Thus, we expect the solution to be exponentially unstable ahead of the front, i.e., for x > 0, and diffusively stable behind the front. If we consider an initial condition v0 (ξ ) localized near ξ = ξ0 > 0, and of amplitude A, then we expect the amplitude to grow like et A until t = t∗ = ξ0 /c, when this perturbation “hits” the back of the front (in the moving frame), or, in other words, when the back of the front hits the perturbation (in the laboratory frame). Thus, the perturbation does not grow larger than Aeξ0 /c . We use this in the following way. Assume that the amplitude at ξ > 0 is bounded by Ae−βξ . Then, ignoring diffusion, we find that the contribution to the amplitude at the origin at time t = ξ0 /c is bounded by ξ0 dξ Aeξ (1−βc)/c . 0
Clearly, if βc > 1, the initial perturbations are sufficiently small for the total effect at the origin (in the moving frame) to be small. Once this has happened, a second epoch starts where the perturbation is behind the front. Then, due to the diffusive behavior, the amplitude will go down as C . (t − t∗ + 1)1/2 These considerations will be used in the choice of topology below.
2.1. Function spaces and Fourier transform. We start the precise analysis and will work in Fourier space and revert to the x-variables only at the end of the discussion. We define the Fourier transform by 1 Ff (k) = dx f (x)e−ikx . 2π
366
J.-P. Eckmann, G. Schneider
Notation. If f denotes a function, then f˜ is defined by f˜ = Ff , and if A is an operator, −1 . We also use the notation f˜ ∗ g˜ for the convolution then A˜ is defined byA˜ = FAF ˜ product f g (k) = f ∗ g˜ (k) = d(f˜(k − ()g((). ˜ Finally, Tζ denotes the conjugate of translation: (Tζ f˜)(k) = e−iζ k f˜(k), so that the Fourier transform of Tζ f (x) = f (x − ζ ) is
(2.5)
FTζ f = Tζ Ff. The relation ([Ta97]) β
k α ∂k (Ff )(k) = (−i)α+β F(∂xα x β f )(k) motivates the introduction of the following norms: We fix a small δ > 0 and define 1/2 n m j δ 2((+j ) dk |∂k (k ( f˜(k))|2 . (2.6) f˜H˜ m,δ = n
(=0 j =0
The dual norm to this is f H nm,δ
1/2 m n = δ 2((+j ) dx |∂x( f (x)|2 x 2j .
(2.7)
(=0 j =0
Parseval’s identity immediately leads to: f H nm,δ = Ff H˜ m,δ . n
In the following we mainly work with the spaces to m = n = 2 and m = 0, n = 2. For some constant C independent of 1 ≥ δ > 0, f gH 2 ≤ Cf H 2 gH 2 , 2,δ
2,δ
2,δ
f˜ ∗ g ˜ ˜ 2,δ ≤ Cf˜ ˜ 2,δ g ˜ ˜ 2,δ , H2
H2
(2.8)
H2
or in a stronger version f gH 2 ≤ Cf H 2 gH 2 , 2,δ
2,δ
0,δ
f˜ ∗ g ˜ ˜ 2,δ ≤ Cf˜ ˜ 2,δ g ˜ ˜ 0,δ . H2
H2
(2.9)
H2
Finally, we shall also need the inequality f˜ ∗ g ˜ ˜ 0,δ ≤ f C 2 g ˜ ˜ 0,δ , H2
H2
b,δ
(2.10)
where f C 2 = b,δ
2 j =0
j
δ j sup |∂x f (x)|. x∈R
(2.11)
Stability of Modulated Fronts
367
This follows from ˜ ˜ 0,δ , f˜ ∗ g ˜ ˜ 0,δ = f · gH 2 ≤ f C 2 gH 2 = f C 2 g H2
0,δ
b,δ
0,δ
b,δ
H2
0,δ where the inequality above is a direct consequence of the definition of H˜ 2 . We define the map Wβ,ctˆ by
(Wβ,ctˆ f )(ξ ) = f (ξ + ct)e ˆ βξ ,
(2.12)
where β ∈ (0, β∗ ) and cˆ ∈ (0, c) will be fixed later. The Fourier conjugate of this operator then satisfies β,ctˆ f˜ (k) ≡ FWβ,ctˆ F −1 f˜ (k) = ei(k+iβ)ctˆ f˜(k + iβ), W (2.13) as one sees from the following equalities: ˜ 2π(Wβ,ctˆ f )(k) = = = =
dξ e−ikξ Wβ,ctˆ f (ξ ) dξ e−ikξ f (ξ + ct)e ˆ βξ dξ e−i(k+iβ)ξ f (ξ + ct) ˆ ˆ dξ e−i(k+iβ)(ξ −ct) f (ξ )
= 2π ei(k+iβ)ctˆ f˜(k + iβ). 2 , then W β,ctˆ f˜ extends to This calculation also shows that if f (ξ )eβ∗ ξ ∈ H 20,δ for f ∈ Cb,δ 0,δ β,ctˆ f˜)(·−iβ) ∈ H ˜ 2 for all β ∈ [0, β∗ ). an analytic function in {0 < Im k < β∗ } and (W
Remark. Since the norms for different δ are equivalent, all theorems throughout this paper can also be formulated in a version with δ = 1. 3. The Linear Simplified Problem In this section we study the linearization of Eq. (2.2): ∂t U (x, t) = ∂x2 U (x, t) + a(x − ct)U (x, t).
(3.1)
The function a is given as a(ξ ) = 21 (1 + tanh ξ ),
(3.2)
but our methods will work for many other functions. The crucial property we need is the existence of a β∗ > 0 such that a(ξ )e−βξ satisfies ξ → a(ξ )e−βξ H 2 ≤ C, 2,δ
(3.3)
for all β ∈ (0, β∗ ). For the case of (3.2) we can take β∗ = 2. The Fourier transform a˜ of a is therefore a tempered distribution which is the boundary value of a function (again
368
J.-P. Eckmann, G. Schneider
called a) ˜ which is analytic in the strip {z | 0 > Im z > −β∗ }. Furthermore, there is a K such that, for all δ ∈ (0, 1], aC 2 ≤ 1 + Kδ,
(3.4)
sup |a(x)| ≤ 1.
(3.5)
b,δ
since x∈R
The bound (3.5) will be tacitly used later. The next proposition describes how solutions of (3.1) tend to 0 as t → ∞. We write Ut (x) for U (x, t) and use similar notation for other functions of space and time. Proposition 3.1. Assume that there are a β and a cˆ ∈ (0, c) such that β 2 − β cˆ + 1 ≡ −2γ < 0. Then there exists a δ ∈ (0, 1] such that the following holds. Assume that U0 ∈ H 22,δ and that W0 (ξ ) = Wβ,0 U0 (ξ ) = U0 (ξ )eβξ ∈ H 20,δ . (These conditions are independent of δ > 0.) Then the solution Ut (x) = U (x, t) of (3.1) with initial data U0 2 ˜ exists for all t > 0 and with ψ(k) = e−k the rescaled solution V˜ (k, t) = U˜ (kt −1/2 , t) satisfies ˜ 2,δ ≤ V˜t − U˜ 0 (0)ψ ˜ H2
C U˜ 0 ˜ 2,δ . H2 (1 + t)1/2
(3.6)
β,ctˆ U˜ t satisfies t = W The function W 0 0,δ . t 0,δ ≤ Ce−3γ t/2 W W ˜ ˜ H2
(3.7)
H2
The constant C does not depend on U0 . Remark. Note that it is optimal to choose cˆ arbitrarily close to c. t and W t : The equation for Proof. First of all, we rewrite Eq. (3.1) for Ut in terms of U Wt = Wβ,ctˆ Ut is ∂t W (ξ, t) = ∂ξ2 W (ξ, t) + (cˆ − 2β)∂ξ W (ξ, t) ˆ (ξ, t). + a(ξ − (c − c)t)W ˆ (ξ, t) + (β 2 − β c)W
(3.8)
Taking Fourier transforms we then find, omitting the argument k and using the notation of (2.5): t , t = −k 2 U t + (Tct a) ˜ ∗U ∂t U 2 t + (T(c−c)t t . t = β − β cˆ − k 2 + ik(cˆ − 2β) W ∂t W ˜ ∗W ˆ a)
(3.9) (3.10)
It is at this point that the simultaneous choice of two representations for the solution and their associated topologies is crucial. t converges to 0, i.e., we show (3.7). We find from (2.10): We first show that W ˜ ∗ f˜ ˜ 0,δ ≤ a(· − ζ )C 2 · f˜ ˜ 0,δ = aC 2 · f˜ ˜ 0,δ . (Tζ a) H2
b,δ
H2
b,δ
Therefore, (3.4) implies t 0,δ ≤ (1 + Kδ)W t 0,δ , (T(c−c)t ˜ ∗W ˆ a) ˜ ˜ H2
H2
H2
(3.11)
Stability of Modulated Fronts
369
and we get from (3.10) the bound 1 2 2 ∂t Wt H˜ 0,δ 2
t 2 0,δ . ≤ (β 2 − β cˆ + 1 + Kδ)W ˜ H2
We choose δ > 0 so small that β 2 − β cˆ + 1 + Kδ ≤ −3γ /2. Integrating over t we get from the choice of β, δ, and c: ˆ t 0,δ ≤ e−3γ t/2 W 0 0,δ . W ˜ ˜ H2
(3.12)
H2
Thus, we have shown Eq. (3.7). . From (2.13) and deforming the contour of integration, we get Next, we study U Tζ a˜ ∗ f˜ (k) = =
d( e−iζ (k−() a(k ˜ − ()f˜(() β,ctˆ f˜ (( − iβ)e−i(ctˆ d( e−iζ (k−() a(k ˜ − () W
β,ctˆ f˜ (()e−i((+iβ)ctˆ d( e−iζ (k−(−iβ) a(k ˜ − ( − iβ) W −β(ζ −ct) ˆ β,ctˆ f˜ (()e−i(ctˆ . d( e−iζ (k−() a(k =e ˜ − ( − iβ) W =
(3.13) β,ctˆ U ˜ t (k) = e−ik ctˆ W t (k). Then (3.13) Let h(k) = e−ictk a(k−iβ) ˜ and g(k) ˜ = e−ik ctˆ W implies ˆ ˜ t = e−β(c−c)t Tct a˜ ∗ U h ∗ g. ˜ From this and (3.3) we conclude that ˆ t 2,δ = e−β(c−c)t h˜ ∗ g ˜ ˜ 2,δ Tct a˜ ∗ U ˜ H2
H2
≤ Ce
−β(c−c)t ˆ
˜ 2,δ g h ˜ ˜ 0,δ ˜ H2
H2
(3.14)
ˆ t 0,δ . ≤ C(1 + tc)2 e−β(c−c)t W ˜ H2
t 0,δ stays bounded (it actually decays On the other hand, from (3.7) we know that W H˜ 2 t is of the form exponentially), and thus the evolution equation for U ˆ t (k) = −k 2 U t (k) + f˜(k, t)(1 + tc)2 e−β(c−c)t , ∂t U
with f˜(·, t) ˜ 2,δ uniformly bounded in t. Since, by construction, cˆ < c, we conclude H2 that (3.6) holds, using well-known arguments which will be made explicit in the proof of Theorem 4.1. The proof of Proposition 3.1 is complete.
370
J.-P. Eckmann, G. Schneider
4. The Renormalization Approach for the Simplified Problem β,ctˆ u˜ t = We consider now the non-linear problem (2.2) and its related version for w˜ t = W FWβ,ctˆ ut in Fourier space. It takes the form ∗p ∂t u˜ t = − k 2 u˜ t + Tct a) ˜ ∗ u˜ t + u˜ t , (4.1) ∂t w˜ t = β 2 − β cˆ − k 2 + ik(cˆ − 2β) w˜ t ∗(p−1) + T(c−c)t ˜ ∗ w˜ t + (T−ctˆ u˜ t ) ∗ w˜ t . ˆ a) Let Mβ be the operator of multiplication: (Mβ f )(x) = eβx f (x). Choose the constants c, ˆ and β such that they satisfy as before 0 > −2γ = β 2 − β cˆ + 1, and fix them henceforth. Our main result for the simplified problem is: Theorem 4.1. For all ϑ ∈ (0, 1/2) there are positive constants R, C and δ ∈ (0, 1] such that the following holds: Assume u0 H 2 + Mβ u0 H 2 ≤ R. Then the solution ut of 2,δ 0,δ (2.2) with initial condition u0 converges to a Gaussian in the sense that there is a constant 2 ˜ = e−k the rescaled solution v(k, ˜ t) = u(kt ˜ −1/2 , t) A∗ = A∗ (u0 ) such that with ψ(k) satisfies ˜ 2,δ ≤ v˜t − A∗ ψ ˜ H2
Furthermore,
CR . (t + 1)1/2−ϑ
(4.2)
w˜ t ˜ 0,δ = FWβ,ctˆ ut ˜ 0,δ ≤ CRe−γ t . H2
H2
We shall use the renormalization technique of [BK92] to show that u˜ t and w˜ t behave t and W t from the (as t → ∞) essentially in the same way as their linear counterparts U previous section. This technique consists, see [CEE92], in pushing forward the solution for some time and then rescaling it. This process makes the effective non-linearity smaller at each step, so that in the end the convergence properties of the linearized problem are obtained. We fix 0 < σ ≤ 1 and introduce: L˜ f˜ (κ) = f˜(σ κ). (4.3) This is a linear change of coordinates in function space. Definition (2.6) and (4.3) imply that 2 L˜ f˜2˜ 2,δ = σ −1 d(σ κ) δ 2( σ −2( σ 2j |(∂ j f˜)(σ κ)|2 (σ κ)2( . H2
j,(=0
From this we conclude immediately that for 0 < σ < 1: L˜ f˜ ˜ 2,δ ≤ σ −5/2 f˜ ˜ 2,δ and L˜ −1 f˜ ˜ 2,δ ≤ σ −3/2 f˜ ˜ 2,δ .
(4.4)
L˜ f˜ ˜ 0,δ ≤ σ −1/2 f˜ ˜ 0,δ and L˜ −1 f˜ ˜ 0,δ ≤ σ 1/2 f˜ ˜ 0,δ .
(4.5)
H2
H2
H2
H2
Similarly H2
H2
H2
H2
Stability of Modulated Fronts
Note also that
371
˜ f˜ ∗ g) L( ˜ (κ) =
dκ f˜(σ κ − κ )g(κ ˜ ) = σ d(σ −1 κ ) f˜(σ κ − σ σ −1 κ )g(σ ˜ σ −1 κ ) = σ (L˜ f˜) ∗ (L˜ g) ˜ (κ).
Furthermore,
˜ Tζ a) L( ˜ (κ) = eiζ σ κ a(σ ˜ κ) = Tσ ζ (L˜ a) ˜ (κ),
and therefore we have
We next define
(4.6)
L˜ (Tζ a) ˜ ∗ (L˜ f˜). ˜ ∗ f˜ = σ (Tσ ζ L˜ a)
(4.7)
u˜ n,τ (κ) = L˜ n u˜ (κ, σ −2n τ ) = u(σ ˜ n κ, σ −2n τ ), w˜ n,τ (κ) = eγ σ
−2n τ
w(κ, ˜ σ −2n τ ),
so that this corresponds to an additional rescaling of the time axis. Note that u˜ n,σ 2 (κ) = u˜ n−1,1 (σ κ), and
w˜ n,σ 2 (κ) = eγ σ
−2n σ 2
w(κ, ˜ σ −2n σ 2 ) = w˜ n−1,1 (κ),
˜ i.e., the exponentially damped variable w is not scaled in space. We also let a˜ n = L˜ n a. From (4.6), (4.7), and ∂τ = σ −2n ∂t we find easily that (4.1) transforms to the system (omitting the argument κ): ∗p ∂τ u˜ n,τ = − κ 2 u˜ n,τ + σ −n (Tcσ −n τ a˜ n ) ∗ u˜ n,τ + σ n(p−3) u˜ n,τ , (4.8) 2 2n 2 σ ∂τ w˜ n,τ = (β − β cˆ + γ ) − κ + iκ(cˆ − 2β) w˜ n,τ (4.9) −n ∗(p−1) ˜ + (T(c−c)σ ˜ ∗ w˜ n,τ + (T−cσ ˜ n,τ )) ∗ w˜ n,τ . ˆ −2n τ a) ˆ −2n τ (L u
We see that under these rescalings the coefficients of the non-linear terms in the first equation go to 0 as n → ∞. We will now put this observation into more mathematical form. Equation (4.1) is of the form ∂t Xt = L Xt + N Xt , where L contains the linear parts with the exception of those depending on a˜ n and N denotes the other terms. We can write the solution as t Xt = e(t−t0 )L Xt0 + ds e(t−s)L N (Xs ). t0
Going to the rescaled variables Xn,τ , and taking t0 = σ −2(n−1) and t = σ −2n τ , we can express this (for the u) ˜ as follows. Equation (4.8) leads to u˜ n,τ (κ) = e−κ (τ −σ ) u˜ n,σ 2 (κ)
τ 2 ∗p dτ e−κ (τ −τ ) σ −n (Tcσ −n τ a˜ n ) ∗ u˜ n,τ + σ n(p−3) u˜ n,τ (κ). (4.10) + 2
σ2
2
372
J.-P. Eckmann, G. Schneider
Similarly, we rewrite (4.9) as ˜ −n ˜ n,τ ))∗(p−1) ∗ w˜ n,τ , ˜ n,τ w˜ n,τ + σ −2n (T−cσ ∂τ w˜ n,τ = G ˆ −2n τ (L u ˜ n,τ is defined, cf. (4.9), by where G ˜ n,τ f˜ (κ) σ 2n G ˜ ∗ f˜ (κ). = (β 2 − β cˆ + γ ) − κ 2 + iκ(cˆ − 2β) f˜(κ) + (T(c−c)σ ˆ −2n τ a) ˜ n,τ f˜n,τ is nothing but (3.10) in The solution of the linear evolution equation ∂τ f˜n,τ = G ˜ a new coordinate system. We write the solution as fn,τ = S˜n,τ,τ f˜n,τ . Then, in analogy to (4.10) we get w˜ n,τ (κ) = S˜n,τ,σ 2 w˜ n,σ 2 (κ)
τ −n ∗(p−1) ˜ dτ S˜n,τ,τ (T−cσ ( L u ˜ )) ∗ w ˜ (κ). + σ −2n −2n n,τ n,τ ˆ τ σ2
(4.11)
Remark. The proof of Theorem 4.1 is divided into several steps: In Lemma 4.2 below, we give the inequalities for the exponentially damped part in scaled variables. Then in Lemma 4.4 a priori estimates for the solutions of (4.10) and (4.11) are established. With these a priori bounds we show Proposition 4.5. From these results, Theorem 4.1 will follow rather simply by a contraction argument. 4.1. The weighted linear problem. We bound S˜n,τ,τ . Recall that we are assuming β 2 − β cˆ + 1 = −2γ < 0. Lemma 4.2. There exists a C > 0 such that for 1 > τ > τ ≥ 0 one has S˜n,τ,τ f˜ ˜ 0,δ ≤ Ce−γ σ
−2n (τ −τ )/2
H2
f˜ ˜ 0,δ , H2
(4.12)
for all n ∈ N. ˜ n,τ f˜τ , with solution f˜τ = S˜n,τ,τ f˜τ : Proof. We consider the equation ∂τ f˜τ = G ˜ ∗ f˜τ , ∂τ f˜τ = λn f˜τ + σ −2n (T(c−c)σ ˆ −2n τ a) where λn is the operator of multiplication by λn (κ) = (β 2 − β cˆ + γ ) − κ 2 + iκ(cˆ − 2β) σ −2n . The variation of constant formula yields τ λn (τ −τ ) ˜ ˜ fτ + fτ = e ds eλn (τ −s) σ −2n (T(c−c)σ ˜ ∗ f˜s . ˆ −2n s a) τ
We use
eλn τ f˜ ˜ 0,δ ≤ eλn τ C 0 f˜ ˜ 0,δ , H2
b
H2
(4.13)
Stability of Modulated Fronts
373
and
eλn τ C 0 ≤ e(β
2 −β c+γ ˆ )σ −2n τ
b
.
We find 2 ˆ )σ −2n (τ −τ ) ˜ f˜τ ˜ 0,δ ≤ e(β −β c+γ fτ ˜ 0,δ H2 H2 τ 2 −2n ˆ )σ (τ −s) −2n + ds e(β −β c+γ σ aC 2 f˜s ˜ 0,δ ,
τ
H2
b,δ
since (T˜ζ Fa) ∗ f˜ ˜ 0,δ = a(· − ζ )F −1 f˜H 2 H2
0,δ
≤ a(· − ζ )C 2 F b,δ
−1
f˜H 2 = aC 2 f˜ ˜ 0,δ . 0,δ
H2
b,δ
Using aC 2 = 1 + Kδ and applying Gronwall’s inequality to e−(β b,δ f˜τ 0,δ we get
2 −β c+γ ˆ )σ −2n τ
H˜ 2
e−(β
2 −β c+γ ˆ )σ −2n (τ −τ )
−2n f˜τ ˜ 0,δ ≤ eσ (1+Kδ)(τ −τ ) f˜τ ˜ 0,δ ,
H2
H2
or equivalently, 2 ˆ +1+Kδ)σ −2n (τ −τ ) f˜τ ˜ 0,δ ≤ f˜τ ˜ 0,δ e(β −β c+γ .
H2
(4.14)
H0
Choosing δ ∈ (0, 1] so small that Kδ < γ /2 completes the proof of Lemma 4.2.
4.2. An a priori bound on the non-linear problem. We now state and prove a priori bounds on the solution of (4.10) and (4.11). Finally these solutions will be controlled by proving inequalities for the elements of the following sequences. Definition 4.3. For all n, we define ρnu = u˜ n,1 ˜ 2,δ
and ρnw = w˜ n,1 ˜ 0,δ .
sup u˜ n,τ ˜ 2,δ
and Rnw =
H2
H2
Moreover, we define Rnu =
τ ∈[σ 2 ,1]
H2
sup w˜ n,τ ˜ 0,δ .
τ ∈[σ 2 ,1]
H2
(4.15)
Lemma 4.4. For all n ∈ N there is a constant ηn > 0 such that the following holds: If u , ρ w , and σ > 0 are smaller than η , the solutions of (4.10) and (4.11) exist for ρn−1 n n−1 all τ ∈ [σ 2 , 1]. Moreover, we have the estimates −n
u + Ce−Cσ Rnw + Cσ n(p−3) (Rnu )p , Rnu ≤ Cσ −5/2 ρn−1
(4.16)
w Rnw ≤ Cρn−1 + Cσ n(p−3/2) (Rnu )p−1 Rnw ,
(4.17)
and
with a constant C independent of σ and n.
374
J.-P. Eckmann, G. Schneider
Remark. There is no need for a detailed expression for η = ηn since the existence of the solutions is guaranteed if we can show Rnu < ∞ and Rnw < ∞. With (4.16) and (4.17) we have detailed control of these quantities in terms of the norm of the initial conditions and σ . Proof. We start with (4.11). We bound the first term of (4.11) by w Cρn−1 .
(4.18)
For the second term in (4.11), we get with (4.5) and (4.6), ˜ −n ˜ n,τ ))∗(p−1) ∗ w˜ n,τ 0,δ (T−cσ ˆ −2n τ (L u ˜ H2
≤(L˜ −n (T−cσ ˜ n,τ ))∗(p−1) ∗ w˜ n,τ ˜ 0,δ ˆ −n τ u H2
≤σ
n/2 n(p−2)
σ
p−1 u˜ n,τ 0,δ w˜ n,τ ˜ 0,δ H2 H˜ 2
≤σ n(p−3/2) (Rnu )p−1 Rnw a bound Cσ n(p−7/2)
τ σ2
dτ e−γ σ
−2n (τ −τ )/2
(Rnu )p−1 Rnw ≤ Cσ n(p−3/2) (Rnu )p−1 Rnw .
We next consider (4.10). The first term is bounded by κ →e−κ
2 (τ −σ 2 )
u˜ n−1,1 (σ κ) ˜ 2,δ
≤ κ → e
H2
−κ 2 (τ −σ 2 )
C 2 κ → u˜ n−1,1 (σ κ) ˜ 2,δ b,δ
H2
(4.19)
u , ≤ Cσ −5/2 ρn−1
using (4.4). We use (4.7) and recall a˜ n = L˜ n a˜ to rewrite the second term of (4.10) into
τ
2 dτ e−κ (τ −τ ) (Tcσ −2n τ a) ˜ ∗ (L˜ −n u˜ n,τ ) (κ) σ2 τ 2 = σ −2n dτ e−κ (τ −τ ) (Tcσ −2n τ a) ˜ ∗ u˜ σ −2n τ (κ) 2 σ τ −2n ˆ −κ 2 (τ −τ ) = σ −2n dτ e−βσ τ (c−c) e 2 σ −2n ˆ −2n τ −γ σ −2n τ × d( ei(κ−()cσ τ a(κ ˜ − ( − iβ) w((, ˜ σ −2n τ )e−i(cσ e ,
σ −2n
Stability of Modulated Fronts
375
where (3.13) is used for the last equality. Using this identity, we get from the techniques leading to (3.14): τ 2 dτ e−κ (τ −τ ) (Tcσ −n τ a˜ n ) ∗ u˜ n,τ (κ) ˜ 2,δ σ −n κ → H2 2 σ τ ≤ σ −n dτ (Tcσ −n τ a˜ n ) ∗ u˜ n,τ ˜ 2,δ H2 σ2 τ ≤ σ −2n dτ L˜ n ((Tcσ −2n τ a) ˜ ∗ u˜ σ −2n τ ) ˜ 2,δ H2 σ2 (4.20) τ −9n/2 ≤σ dτ (Tcσ −2n τ a) ˜ ∗ u˜ σ −2n τ ˜ 2,δ H2 σ2 τ ˆ )σ −2n τ w ≤ Cσ −9n/2 dτ (1 + cσ −2n τ )2 e−(β(c−c)+γ Rn ≤ Cσ
σ2 −17n/2 −(β(c−c)+γ ˆ )σ −2(n−1)
e
−n
ˆ )σ Rnw ≤ Ce−(β(c−c)+γ Rnw .
For the last term in (4.10) we get a bound τ Cσ n(p−3) dτ (Rnu )p ≤ Cσ n(p−3) (Rnu )p .
(4.21)
σ2
The proof of Lemma 4.4 now follows by applying the contraction mapping principle to u , ρ w and σ > 0 sufficiently small the Lipschitz constant (4.10) and (4.11). For ρn−1 n−1 2,δ 0,δ for the right-hand side of (4.10) and (4.11) in C([σ 2 , 1], H˜ 2 × H˜ 2 ) is smaller than 1. An application of a classical fixed point argument completes the proof of Lemma 4.4. 4.3. The iteration process. We next decompose the solution u˜ n,τ for τ = 1 into a 2 ˜ Gaussian part and a remainder. Let ψ(κ) = e−κ and write ˜ u˜ n,1 (κ) = An ψ(κ) + r˜n (κ), : H˜ 2,δ where r˜n (0) = 0, and the amplitude An is in R. We also define > 2 → R by f˜ = f˜ . > (4.22) κ=0 Then (4.10) can be decomposed accordingly and takes the form An = An−1
1 2 ∗p +> dτ e−κ (1−τ ) σ −n (Tcσ −n τ a˜ n ) ∗ u˜ n,τ + σ n(p−3) u˜ n,τ (κ) , r˜n (κ) = e
σ2 −κ 2 (1−σ 2 )
+ +e
1
r˜n−1 (σ κ)
dτ e−κ
σ2 −κ 2 (1−σ 2 )
2 (1−τ )
∗p
(4.23)
σ −n (Tcσ −n τ a˜ n ) ∗ u˜ n,τ + σ n(p−3) u˜ n,τ (κ)
˜ κ) − An ψ(κ). ˜ An−1 ψ(σ
(4.24)
Then we define ρnr = ˜rn ˜ 2,δ and so ρnu ≤ C(|An | + ρnr ). Our main estimate is now H2
376
J.-P. Eckmann, G. Schneider
Proposition 4.5. There is a constant C > 0 such that for σ > 0 sufficiently small the solution u˜ of (2.2) satisfies for all n ∈ N: −n
|An − An−1 | ≤ Ce−Cσ Rnw + Cσ n(p−3) (Rnu )p , r + Ce ρnr ≤ Cσρn−1
ρnw ≤ Ce−Cσ
−2n
−Cσ −n
(4.25)
Rnw + Cσ n(p−3) (Rnu )p ,
w ρn−1 + Cσ n(p−3/2) (Rnu )p−1 Rnw .
(4.26)
Proof. We begin by bounding the difference An − An−1 using (4.23). Observe that since 2,δ we work in H˜ 2 , we have f˜| ≤ Cf˜ 2,δ , |> ˜
(4.27)
H2
with C independent of δ. Thus, it suffices to bound the norm of the integral in (4.23). The first term in (4.23) is the one containing the translated term a˜ n and was already bounded in (4.20) while the second was bounded in (4.21). Combining these bounds with (4.27), we find (4.25). We next bound r˜n in terms of r˜n−1 , using (4.24). The first term is the one where the 2,δ projection is crucial: For σ > 0 sufficiently small, f˜ ∈ H˜ 2 with f˜(0) = 0 one has κ → e−κ
2 (1−σ 2 )
f˜(σ κ) ˜ 2,δ ≤ Cσ f˜ ˜ 2,δ . H2
H2
(4.28)
Indeed, writing out the definition (2.6) of H˜ 2 , one gets for the term with j = ( = 0: 2 ˜ ˜ −2κ 2 (1−σ 2 ) ˜ 2 −2κ 2 (1−σ 2 ) 2 f (σ κ) − f (0) dκ e |f (σ κ)| = dκ e (σ κ) . σκ 2,δ
Clearly, a bound of the type of (4.28) follows for this term by the assumptions on f˜. The derivatives are handled similarly, except that there is no need to divide and multiply by powers of σ κ since each derivative produces a factor σ . We now bound the other terms in (4.24). The first term is bounded using (4.28) and 2,δ yields a bound (in H˜ 2 ) of r . Cσρn−1
(4.29)
The second and third terms have been bounded in (4.20) and (4.21): −n
Ce−Cσ Rnw + Cσ n(p−3) (Rnu )p .
(4.30)
Finally, the last term in (4.24) can be written as 2 2 2 2 2 2 X˜ n ≡ An−1 (e−κ (1−σ ) e−κ σ − e−κ ) + (An−1 − An )e−κ .
2,δ The first expression vanishes and we get a bound (in H˜ 2 ): −n
X˜ n ˜ 2,δ ≤ Ce−Cσ Rnw + Cσ n(p−3) (Rnu )p . H2
(4.31)
Collecting the bounds (4.29)–(4.31), the assertion (4.26) for r˜n follows. Finally, the bounds on ρnw follow as those in Lemma 4.4. The proof of Proposition 4.5 is complete.
Stability of Modulated Fronts
377
Proof of Theorem 4.1. The proof is an induction argument, using repeatedly the above estimates. Again we write C for (positive) constants which can be chosen independent of σ and n. Assume that R = supn∈N Rnu < ∞ exists. From Lemma 4.4 we observe for σ > 0 sufficiently small Rnw ≤ Rnu ≤
w Cρn−1
w ≤ C ρn−1 , 1 − Cσ n(p−3/2) R p−1 −n u + Ce−Cσ Rnw Cσ −5/2 ρn−1
(4.32)
1 − Cσ n(p−3) R p−1
−n
u w ≤ Cσ −5/2 ρn−1 + Ce−Cσ ρn−1 ,
with a constant C which can be chosen independent of R. Using Proposition 4.5 we find −n
w u |An − An−1 | ≤ Ce−Cσ ρn−1 + Cσ n(p−3) σ −5/2 ρn−1 , −n
r w u + Ce−Cσ ρn−1 + Cσ n(p−3) σ −5/2 ρn−1 , ρnr ≤ Cσρn−1
ρnu ≤ C(|An | + ρnr ), ρnw ≤ Ce−Cσ
−2n
(4.33)
w w ρn−1 + C σ n(p−3/2) ρn−1 .
Therefore, we can choose σ > 0 so small that for n > 3: (recall p > 3 and p ∈ N) w r |An − An−1 | ≤ ρn−1 /10 + σ n−3 (|An−1 | + ρn−1 ),
r w ρnr ≤ 3ρn−1 /4 + ρn−1 /10 + σ n−3 |An−1 |, w /10 . ρnw ≤ ρn−1
Thus, the sequence of An converges geometrically to a finite limit A∗ . Furthermore, we find that limn→∞ ρnr = 0, and limn→∞ ρnw = 0. Since the quantities |An |, ρnr , ρnw increase only for at most three steps the term CR p−1 in (4.32) stays less than 1/2 if we choose |A1 |, ρ1r , ρ1w = O(σ m ), for an m > 0 sufficiently large. We then deduce from (4.32) the existence of a finite constant R = supn∈N Rnu . Going back to (4.33) for given ϑ > 0 we can choose σ > 0 so small that |An − An−1 | + ρnr ≤ Cσ (1−2ϑ)n which implies the associated convergence rate stated in Theorem 4.1. This holds since ρn ≤ Cσρn−1 implies ρ|t=σ −2n = ρn ≤ (Cσ )n ρ0 and ρ(t) ≤ ρ0 t −1/2 t ln C/ ln σ
−2
≤ ρ0 t −1/2+ϑ
for σ > 0 sufficiently small. Finally, the scaling of w˜ n,τ implies the exponential decay of w˜ t . The proof of Theorem 4.1 is complete.
378
J.-P. Eckmann, G. Schneider
Part II. The Swift–Hohenberg Equation 5. Bloch Waves Since the problem we consider takes place in a setting with a periodic background provided by the stationary solution of the Swift–Hohenberg, it is natural to work with the Bloch representation of the functions. For additional information see [RS72]. The starting point of Bloch wave analysis in the case of a 2π –periodic underlying pattern is the following relation: 1/2 ikx u(x) = dk e u(k) ˜ = d( ei(n+()x u(n ˜ + () =
1/2 −1/2
where we define
d(
n∈Z −1/2
e
i(n+()x
u(n ˜ + () =
n∈Z
1/2 −1/2
(5.1) d( e
i(x
u((, ˆ x),
T u ((, x) ≡ u((, ˆ x) = einx u(n ˜ + ().
(5.2)
n∈Z
The operator T will play a rôle analogous to that played by the Fourier transform F for the simplified problem of Part I. We will use analogous notation: Notation. If f denotes a function, then fˆ is defined by fˆ = T f , and if A is an operator, then Aˆ is defined by Aˆ = T AT −1 . Note that
dx |u(x)|2 = 2π
−1/2
R
1/2
2π
d(
dx |u((, ˆ x)|2 .
(5.3)
0
This is easily seen from Parseval’s identity: 2 dx |u(x)|2 = 2π dk|u(k)| ˜ R
R
= 2π = 2π
d( |u(n ˜ + ()|2
n∈Z −1/2 1/2 −1/2
= 2π
1/2
d(
n∈Z 2π
1/2
−1/2
|u(n ˜ + ()|2
d(
dx |u((, ˆ x)|2 .
0
The sum and the integral can be interchanged in (5.1) due to Fubini’s theorem when u is in the Schwartz space S. We shall use frequently the following fundamental properties (which follow at once from (5.2)): u((, ˆ x) = eix u(( ˆ + 1, x), u((, ˆ x) = u((, ˆ x + 2π ), ˆ x) for real-valued u. u((, ˆ x) = u(−(,
(5.4)
Stability of Modulated Fronts
379
Finally Tζ denotes the conjugate of translation, so that the Bloch transform of Tζ f (x) = f (x − ζ ) is T Tζ f = Tζ T f. Multiplication in position space corresponds to a modified convolution operation for the Bloch-functions: 1/2 u · v ((, x) = d( u(( ˆ − ( , x)v(( ˆ , x) ≡ uˆ ∗ vˆ ((, x). −1/2
This follows from (5.4) and the identities: imx dk u(( ˜ + m − k)v(k)e ˜ u · v ((, x) = =
m∈Z R 1/2
−1/2
m,n∈Z
d(
u(( ˜ + m − ( − n)v(( ˜ + n)ei(m−n)x einx .
Recalling the norm f H ns,δ
1/2 n s = δ 2(m+j ) dx |∂xm f (x)|2 x 2j j =0 m=0
we now introduce fˆ ˆ s,δ Hn
n s = δ 2(m+j )
2π
d( 1/2
j =0 m=0
1/2
1/2
0
j dx |∂x ∂(m fˆ((, x)|2
.
We get from Parseval’s equality C −1 uH ns,δ ≤ u ˆ ˆ s,δ ≤ CuH ns,δ , Hn
for some C independent of δ ∈ (0, 1). As before, in the following we mainly work with the spaces to s = n = 2 and s = 0, n = 2. Similarly, in analogy to (2.8), we also have u v ˆ 2,δ = uˆ ∗ v ˆ ˆ 2,δ ≤ Cu ˆ ˆ 2,δ v ˆ ˆ 2,δ , H2 H H2 H2 2 uv (· − iβ, ·) ˆ 0,δ = uˆ ∗ vˆ (· − iβ, ·) ˆ 0,δ H2
(5.5)
H2
≤ Cu ˆ ˆ 0,δ v(· ˆ − iβ, ·) ˆ 0,δ , H2
(5.6)
H2
or in a stronger version ∗ v ˆ ˆ 2,δ ≤ Cu ˆ ˆ 0,δ v ˆ ˆ 2,δ , u v ˆ 2,δ = uˆ H2 H H2 H2 2 uv (· − iβ, ·) ˆ 2,δ = uˆ ∗ vˆ (· − iβ, ·) ˆ 2,δ ≤ Cu ˆ ˆ 2,δ v(· ˆ − iβ, ·) ˆ 0,δ . H2
H2
H2
H2
(5.7)
380
J.-P. Eckmann, G. Schneider 2 (see (2.11) for the definition): Then, Finally, suppose f is a function in Cb,δ
∗ v ˆ ˆ 0,δ ≤ Cf C 2 v ˆ ˆ 0,δ , fv ˆ 0,δ = fˆ H2 H2 H2 b,δ fˆ ∗ vˆ (· − iβ, ·) ˆ 0,δ ≤ Cf C 2 v(· ˆ − iβ, ·) ˆ 0,δ . H2 H2 b,δ
(5.8) (5.9)
Thus, apart from notational differences, we can work in the Bloch spaces with much the same bounds as in the spaces used for the model problem of the previous sections. 6. The Linearized Problem We discuss here again the behavior of the linearized problem as in Sect. 3, but now for the Swift–Hohenberg equation. The discussion will again be split in an aspect behind the front and one ahead of the front. In Sect. 3, the behavior of the problem in the bulk behind the traveling front was diffusive by construction, and the only difficulty was to understand the rôle of the decay of a to 0 (as e−β|x| ) as x → −∞. For the problem of the Swift–Hohenberg equation, the situation is similar, leading again to diffusive behavior. However, this observation is not obvious. Therefore, the first problem consists in showing the diffusive behavior. In order to obtain optimal results for the analysis ahead of the front, i.e., for the variable in the weighted representation, we use our approximate knowledge of the shape of the front. 6.1. The unweighted representation. In analogy with the simplified example, the linearized problem would be now ∂t v = Mv + Mi v,
(6.1)
where M and Mi have been defined in Eqs. (1.7) and (1.8). By the analysis for the model problem we expect that the term Mi v will be irrelevant for the dynamics in the bulk with some exponential rate. Therefore, it will be considered in the sequel together with the non-linear terms. As a consequence, the linear equation dominating the behavior behind the front is given by ∂t v = Mv.
(6.2)
We recall those features of the proof of diffusive stability of [Schn96, Schn98] which are relevant to the study of (6.2). In order to do this, we need to localize the spectrum of M. Since this is welldocumented, we just summarize the results. As the linearized problem has periodic ˆ = T MT −1 equals a direct integral ⊕d( M( , where each coefficients, the operator M 2,δ M( acts on the subspace with fixed quasi-momentum ( in Hˆ 2 . The eigenfunctions of M( are given by Bloch waves of the form ei(x w(,n with 2π -periodic w(,n . The index n ∈ N counts various eigenvalues for fixed (. For each ( ∈ R (or rather in the Brillouin zone [− 21 , 21 ]) they are solutions of the eigenvalue equation 2 M( w( (x) ≡ − 1 + (i( + ∂x )2 w( (x) + ε 2 w( (x) − 3U∗2 (x)w( (x) = µ( w( (x). The spectrum takes the familiar form of a curve µ1 (() with an expansion µ1 (() = −c1 (2 + O((3 ),
Stability of Modulated Fronts
381
and c1 > 0 and the remainder of the spectrum negative and bounded away from 0. The eigenfunction associated with µ1 (0) is ∂x U∗ (x), reflecting the translation invariance of the original problem (1.1). There is an (0 > 0 such that for fixed ( ∈ (−(0 , (0 ) the eigenfunction ϕ( (x) = w(,1 (x) of the main branch µ1 (() is well defined (and a continuation of ∂x U∗ (x)) as ( is varied away from 0. Corresponding to this we define the central projections Pˆc (() by Pˆc (()f = ϕ¯( , f ϕ( , where ·, · is the scalar product in L2 ([0, 2π ]) and ϕ¯( the associated eigenfunction of 2,δ the adjoint problem. We will need a smooth version of the projection in Hˆ 2 . We fix once and for all a non-negative smooth cutoff function χ with support in [−(0 /2, (0 /2] which equals 1 on [−(0 /4, (0 /4]. Then we define the operators Eˆ c and Eˆ s by: Eˆ c (() = χ (()Pˆc ((),
Eˆ s (() = 1(() − Eˆ c (().
It will be useful to define auxiliary “mode filters” Eˆ ch and Eˆ sh by Eˆ ch (() = χ ((/2)Pˆc ((),
Eˆ sh (() = 1(() − χ (2()Pˆc (().
These definitions are made in such a way that Eˆ ch Eˆ c = Eˆ c ,
Eˆ sh Eˆ s = Eˆ s ,
which will be used to replace the (missing) projection property of Eˆ c and Eˆ s . We next extend the definitions (4.3) of Sect. 4 to the Bloch spaces. To avoid cumbersome notation, we shall use mostly the same symbols as in that section. Thus, with σ < 1 as before, we let now uˆ (κ, x) = u(σ L ˆ κ, x). Note that here, and elsewhere, the scaling does not act on the x variable, only on the quasi-momentum κ. The novelty of renormalization in Bloch space here is that since the integration region over the ( variable is finite it will change with the scaling. Therefore, we introduce (for fixed δ > 0), Kσ,ρ = {uˆ | u ˆ Kσ,ρ < ∞},
(6.3)
where u ˆ 2Kσ,ρ ≡
2
1/(2σ )
n,n =0 −1/(2σ )
2π
d( 0
dx δ 2(n+n ) |∂(n ∂xn u((, ˆ x)|2 (1 + (2 )2ρ .
For technical reasons we introduced a weight in the Bloch variable (. It turns out that an appropriate choice for the critical part is Kσc = Kσ,3/2 and for the stable part Kσs = Kσ,1 . Note that T , as defined in (5.2) is an isomorphism between the space H 22,δ and the space Kσ,ρ by (5.3) and the definition (6.3). As before we have fˆK n ≤ σ −5/2−2ρ fˆK L , σ ,ρ σ n−1 ,ρ
(6.4)
382
J.-P. Eckmann, G. Schneider
for 0 < σ ≤ 1, where the additional factor σ −2ρ is due to the weight in the (-variable. Moreover, as before, we will not scale the weighted variable and so we fix 0,δ Kw = Hˆ 2 .
Consider again the eigenfunctions ϕ( (x). The function vˆt ((, x) = eµ1 (()t ϕ( (x), solves the equation
∂t vˆt ((, ·) = M( (vˆt ((, ·)).
Because of the nature of the spectrum µ1 ((), this solution satisfies vˆt ((t −1/2 , x) = e−c1 ( vˆ0 (0, x) + O(t −1/2 ). 2
Using this observation and the fact that the Eˆ s -part is exponentially damped, the result will be 0 satisfies: t of the problem (6.2) with initial data V Proposition 6.1. The solution V t ((t −1/2 , x) − e−c1 (2 Pˆc (0)V 0 (0, x)K √ ≤ C V 0 2,δ , ((, x) → V 1/ t,1 Hˆ 2 t 1/2
(6.5)
for a constant C > 0 and all t ≥ 1. Moreover, there is a constant γ− > 0 such that 0 2,δ , t ((t −1/2 , x)K √ ≤ Ce−γ− t V ((, x) → Eˆ s V (6.6) ˆ 1/ t,1 H2
for all t ≥ 1. 6.2. The weighted representation. The weighted representation will be obtained by translating the effect of the transformation Wβ,ctˆ defined in (2.12) to the language of the Bloch waves. In accordance with our notational conventions, we set β,ctˆ = T Wβ,ctˆ T −1 , W and we get now, in analogy to (2.13), ˆ β,ctˆ fˆ ((, x) = ei c((+iβ)t W fˆ(( + iβ, x + ct). ˆ β,ctˆ v, ˆ then takes the form Equation (6.1), expressed in terms of W β,ctˆ vˆ = M β,ctˆ vˆ + M β,ctˆ vˆ , β,ctˆ W i,β,ctˆ W ∂t W
(6.7)
with β,ctˆ fˆ ((, x) = Lˆ iβ fˆ ((, x) − 3U∗2 (x + ct) M ˆ fˆ((, x) + c(i(( ˆ + iβ) + ∂x )fˆ((, x), i,β,ctˆ fˆ ((, x) = − 6U∗ (x + ct) ˆ T−ctˆ K ∗ fˆ ((, x) M ct ∗ T−ctˆ K ∗ fˆ)((, x). − 3(T−ctˆ K ct ct Some explanations are in order: Lˆ iβ is the operator −(1 + (∂x + i( − β)2 )2 + ε2 . The functions U∗ are just multiplications in the Bloch representation because they are
Stability of Modulated Fronts
383
periodic. More precisely, one has U ∗ ((, x) = U∗ (x)δ(() in the sense of distributions. The functions K are derived from K ct ct of Eq. (1.6) and are seen to be given by −i(ct K Fc ((, x − ct, x) − U∗ (x)δ((), ct ((, x) ≡ T Kct ((, x) = e where the Bloch transform is taken in the first (non-periodic) variable of Fc . In order to obtain optimal results for the analysis ahead of the front, i.e., for the variable in the weighted representation, we recall some facts from the construction [CE86, EW91] of the fronts. For small ε > 0 the bifurcating solutions u of the Swift–Hohenberg equation can be approximated by ˜ ψ(x, t, ε) = εA(εx, ε2 t)eix + c.c., up to an error O(ε2 ), where A satisfies the Ginzburg–Landau equation ∂T A = 4∂X2 A + A − 3A|A|2 , with X ∈ R, T ≥ 0 and A(X, T ) ∈ C. See [CE90b, vH91, KSM92, Schn94]. This equation possesses a real-valued front Af (X, T ) = B(X − cB T ), where ξ → B(ξ ) satisfies the ordinary differential equation 4B + cB B + B − 3B|B|2 = 0. For |cB | ≥ 4 the real–valued fronts of this equation are monotonic. These fronts and the trivial solution A = 0 can be stabilized by introducing a weight eβA x satisfying the stability condition DA (cB , βA ) = 4βA2 − βA cB + 1 < 0, see [BK92]. √ Remark. Since B(ξ ) converges at a faster rate to 1/ 3 for ξ → −∞ than to 0 for ξ → ∞ there will be no additional restriction such as (3.3) on βA . Remark. Our result will be optimal in the sense that each modulated front Fc which corresponds to a front of the associated amplitude equation satisfying DA (cB , βA ) < 0 is stable. The connection between the quantities of the Ginzburg–Landau equation and the associated Swift–Hohenberg equation is as follows. We have c = εcB + O(ε 2 ), and β = εβA + O(ε 2 ). In order to prove this remark we write the modulated front Fc as defined in (1.2) as a sum of the Ginzburg–Landau part and a remainder Fc (ξ, x) = 2εB(εξ ) cos(x) + ε 2 Fr (ξ, x), where Fr satisfies
sup Fr (· + y, ·)C 2 ≤ C,
y∈R
b,δ
for a constant C independent of ε ∈ (0, 1) and δ ∈ (0, 1). Then we consider (6.7) which we write without decomposition as . (6.8) = L iβ W − 3T−ctˆ (τ + c(i(( ∂t W ∗ T−ctˆ (τ ∗W ˆ + iβ) + ∂x )W ct Fc ) ct F )
384
J.-P. Eckmann, G. Schneider
In order to control these solutions we use that the linearized system (6.7) evolves in such a way that during times of order O(1/ε 2 ) it can be approximated by the associated linearized Ginzburg–Landau equation ¯ ∂τ A = 4(∂X − βA )2 A + cB (∂X − βA )A + A − B 2 (2A + A).
(6.9)
Theorem 6.2. For all C0 > 0, and τ1 > 0 there exist positive constants ε0 , C1 , C2 , 0 and τ0 such that for all ε ∈ (0, ε0 ] the following is true: For all initial conditions W with W0 ˆ 0,δ ≤ C0 ε there are a solution Wt of (6.8) and a solution Aτ of (6.9) with H2 t in the sense that A0 0,δ ≤ C1 such that the function Aτ approximates W H˜ 2
t − εT (Aε2 t−τ (x)eix + c.c.) 0,δ ≤ C2 ε 2 , W 0 ˆ H2
for all t ∈ [τ0 /ε 2 , (τ0 + τ1 )/ε 2 ]. Here T again denotes the map of Eq. (5.2) from a function f of x to its Bloch representation fˆ((, x). Proof. The proof of this is very similar to the case of the (non-linear) Swift–Hohenberg equation which was discussed in the literature [CE90b, vH91, KSM92, Schn94]. Our (linear) problem is in fact easier and the proof is left to the reader. For the system (6.9) we have the estimate [BK92] Aτ H 2 ≤ CeDA (cB ,βA ,δ)τ A0 H 2 , 0,δ
0,δ
with limδ→0 DA (cB , βA , δ) = DA (cB , βA ). The deviation of DA (cB , βA , δ) from DA (cB , βA ) comes again from the derivatives of B. As a consequence of this estimate and of Theorem 6.2 we conclude that
t 0,δ ≤ CeD(c,β,ε,δ)(t−t ) W t 0,δ , W ˆ ˆ H2
H2
(6.10)
for a constant C and a coefficient D = D(c, β, ε, δ). We can (and will) choose this constant D in such a way that (for ε → 0): D(c, β, ε, δ) = ε2 (DA (cB , βA , δ) + o(1)).
(6.11)
We define D(c, β, ε) = limδ→0 D(c, β, ε, δ). Remark. The choice of a sufficiently small δ > 0 and ε > 0 will allow us to prove the stability of all fronts which are predicted to be stable by the associated amplitude equation since lim(ε,δ)→0 ε −2 D(c, β, ε, δ) = DA (cB , βA ). In the following we consider a modulated front with velocity c and a given (sufficiently small) bifurcation parameter ε > 0 for which there are a β and a cˆ ∈ (0, c) which satisfy: D(c, ˆ β, ε) = −2γ < 0.
(6.12)
Stability of Modulated Fronts
385
Proposition 6.3. Suppose that the above stability condition (6.12) is satisfied. Then there β,ctˆ V t obey t = W is a δ ∈ (0, 1] such that: There is a C < ∞ for which the functions W the bounds s 0,δ . t 0,δ ≤ Ce−3γ (t−s)/2 W W ˆ ˆ H2
H2
(6.13)
As in the previous sections this result will have to be improved for the non-linear problem. Therefore, we skip at this point the proof, and will only deal with the improved version later. Thus, the linear problems (6.2) and (6.7) are the analogs of (3.9) and (3.10) and can be studied pretty much as in the case of the simplified problem, yielding inequalities similar to (3.6) and (3.7). 7. The Renormalization Process for the Full Problem We assume throughout this section that the stability condition (6.12) is satisfied. We prove here our main Theorem 7.1. There are a δ > 0 and positive constants R and C such that the following holds: Assume v0 H 2 + Mβ v0 H 2 ≤ R and denote by vt the solution of (1.4) with 2,δ 2,δ ˜ initial condition v0 . Let ψ(() = exp(−c1 (2 ). There is a constant A∗ = A∗ (v0 ) such that the rescaled solution vˆtr ((, x) = vˆt ((t −1/2 , x) satisfies CR . (t + 1)1/4
(7.1)
β,ctˆ vˆt Kw ≤ CRe−γ t . wt Kw = W
(7.2)
˜ x U∗ K √ ≤ vˆtr − A∗ ψ∂ 1/ t,1 Furthermore,
Remarks. • The inequality (7.1) really says that the difference vˆt ((t −1/2 , x) − A∗ e−c1 ( ∂x U∗ (x) 2
is small, where U∗ is the periodic solution (see Eq. (1.3)) of the Swift–Hohenberg equation. Expressed in the laboratory frame, this means that an initial perturbation v0 (x) will go to 0 like π −x 2 exp( ) ∂x U∗ (x), vt (x) ≈ A∗ (v0 ) c1 t 4c1 t when t → ∞, uniformly for x ∈ R. See [Schn96]. In particular, this means that near the extrema of U∗ the convergence is faster than O(t −1/2 ) since at those points ∂x U∗ vanishes. • The inequality (7.2) gives some more precise bound on the growth of a perturbation ahead of the front, because it says that this perturbation decays exponentially in the weighted norm. More explicitly, we have at least a bound
|vt (x + ct)| ≤ Ceβx−γ t , with γ slightly smaller than γ .
386
J.-P. Eckmann, G. Schneider
• The decay (t + 1)−1/4 in (7.1) can be improved easily to (t + 1)−1/2+ϑ for any ϑ > 0. We have chosen ϑ = 1/4 to keep the notation at a reasonable level. Proof. As we explained before, the proof is similar to the one in Sect. 3 except that now the function behind the front is split into a diffusive part vˆc and into an exponentially damped part vˆs , and correspondingly there will be a few more equations. In Bloch space the initial conditions satisfy vˆ0 ˆ 2,δ + vˆ0 (· + iβ, ·) ˆ 0,δ ≤ R. The H2
H2
system for the variables vˆc and vˆs with initial conditions vˆc |t=0 = Eˆ c v| ˆ t=0 , vˆs |t=0 = β,ctˆ vˆ with initial conditions w β,0 v| ˆ t=0 , and for the variable w =W |t=0 = W ˆ t=0 is Eˆ s v| given in Bloch space by ˆ vˆc , vˆs ) + Eˆ c N (vˆc , vˆs ), vˆc + Eˆ c H( ∂t vˆc = M ˆ vˆc , vˆs ) + Eˆ s N (vˆc , vˆs ), vˆs + Eˆ s H( ∂t vˆs = M
(7.3)
w w w (vˆc , vˆs , w ∂t w =M +N ), where, see (1.8) and (6.7), with vˆ = vˆc + vˆs , = T MT −1 , M ˆ vˆc , vˆs ) = T Mi T −1 vˆ + T Ni (T −1 v), H( ˆ −1 (vˆc , vˆs ) = T N (T v), N ˆ w = M β,ctˆ + M i,β,ctˆ , M w (vˆc , vˆs , w N ) = −3T−ctˆ U∗ · T−ctˆ vˆ ∗w − 3T−ctˆ K ∗ T−ctˆ vˆ ∗w ct ∗ T−ctˆ vˆ ∗w . − T−ctˆ vˆ It is useful to modify this system by introducing the coordinates (uˆ c , uˆ s ) by uˆ c = vˆc ,
−1 Eˆ s (3U∗ · vˆc uˆ s = −M ∗ vˆc ) + vˆs .
(7.4)
This coordinate transform takes care of the fact that asymptotically vˆs can be expressed by vˆc . Under the scaling used below the new variable uˆ s converges to zero, while the old variable vˆs converges to a nontrivial expression. Under this transform (7.3) becomes uˆ c + N c,i (uˆ c , uˆ s ) + N c (uˆ c , uˆ s ), ∂t uˆ c = M uˆ s + N s,i (uˆ c , uˆ s ) + N s (uˆ c , uˆ s ), ∂t uˆ s = M ∂t w =
w w w (uˆ c , uˆ s , w M +N ),
where ˆ uˆ c , M c,i (uˆ c , uˆ s ) = Eˆ c H( −1 Eˆ s (3U∗ · uˆ c N ∗ uˆ c ) + uˆ s ) , −1 ˆ ˆ ˆ Ns,i (uˆ c , uˆ s ) = Es H(uˆ c , M Es (3U∗ · uˆ c ∗ uˆ c ) + uˆ s ) , −1 (uˆ c , M Eˆ s (3U∗ · uˆ c c (uˆ c , uˆ s ) = Eˆ c N ∗ uˆ c ) + uˆ s ) , N s (uˆ c , uˆ s ) = Eˆ s N (uˆ c , M −1 Eˆ s (3U∗ · uˆ c N ∗ uˆ c ) + uˆ s ) −1 Eˆ s (3U∗ · uˆ c ∗ uˆ c )], − ∂t [M Nw (uˆ c , uˆ s , w ) = Nw (vˆc , vˆs , w ).
(7.5)
Stability of Modulated Fronts
387
We follow the lines of Sect. 4 and start with the renormalization process by introducing the scalings vˆc,n (κ, x, τ ) = uˆ c (σ n κ, x, σ −2n τ ), vˆs,n (κ, x, τ ) = σ −3n/2 uˆ s (σ n κ, x, σ −2n τ ), w n (κ, x, τ ) = eγ σ
−2n τ
w (κ, x, σ −2n τ ).
(The 3rd argument is the time, and the function w has here another meaning than in Sect. 4.) Note again that only the Bloch variable is rescaled, but x is left untouched. As before the Bloch variable is not scaled in the weighted representation w. Under these scalings the functions vˆs,n and w n still converge to 0 as n → ∞. The variation of constant formula yields now vˆc,n (κ, x, τ ) = eσ
−2n M c,n (τ −σ 2 )
+ σ −2n + σ −2n vˆs,n (κ, x, τ ) = e
τ
σ τ
2
vˆc,n−1 (σ κ, x, 1) −2n c,i,n (vˆc,n , vˆs,n ) (κ, x, τ ) dτ eσ Mc,n (τ −τ ) N dτ eσ
−2n M c,n (τ −τ )
σ2 s,n (τ −σ 2 ) −3/2 σ −2n M
+ σ −7n/2 + σ −7n/2
c,n (vˆc,n , vˆs,n ) (κ, x, τ ), N
(7.6)
σ
τ
σ τ
2
σ2
vˆs,n−1 (σ κ, x, 1) −2n s,i,n (vˆc,n , vˆs,n )) (κ, x, τ ) dτ eσ Ms,n (τ −τ ) N dτ eσ
−2n M s,n (τ −τ )
s,n (vˆc,n , vˆs,n ) (κ, x, τ ), N
w n (κ, x, τ ) = Sn (τ, σ 2 ) wn−1 (κ, x, 1) τ w,n (vˆc,n , vˆs,n , w + σ −2n dτ n ) (κ, x, τ ), Sn (τ, τ ) N σ2
(7.7)
(7.8)
with L −n , c,n = L n Eˆ ch M M L −n , n Eˆ sh M s,n = L M c,i (L −n vˆs,n ), c,i,n (vˆc,n , vˆs,n ) = L n N −n vˆc,n , σ 3n/2 L N s,i (L −n vˆs,n ), s,i,n (vˆc,n , vˆs,n ) = L n N −n vˆc,n , σ 3n/2 L N c (L −n vˆs,n ), c,n (vˆc,n , vˆs,n ) = L n N −n vˆc,n , σ 3n/2 L N s (L −n vˆs,n ), s,n (vˆc,n , vˆs,n ) = L n N −n vˆc,n , σ 3n/2 L N −n vˆs,n , w w,n (vˆc,n , vˆs,n , w w (L −n vˆc,n , σ 3n/2 L N n ) = N n ), where we recall the definition
fˆ ((, x) ≡ fˆ(σ (, x), L
and where Sn (τ, τ ) is now the evolution operator associated with the equation w + γ )fˆτ . ∂τ fˆτ = σ −2n (M
(7.9)
388
J.-P. Eckmann, G. Schneider
Again, the exponential scaling of w n with respect to time does not affect the definition w due to the fact that w of N n only appears linearly. All this is quite analogous to the developments in Eqs. (4.10) and (4.11).
7.1. The scaled linear evolution operators. First we bound the linear evolution operators s,n . c,n and M generated by M Lemma 7.2. For all ρ1 ≥ ρ2 ≥ 0 there exist Cρ1 ,ρ2 > 0 and γ− > 0 such that for 1 ≥ τ > τ ≥ σ 2 and all σ ∈ (0, 1) one has eσ e
−2n M c,n (τ −τ )
n Eˆ ch L −n g L ˆ Kσ n ,ρ ≤ C(τ − τ )ρ2 −ρ1 g ˆ Kσ n ,ρ , 1
s,n (τ −τ ) n σ −2n M
2
−2n −n g L Eˆ sh L ˆ Kσ n ,ρ ≤ Ce−γ− σ (τ −τ ) (τ − τ )ρ2 −ρ1 g ˆ Kσ n ,ρ , 1
2
for all n ∈ N. Proof. The first estimate follows directly from the fact that c,n (()f = µ1 (()Pˆc (()f = −c1 (2 Pˆc (()f + O((3 ). M s,n (() The second estimate follows from the fact that the real part of the spectrum of M as a function of ( can be bounded from above by a strictly negative parabola. Next, we bound Sn (τ, τ ) as defined through (7.9) and state the analog of Lemma 4.2. Lemma 7.3. Suppose that the stability condition (6.12) is satisfied. Then there is a δ ∈ (0, 1] and a C > 0 such that for 1 > τ > τ ≥ 0 and all σ ∈ (0, 1] one has −2n w Kw ≤ Ce−γ σ (τ −τ )/2 w K w , Sn (τ, τ )
(7.10)
for all n ∈ N. The proof of Lemma 7.3 follows closely the one of Lemma 4.2 in Sect. 4.1. Therefore, it will be omitted here. 7.2. The scaled non-linear terms. Next we estimate the scaled non-linear terms in Nc,n , Ns,n , and Nw,n . In order to estimate the time derivatives on the right hand side of (7.5) coming from the coordinate transform (7.4) we need to choose vˆc,n in the better space Kσc n = Kσ n ,3/2 instead of only being in Kσ n ,1 . Lemma 7.4. Suppose max{vˆc,n Kc n , vˆs,n Ks n , wn Kw } ≤ 1. Then there exists a σ σ C1 > 0 such that for all σ ∈ (0, 1] one has c,n K n ≤ C1 σ 5n/2 (vˆc,n Kc n + vˆs,n Ks n )2 , N σ ,3/4 σ
σ
s,n Ks ≤ C1 σ 2n (vˆc,n Kc + vˆs,n Ks )2 , N n n n σ
σ
σ
w,n Kw ≤C1 σ n/2 (vˆc,n Kc + vˆs,n Ks ) w n K w . N n n σ
σ
Stability of Modulated Fronts
389
Proof. Throughout the proof we use fˆ fˆ) g) L( ∗ g) ˆ (κ) = σ (L ∗ (L ˆ (κ).
(7.11)
w,n . The most dangerous term in N w,n coming from i) We start with the estimates for N Nw (vˆc , vˆs , w ) is −n ∗ T−cσ 3T−cσ ∗w n . ˆ −2n τ Kct ˆ −2n τ (L vˆ c,n ) −n vˆc,n 0,δ ≤ Cσ n/2 vˆc,n Kc and (5.6) we obtain From L n ˆ σ
H2
−n ∗ T−cσ T−cσ ∗w n ˆ 0,δ ˆ −2n τ Kct ˆ −2n τ (L vˆ c,n ) H2
−n ˆ −n τ vˆc,n ) ≤T−cσ ∗w n ˆ 0,δ ˆ −2n τ Kct C 2 L (T−cσ H2
b,δ
≤Cσ
n/2
vˆc,n
Kσc n
w n K w .
s,n . The only difficulty stems from the ii) We use (7.11) to obtain the estimates for N term −1 Eˆ s (3U∗ · uˆ c −1 Eˆ s (6U∗ · uˆ c ∂t [M ∗ uˆ c )] = M ∗ ∂t uˆ c ) coming from the change of coordinates (7.4). This can be estimated in the required way by expressing ∂t uˆ c by the right-hand side of (7.5), by using then the points ii.1)–ii.3) and the fact we already have a factor σ n by uˆ c ∗ ∂t uˆ c using again (7.11). ii.1) The first bound for the terms on the right-hand side of (7.5) is c,n vˆc,n K n ≤ Cσ n vˆc,n K n , M σ ,1 σ ,3/2 which follows from the form of µ1 (() by using the following lemma. 2 ([−1/2, 1/2), C 2 ((0, 2π ), C)) with µ((, ·) Lemma 7.5. Let µ ∈ Cper C 2 ((0,2π),C) ≤ C|(|2(ρ1 −ρ2 ) for a ρ1 ≥ ρ2 ≥ 0. Then, there exists a C > 0 such that for all σ ∈ (0, 1] we have
σ µ)u (L ˆ Kσ,ρ2 ≤ Cσ 2(ρ1 −ρ2 ) µCper ˆ Kσ,ρ1 . 2 ([−1/2,1/2),C 2 ((0,2π),C)) u
(7.12)
Proof. This follows since sup |
(∈R
(2(ρ1 −ρ2 ) σ 2(ρ1 −ρ2 ) | < Cσ 2(ρ1 −ρ2 ) . (1 + (2 )(ρ1 −ρ2 )
c,i,n is exponentially small in terms of σ . ii.2) By Lemma 7.8 below the term N ii.3) From (7.11) we easily obtain c,n K n ≤ σ n (vˆc,n Kc + vˆs,n Ks )2 . N n n σ ,1 σ
σ
c,n part. Note that N c,n can be written iii) From [Schn96] we recall the estimates for the N as c,n = sˆ1 + sˆ2 + N c,n,r , N
390
J.-P. Eckmann, G. Schneider
where n Eˆ c L −n (U∗ · vˆc,n sˆ1 = −3σ n L ∗ vˆc,n ), n Eˆ c L −n (U∗ · vˆc,n s,n )−1 (3U∗ · vˆc,n ∗ (M ∗ vˆc,n )) sˆ2 = −6σ 2n L 2n n ˆ −n ∗ vˆc,n ∗ vˆc,n ), − σ L Ec L (vˆc,n c,n,r K n = O(σ 5n/2 (vˆc,n Kc + vˆs,n Ks )2 ). N n n σ ,1 σ
σ
c,n,r follows easily by applying again (7.11). The estimate for N It remains to estimate sˆ1 and sˆ2 . These estimates have been obtained in [Schn96]. For completeness we recall some of the arguments. Introducing an (() ∈ C by vˆc,n ((, x) = an (()ϕσ n ( (x) shows that the terms sˆ1 and sˆ2 are of the form 2n sˆ2 ((, x) = σ dm dk K2 (σ n (, σ n (( − m), σ n (m − k), σ n k)
× an (( − m)an (m − k)an (k) ϕσ n ( (x),
n n n n sˆ1 ((, x) = σ dm K1 (σ (, σ (( − m), σ m) an (( − m) an (m) ϕσ n ( (x), with Kj : R2+j → C the kernel of an integral operator. The detailed expression for K1 is given in (7.13) below. The case n = m = k = ( = 0 corresponds to the spatially periodic case. In the spatially periodic case there exists a center manifold G = {u = U0,a | a ∈ R}, consisting of the spatially periodic fixed points related to each other by the translation invariance of the original Swift–Hohenberg equation. By a formal calculation it turns out that the flow of the one-dimensional center manifold G is determined by the ordinary differential equation d a = 0 · a + K1 (0, 0, 0)a 2 + K2 (0, 0, 0, 0)a 3 + O(a 4 ). dt Since the center manifold consists of fixed points the flow a = a(t) is trivial, i.e., d dt a = 0. Consequently, we obtain K1 (0, 0, 0) = K2 (0, 0, 0, 0) = 0. Therefore, |K2 ((, ( − m, m − k, k)| ≤ C(|(| + |( − m| + |m − k| + |k|), and so (7.11) and (7.12) imply ˆs2 Kσ n ,1 ≤ Cσ 3n (vˆc,n Kc n )2 . σ
Interestingly it turned out that the first derivatives of K1 vanish as well. Since the eigenvalue problem M( ϕ( = µ1 (()ϕ( is self-adjoint, the projection Pˆc (() is orthogonal in 2 ˆ L (0, 2π) and is given by Pc (()u = ( ϕ( (x)u((, x)dx)ϕ( (·). Thus K1 ((, ( − m, m) = 3 dx ϕ( (x)ϕ(−m (x)ϕm (x)U (x). (7.13)
Stability of Modulated Fronts
391
Expanding ϕ( (x) = ∂x U (x) + i(g(x) + O((2 ), with g(x) ∈ R yields K1 ((, ( − m, m) = 3
dx (∂x U (x))3 U (x)
− i(g(x)(∂x U (x))2 U (x) + i(( − m)g(x)(∂x U (x))2 U (x)
2 2 2 2 + (∂x U (x)) img(x)U (x) + O(( + (( − m) + m ) . Note that U is an even function, so ∂x U is odd, which proves again K1 (0, 0, 0) = 0. Since, in addition, the first order terms cancel we have |K1 ((, ( − m, m)| ≤ C|(2 + (( − m)2 + m2 |, and so from (7.11) and (7.12), ˆs1 Kσ n ,3/4 ≤ Cσ 5n/2 (vˆc,n Kc n )2 . σ
Summing the estimates shows the assertion.
7.3. Bounds on the integrals. Here we estimate the integrals in the variation of constant formula in terms of the following quantities. Definition 7.6. For all n, we define u Rcs,n =
sup vˆc,n (τ )Kc n + sup vˆs,n (τ )Ks n , σ
τ ∈[σ 2 ,1]
τ ∈[σ 2 ,1]
σ
and Rnw =
sup wn (τ )Kw .
τ ∈[σ 2 ,1]
In the following two lemmas we estimate the integrals appearing in (7.6)–(7.8). u + R w ≤ 1. Then for all 1 ≥ τ ≥ σ 2 and all σ ∈ (0, 1] one Lemma 7.7. Assume Rcs,n n has τ −2n c,n (vˆc,n , vˆs,n ) (·, ·, τ )Kc σ −2n dτ eσ Mc,n (τ −τ ) N n σ2
σ
σ2
σ
u ≤ Cσ n/2 (Rcs,n )2 , τ −2n s,n (vˆc,n , vˆs,n ) (·, ·, τ )Ks dτ eσ Ms,n (τ −τ ) N σ −7n/2 n u ≤ Cσ n/2 (Rcs,n )2 , τ w,n (vˆc,n , vˆs,n , wˆ n ) (·, ·, τ )Kw dτ Sn (t, τ ) N σ −2n σ2
u ≤ Cσ n/2 Rcs,n Rnw .
392
J.-P. Eckmann, G. Schneider
Proof. We first use Lemma 7.2 and Lemma 7.4. For the second integral in (7.6) we get a bound τ −2n c,n (vˆc,n , vˆs,n ) (·, ·, τ )Kc sup σ −2n dτ eσ Mc,n (τ −τ ) N n σ
σ2
τ ∈[σ 2 ,1]
u ≤ Cσ −2n (Rcs,n )2 σ 5n/2
u ≤ Cσ n/2 (Rcs,n )2 .
1 σ2
dτ (1 − τ )−3/4
For the second integral in (7.7) we find similarly τ −2n s,n (vˆc,n , vˆs,n ) (·, ·, τ )Ks sup σ −7n/2 dτ eσ Ms,n (τ −τ ) N n σ
σ2
τ ∈[σ 2 ,1]
u ≤ C(Rcs,n )2 σ −3n/2 u ≤ Cσ n/2 (Rcs,n )2 .
1 σ2
dτ e−Cσ
−2n (1−τ )
For the integral in (7.8) we find, using now Lemma 7.3 and Lemma 7.4, a bound τ −2n −2n u u Cσ dτ e−γ σ (τ −τ )/2 (σ n/2 Rcs,n Rnw ) ≤ Cσ n/2 Rcs,n Rnw .
σ2
u + R w ≤ 1. Then for all 1 ≥ τ ≥ σ 2 and all σ ∈ (0, 1) one Lemma 7.8. Assume Rcs,n n has τ −2n c,i,n (vˆc,n , vˆs,n ) (·, ·, τ )Kc σ −2n dτ eσ Mc,n (τ −τ ) N n σ
σ2
−(β(c−c)+γ ˆ )σ −n
≤ Ce Rnw , τ −2n s,i,n (vˆc,n , vˆs,n ) (·, ·, τ )Ks σ −7n/2 dτ eσ Ms,n (τ −τ ) N n σ
σ2
≤ Ce
−(β(c−c)+γ ˆ )σ −n
Rnw .
Proof. We restrict ourselves to the linear part Mi . A typical term of (7.6) – the first in the definition of Mi in (1.8) – can be rewritten as τ
−2n n K −n vˆc,n (τ )) (κ, x) U (x) cσ −2n τ dτ eσ Mc,n (τ −τ ) L ∗ (L σ −2n σ2 τ
−2n n K cσ −2n τ = σ −2n dτ eσ Mc,n (τ −τ ) L ∗ uˆ σ −2n τ (κ, x) U (x). σ2
Since Kct (x) vanishes as x → −∞ with some exponential rate, its Bloch wave transform cσ −2n τ can be extended into a strip in the complex plane such that K cσ −2n τ K ∗ uˆ σ −2n τ (κ, x) ˆ −2n τ −γ σ −2n τ cσ −2n τ (κ − ( − iβ, x) w = d( K ((, x, σ −2n τ )e−i(cσ e ˆ × e−β(c−c)σ
−2n τ
ei(κ−()cσ
−2n τ
.
Stability of Modulated Fronts
393
c,n (τ − τ )) is bounded Using this identity, we get, as in (4.20) – because exp(σ −2n M 0,δ ˆ Kn ≤ Cσ −11n/2 u ˆ 2,δ and (5.7), and recalling Kw = Hˆ 2 : – from Lˆ n u Hˆ 2
σ ,3/2
τ
−2n n K cσ −2n τ dτ eσ Mc,n (τ −τ ) L ∗ uˆ n,τ Kσ n ,3/2 τ −2n n (K cσ −2n τ ≤ Cσ dτ L ∗ uˆ n,τ )Kσ n ,3/2 2 σ τ ˆ −2n τ −11n/2 ≤ Cσ −2n dτ e−β(c−c)σ σ (1 + cσ −2n τ )2
σ −2n
σ2
σ2
× (κ, x) → e−icσ
−2n τ κ
(7.14)
cσ −2n τ (κ − iβ, x) 2,δ K ˆ H2
−2n
−2n
ˆ τ × (κ, x) → e−iκ cσ w n,τ (κ, x)Kw e−γ σ τ τ ˆ −2n τ −γ σ −2n τ w ≤ Cσ −15n/2 dτ (1 + cσ −2n τ )2 e−β(c−c)σ e Rn
≤ Cσ
σ2 −23n/2 −(β(c−c)+γ ˆ )σ −2(n−1)
e
−n
ˆ )σ Rnw ≤ Ce−(β(c−c)+γ Rnw .
The non-linear terms coming from Ni can be handled in exactly the same way and yield similar bounds. The same is true for the terms with Ns,i,n in (7.7). 7.4. Bounds on the initial condition. Here, we estimate the first terms on the right-hand side of the variation of constant formulae (7.6)–(7.8). Lemma 7.9. For all 1 ≥ τ ≥ σ 2 and all σ ∈ (0, 1] we have eσ e
−2n M c,n (τ −σ 2 )
n Eˆ ch L −n L g L ˆ Kc n ≤ Cσ −11/2 g ˆ Kc n−1 , σ
s,n (τ −σ 2 ) n σ −2n M
σ
−2n 2 −n σ −3/2 L g L Eˆ sh L ˆ Ks n ≤ Cσ −6 e−Cσ (τ −σ ) g ˆ Ks n−1 , σ
σ
Sn (τ, σ )g ˆ Kw ≤ Ce 2
−γ σ −2n (τ −σ 2 )/2
g ˆ Kw .
Proof. The first two bounds of Lemma 7.9 follow immediately from Lemma 7.2 and (6.4). The third inequality is an immediate consequence of Lemma 7.3. 7.5. A priori bounds on the non-linear problem. This section follows closely Sect. 4.4. We need a priori bounds on the solution of (7.6)–(7.8). We (re)define now quantities analogous to those of Definition 4.3. Definition 7.10. For all n ∈ N, we define u = vˆc,n |τ =1 Kc n + vˆs,n |τ =1 Ks n , ρcs,n σ
σ
and ρnw = wn |τ =1 Kw .
Lemma 7.11. For all n ∈ N there is a constant ηn > 0 such that the following holds: If u w , and σ > 0 are smaller than η , the solutions of (7.6)–(7.8) exist for all ρcs,n−1 , ρn−1 n 2 τ ∈ [σ , 1]. Moreover, we have the estimates −n
u u u Rcs,n ≤ Cσ −6 ρcs,n−1 + Ce−Cσ Rnw + Cσ n/2 (Rcs,n )2 ,
(7.15)
394
J.-P. Eckmann, G. Schneider
and w u + Cσ n/2 Rcs,n Rnw , Rnw ≤ Cρn−1
(7.16)
with a constant C independent of σ and n. Remark. We remark again that there is no need for a detailed expression for ηn since u the existence of the solutions is guaranteed if we can show Rcs,n < ∞ and Rnw < ∞. By (7.15) and (7.16) we have detailed control of these quantities in terms of the norms of the initial conditions and σ . Proof. For the derivation of the estimates we assume in the sequel, without loss of u + R w ≤ 1. For the first term in (7.8) we obtained in Lemma 7.9 a generality, that Rcs,n n bound w Cρn−1 .
(7.17)
u Rw . For the second term in (7.8), we obtained in Lemma 7.7 a bound Cσ n/2 Rcs,n n We now discuss in detail (7.7). Using Lemma 7.9 the first term is bounded by u . Lemma 7.7 and Lemma 7.8 yield for the second and third terms a bound Cσ −6 ρcs,n−1 −n
u )2 + Ce−Cσ R w for a C > 0 independent of σ ∈ (0, 1] and n ∈ N. Cσ n/2 (Rcs,n n Finally, we come to the bounds for (7.6). Using Lemma 7.9 the first term is bounded u by Cσ −11/2 ρcs,n−1 . Lemma 7.7 and Lemma 7.8 yield for the second and third terms a −n
u )2 + Ce−Cσ R w for a C > 0 independent of σ ∈ (0, 1] and n ∈ N. bound Cσ n/2 (Rcs,n n The proof of Lemma 7.11 now follows by applying the contraction mapping principle to the system consisting of (7.6), (7.7), and (7.8). u w and σ > 0 sufficiently small the Lipschitz constant on the Then for ρcs,n−1 , ρn−1 right-hand side of (7.6) to (7.8) in C([σ 2 , 1], Kσc n × Kσs n × Kw ) is smaller than 1. An application of a classical fixed point argument completes the proof of Lemma 7.11.
7.6. The iteration process. As in the case of the simplified problem, we decompose the 2 ˜ = e−c1 κ solution vˆc,n (·, ·, τ ) for τ = 1 into a Gaussian part and a remainder. Let ψ(κ) and write ˜ vˆc,n (κ, x, 1) = An ψ(κ)ϕ σ −n κ (x) + rˆn (κ, x), : Kc n → C by where rˆn (0, x) = 0, and the amplitude An is in C. We also define > σ ˆ (7.18) (>f )ϕ0 = Pc (0)f κ=0 . Then (7.6) can be decomposed accordingly and takes the form 1
c,n (1−τ ) −2n σ −2n M An = An−1 + > σ dτ e (Nc,i,n + Nc,n ) , σ2
rˆn (κ, x) = eσ
(7.19)
−2n M c,n (1−σ 2 )
+ σ −2n +e
1
rˆn−1 (σ κ, x) −2n c,i,n + N c,n ) (κ, x) dτ eσ Mc,n (1−τ ) (N
σ2 −2n σ Mc,n (1−σ 2 )
(7.20)
˜ κ)ϕσ −n κ (x) − An ψ(κ)ϕ ˜ An−1 ψ(σ σ −n κ (x).
If we define next ρnr = ˆrn Kc n + vˆs,n |τ =1 Ks n then the above construction implies σ σ u ρcs,n ≤ C(|An | + ρnr ).
Stability of Modulated Fronts
395
Our main estimate is now Proposition 7.12. There is a constant C > 0 such that for sufficiently small σ > 0 the solution (vc,n , vs,n , wn ) of (7.6)–(7.8) satisfies for all n ∈ N: −n
u )2 , |An − An−1 | ≤ Ce−Cσ Rnw + Cσ n/2 (Rcs,n
ρnr
≤
−n r Cσρn−1 + Ce−Cσ Rnw u + Cσ n Rcs,n ,
ρnw ≤ Ce
−Cσ −2n
+ Cσ
n/2
(7.21) u (Rcs,n )2
(7.22)
w u ρn−1 + Cσ n/2 Rcs,n Rnw .
(7.23)
Proof. We begin by bounding the difference An − An−1 using (7.19). Since fˆ is in H 2 as a function of ( we obviously have fˆ| ≤ CfˆKc . |> n
(7.24)
σ
Thus, it suffices to bound the norm of the integral in (7.19), but this has already been done in the proof of Lemma 7.7 and Lemma 7.8. We next bound rˆn in terms of rˆn−1 , using (7.20). The first term is the one where the projection is crucial: For σ > 0 sufficiently small, rˆn−1 ∈ Kσc n−1 with rˆn−1 (0) = 0 one has (κ, x) → eσ
−2n M c,n (1−σ 2 )
rˆn−1 (σ κ, x)Kc n ≤ Cσ ˆrn−1 Kc n−1 , σ
σ
(7.25)
as in the proof of Proposition 4.5. This leads for the first term in (7.20) to a bound (in Kσc n ) r . Cσρn−1
(7.26)
The second and third term have been bounded in the proof of Lemma 7.7 and Lemma 7.8 by −n
Ce−Cσ Rnu + Cσ n/2 (Rnu )2 .
(7.27)
Finally, the last term c,n (1−σ 2 ) n (κ, x) ≡ eσ −2n M ˜ κ)ϕσ −n κ (x) − An ψ(κ)ϕ ˜ X An−1 ψ(σ σ −n κ (x),
in (7.20) leads to a bound (in Kσc n ): −n
w u u n ≤ Ce−Cσ Rn−1 X + Cσ n/2 (Rcs,n )2 + Cσ n Rcs,n ,
(7.28)
where the last term is due to µ1 (() = −c1 (2 + O((3 ) not being exactly a parabola. For details see [Schn96]. Collecting the bounds, the assertion (7.22) for rˆn follows. Finally, the bounds on ρnw follow in the same way as those in Lemma 7.11. The proof of Proposition 7.12 is complete.
396
J.-P. Eckmann, G. Schneider
Proof of Theorem 7.1. As before the proof is just an induction argument, using repeatedly the above estimates. Again we write C for constants which can be chosen independent u of σ and n. Assume that R = supn∈N Rcs,n < ∞ exists. From Lemma 7.11 we observe for σ > 0 sufficiently small, Rnw ≤ u Rcs,n ≤
w Cρn−1
w ≤ Cρn−1 , 1 − Cσ n/2 R −n u + Ce−Cσ Rnw Cσ −6 ρcs,n−1
1 − Cσ n/2 R
(7.29)
−n
u w + Ce−Cσ ρn−1 , ≤ Cσ −6 ρcs,n−1
with a constant C which can be chosen independent of R. Using Proposition 7.12 we find −n
w u |An − An−1 | ≤ Ce−Cσ ρn−1 + Cσ n/2 σ −6 ρcs,n−1 , −n
r w u ρnr ≤ Cσρn−1 + Ce−Cσ ρn−1 + Cσ n/2 σ −6 ρcs,n−1 ,
u ≤ C(|An | + ρnr ), ρcs,n
ρnw ≤ Ce−Cσ
−2n
(7.30)
w w ρn−1 + Cσ n/2 ρn−1 .
Therefore, we can choose σ > 0 so small that for n > 13: w r |An − An−1 | ≤ ρn−1 /10 + σ (n−13)/2 (|An−1 | + ρn−1 ), r w ρnr ≤ 3ρn−1 /4 + ρn−1 /10 + σ (n−13)/2 |An |, w w ρn ≤ ρn−1 /10.
Thus, the sequence of An converges geometrically to a finite limit A∗ . Furthermore, we find that limn→∞ ρnr = 0, and limn→∞ ρnw = 0. Since the quantities |An |, ρnr , ρnw increase only for at most 13 steps, the term CR in (7.29) stays less than 1/2 if we choose |A1 |, ρ1r , ρ1w = O(σ m ), for a sufficiently large m > 0. From (7.29) the existence of a u finite constant R = supn∈N Rcs,n follows . Going back to (7.30) we can choose σ > 0 so small that |An − An−1 | + ρnr ≤ Cσ n/2 , which implies the associated convergence rate stated in Theorem 7.1. Finally, the scaling of wn (·, ·, τ ) implies the exponential decay of w(t). The proof of Theorem 7.1 is complete. Acknowledgement. Guido Schneider would like to thank at the Physics Department of the University of Geneva for kind hospitality. Both authors would like to thank the referee for reading the paper very carefully and for very helpful comments. This work is partially supported by the Fonds National Suisse. The work of Guido Schneider is partially supported by the Deutsche Forschungsgemeinschaft DFG under the grant Mi459/2–3.
References [AW78]
Aronson, D.G., Weinberger, H.: Multidimensional nonlinear diffusion arising in population genetics. Adv. Math. 30, 33–76 (1978)
Stability of Modulated Fronts
[BK92]
397
Bricmont, J., Kupiainen, A.: Renormalization group and the Ginzburg–Landau equation. Commun. Math. Phys. 150, 193–208 (1992) [BK94] Bricmont, J., Kupiainen, A.: Stability of moving fronts in the Ginzburg–Landau equation. Commun. Math. Phys. 159, 287–318 (1994) [CE86] Collet, P., Eckmann, J.-P.: The existence of dendritic fronts. Commun. Math. Phys. 107, 39–92 (1986) [CE87] Collet, P., Eckmann, J.-P.: The stability of modulated fronts. Helv. Phys. Acta 60, 969–991 (1987) [CE90a] Collet, P., Eckmann, J.-P.: Instabilities and fronts in extended systems. Princeton: Princeton University Press, 1990 [CE90b] Collet, P., Eckmann, J.-P.: The time dependent amplitude equation for the Swift–Hohenberg problem. Commun. Math. Phys. 132, 139–153 (1990) [CEE92] Collet, P., Eckmann, J.-P., Epstein, H.: Diffusive repair for the Ginsburg–Landau equation. Helv. Phys. Acta 65, 56–92 (1992) [DL83] Dee, G., Langer, J.S.: Propagating pattern selection. Phys. Rev. Lett. 50, 383–386 (1983) [Eck65] Eckhaus, W.: Studies in nonlinear stability theory. Springer Tracts in Nat. Phil. Vol. 6, Berlin– Heidelberg–New York: Springer, 1965 [EW91] Eckmann, J.-P., Wayne, C.E.: Propagating fronts and the center manifold theorem. Commun. Math. Phys. 136, 285–307 (1991) [EW94] Eckmann, J.-P., Wayne, C.E.: The non–linear stability of front solutions for parabolic partial differential equations. Commun. Math. Phys. 161, 323–334 (1994) [EWW97] Eckmann, J.-P., Wayne, C.E., Wittwer, P.: Geometric stability analysis of periodic solutions of the Swift–Hohenberg equation. Commun. Math. Phys. 190, 173–211 (1997) [Ga94] Gallay, T.: Local stability of critical fronts in nonlinear parabolic partial differential equations. Nonlinearity 7, 741–764 (1994) [HS99] Haragus, M., Schneider, G.: Bifurcating fronts for the Taylor–Couette problem in infinite cylinders. Zeitschrift für Angewandte Mathematik und Physik (ZAMP) 50, 120–151 (1999) [KSM92] Kirrmann, P., Schneider, G., Mielke,A.: The validity of modulation equations for extended systems with cubic nonlinearities. Proceedings of the Royal Society of Edinburgh 122A, 85–91 (1992) [RS72] Reed, M., Simon, B.: Methods of Modern Mathematical Physics I–IV. NewYork: Academic Press, 1972 [Sa77] Sattinger, D.H.: Weighted norms for the stability of travelling waves. J. Diff. Eqns. 25, 130–144 (1977) [Schn94] Schneider, G.: Error estimates for the Ginzburg–Landau approximation. J. Appl. Math. Phys. 45, 433–457 (1994) [Schn96] Schneider, G.: Diffusive stability of spatial periodic solutions of the Swift–Hohenberg equation. Commun. Math. Phys. 178, 679–702 (1996) [Schn98] Schneider, G.: Nonlinear stability of Taylor-vortices in infinite cylinders. Arch. Rational Mech. Anal. 144, 121–200 (1998) [Ta97] Taylor, M.E.: Partial Differential Equations I: Basic Theory. Appl. Math. Sciences 115, Berlin– Heidelberg–New York: Springer, 1997 [vH91] van Harten, A.: On the validity of Ginzburg–Landau’s equation. J. Nonlinear Science 1, 397–422 (1991) [Wa97] Wayne, C.E.: Invariant manifolds for parabolic partial differential equations on unbounded domains. Arch. Rat. Mech. Anal. 138, 279–306 (1997) Communicated by A. Kupiainen
Commun. Math. Phys. 225, 399 – 421 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Pauli Operator and Aharonov–Casher Theorem for Measure Valued Magnetic Fields László Erd˝os1, , Vitali Vougalter2 1 School of Mathematics, Georgia Tech, Atlanta, GA 30332, USA. E-mail:
[email protected] 2 Department of Mathematics, University of British Columbia, Vancouver, BC, Canada V6T 1Z2.
E-mail:
[email protected] Received: 14 May 2001 / Accepted: 5 September 2001
Abstract: We define the two dimensional Pauli operator and identify its core for magnetic fields that are regular Borel measures. The magnetic field is generated by a scalar potential hence we bypass the usual A ∈ L2loc condition on the vector potential, which does not allow to consider such singular fields. We extend the Aharonov–Casher theorem for magnetic fields that are measures with finite total variation and we present a counterexample in case of infinite total variation. One of the key technical tools is a weighted L2 estimate on a singular integral operator. 1. Introduction We consider the usual Pauli operator in d = 2 dimensions with a magnetic field B, 2 H = σ · (−i∇ + A) = (−i∇ + A)2 + σ3 B
on
L2 (R2 , C2 ),
B := curl(A) = ∇ ⊥ · A with ∇ ⊥ := (−∂2 , ∂1 ). Here σ · (−i∇ + A) is the two dimensional Dirac operator on the trivial spinorbundle over R2 with real vector potential A and σ = (σ1 , σ2 ) are the first two Pauli matrices. Precise conditions on A and B will be specified later. The Aharonov–Casher theorem [A-C] states that the dimension of the kernel of H is given by dim Ker(H ) = | |, where 1 := 2π
Partially supported by NSF grant DMS-9970323
R2
B(x)dx
(1)
400
L. Erd˝os, V. Vougalter
(possibly ±∞) is the flux (divided by 2π ) and denotes the lower integer part ( n = n − 1 for n ≥ 1 integer and 0 = 0). Moreover, σ3 ψ = −sψ for any ψ ∈ Ker(H ), where s = sign( ). On a Spinc -bundle over S 2 with a smooth magnetic field the analogous theorem is equivalent to the index theorem (for a short direct proof see [E-S]). From topological reasons the analogue of , the total curvature of a connection, is an integer (the Chern number of the determinant line bundle), and the number of zero modes of the corresponding Dirac operator is . In the present paper we investigate two related questions: (i) What is the most general class of magnetic fields for which the Pauli operator can be properly defined on R2 ? (ii) What is the most general class of magnetic fields for the Aharonov–Casher theorem to hold on R2 ? Pauli operators are usually defined either via the magnetic Schrödinger operator, (−i∇ + A)2 , by adding the magnetic field σ3 B as an external potential, or directly by the quadratic form of the Dirac operator σ · (−i∇ + A) (see Sect. 2.1). In both ways, the standard condition A ∈ L2loc (R2 , R2 ) is necessary. On the other hand, the statement of the Aharonov–Casher theorem uses only that B ∈ L1 (R2 ), and in fact B can even be a measure. It is therefore a natural question to extend the Pauli operator for such magnetic fields and investigate the validity of the Aharonov– Casher theorem. However, even if B ∈ L1 , it might not be generated by an A ∈ L2loc . For example, any gauge A generating the radial field B(x) = |x|−2 | log |x| |−3/2 1(|x| ≤ 1/2 1 1 2 −1 2 ) ∈ L satisfies |x|≤1/2 |A(x)| dx ≥ 0 (r| log r|) dr = ∞ (here 1 is the characteristic function). Hence the Pauli operator cannot be defined in the usual way on C0∞ as its core. In case of a point singularity at p ∈ R2 one can study the extensions from C0∞ (R2 \ {p}), but such approach may not be possible for B with a more complicated singular set. In this paper we present an alternative method which enables us to define the Pauli operator for any magnetic field that is a regular Borel measure (Theorem 2.7). Moreover, we actually define the corresponding quadratic form on the maximal domain and identify a core. We recall that the maximal domain contains all finite energy states, hence it has a direct physical interpretation. For mathematical analysis, however, one needs to know a core explicitly that contains reasonably “nice” functions. For most Schrödinger type operators the core consists of smooth functions. In case of Pauli operators with singular magnetic fields the core will be identified as the set of smooth functions times an explicit nonsmooth factor. The basic idea is to define the Pauli operator via a real generating potential function h, satisfying h = B
(2)
instead of the usual vector potential A. This potential function appears in the original proof of the Aharonov–Casher theorem. The key identity is the following: 2 2 σ · (−i∇ + A)ψ 2 = 4 ∂z¯ (e−h ψ+ ) e2h + ∂z (eh ψ− ) e−2h (3) for regular data, with A := ∇ ⊥ h (integrals without specified domains are understood on R2 with respect to the Lebesgue measure). We will define the Pauli quadratic form by
Pauli Operator for Measure Valued Fields
401
the right-hand side even for less regular data. It turns out that any magnetic field that is a regular Borel measure can be handled by an h-potential. The main technical tool is that for an appropriate choice of h, the weight function e±2h (locally) belongs to the Muckenhoupt A2 class ([G-R, St]). Therefore the maximal operator and certain singular integral operators are bounded on the weighted L2 spaces. This will be essential to identify the core of the Pauli operator. We point out that this approach does not apply to the magnetic Schrödinger operator (−i∇ + A)2 . The Aharonov–Casher theorem has been rigorously proven only for a restricted class of magnetic fields on R2 . The conditions involve some control on the decay at infinity and on local singularities. In fact, to our knowledge, the optimal conditions have never been investigated. The original paper [A-C] does not focus on conditions. The exposition [CFKS] assumes compactly supported bounded magnetic field thesis B(x). The Ph.D. by K. Miller [Mi] assumes boundedness, and assumes that |B(x)| log |x| dx < ∞. The boundedness condition is clearly too strong, and it can be easily replaced with the assumption that B ∈ K(R2 ) Kato class. Miller also observes that in case of the integer = 0 there could be either | | or | | − 1 zero states, but if the field is compactly supported then the number of states is always | | − 1 [CFKS]. The idea behind each proof is to construct a potential function h satisfying (2). Locally, H ψ = 0 is equivalent to ψ = (eh g+ , e−h g− ) with ∂z¯ g+ = 0, ∂z g− = 0, where we identify R2 with C and use the notations x = (x1 , x2 ) ∈ R2 and z = x1 + ix2 ∈ C simultaneously. The condition ψ ∈ L2 (R2 , C2 ) together with the explicit growth (or decay) rate of h at infinity determines the global solution space by identifying the space of (anti)holomorphic functions g± with a controlled growth rate at infinity. For bounded magnetic fields decaying fast enough at infinity, a solution to (2) is given by 1 h(x) = log |x − y|B(y) dy (4) 2π R2 and h(x) behaves as ≈ log |x| for large x. If ≥ 0, then eh g+ is never in L2 , and e−h g− ∈ L2 if |g− | grows at most as the ( − 1)th power of |x|. If < ∞, then g− must be a polynomial of degree at most − 1. If = ∞, then the integral in (4) is not absolutely convergent. If the radial behavior of B is regular enough, then h may still be defined via (4) as a conditionally convergent integral and we then have a solution space of infinite dimension. Conditions on local regularity and decay at infinity are used to establish bounds on the auxiliary function h given by (4), but they are not a priori needed for the Aharonov– Casher Theorem (1). We show that local regularity conditions are irrelevant by proving the Aharonov–Casher theorem for any measure valued magnetic fields with finite total variation (Theorem 3.1). Many fields with infinite total variation can also be covered; some regular behavior at infinity is sufficient (Corollary 3.3). However, some control is needed in general, as we present a counterexample to the Aharonov–Casher theorem for a magnetic field with infinite total variation. Counterexample 1.1. There exists a continuous bounded magnetic field B such that R2 |B| = ∞ and 1 := lim (r) = lim B(x) dx (5) r→∞ r→∞ 2π |x|≤r
402
L. Erd˝os, V. Vougalter
exists and > 1, but dim KerH = 0. Finally, we recall a conjecture from [Mi]: Conjecture 1.2. Let B(x) ≥ 0 with flux := dimension of Ker (H ) is at least .
1 2π
B, which may be infinite. Then the
The proof in [Mi] failed because it would have relied on the conjecture that for any continuous function B ≥ 0 there exists a positive solution h to (2). This is false. A counterexample (even with finite ) was given by C. Fefferman and B. Simon and it was presented in [Mi]. However, the same magnetic field does not yield a counterexample to Conjecture 1.2. Theorem 3.1 settles this conjecture for < ∞, but the case = ∞ remains open. The magnetic field in our counterexample does not have a definite sign, in fact is defined only as an improper integral. 2. Definition of the Pauli Operator 2.1. Standard definition for A ∈ L2loc . The standard definition of the magnetic Schrödinger operator, (−i∇ + A)2 , or the Pauli operator, [σ · (−i∇ + A)]2 , as a quadratic form, requires A ∈ L2loc (see e.g. [L-S, L-L] and for the Pauli operator [So]). We define !k := −i∂k + Ak , Q± := !1 ± i!2 , or with complex notation Q+ = −2i∂z¯ + a, Q− = −2i∂z + a¯ with a := A1 + iA2 . These are closable operators, originally defined on C0∞ (R2 ). Their closures are denoted by the same letter on the minimal domains Dmin (!j ) and Dmin (Q± ). Let sA (u, u) := !1 u2 + !2 u2 = |(−i∇ + A)u|2 , u ∈ C0∞ (R2 ) be the closable quadratic form associated with the magnetic Schrödinger operator on the minimal form domain Dmin (sA ). It is known [Si] that the minimal domain coincides with the maximal domain Dmax (sA ) := {u ∈ L2 (R2 ) : sA (u, u) < ∞}. We will denote D(sA ) := Dmax (sA ) = Dmin (sA ) and let SA be the corresponding self-adjoint operator. The closable quadratic form associated with the Pauli operator is pA (ψ, ψ) := Q+ ψ+ 2 + Q− ψ− 2 =
|σ · (−i∇ + A)ψ|2 , ψ+ ψ= ∈ C0∞ (R2 , C2 ). ψ−
The condition A ∈ L2loc is obviously necessary. The minimal form domain is Dmin (pA ) = Dmin (Q+ ) ⊗ Dmin (Q− ), while Dmax (pA ) := {ψ ∈ L2 (R2 , C2 ) : pA (ψ, ψ) < ∞}. The unique self-adjoint operators associated with these forms are PAmin and PAmax . Clearly Dmin (pA ) ⊂ Dmax (pA ). For a locally bounded magnetic field B = ∇ ⊥ ·A one can choose a vector potential A ∈ L∞ loc by the Poincaré formula and in this case Dmin (pA ) = Dmax (pA ), i.e., PAmin = PAmax . To see this, we first approximate any ψ ∈ Dmax (pA ) in the norm [ · 2 + pA (· , ·)]1/2 by functions ψn = ψχn of compact support, where χn → 1 and ∇χn ∞ → 0. Then we use that ∇ψn 2 ≤ 2pA (ψn , ψn )+
Pauli Operator for Measure Valued Fields
403
2Aψn 2 < ∞, i.e. ψn ∈ H 1 , so it can be appoximated by C0∞ functions in H 1 and also in [ · 2 + pA (· , ·)]1/2 . To our knowledge, the precise conditions for Dmin (pA ) = Dmax (pA ) have not been investigated in general. Such a result is expected to be harder than Dmin (sA ) = Dmax (sA ) due to the lack of the diamagnetic inequality. In the present paper we do not address this question. We will define the Pauli quadratic form differently and always on the appropriate maximal domain since this is the physically relevant object (finite energy) and we identify a natural core for computations. We will see that this approach works for data even more singular than A ∈ L2loc and for A ∈ L2loc we obtain PAmax back. It is nevertheless a mathematically interesting open question to determine the biggest subset of L2loc vector potentials such that the set C0∞ is still a core for the Pauli form. Finally we remark D(sA )⊗D(sA ) ⊂ Dmin (pA ) for A ∈ L2loc . In case of B = ∇ ⊥·A ∈ ∞ L , these two domains are equal and PAmin = SA ⊗ I2 + σ3 B. If B ∈ L∞ loc only, then the form domains coincide locally. For more details on these statements, see Sect. 2 of [So].
2.2. Measures and integer point fluxes. Let M be theset of signed real Borel measures µ(dx) on R2 with finite total variation, |µ|(R2 ) = R2 |µ|(dx) < ∞. Let M be the set of signed real regular Borel measures µ on R2 , in particular they have σ -finite total variation. If µ(dx) = B(x)dx is absolutely continuous, then µ ∈ M is equivalent to B ∈ L1 . Let M∗ be the set of all measures µ ∈ M such that µ({x}) ∈ (−2π, 2π ) for any point x ∈ R2 , and M∗ := M ∩ M∗ . Definition 2.1. Two measures µ, µ ∈ M are said to be equivalent if µ − µ = 2π j nj δzj , where nj ∈ Z, zj ∈ R2 . The equivalence class of any measure µ ∈ M contains a unique measure, called the reduction of µ and denoted by µ∗ , such that µ∗ ({x}) ∈ [−π, π) for any x ∈ R2 . In particular, µ∗ ∈ M∗ . The Pauli operator associated with µ ∈ M will depend only on the equivalence class of µ up to a gauge transformation, so we can work with µ ∈ M∗ . This just reflects the physical expectation that any magnetic point flux 2π nδz , with integer n, is removable by the gauge transformation ψ(x) → einϕ ψ(x), where ϕ = arg(x − z). In case of several point fluxes, 2π j nj δzj , the phase factor should be exp i j nj arg(x − zj ) , but it may not converge for an infinite set of points {zj }. However, any µ ∈ M can be uniquely written as µ = µ∗ + 2π j nj δzj with nj ∈ Z \ {0} and with a set of distinct points {zj } which do not accumulate in R2 ≡ C. Let I+ := {j : nj > 0}, and I− := {j : nj < 0} be the set of indices of the points with positive and negative masses, respectively. By the Weierstrass theorem, there exist analytic functions Fµ (x) and Gµ (x) (recall x = x1 + ix2 ) such that Fµ has zeros exactly at the points {zj : j ∈ I+ } with multiplicities nj , and Gµ has zeros at {zj : j ∈ I− } with multiplicities −nj . Let Lµ (x) := Fµ (x)Gµ (x). Then the integer point fluxes can be removed by the unitary gauge transformation Uµ : ψ(x) →
Lµ (x) ψ(x). |Lµ (x)|
(6)
404
L. Erd˝os, V. Vougalter
For example, for any compact set K ⊂ R2 , we can write Lµ /|Lµ | as
nj arg(x − zj ) + iHK (x) , Lµ (x)/|Lµ (x)| = exp i
x ∈ K,
j :zj ∈K
where HK is a real harmonic function on K. In particular, for any ψ supported on K,
Uµ∗ (−i∇)Uµ ψ = − i∇ + nj Aj + ∇HK ψ, j :zj ∈K
where ∇ ⊥ · Aj = 2πδzj . 2.3. Potential function. The Pauli quadratic form for magnetic fields µ ∈ M∗ will be defined via the right hand side of (3), where h is a solution to h = µ. The following theorem shows that for µ ∈ M∗ one can always choose a good potential function h. Later we will extend it for µ ∈ M∗ . 1 Theorem 2.2. Let µ ∈ M∗ and := 2π µ(dx) be the total flux (divided by 2π ). There is 0 < ε(µ) ≤ 1 such that for any 0 < ε < ε(µ) there exists a real valued 1,p function h = h(ε) ∈ ∩p 4 is a sufficiently large real number. We decompose the solutions rt (x) to Eq. (7.1) accordingly: (−)
(+)
rt (x) = rt (x) + rt (x), t (−) (−) (−) Kt−s (G1 (us , vs )rs + G2 (us , vs )r s ) (x) ds, r0 (x) + rt (x) = Kt 0 t (+) (+) (+) rt (x) = Kt r0 (x) + Kt−s (G1 (us , vs )rs + G2 (us , vs )r s ) (x) ds. 0
442
J. Rougemont (−)
(+)
The kernels Kt and Kt have some regularity and decay properties that we next describe: let the Bernstein class BR,k be the following set of functions: (7.3) BR,k ≡ f ∈ L∞ : f extends to an entire function, |f (z)| ≤ Rek|Im z| . We have (−)
f is in BR,2p∗ with R ≤ 2C0 f ∞ . Lemma 7.1. For all p∗ > 4, t > 21 , f ∈ L∞ , Kt Moreover, for all n ∈ N, there is a Cn > 0 such that Cn (x)| ≤ √ (1 + x 2 /t)−n , t C ∗ 2 n (+) |Kt (x)| ≤ √ e−(p ) t/2 (1 + x 2 /t)−n . t (−)
|Kt
The proof of Lemma 7.1 is omitted, see [CE2, Ro]. Pick a 2ε-cover of Aω |QL+C(ε) (which exists a.s. by compactness, see Definition 6.3) and let u and v belong to one of its elements. Then r0 = u − v satisfies |r0 (x)| ≤ 2ε for |x| ≤ 21 (L + C(ε)). Define ξy(n) (x) =
1 . (1 + (x − y)2 )n/2 (n)
Remark that Lemma 10.1 also holds with ϕy replaced by ξy (n ≥ 2). Moreover by reproducing the proof of Lemma 6.1 using the bounds from Lemma 7.1 we obtain (for |x| ≤ L/2): 1 (−) (−) (−) (n) (n) |r1 (x)| ≤ |K1 r0 (x)| + C K1−s / ξ0 2 ξy rs 2 0
1
≤ Cε + 2Cε
√
0
Cn 1−s
(7.4)
eγ s ds
≤ Aε, where A depends on n but not on p∗ and (+)
(+)
|r1 (x)| ≤ |K1 ≤e
r0 (x)| + C
−(p∗ )2 /2
≤ B(p∗ )ε,
1
(+) (n) (n) K1−s / ξ0 2 ξy rs 2
0 1 C
ε + 2Cε 0
ne
−(p∗ )2 (1−s)/2
√
1−s
eγ s ds
(7.5)
where B(p∗ ) → 0 as p∗ → ∞. We choose p ∗ so large that B(p ∗ ) < 21 . We next use a result of Cartwright (see [KT, Eq. (191)]): for all f in the Bernstein class BR,2p∗ (see (7.3)), the following identity holds: f (x) =
∞ sin(8p ∗ x)
sin(4p ∗ (x − xn )) n (−1) f (x ) , n (x − xn )2 32(p ∗ )2 n=−∞
(7.6)
Stochastic Ginzburg–Landau Equations
where xn =
nπ 8p∗ .
443
Let f, g be in BR,2p∗ . A simple application of Eq. (7.6) shows that
f − gL∞ (QL ) ≤ C
sup
|n|≤[4p∗ L/π]+4Cp∗ /(επ)
1 |f (xn ) − g(xn )| + ε. 4
Hence, among all the functions in BRω ,2p∗ that are bounded by Aε in [− 21 L, 21 L] (by ∗ ∗ (−) (7.4), r1 is such a function), at most (4A)Cp L (4Rω /ε)Cp /ε of them are ε/2-separated on QL . By taking a ball of diameter ε around each of them, and repeating the operation for each element of the original 2ε-cover, we get an ε-cover of $1ω (Aω )|QL = Aθ 1 ω |QL . The number of elements in this cover is at most ∗
(4A)Cp L (4Rω /ε)Cp The proof of Lemma 6.2 is complete.
∗ /ε
M2ε,QL+C ,ω .
8. Proof of Proposition 6.1 We follow Collet and Eckmann’s proof [CE2], which is itself an adaptation of standard proofs of existence of the topological entropy, see e.g. [KH] and references therein. The proof of Proposition 6.1 is based on the following inequalities: Lemma 8.1. For all compacts Q, Q , all m, n ∈ N and ε > ε > 0 one has Nω,n,τ,Q,ε ≤ Nω,n,τ,Q,ε , Nω,n,τ,Q∪Q ,ε ≤ Nω,n,τ,Q,ε Nω,n,τ,Q ,ε , Nω,n+m,τ,Q,ε ≤ Nω,n,τ,Q,ε Nθ nτ ω,m,τ,Q,ε .
(8.1) (8.2) (8.3)
Furthermore for any τ < τ the following inequalities hold: Nω,n,τ ,QL ,ε ≤ Nω,n,τ,Qf (L) ,g(ε) ≤ Nω,n,τ ,Qf (f (L)) ,g(g(ε)) ,
(8.4)
where f (L) = L + C(τ + 1) log ε−1 and g(ε) = c exp(−γ τ )ε with C, c, γ some constants. Lemma 8.1 implies immediately that the limit in Eq. (6.3) exists: by subadditivity (8.3) and by invariance of P under θ t , we get that J1 = lim
n→∞
1 E log Nω,n,τ,QL ,ε nτ
exists, it is non-increasing in ε and by further subadditivity (8.2), 1 J1 L→∞ Ld
J2 = lim
also exists and is non-increasing in ε (by (8.1)). Hence the limit in Eq. (6.3) exists. By (8.4), it is independent of τ . Proof of Lemma 8.1. The inequality (8.1) is obvious from the definitions. We prove (8.2) by making the observation that if {A1 , . . . , AN } is an (n, ε)-cover of Aω |Q and {B1 , . . . , BM } an (n, ε)-cover of Aω |Q , then {Aj ∩ Bk : j = 1, . . . , N, k = 1, . . . , M} is an (n, ε)-cover of Aω |Q∪Q .
444
J. Rougemont
Similarly if {A1 , . . . , AN } is an (n, ε)-cover of Aω |Q and {B1 , . . . , BM } an (m, ε)cover of Aθ nτ ω |Q , then {Aj ∩ $−nτ ω Bk : j = 1, . . . , N, k = 1, . . . , M} is an (m + n, ε)cover of Aω |Q which proves (8.3). The inequality (8.4) follows immediately from Lemma 6.1, since if D is a set of diameter g(ε) in the metric dω,n,τ,Qf (L) then D is a set of diameter at most ε in the metric dω,n,τ ,QL . Remark 8.1. The topology of L∞ (Q) is a simplifying choice (as far as Eq. (8.2) is concerned), but [CE3] shows that other topologies can be used as well. 9. Proof of Proposition 6.2 This proof is, like the proof of Proposition 6.1, based on subadditive bounds. We use wellknown properties of the function Hµ (·), see [KH], Chapter 4.3 (in particular Proposition 4.3.3). We recall that x → −x log x is concave, hence for any partition U and any t > 0, the following holds: −t Hµ $−t $ (U) P(dω) ≤ H (U)P(dω) = Hµ (U), µ ω ω where we have used Eq. (6.2). We thus have Hµ
n+m−1
=
k=0
Hµ
n−1
Hµ
k=n
≤
Hµ +
n−1 k=0
k=0
$−kτ ω (.θ kτ ω,ε ) P(dω)
$−kτ ω (.θ kτ ω,ε ) P(dω)
k=0
Hµ
n−1
+
k=0
$−kτ ω (.θ kτ ω,ε ) P(dω)
m−1 −kτ Hµ $−nτ $ (. kτ ω,ε ) P(dω )P(dω) θ ω ω k=0
≤
n−1
$−kτ ω (.θ kτ ω,ε )
m−1 Hµ $−nτ $−kτ ω θ nτ ω (.θ (k+n)τ ω,ε ) P(dω)
≤
$−kτ ω (.θ kτ ω,ε ) P(dω)
k=0 n+m−1
+
$−kτ ω (.θ kτ ω,ε ) P(dω)
Hµ
n−1 k=0
$−kτ ω (.θ kτ ω,ε )
P(dω) +
Hµ
m−1 k=0
$−kτ (. ) P(dω), kτ θ ω,ε ω
namely subadditivity in the time variable. We can prove subadditivity in the space variable in a similar way. Thus the first two limits in Eq. (6.4) exist. These limits are monotonically increasing as ε → 0, hence the third limit is well-defined.
Stochastic Ginzburg–Landau Equations
445
ω,ε We next prove that the limit is independent of the choice of .ω,ε : let .ω,ε and . be two different sequences, we get (by the Rokhlin inequality) n−1 lim 1 lim 1 Hµ $−kτ ω T−x (.θ kτ Tx ω,ε ) L→∞ Ld n→∞ nτ x∈ Zd ∩QL
k=0
n−1 1 1 −kτ − lim d lim Hµ $ω T−x (.θ kτ Tx ω,ε ) L→∞ L n→∞ nτ x∈ Zd ∩QL
k=0
ω,ε ) + Hµ (. ω,ε |.ω,ε ), ≤ Hµ (.ω,ε |. and the r.h.s. above vanishes as ε → 0 since these sequences generate the whole sigmaalgebra of Aω in this limit. We prove that Eq. (6.4) is independent of τ by using Lemma 6.1 and an argument similar to the one used in Sect. 8. 10. Uniqueness of Solutions In this section, we provide details of the existence and uniqueness result for Eq. (4.2). First remark that the process t ζ (t) = e(t−s)L ξ dw(s) 0 m , hence in L2 for any δ, y. Moreover, is a well defined Gaussian stochastic process in Hul δ,y by construction, the nonlinearity in Eq. (4.3) is uniformly Lipschitz, hence local existence m follows by a contraction argument. It is immediate that the corresponding process in Hul m. ut has bounded moments in Hul 2 The uniqueness in Lδ,y space follows from the fact that bounded smooth functions are dense and the following
Lemma 10.1. The semi-flow $tω extends almost surely to a bounded continuous semiflow on L2δ,y for any δ > 0 and y ∈ Rd . Proof. We apply the non-propagation estimate of Ginibre and Velo [GV1]. Let u0 and v0 be two functions in L2δ,y and denote the corresponding solutions to Eq. (2.1) by ut and vt . Their difference ut − vt satisfies (almost surely) the following inequality: 1 √ 1 √ ∂t ϕδ,y (ut − vt )22 ≤ (1 + 1 + α 2 ) ϕδ,y (ut − vt )22 2 2 −Re (1 + iβ) ϕδ,y (ut − v t ) |ut |2q ut − |vt |2q vt . By [GV1] (Proposition 3.1), Hypothesis 2.1 implies that the last term above is negative. We thus get an estimate of the form ut − vt L2 ≤ exp(ct)u0 − v0 L2 δ,y
δ,y
This and Lemma 4.2 prove that $tω is uniformly bounded and continuous on L2δ,y for (n)
any δ > 0 and y ∈ Rd if we define ut = limn→∞ ut of bounded functions approaching u0 .
(n)
where u0 is a Cauchy sequence
446
J. Rougemont
11. Compact Embedding for Local Spaces In this section, we give a proof of Relation (2.7) which is a trivial adaptation of [Ad], Theorem 6.53, p.174. More precisely we prove the embedding (2.7) to be Hilbert– m+k Schmidt. Let {en }n∈N be a complete orthonormal basis of Hδ,y . Let {Qn }n∈N be a d countable cover of R by balls of radius 1. Let x ∈ Qn , let α ≤ m and define the m+k by bounded linear operator Dxα on Hδ,y Dxα (u) = ∇ α u(x). Its norm is (by Sobolev embedding) bounded by Dxα (u)2 m+k ≤ max sup |∇ α u(x)|2 ≤ Hδ,y
0≤α≤m x∈Qn
C u2 m+k . Hδ,y inf x∈Qn ϕδ,y (x)
By Riesz’ Lemma, Dxα (·) = (vxα , ·)Hm+k for some vector vxα and δ,y
∞
|∇ α en (x)|2 =
n=1
∞
n=1
|(en , vxα )Hm+k |2 = vxα 2 m+k . δ,y
Hδ,y
Thus the Hilbert–Schmidt norm of the embedding map is ∞
n=1
en 2Hm δ ,y
=
d α≤m R ∞
≤m which is finite whenever δ > δ.
vxα 2 m+k ϕδ ,y (x) dx
n=1 Qn
Hδ ,y
Cϕδ ,y (x) dx, inf z∈Qn ϕδ,y (z)
12. Proof of Lemma 4.3 The proof can be found in [BGO, Mi] and is summarised below. We decompose the plane into countably many sets Q(m, n) of unit area and use the bounds ϕδ,y (x) ≤ exp(−δ|x − y|) ≤ eϕδ,y (x). For simplicity we assume δ = 1 and we drop it from our notation (if Lemma 4.3 is true for δ = 1 then it is true for all δ > 0 by scaling, possibly with different constants). We simply write D f for D f (x) dx for D ⊂ R2 . We have R2
ϕy |(|f |2q f )f | ≤ C
m,n
e−|n|
Q(m,n)
|f ||f |2q−1 |f ||f | + |∇f |2 , (12.1)
# where m Q(m, n) ⊃ {x ∈ R2 : n − 21 ≤ |x − y| ≤ n + 21 }. We estimate each summand using Hölder and Gagliardo–Nirenberg inequalities. For any p, r with p−1 + r −1 = 1
Stochastic Ginzburg–Landau Equations
447
and in particular for r = 1 + 1/q and p = 1 + q, we get: |f ||f |2q−1 |f ||f | + |∇f |2 Q(m,n)
2q 2q−1 ≤ c1 f 2p f 2pq/(p−1) f 2p + f 2pq/(p−1) ∇f 24pq/(p+q−1) 2q 2q−1 1/2 1/2 2 ≤ c2 f 2p f 2pq/(p−1) f 2p + f 2pq/(p−1) f 2pq/(p−1) f 2p 2q
= c3 f 22p f 2qr 2(2q+2)/(2q+3)
≤ c4 ∇ 3 f 2
2(q+1/(2q+3))
f 2(q+1)
4q 2 +6q+2
≤ K −1 ∇ 3 f 22 + c5 Kf 2(q+1)
.
By summing up all contribution to (12.1) we arrive at ϕy |(|f |2q f )f | 2 R
≤ CK −1 e−|n| |∇ 3 f |2 + C K e−|n| Q(m,n)
m,n
−1 ≤ CK −1 = CK
R2
R2
ϕy |∇ 3 f |2 + C K
m,n
ne−|n| sup
n
ϕy |∇ 3 f |2 + C K sup
which proves Lemma 4.3.
y
R2
y
R2
Q(m,n)
|f |2(q+1)
ϕy |f |2(q+1)
ϕy |f |2(q+1)
η
η
η
,
Acknowledgements. This work was supported by the Fonds National Suisse. I am grateful to Martin Hairer, Sergei Kuksin and Armen Shirikyan for their comments and suggestions.
References [Ad] Adams, R.A.: Sobolev Spaces. New York: Academic Press, 1975 [Ar] Arnold, L.: Random Dynamical Systems. Berlin–Heidelberg: Springer, 1998 [BGO] Bartucelli, M.V., Gibbon, J.D., Oliver, M.: Length scales in solutions of the complex Ginzburg– Landau equation. Physica D 89, 267–286 (1996) [BKL] Bricmont, J., Kupiainen, A., Lefevere, R.: Ergodicity of the 2D Navier–Stokes Equations with Random Forcing. Commun. Math. Phys. 224, 65–81 (2001) [C1] Collet, P.: Thermodynamic limit of the Ginzburg–Landau equations. Nonlinearity 7, 1175–1190 (1994) [C2] Collet, P.: Extended Dynamical Systems, Doc. Math. Extra Volume ICM III (1998), 123–132. [CDF] Crauel, H., Debussche, A., Flandoli, F.: Random Attractors. J. Dyn. Diff. Equ. 9, 307–341 (1997) [CE1] Collet, P., Eckmann, J.-P.: Extensive Properties of the Complex Ginzburg–Landau Equation, Commun. Math. Phys. 200, 699–722 (1999) [CE2] Collet, P., Eckmann, J.-P.: The definition and measurement of the topological entropy per unit volume in parabolic PDEs. Nonlinearity 12, 451–473 (1999) [CE3] Collet, P., Eckmann, J.-P.: Topological entropy and ε–entropy for damped hyperbolic equations. Ann. Henri Poincaré 1, 715–752 (2000) [D] Debussche, A.: Hausdorff Dimension of a Random Invariant Set. J. Math. Pures Appl. 77, 967–988 (1998) [DZ1] Da Prato, G., Zabczyk, J.: Stochastic equations in infinite dimensions. Cambridge: Cambridge University Press, 1992
448
[DZ2]
J. Rougemont
Da Prato, G., Zabczyk, J.: Ergodicity for infinite-dimensional systems. Cambridge: Cambridge University Press, 1996 [EH] Eckmann, J.-P., Hairer, M.: Invariant Measures for Stochastic PDE’s on Unbounded Domains. Nonlinearity 14, 133–151 (2001) [F1] Funaki, T.: The Reversible measures of Multi-Dimensional Ginzburg–Landau Type Continuum Model. Osaka J. Math. 28, 463–494 (1991) [F2] Funaki T.: Regularity Properties for Stochastic Partial Differential Equations of Parabolic Type. Osaka J. Math. 28, 495–516 (1991) [FM] Flandoli, F., Maslowski, B.: Ergodicity of the 2–D Navier–Stokes equation under random perturbations. Commun. Math. Phys. 172, 119–141 (1995) [GV1] Ginibre, J., Velo G.: The Cauchy Problem in Local Spaces for the Complex Ginzburg–Landau Equation I. Compactness Methods. Physica D 95, 191–228 (1996) [GV2] Ginibre, J., Velo, G.: The Cauchy Problem in Local Spaces for the Complex Ginzburg–Landau Equation II. Contraction Methods. Commun. Math. Phys. 187, 45–79 (1997) [KH] Katok, A., Hasselblatt, B.: Introduction to the Modern Theory of Dynamical Systems. Cambridge: Cambridge University Press, 1995 [Kr] Krylov, N.V.: An analytic approach to SPDEs. In: Stochastic partial differential equations: six perspectives. Carmona, R.A. and Rozovskii, B., eds.. Providence, RI: Am. Math. Soc., 1999 [KS] Kuksin, S.B., Shirikyan, A.: Stochastic Dissipative PDEs and Gibbs Measures. Commun. Math. Phys. 213, 291–330 (2000) [KT] Kolmogorov, A.N., Tikhomirov, V.M.: ε-entropy and ε-capacity of sets in functional spaces. In: Selected Works of Kolmogorov, A.N., Vol III, Shiryayev, A.N., ed.. Dordrecht: Kluwer, 1993. [Ku] Kuksin, S.B.: Stochastic Nonlinear Schrödinger Equation. 1. A priori Estimates. Proc. Steklov Inst. Math. 225, 219–242 (1999) [LO] Levermore, C.D., Oliver, M.: The complex Ginzburg–Landau equation as a model problem. In: Dynamical systems and probabilistic methods in partial differential equations. Deift, P. et al., eds. Providence, RI: Am. Math. Soc., 1996 [LQ] Liu, P.-D., Qian, M.: Smooth Ergodic Theory of Random Dynamical Systems. Lecture Notes in Mathematics, 1606. Berlin–Heidelberg, Springer, 1995 [Ma] Mattingly, J.C.: Ergodicity of 2D Navier–Stokes equations with random noise and large viscosity. Commun. Math Phys. 206, 273–288 (1999) [Mi] Mielke, A.: Bounds for the solutions of the complex Ginzburg–Landau equation in terms of the dispersion parameters. Physica D 117, 106–116 (1998) [Ro] Rougemont, J.: ε–Entropy Estimates for Driven Parabolic Equations. Preprint (2000) [Ru] Ruelle, D.: Large Volume Limit of the Distribution of Characteristic Exponents in Turbulence. Commun. Math. Phys. 87, 287–302 (1982) [S] Sinai, Ya.G.: Two Results Concerning Asymptotic Behaviour of Solutions of the Burgers Equation. J. Statist. Phys. 64, 1–12 (1991) [VF] Vishik, M.J., Fursikov, A.V.: Mathematical Problems of Statistical Hydromechanics. Dordrecht: Kluwer, 1988 [Y1] Yosida, K.: Functional Analysis, Sixth edition. Berlin–New York: Springer, 1980 [Y2] Young, L.-S.: Ergodic Theory of Chaotic Dynamical Systems. In: XIIIth International Congress of Mathematical Physics (ICMP’97), Brisbane. Cambridge, MA: Internat. Press, 1999 Communicated by Ya. G. Sinai
Commun. Math. Phys. 225, 449 – 450 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Erratum
Monotonicity of Optimal Transportation and the FKG and Related Inequalities Luis A. Caffarelli Department of Mathematics, RLM 8.000, C1200, University of Texas at Austin, Austin, TX 78712-1082, USA. E-mail:
[email protected] Received: 3 July 2001 / Accepted: 5 July 2001 Commun. Math. Phys. 214, 547–563 (2000)
It was pointed out to us by Gilles Harge, that the proof of Theorem 11, p. 559 was incomplete, since we prove there only that δϕ ≤ 2h2 and thus Dαα ϕ ≤ 2 . We now complete the proof: We change the formula on line 11, p. 560 to (the correct one) h δϕ = ∇ϕ(x0 + te) − ∇ϕ(x0 − te), e dt . (∗) 0
We first plug the information ∇ϕ(x0 + te) − ∇ϕ(x0 − te), e ≤ 2λ ≤ 2h (from convexity along the x0 + te line) and we get δϕ ≤ 2h2 , and thus Dαα ϕ ≤ 2. We now have the extra information that 0 ≤ ϕαα ≤ 2 . More generally, suppose we know that 0 ≤ ϕαα ≤ a0 for some a0 > 1. We plug that information in the formula (∗), above, and get, for any 0 ≤ t ≤ h ∇ϕ(x0 + te) − ∇ϕ(x0 − te), e ≤ min(2h, 2α0 t) .
450
L. A. Caffarelli
Thus, by integration along the segment we get 1 h2 + δϕ ≤ 2 h2 1 − a0 2a0 (2a − 1) 0 = h2 a0 = h2 a1 < h2 a0 . Thus, ϕαα ≤ a1 < a0 . Starting with a0 = 2 and repeating the argument infinitely many times we end up proving that δϕ ≤ h2 since 1 is the unique solution of This completes the proof. Communicated by J. L. Lebowitz
(2a−1) a
= 1.
Commun. Math. Phys. 225, 451 – 452 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Erratum
Energy Correlations in O(N) Models and the Wolff Representation Michael Campbell Department of Mathematics, University of California, Irvine, CA 92697-3875, USA. E-mail:
[email protected] Received: 27 September 2001 / Accepted: 27 September 2001 Commun. Math. Phys. 218, 99–111 (2001)
1. Introduction In this erratum, an error in [1] is pointed out. 1. The proof of Lemma 1 in [1] is not correct. A high temperature expansion for a simple 4-site graph ([2] and the author) shows inconsistencies that point to an error. ˜ m ) + (1/2) The problem in Lemma 1 is the association above (25) of (1/2)δ( ˜ m = π) with δ(Zm = 0). An approximation to δ( ˜ m = 0) is to replace it with δ( ˜ m = Zm /ym , exp[−λ(arctan(Zm /ym ))2 ]/ dsm exp[−λ(arctan(Zm /ym ))2 ]. Since tan ˜ m = π ) + (1/3)δ( ˜ m = 2π ) ˜ m = 0) + (1/3)δ( clearly this will converge to (1/3)δ( as λ → ∞. However, in the new coordinates of (25), the above approximation does not converge to δ(Zm = 0) in (27), which ends up being a uniform measure in the xm -ym plane. Although the maximum of arctan(Zm /ym ) is uniformly distributed in the xm -ym plane (at Zm = 0), the mass is not. Geometrically this can be established by looking at the surface Zm /ym = constant, which is a plane. Note this plane has the most mass between it and the xm -ym plane when ym = 1 and the least when ym = 0. Thus the approximation to the delta function will not converge to a uniform measure in the coordinates of (25). The δ(Zm = 0) should be removed from (27) and replaced with the correct limit. 2. Theorem 2 in [1] relies upon the assumptions of Lemma 1 mentioned above. So it does not hold, and part (ii) of Theorem 2 is incorrect. However, a slight modification does show that if the inductive assumption that (ii) holds for the O(N −1) model is made, then (ii) does hold for the O(N ) model if we replace all dot products in (ii) si1 sj1 + · · · + siN sjN with the first N − 1 terms: si1 sj1 + · · · + siN−1 sjN−1 . In effect Theorem 2 says that if it is inductively assumed that (ii) holds for O(N − 1), then any subset of the same N − 1 terms in the O(N ) model will also satisfy (ii) by a direct application of the strong-FKG property. If it is assumed (i) and (ii) hold for the O(N − 1) model, then part (i) holds exactly as stated.
452
M. Campbell
3. All other results in [1] are correct for O(N ) under the assumption that (i) and (ii) of Theorem 2 hold for the O(N −1) model. Hence if an inductive approach is taken towards proving (i) and (ii), then there are some potentially useful tools available. Namely if it is assumed that (i) and (ii) hold for the O(N − 1) model, then the strong-FKG property can be used in the O(N ) model. References 1. Campbell, M.: Energy Correlations in O(N ) models and the Wolff Representation. Commun. Math. Phys. 218, 99–111 (2001) 2. Hara, T. and Sokal, A.: Private communication Communicated by J. L. Lebowitz
Commun. Math. Phys. 225, 453 – 463 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Mean-Field Criticality for Percolation on Planar Non-Amenable Graphs Roberto H. Schonmann Department of Mathematics, University of California at Los Angeles, Los Angeles, CA 90095, USA. E-mail:
[email protected] Received: 4 April 2001 / Accepted: 4 October 2001
Abstract: The critical exponents β, γ , δ and are proved to exist and to take their meanfield values for independent percolation on the following classes of infinite, locally finite, connected transitive graphs: (1) Non-amenable planar with one end. (2) Unimodular with infinitely many ends. 1. Introduction 1.1. Results. A great deal of attention has been given recently to the study of statistical mechanics and related systems on various classes of graphs. The reader is invited to consult [Lyo] and [Sch] for introductions to the subject, and for references to the literature. This paper can be seen as a continuation of [Sch], and we refer the reader to that paper for background and motivation. Basic terminology, definitions and notation will be reviewed later in this introduction. We will consider independent bond percolation on an infinite, locally finite, connected transitive graph G = (V , E). Results similar to the ones presented here hold also for independent site percolation, with similar proofs. The same remark can be made about the extension from transitive to quasi-transitive graphs. Conjecture 1.2 in [Sch], combined with Conjecture 6 in [BS1] (reproduced as Conjectures 1.1 in [Sch]), state that if the graph is non-amenable, critical exponents exist and take their mean-field values. In [Sch], Theorem 1.1, this was proved for various critical exponents in the case in which the graph is unimodular and the edge-isoperimetric constant (Cheeger constant) is a sufficiently large fraction of the degree of the graph (previously, a special case had been handled in [Wu]). Here we prove similar results in two cases. Theorem 1.1. For independent bond percolation on the following classes of infinite, locally finite, connected transitive graphs the critical exponents β, γ , δ and exist and take their mean field values: Work partially supported by the N.S.F. through grant DMS-0071766 and by a Guggenheim Foundation fellowship.
454
R. H. Schonmann
(i) Graphs which are planar non-amenable and have one end. (ii) Graphs which are unimodular and have infinitely many ends. At the end of the next subsection, after introducing the necessary notation, we review the meaning of the exponents addressed in this theorem and recall what their mean-field values are. It is worth pointing out that while Theorem 1.1 in [Sch] was proved by verifying the triangle condition of [AN] (or, more precisely, the open triangle condition of [BA]), in the present paper we will follow a somewhat different route, based nevertheless also on the work of [AN, BA], and [Ngu]. We do not know whether the triangle condition holds in the cases treated here. The fact that there is no percolation at the critical point, which is a feature of mean-field criticality, is known to hold for independent percolation on any infinite, locally finite, connected transitive unimodular graph. This was proved in [BLPS1], and a simpler proof was provided in [BLPS2]. Unfortunately, the methods from these papers do not provide information on critical exponents. Part (i) of Theorem 1.1 is the main contribution in this paper. This is one more instance in which the extra techniques resulting from planarity allow one to prove that results on percolation conjectured to hold with greater generality are true at least in the planar case. In the classical study of percolation (and other statistical mechanics processes) on transitive amenable graphs, and especially on the graphs Zd , this is a well known fact: planarity allows one to make much faster progress, and much more has been proved in the case of Z2 than in the more general case of Zd (see, e.g., [Gri]). In the context of percolation on transitive non-amenable graphs, a similar pattern has been followed. The paper [Lal1] anticipated for certain transitive non-amenable planar graphs some of the results which would later be proved for more general transitive non-amenable graphs. The study of percolation on transitive non-amenable planar graphs was later greatly developed in the papers [Lal2] and [BS2]. For instance, the fundamental Conjecture 6 in [BS1], which states that for independent bond or site percolation on transitive nonamenable graphs there is always a regime with infinitely many infinite clusters, was proved to hold under the extra assumption of planarity. In contrast to Theorem 1.1(i), independent percolation on transitive amenable planar graphs with one end is expected to have critical exponents with non-mean-field values. The case in which the graph is Z2 is extensively discussed in [Gri]. In the case of site percolation on the triangular lattice, various critical exponents have recently been proved to indeed take their conjectured, non-mean-field, values. This is a result of the rapid progress on conformal invariance, in combination with earlier work by H. Kesten relating various critical exponents in the two dimensional case (see [LSW, SW] and references therein). 1.2. Terminology and notation. We will consider independent bond percolation on an infinite, locally finite, connected graph G = (V , E), where V is the set of vertices (sites) and E is the set of edges (bonds). A site r ∈ V will be singled out and denoted the root of G. The cardinality of a set S ⊂ V will be denoted by |S|. The edge boundary of a set S ⊂ V is ∂E S = {{x, y} ∈ E : x ∈ S, y ∈ S c } and its inner vertex boundary is ∂in S = {x ∈ S : {x, y} ∈ ∂E S for some y ∈ S c }. The edge-isoperimetric constant (Cheeger constant) of G is defined as |∂E S| : S ⊂ V , 0 = |S| < ∞ . iE (G) = inf |S|
Mean-Field Criticality
455
G is said to be amenable in case iE (G) = 0. The number of ends of the graph G is E(G) = sup {number of infinite connected components of G\S}, S⊂V |S| 0}. From the methods of [AB], we know that for quasi-transitive graphs pc = sup{p ∈ [0, 1] : χ (p) < ∞}. The threshold for uniqueness of the infinite cluster is pu = inf{p ∈ [0, 1] : Pp (there is a unique infinite cluster) = 1}. In order to define the critical exponent δ, we introduce a “ghost field”. Each site is painted green, independently of anything else, with probability q. Pp,q will denote the corresponding probability measure in this enlarged probability space, and Ep,q will be the corresponding expectation. The random set of green sites will be denoted by Q. One defines θ (p, q) = Pp,q (r ↔ Q), and χ (p, q) = Ep,q (|C(r)|; C(r) ∩ Q = ∅) = x∈V Pp,q (r ↔ x, r ↔ Q). Next we review what is meant by saying that each one of the critical exponents which appears in Theorem 1.1 exists and takes its mean-field value. The labels on the
456
R. H. Schonmann
left indicate the way one usually refers to each statement, and provide the corresponding mean-field value of each critical exponent: [γ = 1] [β = 1] [δ = 2] [ = 2]
C1 (pc − p)−1 ≤ χ (p) ≤ C2 (pc − p)−1 , for p < pc , C1 (p − pc )1 ≤ θ (p) ≤ C2 (p − pc )1 , for p > pc , C1 q 1/2 ≤ θ (pc , q) ≤ C2 q 1/2 , for q > 0, For m = 1, 2, . . . C1 (pc − p)−2 ≤ Ep (|C(r)|m+1 )/Ep (|C(r)|m ) −2 ≤ C2 (pc − p) , for p < pc ,
where in each case C1 , C2 ∈ (0, ∞). 2. Sufficient Conditions for Mean-Field Criticality From the arguments in [AN] (modified in the fashion of Sect. 3.1 of [BA]) and [Ngu], we have: Lemma 2.1.A. Suppose that G = (V , E) is an infinite, locally finite, connected transitive unimodular graph such that pc < 1. Suppose also that there are ", c > 0 and sites x1 , x2 ∈ V such that for every p ∈ (pc − ", pc ), Pp (x1 ↔ z1 , x2 ↔ z2 , x1 ↔ x2 ) ≥ c(χ (p))2 . z1 ,z2 ∈V
Then γ = 1 and = 2. From the arguments in [BA] and [New] we have: Lemma 2.1.B. Suppose that G = (V , E) is an infinite, locally finite, connected transitive unimodular graph such that pc < 1. Suppose also that there are ", c > 0 and sites x1 , x2 , x3 ∈ V such that for every p ∈ (pc − ", pc ) and q ∈ (0, "), Pp,q (x1 ↔ z, x1 ↔ Q, x2 ↔ Q, x3 ↔ Q, x2 ↔ x3 ) ≥ cχ (p, q)(θ (p, q))2 . z∈V
Then δ = 2 and β = 1. The role of unimodularity in the derivation of the two lemmas above is explained in Section 3.2 of [Sch]. In the remainder of this section, we will reduce the lemmas above to further sufficient conditions for statements of mean-field criticality. The reader can either study these lemmas in the order in which they will be presented, or alternatively, study first the lemmas labeled with “A”, which refer to the exponents γ and , and later study the lemmas labeled with “B”, which refer to the exponents δ and β, and which have more involved proofs. Lemma 2.2.A. Suppose that G = (V , E) is an infinite, locally finite, connected transitive unimodular graph. Suppose also that there are ", c > 0, disjoint sets of sites V1 , V2 ⊂ V and sites x1 ∈ V1 , x2 ∈ V2 such that for every p ∈ (pc − ", pc ), Pp (V1 ↔ V2 ) ≤ 1 − c, z∈Vi
Then γ = 1 and = 2.
Vi
Pp (xi ←→ z) ≥ cχ (p)
(i = 1, 2).
Mean-Field Criticality
457 Vi
Proof. For i = 1, 2, the events {xi ←→ z} depend only on the state of occupancy of the edges which have both endpoints in Vi , while the event {V1 ↔ V2 } depends only on the state of occupancy of the other edges. Therefore, by independence, Pp (x1 ↔ z1 , x2 ↔ z2 , x1 ↔ x2 ) z1 ,z2 ∈V
≥
V1
V2
Pp (x1 ←→ z1 , x2 ←→ z2 , V1 ↔ V2 )
z1 ,z2 ∈V
=
V1
V2
Pp (x1 ←→ z1 )Pp (x2 ←→ z2 )Pp (V1 ↔ V2 )
z1 ,z2 ∈V
=
V1
Pp (x1 ←→ z1 )
z1 ∈V1
V2
Pp (x2 ←→ z2 ) Pp (V1 ↔ V2 )
z2 ∈V2
≥ c (χ (p)) . 3
2
And the claim follows from Lemma 2.1.A. (The hypothesis in that lemma that pc < 1 must hold, since otherwise Pp (V1 ↔ V2 ) → 1, as p pc .) Lemma 2.2.B. Suppose that G = (V , E) is an infinite, locally finite, connected transitive unimodular graph. Suppose also that there are ", c > 0, disjoint sets of sites V1 , V2 , V3 ⊂ V and sites x1 ∈ V1 , x2 ∈ V2 , x3 ∈ V3 such that for every p ∈ (pc − ", pc ) and q ∈ (0, "), Pp,q (Vi ↔ Vj ) ≤ 1 − c (i = j ), V1 Pp,q (x1 ←→ z, x1 ↔ Q) ≥ cχ (p, q), z∈V1 Vi
Pp,q (xi ←→ Q) ≥ cθ (p, q)
(i = 2, 3).
Then δ = 2 and β = 1. Proof. Set V1
Az1 = {x1 →←→ z},
A˜ 1 = {x1 ↔ Q},
V1 A˜ z1 = {x1 →←→ z, x1 ↔ Q}, V2
A2 = {x2 →←→ Q},
V3
A3 = {x3 →←→ Q},
B = {V1 ↔ V2 , V2 ↔ V3 , V3 ↔ V1 }. E
⊂ E, we will denote by FE the σ -field generated by the state of For each set occupancy of the edges in E . Let E1 = {{u, v} ∈ E : u, v ∈ V1 }, and let (E1k )k≥1 be an increasing sequence of subsets of E1 which converges to this set (i.e., ∪k E1k = E1 ). k k For any k, and any configuration ω1 ∈ {0, 1}E1 , the set of configurations in {0, 1}E\E1 × {0, 1}V which in combination with ω1 produce a configuration in A˜ 1 is a decreasing set. Similarly for B. Therefore, by the Harris’ inequality, Pp,q (A˜ 1 B|FE k ) ≥ Pp,q (A˜ 1 |FE k )Pp,q (B|FE k ) = Pp,q (A˜ 1 |FE k )Pp,q (B), 1
1
1
1
458
R. H. Schonmann
where in the last step we used the fact that B depends only on the state of occupancy of the edges which have at least one endpoint in (V1 )c and therefore is independent of FE k . Letting k → ∞, and using (5.9) on p. 264 of [Dur], yields 1
Pp,q (A˜ 1 B|FE1 ) ≥ Pp,q (A˜ 1 |FE1 )Pp,q (B). Integration over Az1 ∈ FE1 , yields now Pp,q (A˜ z1 B) = Pp,q (A˜ 1 Az1 B) ≥ Pp,q (A˜ 1 Az1 )Pp,q (B) = Pp,q (A˜ z1 )Pp,q (B). Therefore,
Pp,q (x1 ↔ z, x1 ↔ Q, x2 ↔ Q, x3 ↔ Q, x2 ↔ x3 ) ≥
z∈V
=
Pp,q (A˜ z1 B)Pp,q (A2 )Pp,q (A3 )
z∈V1 6
≥
Pp,q (A˜ z1 A2 A3 B)
z∈V1
Pp,q (A˜ z1 )Pp,q (A2 )Pp,q (A3 )Pp,q (B)
z∈V1
≥ c χ (p, q)(θ (p, q)) . 2
In the second step above we used the fact that A˜ z1 B depends only on the state of occupancy of the edges which have at least one endpoint in (V2 ∪ V3 )c and on the state (green or not) of the vertices in (V2 ∪ V3 )c , while, for i = 2, 3, Ai depends only on the state of occupancy of the edges which have both endpoints in Vi and on the state (green or not) of the vertices in Vi . In the last step above we used Harris’ inequality to obtain Pp,q (B) ≥ c3 . The claim follows now from Lemma 2.1.B. (The hypothesis in that lemma that pc < 1 must hold, since otherwise Pp,q (Vi ↔ Vj ) → 1, as p pc .) Lemma 2.3.A. Suppose that G = (V , E) is an infinite, locally finite, connected transitive unimodular graph. Suppose also that there are disjoint sets of sites V1 , V2 ⊂ V and sites x1 ∈ V1 , x2 ∈ V2 such that
Ppc (V1 ↔ V2 ) < 1, Ppc (xi ↔ v) < 1
(i = 1, 2).
v∈∂in Vi
Then γ = 1 and = 2. Proof. We will verify the hypothesis of Lemma2.2.A with " = pc and c = min{1 − Ppc (V1 ↔ V2 ), 1− v∈∂in V1 Ppc (x1 ↔ v), 1− v∈∂in V2 Ppc (x2 ↔ v)}. By monotonicity in p, only the second display in the hypothesis of Lemma 2.2.A requires any nonVi
trivial argumentation. To verify it, we note that if {xi ↔ z} occurs, then either {xi ←→ z} occurs, or else there is some vertex v ∈ ∂in Vi for which the event {xi ↔ v}{v ↔ z} occurs. From the van den Berg–Kesten–Fiebig–Reimer inequality, we obtain then, for p < pc , Vi Pp (xi ↔ v)Pp (v ↔ z) Pp (xi ↔ z) ≤ Pp (xi ←→ z) + v∈∂in Vi Vi
≤ Pp (xi ←→ z) +
v∈∂in Vi
Ppc (xi ↔ v)Pp (v ↔ z).
Mean-Field Criticality
459
Summing over z ∈ V , χ (p) ≤
z∈Vi
Therefore,
Vi
Pp (xi ←→ z) +
Ppc (xi ↔ v)χ (p).
v∈∂in Vi
Vi
Pp (xi ←→ z) ≥ cχ (p).
z∈Vi
Lemma 2.3.B. Suppose that G = (V , E) is an infinite, locally finite, connected transitive unimodular graph. Suppose also that there are disjoint sets of sites V1 , V2 , V3 ⊂ V and sites x1 ∈ V1 , x2 ∈ V2 , x3 ∈ V3 such that Ppc (Vi ↔ Vj ) < 1 Ppc (xi ↔ v) < 1
(i = j ), (i = 1, 2, 3).
v∈∂in Vi
Then δ = 2 and β = 1.
Proof. We will verify the hypothesis of Lemma 2.2.B with " = p c and c = min{1 − Ppc (V1 ↔ V2 ), 1 − Ppc (V2 ↔ V3 ), 1 − Ppc (V3 ↔ V1 ), 1 − v∈∂in V1 Ppc (x1 ↔ v), 1 − v∈∂in V2 Ppc (x2 ↔ v), 1 − v∈∂in V3 Ppc (x3 ↔ v)}. By monotonicity in p, only the second and third displays in the hypothesis of Lemma 2.2.B require any non-trivial argumentation. Vi
To verify the third display, we note that if {xi ↔ Q} occurs, then either {xi ←→ Q} occurs, or else there is some vertex v ∈ ∂in Vi for which the event {xi ↔ v}{v ↔ Q} occurs. From the van den Berg–Kesten–Fiebig–Reimer inequality, we obtain then, for p < pc and q ∈ (0, 1], Vi θ (p, q) = Pp,q (xi ↔ Q) ≤ Pp,q (xi ←→ Q) + Pp,q (xi ↔ v)Pp,q (v ↔ Q) v∈∂in Vi Vi
≤ Pp,q (xi ←→ Q) +
Ppc (xi ↔ v)θ (p, q).
v∈∂in Vi
Therefore,
Vi
Pp,q (xi ←→ Q) ≥ cθ (p, q) (i = 2, 3). To verify the second display in the hypothesis of Lemma 2.2.B note that if {x1 ↔ V1
z, x1 ↔ Q} occurs, then either {x1 ←→ z, x1 ↔ Q} occurs, or else there is some vertex v ∈ ∂in Vi for which the event {x1 ↔ v}{v ↔ z, v ↔ Q} occurs (this is slightly subtle; recall that our sample space is {0, 1}E × {0, 1}V and, using the notation in [Gri], p. 38, take for the set K in the definition of a set of edges which produce a path from x1 to v – do not include any vertex in K). From the van den Berg–Kesten–Fiebig–Reimer inequality, we obtain then, for p < pc and q ∈ (0, 1], Pp,q (x1 ↔ z, x1 ↔ Q) V1
≤ Pp,q (x1 ←→ z, x1 ↔ Q) +
Pp,q (x1 ↔ v)Pp,q (v ↔ z, v ↔ Q)
v∈∂in V1 V1
≤ Pp,q (x1 ←→ z, x1 ↔ Q) +
v∈∂in V1
Ppc (x1 ↔ v)Pp,q (v ↔ z, v ↔ Q).
460
R. H. Schonmann
Summing over z ∈ V , V1 χ (p, q) ≤ Pp,q (x1 ←→ z, x1 ↔ Q) + Ppc (x1 ↔ v)χ (p, q). z∈V1
Therefore,
v∈∂in V1 V1
Pp,q (x1 ←→ z, x1 ↔ Q) ≥ cχ (p, q).
z∈V1
3. The Case of Planar Graphs In this section we suppose that G = (V , E) is an infinite, locally finite, connected transitive non-amenable planar single-ended graph. Proposition 2.1 of [BS2] states that G is unimodular and that it can be embedded in the hyperbolic plane H2 in the following way. Each vertex of G is mapped into a point of H2 and each edge of G is mapped into a geodesic line segment with endpoints at the points of H2 which are images of its endpoints; moreover the group of automorphisms of G is mapped in this way into a group of isometries of H2 . It is clear that, by adjusting the length scale, such an embedding can be chosen so that each face in the embedding has diameter less than 1. In particular any point of H2 is then within distance 1 of a point which represents a vertex of G, and all the geodesic line segments which represent edges of G have length at most 1. We will refer to such an embedding as a “nice embedding”. One convenient way to describe the dual G† = (V † , E † ) of G is to represent each element of V † by a face (tile) in the embedding of G, described above, and represent elements of E † by pairs of faces whose topological boundaries intersect on a nondegenerate geodesic line segment (which represents an edge of G). This establishes a one-to-one correspondence between E and E † , and the image of e ∈ E under this correspondence will be denoted by e† . Since G is transitive, G† is quasi-transitive. Any bond percolation process on G is coupled to a bond percolation process on G† , by declaring each edge e† vacant (resp. occupied) if e is occupied (resp. vacant). Independent percolation at density p on G is coupled in this fashion to independent percolation at density 1 − p on G† . The following lemma is a basic building block in our argumentation in this section. In the statement of this lemma, we identify a path in the dual graph with the union of the tiles that correspond to the endpoints of the dual edges in this path in the embedding. Lemma 3.1. Suppose that G is an infinite, locally finite, connected transitive nonamenable planar single-ended graph, nicely embedded in H2 . If p < pu , then there is C0 > 0 such that the following happens. Let L be an arbitrary geodesic line in H2 , s and s be two points on L, separated by distance l > 2, and L and L be geodesic lines perpendicular to L through s and s , respectively. Then Pp (there is an occupied dual path separating L from L ) > C0 . Proof. This was proved in a somewhat more restricted setting and for site percolation in [Lal2], Lemma 2.15. The more general case considered here can be handled in the same way, by using results in [BS2]. First, from Theorem 3.7 of [BS2], we learn that there is percolation in the dual process when p < pu . From the generalization of Corollary 4.4 of [BS2] to quasi-transitive tilings of H2 , we learn then that percolation also occurs in this dual process on hyperbolic half-spaces. This enables us to use the arguments in the proofs of Lemma 2.14 and 2.15 in [Lal2] to conclude the proof.
Mean-Field Criticality
461
Given a nice embedding of G in H2 and a set S ⊂ H2 , we will use the notation S¯ for the set of vertices of G which are endpoints of edges represented in the embedding by geodesic line segments which intersect S. Lemma 3.2. Suppose that G is an infinite, locally finite, connected transitive nonamenable planar single-ended graph, nicely embedded in H2 . If p < pu , then there are C1 , C2 ∈ (0, ∞), such that the following happens. Let L be an arbitrary geodesic line in H2 , s and s be two points on L, separated by distance L, and L and L be geodesic lines perpendicular to L through s and s , respectively. Then Pp (L¯ ↔ L¯ ) ≤ C1 e−C2 L . Proof. Take l > 2 and consider the set of geodesic lines which separate L from L , are perpendicular to L and cross it at points which are at distance j l, j = 1, 2, . . . , L/ l from s . Since any path from L¯ to L¯ has to cross all these lines, the claim follows from Lemma 3.1. Lemma 3.3. Suppose that G is an infinite, locally finite, connected transitive nonamenable planar single-ended graph, nicely embedded in H2 . If p < pu , then there are C3 , C4 ∈ (0, ∞), such that the following happens. Let L be an arbitrary geodesic line in H2 , s and s be two points on L, separated by distance L, and L be the geodesic line perpendicular to L through s . Let x be a vertex of G which in the embedding is mapped into a point of H2 at distance at most 1 from s. Then Pp (x ↔ y) ≤ C3 e−C4 L . Proof. Let L+ and L− be the two half-lines into which s partitions L . Take some l > 2. Set s0 = s , and for k ∈ {1, 2, . . . } let sk (resp. s−k ) be the point on L+ (resp. L− ) at distance kl from s . For k ∈ {1, 2, . . . } let Ik (resp. I−k ) be the geodesic segment (contained in L ) with endpoints sk−1 and sk (resp. s−k+1 and s−k ). For j ∈ Z, let Lj be the geodesic line perpendicular to L through sj . Then Pp (x ↔ y) ≤ Pp (x ↔ y). y∈L¯
j ∈Z\{0} y∈I¯ j
Let D be the degree of G. It is easy to see that for some small " > 0 any ball of radius " in H2 can intersect at most D edges of the embedding of G in H2 . Therefore it is also easy to see that any geodesic line segment of length d can intersect at most dD/" such edges. Therefore, from the previous display we obtain, for arbitrary J , lD 2lJ D ¯ Pp (x ↔ y) ≤ Pp (x ↔ I¯ j ) + Pp (x ↔ L). " " y∈L¯
j :|j |>J
When j > 1 (resp. j < 1) any path from x to I¯ j has to cross the lines Li , i = 1, 2, . . . , j − 1, (resp. i = −1, −2, . . . , −j + 1). Hence, Lemma 3.1 implies Pp (x ↔ I¯ j ) ≤ C5 e−C6 j , for some C5 , C6 ∈ (0, ∞). Therefore, using Lemma 3.2 and taking J = L , we obtain Pp (x ↔ y) ≤ C7 e−C6 L + C8 L e−C2 L ≤ C3 e−C4 L . y∈L¯
462
R. H. Schonmann
Proof of Theorem 1.1(i). We will check that the hypothesis of Lemma 2.3.A and Lemma 2.3.B are satisfied (note that the former are contained in the latter). Suppose that G is nicely embedded in H2 . Let L be a geodesic line and s1 , . . . , s7 be distinct points on L, such that for i = 1, . . . , 6, the distance between si and si+1 has the same common value L. For each i, let Li be the geodesic line perpendicular to L through ri . The removal of L2 ∪ L3 ∪ L5 ∪ L6 breaks H2 into 5 connected components. For i = 1, 4, 7, let Vi be the connected component which contains si . Set V1 = V¯ 1 , V2 = V¯ 4 , V3 = V¯ 7 . Let x1 , x2 and x3 be vertices of G which in the embedding are mapped into points of H2 at distance at most 1 from s1 , s4 and s7 , respectively. With these choices, the hypothesis of Lemma 2.3.B are satisfied, provided that L is large enough, as can be seen from Lemma 3.2, Lemma 3.3 and Theorem 1.1 of [BS2], which states that pc < pu .
4. The Case of Graphs with Infinitely Many Ends We will need some notation and terminology related to the binary homogeneous tree, T2 , i.e., the tree in which every vertex has degree 3. The set of vertices of this tree will be denoted by V (T2 ). Given i, j, k ∈ V (T2 ) we will say that k is between i and j if the shortest path from i to j passes through k. The following proposition will be used in this section; it can be easily proved with the arguments in the proof of Propositions 6.1 in [Moh2]. (Compare with Proposition 2.1 in [Sch].) Below B(u, n) will denote the ball of radius n centered at u ∈ V in the graph G = (V , E). Proposition 4.1. Suppose that G = (V , E) is an infinite, locally finite, connected transitive graph. If G has infinitely many ends, then there is a positive integer n and vertices uk ∈ V , k ∈ V (T2 ) such that the balls B(uk , n), k ∈ Z ar disjoint and have the following property. For each i, j ∈ V (T2 ) any path from B(ui , n) to B(uj , n) intersects each B(uk , n) with k between i and j . Proof of Theorem 1.1(ii). We will check that the hypothesis of Lemma 2.3.A and Lemma 2.3.B are satisfied (note that the former are contained in the latter). Let k0 , k1 , k2 , k3 ∈ V (T2 ) be such that for 1 ≤ i < j ≤ 3, k0 is between ki and kj , and for i = 1, 2, 3, the distance in T2 between ki and k0 has a common value l. Using the notation in Proposition 4.1, set xi = uki , i = 0, 1, 2, 3. Proposition 4.1 implies that G\B(x0 , n) has at least 3 distinct infinite components, which contain respectively x1 , x2 and x3 . Call them, respectively, V1 , V2 and V3 . Since G has infinitely many ends, it is non-amenable and hence, by Theorem 2 of [BS1] (adapted to bond percolation), it has pc < 1. To verify the hypothesis of Lemma 2.3.B, let K be the number of edges of G which have at least one endpoint in B(u0 , n), and note that, for 1 ≤ i < j ≤ 3, Ppc (Vi ↔ Vj ) ≤ 1 − (1 − pc )K < 1, and, for i = 1, 2, 3, Ppc (xi ↔ v) ≤ |∂in Vi |Ppc (xi ↔ ∂in Vi ) ≤ K(1 − (1 − pc )K )l−1 . v∈∂in Vi
The last expression can be made arbitrarily small by taking l sufficiently large.
Mean-Field Criticality
463
Acknowledgement. I am grateful to Ander Holroyd and Oded Schramm for their various comments and suggestions.
References [AB]
Aizenman, M. and Barsky D.: Sharpness of the phase transition in percolation models. Commun. Math. Phys. 108, 489–526 (1987) [AN] Aizenman, M. and Newman, C.M.: Tree graph inequalities and critical behavior in percolation models. J. Stat. Phys. 16, 811–828 (1983) [BA] Barsky, D.J. and Aizenman, M.: Percolation critical exponents under the triangle condition. Commun. Math. Phys. 19, 1520–1536 (1991) [BLPS1] Benjamini, I., Lyons, R., Peres,Y. and Schramm, O.: Group-invariant percolation on graphs. Geom. and Funct. Anal. 9, 29–66 (1999) [BLPS2] Benjamini, I., Lyons, R., Peres, Y. and Schramm, O.: Critical percolation on any non-amenable group has no infinite clusters. Ann. of Probability 27, 1347–1356 (1999) [BS1] Benjamini, I. and Schramm, O.: Percolation beyond Zd , many questions and a few answers. Electronic Communications in Probability 1, 71–82 (1996) [BS2] Benjamini, I. and Schramm, O.: Percolation in the hyperbolic plane. J. Am. Math. Soc. 14, 487–507 (2000) [Dur] Durrett, R.: Probability: Theory and Examples. Duxbury Press, Second edition, 1996 [Gri] Grimmett, G.R.: Percolation. New York–Berlin: Springer-Verlag, 2nd edition, 1999 [Lal1] Lalley, S.P.: Percolation on Fuchsian groups. Annales de L’Institut Henri Poincaré (Probability and Statistics) 34 , 151–177 (1998) [Lal2] Lalley, S.P.: Percolation clusters in hyperbolic tesselations. Geom. and Funct. Anal. (to appear) [LSW] Lawler, G., Schramm, O. and Werner, W.: One-arm exponent for critical 2D percolation. Preprint, 2001 [Lyo] Lyons, R.: Phase transition on non-amenable graphs. J. Math. Phys. 41, 1099–1126 (2000) [Moh] Mohar, B.: Some relations between analytic and geometric properties of infinite graphs. Discrete Mathematics 95, 193–219 (1991) [New] Newman, C.M.: Another critical exponent inequality for percolation: β ≥ 2/δ. J. Stat. Phys. 47, 695–699 (1987) [Ngu] Nguyen, B.: Gap exponent for percolation processes with triangle condition. J. Stat. Phys. 49, 235–243 (1987) [Sch] Schonmann, R.H.: Multiplicity of phase transitions and mean-field criticality on highly nonamenable graphs. Commun. Math. Phys. 219, 271–322 (2001) [SW] Smirnov, S. and Werner W.: critical exponents for two-dimensional percolation. Preprint, 2001 [Wu] Wu, C.C.: Critical behavior of percolation and Markov fields on branching planes. J. Appl. Probability 30, 538–547 (1993) Communicated by M. Aizenman
Commun. Math. Phys. 225, 465 – 485 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Normal Coordinates and Primitive Elements in the Hopf Algebra of Renormalization C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara Instituto de Ciencias Nucleares, Universidad Nacional Autónoma de México, Apdo. Postal 70-543, 04510 México, D.F., Mexico. E-mail: {chryss,quevedo,mrosen,vergara}@nuclecu.unam.mx Received: 25 May 2001 / Accepted: 5 October 2001
Abstract: We introduce normal coordinates on the infinite dimensional group G introduced by Connes and Kreimer in their analysis of the Hopf algebra of rooted trees. We study the primitive elements of the algebra and show that they are generated by a simple application of the inverse Poincaré lemma, given a closed left invariant 1-form on G. For the special case of the ladder primitives, we find a second description that relates them to the Hopf algebra of functionals on power series with the usual product. Either approach shows that the ladder primitives are given by the Schur polynomials. The relevance of the lower central series of the dual Lie algebra in the process of renormalization is also discussed, leading to a natural concept of k-primitiveness, which is shown to be equivalent to the one already in the literature.
Contents 1. 2. 3.
4.
5.
Introduction . . . . . . . . . . . . . . . . . . . . . Differential Geometry á la Hopf . . . . . . . . . . . The Hopf Algebra of Rooted Trees and Its Dual . . 3.1 Functions . . . . . . . . . . . . . . . . . . . 3.2 Vector fields . . . . . . . . . . . . . . . . . 3.3 1-forms . . . . . . . . . . . . . . . . . . . . Normal Coordinates . . . . . . . . . . . . . . . . . 4.1 A new basis . . . . . . . . . . . . . . . . . 4.2 The Hopf structure . . . . . . . . . . . . . . Primitive Elements . . . . . . . . . . . . . . . . . . 5.1 Ladder generators . . . . . . . . . . . . . . 5.2 The general case . . . . . . . . . . . . . . . 5.3 The lower central series and k-primitiveness
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
466 467 468 468 469 470 470 470 472 474 475 476 478
466
6.
C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara
Normal Coordinates and Toy Model Renormalization . . . . . . . . . . . 6.1 The toy model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Renormalization in the ψ-basis . . . . . . . . . . . . . . . . . . .
481 481 482
1. Introduction The process of renormalization in quantum field theory has been substantially elucidated in recent years. In a series of papers (see, e.g., [11, 7, 2, 9] and references therein), a Hopf algebra structure has been identified that greatly simplifies its combinatorics. This, in turn, has led to the development of an underlying geometric picture, involving an infinite dimensional group manifold G, the coordinates of which are in one-to-one correspondence with (classes of) 1PI superficially divergent Feynman diagrams of the theory. The latter are indexed by a type of graphs known as (decorated) rooted trees, which capture the subdivergence structure of the diagram. The forest formula prescription for the renormalization of a diagram then is translated into a series of operations on the corresponding rooted tree and the latter have been shown to deliver standard Hopf algebraic quantities, like the coproduct and the antipode of the rooted tree. The above results were obtained using a powerful mixture of algebraic and combinatoric techniques that brought to light unexpected interconnections with noncommutative geometry, among several other fields. The complexity of the full Hopf algebra of decorated rooted trees is, in many respects, overwhelming. Even in the simplest cases, one is confronted with an infinite set of available decorations for the vertices of the rooted trees, originating in the infinite number of primitive divergent diagrams appearing in the underlying theory. It is rather fortunate then that the considerably simpler algebra of rooted trees with a single decoration seems to capture many of the features of realistic theories. It is for this reason that it has been studied extensively, as a first step towards an understanding of the full theory. Of primary importance, given their rôle in renormalization theory, is the study of the primitive elements of the above Hopf algebra. These correspond to sums of products of diagrams with the property that their renormalization involves a single subtraction. In Ref. [3], an ansatz is presented for a (conjectured) infinite family of such elements, corresponding to the ladder generators of the algebra, i.e., to trees whose every vertex has fertility at most one. Furthermore, dealing with the general case, a set of vertexincreasing operators is constructed that generates new primitive elements from known ones. As the number of primitive elements increases rapidly with increasing number of vertices, this approach necessitates the introduction of new operators in each step, a task that has not yet been systematized. Our motivation in this paper is two-fold. On a general, methodological level, we argue that the above algebraic/combinatoric approach, with all its multiple successes, should nevertheless be complemented by a differential geometric one, which, we feel, has not been sufficiently considered in the literature. On a second, more concrete level, we provide support for our claim, by showing how a simple application of the inverse Poincaré lemma reduces the search for primitive elements to that of closed, left invariant (LI) 1-forms on G. For the case of the ladder primitives, we give a simple generating formula that identifies them with the Schur polynomials. Our discussion uses the normal coordinates on the group, a choice that leads naturally to a concept of k-primitiveness, associated with the lower central series of the dual Lie algebra – we prove that this coincides with the k-primitiveness introduced in Ref. [3]. We discuss the rôle of the new
Normal Coordinates and Primitive Elements in Hopf Algebra
467
coordinates in renormalization, using the toy model realization of Ref. [10], while also commenting on similar results obtained for the more realistic heavy quark model of [2]. 2. Differential Geometry á la Hopf We will be dealing with differential geometric concepts expressed in Hopf algebraic terms. We opt for this formulation having in mind the transcription of our results for the non-commutative case – Hopf algebras are ideally suited to this task. We start by providing a short dictionary between the two languages and establish the notation, assuming nevertheless familiarity with the basic definitions. Two algebras will be of main interest to us: on the one hand we have the (commutative, non-cocommutative) algebra A of functions on a (possibly infinite dimensional) group manifold, generated by {φ A }, with A ranging in an index set – we denote by a, b, . . . general elements of A. On the other hand, we have the (non-commutative, cocommutative) universal enveloping algebra U of the Lie algebra of the group. We actually work with a suitable completion of U, so as to allow exponentials of its generators ZA , which we identify with the points of the manifold1 – we denote by x, y, . . . general elements of U (we use g, g , . . . if we refer to group elements in particular). Both algebras are Hopf algebras. For A, the coproduct (a) ≡ a(1) ⊗ a(2) codifies left and right translations L∗g (a)(·) = a(1) (g)a(2) (·),
(1)
and similarly for right translations. For U, it expresses Leibniz’s rule, (Z) = Z ⊗ 1 + 1 ⊗ Z, for the left-invariant generator Z. The two Hopf algebras are dual , via the inner product (also called pairing)
· , · : U ⊗ A → C,
x ⊗ a → x, a ,
(2)
which, when x stands for a generator Z, amounts to taking the derivative of a along x and evaluating it at the identity. For x = g, the above definition produces a Taylor series expansion of a at the identity which gives, for a analytic, the value a(g) of a at the point g. The coproduct in A is dual to the product in U via
xy, a = x ⊗ y, a(1) ⊗ a(2) (3) and vice-versa. We usually work with dual bases, so that ZA only gives 1 when paired with φ A , while its inner product with all other φ’s, as well as with all products of φ’s, vanishes. Given a Poincaré–Birkhoff–Witt basis {f i } for A, {f i } = {1, φ A , φ A φ B , . . . },
(4)
one can build a dual basis {ei } for the entire U by adjoining to the above Z’s polynomials j in them, {ei } = {1, ZA , quadratic, cubic, . . . }, with ei , f j = δi – this, in general, involves a non-trivial calculation. To every element a of A we can associate a LI 1-form a , given by a = S(a(1) )da(2) ,
(5)
1 The particular group we deal with in Sect. 3 is non-compact and infinite dimensional. Nevertheless, in this paper, we only consider elements that correspond to exponentials of linear combinations of the generators. For a readable account of what we might be missing in doing so, see Ref. [12].
468
C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara
d being the exterior derivative and S the antipode in A. is linear, while on products it gives (ab) = a (b) + (a)b ,
A commutative,
(6)
where is the counit in A. We take all generators φ T of A to be counitless, i.e., we choose functions that vanish at the identity of the group, except for the unit function 1A (which we often write as just 1). This implies that only returns a non-zero result when applied to the generators and vanishes on all products, as well as on 1A . The Maurer-Cartan (MC) equations take the form da = −a(1) a(2) .
(7)
Using (6), one sees that only the bilinear part of the coproduct contributes to the MC equations. 3. The Hopf Algebra of Rooted Trees and Its Dual 3.1. Functions. We specialize the general considerations of the previous section to the Connes–Kreimer algebra of renormalization. For a detailed exposition we refer the reader to [10, 6, 8] and references therein, we give here only a brief account of the basic definitions and some illustrative examples. A is now the Hopf algebra HR of functions generated by φ T , where T is a rooted tree. This means that the group manifold G is, in this case, infinite dimensional, with one dimension for every rooted tree – the φ’s are coordinate functions on this manifold. The group law is encoded in the coproduct C C (φ T ) = φ T ⊗ 1 + 1 ⊗ φ T + φ P (T ) ⊗ φ R (T ) . (8) cuts C
The sum in the above definition is over admissible cuts, i.e., cuts that may involve more than one edge (simple cuts) but such that there is no more than one simple cut on any path from the root downwards. R C (T ) is the part that is left containing the root of T while P C (T ) is the product of all branches cut, e.g. b
( b b) =
b b b
⊗1+1⊗
b b b
b
+ 2 b⊗ b+
bb
⊗ b,
(9)
where we let a tree T itself denote the corresponding function φ T , a convention freely used in the rest ofb the paper. The factor 2 on the r.h.s. appears because there are two possible cuts on b b generating the corresponding term. A convenient way to recast (8) as a single sum, is to introduce a full and an empty cut, above and below any tree T respectively, e.g., full cut
b b
b
b
b
b
empty cut.
(10)
We rewrite (8) in the form (φ T ) =
cuts C
φP
C (T )
⊗ φR
C (T )
,
(11)
Normal Coordinates and Primitive Elements in Hopf Algebra
469
where the above two extra cuts, included in C , produce the primitive part of the coproduct. Notice that respects the grading given by the number v(T ) of vertices of a tree T . We call this the v-degree of φ T , denote it by degv (φ T ), and extend it to monomials as the sum of the v-degrees of the factors. The polynomial degree will be called p-degree to avoid confusion – it is obviously not respected by the coproduct. We will use the (n) (n) notation Ai for the subspace of A of v-degree n and p-degree i, e.g., A1 is the linear span of the generators with n vertices. ∗ , generated by {Z }, with T a 3.2. Vector fields. The rôle of U is now played by HR T rooted tree and we take the Z’s dual to the φ’s, in the sense of the previous section. ZT is a left invariant vector field on G. The Lie algebra of such vector fields is found by computing, using (3), the pairing of ZA ZB − ZB ZA with all basis functions {f i }.
Example 1. Computation of [Z , Z ]. We have b b
b
b b
b
b
b
b
˜ b b) = 2 b ⊗ b + (
˜ b) = b ⊗ b + b ⊗ b, ( b b b ˜ b b) = b ⊗ b + b ⊗ b + (
bb
⊗ b+ b⊗
bb
⊗ b,
(12)
,
bb
˜ T ) ≡ (φ T ) − φ T ⊗ 1 − 1 ⊗ φ T . These are the only functions that contain where (φ b the term b ⊗ b in their coproduct. We find therefore, using (3), b b b b (13) Z Z , b b = 1. Z Z , b b = 2, Z Z , b = 1, b
b b
b
Similarly, one computes
b b
b
ZZ, b b
b
b b b
= 1,
ZZ, b b
b
b bb
b b
= 1,
(14)
the pairings with all other functions being zero. It follows that the only non-zero pairing of the commutator is b [Z , Z ], b b = 2. (15) b
b b
But the element 2Z of U has exactly the same pairings, therefore, in order for the inner product between U and A to be non-degenerate, one must set [Z , Z ] = 2Z . b b b
b
b b
Proceeding along these lines, one arrives at the general expression [7] [ZT1 , ZT2 ] = n T − n T ZT ≡ f T ZT , T
T1 T2
T2 T1
T
T1 T2
b b b
(16)
where n T is the number of simple cuts on T that produce T1 , T2 , with T2 containing T1 T2 the root of T (denoted by n(T1 , T2 , T ) in [6]) and the last equation defines the structure constants f T of the Lie algebra. We introduce, following [7], a ∗-operation among T1 T2 the Z’s, defined by ZT1 ∗ ZT2 = n
T ZT . T1 T2
(17)
Notice that this is not the product in U but, nevertheless, it gives correctly the commutator when antisymmetrized (cf. (16)). The above Lie bracket conserves the number of vertices.
470
C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara
3.3. 1-forms. We turn now to LI 1-forms. Starting from (5) and using the particular form of the coproduct in (8), we find
φ T =
φ S(P
C (T ))
dφ R
C (T )
= dφ T +
C
φ S(P
C (T ))
dφ R
C (T )
.
(18)
C
For the MC equations we may use directly (7) and the comment that follows it to find
dφ T = −
simple C
φ P C (T ) φ RC (T ) .
(19)
The restriction to simple cuts is possible since cuts that involve more than one edge produce non-linear terms in the first tensor factor of the coproduct and these are annihilated by . This is probably the easiest way to derive the structure constants. Example 2. Maurer–Cartan equation for . Using (18) we find b b b
= d b, b
b
= d b − bd b, b b
b b b
b
b
= d b b − 2 bd b +
d b.
bb
(20)
Direct application of d to the above expression for , or use of (19), gives b b b
d
b b b
= −2 , b
in agreement with the commutator [Z , Z ] = 2Z b
b b
b b
b b b
(21)
of Ex. 1.
General vector and 1-form fields are obtained as linear combinations of the above, with coefficients in A. 4. Normal Coordinates 4.1. A new basis. We introduce new coordinates {ψ A } on G, defined by A g, ψ A = α A , where g = eα ZA ,
(22)
i.e., the ψ’s are normal coordinates centered at the origin and, like the φ’s, are indexed by rooted trees. Of fundamental importance in the sequel will be the canonical element C (see, e.g., [4]), given by A
C = ei ⊗ f i = eZA ⊗ψ .
(23)
{ei } and {f i } above are dual bases of U and A respectively (see (4)). In contrast with (4), we fix now the {ei } to be {1, ZA , ZA ZB , . . . } and define the ψ’s by the second equality above (the tensor product sign ensures that the Z’s do not act on the ψ’s). C may be regarded as an “indefinite group element” – when the ψ’s get evaluated on some specific point g0 of the group manifold, C becomes g0 . One may also view C as an “indefinite
Normal Coordinates and Primitive Elements in Hopf Algebra
471
function” on the group – when the Z’s get evaluated on some particular (analytic) φ0 , the resulting Taylor series delivers φ0 , i.e., A A eZA ⊗ψ , id ⊗g0 = g0 , eZA ⊗ψ , φ0 ⊗ id = φ0 . (24) In the above, g0 , φ0 stand for any element in the corresponding universal enveloping algebra, not just the generators.
i The second of (24) gives the relation between the two i linear bases f(φ) and f(ψ) , generated by the φ’s and the ψ’s respectively. Indeed, A taking φ0 = φ and expanding the exponential we find ∞ 1 ZB1 . . . ZBm , φ A ψ B1 . . . ψ Bm m! m=0 1 = ψ A + ZB1 ZB2 , φ A ψ B1 ψ B2 + . . . . 2
φA =
(25)
Lemma 1. The change of linear basis in A generated by (25) is invertible. Proof. Notice that the linear part of φ A (ψ) is ψ A and also, that the above expansion preserves the v-degree. We choose a linear basis in A with the following ordering b b
b b b
b b b
b b
{ φ , φ , φ φ , φ , φ , φ φ , (φ )3 , . . . },
b
v=1
b
b
v=2
b
b
(26)
v=3
namely, in blocks of increasing v-degree and, within each block, non-decreasing pdegree. The above remarks then show that the matrix A, defined by j
i f(φ) = Ai j f(ψ) ,
(27)
i } is also ordered as in (26), is upper triangular, with units along the diagonal where {f(ψ) and hence invertible.
Notice that A is in block-diagonal form, with each block Av acting on A(v) , v = 1, 2, . . . . The computation of φ A (ψ), via (25), reduces essentially to the evaluation of the inner product of φ A with monomials in the Z’s – this is facilitated by the following Lemma 2. The inner product ZB1 . . . ZBm , φ A is given by
ZB1 . . . ZBm , φ A = ZB1 ∗ . . . ∗ ZBm , φ A = n
A , B1 ...Bm
(28)
where A B1 ...Bm
n
A R1 n B1 R1 B2 R2
=n
Rm−2 Bm−1 Bm
...n
(29)
(ZB1 ∗ . . . ∗ ZBm above is computed starting from the right, e.g., ZB1 ∗ ZB2 ∗ ZB3 ≡ ZB1 ∗ (ZB2 ∗ ZB3 )).
472
C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara
Proof. We have
ZB1 . . . ZBm , φ A = ZB1 ⊗ . . . ⊗ ZBm , m−1 (φ A ) .
(30)
In the above inner product, only the m-linear terms in m−1 (φ A ) contribute, since the Z’s vanish on products and the unit function. One particular way of evaluating the (m − 1)-fold coproduct is to apply always on the rightmost It is then tensor factor. A m ⊗ j −1 clear that, in this case, we may instead apply lin , since ⊗ (φ ) and j =1 id A m ⊗ j −1 ⊗lin (φ ) only differ by terms containing products of the φ’s or units j =1 id (this is only true if lin is applied in the rightmost factor). Notice now that the ∗-product of the Z’s is dual to lin ,
ZB1 ∗ ZB2 , φ A = ZB1 ⊗ ZB2 , lin (φ A ) .
(31)
Repeated application of this equation and use of the definition of ∗, Eq. (17), completes the proof. A concise way to express the relation between the two sets of generators is via the ∗-exponential (x ∈ U1 ) e∗x ≡
∞ ∞ 1 ∗i 1 x = x ∗ ··· ∗ x . i! i! i=0 i=0 i factors
(32)
Combining (25) and (28) we find Z ⊗ψ A
e∗ A
= ZB ⊗ φ B ,
(33)
where the convention (ZA ⊗ ψ A ) ∗ (ZB ⊗ ψ B ) = ZA ∗ ZB ⊗ ψ A ψ B is understood and the sum on the r.h.s. starts with 1 ⊗ 1.
4.2. The Hopf structure. We derive now the Hopf data for the new basis. A standard property of C is ( ⊗ id)C = C13 C23 , A
(id ⊗)C = C12 C13 ,
(34)
where, e.g., C13 ≡ eZA ⊗1⊗ψ – this is just the product-coproduct duality in (3). The second of (34) permits the calculation of the coproduct of the ψ’s by applying the Baker– Cambell–Hausdorff (BCH) formula to the product on its r.h.s., (ψ A ) is the coefficient of ZA in the resulting single exponential
Normal Coordinates and Primitive Elements in Hopf Algebra
473
exp ZA ⊗ (ψ A ) = exp ZA ⊗ ψ A ⊗ 1 exp ZB ⊗ 1 ⊗ ψ B 1 = exp ZA ⊗ ψ A ⊗ 1 + ZB ⊗ 1 ⊗ ψ B + [ZA , ZB ] ⊗ ψ A ⊗ ψ B + . . . 2 A 1 A A B1 = exp ZA ⊗ ψ ⊗ 1 + 1 ⊗ ψ + f ψ ⊗ ψ B2 + . . . , (35) 2 B1 B2 so that 1 A B1 ψ ⊗ ψ B2 + . . . . (ψ A ) = ψ A ⊗ 1 + 1 ⊗ ψ A + f 2 B1 B2
(36)
Higher terms in the coproduct can be computed by using a recursion relation for the BCH formula (see, e.g., Sect. 16 of [1]). The counit of all ψ A vanishes.Although (ψ A ) A can be complicated, S(ψ ) never is. Using S(g), ψ A = g, S(ψ A ) and the fact that S(g) = g −1 , it is easily inferred that S(ψ A ) = −ψ A ,
(37)
which extends as S(pr (ψ)) = (−1)r pr (ψ) on homogeneous polynomials of p-degree r. We see the first of the many advantages of working in the ψ-basis: the antipode is diagonal. Example 3. Computation of ψ (n) , n ≤ 4. A straightforward application of (25) gives =ψ Z, b =ψ , b b b 1 2 1 b = ψ Z, b + ψ ψ ZZ, b =ψ + ψ , 2 2 b b 1 3 b = ψ +ψ ψ + ψ , 6 b 1 3 b b = ψ +ψ ψ + ψ , 3 b b
b
b
b
b b
b b b
b
b b
b
b b b b
b b
b
b b b
b b b
b b
b
b
b b
b
b
b
b
b b b
1 1 2 1 2 4 = ψ +ψ ψ + ψ + ψ ψ + ψ , 24 2 2 b b 1 1 2 2 4 b b = ψ +ψ ψ + ψ ψ + ψ ψ + ψ , 12 2 3 b bb 1 1 1 2 5 2 1 4 b = ψ + ψ ψ + ψ ψ + ψ + ψ ψ + ψ , 2 2 2 6 8 b 3 1 4 2 bbb = ψ + ψ ψ +ψ ψ + ψ . 2 4 b b b b
b bb b
b bbb
b
b b
b b b
b
b
b b b
b
b b b
b
b
b b b
b b
b
b b
b
b
b b b
b b
b
b b
b
b
b
b b
b
(38)
474
C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara
Inverting the above expressions we find ψ = b, b
b b
b
b b b
b b b
1 2 b , 2
ψ = b−
1 3 b , 3 b b 1 3 = b b− bb+ b , 6 b
ψ = − b b b
ψ
b b b b
b b b
ψ = − b b b b
b b
b bb
+
b b bb
− b b bb
1 2
b 2 b
+
2
b
b b
−
1 4 b , 4
5 2b 1 b 1 4 b b− bb b− b , 6 2 6 b b bb 1 b 1 b 2 2 b 1 b2 1 4 b , ψ = b − bb− bb b+ b b− b − 2 2 3 2 12 b 1 2b 3 b (39) ψ = b b b − b b b + b b. 2 2 Concerning the coproduct, Eq. (36) shows that all ladder ψ’s are primitive. For the rest of the ψ’s, we get (omitting the primitive part) ˜ ψ = ψ ⊗ψ −ψ ⊗ψ , 1 1 ˜ ψ = ψ ⊗ψ −ψ ⊗ψ + ψ ⊗ψ − ψ ⊗ψ 2 2 1 1 1 2 1 2 + ψ ψ ⊗ψ + ψ ⊗ψ ψ − ψ ⊗ψ − ψ ⊗ψ , 6 6 6 6 1 1 1 1 ˜ ψ = ψ ⊗ψ − ψ ⊗ψ + ψ ⊗ψ − ψ ⊗ψ 2 2 2 2 1 1 2 1 1 2 − ψ ⊗ψ ψ − ψ ψ ⊗ψ + ψ ⊗ψ + ψ ⊗ψ , 6 6 6 6 3 1 3 1 ˜ ψ = ψ ⊗ψ − ψ ⊗ψ − ψ ⊗ψ ψ − ψ ψ ⊗ψ 2 2 2 2 1 2 1 2 + ψ ⊗ψ + ψ ⊗ψ . (40) 2 2 ψ
=
b b
−
+
b bb b
b bbb
b b b
b
b b
b b
b
b b b b
b
b b b
b b b
b
b
b bb b
b b
b
b bbb
b
b b b
b
b
b
b b b
b
b b b
b b
b b
b b
b
b b b
b
b b b
b
b b
b b
b
b b
b
b
b b
b
b
b b b
b
b b
b
b
b b
b
b
b
b b
b
b b b
b
b b b
b
b b
b
b
One can easily verify that S(ψ A ) = −ψ A . 5. Primitive Elements We turn now to the study of the primitive elements of A. These are of fundamental importance in any Hopf algebra, but acquire even more privileged status in our case, given their rôle in renormalization. Apart from this, they are also of interest in representation theory: given a primitive element a ∈ A, (a) = a ⊗ 1A + 1A ⊗ a, one obtains a one-dimensional representation ρa of U via (41) ρa (x) ≡ x, ea .
Normal Coordinates and Primitive Elements in Hopf Algebra
475
Indeed, ea is group-like, (ea ) = ea ⊗ ea , so that ρa (xy) ≡ xy, ea = x ⊗ y, ea ⊗ ea = ρa (x)ρa (y).
(42)
Conversely, every one-dimensional representation of U is associated to some primitive element in A. Primitive elements are typically rare, but the algebra of rooted trees is quite exceptional in this respect: there is an infinite number of them in A, with a non-trivial index set. We start our discussion with the easiest case, that of the ladder generators, for which our Theorem 1 below supplies a complete answer. We then turn to the considerably more complicated general case which Theorem 2 reduces to the problem of finding all closed LI 1-forms on G. 5.1. Ladder generators. We consider the subalgebra T of HR generated by the ladder generators Tn , where n counts the number of vertices. Their coproduct is (Tn ) =
n
Tk ⊗ Tn−k ,
(43)
k=0
making T a sub-Hopf algebra of HR (notice though that for φ not in T , (φ) may involve terms in T ⊗ T ). Experimenting a little we find that, for the first few n’s, each Tn gives rise to a primitive P (n) . The general case is handled by the following Theorem 1. To each ladder generator Tn , n = 1, 2, . . . , corresponds a primitive element P (n) , with Tn as its linear part, given by
P
(n)
∞ 1 ∂n m = log Tm x . n n! ∂x x=0
(44)
m=0
n Proof. Consider the algebra F of formal power series f (x) = ∞ n=0 cn x , c0 = 1, with ∗ the usual product. Define a basis {ξn , n = 0, 1, 2, . . . } of F , the dual of F, via
ξn , f (x) = cn ,
(45)
i.e., ξn reads off the coefficient of x n in f and ξ0 = 1. For f (x) = f (x)f (x) we have2 f (x) =
∞ n=0
cn x n ,
cn =
n k=0
ck cn−k ,
(46)
which implies the coproduct (ξn ) =
n
ξk ⊗ ξn−k
k=0 2 Notice that primes only distinguish functions here, they do not denote differentiation.
(47)
476
C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara
in F ∗ . Endowing F ∗ with a commutative product, we arrive at the isomorphism F ∗ ∼ =T, as Hopf algebras, with ξn ↔ Tn . Define a new basis {σn , n = 0, 1, 2, . . . } in F ∗ by
σr , f (x) = c˜r ,
with
f (x) = e
∞
r=1 c˜r x
r
(48)
and σ0 = 1. Then f (x) = e
∞
r r=1 c˜r x
,
with
c˜r = c˜r + c˜r ,
(49)
implying the coproduct (σr ) = σr ⊗1+1⊗σr . The σ ’s, under the above isomorphism, correspond to the P (n) in T . Solving the equation e
∞
r=1 P
(r) x r
=
∞
Tn x n
(50)
n=0
for P (r) , one arrives at (44).
We read off P (n) , for the first few values of n, as the coefficient of x n in the Taylor series expansion ∞ 1 1 n = T1 x + T2 − T12 x 2 + T3 − T1 T2 + T13 x 3 Tn x log 2 3 n=0
1 1 + T4 − T1 T3 − T22 + T12 T2 − T14 x 4 2 4 1 + T5 − T1 T4 − T2 T3 + T12 T3 + T1 T22 − T13 T2 + T15 x 5 5 1 2 2 + T6 − T1 T5 − T2 T4 − T3 + T1 T4 + 2T1 T2 T3 − T13 T3 2 1 3 3 2 2 1 4 + T2 − T1 T2 + T1 T2 − T16 x 6 + . . . . (51) 3 2 6
The polynomials P (n) (Ti ) are known as Schur polynomials. 5.2. The general case. Given a closed LI 1-form α on G, there exists a linear combination φ i of the generators φ A such that α = φ i . Applying the inverse Poincaré lemma, we may write (locally)
φ i = dψ i ,
(52)
for some function ψ i in A. Requiring additionally that ψ i vanish at the origin, (ψ i ) = 0, fixes the constant left arbitrary by (52) to zero. ψ i can be expressed in terms of the i of ψ i (φ) is φ i . But then φ’s. Since φ i reduces to dφ i at the origin, the linear part ψlin φ i = ψ i , since projects to the linear part. Comparing the r.h.s. of (52) with the
general expression for a LI 1-form, Eq. (5), we conclude that ψ i is primitive. Conversely, every primitive function ψ i gives rise to a closed LI 1-form, dψ i = ddψ i = 0 = dψ i . lin
Normal Coordinates and Primitive Elements in Hopf Algebra
477
Equation (7), and the comment that follows it, show that lin (φ i ) is symmetric under the interchange of its two tensor factors. This observation leads to a particularly simple way to identify primitive elements. One first looks for linear combinations φ i of the φ A with symmetric lin (φ i ) (notice that lin is given by simple cuts). The explicit expression for the corresponding primitive ψ i then is given by the standard formula for the (local) potential of a closed form. We find that the result is simplified considerably due to the particular form of the coproduct of the φ A , namely the linearity of (φ A ) in its second tensor factor.
Theorem 2. Given φ i ∈ A1 , such that dφ i = 0. Then the element ψ i of A, given by
ψ i = −+−1 ◦ S(φ i ),
(53)
is primitive and has φ i as its linear part (+ above is the p-degree operator for the φ’s, +(φ A1 . . . φ Ar ) = rφ A1 . . . φ Ar ). Proof. We apply the inverse Poincaré lemma to φ i . For a given v-degree n, only φ A of v-degree up to n enter in the formulas – we denote them collectively by x (e.g., S(φ)(x) denotes the standard expression of S(φ) in terms of the φ A while S(φ)(zx) denotes the same expression with every φ A multiplied by z). Consider the family of diffeomorphisms ϕt : x → (1 − t)x, 0 ≤ t ≤ 1. Then ϕ0∗ is the identity map while ϕ1∗ is the zero map. The corresponding velocity field is v =
d 1 ϕt (x) = −x ⇒ v(y, t) = − y, dt 1−t
where y = ϕt (x). We have3 φ (x) = i
However, we find
d ∗ dt ϕt
ϕ0∗
φ (ϕ0 (x)) i
− ϕ1∗
φ (ϕ1 (x)) = i
0
dt
1
(54)
d ∗ ϕ φ i (y) . dt t
(55)
= ϕt∗ Lv = ϕt∗ (d iv + iv d) and, taking into account the closure of φ i , φ i (x) = d
0 1
dt ϕt∗ iv φ i (y) .
(56)
This is the inverse Poincaré lemma. We concentrate now on the action of iv on φ i (y). We have 1 i i iv = − φ i (y) = S(φ(1) ) dφ(2) (y). (57) y j i∂y j , 1−t
In this latter (implied) sum, all terms in the coproduct of φ i appear except the first one, φ i ⊗ 1, which is annihilated by d. Notice now that y j i∂y j dy i = y i . Since (φ i ) is linear in its second factor we conclude that
i i i i ) dφ(2) (y) = S(φ(1) ) φ(2) (y) − S(φ i )(y) = −S(φ i )(y). y j i∂y j S(φ(1) 3 We ignore in the sequel the singularity of v at t = 1 – it is easily shown to be harmless.
(58)
478
C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara
Substituting back into (56) and putting 1 − t ≡ z we find 1 dz φ i (x) = −d S(φ i )(zx), 0 z
(59)
which, upon performing the integration over z, gives φ i = −d+−1 ◦ S(φ i ). The remarks preceding the theorem complete the proof. 5.3. The lower central series and k-primitiveness. We extend here the notion of primitiveness to that of k-primitiveness. Our starting point is our BCH-based prescription for calculating the coproduct of the ψ’s, Eq. (36). Suppose we identify all generators Zi[1] of G that cannot be written as commutators (the Zi[1] are, in general, linear combinations of the ZA ). Then we may perform a linear change of basis in G and split the generators into two classes, one made up of the above Zi[1] and the other spanning the complement – we denote the latter by {Zi }. Writing the canonical element in the new basis, [1]
C = eZi
i +Z ⊗ψ i ⊗ψ[1] i
,
(60)
i with the primitive ψ’s. This is so since, in we are led to the identification of the ψ[1]
the BCH formula, the Zi[1] are never produced by the commutators, so that the only i ) is the primitive part. Consider now the lower central series of contribution to (ψ[1] G, consisting of the series of subspaces G [1] , G [2] , . . . . A particular Z in G belongs to G [k] if it can be written as a (k − 1)-nested commutator. This implies that if Z belongs to G [k] , it also belongs to all G [r] , with r < k. This is the standard definition of G [k] – we actually need a slightly modified one, according to which Z belongs only to the G [k] with the maximum k. With this definition, G [k] ∩ G [r] = ∅ whenever k = r. We may now perform a linear change of basis in G such that each generator Zi[k] in the new basis belongs to G [k] . Writing the canonical element in the form [k]
C = eZi
i ⊗ψ[k]
,
(61)
i dual to the above Z [k] . Since the Z [k] are linear defines the k-primitiveness for the ψ[k] i i i will be linear combinations of the ψ A . A splits accombinations of the ZA , the ψ[k] [k] – the primitive ψ’s, in particular, span A[1] . cordingly to a direct sum, A = ∞ k=1 A Notice that ψ’s with n vertices may belong to G [k] with k ≤ n − 1. This is so because the “longest” nested commutator with n vertices is [Z , [Z , . . . [Z , Z ]] . . . ], with n − 2 entries of Z . i . The above concept of k-primitiveness arose naturally in our study of the primitive ψ[1] Some time afterwards, we became aware of Ref. [3], where a concept of k-primitiveness is also defined, as follows: given an element χ of A, one computes successive powers of the coproduct, k (χ ). There is a minimum k for which all terms in k (χ ) contain a unity in at least one of the tensor factors – this defines the k-primitiveness of χ . Our i , while the above makes definition is intrinsically defined only on the generators ψ[k] i , the two definitions coincide. sense in all of A. We now show that, for ψ[k] b
b
b
b b
b
i ) contains at least one unit tensor Lemma 3. The minimum value of r for which r (ψ[k] factor in each of its terms, is r = k.
Normal Coordinates and Primitive Elements in Hopf Algebra
479
i can be computed by iteration of the Proof. The various powers of the coproduct of ψ[k] second of (34), i r−1 (ψ[k] ) = coeff. of Zi[k] in log C01 C02 . . . C0r . (62) i ), the (k +1)-linear term can only be produced by the k-nested This shows that in k (ψ[k] commutator
[Zi1 , [Zi2 , . . . [Zik , Zik+1 ]] . . . ] ⊗ ψ i1 ⊗ . . . ⊗ ψ ik+1 . The latter, however, has no Zi[k] component, since Zi[k] can be written as a (k − 1)-nested commutator at most. It is also clear, for the same reason, that there are no terms of higher p-degree in the ψ’s, as those would correspond to even longer nested commutators. i ) then must have at least one unit tensor factor in each of its terms. On the k (ψ[k] i ) is not zero, because, by definition, the other hand, the k-linear term in k−1 (ψ[k] corresponding (k − 1)-nested commutator has a Zi[k] component.
As shown in [3], the k-degree satisfies j
i ψ ) = k1 + k2 . degk (ψ[k 1 ] [k2 ]
(63)
We use the two definitions of the k-degree interchangeably in what follows. We may now clarify the relation between the primitive elements given by the inverse Poincaré formula, Eq. (53), and the ones introduced above via the lower central series of G.
Lemma 4. Given φ i = ci A φ A , with ci A constants, such that dφ i = 0. Then the
primitive element ψ i of (53) is equal to ci A ψ A , i.e.,
ψ i = −+−1 ◦ S(φ i ) = ci A ψ A .
(64)
All primitive elements of A can be obtained in this form. i is primitive, while (sums of) products of them Proof. Any linear combination of the ψ[1] i constitute a linear basis in the vector space of are not, due to (63). Therefore, the ψ[1]
primitive elements of A. To the given φ i , Eq. (53) associates a primitive element ψ i , with φ i as its linear part. The unique linear combination of the ψ A (and, hence, of the i ) with this linear part is ψ i = ci ψ A . ψ[1] A
We give an example illustrating the above. Example 4. Construction of G (n)[k] , A(n)[k] , for n ≤ 4. To identify the generators of G (n)[k] , we construct all (k−1)-nested commutators with n vertices – G (n)[1] is determined (n)[k] in G (n) (below we use the orthogonal complement as the complement of n−1 k=2 G but this is not essential, one simply has to complete the basis of the Z’s). This gives a matrix that effects the transition from the basis {ZA }, indexed by rooted trees, to the i in terms basis {Zi[k] }, of definite k-primitiveness. The inverse matrix then gives the ψ[k] of the ψ A .
480
C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara
G (1)[1] = G (1) is generated by Z . G (2)[1] = G (2) is generated by Z , since the only commutator with two vertices, [Z , Z ] is zero. For n = 3, we have the only non(3)[2] ≡ [Z , Z ] = 2Z . The complement in G (3) is spanned by zero commutator4 Z1 (3)[1] = Z . Next we look at the case n = 4. We find the only non-zero commutators Z1 b b
b
b
b
b b
b
b b b
b b b
(4)[2]
[Z , Z ] = (0, 2, 1, 0) ≡ Z1 b
b b b
(4)[3]
[Z , Z ] = (0, −1, 1, 3) ≡ Z1
,
b b b
b
,
(65)
in the basis Z , Z , Z , Z . The orthogonal complement in G (4) is spanned by b b b b
b b b b
b bb b
(4)[1]
Z1
b bbb
(4)[1]
= (1, 0, 0, 0),
Z2
= (0, 1, −2, 1).
(66)
Writing the above change of basis symbolically as Zi[k] = MZA , with M a matrix of i = ψ A M −1 . numerical coefficients, the dual change of basis for the ψ’s is given by ψ[k] We find 1 =ψ , ψ(1)[1] b
b b b
b b
1 ψ(2)[1] =ψ ,
1 ψ(3)[1] =ψ ,
1 ψ(3)[2] =
1 ψ , 2 b b b
(67)
while, for n = 4, b b b b
1 ψ(4)[1] =ψ , 2 ψ(4)[1] 1 ψ(4)[2] 1 ψ(4)[3]
b b b b
b bb b
1 1 1 = ψ − ψ + ψ , 3 6 6 7 2 1 = ψ + ψ + ψ , 18 9 18 1 1 5 =− ψ + ψ + ψ . 18 9 18 b b b b
b bbb
b bb b
b b b b
b bbb
b bb b
b bbb
(68)
2 , one easily verifies that Referring to, e.g., ψ(4)[1]
1 φ = φ 6 i
b b b b
b bb b
1 1 − φ + φ 6 3
b bbb
(69)
2 . has symmetric lin and, when inserted in (53), delivers ψ(4)[1]
To continue the above construction to the cases n = 5, 6, we developed a REDUCE program, incorporating some of the procedures of [2]. The numbers Pn,k of k-primitive ψ’s with n ≤ 6 vertices that we find coincide with the ones in Table 4 of [3], as expected. In what refers to the primitive ψ’s, the procedure presented above, starting with φ’s with symmetric lin and then using (53), should be considerably more efficient than the one used in [3] – it would be interesting to quantify this statement. Notice that an equivalent procedure involves expanding the primitive ψ’s as ψ[1] = cA ψ A and then determining the constants cA from the set of equations f T ZT , ψ[1] = 0 (the latter is the statement RS that ψ[1] is invariant under the coadjoint coaction). (n)[k]
4 We remind the reader of our notation: Z is the i th element in the subspace of k-primitive, n-vertex i Z’s. The same notation is used for the ψ’s, with the position of the indices (upper–lower) interchanged.
Normal Coordinates and Primitive Elements in Hopf Algebra
481
6. Normal Coordinates and Toy Model Renormalization We turn now to what, in some sense, is our main objective, namely, the application of the formalism presented so far in the problem of renormalization in perturbative quantum field theory. The scope of our considerations in this section can only be modest, since realistic quantum field theories involve rooted trees with an infinite number of decorations. Nevertheless, a toy model exists (see [10]) that realizes the φ A as nested divergent integrals, regulated by a parameter . We find this an extremely useful construct that captures many of the most important features of realistic renormalization – again, we refer the reader to [10, 6] for a detailed presentation. What we are interested in here, is the rôle of the new coordinates ψ in the renormalization of divergent quantities. We start with a brief review of the basics. 6.1. The toy model. The elementary divergence in the toy model we deal with is given by the integral ∞ y − I1 (c; ) = dy , (70) y+c 0 which diverges as goes to zero. c above will be referred to as the external parameter of the integral. We associate the function φ with I1 (c; ). To the function φ corresponds the nested integral ∞ ∞ ∞ y − y − y − I2 (c; ) = dy1 1 I1 (y1 ; ) = dy1 1 dy2 2 . (71) y1 + c y1 + c 0 y2 + y 1 0 0 b b
b
b b b
b b b
Notice that the external parameter of the subdivergence I1 is y1 . To φ , φ correspond, respectively, ∞ ∞ 2 y1− y − I3,1 (c; ) = dy1 I3,2 (c; ) = dy1 1 I2 (y1 ; ), I1 (y1 ; ) , y1 + c y1 + c 0 0 (72) it should be clear how this assignment extends to all φ A . In this way, each φ A can be associated with the Laurent series in that corresponds to its associated integral, e.g. ∞ y − 1 π φ = dy = c− = − a + O(), (73) y+c sin(π ) 0 b
where a ≡ log(c) and, similarly (using MAPLE), 1 a 5π 2 2 + a + O(), − + 2 2 12 3a 2 a 7π 2 1 a 2 1 φ = 3− 2+ + − 9a + 14π 2 + O(), 6 2 4 18 12 3a 2 1 a 11π 2 1 a 2 φ = 3− 2+ + − 9a + 11π 2 + O(), 3 2 18 6 2 2 5π 1 1 a a a 2 2 1 φ = + 8a + O( 0 ), − + − + 15π 24 4 6 3 3 24 2 18 b b
φ = b b b
b b b
b b b b
(74)
482
C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara
2a 2 3π 2 1 1 a a 2 2 1 + 16a + O( 0 ), − + − + 27π 12 4 3 3 3 8 2 18 1 1 a 11π 2 1 a 2 φ = 4 − 3 + a2 + − 8a + 11π 2 + O( 0 ), 2 8 2 24 6 1 a 19π 2 1 a 2 2 1 φ = 4 − 3 + 2a 2 + 16a + O( 0 ), , − + 19π 4 24 2 6 and so on. It is easily seen that φ’s with n vertices give rise to Laurent series with leading pole of order n. The process of renormalization assigns to each φ A a finite “renormalized” A (see, e.g., [5]). In Hopf algebraic terms, the latter is given by [2] value φR A A A φR = SR φ(1) (75) φ(2) , φ
b b b b
=
b bb b
b bbb
where the twisted antipode SR is defined recursively by A A SR φ A = −R φ A − R SR φ(1 ) φ(2 ) .
(76)
R above is a renormalization map that we choose here to give the pole part of its argument, evaluated at the external parameter equal to 1, e.g., R φ = 1/2 2 (compare with the first of (74)). The primed sum in the second term of (76) excludes the primitive part of the coproduct. The magic of renormalization lies in the fact that, for any φ A , A in (75) has no poles in – what makes this statement non-trivial the renormalized φR A , are independent of external is that all terms subtracted iteratively from φ A , to give φR parameters. We conclude our brief review with the following statement, proven in [11]: if R satisfies the multiplicative constraint R(xy) − R R(x)y − R xR(y) + R(x)R(y) = 0, (77) b b
then SR is multiplicative, SR (xy) = SR (x)SR (y) – our choice of R above does satisfy (77). 6.2. Renormalization in the ψ-basis. For a given number n of vertices, the renormalization of every generator φ A gives rise to 2n counterterms, for a total of rn 2n , where rn is the number of rooted trees with n vertices. To renormalize the ψ’s, one can always express them in terms of the φ’s and then proceed as above. However, for renormalization schemes R that satisfy (77), a much more efficient possibility arises. Equation (75), in this case, is valid for any function in A, and, in particular, for the ψ’s. Notice that although the action of the antipode S is trivial on the ψ A , that of the twisted antipode i } is that the complexity SR is not, in general. The advantage of working in the basis {ψ[k] i of the renormalization of a generator ψ(n)[k] is governed by k, not n, which entails, in general, significant savings. As an extreme example, a primitive ψ with one hundred vertices is renormalized by a simple subtraction – this should be compared with the 2100 counterterms necessary for the renormalization of each of the φ(100) ’s. How significant i in the can the savings be in, e.g., CPU time, depends on the distribution of the ψ(n) various k-classes. As proved in [3], the numbers Pn,k of k-primitive ψ’s with n vertices are generated by Pk (x) ≡
∞ n=1
Pn,k x n =
µ(s) s|k
k
1−
∞
1 − x ns
n=1
rn k/s
,
(78)
Normal Coordinates and Primitive Elements in Hopf Algebra
483
a rather non-trivial result. The sum in the r.h.s. above extends over all divisors s of k, including 1 and k. µ(s) is the Möbius function, equal to zero, if s is divisible by a square, and to (−1)p , if s is the product of p distinct primes (µ(1) ≡ 1). Of particular interest to us is the asymptotic behavior of Pn,k , for large values of n [3], Pn,k 1 k−1 1 1− = , (79) fk ≡ lim n→∞ rn c c where c = 2.95 . . . is the Otter constant. This is encouraging, as the population of the CPU-intensive high-k ψ’s is seen to be exponentially suppressed. A realistic estimate of the complexity of renormalization in the ψ-basis is outside the scope of this article, as it would probably entail implementation-dependent parameters. Nevertheless, we attempt a first-order estimation by assigning a computational cost of 2k to a k-primitive ψ, while the φ(n) are assigned the cost 2n . The ratio of the total costs of renormalizing all generators with n vertices in the two bases then is c n−1 rn 2 n ρn = n−1 ≈ (c − 2) , (80) k c−1 k=1 Pn,k 2 with ρ33 ≈ 6×105 making the difference between a week and a second. We consider (80) as a loose upper bound on the potential savings. Another feature of the ψ’s that is worth pointing out is their toy model pole structure. A corresponds to a Laurent series with maximal pole As mentioned above, each of the φ(n) i is much milder. We list the series expansion order n. We find that the behavior of the ψ(n) of the first few ψ A , which should be compared with the analogous expressions for the φ A , Eq. (74), 1 − a + O(), π2 ψ = + O(), 4 π 2a π2 ψ = − + O(), 18 6 7π 2 7π 2 a ψ = − + O(), 36 12 ψ = b
b b
b b b
b b b
b b b b
π4 + O(), 8 19π 4 + O(), = 72 π2 π 2a + O( 0 ), = − 24 2 6 π2 π 2a + O( 0 ). = − 12 2 3
ψ = ψ
b b b b
ψ ψ
b bb b
b bbb
b b b b
(81)
b b b b
Notice that, e.g., the primitive ψ is actually finite, as is ψ which is not primitive. We emphasize that ψ is still given by (75) (with φ a → ψ ) and does not coincide b b b b
b b b b
R
484
C. Chryssomalakos, H. Quevedo, M. Rosenbaum, J. D. Vergara b b b b
i are of order 1/ 2 , even though with the finite ψ (see Ex. 5 below). The other two ψ(4) they have G [3] components. These initial observations point to a general feature of the ψ’s: the pole order does not specify the complexity of their renormalization, as is the case with the φ’s. The cancellations of the higher-order poles observed point to rather non-trivial underlying combinatorics that, we believe, deserve further investigation. i The series expansion of the ψ(n)[k] is
π4 + O(), 48 π2 π 2a = − + O( 0 ), 2 72 18 π2 π 2a + O( 0 ) = − 36 2 9
2 ψ(4)[1] = 1 ψ(4)[2] 1 ψ(4)[3]
(82)
(the rest are essentially identical to the ψ A ). We also point out that some of the n = 6 primitive ψ’s are of order 1/ 3 – nevertheless, the coefficients of all poles are independent of c and their renormalization is accomplished by a simple subtraction, in agreement with (75). Example 5. (82) give
2 1 , ψ(4)[2] ,ψ Renormalization of ψ(4)[1]
b b b b
2 . For the primitive ψ(4)[1] , Eqs. (76),
2 2 = −R ψ(4)[1] = 0, SR ψ(4)[1]
(83)
2 2 2 2 so that the renormalized value ψ(4)[1] R = ψ(4)[1] +SR ψ(4)[1] coincides with ψ(4)[1] . 1 , the first of (65) and (36) give For the 2-primitive ψ(4)[2] 1 1 1 1 1 1 1 1 1 ψ(4)[2] ⊗ 1 + 1 ⊗ ψ(4)[2] + ψ(1)[1] ⊗ ψ(3)[1] − ψ(3)[1] ⊗ ψ(1)[1] , = ψ(4)[2] 2 2 (84) so that 1 1 1 1 1 1 1 1 1 ψ(4)[2] R = ψ(4)[2] + SR ψ(4)[2] + 2 SR ψ(1)[1] ψ(3)[1] − 2 SR ψ(3)[1] ψ(1)[1] . (85)
For the (non-trivial) twisted antipode we find 1 1 1 1 1 1 1 1 = −R ψ(4)[2] + R R ψ(1)[1] ψ(3)[1] − R R ψ(3)[1] ψ(1)[1] . SR ψ(4)[2] 2 2 (86) Substituting above we get 7 4 1 ψ(4)[2] R = 96 π + O().
(87)
Normal Coordinates and Primitive Elements in Hopf Algebra
485
b b b b
Finally, for ψ , we use the coproduct given in (40) and, proceeding along the same lines, we find
ψ
b b b b
R
=
13 4 1 π − π 2 a 2 + O(), 96 24
(88)
b b b b
which is different, as mentioned above, from the finite ψ . The remarkable pole structure of the ψ’s observed above, persists for other, more realistic models as well. For example, we have repeated the above analysis for the heavy-quark model of [2]. We find that, for n ≤ 4, the maximal pole order appearing is only 1/, with all ladder ψ’s, except the first one, finite. Acknowledgement. C. C. would like to thank Denjoe O’Connor for discussions and for pointing out Ref. [12]. The authors acknowledge partial support from CONACyT grant 32307-E and DGAPA-UNAM grant IN119792 (C. C.), DGAPA-UNAM grant 981212 (H. Q.) and CONACyT project G245427-E (M. R.).
References 1. Borodulin, V.I. Rogalyov, R.N. and Slabospitsky, S.R.: CORE: COmpendium of RElations. hepph/9507456 2. Broadhurst, D.J. and Kreimer, D.: Renormalization Automated by Hopf Algebra. J. Symb. Comp. 27, 581 (1999), hep-th/9810087 3. Broadhurst, D.J. and Kreimer, D.: Towards Cohomology of Renormalization: Bigrading the Combinatorial Hopf Algebra of Rooted Trees. Commun. Math. Phys. 215, 217–236 (2000), hep-th/0001202 4. Chryssomalakos, C., Schupp, P. and Watts, P.: The Rôle of the Canonical Element in the Quantized Algebra of Differential Operators A U . hep-th/9310100 5. Collins, J.: Renormalization. Cambridge: Cambridge University Press, 1984 6. Connes, A. and Kreimer, D.: Hopf Algebras, Renormalization and Noncommutative Geometry. Commun. Math. Phys. 199, 203–242 (1998), hep-th/9808042 7. Connes,A. and Kreimer, D.: Renormalization in Quantum Field Theory and the Riemann-Hilbert Problem. 1. The Hopf Algebra Structure of Graphs and the Main Theorem. Commun. Math. Phys. 210, 249–273 (2000), hep-th/9912092 8. Kastler, D.: Connes-Moscovici-Kreimer Hopf Algebras. Fields Institute Communications XX, 2001, math-ph/0104017 9. Kreimer, D.: Combinatorics of (Perturbative) Quantum Field Theory. hep-th/0010059 10. Kreimer, D.: On the Hopf Algebra Structure of Perturbative Quantum Field Theories. Adv. Theor. Math. Phys. 2, 303–334 (1998), q-alg/9707029 11. Kreimer, D.: Chen’s Iterated Integral Represents the Operator Product Expansion. Adv. Theor. Math. Phys. 3, 627–670 (1999) hep-th/9901099 12. Milnor, J.: Remarks on Infinite-Dimensional Lie Groups. In: DeWitt, B.S. and Stora, R. (eds), Relativity, Groups and Topology II (Les Houches 1983). Elsevier Science B.V., 1984, pp. 1007–1057 Communicated by A. Connes
Commun. Math. Phys. 225, 487 – 521 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Finite-Wavelength Stability of Capillary-Gravity Solitary Waves Mariana Haragus1 , Arnd Scheel2, 1 Mathématiques Appliquées de Bordeaux, Université Bordeaux 1, 351, Cours de la Libération, 33405 Talence
Cedex, France. E-mail:
[email protected] 2 Institut für Mathematik I, Freie Universität Berlin, Arnimallee 2–6, 14195 Berlin, Germany
Received: 7 February 2001 / Accepted: 6 October 2001
Abstract: We consider the Euler equations describing nonlinear waves on the free surface of a two-dimensional inviscid, irrotational fluid layer of finite depth. For large surface tension, Bond number larger than 1/3, and Froude number close to 1, the system possesses a one-parameter family of small-amplitude, traveling solitary wave solutions. We show that these solitary waves are spectrally stable with respect to perturbations of finite wave-number. In particular, we exclude possible unstable eigenvalues of the linearization at the soliton in the long-wavelength regime, corresponding to small frequency, and unstable eigenvalues with finite but bounded frequency, arising from non-adiabatic interaction of the infinite-wavelength soliton with finite-wavelength perturbations. 1. Introduction In this article, we study stability of solitary waves traveling at constant velocity on the free surface of a two-dimensional inviscid fluid layer of finite depth under the influence of gravity and surface tension. The equations of motion are the Euler equations for nonlinear surface waves. Solitary waves are among the most striking phenomena and appear to be stable in several parameter regimes. Both for large surface tension and in the absence of surface tension, solitary waves are known to exist as particular solutions. Together with the solitary waves, there exists a family of spatially periodic waves, which are known as Stokes waves in the absence of surface tension. Phenomenologically, solitary waves appear to be stable in both parameter regimes mentioned, whereas Stokes waves are stable only for large enough wavelengths. At some critical finite wavelength, the periodic waves destabilize, an instability mechanism first discovered in [BF67, Be67,Wh67] and known as the Benjamin-Feir instability. Mathematically, the water wave problem is an evolutionary partial differential equation and possesses a Hamiltonian structure [Za68]. Various symmetries and associated Permanent address: University of Minnesota, School of Mathematics, 206 Church St. S.E., Minneapolis, MN 55455, USA
488
M. Haragus, A. Scheel
conservation laws are known; see [BO80]. The initial-value problem to this partial differential equation is well posed locally in time in the case of gravity waves [Na74, KN79,Yo82, Cr85,Wu97]. Both solitary waves and spatially periodic Stokes waves are particular equilibria of the Hamiltonian system. Their stability or instability is to first order determined by the spectrum of the linearization. Complete stability proofs would however have to take into consideration the effects of nonlinearity, as well. Throughout this paper, we focus on the spectrum of the linearization, the first and basic step towards stability of solitary waves. Existence of free surface waves in the full Euler equations has attracted a lot of interest in the late 80’s using bifurcation theory. For example, existence of solitary waves for large surface tension, Bond number larger than 1/3, was shown in [Ki88,AK89, Sa91]. Stability of surface waves in the full Euler equations is, from a mathematical point of view, a completely open problem, for both cases of gravity and capillary-gravity waves. Although a tremendous amount of literature is devoted to stability and instability of surface waves, to our knowledge, the present work represents the first rigorous attempt to show stability of solitary waves. Below, we summarize part of the previous work on stability and instability. Most detailed results are available for Stokes waves. In the absence of surface tension, a rigorous proof of the Benjamin-Feir instability of small-amplitude Stokes waves has been given in [BM95]. Rigorous stability proofs, even for the linearized problem, do not seem to be available. On the other hand, instability induced by critical eigenvalues leaving the imaginary axis of the linearized equations about a periodic wave upon variations of parameters has been extensively studied, both numerically and analytically; see, for example, [Mc82, LH84, Sa85, MS86, LHT97] and the references therein. Solitary waves in shallow water in the absence of surface tension appear to be stable at small amplitude. This is suggested by the numerical results on eigenvalues of the linearized operator in the absence of surface tension in [Ta86]. An instability seems to occur at some critical, finite amplitude, see again [Ta86]. The nature of this crest instability has also been investigated in direct numerical simulations, in [LHT97]. As already mentioned, stability results for solitary waves in the full Euler equations are not known. However, for large-wavelength initial data, the evolution of the free surface is governed on large time scales by certain model equations. For example, both for zero and for large surface tension, a formal expansion of the solution in the large wavelength exhibits at leading order a Korteweg–de Vries equation [KdV, Bou]. In other parameter regimes, the fifth order Kawahara equation [Ka72], or nonlinear Schrödinger equations can be derived. Together with these model equations, there come two mathematical problems: (i) What are the wave dynamics in the model equations? (ii) What can we conclude from the dynamics in the model equations for the dynamics of the full equations? For the particular question of stability of solitary waves we are interested in, these two problems reduce to first, the question of stability of solitary waves in the Korteweg– de Vries equation, and second, the question of validity of the approximation. Stability of solitary waves in the Korteweg–de Vries equation is fairly well understood. Orbital stability of the two-parameter family of solitary waves in this infinite-dimensional, integrable Hamiltonian system has been shown in [Be72, BSS87]. More towards the spirit of the present work, asymptotic stability of solitary waves has been shown in [PW96]. The proof there relies on a very careful understanding of the linearized problem using
Stability of Solitary Waves
489
a scattering-type analysis. Convergence then is, necessarily, established in an exponentially weighted function space, where the Korteweg–de Vries equation is not Hamiltonian. Deviating from the primary objective of this work, we also mention stability results for the Kawahara equation [Ka72]. This fifth order partial differential equation describes the dynamics of surface waves in the critical case of moderate surface tension, that is, for Bond numbers close to 1/3. For Bond numbers slightly larger than 1/3, the Kawahara equation supports solitary wave solutions just like the Korteweg–de Vries equation. Again, existence and orbital stability of these waves have been proved; see [IS92]. These stability results for the model equations let us believe that the solitary waves of the full Euler equations are stable at low amplitudes. However, the question to which extent solutions of the full system are well approximated by solutions of the model equations has not received a satisfactory answer that would allow us to conclude the stability of the solitary waves of the full system from only the stability of the corresponding waves of the model equation. Moreover, results on the validity of the model equations exist only in the case of gravity waves [KN79, KN86, Cr85, SW00]. In the presence of surface tension, the reduction method in [Ha96] permits to derive, in a rigorous and systematic manner, reduced systems that are nonlocal in the unbounded space variable and local in time, for different regions in the parameter plane (λ, b). The model equations, such as the Korteweg–de Vries and Kawahara equations, appear as the lowest order part in these reduced systems, but the connection between the solutions of the model equations and those of the reduced systems is still not clear. If we want to infer stability of solitary waves in the full Euler equations from stability of the soliton in the Korteweg–de Vries equation, two major problems arise. First, the Korteweg–deVries equations are valid on large, but finite time scales. Instabilities beyond these time scales are invisible in this leading order approximation. The second difficulty is non-adiabatic interactions between the infinite-wavelength solitary wave and finitewavelength perturbations. In the long-wavelength approximation of the Korteweg–de Vries equation, these perturbations are ignored. However, even at the linear level, these types of interaction may produce unstable eigenvalues, as has been shown, in a different context, in [KS98]. We give an outline of our results. In the case of large surface tension, we use bifurcation theory to deduce spectral stability of small-amplitude solitary waves for eigenvalues of finite frequencies, corresponding to finite wave numbers of the perturbations; see Theorem 2. As a first step, we reformulate the Euler equations as an abstract, firstorder differential equation in the spatial variable x; Sect. 2. The existence of solitary waves, Sect. 3, is described by a four-dimensional differential equation, which, due to symmetries reduces at leading order to a one-degree of freedom Hamiltonian system. The homoclinic orbit of this Hamiltonian system represents the solitary wave solution. This part of the analysis is similar to [Ki88]. The formulation of the Euler equations as a dynamical system in the spatial variable x in [Ki88] is slightly simpler, but does not generalize to the time-dependent case. We then linearize the Euler equations about this solitary wave solution and look for eigenfunctions with temporal growth eσ t . We obtain a generalized eigenvalue problem for the linearized operator L(σ ), depending on the spectral parameter σ . We formulate the stability problem in terms of the spectrum of this generalized eigenvalue problem and state our main results in Sect. 4. Stability of the continuous spectrum then follows from general perturbation arguments together with an explicit computation of the dispersion relation; Sect. 5. The main body of the proof is contained in Sect. 6, where point spectrum off the imaginary axis is excluded. It is here that we crucially rely on the dynamical systems formulation of the problem. We
490
M. Haragus, A. Scheel
define a complex analytic function, depending on the spectral parameter σ , which we call the Evans function of the full water-wave problem. Its zeroes σ coincide with the point spectrum. Stability of the solitary wave decomposes into stability in three different regimes, depending on the magnitude of the frequency of the eigenvalue, given by the imaginary part of the spectral parameter σ : (I) the long-wavelength, (II) the intermediate-wavelength, and (III) the short-wavelength regime. Our main result claims stability in (I) and (II). Stability in the short-wavelength regime (III) remains open. In the intermediate-wavelength regime (II), we exclude eigenvalues popping out of the essential spectrum by analytically continuing the Evans function into the essential spectrum and explicitly computing its value from the linear dispersion relation about the flat surface. The long-wavelength regime (I) requires a more subtle analysis. In appropriate scalings, we find the Korteweg–de Vries equation and the Evans function associated to the Korteweg–de Vries soliton, already computed explicitly in [PW92]. The major difficulty then is associated to the fact that the linear dispersion relation about the trivial surface in the long-wavelength limit is the dispersion relation of the wave equation and not the dispersion relation of the Korteweg–de Vries equation. Technically, the problem appears when we formulate the Euler equations for the potential of the velocity field, whereas we derive the Korteweg–de Vries equation for the derivative of the potential. In particular, at bifurcation, we have four critical modes with zero group velocity. Only three are represented in the third order Korteweg–de Vries equation. The central argument relies on the symmetry of the dispersion relation induced by reflection in physical space. The symmetry is exploited in Sect. 6.2.4, where we show that the additional critical mode does not couple to the three other modes. More precisely, we show that we can continue the Evans function for the full water-wave problem problem analytically in the KdV-scaled spectral parameter σ . At leading order, we are able to compute the Evans function explicitly and find the Evans function of the KdV-soliton, multiplied by σ . The additional factor σ precisely accounts for the fourth critical mode induced by translation of the velocity potential by constants. The stability proof is concluded by a perturbation argument, which shows that all roots of the Evans function are located in the origin, even for higher order perturbations, since they are induced by symmetries of the full water-wave problem. The method developed here for the case of large surface tension can be applied to the case of zero surface tension, as well.Although the formulation of the problem, Sect. 2, has to be adapted, most of the consequent analysis is very similar. In particular, Theorem 2 on spectral stability holds in the absence of surface tension, as well. An important difference arises when proving the absence of unstable point spectrum with small frequency. The fourth critical mode, which appears in addition to the KdV-spectrum, carries a group velocity with the opposite sign when compared with the case of large surface tension. This actually simplifies the stability proof substantially in allowing for a continuation of the Evans function across the essential spectrum by means of exponentially weighted spaces, just like in the Korteweg–deVries approximation; see [PW97] and [HS01] for solitary waves in different contexts, where a similar situation arises.
Stability of Solitary Waves
491
2. The Euler Equations and Spatial Dynamics Consider nonlinear waves propagating at a constant speed c on the free surface of an inviscid fluid layer of mean depth h and constant density ρ. Assume that both gravity and surface tension are present, and denote by g the acceleration due to gravity and by T the coefficient of surface tension. In a coordinate system (X, Y ) moving with the waves the bottom lies at Y = 0 and the free surface is described by Y = Z(X, t), where t is the time variable. The flow is supposed to be irrotational, so the velocity field has a potential = (X, Y, t). Introduce dimensionless variables by choosing the unit length to be h and the unit velocity to be c. The Euler equations of motion become XX + Y Y = 0,
for 0 < Y < 1 + Z(X, t),
(2.1)
with the boundary conditions Y = 0
(2.2)
at the bottom Y = 0, and Zt + ZX + ZX X = Y , 1 bZXX t + X + (2X + 2Y ) + λZ − =0 2 )3/2 2 (1 + ZX
(2.3) (2.4)
on the free surface Y = 1 + Z(X, t). The dimensionless numbers λ = gh/c2
and
b = T /ρhc2
are the inverse square of the Froude number and the Bond number. The analysis is made for capillary-gravity waves, so we fix b = 0. The goal of this section is to write the system (2.1)–(2.4) in the abstract form Dwt = wx + F (w; λ),
(2.5)
with boundary conditions 0 = f (w), Bwt = f (w),
on y = 0, on y = 1,
(2.6) (2.7)
where D, B are linear and F , f nonlinear maps acting on a Hilbert space of functions defined on the bounded cross-section of the domain. Consider the new variables u = X ,
η=
bZX 2 1 + ZX
,
and the change of coordinates x = X,
y=
Y , 1 + Z(X, t)
(2.8)
492
M. Haragus, A. Scheel
which transforms the moving domain {(X, Y ) ∈ R2 | 0 ≤ Y ≤ 1 + Z(X, t)} into R × [0, 1]. Then, (2.1), (2.4) lead to the system yη 0 = x − u − y , in R × (0, 1), (2.9) (1 + Z) b2 − η2 1 yη 0 = ux + yy − uy , in R × (0, 1), (2.10) (1 + Z)2 (1 + Z) b2 − η2 η 0 = Zx − , (2.11) b2 − η2 u2 1 2 + 2 2(1 + Z)2 y η(1 + u) − y , on y = 1, (1 + Z) b2 − η2
t = ηx − λZ − u −
(2.12)
with boundary conditions 0 = y , on y = 0, 1 η(1 + u) Zt = , y − 1+Z b2 − η2
(2.13) on y = 1,
(2.14)
obtained from (2.2) and (2.3). Equations (2.9)–(2.12) are of the form (2.5) in which the independent variable w, the linear operator D and the map F are defined through w = (, u, Z, η)T , Dw = (0, 0, 0, y=1 )T , and
yη y (1 + Z) b2 − η2 1 yη yy − uy 2 (1 + Z) (1 + Z) b2 − η2 F (w; λ) = − η b2 −η2 2 η(1 + u) 1 2 −λZ − u + u − + y 2 2(1 + Z)2 y (1 + Z) b2 − η2
−u −
. y=1
The boundary conditions (2.13), (2.14) are of the form (2.6), (2.7) in which Bw = Z,
f (w) =
1 yη(1 + u) y − . 1+Z b2 − η2
We consider (2.5) as an abstract differential equation on the phase space X := H 1 (0, 1) × L2 (0, 1) × R2 . Set U = {(, u, Z, η) ∈ X | Z > −1, |η| < b}, and define X1 := H 2 (0, 1) × H 1 (0, 1) × R2 , and V = U ∩ X1 . The properties of D, F , B and f are summarized in the following lemma.
Stability of Solitary Waves
493
Lemma 2.1. The following statements hold: (i) D is a bounded linear operator from X (resp. X 1 ) into X (resp. X 1 ). (ii) B is a bounded linear operator from X (resp. X1 ) into R. (iii) F ∈ C k (V × R, X) and f ∈ C k (U, L2 (0, 1)) ∩ C k (V , H 1 (0, 1)), for any k ≥ 0. The proof is an easy consequence of the definition of D, B, f , and F and is left to the reader. Remark 2.2. The Euler equations (2.1)–(2.4) possess a reversibility symmetry. For any solution (Z(X, t), (X, t)), reversibility yields a different solution (Z(−X, −t), −(−X, −t)). For the system (2.5) this means that D commutes and F anticommutes with the R = diag(−1, 1, 1, −1), and for the boundary conditions (2.6)–(2.7) that BR = B and g(Rw) = −g(w), for any w ∈ U . 3. Steady Solitary Waves The Euler equations (2.1)–(2.4) possess steady solitary-wave solutions for any b > 1/3 and λ = 1 + ε2 for ε sufficiently small. Mathematical proofs go back to [Ki88,AK89, Sa91]. Our main purpose is a study of the temporal stability properties of these solitary waves. As we explained in the previous section, our approach to the stability problem is technically based on a spatial dynamics formulation of the eigenvalue problem – similar to the existence proof given in [Ki88]. However, our formulation slightly differs from the one exploited there. For the convenience of the reader, and in order to exhibit the main technical tools in the slightly simpler steady problem, we sketch the proof of existence of solitary waves in this section. In particular, we describe the most important properties of the steady solitary wave solutions of (2.5)–(2.7) that exist for b > 1/3 and λ > 1, λ close to 1. From now on we fix b > 1/3 and set λ = 1 + ε2 . The solitary waves are not unique, due to the invariance of the equations under translations in X, , and due to Galilean invariance. Translational symmetry is ruled out by restriction to symmetric waves, that is reversible solutions of the spatial dynamics formulation, satisfying Z(X) = Z(−X) and (X, Y ) = −(−X, Y ). In the steady problem the mean flow m is conserved and can be used to select a unique solitary wave from the family of solitary waves obtained by Galilean invariance. Fixing the mean flow through a cross section to one amounts to the condition 1+Z(X)
m = 1 + Z(X) +
X (X, Y ) dY = 1.
(3.1)
0
We consider the steady water-wave problem (2.5) with wt = 0, wx + F (w; λ) = 0,
(3.2)
with boundary conditions f (w) = 0,
on y = 0, 1.
(3.3)
The proof of existence of solitary waves for this system is, as the one in [Ki88], based on a center manifold reduction. However, the reduction procedure cannot be applied directly to this system because of the nonlinear boundary condition on y = 1. We therefore consider first a nonlinear change of variables on U which transforms this boundary condition into a linear condition on y = 1.
494
M. Haragus, A. Scheel
˜ u, Z, η), where Lemma 3.1. The map χ : U → U defined by χ (, u, Z, η) = (, ˜ =+
y
f (w) − f (0)w dy −
0
Z (0) 1+Z
is a C 1 -diffeomorphism. Moreover, the restriction χ : V → V is a C 1 -diffeomorphism. Proof. It is easy to check that χ is a smooth map from U into X. A direct calculation shows that 1 y2η η ˜ = + − 1+Z 2b b2 − η2
y
y (1 + u(y )) dy ,
0
˜ u, Z, η) = so χ is invertible with inverse χ −1 : U → U defined through χ −1 (, (, u, Z, η) with y 2η η y ˜ − y (1 + u(y )) dy . + = (1 + Z) 2b b2 − η2 0
χ −1
The fact that is smooth proves the first part of the lemma. The second part follows from the fact that the restrictions to V , χ : V → V and χ −1 : V → V , are well defined and smooth. ˜ Then (3.2)–(3.3) yields the following system for w ˜ Set w = χ −1 (w). −1 ˜ x = − Dχ −1 (w) ˜ ˜ λ) =: G(w; ˜ λ), w F (χ −1 (w);
(3.4)
with boundary conditions ˜ y = 0, on y = 0, η ˜ y = , on y = 1, b since
(3.5) (3.6)
yη . b We treat this system as an infinite dimensional dynamical system on the phase space X. We write ˜ y = f (w) +
˜ w ˜ w; ˜ x = A(λ) ˜ + G( ˜ λ), w
(3.7)
˜ w; ˜ w. ˜ ˜ λ) = G(w; ˜ λ) − A(λ) ˜ The boundary conditions where A(λ) = Dw˜ G(0; λ) and G( ˜ (3.5)–(3.6) are included in the domain of definition of the linear operator A(λ) by taking η ˜ ˜ y (0) = 0, ˜ y (1) = ˜ u, Z, η) ∈ X1 | . Y := Dom(A(λ)) = (, b ˜ ˜ is a smooth Then A(λ) is a closed linear operator in X with domain Y dense in X, and G map from W = U ∩ Y × R into X.
Stability of Solitary Waves
495
˜ Note that χ (0) = 0 and Dχ (0) = I , so A(λ) = Dw˜ G(0; λ) = −Dw F (0; λ). This means that the linear part of the system (3.2) is not changed by the transformation above. The same is true for the boundary conditions (3.3). A direct calculation shows that T η ˜ w ˜ yy , , λZ + u ˜ = u, − A(λ) . y=1 b Remark also that (3.7) is reversible with reverser R defined in Sect. 2, since χ (Rw) = Rχ (w). We apply center manifold reduction directly to this system. We find a four-dimensional reduced system which describes the steady waves. Note that the reduced system obtained in [Ki88] is only two-dimensional. The two additional dimensions here are due to the invariance of (2.1)–(2.4) under translations in the fluid potential and due to Galilean invariance. Both symmetries are inherited by the system (3.7) from the full Euler equations. In [Ki88], these invariances were factored out, already in the dynamical formulation of the problem, before the reduction procedure, such that the reduced equation did not possess these symmetries any more. Here, we only use them after the reduction, and show that it is possible to simplify the reduced system on the four-dimensional centermanifold to a two-dimensional differential equation with the help of reversibility and condition (3.1). The reason for this slightly different approach is that we cannot factor out these symmetries in the eigenvalue problem. Theorem 1. For any b > 1/3 and k ≥ 0 there exist ε ∗ > 0 and C > 0 such that, for any ε ∈ (0, ε∗ ) the system (3.2)–(3.3) with λ = 1 + ε2 possesses a unique solitary-wave solution wε∗ ∈ Cbk (R, X 1 ) with the following properties: ˜ ε∗ , where wε∗0 = (U 0 , u0 , −u0 , −bu0x ) with (i) wε∗ = wε∗0 + w √ βεx , u (x) = ε sech 2 0
2
x U (x) =
2
0
u0 (x ) dx ,
0
β=
3 , 3b − 1
˜ ε∗ (x)X1 ≤ Cε3 for any x ∈ R. Moreover, and w √ βε|x|
˜ ε∗ (x)X1 ≤ Cε4 e− (I − P ) w
√ βε|x|
˜ ε∗ (x)X1 ≤ Cε3 e− ∂y P w
,
,
where P is the projection on the –component of w: P : X → X, P = diag (1, 0, 0, 0). (ii) wε∗ is reversible, i.e. Rwε∗ (x) = wε∗ (−x), and the components ∗ε , u∗ε , Zε∗ , ηε∗ of wε∗ satisfy Zε∗ (x) + (1 + Zε∗ (x))
1 0
(iii) wε∗ is a smooth function of ε.
u∗ε (x, y) dy = 0.
496
M. Haragus, A. Scheel
Proof. By Lemma 3.1 it is enough to show the existence of solitary waves for the system ˜ (3.7).As in [Ki88] one can show that A(λ) has compact resolvent, so its spectrum consists only of isolated eigenvalues of finite multiplicities. The eigenvalue problem ˜ w ˜ = ζ w, ˜ A(λ)
˜ ∈Y w
˜ can be solved explicitly, and we find that ζ is an eigenvalue of A(λ) if and only if it satisfies the equality ζ 2 cos ζ = (λ − bζ 2 )ζ sin ζ. ˜ A direct calculation shows that 0 is always an eigenvalue of A(λ) with generalized eigenvectors w0 = (1, 0, 0, 0)T , wλ = (0, 1, −1/λ, 0)T , ˜ ˜ such that A(λ)w 0 = 0, A(λ)w 1 = w0 . If b > 1/3 and λ = 1 this eigenvalue has algebraic multiplicity 4; the generalized eigenvectors 2 0 1 0 − y2 0 y2 0 1 , w3 = − 2 , , w2 = w0 = , w1 = 0 1 − b 0 −1 2 0 0 −b 0 ˜ ˜ satisfy A(1)w 0 = 0, A(1)w i = wi−1 , i = 1, 2, 3, and form a basis for the generalized eigenspace associated to the eigenvalue 0. We apply the reduction result in [Mi88] to system (3.7) with b > 1/3 and λ = 1 + ε2 close to λ0 = 1. By direct calculation one can prove that there exist positive constants C(λ) and q0 such that C(λ) −1 ˜ (iq − A(λ)) , X→X ≤ |q|
(3.8)
˜ is smooth in w ˜ and ε 2 when considered for any q ∈ R, |q| > q0 . Moreover, the map G ˜ into X. With these preparations, the reduction as a map from the domain Y = Dom (A) ˜ ∈ Cbk (R, Y ) of (3.7) is of theorem in [Mi88] shows that any small bounded solution w the form ˜ w(x) = a0 (x)w0 + a1 (x)w1 + a2 (x)w2 + a3 (x)w3 + 3(a0 , a1 , a2 , a3 ; ε2 ),
(3.9)
with 3(a0 , a1 , a2 , a3 ; ε2 ) = O(|aj |(|aj | + ε2 )), and aj satisfy the reduced system a0,x = a1 + f0 (a0 , a1 , a2 , a3 ; ε2 ), a1,x = a2 + f1 (a0 , a1 , a2 , a3 ; ε2 ), a2,x = a3 + f2 (a0 , a1 , a2 , a3 ; ε2 ),
(3.10)
a3,x = f3 (a0 , a1 , a2 , a3 ; ε2 ), in which fj (a0 , a1 , a2 , a3 ; ε2 ) = O(|aj |(|aj | + ε 2 )). By a careful choice of a cut-off function, necessary in the construction of the centermanifold, one can arrange to have the reduced flow inherit the symmetries of the full system (3.7). In particular, the invariance of (3.7) under translation in implies that 3 and (3.10) are invariant under transformations of the form a0 → a0 + α, for any α ∈ R,
Stability of Solitary Waves
497
such that 3 and the fj , j = 0, . . . , 3 do not depend upon a0 . The reduced equations (3.10) possess a skew-product structure and decouple into a system for a1 , a2 , a3 , a1,x = a2 + f1 (a1 , a2 , a3 ; ε2 ), a2,x = a3 + f2 (a1 , a2 , a3 ; ε2 ), a3,x = f3 (a1 , a2 , a3
(3.11)
; ε2 ),
and one differential equation for a0 , which can be integrated. Reversibility can be used to uniquely determine a0 . The reduced system (3.10) is reversible with reverser R0 acting through R0 (a0 , a1 , a2 , a3 ) = (−a0 , a1 , −a2 , a3 ), since Rw0 = −w0 , Rw1 = w1 , Rw2 = −w2 , Rw3 = w3 . Reversible solutions of (3.10) are those with a0 , a2 odd and a1 , a3 even functions in x. For such solutions a0 is uniquely determined by the condition a0 (0) = 0, which leads to x a0 (x) =
a1 + f0 (a1 , a2 , a3 ; ε2 ) dx .
(3.12)
0
Next, we use the condition (3.1) to uniquely determine a3 for solutions of (3.7) with ˜ u, Z, η) this condition reads ˜ = (, mean flow one. For w 1 Z(x) + (1 + Z(x))
u(x, y) dy = 0,
x ∈ R.
0
˜ from (3.9) yields an equality Substitution of w F(a1 , a2 , a3 ; ε2 ) = 0. It is not difficult to see that F is smooth in its arguments, and a direct calculation shows that 1 Da3 F(0, 0, 0; ε2 ) = − b = 0. 3 Then by the implicit function theorem we obtain a3 = ψ(a1 , a2 ; ε2 ) = O(|aj |(|aj | + ε 2 )),
(3.13)
with ψ smooth function. Substituting (3.13) into (3.11) we obtain the two-dimensional system a1,x = a2 + g1 (a1 , a2 ; ε2 ), a2,x = g2 (a1 , a2 ; ε2 ).
(3.14)
This system is also reversible with reverser acting through a1 → a1 , a2 → −a2 . One can now argue as in [Ki88] and prove that (3.14) possesses a unique reversible homoclinic solution (a1∗ (ε), a2∗ (ε)), smooth function of ε, for sufficiently small ε > 0. Explicit calculation of the relevant quadratic terms shows that √ βεx a1∗ (x; ε) = ε2 sech2 + O(ε 4 ). 2 The equalities (3.12), (3.13) give the reversible homoclinic solution of the reduced system (3.10), and from (3.9) we find the reversible solitary-wave solution of (3.7). This proves the theorem.
498
M. Haragus, A. Scheel
4. Spectral Stability of Solitary Waves In this section we formulate the stability problem in terms of the spectrum of a family of linear operators and state the main results.
4.1. Linearized system. Consider the linearization of the problem (2.5)–(2.7) about the solitary wave wε∗ ∈ Cbk (R, X 1 ) found in Theorem 1 for ε ∈ (0, ε∗ ): DWt = Wx + Dw F (wε∗ ; 1 + ε 2 )W, 0 = f (wε∗ )W, on y = 0, BWt = f (wε∗ )W, on y = 1.
(4.1) (4.2) (4.3)
We look for solutions of this system of the form W(t, x) = eσ t Wσ (x),
(4.4)
with Wσ bounded function from R into the complexification of X 1 , for σ ∈ C. For simplicity we denote the complexification of X 1 , and later those of X and Y , also by X 1 (resp. X and Y ). Roughly speaking, the solitary wave wε∗ is stable if (4.1)–(4.3) does not possess any solutions of the form (4.4) for any σ ∈ C with Re σ > 0. Substitution of (4.4) into (4.1)–(4.3) yields the following system for Wσ : σ DW = Wx + Dw F (wε∗ ; 1 + ε 2 )W, 0 = f (wε∗ )W, on y = 0, σ BW = f (wε∗ )W, on y = 1.
(4.5) (4.6) (4.7)
We write this system in abstract form = 0, := W x − L(σ, ε)W L(σ, ε)W with L(σ, ε) some linear operator in X, and then formulate the stability problem for wε∗ in terms of the spectrum of the family of operators Lε = (L(σ, ε))σ ∈C . We proceed as in the steady problem by constructing first a linear diffeomorphism χσ which transforms the non-autonomous boundary conditions (4.6)–(4.7) into autonomous boundary conditions. Lemma 4.1. Assume σ ∈ C and ε ∈ (0, ε∗ ). The linear map χσ : X → X defined by σ 2 y BW, 0, 0, 0 χσ W = Dχ (wε∗ )W − 2 is bounded and has bounded inverse χσ−1 : X → X. Moreover, χσ and χσ−1 are analytic in σ , smooth in ε, and their restrictions to X1 are well defined. The proof is similar to the one of Lemma 3.1 so we omit it here. Note that χ0 is the linearization about wε∗ of the diffeomorphism χ in Lemma 3.1. Then the system (4.5) becomes Set W = χσ−1 W. + (∂x χσ ) χσ−1 W, x = χσ σ D − Dw F (wε∗ ; 1 + ε 2 ) χσ−1 W (4.8) W
Stability of Solitary Waves
499
with boundary conditions ˜ y = 0, η ˜y = , b
on y = 0,
(4.9)
on y = 1
(4.10)
= (, ˜ u, Z, η). for W Explicit calculation of the equations in (4.8) show that it is of the form x = D(σ, ε)W + A(ε)W, W
(4.11)
with D(σ, ε) = D∞ (σ ) + ε 2 D1 (x; σ, ε), a bounded linear operator in X, and A(ε) = A∞ (ε 2 ) + ε 2 A1 (x; ε), a closed linear operator in X. The parts A∞ and D∞ correspond to the linearization evaluated at the asymptotic state of the solitary wave, at x = ∞. The parts A1 and D1 correspond to the perturbation due to the solitary wave. These are operators with coefficients depending on x, and decaying to 0 at x = ∞ with the same rate as the decay rate of the solitary wave wε∗ . Since we do not need the explicit formulas of these ˜ + ε2 ), operators in the following, we omit them here. However, note that A∞ (ε 2 ) = A(1 and that D∞ (σ ) and D1 (x; σ, ε) depend upon σ in the following way: D∞ (σ ) = σ D∞1 + σ 2 D∞2 ,
D1 (x; σ, ε) = σ D11 (x; ε) + σ 2 D12 (x; ε),
since BD = 0.As in the formulation of the steady problem (3.7), the boundary conditions (4.9)–(4.10) are included in the domain of definition of the operator A(ε). The properties of D(σ, ε) and A(ε) needed later are summarized in the next lemma. They follow from Lemma 2.1, the decay properties of wε∗ in Theorem 1, and the definition of χσ in Lemma 4.1. Lemma 4.2. Assume σ ∈ C, ε ∈ (0, ε∗ ) and x ∈ R. (i) D∞ (σ ) and D1 (x; σ, ε) are bounded linear operators in X (resp. X 1 ), depending analytically upon σ and smoothly upon ε. (ii) A∞ (ε 2 ) and A1 (x; ε) are closed linear operators in X with dense domain Y , depend analytically upon σ and smoothly upon ε. Moreover, there exists a positive constant C such that the following inequalities hold for any σ ∈ C, ε ∈ (0, ε∗ ) and x ∈ R: D∞ (σ )X(resp.X1 )→X(resp.X1 ) ≤ C |σ |(1 + |σ |), √ βε|x|
D1 (x; σ, ε)X(resp.X1 )→X(resp.X1 ) ≤ C |σ |(1 + |σ |)e− A∞ (ε 2 )Y →X ≤ C,
√ βε|x|
A1 (x; ε)Y →X ≤ C e−
.
,
500
M. Haragus, A. Scheel
4.2. Spectral stability. Set L(σ, ε) = D(σ, ε) + A(ε), and consider the family of operators Lε = (L(σ, ε))σ ∈C defined by L(σ, ε) =
d − L(σ, ε). dx
= 0. Set H = L2 (R, X) and W = H 1 (R, X) ∩ Equation (4.11) becomes L(σ, ε)W 2 L (R, Y ). Then L(σ, ε) is a closed linear operator in H with dense domain W. Define the resolvent of the family of operators Lε as the set ρ(Lε ) = {σ ∈ C : L(σ, ε) invertible}. The set :(Lε ) = C \ ρ(Lε ) is called the spectrum of Lε . We distinguish between point spectrum :p (Lε ) = :(Lε ) ∩ {σ ∈ C : L(σ, ε) Fredholm with index 0}, and essential spectrum :e (Lε ) = :(Lε ) \ :p (Lε ). Definition 4.3. The solitary wave wε∗ is called spectrally stable if :(Lε ) ⊂ {σ ∈ C : Re σ ≤ 0}, and spectrally unstable otherwise. The main result in this paper is: Theorem 2. Fix b > 1/3, and choose any R > 0 large. Then there exists εb > 0 such that, for any ε ∈ (0, εb ), the spectrum of Lε coincides with the imaginary axis in a ball of radius R: :(Lε ) ∩ {σ ∈ C : |σ | ≤ R} = iR ∩ {σ ∈ C : |σ | ≤ R}. The proof consists of two parts summarized in the following two theorems. Theorem 3. There exists εe > 0 such that for any ε ∈ (0, εe ) the essential spectrum of Lε coincides with the imaginary axis. Theorem 4. Fix b > 1/3, and choose any R > 0 large. Then there exists εp > 0 such that for any ε ∈ (0, εp ) the point spectrum of Lε is contained in iR ∪ {|σ | ≥ R}. Both theorems are proved in Sects. 5 and 6. The result in Theorem 2 is a consequence of Theorems 3 and 4. Remark 4.4. In fact, we prove slightly more. We actually compute eigenvalues embedded into the essential spectrum :e (Lε ) = iR. We show that inside the essential spectrum, there is only the zero eigenvalue with geometric multiplicity two and algebraic multiplicity three. One eigenfunction is due to the invariance of the Euler equations under → +const, and the second eigenfunction is given by the x-derivative of the solitary wave. The generalized eigenvector to the second eigenfunction is given by the derivative of the solitary wave with respect to the wave speed.
Stability of Solitary Waves
501
5. The Essential Spectrum of Solitary Waves We prove Theorem 3. We study first the spectrum of the family of asymptotic operators Lε∞ = (L∞ (σ, ε))σ ∈C , where L∞ (σ, ε) =
d − L∞ (σ, ε), dx
L∞ (σ, ε) = D∞ (σ ) + A∞ (ε 2 ).
Lemma 5.1. For any ε ≥ 0, the essential spectrum of Lε∞ is equal to iR. The point spectrum of Lε∞ is empty. Proof. The asymptotic operators D∞ (σ ) and A∞ (ε 2 ) are independent of x, so in order to determine the spectrum of Lε∞ we can use the Fourier transform in x. Let k denote the Fourier variable. Then the spectrum of Lε∞ in H coincides with the spectrum of Lε∞ = ( L∞ (σ, ε))σ ∈C , where L∞ (σ, ε) = ik − L∞ (σ, ε). The domain of L∞ (σ, ε) is 2 1 W = L (R, Y ) ∩ H (R, X), where 1 (R, X) = {f ∈ L2 (R, X) : (1 + |k|)f ∈ L2 (R, X)}. H The resolvent set of Lε∞ consists of the values σ ∈ C with the following two properties: (i) :(L∞ (σ, ε)) ∩ iR = ∅, where :(L∞ (σ, ε)) is the spectrum of L∞ (σ, ε) in X, (ii) there exists a positive constant C(σ, ε) such that the estimate (ik − L∞ (σ, ε))−1 X→X ≤
C(σ, ε) , 1 + |k|
(5.1)
holds for any k ∈ R. Indeed, assume that (i) and (ii) hold for some σ ∈ C. Then, for any f ∈ H there exists g (k) = (ik − L∞ (σ, ε))−1 f(k) with (1 + |k|) g 2H 2 2 2 = (1 + |k|) g (k)X dk ≤ C(σ, ε) f(k)2X dk = C(σ, ε)2 f2H . R
R
and the map f → Hence g∈W g is bounded from H into W. The operator L∞ (σ, ε) has compact resolvent, so its spectrum consists only of isolated eigenvalues of finite multiplicities. The eigenvalue problem ˜ = ζ w, ˜ L∞ (σ, ε)w
˜ ∈ Y, w
can be solved explicitly. We find that ζ is an eigenvalue of L∞ (σ, ε) if and only if (σ + ζ )2 cos ζ = (1 + ε2 − bζ 2 )ζ sin ζ.
(5.2)
Set σ = σ1 + iσ2 and ζ = ik. Then (5.2) yields (σ2 + k)2 − σ12 = (1 + ε 2 + bk 2 )k tanh k, 2σ1 (σ2 + k) = 0.
(5.3) (5.4)
502
M. Haragus, A. Scheel
If σ1 = 0, i.e. σ ∈ / iR, the equality (5.4) implies k = −σ2 which is clearly not a solution of (5.3). Hence (5.2) has no purely imaginary solutions, i.e. :(L∞ (σ, ε)) ∩ iR = ∅, for any σ ∈ / iR. If σ1 = 0, i.e. σ ∈ iR, the last equality is always satisfied, and (5.3) has, for any σ2 = 0, exactly two real solutions, one positive and one negative (recall that b > 1/3), so (5.2) has in this case two purely imaginary solutions, both simple and different from zero. For σ = 0, (5.2) has only one purely imaginary solution, ζ = 0 which is a root of multiplicity two if ε = 0, and a root of multiplicity four if ε = 0. We conclude that (i) is satisfied for any σ ∈ / iR, and is not satisfied if σ ∈ iR. ˜ + ε 2 ), where We show that (ii) holds for any σ ∈ / iR. Recall that A∞ (ε 2 ) = A(1 ˜ A(λ) is the linear operator in (3.7). Then (3.8) implies (ik − A∞ (ε 2 ))−1 X→X ≤
C(ε) , |k|
for any |k| ≥ k0 , for some positive k0 and C(ε). Since D∞ (σ ) is a bounded operator in X, we find (ik − A∞ (ε 2 ))−1 D∞ (σ )X→X ≤ D∞ (σ )
1 C(ε) ≤ , |k| 2
if |k| ≥ k1 (σ, ε) = max{k0 , 2D∞ (σ )C(ε)}. Then (ik − L∞ (σ, ε))−1 = (I + (ik − A∞ (ε 2 ))−1 D∞ (σ ))−1 (ik − A∞ (ε 2 ))−1 , so, for any |k| ≥ k1 (σ, ε), (ik − L∞ (σ, ε))−1 X→X ≤
2C(ε) . |k|
Now (5.1) follows for σ ∈ / iR from :(L∞ (σ, ε)) ∩ iR = ∅. We conclude that any σ ∈ / iR belongs to the resolvent of Lε∞ . It remains to show that the entire imaginary axis belongs to the essential spectrum. We therefore exhibit an orthonormal sequence w< ∈ X, with L∞ (σ, ε)w< → 0 and conclude that L∞ (σ, ε) cannot be Fredholm of index zero, for σ ∈ iR. From (5.3), (5.4), we find a k∗ = k∗ (σ ) ∈ R and a vector w0 such that (ik∗ − L∞ (σ, ε))w0 = 0. Let θR be a smooth, even cut-off function, with θR (x) = 1 for |x| ≤ R, θR (x) = 0 for |x| ≥ R + 1, and θR (x) = θ0 (x − R) for x ∈ [R, R + 1]. Define ˜ < := θ< (x − 2 0, σ ∈ / iR, v ∈ H. For w ∈ W set v = L∞ (σ, ε)w ∈ H. Then wW = L∞ (σ, ε)−1 vW ≤ C(σ, ε)vH = C(σ, ε)L∞ (σ, ε)wH and (5.5) is proved. Choose σ0 ∈ / iR and ε0 ∈ (0, ε∗ ). Then wW ≤ c1 (σ0 , ε0 )L∞ (σ0 , ε0 )wH ≤ c1 (σ0 , ε0 ) L(σ, ε)wH +(D∞ (σ ) − D∞ (σ0 ))wH + (A∞ (ε 2 ) − A∞ (ε02 ))wH +ε 2 D1 (x; σ, ε)wH + ε 2 A1 (x; ε)wH . ˜ + ε2 ) we deduce that A∞ (ε 2 ) − A∞ (ε 2 ) From the explicit formula for A∞ (ε 2 ) = A(1 0 is a bounded operator in X, and (A∞ (ε 2 ) − A∞ (ε 2 ))w 0
H
≤ ε2 − ε02 wH .
Furthermore, Lemma 4.2 implies D1 (x; σ, ε)wH ≤ C|σ |(1 + |σ |)wH ,
A1 (x; ε)wH ≤ CwW ,
for any w ∈ W. The constant C is independent of ε and σ . Finally, recall that D∞ (σ ) is bounded in X and conclude wW ≤ c1 (σ0 , ε0 ) L(σ, ε)wH + C(σ )wH + |ε 2 − ε02 |wH + ε 2 CwW . Choose ε1 such that ε12 c1 (σ0 , ε0 )C ≤ 1/2 and (5.6) is proved. For the next two lemmas we follow [RS95]. For each T > 0 define the Hilbert spaces HT = L2 ([−T , T ], X), WT = L2 ([−T , T ], Y ) ∩ H 1 ([−T , T ], X). The embedding WT ⊂ HT is compact (cf. [RS95], Lemma 3.8). / iR. There exist T = T (σ, ε) > 0 and Lemma 5.4. Assume ε ∈ (0, ε1 ) and σ ∈ c3 (σ, ε) > 0, such that the inequality
wW ≤ c3 (σ, ε) wHT + L(σ, ε)wH , holds, for any w ∈ W.
(5.7)
504
M. Haragus, A. Scheel
Proof. Assume w ∈ W is such that w(x) = 0, for |x| ≤ T , for some T > 0. Then (5.5) and the inequalities in Lemma 4.2 imply wW ≤ c1 (σ, ε)L∞ (σ, ε)wH ≤ c1 (σ, ε) L(σ, ε)wH + ε 2 D1 (x; σ, ε)wH + ε 2 A1 (x; ε)wH √ βεT
≤ c1 (σ, ε)L(σ, ε)wH + C0 (σ, ε)ε2 e−
wW .
Then, there exist T = T (σ, ε) > 0 and C1 (σ, ε) > 0 such that for any w ∈ W, with w(x) = 0, for |x| ≤ T − 1, we have wW ≤ C1 (σ, ε)L(σ, ε)wH .
(5.8)
Take a smooth cutoff function φ : R → [0, 1] such that φ(x) = 0 for |x| ≥ T , φ(x) = 1 for |x| ≤ T − 1, and |φ (x)| ≤ m. Using (5.6) and (5.8) we obtain wW ≤ φwW + (1 − φ)wW ≤ c2 (σ ) (φwH + L(σ, ε)φwH )
+C1 (σ, ε)L(σ, ε)(1 − φ)wH ≤ c3 (σ, ε) wHT + L(σ, ε)wH , since L(σ, ε)φw = φL(σ, ε)w + φ w.
/ iR, the operator L(σ, ε) has closed range Lemma 5.5. For any ε ∈ (0, ε1 ) and σ ∈ and finite dimensional kernel. Proof. Since the restriction W → HT is compact the conclusion follows from Lemma 5.4 and the Abstract Closed Range Lemma (cf. [RS95]). / iR, the adjoint operator L(σ, ε)∗ has closed Lemma 5.6. For any ε ∈ (0, ε1 ) and σ ∈ range and finite dimensional kernel. Proof. The proof is similar to the proof of Lemma 5.5 and we omit it.
Lemmas 5.5 and 5.6 imply: / iR, the operator L(σ, ε) is Fredholm. Lemma 5.7. For any ε ∈ (0, ε1 ) and σ ∈ Finally, we show / iR, the Fredholm index of L(σ, ε) is zero. Lemma 5.8. For any ε ∈ (0, ε1 ) and σ ∈ Proof. Since L(σ, ε) − L∞ (σ, ε) is a small perturbation of L∞ (σ, ε), and since this operator has a bounded inverse from H into W, for any σ ∈ / iR, a perturbation argument shows that L(σ, ε) is invertible for σ in an open set in the right half plane Re σ > 0, and for σ in an open set in the left half plane Re σ < 0. Hence, for σ in these open subsets the Fredholm index of L(σ, ε) is zero. Since the Fredholm index of L(σ, ε) is constant on connected subsets of C \ iR, we conclude that its Fredholm index is zero, for any ε ∈ (0, ε1 ) and σ ∈ / iR. Proposition 5.9. For any ε ∈ (0, ε1 ), the entire imaginary axis σ ∈ iR belongs to the essential spectrum of Lε . Proof. The proof is identical to the proof for Lε∞ from Lemma 5.1. The orthonormal sequence w< , which was constructed there, satisfies L(σ, ε)w< → 0 for < → ∞.
Stability of Solitary Waves
505
6. The Point Spectrum of Solitary Waves The goal of this section is to prove Theorem 4. Equivalently, given the information on the essential spectrum from Theorem 3, we show that Re σ = 0 belongs to the resolvent set for bounded |σ | and small ε. Proposition 6.1. For any R > 0, there exists ε2 > 0 such that, for any ε ∈ (0, ε2 ) and any σ ∈ / iR, |σ | ≤ R, the operator L(σ, ε) is invertible. The proposition is proved in several steps. Since we have bounds on the norm of L∞ (σ, ε)−1 , uniformly for values | Re σ | ≥ δ > 0, |σ | ≤ R, it is sufficient to consider a neighborhood of the imaginary axis σ ∈ i[−R, R]. We therefore concentrate on a neighborhood of σ = iq for fixed q. There are then two different cases: (I) finite frequencies q = 0, (II) small frequencies q = 0. In both cases, we are interested in the kernel of the operator L(σ, ε), which is Fredholm with index zero for Re σ = 0. Elements of the kernel are bounded solutions of the abstract, non-autonomous, linear differential equation x = D(σ, ε)W + A(ε)W. W
(6.1)
It is sufficient to show that this ordinary differential equation does not possess any nontrivial, bounded solutions. We will see that, just as for the nonlinear steady equation, bounded solutions lie on a finite-dimensional, invariant manifold. To the abstract, quasilinear differential equation (6.1), we apply non-autonomous center-manifold reduction; see [Mi88]. The reduction is performed for σ close to iq ∈ iR and ε small. Note that for any q fixed, finite, and ε = 0, the linear equation is a relatively bounded perturbation of the principal part x = L∞ (iq, 0)W = (D∞ (iq) + A∞ (0))W, W with small relative bound. In Lemma 5.1 we proved the resolvent estimate (ik − L∞ (iq, 0))−1
X→X
≤
C , |k|
for all |k| ≥ k0 (q), and we may apply the reduction theorem in [Mi88] in a neighborhood of any fixed point q, uniformly for bounded q. 6.1. The case of non-zero frequency. 6.1.1. The reduction. We exclude point spectrum in a neighborhood of iq = 0, case (I). Set σ = iq + δ and rewrite (6.1) as x = L∞ (iq, 0)W + δB∞ (δ)W + ε 2 (B0 + B1 (x; δ, ε))W, W where L∞ (iq, 0) = D∞ (iq) + A∞ (0), and δB∞ (δ) = D∞ (iq + δ) − D∞ (iq),
ε2 B0 = A∞ (ε 2 ) − A∞ (0),
B1 (x; δ, ε) = D1 (x; iq + δ, ε) + A1 (x; ε).
(6.2)
506
M. Haragus, A. Scheel
We view Eq. (6.2) as a small perturbation of the eigenvalue problem for δ = 0 and ε = 0. This is justified by the following inequalities B∞ (δ)Y (resp. X)→Y (resp. X) ≤ C(1 + |q|),
B0 Y (resp. X)→Y (resp. X) ≤ C, √ βε|x|
B1 (x; δ, ε)Y →X ≤ C(1 + |q|2 )e−
for ε ∈ (0, ε3 ) and any q = 0. The reduction procedure is performed for small ε and δ. We have to find the center eigenspace of the linear operator L∞ (iq, 0). The linear operator L∞ (iq, 0) is closed in X with dense domain Y . Moreover, it has compact resolvent, so its spectrum consists only of isolated eigenvalues of finite multiplicities. As shown in the proof of Lemma 5.1, ζ is an eigenvalue of L∞ (iq, 0) if (iq + ζ )2 cos ζ = (1 − bζ 2 )ζ sin ζ. Imaginary solutions ζ = ik of this equation satisfy (q + k)2 = (1 + bk 2 )k tanh k. We find exactly two simple roots ik1 and ik2 with k2 < 0 < k1 (since b > 1/3). Hence, L∞ (iq, 0) has two simple, purely imaginary eigenvalues ik1 , ik2 . The corresponding eigenvectors are w1,2
=
cosh(k1,2 y) + 21 qy 2 z˜ 1,2 ik1,2 cosh(k1,2 y) i˜z1,2
,
−bk1,2 z˜ 1,2 where z˜ 1,2 = −
k1,2 sinh k1,2 (k1,2 + q) cosh k1,2 . =− 2 k1,2 + q λ + bk1,2
The center manifold reduction implies that small bounded solutions of (6.2) are of the form W(x) = a1 (x)w1 + a2 (x)w2 + O((|δ| + ε 2 )|aj |).
(6.3)
For the amplitudes a = (a1 , a2 ), we find a linear, non-autonomous system of ordinary differential equations, depending on the eigenvalue parameter σ and the bifurcation parameter ε ax = A(x; δ, ε)a.
(6.4)
In ε = 0, the 2 × 2-matrix A does not depend on x any more and possesses two distinct purely imaginary eigenvalues. In the remainder of this section, we set up a perturbation argument, which shows that for ε small and Re δ > 0, there are no bounded solutions to (6.4).
Stability of Solitary Waves
507
6.1.2. Exponential dichotomies. In ε = 0, Eq. (6.4) is autonomous. At Re σ = 0, the spectrum of the matrix A consists precisely of the eigenvalues ζ1 = ik1 , k1 > 0 and ζ2 = ik2 , k2 < 0. Depending on δ = σ − iq, the eigenvalues may move off the imaginary axis. A direct computation shows that dζ1 /dδ > 0 and dζ2 /dδ < 0, such that the eigenvalues leave the axis, with non-vanishing speed, in opposite directions. In particular, for Re σ > 0 small and ε = 0, we find that (6.4) is a hyperbolic, linear ordinary differential equation. The eigenspaces are analytic in δ = σ − iq. For ε > 0, the eigenvalues ζ1 and ζ2 still describe the dynamics at x = ±∞, since the solitary wave and therefore the coefficients of the matrix A(x; δ, ε) converge to zero, √ with rate e− βε|x| . Therefore, when Re δ > 0, the dynamics for |x| → ∞ are hyperbolic, with stable eigenvalue ζ2 and unstable eigenvalue ζ1 . The following lemma on exponential dichotomies shows in which sense the hyperbolic structure can be continued to finite x. We therefore consider a general non-autonomous, linear differential equation ax = A(x; δ, µ)a, a ∈ Rn ,
(6.5)
depending on a real parameter µ and a complex spectral parameter δ. In our example, µ represents the (small) parameter ε. Lemma 6.2. Consider (6.5) with fundamental solution ϕ(x, y). Assume asymptotically constant coefficients A(x; δ, µ) → A± (δ, µ) as x → ∞, and smoothness: A and A± are C k in the parameter µ ∈ Uµ ⊆ R, k ≥ 0, and analytic in the spectral parameter δ ∈ Uδ ⊆ C, and A is continuous in x. Furthermore assume that A± are hyperbolic, that is, they do not possess eigenvalues on the imaginary axis, for all µ ∈ Uµ and all δ ∈ Uδ . Then there exists a unique decomposition of the phase space Rn into linear, stable s (x; δ, µ) and E u (x; δ, µ), which are as smooth as A. The and unstable subspaces E+ − subspaces are invariant under the linear evolution ϕ(x, y): s s ϕ(x, y)E+ (y) = E+ (x),
u u ϕ(x, y)E− (y) = E− (x).
s (0) and Moreover, any initial value to a bounded solution on [0, ∞) is contained in E+ u (0). initial values to bounded solutions on (−∞, 0] are contained in E− On the other hand, there are positive constants C, η+ > 0, and η− > 0 such that we have uniform exponential decay for solutions in forward time,
|ϕ(x, y)a| ≤ Ce−η+ |x−y| |a| s (y), x ≥ y ≥ 0, and in backward time for all a ∈ E+
|ϕ(x, y)a| ≤ Ce−η− |x−y| |a| u (y), x ≤ y ≤ 0. The constants C and η can be chosen independently of for all a ∈ E− ± µ, δ in compact subsets of Uµ × Uδ .
For the proof, see [Co78], for example. By the above lemma, we find nontrivial, bounded solutions, if and only if stable and unstable subspaces intersect nontrivially u s E− (0) ∩ E+ (0) = {0}.
508
M. Haragus, A. Scheel j
We may choose bases a± , analytic in δ and continuous in µ in the two subspaces and compute the determinant j
E(δ; µ) = det (a± ).
(6.6)
A variant of this analytic function is usually referred to as the Evans function [Ev72, AGJ90]. Clearly, zeroes of E detect precisely the nontrivial bounded solutions to (6.4), and therefore the point spectrum coincides with the zeroes of E. The algebraic multiplicity of eigenvalues coincides with the order of the zeroes of E; see [AGJ90]. By analyticity in δ and continuous dependence on µ, the number of zeroes counted with multiplicity varies continuously with µ. We are going to exploit this fact in Sect. 6.2. In our setting, both subspaces are well-defined and complex one-dimensional for s (0) and au (0), which lead to Re δ > 0. They are spanned by the complex vectors a+ − s u solutions a+ (x) and a− (x). It is our goal to show, that both solutions can be extended, analytically in δ and continuously in ε in an open neighborhood of δ = 0, in particular, across the imaginary axis where hyperbolicity at x = ±∞ is lost, into the left half s (0) and au (0) plane. We show that in the limit ε = 0 and Re δ = 0, the initial values a+ − converge to eigenvectors e2 and e1 to the eigenvalues ζ2 and ζ1 , respectively. In particular, E(0; 0) = 0, and by continuity, we can exclude unstable eigenvalues in a neighborhood of σ = iq. 6.1.3. A gap lemma. The goal here is to continue the Evans function across the essential spectrum. The idea is to exploit rapid convergence of the coefficients of the non-autonomous differential equation A(x), compensating for the loss of hyperbolicity in the asymptotic equation at x = ±∞. The main idea was already used in [GZ98, Theorem 2.3] and [KS98, Lemma 2.2]. We recall the results stated there. Theorem 5 ([GZ98, KS98]). Consider a non-autonomous, linear differential equation ax = A(x; δ, µ)a ∈ Rn with fundamental solution ϕ(x, y), with paramters δ ∈ Uδ (0) ⊂ C and µ ∈ Uµ (0) ⊂ R close to the origin. Assume exponential convergence to asymptotically constant coefficients |A(x; δ, µ) − A∞ (δ, µ)| ≤ Ce−η|x| with positive constants C, η > 0. Assume furthermore that A and A∞ are C k in µ, k ≥ 0, and analytic in δ, and A is continuous in x. At µ = 0, δ = 0, we require the existence of a spectral projection P to A∞ such that Re spec P A∞ ≤ 0 and Re spec (id − P )A∞ ≥ 0. Then there exists a unique decomposition of the phase space Rn into linear, stable s (x; δ, µ) and E u (x; δ, µ), which are as smooth as A. The and unstable subspaces E+ − subspaces are invariant under the linear evolution ϕ(x, y): s s (y) = E+ (x), ϕ(x, y)E+
u u ϕ(x, y)E− (y) = E− (x).
s (0) converge to E s (δ, µ) as x → ∞, where the Solutions to initial values in E+ s eigenspace E (δ, µ) smoothly depends on δ and µ and coincides with the range Im P for µ = 0, δ = 0.
Stability of Solitary Waves
509
u (0) converge to E u (δ, µ) as x → ∞, where the Also, solutions to initial values in E− u eigenspace E (δ, µ) smoothly depends on δ and µ and coincides with the kernel Ker P for µ = 0, δ = 0. In particular, for parameter values δ, µ where the eigenspaces E s/u (δ, µ) are actus (x; δ, µ) and ally the stable and unstable eigenspaces, respectively, the subspaces E+ u (x; δ, µ) coincide with the eigenspaces from Lemma 6.2. E−
In our problem, one additional difficulty arises. The convergence rate η of the non√ autonomous perturbation depends on µ = ε. The rate, βε, although fast compared to the eigenvalues of the asymptotic matrix O(ε 2 ), is not bounded away from zero, as required in the above theorem. We therefore restate a parameter-dependent version of these results, taking into account the different orders of convergence of the solitary wave and possible eigenfunctions. Proposition 6.3. Consider a non-autonomous, linear differential equation ax = A(x; δ, µ)a, depending on a parameter µ ∈ Rp and an eigenvalue parameter δ ∈ C, in a neighborhood of the origin in Rp × C. Assume that the coefficients A are C k , k ≥ 0, in µ and analytic in δ, and that A is continuous in x. Furthermore assume that A(x; δ, µ) converge to constant matrices, as |x| → ∞ |A(x; δ, µ) − A∞ (δ, µ)| ≤ C|µ|e−η(µ)|x| , and, as µ → 0, |A(x; δ, µ) − A0 (δ)| ≤ C|µ|. Assume spec A0 (0) ⊂ iR, and A∞ (δ, µ) is hyperbolic for Re δ = 0, with |(ik − A∞ (δ, µ))−1 | ≤
C , | Re δ|
(6.7)
for all k ∈ R and with C > 0 independent of µ ≥ 0. Suppose that spatial convergence of the coefficients is fast compared to the rate of hyperbolicity: µ/η(µ) → 0 as µ → 0. Then, the Evans function E(δ; µ) defined for Re δ > 0, can be extended continuously in µ and analytically in δ, in a sector {(δ, µ); − Re δ ≤ M|µ|}, for any fixed constant M > 0. In the limit µ → 0, we find E(δ; 0) = 0 for δ close to zero. Proof. For any µ = 0 small, the conclusions of the proposition directly follow from the gap lemma, Theorem 5. We have to show that the limit µ → 0 of E(δ; µ) exists, and is nonzero. For Re δ > 0, µ ≥ 0, the equation possesses exponential dichotomies, as stated in Lemma 6.2. The subspaces can actually be constructed from a fixed point argument. We s , first. From the resolvent estimate (6.7), we conclude that the subspaces focus on E+ s/u corresponding to stable and unstable eigenvalues E+ (δ) for the equation with µ = 0 continue analytically in a neighborhood of δ = 0. We write P+ for the projection on s (0) along E u (0), and B(y) := A(y) − A , suppressing the dependence on δ and E+ ∞ +
510
M. Haragus, A. Scheel
µ. For Re δ > 0, solutions a(x) which are bounded on x ≥ 0 then solve the integral equation a(x) = e
A∞ x
x a0 +
e
A∞ (x−y)
x P+ B(y)a(y)dy +
eA∞ (x−y) (id − P+ )B(y)a(y)dy
∞
0
with a0 = P+ a(0). We substitute aˆ (x) = e−A∞ x a(x) and arrive at x aˆ (x) = a0 +
e
−A∞ y
P+ B(y)e
0
A∞ y
x aˆ (y)dy +
e−A∞ y (id − P+ )B(y)eA∞ y aˆ (y)dy.
∞
We view the right side as an affine operator on the space of bounded, continuous functions on [0, ∞), equipped with the supremum norm. Since B(y) ≤ C|µ|e−η(µ)|y| , and |eA∞ y | ≤ CeC|µ|y for − Re δ ≤ C|µ|, we find that the norm of the linear part of the right side is C µ/η(µ), which converges to zero for µ → 0 by assumption. We therefore find a unique solution aˆ (x) in the sector, which converges to the constant solution as µ → 0. We find the stable subspace as aˆ (0), parameterized over a0 . The construction of the unstable subspace is similar. In the limit, µ = 0, we find the Evans function for the constant coefficient equation, which is nonzero, since we have a spectral decomposition on the imaginary axis corresponding to the limits of stable and unstable subspaces. Together with the considerations in Sect. 6.1.2, this proves absence of point spectrum in a neighborhood of the imaginary axis, outside a given small neighborhood of the origin, which we consider next. 6.2. The case of small frequency. We exclude point spectrum in a neighborhood of the origin σ = 0, off the imaginary axis. As a first step, we reduce the eigenvalue problem to finding non-trivial solutions to a four-dimensional non-autonomous ordinary differential equation, Sect. 6.2.1. We then introduce and justify a long-wave scaling corresponding to the Korteweg-de Vries limit, Sect. 6.2.2. We then recall from [PW92] the structure of the spectrum in the scaling limit, where we find the spectrum of the Korteweg– de Vries soliton, Sect. 6.2.3. The last part of this chapter, Sect. 6.2.4, is devoted to the central perturbation arguments. We show that the spectrum of the capillary-gravity waves coincides with the point spectrum of the Korteweg–de Vries soliton in a neighborhood of the imaginary axis. 6.2.1. The reduction. Rewrite (6.1) for σ = δ small as x = L∞ (0, 0)W + δB∞ (δ)W + ε 2 (B0 + B1 (x; δ, ε))W, W
(6.8)
where L∞ (0, 0) = A∞ (0), and δB∞ (δ) = D∞ (δ),
ε2 B0 = A∞ (ε 2 ) − A∞ (0),
B1 = D1 + A1 .
˜ so it is exactly the linear operator used for the analysis of Recall that A∞ (0) = A(1), the steady problem in Theorem 1. From those results we find that A∞ (0) has only one purely imaginary eigenvalue ζ = 0, with algebraic multiplicity four. The corresponding (generalized) eigenvectors are w0 , w1 , w2 , w3 found in the proof of Theorem 1.
Stability of Solitary Waves
511
The center manifold reduction implies that the bounded solutions of (6.8) are of the form W(x) = a0 (x)w0 + a1 (x)w1 + a2 (x)w2 + a3 (x)w3 + O((|δ| + ε 2 )|aj |), and the amplitudes aj satisfy a non-autonomous, linear, reduced system of the form a0,x = a1 + δ(c00 a0 + c02 a2 ) + ε 2 (c01 a1 + c03 a3 ) + ε 2 f0 (x; a1 , a2 , a3 ) + O((|δ| + ε 2 )2 |aj |), a1,x = a2 + δ(c11 a1 + c13 a3 ) + ε2 f1 (x; a1 , a2 , a3 ) + O((|δ| + ε 2 )2 |aj |), a2,x = a3 + δ(c20 a0 + c22 a2 ) + ε 2 (c21 a1 + c23 a3 )
(6.9)
+ ε 2 f2 (x; a1 , a2 , a3 ) + O((|δ| + ε 2 )2 |aj |), a3,x = δ(c31 a1 + c33 a3 ) + ε 2 f3 (x; a1 , a2 , a3 ) + O((|δ| + ε 2 )2 |aj |). The constants cij are O(1) and can be determined explicitly. In particular, we have c20 = c31 = −β and c21 = β. Note that the functions fj are independent of a0 . This is due to the invariance of (2.1)–(2.4) under → + const. which implies the invariance of the reduced system under a0 → a0 + const. if δ = 0. A direct calculation of the relevant terms gives ε2 f2 (x; a1 , a2 , a3 ) = −βu0 a1 + ε 2 f22 (x; a2 , a3 ), ε2 f3 (x; a1 , a2 , a3 ) = −2βu0x a1 − 2βu0 a2 + ε 2 f33 (x; a3 ) with u0 from Theorem 1 (i). 6.2.2. Justifying the Korteweg–de Vries scaling. As a first step, we prove that any eigenvalue δ, Re δ = 0 is necessarily located in an O(ε 3 )-neighborhood of the origin. Suppose therefore ε = ν|δ|1/3 with ν small. We shall prove that the system (6.9) does not possess non-trivial, bounded solutions, provided Re δ = 0. We may scale the system (6.9) according to ξ = |δ|1/3 x,
aj (x) = |δ|j/3 Aj (ξ ),
j = 0, 3,
and obtain A0,ξ = A1 + O(δ 2/3 + ν 2 δ 2/3 ), A1,ξ = A2 + O(δ 2/3 + ν 2 δ 1/3 ), A2,ξ = A3 − βeiarg(δ) A0 + O(δ 2/3 + ν 2 ),
(6.10)
A3,ξ = −βeiarg(δ) A1 + O(δ 2/3 + ν 2 ). 3 = At ν = δ = 0, we have an autonomous linear ODE with eigenvalues ζ0 = 0, ζ1,2,3 0 0 0 iarg(δ) iarg(δ) , and corresponding eigenvectors A1 = A2 = 0, A3 = −βe , A00 = 1, −2βe and Akj = (−ζk )j , k = 1, 2, 3 and j = 0, . . . , 3. Now suppose first Re δ = 0. Then ζj , j = 1, 2, 3 are hyperbolic. Therefore the eigenspace to the eigenvalue ζ0 forms a normally hyperbolic center-manifold for the linear flow. This center-manifold persists
512
M. Haragus, A. Scheel
under small, non-autonomous perturbations and contains all bounded solutions (we may construct the center-manifold as the robust intersection of center-stable manifold at x = ∞ and center-unstable manifold at x = −∞). On the other hand, the eigenvalue ζ0 = 0 is easily seen from (5.2) to move off the imaginary axis whenever ζ moves off the axis. But this eigenvalue determines the asymptotic behavior of solutions in the center-manifold at x = +∞ and x = −∞. If now δ approaches the imaginary axis, we have to refine the arguments as in Case I, above. Using the gap lemma, Proposition 6.3, we continue the center-stable manifold at x = +∞ and the center-unstable manifold at x = −∞ smoothly across the imaginary axis, exploiting fast convergence of the nonautonomous terms on the scale, O(ε) compared to the order of the perturbation O(ε2 ). We omit the details which are similar to the case of non-zero frequency, Sect. 6.1. 6.2.3. The Korteweg–de Vries limit. We may now assume that the eigenvalue δ is necessarily of the order ε3 and therefore scale δ = ε 3 H. We obtain in the KdV-scaling ξ = εx,
aj (x) = εj Aj (ξ ),
j = 0, 3,
the scaled reduced system A0,ξ = A1 + O(ε 2 ), A1,ξ = A2 + O(ε), A2,ξ = A3 − βHA0 + βA1 − βA∗1 A1 + O(ε 2 ), A3,ξ = −βHA1 − 2βA∗1,ξ A1 − 2βA∗1 A2 + O(ε).
(6.11)
Here A∗1 is the steady solitary wave solution of the KdV-equation 2βA1,τ + A1,ξ ξ ξ − βA1,ξ + 3βA1 A1,ξ = 0, A∗1 (ξ ) = sech 2
√
βξ 2
(6.12)
.
We consider the case ε = 0 first. We transform variables B0 = A0 , B1 = A1 , B2 = A2 , B3 = A3 − βHA0 + βA1 − βA∗1 A1 and obtain at ε = 0, B0,ξ B1,ξ B2,ξ B3,ξ
= B1 , = B2 , = B3 , = −2βHB1 + βB2 − 3βA∗1,ξ B1 − 3βA∗1 B2 ,
(6.13)
which is the KdV-equation, linearized in the soliton solution A∗1 , for B1 = B0,ξ . The equation at |ξ | = ∞ reduces to B0,ξ = B1 ,
B1,ξ ξ ξ + 2βHB1 − βB1,ξ = 0
with characteristic polynomial ζ 4 + 2βHζ − βζ 2 for the ζ -eigenvalues, determining exponential spatial decay or growth of possible eigenfunctions. Besides ζ = 0 with eigenvector (1, 0, 0, 0)T , we have precisely the spectrum of the linearization about the
Stability of Solitary Waves
513
KdV-soliton. In particular, dynamics in the space (1, 0, 0, 0)⊥ are precisely the (linear) dynamics around the KdV-soliton. This strongly suggests that eigenfunctions will appear wherever the KdV-soliton possesses eigenfunctions — and nowhere else. Given the stability of the KdV-soliton [PW92], this would then prove stability of the solitary wave in the Euler-equations! We construct in the sequel a more refined picture of the spectrum in ε = 0, which will, in particular, be persistent for ε > 0. First of all, we note that the trivial zero-eigenvalue moves out of zero as soon as ε becomes positive and H non-zero. This can be readily seen from (5.2), by substituting the KdV-scaling δ = ε3 H and ζ = εZ. From the dispersion relation (5.2) we then obtain a new equation for Z, H and ε. The Taylor expansion of this equation in ε 2 is, up to third order 1 1 1 1 ε4 b − Z 4 −Z 2 + 2HZ + ε 2 − b − Z 6 + Z 4 −HZ 3 + H2 + O(ε 4 ) 3 6 5 6 = 0. (6.14) To second order in ε2 , there is still one eigenvalue ζ0 = 0 which can be seen to be perturbed to ζ0 = − 21 ε 2 H + O(ε 4 H) by the third order terms in ε 2 . We emphasize here, that all eigenvalues are, for ε ≥ 0 small, smooth functions in H and ε. 6.2.4. Perturbing the Korteweg–de Vries spectrum. We consider the scaled eigenvalueproblem for the water-waves (6.11) as a small perturbation of the eigenvalue problem for the KdV-equation (6.13). We distinguish three cases, with increasing difficulty. First we consider H bounded away from the imaginary axis. We then continue the arguments for H close to the imaginary axis, but bounded away from the origin. Finally, we study the eigenvalue problem for H in a neighborhood of the origin. (I) Eigenvalues far from the imaginary axis. Suppose first that Re H ≥ ν∗ > 0 for some ν∗ > 0. We have to exclude bounded solutions to (6.11) for ε > 0, small. As in Sect. 6.1, we exploit the fact that the ξ -dependent coefficients in (6.11) converge exponentially as |ξ | → ∞, uniformly in ε ≥ 0. In order to construct stable and unstable subspaces as in Sect. 6.1, we discuss the spatial eigenvalues ζj of (6.11) at |ξ | = ∞. From the scaled dispersion relation (6.14), we find two eigenvalues with positive real part, ζ1 and ζ3 , one eigenvalue with negative real part, which we call ζ2 and the eigenvalue ζ0 , which for ε = 0 remains in the origin, and moves into the left half plane for ε > 0: Re ζ1 , Re ζ3 > 0,
Re ζ0 ≤ 0, Re ζ2 < 0.
With Lemma 6.2, we can construct linear subspaces E s (0) and E u (0), such that all initial values at ξ = 0 of the linear equation (6.11) leading to bounded solutions on R+ or R− are contained in E s (0) or E u (0), respectively. Both subspaces depend analytically on 1/2 H, Re H ≥ ν∗ > 0, and smoothly on ε ≥ 0. Choosing analytic bases Bs/u in E s/u (0), we can compute the Evans function
E(H; ε) = det Bs1 , Bs2 , Bu1 , Bu2 . We show that E(H; 0) is nonzero for Re H ≥ 0. By continuity in ε and the previous considerations for large H, this excludes eigenvalues in Re H ≥ ν∗ > 0.
514
M. Haragus, A. Scheel
The Evans function E(H; 0) can be computed almost explicitly from (6.13). Recall, that the equation for (B1 , B2 , B3 ) does not depend on B0 and is precisely the linearization about the KdV-soliton. We therefore define the subspace (0, ∗, ∗, ∗) = (1, 0, 0, 0)⊥ as the KdV-subspace. This subspace is not flow-invariant, but the dynamics in this subspace are independent of the value of B0 in the first component if ε = 0. This gives the equations a skew-product structure. We may first solve the equation in the KdV-subspace and then solve the equation for B0 . Within the KdV-subspace, we find the eigenvalues ζ1 , ζ2 , s u (0) by intersecting and ζ3 . We find the stable and unstable subspaces EKdV (0) and EKdV s s u the subspaces E (0) and E (0) with the KdV-subspace. In particular, EKdV (0) is oneu dimensional and EKdV (0) is two-dimensional. Choosing analytic bases in these two subspaces, we can compute an analytic function EKdV (H), the Evans function of the KdVsoliton. We are now going to use information from [PW92] on the zeroes of EKdV (H). Theorem 6 ([PW92]). The Evans function EKdV (H) for the KdV-soliton can be extended analytically into Re H > −4/3. It vanishes precisely in the origin, where we have EKdV (0) = 0, EKdV (0) = 0, EKdV (0) = 0. From this information, we can infer absence of zeroes for E(H; 0) in Re H ≥ ν∗ . Lemma 6.4. The reduced, scaled Evans function of the water-wave problem, E(H; 0), and the Evans function for the KdV-soliton, EKdV (H), differ by a non-vanishing analytic function S(H): E(H; 0) = S(H)EKdV (H);
S(H) = 0
for Re H > 0. Proof. We compute E(H) choosing a particular analytic basis in E s (0) and E u (0). Note first that Bs1 := B0 = (1, 0, 0, 0)T ∈ E s (0) since this vector is constant under time-ξ 1 2 3 evolution. Next, let Bs,2 KdV (H), Bu, KdV (H), Bu, KdV (H) ∈ C denote the basis vectors s/u
for stable and unstable KdV-subspaces EKdV (0). Solving B0,ξ = B1 , with B1 given from j j the KdV-subspace, with initial condition Bs/u, KdV (H), we find particular bases Bs/u of j
E s/u (0), which coincide with Bs/u, KdV in the KdV-subspace. Since Bs1 = (1, 0, 0, 0)T , we find that in these coordinates the determinant det (Bs1 , Bs2 , Bu1 , Bu2 ) is of the form 1 ∗ ∗ ∗
1
2 2 0 Bs, KdV (H) Bu, KdV 1 (H) Bu, KdV 1 (H) 1 E(H; 0) = det 2
1
2 0 B s, KdV 2 (H) Bu, KdV 2 (H) Bu, KdV 2 (H)
2
1
2 0 Bs, KdV 3 (H) Bu, KdV 3 (H) Bu, KdV 3 (H)
1 2 = det Bs,2 KdV (H), Bu, KdV (H), Bu, KdV (H) = EKdV (H). Choosing different analytic bases, the determinant only differs by a nonzero, analytic factor, which proves the lemma. Corollary 6.5. The scaled Evans function of the water-wave problem E(H; 0) does not vanish in the right half plane. In particular, for 0 < ε ≤ ε∗ (ν), there are no unstable eigenvalues of the solitary wave in Re δ ≥ νε3/2 .
Stability of Solitary Waves
515
(II) Eigenvalues close to the imaginary axis. We show that we may continue the construction from Lemma 6.4 across the imaginary axis, outside a neighborhood of the origin. Lemma 6.6. The reduced Evans function E(H; ε) can be continued analytically in H and continuously in ε in a region {Re H ≥ −ν, |H| ≥ ν} ⊂ C. Proof. We have to show that the stable and unstable subspaces E s (0) and E u (0) continue analytically in H and continuously in ε across the imaginary axis. This in turn is an immediate consequence of the gap lemma, Theorem 5. Corollary 6.7. The scaled Evans function of the water-wave problem E(H; ε) does not vanish in a region {Re H ≥ −ν, |H| ≥ ν} ⊂ C. In order to finish the proof, it remains to exclude eigenvalues for the perturbed, scaled eigenvalue problem (6.14) in a neighborhood of the origin. (III) Eigenvalues close to the origin. Finally, we address the crucial neighborhood of the origin. We may already suspect that transversality as above might not hold, since already the KdV-equation possesses an eigenvalue H = 0 of algebraic multiplicity two, embedded in the essential spectrum. Again, the strategy consists of first continuing the Evans function E(H; ε) for the water-wave problem analytically in H and continuously in ε in a neighborhood of the origin first. As a second step, we show how this Evans function is related to the Evans function of the Korteweg-de Vries equation, EKdV (H). The goal of this step to conclude that for all ε ≥ 0 sufficiently small, E possesses at most three zeroes in a neighborhood of the origin – exploiting that the number of zeroes of an analytic function is invariant under small perturbations. We then conclude the stability proof exhibiting two explicit eigenvectors in the kernel and an explicit principal vector in the generalized kernel. We start with some notational preliminaries for the asymptotic equation at |ξ | = ∞. The eigenvalues of the linear equation on the right side of √ (6.13) at H =√0, |ξ | = ∞ are ζs = ζu = 0, a double zero eigenvalue, and ζss = − β and ζuu = β. The zero eigenvalue is geometrically simple with eigenvector (1, 0, 0, 0)T . The central observation now is that for H, ε = 0 the zero eigenvalues unfold smoothly: 1 ζs = − ε 2 H + O(ε 2 H2 ), 2
ζu = 2H + O(H2 + ε 2 H).
These expansions are readily computed from the Newton polygon to (6.14), with leading order contribution −Z 2 + 2HZ + ε 2 H2 . Eigenvectors are smooth as well and given by ej = (−1, ζj , ζj2 , ζj3 ) for j = s, u, ss, uu. For ε > 0, Re H > 0, the stable eigenspace is spanned by E s = span {es , ess } and the unstable eigenspace by E u = span {eu , euu }. At H = 0, we find a nontrivial intersection of stable and unstable subspaces E s ∩ E u = span {es } = span {eu }. We emphasize that this smooth unfolding is non-generic: in a typical unfolding of √ the Jordan block with a parameter H, the eigenvalues are smooth functions of H! The smooth unfolding here is due to reversibility: in the scaled dispersion relation (6.14), there is no linear term H, which would make the leading order √ contribution in the Z-H Newton polygon for (6.14) to be −Z 2 + H = 0, with Z ∼ H. Reversibility implies
516
M. Haragus, A. Scheel
invariance of the dispersion relation under Z # → −Z and H #→ −H, for all ε! It is this symmetry which excludes linear terms in H. We next show that the subspaces E s (ξ ) and E u (ξ ), constructed for H outside a neighborhood of zero above, can be continued analytically in H and smoothly in ε across this neighborhood. Lemma 6.8. The Evans function E(H; ε) to the scaled linearization about the solitary wave in the water-wave problem (6.14) possesses an analytic extension into an open neighborhood of the origin |H| ≤ ν0 , which depends continuously on ε ≥ 0 sufficiently small. The neighborhood is uniform in ε, that is, ν0 does not depend on ε ≥ 0. Proof. The construction very much relies, in the spirit of the gap lemma, Theorem 5, on a stable manifold theorem. However, we cannot apply the gap lemma directly, since additional hyperbolic eigenvalues are present, which actually are in resonance with spatial convergence of the coefficients at H = 0. 1 We compactify time 2βξ = log( 1+τ 1−τ ), τ ∈ [−1, 1] and obtain a smooth (C in τ and analytic in H) differential equation, suspended with the equation τξ = β(1 − τ 2 ). The fibers τ = +1 and τ = −1 are invariant and describe the limiting situation at ξ = ±∞. In these fibers the dynamics possesses invariant subspaces which are the linear eigenspaces to the eigenvalues ζj , j = s, u, ss, uu. In the τ -direction, the asymptotic τ = ±1-subspaces are linearly stable (τ = +1) and linearly unstable (τ = −1), respectively, with exponential rate ±2β. The flows inside τ = 1 and τ = −1 are linear and coincide. Subspaces corresponding to eigenspaces and generalized eigenspaces are flow-invariant subspaces. For example, the two-dimensional subspace in τ = ±1 corresponding to the generalized kernel for H = 0, can be viewed as a smooth, normally hyperbolic, local centermanifold. Inside this center-manifold, we find the particularly important flow-invariant subspaces span {es } in τ = +1 and span {eu } in τ = −1. The subspaces are analytic in H and continuous in ε. They possess strong unstable and strong stable foliations, which are as smooth as the vector field. Indeed, we may smoothly transform variables, Bj # → Bj e−ζs/u ξ to trivialize the flow inside the eigenspace, which consists of a line of equilibria after the rescaling. The foliations are then given as the strong stable manifolds of the equilibria in the eigenspaces. Analyticity follows from differentiability and the Cauchy-Riemann differential equations. We denote by W ss (span {es }) the three-dimensional stable manifold of the subspace span {es } in the extended phasespace (τ, B). Analogously, let W uu (span {eu }) denote the three-dimensional unstable manifold of the subspace span {eu }. By construction, these manifolds are the smooth continuations of E s (ξ ) and E u (ξ ) that we already constructed in the region ReH > 0: W ss (span {es }) ∩ {τ = 0} = E s (0) and W uu (span {eu }) ∩ {τ = 0} = E u (0). Choosing analytic bases in these subspaces, and evaluating the determinant, we have continued the Evans-function E into a neighborhood of the origin H = 0 smoothly, analytically in H and continuously in ε. Remark 6.9. The above construction does not show that we can smoothly single out a particular one-dimensional subspace of initial conditions which converges to span {eu } or span {es } faster than the other solutions – which is part of the proof of the gap lemma; see the proof of Proposition 6.3. In fact, we believe that this is in general impossible, since precisely at the origin, H = 0, the contracting and expanding eigenvalues ζss and ζuu are equal to the rate of exponential approach in the ξ -direction, which makes it impossible to single out a strong stable or unstable direction.
Stability of Solitary Waves
517
The next step provides an expansion for E(H; 0) near H = 0. Lemma 6.10. There exists a nonzero coefficient E3 = 0 such that E(H; 0) = E3 H3 + O(H4 ). Proof. For ε = 0, the linear equation (6.13) possesses a skew-product structure, already exploited in the previous paragraphs (I) and (II). In the KdV-subspace, the dynamics are independent of B0 . Stable and unstable subs u (0) are well-defined. We may choose particular bases spaces EKdV (0) and EKdV s ss EKdV (0) = span {BKdV (0)},
u uu u EKdV (0) = span {BKdV (0), BKdV (0)}
such that solutions in the KdV-subspace with these initial conditions satisfy e−ζ e−ζ
ss ξ
ss ss BKdV (ξ ) → bKdV for ξ → ∞,
uu ξ
uu uu BKdV (ξ ) → bKdV for ξ → −∞,
u u e−ζ ξ BKdV (ξ ) → bKdV for ξ → −∞. u
From these solutions, we are going to construct a basis of stable and unstable subspaces for the full water-wave problem (6.14), E s (0) and E u (0). We start with E s (0). First, B s (1, 0, 0, 0)T is a ξ -independent, bounded solution and belongs to E s (0). The second ss (ξ ). Define basis vector is readily computed from BKdV ξ B0ss (ξ )
=
ss BKdV (s)ds 1
∞
and
T
B1ss (ξ ), B2ss (ξ ), B3ss (ξ )
ss = BKdV (ξ ).
T Then B ss (ξ ) = B0ss (ξ ), B1ss (ξ ), B2ss (ξ ), B3ss (ξ ) is exponentially decaying for ξ → ∞ and B ss (0) is the desired second basis vector in E s (0). Similarly, we define ξ B0uu (ξ )
=
uu (BKdV )1 (s)ds −∞
and uu (ξ ). (B1uu (ξ ), B2uu (ξ ), B3uu (ξ ))T = BKdV
Then B uu (ξ ) = (B0uu (ξ ), B1uu (ξ ), B2uu (ξ ), B3uu (ξ ))T is exponentially decaying for ξ → ∞ and B uu (0) ∈ E u (0).
518
M. Haragus, A. Scheel
u The same construction for BKdV would give a pole in H = 0 since the integral diverges due to slow exponential decay, ζ u = 2H + O(H2 + ε 2 H), u u (ξ ) = bKdV eζ BKdV
uξ
+ r(ξ )
with r(ξ ) = O(e(ζ +ν)ξ ) for ξ → −∞ with some ν > 0, uniformly in H close to zero. We therefore rescale the KdV-eigenvector with H and set u
u u B˜ KdV (ξ ) = HBKdV (ξ ).
We then proceed as for B uu and define ξ B0u (ξ )
=
u (B˜ KdV )1 (s)ds + B0u (0)
0
with B0u (0)
Hbu = u + ζ
0 r(s)ds.
−∞
With this choice of B0u (0), B0u (ξ ) decays to zero exponentially for Re H > 0. Note that B0u (0) is analytic in a neighborhood of H = 0 and that, with a suitable choice of bu we can arrange to have B0u (0) = 1 + O(H), u (B1u (ξ ), B2u (ξ ), B3u (ξ ))T = BKdV (ξ ).
Then B u (ξ ) = (B0u (ξ ), B1u (ξ ), B2u (ξ ), B3u (ξ ))T is exponentially decaying for ξ → −∞ and Re H > 0 and B u (0) ∈ E u (0). The Evans function for the water-wave problem is then given by the determinant E(H, 0) = det(B uu , B u , B s , B ss ). Exploiting that B s (0) = 1, we find that u uu ss E(H, 0) = det(BKdV , B˜ KdV , BKdV ) = HEKdV (H).
Together with Theorem 6 for the Evans function of the KdV equation, this proves the lemma. Geometrically, the unfolding of the subspaces is as follows, roughly speaking. For H = 0, B u and B s coincide and B ss and B uu can be assumed to coincide as well. The weak directions, B s and B u cross transversely in H = 0, contributing a factor H to E. The strong directions B ss and B uu unfold with quadratic tangency, just as in the KdV-equation, contributing a factor H2 to E. By continuity in ε and analyticity in H, Lemma 6.8, we conclude using Rouché’s theorem that for ε > 0 small, E(H; ε) possesses precisely three roots close to the origin, counted with multiplicity. The following lemma therefore shows that there are indeed no unstable eigenvalues in a small enough neighborhood of the origin.
Stability of Solitary Waves
519
Lemma 6.11. The Evans function for the water-wave problem E(H; ε) possesses a triple root in the origin for all ε ≥ 0 sufficiently small. Proof. Let H = 0. We find for ε ≥ 0 a two-dimensional intersection of Es (0) and Eu (0), generated by the derivative of the solitary wave and the translation of the potential (1, 0, 0, 0)T . Indeed, by construction, Lemma 6.8, any bounded solution necessarily lies in the intersection, since solutions which at H = 0 do not belong to the intersection grow at least linearly. From Galilean invariance, we find the exponentially localized derivative of the solitary wave with respect to the wave speed as a principal vector to the derivative of the solitary wave. Following [PW92], we conclude that E(H; ε) possesses at least a triple zero in H = 0. On the other hand, Lemma 6.10 shows that the multiplicity is at most three. This proves the lemma. 6.3. Proof of Proposition 6.1. We conclude the proof of absence of point spectrum in the right half plane. First, we showed in Sect. 6.1 that there are no unstable eigenvalues in a neighborhood of the imaginary axis, up to possible eigenvalues with large imaginary part or in a neighborhood of the imaginary axis. We then showed in Sect. 6.2.2 that eigenvalues in a neighborhood of the imaginary axis necessarily scale with ε 3 , justifying the Korteweg–de Vries scaling. Finally, we showed in Sect. 6.2.4 that in the Korteweg– de Vries scaling, there are no unstable eigenvalues. The main part was a perturbation argument, based on the construction of an analytic Evans function. We showed that any eigenvalue is a root of an analytic function E(H; ε). We then continued E(H; ε) analytically in an open neighborhood of H = 0, for ε ≥ 0. Lemma 6.10 showed that there are at most three eigenvalues in a neighborhood of zero, counting multiplicity, and Lemma 6.11 showed that all three eigenvalues are located in zero, for ε ≥ 0 sufficiently small. This proves spectral stability up to possible eigenvalues with imaginary part tending to ∞ as ε → 0, Proposition 6.1. Acknowledgement. The authors gratefully acknowledge financial support by DAAD/Procope, Nr. D/0031082 and F/03132UD.
References [AGJ90] Alexander, J., Gardner, R. and Jones, C.K.R.T.: A topological invariant arising in the stability analysis of traveling waves. J. Reine Angew. Math. 410, 167–212 (1990) [AK89] Amick, C.J. and Kirchgässner, K.: A theory of solitary water-waves in the presence of surface tension. Arch. Rational Mech. Anal. 105, 1–49 (1989) [Be67] Benjamin, T.B.: Instability of periodic wavetrains in nonlinear dispersive systems. Proc. Roy. Soc. Lond. A 299, 59–75 (1967) [Be72] Benjamin, T.B.: The stability of solitary waves. Proc. R. Soc. Lon. A 328, 153–183 (1972) [BF67] Benjamin, T.B. and Feir, J.E.: The disintegration of wave trains on deep water, Part 1. J. Fluid Mech. 27, 417–430 (1967) [BO80] Benjamin, T.B. and Olver, P.: Hamiltonian structure, symmetries and conservation laws for water waves. J. Fluid Mech. 125, 137–185 (1982) [BSS87] Bona, J.L., Souganidis, P.E. and Strauss, W.A.: Stability and instability of solitary waves of Korteweg-de Vries type. Proc. R. Soc. Lon. A 411, 395–412 (1987) [Bou] Boussinesq, M.J.: Essai sur la théorie des eaux courantes. Mémoires présentés par divers savants à l’Académie des Sciences Inst. France (séries 2) 23, 1–680 (1877) [BM95] Bridges, T.J. and Mielke, A.: A proof of the Benjamin-Feir instability, Arch. Rational Mech. Anal. 133, 145–198 (1995)
520
[Co78] [Cr85]
M. Haragus, A. Scheel
Coppel, W.A.: Dichotomies in stability theory. Lect. Notes Math. 629. Berlin: Springer, 1978 Craig, W.: An existence theory for water waves and the Boussinesq and Korteweg-de Vries scaling limits. Comm. Partial Diff. Eq. 10, 787–1003 (1985) [Ev72] Evans, J.: Nerve axon equations (iii): Stability of the nerve impulses. Indiana Univ. Math. J. 22, 577–594 (1972) [GZ98] Gardner, R. and Zumbrun, K.: The gap lemma and geometric criteria for instability of viscous shock profiles. Comm. Pure Appl. Math. 51, 797–855 (1998) [Ha96] Haragus, M.: Model equations for water waves in the presence of surface tension. Eur. J. Mech. B/Fluids 15, 471–492 (1996) [HS01] Haragus, M. and Scheel, A.: Linear stability and instability of ion-acoustic plasma solitary waves. Preprint. [IS92] Il’ichev, A.T. and Semenov, A.Y.: Stability of solitary waves in dispersive media described by a fifth-order evolution equation. Theoret. Comput. Fluid Dynamics 3, 307–326 (1992) [KN79] Kano, T. and Nishida, T.: Sur les ondes de surface de l’eau avec une justification mathématique des équations des ondes en eau peu profonde. J. Math. Kyoto Univ. 19, 335–370 (1979) [KN86] Kano, T. and Nishida, T.: A mathematical justification for Korteweg-de Vries and Boussinesq equation of water surface waves. Osaka J. Math. 23, 389–413 (1986) [KS98] Kapitula, T. and Sandstede B.: Stability of bright solitary-wave solutions to perturbed nonlinear Schrödinger equations. Physica D 124, 58–103 (1998) [Ka72] T. Kawahara, Oscillatory solitary waves in dispersive media, Phys. Soc. Japan 33 (1972), 260–264. [Ki88] Kirchgässner, K.: Nonlinearly Resonant Surface Waves and Homoclinic Bifurcation. Adv. Appl. Mech. 26, 135–181 (1988) [KdV] Korteweg, D.J. and de Vries, G.: On the change of form of long waves advancing in a rectangular channel, and on a new type of long stationary waves Phil. Mag. 5, 422–443 (1895) [LH84] Longuet-Higgins, M.S.: On the stability of steep gravity waves. Proc. R. Soc. Lon. A 396, 269–280 (1984) [LHT97] Longuet-Higgins, M.S. and Tanaka, M.: On the crest intabilities of steep surface waves. J. Fluid Mech. 336, 51–68 (1997) [MS86] MacKay, R.S. and Saffman, P.G.: Stability of water waves. Proc. R. Soc. Lon. A 406, 115–125 (1986) [Mc82] McLean, J.W.: Instabilities of finite-amplitude water waves. J. Fluid Mech. 114, 315–330 (1982) [Mi88] Mielke, A.: Reduction of quasilinear elliptic equations in cylindrical domains with applications. Math. Meth. Appl. Sci. 10, 51–66 (1988) [Na74] Nalimov,V.I.: The Cauchy-Poisson Problem (in Russian). Dynamik Splosh. Sredy 18, 104–210 (1974) [PW92] Pego, R.L. and Weinstein, M.I.: Eigenvalues, and instabilities of solitary waves. Philos. Trans. Roy. Soc. Lond. A 340, 47–94 (1992) [PW96] Pego, R.L. and Weinstein, M.I.: Asymptotic stability of solitary waves. Commun. Math. Phys. 164, 305–349 (1996) [PW97] Pego, R.L. and Weinstein, M.I.: Convective linear stability of solitary waves for Boussinesq equations. Stud. Appl. Math. 99, 311–375 (1997) [RS95] Robbin, J. and Salamon, D.: The spectral flow and the Maslov index. Bull. London Math. Soc. 27, 1–33 (1995) [Sa91] Sachs, R.L.: On the existence of small amplitude solitary waves with strong surface tension. J. Diff. Equ. 90, 31–51 (1991) [Sa85] Saffman, P.G.: The superharmonic instability of finite amplitude water waves. J. Fluid Mech. 159, 169–174 (1985) [SW00] Schneider, G. and Wayne, C.E.: The long wave limit for the water wave problem I. The case of zero surface tension, Comm. Pure Appl. Math. 53, 1475–1535 (2000) [Ta86] Tanaka, M.: The stability of solitary waves, Phys. Fluids 29, 650–655 (1986) [Wh67] Whitham, G.B.: Nonlinear dispersion of water-waves, J. Fluid Mech. 27, 399–412 (1967) [Wu97] Wu, S.: Well-posedness in Sobolev spaces of the full water wave problem in 2-D. Invent. Math. 130, 39–72 (1997)
Stability of Solitary Waves
[Yo82] [Za68]
521
Yosihara, H.: Gravity waves on the free surface of an incompressible perfect fluid of finite depth. RIMS Kyoto 18, 49–96 (1982) Zakharov, V.E.: Stability of periodic waves of finite amplitude on the surface of a deep fluid. J. Appl. Mech. Tech. Phys. 2, 190–194 (1968)
Communicated by P. Constantin
Commun. Math. Phys. 225, 523 – 549 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
One Dimensional Behavior of Singular N Dimensional Solutions of Semilinear Heat Equations Hatem Zaag1,2 1 Courant Institute, New York University, 251 Mercer Street, New York, NY 10012, USA 2 Département de Mathématiques et Applications, CNRS UMR 8553, École Normale Supérieure,
45 rue d’Ulm, 75005 Paris, France. E-mail:
[email protected] Received: 20 June 2001 / Accepted: 6 October 2001
Abstract: We consider u(x, t) a solution of ut = u + |u|p−1 u that blows up at time T , where u : RN × [0, T ) → R, p > 1, (N − 2)p < N + 2 and either u(0) ≥ 0 or (3N − 4)p < 3N + 8. We are concerned with the behavior of the solution near a non isolated blow-up point, as T − t → 0. Under a non-degeneracy condition and assuming that the blow-up set is locally continuous and N − 1 dimensional, we escape logarithmic scales of the variable T − t and give a sharper expansion of the solution with the much smaller error term (T − t)1/2−η for any η > 0. In particular, if in addition p > 3, then the solution is very close to a superposition of one dimensional solutions as functions of the distance to the blow-up set. Finally, we prove that the mere hypothesis that the blow-up set is continuous implies that it is C 1,1/2−η for any η > 0. 1. Introduction In this paper, we are mainly concerned with the blow-up behavior at non-isolated blow-up points of the following semilinear heat equation: ut = u + |u|p−1 u, u(., 0) = u0 ∈ L∞ (RN ),
(1)
where u(t) : x ∈ RN → u(x, t) ∈ R and stands for the Laplacian in RN . We assume in addition that the exponent p > 1 is subcritical: if N ≥ 3 then 1 < p < (N +2)/(N −2). Moreover, we assume that either u0 ≥ 0 or (3N − 4)p < 3N + 8.
(2)
This problem has attracted a lot of attention because it captures features common to a whole range of blow-up problems arising in various physical situations; particularly it highlights the role of scaling and self-similarity. Among related equations, we mention: the motion by mean curvature, surface diffusion (Bernoff, Bertozzi and Witelski [1]) and
524
H. Zaag
chemotaxis (Brenner et al. [3], Betterton and Brenner [2]). However, Eq. (1) is simple enough to be tractable in rigorous mathematical terms, unlike other physical equations. In this work, we build up tools that may be useful in more physical situations. As a matter of fact, in Sect. 5 we will mention connections with a chemotaxis problem. The behavior near singular points is a major concern in all singularity problems. One general idea of this work is to find out how to refine the singular behavior beyond first order terms and reach significantly small error terms. Through a change of variables, singular behavior reduces to the asymptotic behavior of some PDE when a small positive parameter goes to zero. For the heat equation (1), = T − t → 0, where T is the blow-up time. In previous work, an explicit profile is found to be a good first order approximation, up to ν α where ν = −1/ log and α > 0. Further refinements in this direction should give an expansion of the solution in terms of powers of ν, i.e., in logarithmic scales of (see Stewartson and Stuart [18]). Logarithmic scales also arise in some singular perturbation problems such as low Reynolds number fluids and some vibrating membranes studies (see Ward [20] and the references therein, see also Segur and Kruskal [17] for a Klein–Gordon equation). Since ν goes to zero slowly, infinite logarithmic series may be of only limited practical use in approximating the exact solution. Relevant approximations, i.e., approximations up to lower order terms such as β for β > 0, lie beyond all logarithmic scales. In this work, our idea to capture such relevant terms is to abandon the explicit profile function obtained as a first order approximation, and take a less explicit function as a first order description of the singular behavior. Both formulations agree to the first order. Through scaling and matching, we can reach the order β by iterating the expansion around the less explicit function. A second general idea in this work is to see how more constraints on the singular set yield more regularity for that set. This idea is found in studies of free boundary problems, where over determined boundary conditions yield regularity of the free boundary. In this work, we focus on the case where the blow-up set of (1) is a continuum. The mere hypothesis that the blow-up set is continuous, which is an unstable situation (see Sect. 5), adds constraints in the problem, yielding C 1,α regularity for the blow-up set. 1.1. Blow-up behavior in logarithmic scales of T − t. A solution u(t) to (1) blows up in finite time if its maximal existence time T is finite. In this case, lim u(t) H 1 (RN ) = lim u(t) L∞ (RN ) = +∞.
t→T
t→T
Let us consider such a solution. T is called the blow-up time of u. A point a ∈ RN is called a blow-up point if |u(x, t)| → +∞ as (x, t) → (a, T ) (this definition is equivalent to the usual local unboundedness definition, because of Corollary 2 in Merle and Zaag [15]). S denotes the blow-up set, i.e., the set of all blow2 (RN \S) up points. From [15], we know that there exists a blow-up profile u∗ ∈ Cloc such that 2 (RN \S) as t → T . u(x, t) → u∗ (x) in Cloc
(3)
Given aˆ ∈ S, we know from Velázquez [19] that up to some scalings, u approaches a particular explicit function near the singularity (a, ˆ T ). We consider the case where for
Solutions of Semilinear Heat Equations
525
all K0 > 0,
1 sup (T − t) p−1 u aˆ + Qaˆ z (T − t)| log(T − t)|, t − flaˆ (z) → 0
|z|≤K0
(4)
as t → T , where Qaˆ is an orthonormal N × N matrix, laˆ = 1, ..., N, and
l (p − 1)2 2 zi fl (z) = p − 1 + 4p
1 − p−1
.
(5)
i=1
Other behaviors with the scaling (T − t)− 2k (x − a) ˆ where k = 2, 3, .. may occur (see [19]). We suspect them to be unstable. If laˆ = N , then aˆ is an isolated blow-up point. An extensive literature is devoted to this case (Weissler [21], Bricmont and Kupiainen [5], Herrero and Velázquez [12] and [19], . . . ). We have proved the stability of such a behavior with Fermanian and Merle in [8]. The key argument in our proof was the following Liouville Theorem proved by Merle and Zaag in [13] and [15]: Consider U a solution of (1) defined for all (x, t) ∈ RN × (−∞, T ) such that 1
−
1
for all (x, t) ∈ RN × (−∞, T ), |U (x, t)| ≤ C(T − t) p−1 . Then, either U ≡ 0 or − 1 U (x, t) = (p − 1)(T ∗ − t) p−1 for some T ∗ ≥ T . When laˆ = N , the blow-up behavior of u(x, t) near the isolated blow-up point aˆ is already contained in (4) which shows that the profile of u(x, t) is a function of a one dimensional variable: 1 d(x, S) − p−1 u(x, t) ∼ (T − t) f1 , (6) (T − t)| log(T − t)| since S = {a} ˆ and d(x, S) = |x − a| ˆ when x is close to a. ˆ This description remains valid even when aˆ is not isolated, as we will show later. The case laˆ < N is known to occur, namely when u is invariant with respect to some coordinates. However, when laˆ < N , we cannot even tell whether aˆ is isolated or not. The first singularity description was obtained in [23]. For simplicity, we assume that locally near a, ˆ S is a (N − laˆ )-dimensional C 1 manifold. We have shown in Theorems 3 and 4 in [23] that for some ˆ δ) t0 < T and δ > 0, for all K0 > 0, t ∈ [t0 , T ) and x ∈ B(a, such that d(x, S) ≤ K0 (T − t)| log(T − t)|, we have 1 d(x, S) log | log(T − t)| , (7) (T − t) p−1 u(x, t) − f1 ≤ C0 (K0 ) | log(T − t)| (T − t)| log(T − t)| where f1 is defined in (5). Note that formally, this is the same description as in the case laˆ = N , where aˆ was isolated (see (6)). The variable d(x, S), normal to S, appears as the blow-up variable that determines the size of u. The major step in [23] is the proof of the stability of the behavior (4) in a neighborhood of aˆ in S. The key argument in getting this stability is the Liouville Theorem of [15], stated earlier in this section. The error term in (7) shows that we fall in logarithmic scales of the small parameter = T − t. In this paper, we do better, and get to error terms of order (T − t)α with α > 0. Following the ideas of the Introduction, we will replace the explicit profile f1 by a less explicit function, and then go beyond all logarithmic scales, through scaling and matching.
526
H. Zaag
1.2. Blow-up behavior beyond all logarithmic scales of T −t. A natural candidate for this non explicit function is simply a one dimensional solution of (1) that has the same profile f1 . It is classical that there exists a one dimensional even function u(x ˜ 1 , t), solution of (1), which decays on (0, ∞) and blows up at time T only at the origin, with the profile f1 , in the sense that for all K0 > 0 and t ∈ [t0 , T ), if |x1 | ≤ K0 (T − t)| log(T − t)|, then 1 x log | log(T − t)| 1 ˜ 1 , t) − f1 (8) (T − t) p−1 u(x ≤ C0 (K0 ) | log(T − t)| (T − t)| log(T − t)| (see Appendix A for a proof of this fact). Hence, it follows from (7) that for all K0 > 0, ˆ δ) such that d(x, S) ≤ K0 (T − t)| log(T − t)|, we have t ∈ [t0 , T ) and x ∈ B(a, 1
(T − t) p−1 |u(x, t) − u(d(x, ˜ S), t)| ≤ C(K0 )
log | log(T − t)| . | log(T − t)|
(9)
This estimate remains valid even if we replace u(d(x, ˜ S), t) by any u˜ σ (x,t) (d(x, S), t), where u˜ σ is defined by
σ − σ u˜ σ (x1 , t) = e p−1 u˜ e− 2 x1 , T − e−σ (T − t) , (10) provided that |σ (x, t)| ≤ C(K0 ). Indeed, for any σ ∈ R, u˜ σ is still a blow-up solution of (1) with the same properties and the same profile (8) as u. ˜ Moreover, u˜ σ = u, ˜ unless σ = 0, because u˜ is not self-similar (see Appendix A). For each blow-up point a near a, ˆ we will suitably choose this free scaling parameter 1
σ = σ (a) so that the difference (T −t) p−1 u(x, t) − u˜ σ (a) (d(x, S), t) along the normal direction to S at a is minimum. Following the ideas of the Introduction, if we refine the expansion about this well chosen, though less explicit, function u˜ σ (a) (d(x, S), t), then we escape logarithmic scales. In particular, if p > 3, then the difference u(x, t) − u˜ σ (a) (d(x, S), t) is bounded and goes to zero as t → T , although both functions blow up. This can be done only when laˆ = 1 which corresponds to a (N − 1)-dimensional blow-up set, according to [23]. We claim the following: Theorem 1 (The N dimensional solution seen as a superposition of one dimensional solutions of the normal variable to the blow-up set, with a suitable dilation). Assume N ≥ 2 and consider u a solution of (1) that blows up at time T on a set S which is ˆ If u behaves as stated in (4) near a (N − 1)-dimensional C 1 manifold, locally near a. (a, ˆ T ) with laˆ = 1 and if p > 3, then for all t ∈ [t1 , T ) and x ∈ B(a, ˆ δ) such that d(x, S) < 0 for some t1 < T , δ > 0 and 0 > 0, we have u(x, t) − u˜ σ (P (x)) (d(x, S), t) ≤ h(x, t) < M < +∞, (11) S where PS (x) is the projection of x over S and h(x, t) → 0 as d(x, S) → 0 and t → T . Thus, when p > 3, all the singular terms of u in a neighborhood of (a, ˆ T ) are contained in the rescaled one dimensional solution u˜ σ (PS (x)) (d(x, S), t), which shows that in a tubular neighborhood of the blow-up set S, the space variable splits into 2 independent variables:
Solutions of Semilinear Heat Equations
527
– A primary variable, d(x, S), normal to S. It accounts for the main singular term of u and gives the size of u(x, t), as already shown in the old formulation (9), which follows directly from [23]. – A secondary variable, PS (x), whose effect is sharper. Through the optimal choice of the dilation σ (PS (x)), it absorbs all next singular terms in the normal direction to S at PS (x). Similar ideas are used by Betterton and Brenner [2] in a chemotaxis model; see Sect. 5 for a short discussion of connections with that work. We would like to mention that we have successfully used this idea of modulation of the dilation with Fermanian in [9] to prove that for N = 1 and p ≥ 3, there is only one blow-up solution of (1) with the profile (4), up to a bounded function and to the invariances of the equation (the dilation and translations in space and in time). Theorem 1 is a direct consequence of the following result which is valid also for 1 < p ≤ 3. Theorem 2 (Blow-up behavior and profile near a blow-up point where u behaves as in (4) assuming S is locally a (N − 1)-dimensional manifold). Under the hypotheses of Theorem 1 and without the restriction p > 3, there exists t1 < T and 0 > 0 such that for all x ∈ B(a, ˆ δ) such that d(x, S) ≤ 0 , we have the following: (i) For all t ∈ [t1 , T ), u(x, t) − u˜ σ (P (x)) (d(x, S), t) S
p−3 p−3 p 3 +C ≤ C mM (T − t) 2(p−1) | log(T − t)| 2 +C0 , d(x, S) p−1 | log d(x, S)| p−1 0 , (12) where PS (x) is the projection of x over S, mM = min if 1 < p ≤ 3 and mM = max if p > 3. (ii) If x ∈ S, then u(x, t) → u∗ (x) as t → T and
σ (PS (x)) p−3 p ∗ σ (P (x)) u (x) − e− p−1 u˜ ∗ e− S2 d(x, S) ≤ Cd(x, S) p−1 | log d(x, S)| p−1 +C0 ,
where u˜ ∗ (x1 ) = lim u(x ˜ 1 , t). t→T
Remark. In [23], we have obtained the following explicit equivalent for u∗ :
8p | log d(x, S)| u (x) ∼ (p − 1)2 d(x, S)2 ∗
1 p−1
∼ u˜ ∗ (d(x, S)) as d(x, S) → 0.
Our new estimate shows that up to a suitable dilation, all the next terms in the expansion p−3
p
of u∗ up to the order d(x, S) p−1 | log d(x, S)| p−1 dimensional solution.
+C0
are the same as the particular one
528
H. Zaag
1.3. C 1,α regularity of the blow-up set. The splitting of the space variable x into d(x, S) and PS (x), as shown in (12), induces a geometric constraint on the blow-up set S, leading to more regularity on S. Proposition 3 (C 1, 2 −η regularity for S and C 1−η regularity for the dilation σ ). Under 1 the hypotheses of Theorem 2, S is the graph of a function ϕ ⊂ C 1, 2 −η (BN−1 (0, δ1 ), 1−η R), locally near a, ˆ and σ is a C function, for any η > 0. More precisely, there is a h0 > 0 such that for all |ξ | < δ1 and |h| < h0 such that |ξ + h| < δ1 , we have 1
|ϕ(ξ + h) − ϕ(ξ ) − hϕ (ξ )| ≤ C|h|3/2 | log |h|| 2 +C0 , 1
|σ (ξ, ϕ(ξ )) − σ (ξ + h, ϕ(ξ + h))| ≤ C|h|| log |h||3+C0 . The regularity of the blow-up set S is our second concern in this paper. We know from Velázquez [19] that the (N − 1)-dimensional Hausdorff measure of S is bounded on compact sets. Under a local non-degeneracy condition, we have proved in [23] that if S locally contains a continuum, then S is locally a C 1 manifold of dimension k = 1 1, ..., N − 1. Since Proposition 3 derives C 1, 2 −η regularity assuming C 1 regularity, we can weaken the hypotheses of Proposition 3 and get a stronger version that derives 1 C 1, 2 −η regularity just assuming continuity. Stating this new version requires additional technical notation. We consider a non-isolated blow-up point aˆ where u has the behavior (4) with laˆ = 1. We may take Qaˆ = Id. According to Theorem 2 in [19], for all > 0, there is δ() > 0 such that S ∩ B(a, ˆ δ) ⊂ +a,π, ≡ x | |Pπ (x − a)| ˆ ≥ (1 − )|x − a| ˆ , ˆ where Pπ is the orthogonal projection over π , the subspace spanned by e2 , ..., eN . Note that +a,π, is a cone with vertex aˆ that shrinks to aˆ + π as → 0. In fact, aˆ + π is the ˆ candidate for the tangent plane to S at a. ˆ We assume there is a ∈ C((−1, 1)N−1 , RN ) such that a(0) = aˆ and Im a ⊂ S, where Im a is at least (N − 1)-dimensional in the sense that ∀b ∈ Im a, there are (N − 1) independent vectors v1 , ..., vN−1 in RN and a1 , .., aN−1 functions in C 1 ([0, 1], S) such that ai (0) = b and ai (0) = vi .
(13)
This hypothesis means that b is actually non-isolated in (N − 1) independent directions. We also assume that aˆ = 0 is not an endpoint in Im a in the sense that ∀ > 0, the projection of a((−, )N−1 ) on the plane aˆ + π contains an open ball with center a. ˆ
(14)
We claim the following: Theorem 4 (Regularity of the blow-up set near a point with the behavior (4) assuming S contains a (N − 1)-dimensional continuum). Take N ≥ 2 and consider u a solution of (1) that blows up at time T on a set S and take aˆ ∈ S, where u behaves locally as stated in (4) with laˆ = 1. Consider a ∈ C((−1, 1)N−1 , RN ) such that aˆ = a(0) ∈ Im a ⊂ S and Im a is at least (N − 1)-dimensional in the sense (13). If aˆ is not an endpoint (in the sense (14)), then there are δ > 0, δ1 > 0 and 1 ϕ ∈ C 1, 2 −η (BN−1 (0, δ1 ), R) (for any η > 0) such that S ∩ B(a, ˆ 2δ) = graph ϕ ∩ B(a, ˆ 2δ) = Im a ∩ B(a, ˆ 2δ).
(15)
Solutions of Semilinear Heat Equations
529
Moreover, the conclusions of Theorem 2 and Proposition 3 hold. In particular, if p > 3, then the conclusion of Theorem 1 also holds. Remark. When N = 2, we can replace conditions (13) and (14) just by the existence of α0 such that for all > 0, a(−, ) intersects the complimentary of any connected closed cone with vertex at aˆ and angle α ∈ (0, α0 ]. Remark. In the case laˆ ≥ 2 in (4), that is when the blow-up set is 2 dimensional, we are unable to suitably choose the dilation in (10) and we cannot escape the logarithmic scale in T − t. Hence, we cannot obtain C 1,α regularity. We can nonetheless improve estimate (9) and prove that: For all t ∈ [t1 , T ) and x ∈ B(a, ˆ δ) such that d(x, S) ≤ 0 , we have − 1 − 2 (T − t) p−1 d(x, S) p−1 |u(x, t) − u(d(x, ˜ S), t)| ≤ C min . , | log(T − t)| | log d(x, S)| p−2 p−1 Theorem 1 is a direct consequence of Theorem 2. Throughout the paper, we assume the hypotheses of Theorem 2. In Sect. 2, we start from the conclusion given in [23] under the hypotheses of Theorem 2 and show that for any blow-up point a near a, ˆ there is σ (a) ∈ R such that u˜ σ (a) is the best profile for u along the normal direction to S at a. In Sect. 3, we use this to get the blow-up behavior of u in a tubular neighborhood of S (Theorem 2). In Sect. 4, we prove regularity results (Proposition 3). Theorem 4 is a direct consequence of Theorem 2 and Proposition 3 because of the results of [23]. Indeed, Theorem 4 in [23] asserts that under the hypotheses of Theorem 4, S is the graph of a C 1 function; hence Theorem 2 and Proposition 3 apply. Some connections with a chemotaxis model are presented in Sect. 5. The results of this paper and those of [23] have been presented in the note [22]. 2. Modulation of the Dilation, Uniformly with Respect to the Blow-up Point This is a major step in our paper. Under the hypotheses of Theorem 2, there is a C 1 function ϕ such that ˆ 2δ) = graph ϕ ∩ B(a, ˆ 2δ) Sδ ≡ S ∩ B(a,
(16)
for some δ > 0 and ϕ ∈ C 1 (BN−1 (0, δ1 ), R), where δ1 > 0 and BN−1 (0, δ1 ) is a ball in RN−1 . If a ∈ Sδ and wa is defined by 1 x−a wa (y, s) = (T − t) p−1 u(x, t), y = √ , s = − log(T − t), T −t
(17)
then we see from (1) that for all (y, s) ∈ RN × [− log T , ∞), 1 w ∂w = w − y.∇w − + |w|p−1 w. ∂s 2 p−1
(18)
We have proved in Propositions 3.1, 4.4 and 4.4’ of [23] that for all a ∈ Sδ and s ≥ − log T ,
y12 κ log s 1− (19) ≤C 2 , wa (Qa y, s) − κ + 2 2ps 2 s Lρ
530
H. Zaag
where Qa is a N × N orthogonal matrix continuous in terms of a, such that {Qa ei | i = 2, ..., N } span the tangent plane Ta to S at a, Qa e1 is the normal direction to S at a, κ = (p − 1)
1 − p−1
and ρ(y) = e−
|y|2 4
/(4π )N/2 .
(20)
To show this, we first start from (4) and use the paper by Filippas and Kohn [10] to establish (19) at a = a. ˆ Then, we use dynamical system methods to show the stability of the behavior (19) for solutions of (18). The Liouville Theorem stated in Subsect. 1.1 is a central argument. The particular one dimensional solution u(x ˜ 1 , t) in Subsect. 1.2 can also be thought as a N dimensional solution blowing up on the hyperplane {x1 = 0} in RN . Therefore, the results of [23] apply to u˜ and (19) holds for u˜ too. Since u˜ is invariant in the direction of the blow-up set, we have for all a ∈ {x1 = 0}, Qa ≡ Id and w˜ a = w˜ defined by 1 x1 w(y ˜ 1 , s) = (T − t) p−1 u(x ˜ 1 , t), y1 = √ , s = − log(T − t). (21) T −t w˜ is a solution of (18) and (19) yields for all s ≥ − log T ,
y12 κ log s 1− ˜ 1 , s) − κ + ≤C 2 . w(y 2ps 2 s 2
(22)
Lρ
Using (19) and (22), we get for all σ0 > 0, a ∈ Sδ , |σ | ≤ σ0 and s ≥ − log T + σ0 , log s . (23) s2 We aim in this section at choosing a particular σ = σ (a) so that this difference becomes s less than Ce− 2 s C0 for some C0 ≥ 0. This is equivalent to choosing an appropriate dilation λ(a) = e−σ (a) in (10) for the original function u(x ˜ 1 , t). The following proposition is the goal of this section. ˜ 1 , s + σ ) L2ρ ≤ C(σ0 ) wa (Qa y, s) − w(y
Proposition 2.1 (Modulation of the dilation in the one dimensional solution). There exist s0 > 0 and C0 > 0 and a continuous function σ : Sδ → R such that for all a ∈ Sδ and s ≥ s0 , s
wa (Qa y, s) − w(y ˜ 1 , s + σ (a)) L2ρ ≤ C0 e− 2 s C0 . Let us first recall from [15] some consequences of the Liouville Theorem of Subsect. 1.1, namely some L∞ estimates and a localization property for blow-up solutions of (1). We also need some elementary estimates of the one dimensional solution u. ˜ 2.1. Uniform L∞ estimates. The following propositions are consequences of the Liouville Theorem of Subsect. 1.1. Proposition 2.2 (L∞ estimates for solutions to (1) at blow-up). There exists C > 0 such that if u is a solution to (1) which blows up at time T > 0, then, there exists sˆ such that for all s ≥ sˆ and a ∈ RN , C C and || ∇ i wa (s) ||L∞ ≤ i/2 s s for all i ∈ {1, 2, 3}, where wa is defined in (17). || wa (s) ||L∞ ≤ κ +
(24)
Solutions of Semilinear Heat Equations
531
Proposition 2.3 (A uniform localization of the PDE (1) by means of the associated ODE). Let u be a solution to (1) which blows up at time T . Then, ∀ > 0, ∃C > 0, ∂u T ∀t ∈ , T , ∀x ∈ RN , − |u|p−1 u ≤ |u|p + C . 2 ∂t The reader will find a proof of these propositions in [15] and [14] respectively. In the following lemma, we give some elementary estimates for the particular one dimensional solution u: ˜ Lemma 2.4 (Elementary estimates for u). ˜ (i) There exists C > 0 and sˆ > 0 such that for all s ≥ sˆ and |y1 | ≤ w(y ˜ 1 , s) ≤ w(0, ˜ s) − C
√ s, we have
y12 . s
(ii) R
y12
(y 2 − 2) e− 4 ∂ w˜ κ as s → ∞. (y1 , s) 1 √ dy1 ∼ ∂s 8 4ps 2 4π
(iii)
∂ w˜ κ as s → ∞. (0, s) ∼ − ∂s 2ps 2
Proof. See Appendix A. 2.2. A dynamical system formulation for the modulation problem. Our approach is identical to what we did with Fermanian in [9] for the difference of two solutions with the radial profile (laˆ = N) in (4), instead of the non symmetric profile (1 = laˆ < N ) we handle here. Therefore, we follow the full strategy of [9] and emphasize the novelties. However, some technical details – most of them are straightforward and long – are omitted. The reader can find them in [9]. Consider an arbitrary σ0 ≥ 0 and fix a ∈ Sδ and |σ | ≤ σ0 . If we define ga,σ (y, s) = wa (Qa y, s) − w(y ˜ 1 , s + σ ), then we see from (18) that for all (y, s) ∈ RN × [− log T + σ0 , ∞),
∂s ga,σ (y, s) = L + αa,σ g a,σ , where L = −
y 2
α a,σ (y, s) =
(25)
(26)
· ∇ + 1 and ∀(y, s) ∈ RN × R, ˜ 1 , s + σ ) |p−1 w˜ | wa (Qa y, s) |p−1 wa − | w(y p − wa − w˜ p−1
(27)
if wa (Qa y, s) = w(y ˜ 1 , s + σ ), and in general, α(y, s) = p | w¯ a,σ (y, s) |p−1 −
p p−1
(28)
532
H. Zaag
for some w¯ a,σ (y, s) ∈ wa (Qa y, s), w(y ˜ 1 , s + σ ) . In the following, we drop down the index (a, σ ) unless there is ambiguity. One should keep in mind that all quantities defined from g also depend on (a, σ ). According to (23) and (25), g → 0 in L2ρ as s → ∞. More precisely, for all s ≥ − log T + σ0 , g(s) L2ρ ≤ C(σ0 )
log s . s2
(29)
Operator L is self-adjoint on D(L) ⊂ L2ρ (RN ) where ρ is defined in (20). The spectrum of L consists of eigenvalues m spec L = 1 − , m ∈ N . 2
Note that except two positive eigenvalues 1 and 21 and a null eigenvalue, all the spectrum is negative. The eigenfunctions of L are hβ (y) = hβ1 (y1 ) . . . hβN (yN ),
(30)
where β = (β1 , . . . , βN ) ∈ NN and for each m ∈ N, hm is the rescaled Hermite polynomial hm (ξ ) =
[m/2] j =0
m! hm (−1)j ξ m−2j . We note km = j !(m − 2j )! hm 2L2
,
(31)
ρ1 (R)
where L2ρ1 (R) is the L2 space with the measure 2
ξ N e− 4 ρ1 (yi ). ρ1 (ξ ) = √ that satisfies ρ(y) = 4π i=1
(32)
The polynomials hm and hβ satisfy |β| Lhβ = 1 − hm (ξ )kj (ξ )ρ1 (ξ )dξ = δm,j . hβ and 2 R Let us introduce the component of g(., s) on hβ , gβ (s) = kβ (y)g(y, s)ρ(y)dy where kβ (y) =|| hβ ||−2 h (y). L2 β ρ
RN
(33)
(34)
If Pn is the orthogonal projector of L2ρ over the eigenspace of L corresponding to 1 − n2 , gβ (s)hβ (y). Since the eigenfunctions of L span the whole space then Pn g(y, s) = |β|=n
L2ρ , we can write g(y, s) = P g = gβ (s)hβ (y) n N n∈N β∈N 2 2 ln (s)2 where ln (s) ≡ Pn g L2ρ . g(s) L2ρ ≡ I (s) = n∈N
As for α, we claim the following:
(35)
Solutions of Semilinear Heat Equations
533
Lemma 2.5 (Estimates on α). For all σ0 ≥ 0, a ∈ Sδ , |σ | ≤ σ0 , y ∈ RN and s ≥ − log T + σ0 , α(y, s) ≤
C(σ0 ) , s
|α(y, s)| ≤
C(σ0 ) (1 + |y|2 ) s
C(σ ) 1 0 and α(y, s) + h2 (y1 ) ≤ 3/2 (1 + |y|3 ). 4s s
(36)
Proof. See Lemma 2.5 in [9] where a similar lemma was derived from Proposition 2.2, k by parabolic regularity). (22) and (19) (note that both (22) and (19) hold in Cloc 2.3. Modulation for the dilation in the one dimensional solution. We prove Proposition 2.1 here. Practically, since g a,σ satisfies Eq. (26), we consider that equation as a dynamical system and classify all possible asymptotic behaviors the equation can exhibit as s → ∞, under the growth condition (29). It turns out that the effect of α in (26) can be neglected, except in the neutral mode of L. Since the eigenvalues of L are 1, 21 , 0 and − 2k for any integer k ≥ 1, we expect the positive modes to be neglected. More precisely, unless g a,σ decreases faster than e−ks for any k ∈ N, either the null mode or a negative mode of L will dominate as s → ∞. Moreover, we expect g a,σ to decrease polynomially in the former case (because of the effect of the 1s term in α) and exponentially in the latter. We proceed in 3 steps: – In Step 1, we project Eq. (26) on the different modes. We then show that the positive modes are relatively small and that either the null or a negative mode dominates (unless g a,σ decreases faster than e−ks for any k ∈ N). – In Step 2, we solve the ODE satisfied by the null mode and show that it decays like 1 , except for a critical explicit value σ (a) of σ , where it decays faster. s2 – In Step 3, we take σ equal to this critical value σ (a) and show that the null mode can not dominate, unless g a,σ ≡ 0. Thus, we drop down in the spectrum from 0 to − 21 or less, which gives exponentially fast decay for g a,σ . Step 1: Dominance of a particular mode. Let us first project (26) on the different modes. For the null mode of L (|β| = 2), the main term of the equation comes from the main term of α (see (36)). Lemma 2.6 (Projection of (26) on the different modes). For all σ0 ≥ 0, a ∈ Sδ , |σ | ≤ σ0 and s ≥ − log T + σ0 , we have the following: (i) For all n ∈ N, |ln + ( n2 − 1)ln | ≤ C(n, σ0 ) I (s) s . n 1 C0 (σ0 ) (ii) For all n ∈ N, I (s) ≤ 1 − n+1 + I (s) + (n + 1 − k)lk (s). 2 s 2 k=0 (s) 4 + C(σ0 ) l0 +l (iii) If |β| = 2, then gβ (s) + βs1 gβ (s) ≤ C(σ0 ) Is 3/2 s . Proof. The calculation is straightforward. Parts (i) and (ii) follow from (26) and Lemma 2.5 exactly as in Lemma 2.7 in [9]. (iii) The calculation is straightforward and similar to the proof of Proposition 2.9 in [9]. See Appendix B.1 for details.
534
H. Zaag
Our main goal in this step is to show that one mode has to dominate all the others (unless I (s) decays faster than e−ks for any k ∈ N). The argument would be clear if α was identically zero, because the modes would not interact in that case. In the actual proof, we rely on this simple fact and treat the term αg as a perturbation to get the result. We claim the following lemma (which was proved in [9] with no special care to uniform estimates with respect to a ∈ Sδ ): Lemma 2.7(Dominance of a mode). For all a ∈ Sδ and σ ∈ R, either for all m ∈ N, I (s) lm (s) = O s or there is n ≥ 2 such that I ∼ ln as s → ∞. In that case, ∀m = n,
lm = O Is as s → ∞. Proof. See Proposition 2.6 in [9].
Lemma 2.7 asserts that the positive modes l0 and l1 are O Is as s → ∞. We need to know that this holds uniformly with respect to a and σ . We claim the following: Lemma 2.8 (Uniform smallness of the positive modes). For all σ0 ≥ 0, there exists s1 > 0 such that for all a ∈ Sδ , |σ | ≤ σ0 and s ≥ s1 , l0 (s) + l1 (s) ≤ 2C(σ0 ) I (s) s . Proof. It is the same as in [9], with more care about the dependence of the constants. See Appendix B.2 for the proof. Step 2: Asymptotic behavior of the null mode. We first use the decay information on I (s) and l0 (s) contained in (29) and Lemma 2.8 to solve the ODE satisfied by the null mode and stated in (iii) of Lemma 2.6. We claim the following: Lemma 2.9 (Decay of the null mode of (26)). For all σ0 ≥ 0, there is s3 (σ0 ) such that for all a ∈ Sδ , |σ | ≤ σ0 , s ≥ s3 (σ0 ) and |β| = 2, we have: s |ga,σ,β (s)| ≤ C(σ0 ) log if β1 = 2 s 5/2 k log s ga,σ,β (s) − a,σ ≤ C(σ0 ) s 5/2 if β1 = 2. s2 Proof. This is straightforward. See Appendix B.3.
Now it becomes clear that by making ka,σ = 0, the decay of the null mode is faster, which suggests that the null mode may not dominate, therefore, we drop down in the spectrum to − 21 or less, which yields exponential decay. But, can we make ka,σ = 0? The answer is yes and this comes from a simple fact: the difference ka,σ − ka,0 does not depend on the function w or on the blow-up point a ∈ Sδ , or even on the one dimensional solution w; ˜ it is a linear function of σ . More precisely, we have the following lemma, which is the core of our argument: Lemma 2.10 (Modulation of the value of σ ). For all a ∈ Sδ and σ ∈ R, κ ka,σ = ka,0 − 4p σ. Proof. By definition of ka,σ (see Lemma 2.9 and (34)), ka,σ = lim s 2 ga,σ (y, s)k2 (y1 )ρ(y)dy. s→∞
RN
(37)
Solutions of Semilinear Heat Equations
Therefore,
535
ka,σ − ka,0 = lim s 2 s→∞
= lim s 2 s→∞
R R
ga,σ (y, s) − ga,0 (y, s) k2 (y1 )ρ(y)dy
N
(38)
˜ 1 , s) − w(y ˜ 1 , s + σ )) k2 (y1 )ρ1 (y1 )dy1 (w(y
according to (25) and (32). In particular, ka,0 − ka,σ does not depend on w or on a ∈ Sδ . Since we know from (ii) in Lemma 2.4, (31) and (32) that ∂ w˜ κ as s → ∞, (y1 , s)k2 (y1 )ρ1 (y1 )dy1 ∼ 4ps 2 R ∂s the conclusion follows by the mean value theorem.
4p κ ka,0 , which makes k a,σ L2ρ . We conclude the proof
In the following, we take σ = σ (a) ≡
= 0.
of Proposition 2.1 Step 3: Exponential decay of ga,σ (a) in here. With this choice of σ , k a,σ = 0, hence, (iii) of Lemma 2.6 and Lemma 2.9 yield
2 C log s l2 (s) ≥ − l2 − 3/2 I (s) and l2 (s) = O as s → ∞ (39) s s s 5/2
gβ2 hβ 2L2 . This implies that we cannot have I ∼ l2 , unless recall that l22 = |β|=2
ρ
I ≡ 0. Therefore, Lemma 2.7 implies that either a negative mode dominates, or all the modes are less than CI (s)/s. In both cases, the differential inequality (ii) in Lemma 2.6 yields exponential decay for I (s), which is the desired conclusion. However, we need to make this decay uniform with respect to the blow-up point a ∈ Sδ . We need first to fix σ0 . The uniform estimate of Lemma 2.9 along with the continuity of g a,σ (y, s) with respect to a, σ and s (see (25)) yields the continuity of k a,σ with respect to (a, σ ) ∈ Sδ × R (see (37)). Hence, we can fix σ0 = max a∈Sδ
4p |ka,0 | < +∞ κ
(40)
and define a continuous function σ : Sδ → [−σ0 , σ0 ] by σ (a) = 4p κ ka,0 . Just note that if we take n = 2 in (i) and (ii) of Lemma 2.6 and use Lemma 2.8, then we see that x = l2 and y = I satisfy the inequality (41) in the following ODE lemma: Lemma 2.11 (ODE lemma). For all M > 0 and sˆ , there is s¯ (M, sˆ ) ≥ sˆ such that if 0 ≤ x(s) ≤ y(s) → 0 as s → ∞ and x ≥ − Ms y ∀s ≥ sˆ , (41) y ≤ − 21 y + Ms y + 21 x, 5M y(s) s or y ≥ x > 0 and y ∼ x as s → ∞.
then either ∀s ≥ s¯ , x(s) ≤
Remark. If (43) holds, then we have no uniform control with respect to M and sˆ .
(42) (43)
536
H. Zaag
Proof. See Appendix B.2. We have just proved that (43) doesn’t hold. Therefore, for all a ∈ Sδ and s ≥ s2 for some s2 > 0, l2 (s) ≤ CI (s)/s. Using Lemma 2.8 and (ii) in Proposition 2.6 (take n = 3) yields for all a ∈ Sδ , if σ = 4p κ ka,0 , then I (s) lk (s) ≤ C s if k = 0, 1 or 2 2 ∀s ≥ s0 , (s) ≤ − 1 + C0 I (s) + 1 I (n + 1 − k)lk (s). 2 s 2 k=0
s Therefore, ∀s ≥ s0 , I (s) ≤ − 21 + Cs I (s), hence I (s) ≤ C0 e− 2 s C0 for some C0 > 0. This concludes the proof of Proposition 2.1. 3. Blow-up Behavior of u in a Tubular Neighborhood of S We prove Theorem 2 here. We have proved in [23] that (7) holds. This estimate identifies for each t ∈ [0, T ) three regions in B(a, ˆ δ): – The blow-up region. It is {x | d(x, S) ≤ (T − t)| log(T − t)|}. According to (7), it corresponds to the set {x | |u(x, t)| ≥ η u(t) L∞ } for some 0 < η < 1. – The regular region. It is the region far away from blow-up, where u stays bounded, say by 1. It corresponds to {x | d(x, S) ≥ 0 } for some 0 > 0. – The intermediate region. It is between the two others, that is {x | 1 ≤ |u(x, t)| ≤ η u(t) L∞ } or {x | (T − t)| log(T − t)| ≤ d(x, S) ≤ 0 }. We handle separately the blow-up and the intermediate regions whose union makes the tubular neighborhood. Our technique is the same as in [9]. Although we had only one blow-up point in [9], it turns out that the techniques of [9] hold uniformly with respect to the blow-up point, when they are adapted to the present case. Therefore, we follow the method of [9]. However, we omit technical details; the reader can find them in [9] and in the appendix. We proceed in 3 steps: – In Step 1, we use the transport effect of the term − 21 y.∇g in Eq. (26) to extend the √ convergence of Proposition2.1 from compact sets to larger sets |y| ≤ s, i.e., the blow-up region d(x, S) ≤ (T − t)| log(T − t)|, after the change (17). – In Step 2, we i.e., when use the information on the edge of the blow-up region, d(x, S) = (T − t)| log(T − t)| as initial data to solve the ODE u = up , which turns out to be a very good approximation for the PDE in the intermediate region (T − t)| log(T − t)| ≤ d(x, S) ≤ 0 , as mentioned in Proposition 2.3. – In Step 3, we just gather the previous information to prove Theorem 2. Step 1: The blow-up region. The L2ρ estimate of Proposition 2.1 also holds uniformly on compact sets. The convection term − 21 y.∇g in Eq. (26) allows us to carry estimates √ s−s from compact sets to sets |y| ≤ s along characteristics of the type y = Re 2 . The following lemma is a corollary of Proposition 2.1 in Velázquez [19]. It is proved in the course of the proof of Proposition 2.13 in [9].
Solutions of Semilinear Heat Equations
537
Lemma √ 3.1 (Velzáquez-Extension of the convergence from compact sets to sets |y| ≤ s). Assume g is a solution of 1 ∂s g = g − y.∇g + g + α(y, s)g for (y, s) ∈ RN × [ˆs , ∞), 2 where α(y, s) ≤ Ms and |g(y, s)| ≤ M. Then, for all s ≥ sˆ and s ≥ s + 1 such that √ s−s e 2 = s, we have
sup√ |g(y, s)| ≤ C(M)es−s g(s ) L2ρ .
|y|≤ s
This lemma along with Proposition 2.1 yields for all a ∈ Sδ and s ≥ s0 + 1,
s
C
sup√ |ga,σ (a) (y, s)| ≤ Ces−s C0 e− 2 s 0 ,
|y|≤ s s−s
where e 2 = proposition:
√
s. Since s = s − log s, we have just proved part i) of the following
Proposition 3.2 (Uniform √ estimates for wa in larger sets |y| ≤ s ≥ s0 + 1 and |y| ≤ s, (i) (ii)
s
√
s). For all a ∈ Sδ ,
|ga,σ (a) (y, s)| ≤ Ce− 2 s 2 +C0 , − s 3 +C0 ˜ , |wa (y, s) − w(y.Q a e1 , s + σ (a))| ≤ Ce 2 s 2 3
where s0 and C0 are defined in Proposition 2.1. Proof of (ii). Just change Q0 y into y in part (i) and use the definition of g given in (25). Now, we just rewrite part (ii) of the previous proposition in the original variables u(x, t) through the transformation (17) to get the following corollary: Corollary 3.3 (Uniform estimates for u(x, t) in the larger sets |x − a| ≤ (T − t)| log(T − t)|). For all a ∈ Sδ , t ≥ T − e−s0 −1 and |x − a| ≤ (T − t)| log(T − t)|, − 1 d(x,T ) u(x, t) − (T − t) p−1 w˜ √T −ta , − log(T − t) + σ (a) 1 − 1 = u(x, t) − u˜ σ (a) (d(x, Ta ), t) ≤ C(T − t) 2 p−1 | log(T − t)|3/2+C0 , where Ta is the tangent plane to S at a and u˜ σ (a) is defined in (10). The only delicate point in this transformation is the computation of y.Qa e1 in terms of x, a and t. Using (17), we have |y.Qa e1 | = |(x − a).Qa e1 |(T − t)−1/2 = d(x, Ta )(T − t)−1/2 , because Qa e1 is the normal direction to the blow-up set S at the blow-up point a (see (20)). The relation between w˜ and u˜ σ follows directly from the definition of w˜ (21) and the definition of u˜ σ (10). Now, if we choose a to be the closest blow-up point to x, that is a = PS (x), the projection of x on the blow-up set S, then we get d(x, Ta ) = d(x, S), which yields the following corollary:
538
H. Zaag
Corollary 3.4 (Uniform estimates for u(x, t) in the blow-up region d(x, S) ≤ (T − t)| log(T − t)|). For all t ≥ T − e−s0 −1 and x ∈ B(a, ˆ δ) such that d(x, S) ≤ (T − t)| log(T − t)|, u(x, t) − u˜ σ (P
S (x))
1 − 1 (d(x, S), t) ≤ C(T − t) 2 p−1 | log(T − t)|3/2+C0 ,
where PS (x) is the projection of x over S. Remark. We need the restriction |x − a| ˆ < δ to guarantee the fact that PS (x) is in Sδ ≡ S ∩ B(a, ˆ 2δ), defined in (16), so that Corollary 3.3 applies. Indeed, if |x − a| ˆ < δ, then |PS (x) − a| ˆ ≤ |PS (x) − x| + |x − a| ˆ ≤ 2|x − a| ˆ < 2δ, because aˆ ∈ S. Hence PS (x) ∈ Sδ . Step 2: Estimates in the intermediate region. We consider a point (x, t) in the interme diate region, i.e. such that d(x, S) ≥ (T − t)| log(T − t)|. We remark that the point (x, t˜(d(x, S))), where t˜(d) is defined by (44) d = (T − t˜)| log(T − t˜)| is on the frontier of the two regions (note that t˜ ≤ t). Therefore, we have an estimate on u and on u − u˜ σ (PS (x)) at (x, t˜(d(x, S))), respectively from (7) and from Corollary 3.4. Moreover, the PDE (1) can be uniformly localized by the ODE u = up , according to Proposition 2.3. The one dimensional solution u˜ too. Our idea is simple: we use the ODE to propagate the information on u − u˜ σ (PS (x)) from time t˜ to t. Thus, the error term on u − u˜ σ (PS (x)) in the intermediate region will be the same as the one on the edge. More precisely: Proposition 3.5 (Estimates in the intermediate region (T − t)| log(T − t)| ≤ d(x, S) ≤ 0 ). There exists 0 > 0 such that for all x ∈ B(a, ˆ δ) and t ∈ [0, T ), if (T − t)| log(T − t)| ≤ d(x, S) ≤ 0 , then 1
|u(x, t) − u˜ σ (PS (x)) (d(x, S), t)| ≤ C(T − t˜) 2
1 − p−1
2 1− p−1
≤ Cd(x, S)
| log(T − t˜)|3/2+C0 p
| log d(x, S)| p−1
+C0
,
where PS (x) is the orthogonal projection of x on S and t˜ = t˜(d(x, S)) is defined by (44). Proof. The main argument of the proof has just been given. The reader can find the “technical” proof in Appendix C. Step 3: Estimates in a tubular neighborhood of S. We prove Theorem 2 here. Let t1 = max(T − e−s0 −1 , t˜(0 )), where 0 and t˜(0 ) are given in Proposition 3.5, and consider some x ∈ B(a, ˆ δ) such that d(x, S) ≤ 0 . (i) Let t ∈ [t1 , T ). If t ≤ t˜(d(x, S)) defined in (44), then d(x, S) ≤ ˜ (T − t)| log(T − t)|. Use Corollary 3.4. If t ≥ t (d(x, S)), then d(x, S) ≥ (T − t)| log(T − t)|. Use Proposition 3.5. (ii) Just make t → T in (i) and use (10).
Solutions of Semilinear Heat Equations
539
4. Regularity of the Blow-up Set We prove Theorem 4 and Proposition 3 here. To keep up with the notation of [23], we assume that aˆ = 0 and Qaˆ = Id, and consider that Sδ , the intersection of S with B(a, ˆ 2δ) (see (16)), is the graph of a function ϕ ∈ C 1 (BN−1 (0, δ1 ), R) of the variable x˜ = (x2 , ..., xN ). If we introduce A(x) ˜ = (ϕ(x), ˜ x), ˜ then Im A ∩ B(a, ˆ 2δ) = graph ϕ ∩ B(a, ˆ 2δ) = Sδ . Given x near Sδ , Corollary 3.3 gives many different asymptotic behaviors for u(x, t), depending on the choice of the point a ∈ Im A ∩ B(x, (T − t)| log(T − t)|). All these possible behaviors have to agree, up to the error term in Corollary 3.3. This implies a geometric constraint on Sδ , which gives some more regularity on A (and ϕ). ˜ < δ1 and A(x) ˜ We consider some |x| ˜ < δ1 and some h˜ ∈ RN−1 such that |x˜ + h| 1 ˜ as well as A(x˜ + h) are in Sδ . Since A is C and σ is continuous (see Proposition 2.1), there is C ∗ such that ˜ − A(x)| ˜ and |σ (A(x))| |ϕ (x)| ˜ ≤ C ∗ , |A(x˜ + h) ˜ ≤ C ∗ |h| ˜ ≤ C∗. (45) ˜ ≤ (T − t)| log(T − t)|, For any time t ≥ T − e−s0 −1 such that |A(x) ˜ − A(x˜ + h)| ˜ we can estimate u(A(x˜ + h), t) from Corollary 3.3 in two ways: ˜ and s = − log(T − t), which gives – First by taking x = a = A(x˜ + h) 1 s 3 ˜ ≤ Ce− 2 s 2 +C0 . ˜ s + σ (A(x˜ + h))) (T − t) p−1 u(A(x˜ + h), t) − w(0,
(46)
˜ and s = − log(T − t), which gives – Second, by taking a = A(x), ˜ x = A(x˜ + h) s 1 ˜ t) − w˜ d A(x˜ + h), ˜ TA(x) ˜ (T − t) p−1 u(A(x˜ + h), ˜ e 2 , s + σ (A(x))
(47)
s
≤ Ce− 2 s 2 +C0 . 3
˜ such that Now, if we fix t = t˜(x, ˜ h) ˜ ˜ log(T − t˜(x, ˜ ˜ = (T − t˜(x, ˜ h))| ˜ h))| A(x˜ + h) − A(x)
(48)
˜ < h1 (s0 ) for some h1 (s0 ) > 0, then we see from (45) that t˜(x, ˜ ≥ and take |h| ˜ h) T − e−s0 −1 , hence (46) and (47) hold. Therefore, if s˜ = − log(T − t˜), then s˜ ˜ − w˜ d A(x˜ + h), ˜ TA(x) ˜ s˜ + σ (A(x˜ + h))) ˜ w(0, ˜ e 2 , s˜ + σ (A(x)) (49) s˜ 3 ≤ Ce− 2 s˜ 2 +C0 . ˜ we don’t change t˜(x, ˜ and obtain similarly By changing the roles of x˜ and x˜ + h, ˜ h) s˜ ˜ 2 ˜ s˜ + σ (A(x))) ˜ − w˜ d A(x), ˜ TA(x+ e , s ˜ + σ (A( x ˜ + h)) w(0, ˜ ˜ h) (50) s˜ 3 ≤ Ce− 2 s˜ 2 +C0 .
540
H. Zaag
Since u, ˜ hence w˜ are radially decreasing (see Subsect. 1.2), this yields s˜ 3 ˜ ≤ Ce− 2 s˜ 2 +C0 . ˜ s˜ + σ (A(x))) ˜ − w(0, ˜ s˜ + σ (A(x˜ + h))) w(0,
(51)
˜ ≥ 0, then Indeed, if w(0, ˜ s˜ + σ (A(x))) ˜ − w(0, ˜ s˜ + σ (A(x˜ + h))) ˜ 0 ≤ w(0, ˜ s˜ + σ (A(x))) ˜ − w(0, ˜ s ˜ + σ (A(x˜ + h))) s˜ ˜ ≤ w˜ 0, s˜ + σ (A(x))) ˜ − w(d ˜ A(x), ˜ TA(x+ ˜ e 2 , s˜ + σ (A(x˜ + h)) ˜ h) because w˜ is radially decreasing. Hence, (51) follows from (50). Do the same and use (49) in the other case. Therefore, with a triangular identity, we get from (51) and (49) s˜ ˜ TA(x) ˜ 0 ≤ w(0, ˜ s˜ + σ (A(x))) ˜ − w(d ˜ A(x˜ + h), ˜ e 2 , s˜ + σ (A(x))) (52) − 2s˜ 23 +C0 ≤ Ce s˜ . Note that since A(x) ˜ ∈ TA(x) ˜ , we have
˜ A(x) d(A(x+ ˜ h),T ˜ ) ˜ |A(x+ ˜ h)−A( x)| ˜
≤ 1. Therefore, (i) of Lemma 2.4
˜ < h2 then s˜ + σ (A(x)) ˜ ≥ sˆ by implies that there is C > 0 and h2 > 0 such that if |h| (48) and (45) and s˜ 2 C ˜ TA(x) d A( x ˜ + h), e2 ˜ s˜ +σ (A(x)) ˜ s˜ (53) ˜ TA(x) 2 ≤ w(0, ˜ s˜ + σ (A(x))) ˜ − w˜ d A(x˜ + h), , s ˜ + σ (A( x)) ˜ . e ˜ Since Im A is the graph of ϕ, we have
˜ TA(x) d A(x˜ + h), ˜
˜ − ϕ(x) ˜ ˜ − h.∇ϕ( x) ˜ ϕ(x˜ + h) . = 1 + |∇ϕ(x)| ˜ 2
(54)
˜ < h3 , then s˜ is large enough by Using (iii) in Lemma 2.4, we get h3 > 0 such that if |h| (48) and (45) and ˜ + h))| ˜ . ≤ w(0, ˜ s˜ + σ (A(x))) ˜ − w(0, ˜ s˜ + σ (A(x˜ + h))) √ If τ (d) is given by d = τ | log τ |, then C |σ (A(x)) ˜ − σ (A(x˜ s˜ 2
log τ ∼ 2 log d and τ ∼ Therefore,
log | log τ | | log τ |
≤
log | log d| | log d|
d2 as d → 0. 2| log d|
if |d| ≤ d0 for some d0 > 0. Combining this with (48)
˜ < h4 for some h4 > 0, and (45), we have for all |h| C s˜ 3 3 1 C0 s˜ ˜ 23 | log |h|| ˜ 21 + 20 ˜ e− 2 s˜ 2 +C0 ≤ Cd 2 | log d| 2 + 2 ≤ C|h| e− 2 (˜s + σ (A(x))) s˜
˜ log |h|| ˜ 3+C0 , s˜ 2 e− 2 s˜ 2 +C0 ≤ Cd| log d|3+C0 ≤ C|h|| 3
(55)
(56)
where d = |A(x) ˜ − A(x˜ + h)|. Take h0 = min(h1 , h2 , h3 , h4 ). Combining (53), (54), (52), (56) and (45) gives the regularity estimate for ϕ. Combining (55), (51) and (56) gives the regularity estimate for σ and closes the proof of Proposition 3.
Solutions of Semilinear Heat Equations
541
5. Connection with a Chemotaxis Problem We would like to mention connections between the ideas of this paper and the chemotaxis problem of Betterton and Brenner [2]. Chemotaxis refers to the movement of bacteria under a gradient of some chemical substance. Under special conditions, bacteria excrete a substance to attract neighboring individuals. This way, bacteria aggregate and their density blows up in finite time T > 0. For simplicity, we assume that the cellular division is much slower than the dynamics of chemotaxis, and that the diffusion of bacteria is much slower than the diffusion of the attractant. Therefore, we have from [2] the equations satisfied by ρ, the bacterial density and c, the chemical attractant concentration: ∂ρ ∂t
= ρ − ∇.(ρ∇c) = ρ + ρ 2 − ∇ρ.∇c, 0 = c + ρ.
(57)
Many blow-up regimes are possible, depending on the relative importance of the three terms on the right-hand side of (57). A global picture is presented by Brenner et. al in [3], in the case of radial solutions. One of those regimes has the same scaling (T − t)| log(T − t)| as Eq. (1) with p = 2 (see Subsect. 4.3 in [3]). In an experiment conducted by Budrene and Berg [6, 7], (see also Brenner, Levitov and Budrene [4]), it appears clearly that the dynamics are 3 dimensional and not radial. The authors observe two regimes in this finite time blow-up: – The transient regime, for t ≤ t1 for some t1 < T . The bacteria aggregate along cylindrical structures that shrink towards their common axis, as time grows. This suggests that the axis of the cylinder would be the singular set. – The asymptotic regime. The cylinder is destabilized at time t = t1 and breaks up into spherical aggregates. Then, the three dimensions of the sphere shrink simultaneously, leading to isolated blow-up points. Although the chemotaxis equation is non-local, it has the same one dimensional scaling as the heat equation (1). Both equations deal with blow-up on a continuum (say on a line) and share the idea of the instability of such a behavior (only single point blow-up is thought to be generic for Eq. (1)). However, the goals of the two papers are different. Indeed, while [2] proves the instability of the blow-up on a line, we prove here that if this occurs, which is exceptional, then we have more constraints, hence more regularity on that line. Although the goals are different, the same idea is used in both works: how to connect all local singular behavior near singular points (or candidates for singular points in the case of [2]) to get a global picture of the situation? In [2], the destabilization of the cylinder at time t1 breaks the symmetry and induces a variation of a “local blow-up time”, or phase. The variation of the phase along the line is governed by a phase equation. The minimum of the phase determines the actual blow-up point. In our case, the connection between local behaviors is done through the dilation σ (a), a ∈ S, analogous to the phase of [2]. The Liouville theorem of [15] cited in Subsect. 1.1 is the key tool to connect local descriptions. We are unable to find a non trivial phase equation for σ , analogous to that of chemotaxis. However, since σ is linked to the one dimensional scaling of (1), which is also present for chemotaxis, we believe that if one adopts our point of view in chemotaxis, σ would satisfy a non trivial equation, related to the phase equation of [2].
542
H. Zaag
A. Properties of the Particular Single Point Blow-up Solution in One Dimension A.1. Existence of the one dimensional solution. We prove here the existence of the particular one dimensional solution announced in Subsect. 1.2. Take g a symmetric positive continuous function, decreasing on (0, ∞) and going to zero at infinity. The solution u(x ˜ 1 , t) of (1) with initial data kg is symmetric and decreasing on (0, ∞) as well. If k is large enough, then u(x ˜ 1 , t) blows up in finite-time T˜ , only at the origin (see Theorems 1 and 2 in Mueller and Weissler [16]). We can assume T˜ = T by changing u˜ into some 2
2 ˜ u˜ λ (x1 , t) = λ p−1 u(λx 1 , λ t).
Theorem 1 in Herrero and Velázquez [12] then asserts that u˜ has the profile f1 defined in (5). u˜ is not self-similar, because the only self-similar solutions of (1) are independent of space, hence trivial (see Theorem 1’ in Giga and Kohn [11]).
A.2. Elementary estimates for the one dimensional solution. We prove Lemma 2.4 here. (i) Using a Taylor expansion, we write ˜ s) + y1 w(y ˜ 1 , s) = w(0,
∂ w˜ 1 ∂ 2 w˜ 1 ∂ 3 w˜ (0, s) + y12 2 (0, s) + y13 3 (z1 , s) ∂y1 2 ∂y1 6 ∂y1
for some z1 ∈ (0, y1 ). Since w˜ is even, we have ˜ − 4sC
∂ w˜ ∂y1 (0, s)
≡ 0. Since (22) also holds
we have ≤ for some C˜ > 0. Since Proposition 2.2 implies that in ∂ 3 w˜ 3 (z1 , s) ≤ C3 , we combine all the previous estimates with the Taylor expansion ∂y1 s 3/2 to get k , Cloc
∂ 2 w˜ (0, s) ∂y12
∀|y1 | ≤
√ 6C˜ √ C˜ s ≡ δ˜ s, w(y ˜ 1 , s) ≤ w(0, ˜ s) − y12 . C3 s
If δ˜ ≥ 1, then the proof is complete. If δ˜ < 1, then recall that
y1 sup√ w(y ˜ 1 , s) − f1 √ → 0 as s → ∞, s |y1 |≤ s
(58)
(59)
since u˜ has the profile f1 defined in (5). Therefore, there is sˆ > 0 such that if s ≥ sˆ and √ √ δ˜ s ≤ |y1 | ≤ s, then |w(0, ˜ s) − w(y ˜ 1 , s)| ≥
1 |y |2 1 ˜ ≥ f1 (0) − f1 (δ) ˜ 1 . f1 (0) − f1 (δ) 2 2 s
The conclusion then follows from (58) and (60). (ii) See identity (5.34) on p. 854 in Filippas and Kohn [10]. (iii) We know from (59) that w(y ˜ 1 , s) → f1 (0) = (p − 1)
1 − p−1
as s → ∞
(60)
Solutions of Semilinear Heat Equations
543
uniformly on compact sets. Since w(s) ˜ ˜ L∞ and ∇ w(s) L∞ go to 0 as s → ∞ (see Proposition 2.2), we use Eq. (18) to get ∂ w˜ (y1 , s) → 0 as s → ∞ ∂s uniformly on compact sets. By the Lebesgue Theorem, we obtain ∂ w˜ → 0 as s → ∞. ∂s (s) 2 Lρ (R) 1
Let us introduce q(y1 , s) = the same type as (26):
∂ w˜ ∂s (y1 , s).
From (18), we see that q satisfies an equation of
∂q = (L + α(y1 , s)) q, ∂s where Lq =
∂2q ∂y12
∂q − 21 y1 ∂y + q and α(y1 , s) = p w(y ˜ 1 , s)p−1 − 1
(61) p p−1 .
In particular, we
have the same dynamical system techniques as for Eq. (26). Therefore, we just sketch our argument and borrow techniques from Sect. 2 and from [9] where the same equation was considered. Since w˜ satisfies Proposition 2.2 and (22), α satisfies the estimates of Lemma 2.5. If we borrow the notations we used for g in Sect. 2 and write q(y1 , s) = qn (s)hn (y1 ), I (s) = q(s) L2ρ , ln (s) = |qn (s)| hn (y1 ) L2ρ , (62) 1
n∈N
then we have I (s)2 =
n∈N
us remark that
1
qn (s)2 hn 2L2 and Eqs. (i) and (ii) in Lemma 2.6 hold. Let ρ1
I (s) ≥
C for s large, where C > 0. s2
(63)
Indeed, I (s) ≥ |q2 (s)| h2 L2ρ and by definition (see (31) and (32)), 1
q2 (s) =
∂ w˜ κ (y1 , s)k2 (y1 )ρ1 (y1 )dy1 = w2 (s) ∼ ∂s 4ps 2
(64)
as mentioned in (ii) of the lemma we are proving. Like for Eq. (26), Lemma 2.7 holds and either no mode dominates in I (s) or I (s) ∼ ln (s) as s → ∞ for some n ≥ 2. We claim that I (s) ∼ l2 (s) as s → ∞. Indeed, if no mode dominates or if I (s) ∼ ln (s) with n ≥ 3, then Lemmas 2.6 and 2.7 imply that I (s) has to decay exponentially fast. Contradiction with (63). Using (64), we see that √ √ κ 2 as s → ∞. (65) I (s) ∼ l2 (s) = 2 2|q2 (s)| ∼ 2ps 2
544
H. Zaag
Our conclusion follows if we prove that q(y1 , s) − q2 (s)h2 (y1 ) L2ρ = O 1
1 s3
.
(66)
Indeed, parabolic regularity implies that (66) also holds in L∞ loc , in particular, at y1 = 0: κ ∂ w˜ (0, s) = q(0, s) ∼ q2 (s)h2 (0) ∼ − as s → ∞, ∂s 2ps 2 which is the desired conclusion (note that h2 (0) = −2, by (31)). Let us prove (66). Proof of (66). From (62), we see that q − q2 (s)h2 (y1 ) 2L2 = c0 q0 (s)2 + c1 q1 (s)2 + l3 (s)2 , ρ1
where l3 = π3 q L2ρ and π3 q =
∞
1
qn (s)hn (y1 ). Using (i) of Lemma 2.6 with n = 0
n=3
or 1, along with (65), we see that ln (s) + n2 − 1 ln (s) ≤ ln (s) = O
1 s3
(67)
C s3
which yields
as s → 0 for n = 0 or 1.
(68)
If we project (61) using π3 , we see that ∂s π3 q = Lπ3 q + π3 (αq). Multiplying this equation by π3 qρ1 (y1 ) and integrating over R, we see that 1 1 d 2 l3 = Lπ3 q. π3 qρ1 dy1 + π3 (αq)π3 qρ1 dy1 ≤ − l32 + π3 (αq)π3 qρ1 dy1 2 ds 2 because π3 is the projector over the negative part of the spectrum. Using Cauchy– Schwartz’s inequality twice, we write π3 (αq)π3 qρ1 dy1 ≤ π3 (αq) L2 π3 q L2 ρ1 ρ1 ≤ αq L2ρ l3 (because π3 is a projector) 1 ≤ α L4ρ q L4ρ l3 . 1
1
Therefore, 1 l3 ≤ − l3 + α L4ρ q L4ρ . 1 1 2
C C 2 s (1 + y1 ) L4ρ1 ≡ s . Equation (61) has a the L2ρ1 norm up to some delay in time (see
Using Proposition 2.5, we see that α L4ρ ≤ 1
(69)
nice property of control of the L4ρ1 norm by Lemma 2.3 in [12]):
1/4
1/2 q(y1 , s)4 ρ1 dy1 ≤C q(y1 , s − s∗ )2 ρ1 dy1
Solutions of Semilinear Heat Equations
545
for some s∗ > 0. Using (65), we end-up with q L4ρ ≤ 1
C . s2
Therefore, (69) becomes
1 C l3 ≤ − l3 + 3 2 s which yields
l3 (s) = O
1 s3
as s → ∞.
(70)
Thus, (66) follows from (67), (68) and (70). This concludes the proof of Lemma 2.4.
B. Projection of Equation (26) on the Different Modes We prove in this appendix various technical lemmas from Sect. 2. In Subsect. B.1, we prove part (iii) of Lemma 2.6. We prove Lemma 2.8 and Lemma 2.11 in Subsect. B.2. Subsection B.3 is devoted to the proof of Lemma 2.9. B.1. Equation on the null mode. We prove (iii) of Lemma 2.6 here. Take β ∈ NN such that |β| = 2. If we multiply (26) by kβ (y)ρ(y) and integrate over RN , then we get from (34) and (33) gβ (s) = αgkβ ρdy. Using (36) and Cauchy–Schwartz’s inequality, we write for all a ∈ Sδ , |σ | < σ0 and s ≥ − log T + σ0 , 1 C |gβ + 4s h2 (y1 )gkβ ρdy| ≤ s 3/2 (1 + |y|3 )|g||kβ |ρdy ≤
C (1 + |y|3 )kβ L2ρ g L2ρ s 3/2
≡
C(β) I (s). s 3/2
Using (35), (30), (32) and (33), we write h2 (y1 )g(y, s)kβ (y)ρ(y)dy = gγ (s) h2 (y1 )hγ (y)kβ (y)ρ(y)dy γ ∈NN
=
gγ
γ ∈NN
=
γ ∈NN
h2 (y1 )hγ1 (y1 )kβ1 (y1 )ρ1 (y1 )dy1
gγ (s)
N
hγi kβi ρ1 (yi )dyi
i=2
h2 (y1 )hγ1 (y1 )kβ1 (y1 )ρ1 (y1 )dy1
N
δγi ,βi .
i=2
Because of the orthogonality relation (33) and symmetry, the above term is zero except when for all i = 2, . . . , N, γi = βiand |γ1 − β1 | = 0 or 2. If γ = β, then the term is gβ (s) h2 (y1 )hβ1 (y1 )kβ1 (y1 )ρ1 (y1 )dy1 = 4β1 gβ (s) after straightforward calculations based on (31) and (33),performed for β1 = 0, 1 or 2. If γ = β ± (2, 0, . . . , 0), then the term is gγ (s) h2 hβ1 ±2 kβ1 ρ1 dy1 ≡ C|gγ (s)| ≤ C (l0 + l4 ) by (35). This concludes the proof of (iii) in Lemma 2.6.
546
H. Zaag
B.2. Uniform smallness of the positive modes. We prove Lemmas 2.8 and Lemma 2.11 here. Proof of Lemma 2.8. If we take n = 0 in (i) and (ii) in Lemma 2.6, then we see that x = e−s l0 (s) and y = e−s I (s) satisfy inequality (41) in the ODE Lemma 2.11. Therefore, either (42) or (43) holds. Let us assume by contradiction that (43) holds. Then, we see that I (s) ∼ l0 > 0 as s → ∞. Using i) of Lemma 2.6 with n = 0, we see that l0 and I go to infinity. Contradiction. Thus (42) holds and we get the estimate for s s l0 . We do the same for l1 and I , using Lemma 2.11 with x = e− 2 l1 and y = e− 2 I . This closes the proof of Lemma 2.8. It remains to prove Lemma 2.11. Proof of Lemma 2.11. This lemma was proved in [9] with no attention to the dependence of the conclusion on the data. We have proved there that either x = O
y s
or x ∼ y > 0 as s → ∞,
with no uniform estimates. Let us prove the uniform version. Define s¯ (M, sˆ ) ≥ sˆ such that
3M 35 M ∀s ≥ s¯ , + 2 5 − M ≥ 0. 2s s 2
(71)
(72)
If (42) doesn’t hold, then there is s˜ ≥ s¯ such that γ (˜s ) > 0, where γ (s) = x(s)− 5M s y(s) (˜s may depend on x and y). Using (41) and (72), we get
∀s ≥ s˜ , γ ≥ y Therefore, γ (s) ≥ γ (˜s )
3M M + 2 2s s
5M s˜ s
2
35 5− M 2
−
5M 5M γ ≥− γ. 2s 2s
> 0 and
∀s ≥ s˜ , x(s) >
5M y(s). s
(73)
In particular, y ≥ x > 0 and we can write from (41) the following equation for all s ≥ s˜ ,
x M x 1x x 2M 1x x ∀s ≥ s˜ , ≥− 1+ + 1− ≥− + 1− . y s y 2y y s 2y y
(74)
The proof will be completed
if we rule out the first possibility in (71). We proceed by contradiction. If x = O ys , then we have from (73) and (74), 5M 5M M 25M 2 2M x + 1− = − ≥− y s 2s s 2s 4s 2 for s large. This implies that xy → ∞ as s → ∞. Contradiction with x ≤ y. Thus, only the second case in (71) holds and Lemma 2.11 as well as Lemma 2.8 are proved.
Solutions of Semilinear Heat Equations
547
B.3. Decay of the null mode. We prove Lemma 2.9 here. We use Eq. (iii) in Lemma 2.6. We need to estimate the error terms there. Let s3 (σ0 ) = max (− log T + σ0 , s1 (σ0 )), where s1 (σ0 ) is defined in Lemma 2.8. Consider some a ∈ Sδ and |σ | ≤ σ0 . According to Lemma 2.8 and (29), we have for all s ≥ s3 (σ0 ), l4 (s) ≤ I (s) ≤ C(σ0 )
log s I (s) log s ≤ C(σ0 ) 3 . and l0 ≤ C s2 s s
(75)
As for the size of l4 , we integrate the equation in (i) of Lemma 2.6 with n = 4 to get ∀s ≥ s3 , s I (t) −(s−s3 ) −s l4 (s3 ) + e et l4 (s) ≤ e dt. t s3 Using (75), we see that s s I (t) log t log s et et 3 dt ≤ C(σ0 )es 3 . dt ≤ C(σ0 ) t t s s3 s3 Therefore, ∀s ≥ s3 , l4 (s) ≤ C(σ0 )
log s . s3
(76)
Using (iii) of Lemma 2.6 along with (75) and (76) yields β1 log s ∀s ≥ s3 , ∀|β| = 2, gβ (s) + gβ (s) ≤ C(σ0 ) 7/2 . s s s Since β1 = 0, 1 or 2 and |gβ (s)| ≤ Cl2 (s) ≤ CI (s) ≤ C(σ0 ) log by (75), this yields s2 the conclusion.
C. Estimates in the Intermediate Region We prove Proposition 3.5 here. From (7) and Corollary 3.4, we have information on u and u − u˜ σ (PS (x)) at (x, t˜(d(x, S))), a point on the edge of the blow-up region. We use this as initial data, and solve the 2 ODEs of Proposition 2.3 between t˜ and t to get an estimate on u and u − u˜ σ (PS (x)) at (x, t), when t ∈ [t˜, T ). For clearness, we work with t˜ rescaled versions of u and u, ˜ defined for all (ξ, τ ) ∈ R2 × [− T − , 1) by: t˜ 1 p−1 v(x, ξ, τ ) = (T − t˜) u(x + ξ T − t˜, t˜ + τ (T − t˜)) 1 v(x, ˜ ξ, τ ) = (T − t˜) p−1 u˜ σ (PS (x)) (d(x, S) + ξ1 T − t˜, t˜ + τ (T − t˜)) h(x, ξ, τ ) = v − v, ˜
(77)
where t˜ = t˜(d(x, S)) is defined in (44) and goes to T as d(x, S) → 0. We start with initial data at τ = 0 for v, v˜ and h (which corresponds to information on u at time t˜, i.e. at the frontier between the blow-up and the intermediate regions).
548
H. Zaag
We see from Corollary 3.4 and (7) that there is 1 > 0 such that if |x − a| ˆ < δ and d(x, S) < 1 , then |v(x, 0, 0) − f (1)| ≤ C log | log(T − t˜)| 1 (78) | log(T − t˜)| 1 |h(x, 0, 0)| ≤ C(T − t˜) 2 | log(T − t˜)|3/2+C0 . As rescaled versions, v and v˜ are still solutions of the PDE (1). However, it is easier to work with the localizing ODE given in Proposition 2.3: for all > 0 and (x, t) ∈ RN × [ T2 , T ), |∂t u − |u|p−1 u| ≤ |u|p + C , |∂t u˜ − |u| ˜ p−1 u| ˜ ≤ |u| ˜ p + C , where C denotes hereafter a constant depending only on . Since σ (a) is continuous in terms of a (see Proposition 2.1), we see from the definition of u˜ σ (10) that for all a ∈ Sδ and (x, t) ∈ RN × [T − e−σ0 T2 , T ), |∂t u˜ σ (a) − |u˜ σ (a) |p−1 u˜ σ (a) | ≤ |u˜ σ (a) |p + C . Using (77), we get for all > 0, x ∈ B(a, ˆ δ) and τ ∈ [0, 1), p
|∂τ v(x, 0, τ ) − |v|p−1 v| ≤ |v|p + C (T − t˜) p−1 , p |∂τ v(x, ˜ 0, τ ) − |v| ˜ p−1 v| ˜ ≤ |v| ˜ p + C (T − t˜) p−1 , p |∂τ h(x, 0, τ ) − p|v| ¯ p−1 h| ≤ (|v|p + |v| ˜ p ) + C (T − t˜) p−1
(79)
for some v¯ ∈ [v, v]. ˜ Since the solution of p
v0 = v0 , v0 (0) = f (1) − 1 2 p−1 is v0 (τ ) = (p−1) + (p − 1)(1 − τ ) , a bounded function for all τ ∈ [0, 1], we 4p use the continuity of ODE solutions with respect to initial data to get sup |v(x, 0, τ ) − v0 (τ )| + |v(x, ˜ 0, τ ) − v0 (τ )| → 0 as d(x, S) → 0
τ ∈[0,1)
and sup |h(x, 0, τ )| ≤ C|h(x, 0, 0)|
τ ∈[0,1)
whenever d(x, S) ≤ 0 for some 0 > 0. Therefore, we get from (77) and (78): 1 3 − 1 sup u(x, t) − u˜ σ (PS (x)) (d(x, S), t) ≤ C(T − t˜) 2 p−1 | log(T − t˜)| 2 +C0 . t˜≤t 0),
(1.1)
with u(x, t = 0) = u0 (x) ≥ 0,
(x ∈ IR).
(1.2)
This equation, derived from a lubrication approximation, models the surface tension dominated motion of thin viscous films and spreading droplets [24] ∂u = ∇x · (f (u)∇x x u). ∂t Equation (1.1) is a particular case of the thin film equation ∂u = −(|u|n uxxx )x , ∂t
(x ∈ IR, t > 0),
(1.3)
(1.4)
552
J. A. Carrillo, G. Toscani
where n > 0. Compactly supported nonnegative source type solutions to (1.4) exist for all 0 < n < 3 [7]. For a given n and mass, there is more than one similarity solution U (x, t). A unique (up to translation) solution is obtained by imposing the additional constraint Ux (x, t) = 0 at the edge of the support. Recent numerical studies [8, 9] indicate that the support of the solution has finite speed of propagation and continuous flux, two properties desirable for a physically correct model. Moreover, they show a rapid convergence of the solution onto the similarity solution before the merging of support. For second order degenerate diffusion equations, like the porous medium equation [2] ∂u = ∇x · (|u|n ∇x u), ∂t
n>0
(1.5)
the convergence of the solution towards the similarity solution has been known for many years (see [30, 31] and the references therein). Recently, this problem has been addressed by techniques borrowed from kinetic theory, essentially based on the study of the time decay of the entropy [13, 14, 16, 26]. This strategy is possible any time we can work with an evolution equation which possesses a unique steady state of given mass, in correspondence to which the convex entropy attains the (unique) extremal point. For this reason, instead of working on (1.1) directly, one considers the asymptotic decay towards its equilibrium state of solutions to the (nonlinear) equation ∂v = (xv − vvxxx )x , ∂t v(x, t = 0) = u0 (x) ≥ 0,
(x ∈ IR, t > 0), (x ∈ IR).
(1.6)
(1.7)
The reason relies on the following fundamental remark: there exists a time dependent scaling which transforms (1.6) into the thin film equation (1.1); moreover, we can fix the time scaling in order for the initial data for (1.6) after rescaling to be the same as for the original equation. The exact expression of this time transformation follows from a by now standard time dependent change of variables [13, 14, 16, 26] and is given by v(x, t) = α(t)u(α(t)x, β(t)), where
(1.8)
α(t) = et and β(t) = e5t − 1 /5.
As a conclusion, any property about the asymptotic behavior of v(x, t) can be translated into a result about the asymptotic behavior of u(x, t). In fact, self-similar source-type solutions of (1.1) are translated into steady states for (1.6). Equation (1.6) has a unique C 1 (IR) (up to translation) compactly supported steady state of given mass M, v∞ (x) =
2 1 2 C − x2 + 24
(1.9)
with C = C(M), and, as usual, g+ indicates the positive part of g. Note that the uniqueness is due to the regularity condition C 1 (IR) which implies that vx = 0 at the edge of the support. In fact, this uniqueness is derived from the uniqueness of the source type solutions U (x, t) for (1.1) with the additional assumption Ux (x, t) = 0 at the edge of the support (see [7]). This solution has been found first by Smyth and Hill [28] and
Long-Time Asymptotics for Strong Solutions of the Thin Film Equation
553
proved to be linearly stable very recently [8]. The steady state (1.9) is nothing but the Barenblatt–Pattle steady state of the rescaled porous medium equation 1 ∂h = (xh + h 2 hx )x , ∂t
(x ∈ IR, t > 0).
(1.10)
Following [13, 14] the natural entropy to study the asymptotic behavior for (1.10) is given by 2 x 8 3/2 H (f ) = dx. (1.11) f+ f 2 3 IR In fact, this entropy has a unique minimum point attained at the stationary state (1.9). A related functional involving the second moment has been recently used in [11] for the analysis of a fourth order diffusion equation similar to (1.6), where the confining effect is due to a nonlinear second order antidiffusion term. It is interesting to remark that through the evolution of this functional they are able to study the blow-up in finite time of solutions. However, in the present study the nonlinear in f term in H (f ) is due to the fourth-order term rather than to the second-order term as in [11]. In the next sections, given the solution to (1.6), we shall study the asymptotic decay of its entropy (2.9) towards the minimum, showing exponential convergence to equilibrium in relative entropy. A Csiszár-Kullback type-inequality [20, 14] then implies an exponential convergence to the equilibrium v∞ in L1 . Going back through the change of variables we recover an algebraic decay rate towards the most regular source-type solution [7, 10] of the thin-film equation (1.1). All the above computations can be done rigorously for strong solutions of the problem (1.1) in which u ∈ C 1 (IR) for a.e. t > 0 and this is the aim of the rest of the paper. We state our main result here and we refer to the fourth section for a more rigorous statement. Theorem 1.1. The intermediate asymptotics in L1 for strong positive solutions of the thin film equation (1.1) are given by the unique (among source type solutions) strong source type solution of the equation with the same mass. Moreover, an explicit and universal algebraic rate of convergence can be obtained. Let us point out that to our knowledge the term intermediate asymptotics has been used [3] when self-similar behavior as t → ∞ for solutions of Cauchy problems is obtained, although they can be considered anyway as large-time asymptotics results. For the case of prescribed contact angle at the boundary we refer to [27], in which the existence of solution for this problem is analysed. These solutions and the less regular source-type solutions are not strong solutions, and thus outside the scope of Theorem 1.1. It would be certainly interesting to study the stability properties of these less regular source-type solutions. The methods introduced in this work are restricted to Eq. (1.1) among the set of Eqs. (1.4) due to two facts. First, the entropy associated to the strong steady state (1.9) is explicit as the steady state is; in the general case (1.4), 0 < n < 3, the strong steady states are not explicit nor the entropy functionals. Therefore, the application of the entropy-entropy production method based on the knowledge of the derivative in time of the functional becomes rather difficult. Nevertheless, extensions of the method to other fourth-order equations is possible, and we discuss them in the last section. Equation (1.1) has a very particular structure, and in fact it is the only one among thin-film equations
554
J. A. Carrillo, G. Toscani
(1.4), 0 < n < 3, that can be written formally as the gradient-flow for a suitable functional in a suitable metric (we refer to [26, 27] for details). In our case, (1.6) can be written formally as a gradient flow with respect to the entropy functional while (1.1) with respect to the dissipation of H (f ). This would be part of future research. Also, there are particular weak forms of the equation that are only valid for n = 1, see [11]. Finally, we should remark that the uniqueness of strong solutions of the Cauchy problem for (1.1) continues to remain an open problem. Nevertheless, we are able to understand the large time asymptotics of these strong solutions simply looking at entropy and energy identities and inequalities. This uniqueness issue has nothing to do with the uniqueness of the strong source type solution among the set of source type solutions [7]. The results of the present paper will be reached by steps. As a first step, in the next section we will discuss the analogies between the second and fourth order diffusions. By this analysis, the case n = 1 will be clearly separated from any other equation of type (1.4) with n = 1. Section 3 will be devoted to an overview of the properties of strong solutions to the thin films equation (1.4). Section 4 contains the main result, namely the rigorous study of the time decay of the entropy for compactly supported initial data. Section 5 contains the generalization of the main result for non compactly supported initial data and finally, in Sect. 6 we discuss different fourth order degenerate diffusion equations for which this strategy may be applied. Some of these problems have been addressed before. In particular, the asymptotic decay of the solution to (1.4) in a bounded domain with periodic boundary condition has been studied in [10] by means of generalized convex entropies. The authors show that the weak nonnegative solution becomes a strong positive solution after some finite time, and approaches exponentially in time its mean as t → ∞. Recently, explicit constants for the exponential rate of decay of the solution to the same problem have been obtained in [23]. Here, the natural entropy (1.11) has been introduced for the bounded domain problem. 2. Similarities Between the Fourth- and Second-Order Diffusion Equations In this section we compare the fourth-order problem (1.4) to the well-known second order degenerate diffusion equation (1.5). Some similarities between the fourth- and second-order cases are that both equations are parabolic and in divergence form, with a subdiffusive nonlinear diffusion coefficient. Moreover, in both cases there are compactly supported source type solutions: for all n > 1 in the second-order case, and for all 0 < n < 3 in the fourth-order case. In general, at any deeper level, the similarities between the fourth- and second-order problems cease to exist. One striking difference is the lack of a maximum principle for the fourth-order problem. If n = 1, additional similarities with the porous medium equation 1
ht = a(h 2 hx )x ,
(2.1)
can be found. First, let us remark that (1.1) can be written as 3 1 ut = −2 u 2 u 2 xx .
(2.2)
After the rescaling (1.8), Eq. (2.2) becomes 3 1 vt = −2 v 2 v 2 xx
(2.3)
xx
xx
+ (xv)x .
Long-Time Asymptotics for Strong Solutions of the Thin Film Equation
555
For a given positive constant c, let us add and subtract to (2.3) the quantity
2c v
3 2
xx
= 2c v
3 2
x2 2
xx xx
.
(2.4)
We have 3 x2 + 2c v 2 xx + (xv)x = vt = −2 v v + c 2 xx xx
3 1 1 1 x2 x2 −2c v 2 + v 6cv 2 + . v2 + c 2 xx xx 2 x x
3 2
1 2
(2.5)
√ If we set c = 1/ 6, we finally obtain
√ 1 √ 1 3 2 x2 x2 2 2 2 6v + + v 6v + . vt = − √ v 2 xx xx 2 x x 6 Now, consider Eq. (2.1) rescaled as in (1.8). If we set a =
√ 1 x2 2 6w + , wt = w 2 x x
(2.6)
√
6 2
we obtain (2.7)
which is nothing but Eq. (2.6) without the higher order term. Since steady states of both Eqs. (2.6) and (2.7) are obtained by setting
√ 1 x2 6v 2 + v 2
x
= 0;
(2.8)
both equations have the same C 1 (IR) steady states. Thus, by studying entropies of the second order nonlinear degenerate diffusion equation (2.7) we obtain at once entropies for the fourth-order nonlinear diffusion equation (2.6). Following [13, 14], it is immediate to recover the exact form of the entropy associated to the steady state v∞ given in (1.9): 2 x 8 3/2 dx. (2.9) f+ f H (f ) = 2 3 IR Theorem 2.1 of [29] then gives that, for any given ⊇ (−C, C), v∞ is the unique extremal of H (f ) for all f belonging to the manifold
f (x) dx = M , (2.10) Fc = f ≥ 0, f ∈ L1 (),
namely H (f ) ≥ H (v∞ ), and the equality holds if and only if f = v∞ .
(2.11)
556
J. A. Carrillo, G. Toscani
Using this entropy (2.9) we have at least formally integrating by parts in Eq. (2.6) that 2 √ d x H (v) = vt + 6v 1/2 dx dt 2 IR 2 2 √ x + 6v 1/2 dx =− v 2 IR x
2 √ 1/2 2 2 3/2 x v dx. (2.12) + 6v −√ 2 6 IR xx Let v∞ (x) be the stationary solution defined by (1.9). The relative entropy H (v|v∞ ) is defined by H (v|v∞ ) = H (v) − H (v∞ ). Thus, by (2.12) we have d H (v|v∞ ) ≤ − dt
v IR
x 2 √ 1/2 + 6v 2
2 x
dx = −Dp (v) ≤ 0.
(2.13)
We remark that Dp (v) is the entropy production associated to the porous medium type equation (2.1). Lower bounds for the entropy production in terms of the relative entropy have been obtained in [13, 16]. These bounds correspond to generalized logarithmic Sobolev inequalities. The results of [13], Theorem 17 [14] assure that H (v|v∞ ) ≤
1 Dp (v). 2
(2.14)
Applying (2.14) to v(t) we finally deduce d H (v(t)|v∞ ) ≤ −2H (v(t)|v∞ ), dt which implies exponential convergence to equilibrium in relative entropy with an explicit rate. A Csiszár–Kullback type-inequality [20, 14] shows that the L1 deviation of v with respect to v∞ is bounded by H (v(t)|v∞ ) and thus, one proves the exponential convergence to the equilibrium v∞ in L1 . The change of variables (1.8) finally gives an algebraic decay rate towards the most regular source-type solution [7, 10] of the thinfilm equation (1.1). All the formal computations we gave can be done rigorously for strong solutions and compactly supported initial data. This will be the object of the next sections. Remark 2.1. The similarity between the thin film equation (1.4) and the porous medium equation (1.5) is unlikely limited to the case discussed in this section. Only in this case in fact the C 1 -similarity solution to (1.1) is also a similarity solution to the porous medium equation (1.5). In all the other cases in which a similarity solution to (1.4) is known to exist, namely for 0 < n < 3, the solution is not explicit, neither is a Barenblatt–Pattle type similarity solution. Moreover, only if n = 1 the thin film equation can be written in the form (2.6), which is the key of the whole analogy. Nevertheless, the properties of the similarity solutions for n = 1 discussed in [7] (compact support and monotonicity) do not exclude, in the spirit of the analysis of [29], the existence of a suitable entropy which attains the minimum in correspondence to the similarity solution.
Long-Time Asymptotics for Strong Solutions of the Thin Film Equation
557
3. Overview on Properties of Strong Solutions In this section, we deal with the asymptotic decay of the solution to (2.5). In the sequel we will assume the initial data u0 is non-negative and compactly supported. Moreover, u0 ∈ H 1 (IR) with mass M > 0. We will denote by (H0) this set of hypotheses on the initial data. The Cauchy problem for Eq. (1.1) has been studied deeply in [5]. The basic existence theorem in a bounded domain with no-flux boundary conditions has been obtained by Bernis and Friedman [6]. This theorem assures the existence of a weak solution in the sense of Definition 2.1 below. Later on, a detailed analysis of the regularity of the solution [4, 10] proved the existence of strong solutions in the sense of Definition 2.2 below and many other properties. Bernis [5] studied the finite speed of propagation of strong solutions by means of new entropy estimates. These strong solutions are obtained by a regularization introduced firstly in [6] and subsequently used in [4, 10, 5]. In what follows, we merely collect here the concepts of weak and strong solutions and its properties, addressing the reader for details mainly to the papers of [5, 10]. We will use the notations Q = IR × (0, ∞), QT = IR × (0, T ), P = Q \ ({u = 0} ∪ {t = 0}) and PT = QT \ ({u = 0} ∪ {t = 0}). Definition 3.1. A weak solution to problem (1.1) with initial condition satisfying (H0) is a function u(x, t) ≥ 0 enjoying the following properties: ¯ ∩ L∞ (0, ∞; H 1 (IR)), u ∈ C 1/2,1/8 (Q)
(3.1)
u ∈ C ∞ (P) and u1/2 uxxx ∈ L2 (P),
(3.2)
Q
uψt +
P
uuxxx ψx = 0
(3.3)
¯ with compact support inside Q, for all ψ ∈ Lip(Q) u(x, 0) = u0 (x), x ∈ IR and ux (·, t) → u0x strongly in L2 (IR) as t → 0. (3.4) If a weak solution satisfies additional regularity it is called a strong solution. Let us note the concept of strong solution is much weaker than the classical solution. Definition 3.2. A strong solution to problem (1.1) with initial condition satisfying (H0) is a weak solution verifying: u ∈ L2 ([0, T ], H 2 (IR))
(3.5)
for any T > 0 and thus u(·, t) ∈ C 1 (IR) a.e. in t > 0. By means of a regularization, a strong solution for problem (1.1) was proved to exist in [6, 4, 10, 5]. This solution satisfies the following additional regularity: u1−s/2 ∈ L2 ([0, T ], H 2 (IR)) for 0 < s < (ur )x ∈ L4 (QT ) for any
1 , 2
1 s 1 − ≤ r < 1, 0 < s < , 2 4 2
(3.6)
(3.7)
558
J. A. Carrillo, G. Toscani
for any T > 0, and u satisfies Eq. (1.1) in the sense
r 1−r u uψt − uuxx ψxx − u uxx ψx = 0 r x Q Q Q
(3.8)
¯ with compact support inside Q and r, s as in (3.7)–(3.8). Notice for all ψ ∈ C ∞ (Q) that the case s = 0 and r = 1/2 is not included. Moreover, u satisfies also (see [11]), Eq. (1.1) in the sense 3 − uψt + u(x, T )ψ(x, T ) − u0 (x)ψ(x, 0) = uux ψxxx + u2 ψxx 2 QT x QT IR IR QT (3.9) for all ψ ∈ C0∞ (Q¯T ) and all T > 0. Let us remark that this last property comes from the previous weak formulation by integrating by parts once in the last term in (3.8) taking into account that ux = 0 in the set where u = 0 due to u ∈ C 1 (IR) and u ≥ 0. Strong solutions are known to preserve mass u(x, t) dx = u0 (x) dx = M (Conservation of mass). (3.10) IR
IR
Moreover, there is dissipation of surface-tension energy, that is, u2x (x, T ) dx + 2 uu2xxx (x, t) dx dt ≤ u20x dx, IR
PT
(3.11)
IR
for all T > 0, and they admit entropies (Sect. 3 in [5]): in particular, the function t −→ u1+λ (x, t) dx IR
/ {0, 2, 3}, and verifies is absolutely continuous in [0, ∞) for − 21 < λ and λ ∈ 1 d − u1+λ (x, t) dx λ(λ + 1) dt IR λ(1 − λ) = uλ u2xx dx + uλ−2 u4x dx. 3 IR∩{u>0} IR∩{u>0} Moreover, we have the following integration-by-parts formula 1−λ uλ−1 uxx u2x dx = uλ−2 u4x dx a.e. t > 0. 3 IR∩{u>0} IR∩{u>0}
(3.12)
(3.13)
Let us remark that this integration by parts formula is not directly written in [5] but it is a straightforward consequence of Lemma 3.3 in [5]. Finally, the support of the solution increases following the law A1 M 1/5 t 1/5 ≤ |χ (u)|(t) ≤ |χ (u0 )| + A2 M 1/5 t 1/5
(3.14)
for any t > 0 where |χ (u)|(t) = χ+ (u)(t) − χ− (u)(t) and χ± (t) are the supremum and the infimum of the support of u(x, t) respectively (Theorems 7.1 and 7.2 in [5]). This finite speed of the propagation property for strong solutions of (1.1) is the reason
Long-Time Asymptotics for Strong Solutions of the Thin Film Equation
559
behind the intermediate asymptotics. Let us remark that (3.14) is optimal for the unique strong (self-similar) source-type solution. Finally, we have also the following property on strong solutions of problem (1.1): the function 2 x t −→ u(x, t) dx IR 2 is absolutely continuous in [0, ∞) and verifies 2 d x 3 u2 dx. u dx = dt IR 2 2 IR x
(3.15)
The evolution of the second moment is directly derived from (3.9) by taking as test function ψ(x, t) = θ1 (t) x 2 θ2 (x), where θ2 (x) ∈ C0∞ (IR) and θ2 (x) = 1 inside the support of u(x, t) for any 0 ≤ t ≤ T . Now, we translate all the properties and definitions of solutions for (1.1) to properties and definitions of solutions for the nonlinear Fokker-Planck type equation (1.6) through the change of variables (1.8). It is straightforward to change variables in weak formulations using suitable test functions. Thus, we have a weak solution to problem (1.6) with initial condition satisfying (H0), which is a function v(x, t) ≥ 0 enjoying the following properties: ¯ ∩ L∞ (0, ∞; H 1 (IR)), v ∈ C 1/2,1/8 (Q)
(3.16)
v ∈ C ∞ (P) and v 1/2 vxxx ∈ L2 (P),
(3.17)
Q
vψt −
Q
xvψx +
P
vvxxx ψx = 0
(3.18)
¯ with compact support inside Q, for all ψ ∈ Lip(Q) v(x, 0) = u0 (x), x ∈ and vx (·, t) → u0x strongly in L2 (IR) as t → 0. (3.19) Let us note that the sets where u and v are positive coincide, and thus P = \({v = 0} ∪ {t = 0}). Moreover, v is a strong solution in the sense of Definition 2.2 and thus, v ∈ L2 ([0, T ], H 2 (IR))
(3.20)
for any T > 0 with v(·, t) ∈ C 1 (IR) a.e. in t > 0. Furthermore, v satisfies v 1−s/2 ∈ L2 ([0, T ], H 2 (IR)) for 0 < s < (v r )x ∈ L4 (QT ) for any
1 , 2
1 s 1 − ≤ r < 1, 0 < s < , 2 4 2
for any T > 0 and v satisfies Eq. (1.6) in the sense
r 1−r v vψt − xvψx − vvxx ψxx − v vxx ψx = 0 r x Q Q Q Q
(3.21)
(3.22)
(3.23)
560
J. A. Carrillo, G. Toscani
¯ with compact support inside Q and r, s as in (3.22)–(3.23). Weak for all ψ ∈ C ∞ (Q) formulation (3.9) can be translated analogously.Also, the solution v verifies the following properties: v(x, t) dx = u0 (x) dx = M (Conservation of mass). (3.24) IR
IR
Moreover, the function t −→
v 1+λ (x, t) dx IR
is absolutely continuous in [0, ∞) for − 21 < λ and λ ∈ / {0, 2, 3}, and verifies 1 1 d − v 1+λ (x, t) dx + v 1+λ (x, t) dx = λ(λ + 1) dt IR λ + 1 IR λ(1 − λ) λ 2 v vxx dx + v λ−2 vx4 dx. 3 IR∩{v>0} IR∩{v>0} Furthermore, we have the following integration-by-parts formula 1−λ λ−1 2 v vxx vx dx = v λ−2 vx4 dx a.e. t > 0 3 IR∩{v>0} IR∩{v>0}
(3.25)
(3.26)
and 5t e5t − 1 −t 1/5 e − 1 ≤ |χ (v)|(t) ≤ |χ (u )|e + A M (3.27) 0 2 5e5t 5e5t for any t > 0. Let us remark that to obtain the last bound we have made use of (1.8) and we call t again for the new time variable for v as done in the rest of the paper.
A1 M 1/5
Remark 3.3. The finite speed of propagation property for strong solutions of problem (1.1) is translated to the property that the support of v remains always inside a suitable bounded interval [−R, R] and therefore, the Cauchy problem for compactly supported initial data u0 for the nonlinear Fokker–Planck equation (1.6) coincides with the no-flux initial-boundary value problem for (1.6) by setting vx = vxxx = 0 at x = ±R. In fact, given u0 there exists R > 0 and ω− , ω+ ∈ IR, −R < ω− < ω+ < R such that v = 0 in [−R, ω− ] and [ω+ , R]. Remark 3.4. The existence theory of the initial-boundary value problem for equations slightly different from (1.6), with no-flux boundary conditions has been studied directly by a regularization procedure in [17]. The results there can be easily extended to cover (1.6). Finally, we will need also the following property on strong solutions of problem (1.6): the function 2 x t −→ v(x, t) dx IR 2 is absolutely continuous in [0, ∞) and verifies 2 d x 3 v dx = − x 2 v dx + v 2 dx. (3.28) 2 IR x dt IR 2 IR This property is proved easily taking a suitable test function in (3.23) and doing integration by parts which are completely rigorous in this case due to (3.20).
Long-Time Asymptotics for Strong Solutions of the Thin Film Equation
561
4. Entropy–Entropy Production Method Our main goal here is to study the asymptotic behavior of strong solutions of (1.6) obtained in the previous section. Let us remark that, whenever the initial datum u0 satisfies (H0), if µ denotes the first moment of u0 , µ = xu0 (x) dx, (4.1) IR
one has |µ| < ∞. Thus, without loss of generality, we shall study the asymptotic decay of (1.6) choosing as initial datum u¯ 0 (x) = u0 (x + µ/M), which has the first moment equal to zero. The general case will follow by translation. This choice is connected with the optimal decay in relative entropy, we will detail at the end of this section. As discussed in Sect. 2, we can compute easily a strong positive steady solution of Eq. (1.6) given by v∞ (x) =
2 1 2 C − x2 + 24
(4.2)
with a suitable C = C(M) to fix the mass M. This solution corresponds (up to a time translation) to the unique strong source-type solution of problem (1.1) through the change of variables (1.8). Moreover, by symmetry, it has the first moment equal to zero. Let us recall that the entropy for (4.2) is given by 2 x 8 3/2 H (w) = dx. (4.3) w+ w 2 3 IR The relative entropy between w and v∞ , where w and v∞ have the same mass, is defined by the quantity H (w|v∞ ) = H (w) − H (v∞ ).
(4.4)
By (2.11), H (w|v∞ ) ≥ 0 with equality if and only if w = v∞ . Taking into account the results of the previous section, mainly (3.16)–(3.27), we can rigorously compute the derivative of the relative entropy. Using properties (3.25) for λ = 1/2 and (3.28), we conclude that the function t −→ H (v|v∞ ) is absolutely continuous in [0, ∞) and verifies d H (v|v∞ ) = −D(v), dt
(4.5)
where the entropy dissipation D(v) is given by 2 3 2 D(v) = x v dx − vx dx − v 3/2 dx 2 3 IR IR IR 3 1 2 1/2 2 + v vxx dx + v −3/2 vx4 dx. 2 IR∩{v>0} 8 3 IR∩{v>0}
2
(4.6)
562
J. A. Carrillo, G. Toscani
Let us define the following functional over the solution v: 2
2 2
2 √ √ x x 2 ˜ D(v) = + 6v 1/2 + 6v 1/2 v dx + √ v 3/2 dx, 2 2 6 IR∩{v>0} IR∩{v>0} x xx (4.7) which has sense due to (3.20) and (3.25). It is an exercise now to check that in fact ˜ D(v) = D(v). This is in fact how the completing the square (2.6) and (2.12) of Sect. 2 becomes rigorous. Thanks to (3.20), we expand the squares and take the derivatives up to second order. Comparing terms, we notice that there are three integration-by-parts to be proved. The first one we need is: vvxx dx = − vx2 dx a.e. t > 0, (4.8) IR
IR
which is obvious from (3.20). The second one is: 2 xv 1/2 vx dx = − v 3/2 dx a.e. t > 0 3 IR IR which is also true by (3.16)-(3.23). Finally, the third one is: 1 −1/2 2 v vxx vx dx = v −3/2 vx4 dx a.e. t > 0, 6 IR∩{v>0} IR∩{v>0}
(4.9)
(4.10)
which corresponds to (3.26) for λ = 1/2. Therefore, it follows rigorously that 2
2 2
2 √ √ x x 2 D(v) = + 6v 1/2 + 6v 1/2 v dx + √ v 3/2 dx, 2 2 6 IR∩{v>0} IR∩{v>0} x xx (4.11) which is of course greater than or equal to 2
2 √ x Dp (v) = v dx. + 6v 1/2 2 IR∩{v>0} x
(4.12)
Let us note that Dp (v) is exactly the entropy dissipation of the porous medium equation (2.1) Thus, we proved the entropy dissipation bound d (4.13) H (v|v∞ ) ≤ −Dp (v). dt To this point, we recall the generalized logarithmic Sobolev inequality proved in [13, 14, 16] in the one dimensional case and for m = 3/2. Theorem 4.1. Let w(x) ≥ 0 belong to L1 (IR) with mass M such that the distributional derivative of w is square integrable. Then w 3/2 ∈ L1 (IR) and 3 1 (4.14) w 3/2 dx ≤ wx2 dx + B(M), 8 3 IR IR
x2 3/2 v∞ + 2v∞ dx. 2 IR Moreover there is equality in (4.14) if and only if w is a multiple and translate of w = v∞ . where
B(M) =
Long-Time Asymptotics for Strong Solutions of the Thin Film Equation
563
The previous generalized logarithmic Sobolev inequality implies H (w|v∞ ) ≤
1 Dp (w). 2
(4.15)
Substituting this inequality into (4.13) we finally deduce d H (v(t)|v∞ ) ≤ −2H (v(t)|v∞ ), dt which furnishes H (v|v∞ ) ≤ e−2t H (u¯ 0 |v∞ )
(4.16)
for any t ≥ 0. Equation (4.16) gives the exponential decay to the steady state in relative entropy. The next step consists in obtaining the exponential decay towards v∞ in L1 (IR). This is by now a classical argument, which uses Csiszar–Kullback type inequalities. To this aim, we quote the following theorem [14, 16, 26]. Theorem 4.2. Let w(x) ≥ 0 belong to L1 (IR) with mass M. Then, the following Csiszar– Kullback type inequality holds
w − v∞ L1 (IR)
16 ≤ 3
1/2 v∞
dx
1/2
H (w|v∞ ).
(4.17)
IR
Substituting (4.17) into (4.16) we get v − v∞ L1 (IR) ≤ e
−t
16 3
1/2 v∞
dx
1/2
H (u¯ 0 |v∞ )
(4.18)
IR
for any t ≥ 0. Now, the result for the thin film equation (1.1) follows coming back to the original equation through the change of variables (1.8). So, we proved our main result. Theorem 4.3. We are given a non-negative, compactly supported initial condition u0 ∈ H 1 (IR) with mass M > 0 and first moment µ. Let u(t, x) be a corresponding strong solution to the Cauchy problem (1.1). Then, if U (t, x) is the unique strong source-type (self-similar) solution of (1.1) with mass M ([28,5,7]) up to a time translation to set U (0, x) = v∞ , there exists a constant A > 0 depending only on v∞ and M such that u(x, t) − U (x − µ/M, t)L1 (IR) ≤ A H (u¯ 0 |v∞ )(5t + 1)−1/5 for any t ≥ 0. The constant A is given by
16 A = A(M, v∞ ) = 3
1/2 v∞
1/2 dx
.
(4.19)
IR
The decay rate in Theorem 4.3 is optimal since the difference between v(x, t) and v(x − x0 , t) decays exactly as t −115 , cf. [8].
564
J. A. Carrillo, G. Toscani
Remark 4.4. Theorem 4.3 shows that a solution to the initial value problem for (1.1), with given mass M and first momentum µ, converges to a self-similar solution of mass M that is symmetric about x¯ = µ/M and was a delta function at time t = −1/5, with an explicit rate of convergence. Since the support is spreading in time and the height is decaying in time, the initial data clearly does not select a particular self-similar solution as its long-time limit, only specified by x¯ = µ/M or t = −1/5 in the rate of decay. The exact meaning of Theorem 4.3 relies in the fact that the decay towards equilibrium in relative entropy of the solution to the nonlinear Fokker–Planck type equation (1.6), corresponding to an initial condition of the form u0 (x + c), where c ∈ IR, is optimal in correspondence to c = µ/M, where M and µ are respectively the mass and the first momentum of u0 (x). In fact, a direct computation shows that µ 2 H (u0 (x + c)) = H (u¯ 0 (x + c − µ/M)) = H (u¯ 0 (x)) + c − M. (4.20) M This equality in particular shows that, for initial data as in Theorem 4.3, translations in space are controlled by
1/2 µ 2 u(x, t) − U (x − c, t)L1 (IR) ≤ A H (u¯ 0 |v∞ ) + c − M (5t + 1)−1/5 . M Likewise, direct computations give a control of translations in time. To this aim, consider that, for any time τ > 0,
α(t) ˆ α(t) ˆ |U (x, t) − U (x, t + τ )| dx = v∞ (x, t) − v∞ x dx α(t ˆ + τ) α(t ˆ + τ) IR IR
≤ 1−
α(t) ˆ α(t) ˆ α(t) ˆ v∞ dx, M+ x − v (x, t) ∞ α(t ˆ + τ) α(t ˆ + τ ) IR α(t ˆ + τ)
ˆ α(t ˆ + τ ) < 1. Recalling that a 5 = 15 where α(t) ˆ = (5t + 1)1/5 . Let us set ρ = α(t)/ 16 M, the last integral can be easily evaluated to give the inequality
a a/ρ ρ |v∞ (ρx) − v∞ (x)| dx = 2ρ v∞ (ρx) dx (v∞ (ρx) − v∞ (x)) dx + a
0
IR
7 ≤ M(1 − ρ)2 . 4 Finally, since 1−ρ ≤ |U (x, t) − U (x, t + τ )| dx ≤ IR
τ , (5t + 1)4/5
7 τ M+ M 4/5 4 (5t + 1)
τ (5t + 1)4/5
2 ,
(4.21)
which implies, for τ > 0, the following formula for time translations: 1 u(x, t) − U (x, t + τ )L1 (IR) ≤ A H (u¯ 0 |v∞ ) (5t + 1)1/5 2
τ 7 τ + M + . (4.22) M 4 (5t + 1)4/5 (5t + 1)4/5
Long-Time Asymptotics for Strong Solutions of the Thin Film Equation
565
Theorem 4.3 implies, together with the dissipation of surface-tension energy (3.11), the asymptotic decay of the solution in L∞ (IR)-norm. To this aim, the following interpolation inequality will be useful [23]. Lemma 4.5. Let f ∈ L1 (IR) ∩ H 1 (IR). Then, the following inequality: f L∞ (IR) ≤
3
1
f L3 1 (IR) 2
1 fx2 dx
2π 3
3
(4.23)
IR
holds. Using Lemma 4.5, we get Theorem 4.6. We are given a non negative, compactly supported initial condition u0 ∈ H 1 (IR) with mass M > 0 and first moment µ. Let u(t, x) be a corresponding strong solution to the Cauchy problem (1.1). Then, if U (t, x) is the unique strong source-type (self-similar) solution of (1.1) with mass M ([28,5,7]) up to a time translation to set U (0, x) = v∞ , there exists a constant B > 0 depending only on v∞ , M and uo H 1 (IR) such that u(x, t) − U (x − µ/M, t)L∞ (IR) ≤ BH (u¯ 0 |v∞ )1/6 (5t + 1)−1/15 for any t ≥ 0. The constant B is given by 1/6 3A1/3 2 2 2 v∞,x dx + 2 u0,x dx . B= 2 IR IR 2π 3
(4.24)
(4.25)
Remark 4.7. It is straightforward by interpolation to obtain from the previous Theorems 4.3 and 4.6 a decay estimate in Lp norm with rate
1 5t + 1
1
15
1+ p2
,
for all 1 ≤ p ≤ ∞. Likewise, one can obtain control of space and time translation even in these cases. 5. Cauchy Problem for Non-Compactly Supported Initial Data In this section we show that Theorems 4.3 and 4.6 can be extended to non-compactly supported initial data with finite entropy. As a first step, we extend the existence theory of strong solutions to non-compactly supported initial data. The procedure is close to that introduced in [10], and makes use of similar arguments. In the remainder of this section we will assume the nonnegative initial data u0 ∈ L1 (IR) ∩ H 1 (IR) with mass M > 0, x 2 u0 ∈ L1 (IR) and u0 log u0 ∈ L1 (IR). We will denote by (H1) this set of hypotheses on the initial data. Let us also assume that the first moment µ vanishes without loss of generality. Let us consider a sequence of positive functions χn (x) ∈ Co∞ (IR) such that 0 ≤ χn ≤ 1 with χn = 1 on the interval [−n, n] and χn = 0 outside the interval [−(n + 1), n + 1]. We consider the approximated initial data u0n (x) = u0 (x)χn (x). For any n ≥ n0 , u0n verifies all the set of hypotheses (H0) given at the beginning of Sect. 3. Its mass will be denoted by Mn . Let us remark that, for n sufficiently large, Mn > 0. Moreover, we
566
J. A. Carrillo, G. Toscani
have that u0n → u0 as n → ∞ in L1 (IR) ∩ H 1 (IR) while the second moments x 2 u0n are uniformly bounded and converge towards x 2 u0 in L1 (IR). The sequence of solutions un (t, x) verify all the properties and results obtained in the previous two sections. Moreover, if we denote by Mn the mass of the approximated sequence and if Un and U denote the unique strong source-type (self-similar) solution of (1.1) with mass Mn and M respectively, then Mn → M and Un → U as n → ∞. Taking into account the dissipation of surface-tension energy (3.11), the conservation of mass and Nash inequality [25]: there exists a constant 5 > 0 such that for all w ∈ L1 (IR) ∩ H 1 (IR), w3L2 ≤ 5w2L1 ∇wL2 , we deduce that un is a bounded sequence in L1 (Q) ∩ L2 ([0, T ], H 1 (IR)) independent of t and n. Let us prove that we can get a uniform control on the L2 ([0, T ], H 2 (IR)) norm. In order to do this we use the entropy Go (u) = u log u − u as in [6, 10, 11], we have
QT
(un )2xx (x, t) dx ≤
Go (un ) dx − Go (Mn ).
(5.1)
IR
This is not directly written in these references, but choosing a suitable approximation of the initial data, it follows directly for instance from estimates around Eq. (43) in [10]. Using hypotheses (H1) we have that the right-hand side is uniformly bounded in n. Therefore, un is a bounded sequence in L2 ([0, T ], H 2 (IR)) independent of n for any T > 0. Therefore, we can take a weak limit (up to a subsequence we denote it with the same index) in L2 ([0, T ], H 2 (IR)) towards a function u which is a strong convergence in the spaces L2 ([0, T ], H 1 (IR)) for any T > 0. Furthermore, since we have a uniform control on the second order moment by (3.15), i.e., d dt
IR
x2 3 un dx = 2 2
(un )2x dx,
(5.2)
IR
we deduce the strong convergence of un towards u in L1 (QT ). Due to Sobolev inequalities, the Ascoli–Arzela theorem and similar arguments as in [6] (Sect. 2) we deduce that we have convergence in C 1/2,1/8 (K × [0, T ]) for any compact interval K and T > 0, and therefore, the initial data u0 is taken by u as a continuous function and in L2 (IR) using (5.2). Moreover, we have also convergence of un → u in C([0, T ], Lp (K)) for any 1 ≤ p ≤ ∞ and any T > 0 and any K compact interval. Again, using (5.2) one can prove convergence in C([0, T ], Lp (IR)) for any 1 ≤ p < ∞. Regarding the convergence of ux (t, ·) as t → 0 in L2 (IR), we use the strong convergence of u(t, ·) → u0 in L2 (IR) to deduce ux (t, ·) → u0x weakly in L2 (IR). Using now (5.2) we have that 2 lim sup ux (x, t) dx ≤ u20x dx t→0
IR
IR
and therefore, since L2 (IR) is a uniformly convex space, we deduce that ux (t, ·) → u0x as t → 0 strongly in L2 (IR).
Long-Time Asymptotics for Strong Solutions of the Thin Film Equation
567
We shall now prove that we can pass to the limit in the weak formulation (3.8) of Eq. (1.1), that is,
Q
un ψ t −
Q
un (un )xx ψxx −
Q
(un )
1−r
(un )r r
x
(un )xx ψx = 0
(5.3)
¯ with compact support inside Q, 0 < s < 1/2 and 1/2−s/4 ≤ r < 1. for all ψ ∈ C ∞ (Q) Since un converges strongly in L2 (QT ) and (un )xx converges weakly in L2 (QT ) for any T > 0, we can take limits in the first two terms as n → ∞. In order to pass to the limit in the last term we will use the proof given in [10] to which we refer for the details. Let us sketch it. Let us use the additional entropies (3.12). If − 21 < λ = −s < 0 we have d 1 s(1 − s) dt =
(un )1−s (x, t) dx IR
IR∩{un >0}
(un )
−s
(un )2xx
s(1 + s) dx − 3
IR∩{un >0}
(un )−s−2 (un )4x dx.
(5.4)
First, using that un is uniformly bounded with respect to n in L1 (QT ) and the uniform bound in L∞ ([0, T ], H 1 (IR)) we have that un is uniformly bounded with respect to n in L∞ ([0, T ], Lp (IR)) for any 1 ≤ p ≤ ∞. Now using (5.4), we can produce a uniform bound of [(un )1−s/2 ]xx in L2 (QT ) and of [(un )1/2−s/4 ]x in L4 (QT ) by mimicking the procedure in [10, Sect. 4, p. 14–16]. Finally, using Lemma 6 in [10] we deduce directly the convergence of the last nonlinear term in (5.3). Concluding, we proved the existence of a strong solution to the Cauchy problem for Eq. (1.1) with initial data u0 satisfying (H1). Now, it is easy to apply Theorem 4.3 to the approximated problems to obtain un (x, t) − Un (x, t)L1 (IR) ≤ A H (u0n |v∞n )(5t + 1)−1/5 , and to take the limit n → ∞ to prove that Theorems 4.3 and 4.6 apply to the strong solutions corresponding to initial data satisfying hypotheses (H1). We proved Theorem 5.1. We are given a non-negative, initial condition u0 ∈ L1 (IR) ∩ H 1 (IR) with mass M > 0, x 2 u0 ∈ L1 (IR) and u0 log u0 ∈ L1 (IR). Let µ be the first moment of u0 . Then, there exists a strong solution u(t, x) to the Cauchy problem (1.1). Moreover, if U (t, x) is the unique strong source-type (self-similar) solution of (1.1) with mass M ([28,5,7]) up to a time translation to set U (0, x) = v∞ , there exist constants A, B > 0 such that u(x, t) − U (x − µ/M, t)L1 (IR) ≤ A H (u¯ 0 |v∞ )(5t + 1)−1/5 and u(x, t) − U (x − µ/M, t)L∞ (IR) ≤ BH (u¯ 0 |v∞ )1/6 (5t + 1)−1/15 for any t ≥ 0. The constants A and B are given in Theorems 4.3 and 4.6 respectively.
568
J. A. Carrillo, G. Toscani
6. Final Remarks Equation (2.6) admits less regular steady states with positive slope γ at the edge of the support. Equation (1.1) with a prescribed contact angle has been recently studied by Otto [27], who did not consider the asymptotic behavior of the solution. It would be certainly interesting to study if these less regular steady states are the asymptotic limit of solutions (with the same prescribed slope γ ) to the rescaled thin film equation in the situation studied by Otto. In more detail, any steady solution of the rescaled thin film equation, of given mass M, with a prescribed positive slope γ , has the form 2 1 2 γ 2 C − x2 + C − x2 + + 24 1 2C1 1
vγ ,∞ (x) =
(6.1)
with C1 = C1 (M). The C 1 -steady state we considered in this paper is included in (6.1), and corresponds to the choice γ = 0. Whenever γ = 0, vγ ,∞ (x) is no more a steady state of the rescaled porous medium equation (1.10), the representation (2.6) is useless, and the entropy method of course fails. Any steady state of the form (6.1) admits a natural convex entropy (see [29]) 2 x H (f ) = f + 9(f ) dx, (6.2) 2 IR where
f
9(f ) =
6y +
0
3γ C1
2 −
3γ dy. C1
(6.3)
By Theorem 2.1 of [29], for all nonnegative functions f of given mass M, 9(f ) ≥ 9(vγ ,∞ ),
(6.4)
with equality if and only if f = vγ ,∞ . Unlikely, at present we do not know if the entropy (6.2) is monotone decreasing along the solution to Eq. (1.6). At least for strong solutions, Eq. (1.6) has another Lyapunov functional (entropy) which is non-increasing monotonically in time, H (f, fx ) = x 2 f + (fx )2 dx. (6.5) IR
In fact, surface-tension energy dissipation (3.11) gives, for any T > 0, H (u, ux )(t) + 2
T 0
u uxx IR
x2 − 2
2 x
dxdt = H (u, ux )(0).
(6.6)
We remark that both the entropy (6.5) and the entropy production D(f ) =
f IR
2
x2 fxx − dx, 2 x
(6.7)
do not distinguish among the steady states (6.1). We believe that the study of the time evolution of this Lyapunov functional would be of great importance to understand the
Long-Time Asymptotics for Strong Solutions of the Thin Film Equation
569
asymptotic behavior of solutions with a prescribed contact angle, but presently we are not in a position to do it. On the other hand, within the same strategy we can treat in a formal way any equation of the form ∂v = −C1 9(v) (V (x) + h(v))xx xx + v (V (x) + h(v))x x , ∂t
(6.8)
where C1 > 0, 9 ≥ 0 and V and h verify the hypotheses needed in [14] to work on the general nonlinear diffusion equation ∂v = (v(V (x) + h(v))x )x , ∂t
(x ∈ IR, t > 0).
(6.9)
A class of fourth order diffusion equations that can be written in the form (6.8) by taking 2 V (x) = x2 , is the following: ∂v = − 9(v) (h(v))xx xx + (xv)x , ∂t
(6.10)
where 9 is increasing from 9(0) = 0, while h is conjugate to 9, in the sense that h (v) =
9 (v) . v
(6.11)
This is unlikely, since we can not perform rigorously the asymptotic behavior, since the existence theory for this class of equations is at present not well developed. A particular case of (6.10) is relevant in semiconductors device modeling, and corresponds to the choice 9(v) = v. In this case we obtain ∂v = − v (log v)xx xx + (xv)x ∂t
(6.12)
that corresponds through a suitable change of variables of the type (1.8) to the spin equation ∂u = − u (log u)xx xx ∂t
(6.13)
introduced by Derrida, Lebowitz, Speer and Spohn in [15]. Bleher, Lebowitz and Speer in [12] and subsequently Jüngel and Pinnau in [22] studied Eq. (6.13) in the bounded domain (0, 1) subject to boundary conditions u(0) = u(1) = 1, and ux (0) = ux (1) = 0. In fact, (6.12) admits the Gaussian as a steady state which is the minimum of the physical entropy 2 x f + f log f dx. H (f ) = 2 IR The formal computation of the evolution of the relative entropy and the logarithmic Sobolev inequality imply the exponential convergence in relative entropy towards the Gaussian for solutions of (6.12). This result would imply an algebraic decay in L1 -norm towards a modified heat kernel for solutions of (6.13). The rigorous study of the Cauchy problem for Eq. (6.13) is under consideration.
570
J. A. Carrillo, G. Toscani
Acknowledgement. This work has been performed and financially supported within the activities both from the TMR project “Asymptotic Methods in Kinetic Theory”, No. ERBFRMXCT 970157, funded by the EC., from the Italian MURST, project “Mathematical Problems in Kinetic Theories”, from the Spanish–Italian Acciones Integradas and from the Spanish DGES projects PB98-1281 and PB98-1294. The authors would like to express their sincere gratitude to Mary Pugh for helpful references and fruitful comments. The suggestions of the anonymous referees, which led to a marked improvement in the structure of the paper, are gratefully acknowledged.
References 1. Arnold, A., Markowich, P., Toscani, G., Unterreiter, A.: On logarithmic Sobolev inequalities, CsiszárKullback inequalities and the rate of convergence to equilibrium for Fokker-Planck type equations. To appear in Comm. P.D.E. 2. Aronson, D.G.: The porous media equation. In: Nonlinear Diffusion Problems, edited by A. Fasano and M. Primicerio, Montecatini 1985, Lecture Notes in Mathematics 1224, Berlin: Springer, 1986, pp. 1–46 3. Barenblatt, G.I.: Similarity, Self-Similarity and Intermediate Asymptotics. New York–London: Plenum, 1979 4. Beretta, E., Bertsch, M., Dal Passo, R.: Nonnegative solutions of a fourth-order degenerate parabolic equation. Arch. Rational Mech. Anal. 129, 175–200 (1995) 5. Bernis, F.: Finite speed of propagation and continuity of the interface for thin viscous flows. Adv. in Diff. Eq. 3, 337–368 (1996) 6. Bernis, F., Friedman, A.: Higher order nonlinear degenerate parabolic equations. J. Diff. Eqns. 83, 179– 206 (1990) 7. Bernis, F., Peletier, L.A., Williams, S.M.: Source type solutions of a fourth order nonlinear degenerate parabolic equation. Nonlinear Analysis 18, 217–234 (1992) 8. Bernoff, A.J., Witelski, T.P.: Linear stability of source-type similarity solutions of the thin film equation. To appear in Appl. Math. Letters (2001) 9. Bertozzi, A.L.: The mathematics of moving contact lines in thin liquid films. Notices of the AMS, June– July 1998, 689–697 (1998) 10. Bertozzi, A.L., Pugh, M.: The lubrication approximation for thin viscous films: Regularity and long-time behavior of weak solutions. Comm. Pure Appl. Math. XLIX, 85–123 (1996) 11. Bertozzi, A.L., Pugh, M.: Finite-time blow-up of solutions of some long-wave unstable thin film equations. Indiana Univ. Math. J. 49, 1323–1366 (2000) 12. Bleher, P.M., Lebowitz, J.L., Speer, E.R.: Existence and positivity of solutions of a fourth order nonlinear PDE describing interface fluctuations. Commun. Pure Appl. Math. XLVII, 923–942 (1994) 13. Carrillo, J.A., Toscani, G.: Asymptotic L1 -decay of solutions of the porous medium equation to selfsimilarity. Indiana Univ. Math. J. 49, 113–141 (2000) 14. Carrillo, J.A., Jungel, A., Markowich, P., Toscani, G., Unterreiter, A.: Entropy dissipation methods for degenerate parabolic problems and generalized Sobolev inequalities. Monatsh. Math. 133, 1–82 (2001) 15. Derrida, B., Lebowitz, J.L., Speer, J., Spohn, H.: Fluctuations of a stationary nonequilibrium interface. Phys. Rev. Lett. 67, 165–168 (1991) 16. Dolbeault, J., del Pino, M.: Generalized Sobolev inequalities and asymptotic behaviour in fast diffusion and porous medium problems. Preprint 17. Giacomelli, L.:A fourth-order degenerate parabolic equation describing thin viscous flows over an inclined plane. Appl. Math. Lett. 12, 107–111 (1999) 18. Giacomelli, L., Otto, F.: Variational formulation for the lubrication approximation of the Hele-Shaw flow. Preprint 19. Hocking, L.M.: The spreading of a thin drop by gravity and capillarity. Q. J. Mech. Appl. Math. 34, 37–55 (1981) 20. Kullback, S.: Information Theory and Statistics. New York: John Wiley, 1959 21. Lacey, A.A.: The motion with slip of a thin viscous droplet over a solid surface. Stud. Appl. Math. 67, 217–230 (1982). 22. Jungel, A., Pinnau, R.: Global non-negative solutions of a nonlinear fourth-order parabolic equation for quantum systems. SIAM J. Math. Anal. 32, 760–777 (2000) 23. Lopez, J.L., Soler, J., Toscani, G.: Time rescaling and asymptotic behavior of some fourth order degenerate diffusion equations. To appear in Computers and Math. Applications 24. Myers, T.G.: Thin films with high surface tension. SIAM Reviews 40, 441–462 (1998) 25. Nash, J.: Continuity of solutions of parabolic and elliptic equations. Am. J. Math. 80, 931–954 (1958) 26. Otto, F.: The geometry of dissipative evolution equations: The porous medium equation. Comm. P.D.E. 26, 101–174 (2001)
Long-Time Asymptotics for Strong Solutions of the Thin Film Equation
571
27. Otto, F.: Lubrication approximation with prescribed nonzero contact angle. Comm. P.D.E. 23, 2077–2164 (1998) 28. Smyth, N.F., Hill, J.M.: Higher order nonlinear diffusion. IMA J. Appl. Math. 40, 73–86 (1988) 29. Toscani, G.: Remarks on entropy and equilibrium states. Appl. Math. Letters 12, 19–25 (1999) 30. Vázquez, J.L.: Asymptotic behaviour for the porous medium equation in the whole space. Notas del curso de doctorado Métodos asintóticos en ecuaciones de evolución 31. Vázquez, J.L.: An introduction to the mathematical theory of the porous medium equation. In: Shape optimization and free boundaries (Montreal, PQ, 1990), NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci., 380, Dordrecht: Kluwer Acad. Publ. (1992), pp. 347–389 Communicated by J. L. Lebowitz
Commun. Math. Phys. 225, 573 – 609 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Unitary Representations of Uq (sl(2, R)), the Modular Double and the Multiparticle q-Deformed Toda Chains S. Kharchev1 , D. Lebedev1 , M. Semenov-Tian-Shansky2,3 1 Institute of Theoretical and Experimental Physics, Moscow 117259, Russia 2 Université de Bourgogne, 21078 Dijon, France 3 Steklov Math. Institute, St. Petersburg 191011, Russia
Received: 11 April 2001 / Accepted: 8 October 2001
Abstract: The paper deals with the analytic theory of the quantum q-deformed Toda chains; the technique used combines the methods of representation theory and the Quantum Inverse Scattering Method. The key phenomenon which is under scrutiny is the role of the modular duality concept (first discovered by L. Faddeev) in the representation theory of noncompact semisimple quantum groups. Explicit formulae for the Whittaker vectors are presented in terms of the double sine functions and the wave functions of the N -particle q-deformed open Toda chain are given as a multiple integral of the Mellin– Barnes type. For the periodic chain the two dual Baxter equations are derived. Preface In the late seventies B. Kostant [1] has discovered a fascinating link between the representation theory of non-compact semisimple Lie groups and the quantum Toda chain. Let G be a real split semisimple Lie group, B = MAN its minimal Borel subgroup, let N and V = N¯ be the corresponding opposite unipotent subgroups. Let χN , χV be nondegenerate unitary characters of N and V , respectively. Let HT be the space of smooth functions on G which satisfy the functional equation ϕ(vxn) = χV (v)χN (n) ϕ(x),
v ∈ V , n ∈ N.
A function ϕ ∈ HT is uniquely determined by its restriction to A ⊂ G. Obviously, HT is invariant under the action of the center of the universal enveloping algebra Z ⊂ U (g); hence, any Casimir operator C ∈ Z gives rise to a differential operator acting in C ∞ (A). When C is the quadratic Casimir, this is precisely the Toda Hamiltonian; other Casimirs provide a complete set of quantum integrals of motion. This observation reduces the spectral theory of the Toda chain to the representation theory of semisimple Lie groups. The joint eigenfunctions of the quantum Toda Hamiltonians are the so called generalized Whittaker functions. The theory of Whittaker
574
S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky
functions has been extensively studied in the 60’s and 70’s [2–4]; it displays deep parallels with the celebrated Harish-Chandra theory of spherical functions [5] and depends on a profound study of the principal series representations [6]. The group theoretic approach based on representation theory of finite-dimensional semisimple groups is matched by a more sophisticated technique of the Quantum Inverse Scattering Method [7]. The treatment of the Toda chain by means of QISM is based on a 2 × 2 matrix first order difference Lax operator for the Toda lattice. (In order to understand its relation to the n × n Lax representation which is implicit in Kostant’s approach recall that the Lax matrix is a tridiagonal Jacobi matrix which defines a threeterm recurrence relation and hence may be regarded as a second order scalar difference operator.) While the use of the lattice Lax representation restricts generality: we have to assume that g = sl(n)1 , it allows to bring into play the powerful machinery of quantum R-matrices (and hence eventually of infinite dimensional quantum groups). Recently the first two authors have established an explicit connection of the QISM-based approach to the quantum Toda chain to the theory of Whittaker functions [9]. The technique of QISM yields new explicit formulae for the Whittaker functions which, to the best of our knowledge, were not known in the elementary representation theory. It looks rather natural to generalize this approach to the q-deformed case. The use of lattice Lax representation makes the procedure rather straightforward: one simply has to replace the rational R-matrix with the trigonometric one (we shall see, however, that this generalization includes a number of nontrivial points). On the other hand, the very definitions of “noncompact quantum groups” which one needs to proceed with the q-deformed version of the Kostant approach are by no means obvious. It is the interplay of the explicit formulae based on QISM and of their not-yet-defined counterparts coming from the representation theory of noncompact finite-dimensional quantum groups that makes the entire game very exciting. Our preliminary results suggest that the correct treatment of the problem requires a very significant change in the entire framework of the representation theory of Uq (g); the crucial role is played by the “modular dual” of Uq (g) and the modular double Uq (g) ⊗ U q (g) which was introduced recently by Faddeev [10]2 . Among other things, this new point of view leads to new possibilities in the choice of real forms of the relevant algebras: it is the real form of the modular double Uq (g) ⊗ U q (g) which really matters. One nontrivial possibility for the choice of the real form has been recently pointed out by Faddeev, Kashaev and Volkov [11] in their study of the quantum Liouville theory; it is very encouraging that the same real form naturally arises in the study of the q-deformed Toda chain. Analytical aspects of the theory bring into play the double gamma and double sine functions of Barnes [12–16], or the closely related quantum dilogarithms [17], which replace the ordinary gamma functions in the formulae for both the Harish-Chandra c-functions and the Whittaker functions. We believe that the implications of these constructions for the representation theory are probably more interesting than the qdeformed Toda model itself (commonly known as the relativistic Toda chain [18]). Our strategy in the present paper is as follows. In Sect. 1 we shall start with the elementary representation theory of the algebra Uq (sl(2, R)). Section 2 deals with the theory of Whittaker vectors and Whittaker functions for the modular double Uq (sl(2, R)) ⊗ 1 The treatment of other classical Lie algebras is also possible; for that end, one needs to use lattice Lax pairs with boundary conditions introduced by Sklyanin [8]. In the present note we shall not deal with this generalization and assume that g = sl(n). 2 The definition of the modular double was coined out by Faddeev in the special case g = sl(2); as pointed out to the authors by B. Feigin, it is most likely that for general semisimple Lie algebras the modular dual of Uq (g) is U q (ˇg), where gˇ is the Langlands dual of g.
Analytic Theory of Quantum q-Deformed Toda Chains
575
U q (sl(2, R)) and with the 2-particle q-deformed open Toda chain. We obtain explicit formulae for the Whittaker vectors in terms of the double sine functions and derive the integral representations for solutions to a one-parameter family of two-particle relativistic Toda chains in the framework of representation theory; all these solutions possess dual symmetry. Generalization to the N -particle case is described in Sect. 3; using the QISM approach, we derive an appropriate solution of the spectral problem for the open N-particle chain in the form of a multiple integral of the Mellin–Barnes type with the gamma functions replaced by double sine functions. It is shown that the solution for the N -periodic chain is represented as a generalized Fourier transform of the N − 1particle open wave function with the kernel satisfying two mutually dual Baxter equations. Finally, in the Appendix we list the essential analytic properties of the double sine functions. 1. Representations of Principal Series of Uq (sl(2, R)) and the Modular Double In this section we shall discuss the representations of Uq (sl(2, R)) which may be regarded as deformations of the principal series representations of SL(2, R). As pointed out by Faddeev [10], these representations possess a remarkable duality which is similar to the modular duality for noncommutative tori discovered by Rieffel [19]. We start with the algebraic definition of Uq (sl(2, C)) (see, for example, [20]). It is generated by elements K ±1 , E, F subject to the relations KE = q 2 EK,
KF = q −2 F K,
EF − F E =
K − K −1 , q − q −1
(1.1)
q = eπiτ ,
τ ∈ C.
(1.2)
where
The bialgebra structure on Uq (sl(2, C)) is given by the coproduct 3 K = K ⊗ K, E = E ⊗ 1 + K ⊗ E, F =1⊗F +F ⊗K
−1
(1.3) .
The center of Uq (sl(2, C)) is generated by the Casimir element C2 = qK + q −1 K −1 + (q − q −1 )2 F E.
(1.4)
The algebra Uq (sl(2, C)) admits a real form defined by the involution K ∗ = K,
E ∗ = −E,
F ∗ = −F,
(1.5)
which is compatible with the commutation relations (1.1) only if |q| = 1, i.e. τ ∈ R. The corresponding real algebra is called Uq (sl(2, R)). (We shall see later that when Uq (sl(2)) is replaced with its modular double, there is a possibility to choose the real structure in a different way.) 3 The coalgebraic structure of U (sl(2, C)) is not used in the present paper. q
576
S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky
It is sometimes useful to consider the corresponding “infinitesimal” algebra Uτ (sl(2, R)), τ ∈ R, with generators E, F, H and relations [H, E] = 2E, [E, F ] =
[H, F ] = −2F,
K = eH ,
q H − q −H . q − q −1
(1.6)
Evidently, there is an involution H ∗ = −H,
E ∗ = −E,
F ∗ = −F.
(1.7)
Let us sketch the representation theory of Uq (sl(2, R)) in the way which stresses the role of the modular duality concept (cf. [21]). The representations of the principal series of Uq (sl(2, R)) admit an explicit realization by means of finite difference operators on the real line; the commutation relations of the basic operators which are the building blocks for these representations are the ordinary Weyl relations. To put it in a different way, the principal series representations of Uq (sl(2, R)) factor through a noncommutative torus. Definition 1.1. The noncommutative torus Aq is the associative algebra generated by u, v subject to the relation uv = q 2 vu. We shall adjoin to Aq the inverse elements u−1 , v −1 (in other words, we replace Aq with its field of fractions, which we denote by the same letter). Proposition 1.1. For any z ∈ C the mapping Uq (sl(2, R)) → Aq defined by K → zu−1 ,
E →
v −1 (1 − u−1 ), q − q −1
F →
qv (z − z−1 u) q − q −1
(1.8)
is a homomorphism of algebras. Note that the Casimir C2 is mapped by the homomorphism (1.8) to qz + q −1 z−1 . It is sometimes technically convenient to extend the algebra Uq (sl(2)) by adjoining to it “virtual Casimir elements”. The following assertion is well-known. Proposition 1.2. The center of Uq (sl(2, R)) is isomorphic to the polynomial algebra ˆ Uq (sl(2, R)) is a free Z-module. Z = C[qz + q −1 z−1 ] ⊂ C[z, z−1 ] = Z; Set
ˆ Uˆ q (sl(2, R)) = Uq (sl(2, R)) ⊗Z Z.
The mapping (1.8) canonically extends to Uˆ q (sl(2)). Informally, we may think of Uˆ q (sl(2)) as of a bundle of noncommutative tori parameterized by the spectrum of ˆ the central element z ∈ Z. Proposition 1.1 is a simple instance of the “free field representations” for quantum groups; it may also be compared with the well-known Gelfand–Kirillov theorem [22] which asserts that the field of fractions of the universal enveloping algebra is isomorphic to the standard noncommutative division algebra (central extension of the field of fractions of the Weyl algebra generated by several pairs of “canonical variables” pi , qi ). As a motivation for the study of the modular duality for Uq (sl(2, R)) let us recall the following standard construction from ergodic theory [19, 10]. Let q = exp π iω1 /ω2 , where ω1 , ω2 ∈ R; we shall assume that τ = ω1 /ω2 is irrational. Put q = exp(π iω2 /ω1 )
Analytic Theory of Quantum q-Deformed Toda Chains
577
and let A u, v , and relations u v = q 2 v u. Let us define q be the dual torus with generators unitary operators Tω1 , Tω2 , S−iω1 , S−iω2 in L2 (R) by Tω1 ϕ(t) = ϕ(t + ω1 ), S−iω1 ϕ(t) = e
2π it ω1
Tω2 ϕ(t) = ϕ(t + ω2 ),
ϕ(t), S−iω2 ϕ(t) = e
2π it ω2
ϕ(t).
(1.9)
Define the dual representations of Aq and A q in H = L2 (R) by ρ : u → Tω1 , ρ : u → Tω2 ,
v → S−iω2 , v → S−iω1 .
(1.10)
It is easy to see that Aq , and A q are the centralizers of each other in the algebra B(H) of all bounded operators in H. The space H = L2 (R), which has the structure of a left Aq -module and of a right A q -module is called the imprimitivity (Aq , A q )-bimodule. The images of Aq , A in B(H) are factors of type II . Clearly, the representations of Aq q 1 and A q are reducible (in fact, both Aq and A q contain plenty of idempotent elements which are by projection operators in H; the image of a projection operator represented ∈ ρ P A q is an invariant subspace for Aq ; the subspaces of H which arise in this way are the celebrated fractional dimensional spaces of von Neumann). On the other hand, the second commutant of Aq ⊗ A q coincides with B(H) and hence (1.10) is an irreducible representation of Aq ⊗ A q (as a matter of fact, up to unitary equivalence, this algebra has a unique irreducible representation). The relation between the two noncommutative tori described above is called by Rieffel the strong Morita equivalence; in a more general way, Rieffel showed [19] that two tori +b πiτ , τ are strong Morita equivalent if and only if Aq and A q = eπi τ = aτ q, q = e cτ +d , where ab ∈ GL(2, Z), cd which explains the term “modular duality”. Definition 1.2. The modular dual of Uq (sl(2, R)) is the Hopf algebra U q (sl(2, R)) with q = eπi/τ ; we set also ˆ Uˆ Z, q (sl(2)) = U q (sl(2)) ⊗Z = C[ Z qz + q −1 z−1 ],
Zˆ = C[z, z−1 ].
The obvious motivation for this definition is the existence of the “dual free field representation” U q (sl(2, R)) → A q. Remark 1.1. The modular transformation usually considered in the theory of theta functions is τ → − τ1 ; this transformation preserves the upper half-plane Im τ > 0 and the unit circle |q| < 1. While the flip q → q −1 amounts to the simple exchange of the generators of the quantum torus (and hence, in particular, the quantum algebras U q (sl(2, R)) and U q −1 (sl(2, R)) are isomorphic), our choice of the sign of the modular transform appears to be most natural for the study of the q-deformed Toda chain. The fundamental difference which arises in the representation theory of Uq (sl(2, R)) is that its unitary representations are constructed from non-unitary representations of the quantum torus: we need to make a kind of “Wick rotation” and hence u, v ∈ Aq are
578
S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky
represented by unbounded operators [23]. More precisely, let us consider the following operators in H: Tiω1 ϕ(t) = ϕ(t + iω1 ),
Tiω2 ϕ(t) = ϕ(t + iω2 ),
Sω1 ϕ(t) = e
Sω2 ϕ(t) = e ω2 ϕ(t).
2π t ω1
ϕ(t),
2π t
(1.11)
The dual representations of Aq , A q are now given by ρW : u → Tiω1 , ρ W : u → Tiω2 ,
v → Sω2 , v → Sω1 .
(1.12)
Operators (1.11) are essentially self-adjoint on the common domain P which consists of entire functions ψ such that esx |ψ (x + iy)|2 dx < ∞ for all y ∈ R, s ∈ R. R
Remark 1.2. Unlike the unitary case, the definition of the centralizer of an unbounded operator must take care of the domains of operators; thus AB = BA implies that B(DomA ) ⊂ DomA ; this may not be true even if B is bounded. As a result, although the four operators (1.11) commute with each other, the same is not true, e. g., for their spectral projection operators4 ; thus, contrary to the ergodic case, the centralizer of ρW Aq does not contain projection operators, and hence the representations ρW , ρ W are geometrically irreducible. It is much more important for us, however, that they are not irreducible in the operator sense, as each of them still admits a huge algebra of intertwiners. Proposition 1.3. Operators which commute with all four operators (1.11) are scalars. Corollary 1.1. Representation of Aq ⊗ A q is strongly irreducible (i.e. it does not admit any nontrivial intertwiners). Let us now describe explicitly the particular principal series representation of Uˆ q (sl(2, R)) which is extensively used in the paper (the realization we use is slightly different from those described in [21]). Namely, the representation πλ of Uˆ q (sl(2, R)) (q = eπiω1 /ω2 , ω1 , ω2 ∈ R+ ), which depends on a parameter λ ∈ C, is given by π iλ
πλ :
−1 , K → e ω2 Tiω 1 −1 Sω −1 E → q−q2−1 1 − Tiω , 1 π iλ qS 2 − π iλ F → q−qω−1 e ω2 − e ω2 Tiω1 , π iλ
C2 → qe ω2 + q −1 e z → e
π iλ ω2
− πωiλ 2
(1.13)
,
.
4 Spectral projection operators E( ) for multiplication operators are multiplication operators by the characteristic function of the interval ; this function has compact support and hence the spectral projector does not preserve the domain which consists of analytic functions.
Analytic Theory of Quantum q-Deformed Toda Chains
579
By duality, we define the representation πλ of the modular dual algebra Uˆ q (sl(2, R)) πiω /ω 2 1 with q=e by π iλ
πλ :
→ e ω1 T −1 , K iω2 Sω−1 −1 1 E → 1 − T , −1 iω q − q 2 π iλ π iλ q Sω1 − ω ω 1 1 F → e −e Tiω2 , q − q −1 π iλ
(1.14)
π iλ
− 2 → C q e ω1 + q −1 e ω1 , π iλ
z → e ω1 . The representations πλ , πλ are defined on a larger space Pλ ⊃ P which depends on λ. Definition 1.3. Pλ is the set of entire functions such that (i) For t → +∞ a function ψ ∈ Pλ admits an asymptotic expansion 2π λt
ψ(t + is) ∼t→+∞ e ω1 ω2
Cn1 ,n2 e
−2π t (n1 ω1 +n2 ω2 ) ω1 ω2
(1.15)
n1 ,n2 ≥0
uniformly in each bounded strip. (ii) For t → −∞ it admits an asymptotic expansion 2π t (n1 ω1 +n2 ω2 ) ω1 ω2 ψ(t + is) ∼t→−∞ C 1 + Cn1 ,n2 e
(1.16)
n1 ,n2 >0
uniformly in each bounded strip5 . The scalar product which is adapted to the discussion of the unitarity conditions in our algebra, defined on Pλ only for λ ∈ iR − ω1 − ω2 , is given by 2πt ω1 + ω1 1 2 (ϕ, ψ) = e ϕ(t)ψ(t)dt, (1.17) R
Proposition 1.4. The following statements hold: (i) Operators πλ (X), X ∈ Uq (sl(2, R)) and πλ (Y ), Y ∈ U q (sl(2, R)) leave Pλ ⊂ L2 (R) invariant and commute with each other on this domain; any operator which commutes with both algebras is scalar. (ii) If λ ∈ iR − ω1 − ω2 , all operators πλ (X), X ∈ Uq (sl(2, R)) obey the involution generated by (1.5) with respect to the scalar product (1.17). A similar statement holds for all operators πλ (Y ), Y ∈ U q (sl(2, R)). πλ are related by z = z τ , τ = ω1 /ω2 . (iii) The central characters z, z of πλ , As in Proposition 1.3, the commutativity condition implies that the operator preserves the domains of our unbounded operators. 5 The space P essentially coincides with those considered in [21]. λ
580
S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky
Corollary 1.2. The principal series representation πλ of Uˆ q (sl(2, R)) canonically extends to a representation of Uˆ q (sl(2, R)) ⊗ Uˆ q (sl(2, R)) which is defined on the same domain Pλ ; this representation is unitary if and only if λ ∈ iR − ω1 − ω2 . Remark 1.3. Since representations of the dual algebras Uq (sl(2)) and U q (sl(2)) are constructed from the dual representations of the quantum tori Aq , A q for any values of the central characters z = eπiλ/ω2 , z = eπiµ/ω1 , one might conclude that the principal series representations πλ , πµ centralize each other for any pair of indices λ, µ; it is the condition on the common domain which imposes the selection rule. Let us now introduce the following key definition. Definition 1.4. The modular double of Uˆ q (sl(2)) is the Hopf algebra Dmod = Uˆ q (sl(2)) ⊗ Uˆ q (sl(2)). The bialgebra structure of the modular double, i.e., its product and coproduct, is standard. The point is that this algebra admits an unexpected class of representations which are not tensor products of representations of the factors, but rather are related to a kind of “type II” operator algebras (the quotation marks reflect the fact that due to analyticity constraints our algebras are “thinner” than the genuine type II factors; in particular, they do not contain projection operators). The modular double should be regarded as an analytic rather than algebraic object which for the first time brings into play the nontrivial analytic properties of noncompact semisimple quantum groups. In what follows we shall be interested only in the principal series representations of Dmod defined above; with respect to this subclass of representations Dmod behaves itself as a rank one algebra. Note that the kernel of these representations contains the two-sided ideal J ⊂ Uˆ q (sl(2)) ⊗ Uˆ q (sl(2)) generated by the relations z = z τ,
τ. K=K
The use of the modular double and its representations, instead of those of its factors, appears to be very natural in many ways. We shall see below that the definition of the Whittaker vectors becomes unambiguous only if we require that they are the eigenvectors One more reason to enjoy the presence of of both nilpotent generators πλ (E), πλ (E). a double set of generators is the integrability problem for the q-deformed (relativistic) Toda model discussed below. The q-Toda Hamiltonian, which is derived from the Casimir element of Uq (sl(2)) is a difference operator which involves only translations Tiω1 ; due to the presence of quasiconstants (i.e., functions with period iω1 ), its spectrum becomes multiple with infinite multiplicity; the multiplicity problem is resolved when we take into account the dual Casimir element which involves dual translations Tiω2 . The real form of Dmod used above is inherited from the real forms of Uˆ q (sl(2)), Uˆ q (sl(2)). As pointed out by Faddeev [10], for a special choice of the complex periods ω1 , ω2 there exists another real form of Dmod which does not reduce to real forms of its factors. Namely, Proposition 1.5. (i) Let us assume that ω1 = ω2 , or, equivalently, that |τ | = 1. Then the mapping F → −F , K → K, z → z E → −E, q 2 extends to a C-antilinear involution of Dmod .
(1.18)
Analytic Theory of Quantum q-Deformed Toda Chains
581
(ii) Let ρ be a unitary representation of Dmod with respect to the real form (1.18); then all operators ρ(X), X ∈ Uˆ q (sl(2)) ⊂ Dmod , ρ(Y ), Y ∈ Uˆ q (sl(2)) ⊂ Dmod , are normal. (iii) Let λ ∈ iR − ω1 − ω2 ; then the principal series representation πλ extends to a unitary representation of Dmod with respect to the real form (1.18). Physical self-adjoint Hamiltonians associated with the real form (1.18) can be derived from the real and imaginary parts of the Casimir operators. Analytically, Faddeev’s real form is particularly attractive, since in that case the lattice generated by ω1 , ω2 is non-degenerate. The use of the modular double is very well suited for the treatment of interpolation problems. Recall that we are dealing with the “rational form” of the quantum algebra Uq (sl(2)) which is defined in terms of the generator K = q H . This choice is at the core of modular duality: it will be completely destroyed if we replace Uq (sl(2)) with the “infinitesimal” algebra Uτ (sl(2)) generated by E, F, H , the commutativity of two dual sets of generators will be destroyed. On the other hand, when it comes up to compute special functions associated with representations of Uq (sl(2)), i.e., some specific matrix coefficients of its irreducible representations, e.g., spherical functions or Whittaker functions, and to construct the corresponding spectral theory, it is important to define these functions on the entire real line or on its compexification. By contrast, the use of the rational form Uq (sl(2)) implies that these functions are defined a priori only on a discrete set {K n , n ∈ Z}. Let us assume that ω1 , ω2 are real and τ = ω1 /ω2 is irrational. Proposition 1.6. For any α ∈ R the operators πλ (eαH ) are approximated by linear m ), n, m ∈ Z. combinations of πλ (K n · K Indeed, πλ (eαH ) is a translation operator, πλ (eαH )ϕ(t) = ϕ(t − iα); on the other m )ϕ(t) = eπiλn/ω2 +πiλm/ω1 ϕ(t − inω1 − imω2 ). The set {inω1 + hand, πλ (K n · K imω2 ; n, m ∈ Z} is dense in iR. Remark 1.4. There exists a whole family of principal series representations similar to those described above. It is easy to find a realization of the algebras Uq (sl(2)) and U q (sl(2)) labeled by integer indices k1 and k2 , respectively, such that all representation operators, which act on an appropriate space Pλk1 k2 , satisfy the unitarity condition with respect to the scalar product with the measure exp{ 2π(k1 ωω11 ω+k2 2 ω2 )t }. In this case one obtains unitary representation if and only if λ ∈ iR − k1 ω1 − k2 ω2 . For simplicity, in the present paper we restrict ourself to the case k1 = k2 = 1, although the more general case can be treated quite similarly. 2. Whittaker Vectors Let g be a semisimple Lie algebra, n its maximal nilpotent subalgebra generated by positive root vectors. A character χ : n → C is uniquely fixed by its values on root vectors associated with simple roots; it is called nondegenerate if χ (eα ) = 0 for all simple roots α. A Whittaker vector in a g-module V is a vector w ∈ V such that Xw = χ (X)w
(2.1)
for all X ∈ n. The extension of this definition to q-deformed algebras is nontrivial: it is easy to see that for rank g ≥ 2 the algebra Uq (n) generated by the Chevalley generators
582
S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky
associated with positive simple roots does not admit nondegenerate characters (the obstruction is associated with the q-deformed Serre relations). In [24] Sevostyanov found the way around this difficulty: one has to rescale the generators of the nilpotent subalgebra multiplying them by appropriate group-like elements from the Cartan subalgebra. Although the Serre relations are vacuous in the sl2 case, the same trick proves worthy in that case as well; it provides an extra freedom which serves to construct various versions of the q-deformed Toda Hamiltonians. Whittaker vectors associated with the unitary principal series representations of SL(2, R) do not lie in the Hilbert space, because the spectrum of E, F is continuous; as a result, the Whittaker functions which are defined as formal matrix coefficients of the principal series representations between a pair of Whittaker vectors are expressed by a divergent integral which requires regularization. The situation in the q-deformed case is completely similar. As already mentioned, the natural definition of Whittaker vectors in the q-deformed setting requires the use of the modular double. The two commuting ∈ Dmod give rise to two compatible difference equations which have a generators E, E unique common solution with nice analytic properties; this solution does not belong to L2 (R), because it does not decrease rapidly enough. With these remarks in mind, we may now proceed to the formal definition. Let (πλ , πλ ), λ ∈ iR − ω1 − ω2 , be the unitary representation of Dmod and α ∈ R an (α) arbitrary parameter. The E-Whittaker vector ;λ is defined by (α)
g ω1 (α) eπiα πλ (q αH );λ , q − q −1 g ω2 (α) = eπiα πλ ( q α H );λ ; −1 q − q
πλ (E);λ = (α) πλ (E); λ
(2.2)
here g is a positive real number (the “coupling constant”). The extra parameter α matches the freedom in the choice of the quantum Lax operator in the alternative formulation of the q-deformed Toda theory based on the Quantum Inverse Scattering Method. In other words, particular choices of α correspond to different Toda-like models. In a similar way, the F -Whittaker vectors are defined by (α)
g ω1
(α) , eπiα πλ (q −αH ); λ q − q −1 ω g 2 (α) =− eπiα πλ ( q −α H ); λ . q − q −1
=− πλ (F ); λ (α)
);
πλ (F λ
(2.3)
The definition of the Whittaker vectors is completely symmetric with respect to the exchange of the two dual algebras Uq (sl(2)), U q (sl(2)). Note that the existence of a or πλ (F ), ) is common eigenvector of the commuting generators πλ (E), πλ (E), πλ (F ˆ ˆ guaranteed due to our “selection rule” for the central characters of Uq (sl(2)), U q (sl(2)). 2.1. Whittaker vectors: Explicit solutions. We shall start with the explicit formulae for the simplest Whittaker vectors corresponding to a particular choice of α. Using the representations (1.13), (1.14), we get the following system of difference equations for (α) the vectors ;λ with α = 0, 1: (0)
;λ (t − iω1 ) (0) ;λ (t)
2π t
= 1 − g ω1 e ω2 ,
(2.4a)
Analytic Theory of Quantum q-Deformed Toda Chains (0)
;λ (t − iω2 ) (0) ;λ (t)
583 2π t
= 1 − g ω2 e ω1 ,
(1)
;λ (t − iω1 ) (1) ;λ (t)
=
1 2π t
1 − g ω1 e ω2
(1)
;λ (t − iω2 ) (1) ;λ (t)
=
+ πωiλ
2π t
,
(2.4c)
.
(2.4d)
2
1 1 − g ω2 e ω1
(2.4b)
+ πωiλ 1
(α) with α = 0, 1 satisfy the difference equations In a similar way, the Whittaker vectors ; λ
(0) (t + iω1 ) 2π iλ 2π t π iλ ; −1 ω1 − ω2 − ω2 λ ω2 = e g e 1 + q ,
(0) (t) ;
(2.5a)
(0) (t + iω2 ) 2π iλ 2π t π iλ ; −1 ω2 − ω1 − ω1 λ ω1 = e g e 1 + q ,
(0) (t) ;
(2.5b)
λ
λ
2π iλ
(1) (t + iω1 ) ; e ω2 λ = , − 2π t
(1) (t) ; 1 + q −1 g ω1 e ω2 λ
(2.5c)
2π iλ
(1) (t + iω2 ) ; e ω1 λ = . t − 2π
(1) (t) ω1 −1 ω ; 2 1 + q g e λ
(2.5d)
Let S(y) be the function defined in terms of the double sine S2 (y) according to (A.17).6 Proposition 2.1. The Whittaker vectors satisfying Eqs. (2.4a–2.5d) are given by the following formulae: (0) 1 ω2 ;λ (t) = S − it + ω1 +ω2 − iω2π log g , (2.6a) λ (1) ;λ (t) = S −1 − it + ω1 +ω2 + − 2 λ
(0) (t) = S it + 1 (ω1 +ω2 ) − − ; λ 2 2
(1) (t) = S −1 it + 1 (ω1 +ω2 ) − ; λ 2
log g ,
(2.6b)
2π λt log g e ω1 ω2 ,
(2.6c)
iω1 ω2 2π
iω1 ω2 2π
iω1 ω2 2π
2π λt log g e ω1 ω2 .
(2.6d)
6 In the main text we shall write S(y) instead of S(y|ω) for brevity. We omit such dependence for any other function of such type.
584
S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky
In a more general way, one can prove the following formulae for the Whittaker vectors
(α) with arbitrary values of α: ; λ π i(2α−1)ζ 2 i + ω2πωiζ [t+ iα (α) 2 (λ+ω1 +ω2 )+ 4 (ω1 +ω2 )] dζ, 1 2 ;λ (t) = c(ζ )e 2ω1 ω2 (2.7a)
(α) ;λ ,
=α 2π λt
(α) (t) = e ω1 ω2 ; λ
c(ζ )e
π i(2α−1)ζ 2 2π iζ i(1−α) i 2ω1 ω2 − ω1 ω2 [t+ 2 (λ+ω1 +ω2 )− 4 (ω1 +ω2 )]
dζ,
(2.7b)
=α
where c(ζ ) ≡ √
g iζ S −1 (−iζ ) ω1 ω2 2
(2.8)
and the contour =α is chosen in such a way that it passes above the poles of the integrand π iαζ 2
and escapes to infinity in the sector where the function e ω1 ω2 is decaying on the left π i(α−1)ζ 2
(α) (α) and in the sector where e ω1 ω2 is decaying on the right. For α = 0, 1, ;λ , ; λ are entire functions of the variable t; for “degenerated cases” α = 0, 1, the integrals in (2.7a), (2.7b) may be evaluated explicitly using formulae (A.27) and reduce to (2.6); in these cases both vectors are meromorphic functions of t. Let us note that the function c(ζ ) may be regarded as the q-deformed Harish-Chandra function (this term is justified by its role in the asymptotic formulae for the Whittaker functions, see below).
2.2. Whittaker functions. Now we would like to define the q-deformed Whittaker functions as the matrix elements of Whittaker vectors. As mentioned before, the standard integral (1.17) is divergent in this case. To regularize the integral, one should deform the integration contour in an appropriate way. Therefore, by the scalar product below we mean a suitable regularization of (1.17). (α)
Definition 2.1. Let α = (α1 , α2 ) ∈ R2 . The Whittaker functions wλ (x) corresponding to the representation (πλ , πλ ) of the algebra Dmod are the matrix elements π(ω +ω )x πx − ω1 ω 2 (α)
(α1 ) , e− ω2 H ;(α2 ) . 1 2 wλ (x) = e ; (2.9) λ λ Proposition 2.2. The Whittaker functions (2.9) satisfy the equations (α)
(α)
2π x
(α)
wλ (x −iω1 ) + wλ (x +iω1 ) + q α1 −α2 g 2ω1 e ω2 wλ (x +i(α1 −α2 )ω1 ) π iλ − π iλ (α) = − qe ω2 + q −1 e ω2 wλ (x), (α)
(α)
2π x
(2.10a)
(α)
q α1 −α2 g 2ω2 e ω1 wλ (x +i(α1 −α2 )ω2 ) wλ (x −iω2 ) + wλ (x +iω2 ) + π iλ − π iλ (α) = − qe ω1 + q −1 e ω1 wλ (x).
(2.10b)
Analytic Theory of Quantum q-Deformed Toda Chains
585
Let us check (2.10a) formally; we shall discuss the convergence of the integral in (2.9) a little later. Set πx (α)
(α1 ) , e− ω2 H ;(α2 ) . Fλ (x) = ; (2.11) λ λ The eigenvalue of the Casimir operator πλ (C2 ) is π iλ
C2 = qe ω2 + q −1 e Therefore,
− πωiλ
.
2
(2.12)
πx π iλ π iλ
(α1 ) , e− ω2 H C2 ;(α2 ) = (qe ω2 + q −1 e− ω2 )F (α) (x). ; λ λ λ
On the other hand, πx
(α1 ) , e− ω2 H C2 ;(α2 ) ; λ λ πx − (α ) 1
, e ω2 H q H +1 + q −H −1 + (q −q −1 )2 F E ;(α2 ) = ; λ λ πx
(α1 ) , e− ω2 H (q H +1 + q −H −1 );(α2 ) = ; λ λ 2π x πx − H (α )
1 , e ω2 E;(α2 ) . − (q −q −1 )2 e ω2 F ; λ λ Using the definition of the Whittaker vectors (2.2), (2.3), we obtain πx
(α1 ) , e− ω2 H C2 ;(α2 ) ; λ λ πx
(α1 ) , e− ω2 H (q H +1 + q −H −1 );(α2 ) = ; λ λ 2π x πx
(α1 ) , e− ω2 H q (α2 −α1 )H ;(α2 ) − eπi(α2 −α1 ) g 2ω1 e ω2 ; λ λ 2π x (α) = qe−iω1 ∂x + q −1 eiω1 ∂x − eπi(α2 −α1 ) g 2ω1 e ω2 ei(α1 −α2 )ω1 ∂x Fλ (x).
(2.13)
(2.14)
(2.15)
(α)
From (2.13) and (2.15) it follows that the matrix coefficient Fλ satisfies the equation 2π x (α) qe−iω1 ∂x + q −1 eiω1 ∂x − eπi(α2 −α1 ) g 2ω1 e ω2 ei(α1 −α2 )ω1 ∂x Fλ (x) (2.16) π iλ − π iλ (α) = qe ω2 + q −1 e ω2 Fλ (x). Hence, the function (α)
wλ (x) = e
−
π(ω1 +ω2 )x ω1 ω2
(α)
Fλ (x)
(2.17)
satisfies (2.10a). Corollary 2.1. Let the unitary weight be λ = −iγ − ω1 − ω2 . The Whittaker functions (α) (α) w−iγ −ω1 −ω2 ≡ wγ are eigenfunctions of the Hamilton operators 2π x
H(α1 −α2 ) = eiω1 ∂x + e−iω1 ∂x + q α1 −α2 g 2ω1 e ω2 ei(α1 −α2 )ω1 ∂x , 2π x
(α1 −α2 ) = eiω2 ∂x + e−iω2 ∂x + H q α1 −α2 g 2ω2 e ω1 ei(α1 −α2 )ω2 ∂x πγ
with eigenvalues εγ = e ω2 + e
− πωγ
2
πγ
and εγ = e ω 1 + e
− πωγ
1
, respectively.
(2.18)
586
S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky
(α) are the two dual Hamiltonians of the q-deformed 2-particle Toda Operators H(α) , H (α) are essentially self-adjoint in the space of chain. If ω1 , ω2 are real, both H(α) and H smooth functions on the line which decrease faster than e−|ωx| , where ω = max(ω1 , ω2 ). (α) and i(H(α) − When ω1 = ω2 , “physical” self-adjoint Hamiltonians are H(α) + H (α) H ). (α) (α) Using the explicit formulae for the Whittaker vectors ;λ , ; λ , we may express (α) Whittaker functions wγ in integral form: wγ(α) (x) = Nγ e
−
π(ω1 +ω2 )x ω1 ω2
π iγ x
= Nγ e ω1 ω2
e
πx
− ω H (α2 )
(α1 ) 2 ; ;−iγ −ω1 −ω2 −iγ −ω1 −ω2 , e
2π(ω1 +ω2 )t ω1 ω2
(α2 )
(α1 ) ; −iγ −ω1 −ω2 (t ) ;−iγ −ω1 −ω2 (t + x)dt,
(2.19)
where one introduces the normalization factor Nγ for future convenience as follows: Nγ =
1 ω1 ω2
πi
e− 2 [B2,2 (iγ )−B2,2 (0)]
(2.20)
with the polynomial B2,2 (z) defined by (A.4). In particular, substituting in (2.19) the expressions (2.6b), (2.6c), and using (A.21) (0,1) (−) ≡ wγ the integral representation we get for the Whittaker function wγ π iγ x (−) 1 ω2 ω ω 1 2 wγ (x) = Nγ e S −1 it + 2i γ − iω2π log g
C−
× S −1 − it − ix −
i 2
γ + 21 (ω1 +ω2 ) −
iω1 ω2 2π
log g e
(2.21) 2π iγ t ω1 ω2
dt,
where the contour belongs asymptotically to the sectors π 1 < arg t < (arg ω1 + arg ω2 ) + π, 2 2 1 π arg ω1 − < arg t < (arg ω1 + arg ω2 ) 2 2 arg ω1 +
(2.22)
(at this point one can relax the “physical” constraints imposed on parameters ω1 , ω2 ) and lies between the two sets of poles of the integrand: = − γ2 + tn(−) 1 ,n2 (−) (x) = −x − tm 1 ,m2
γ 2
−
ω1 ω2 2π
ω1 ω2 2π
log g + i(n1 ω1 + n2 ω2 ) ,
log g − i (m1 + 21 )ω1 + (m2 + 21 )ω2 ,
n1 , n2 ≥ 0, m1 , m2 ≥ 0.
(See (A.12), (A.17) for the description of the poles and the zeros of S(y).) The choice of the integration contour assures convergence and provides a natural regularization of the divergent inner product. Indeed, to see that the integral in (2.21) is well defined observe that due to (A.24) the integrand has the asymptotics 2
e
− ωπ itω +t (... ) 1 2
.
But in sectors (2.22) the quadratic exponential decreases. Hence, the integral (2.21) is absolutely convergent.
Analytic Theory of Quantum q-Deformed Toda Chains (1,0)
587 (+)
In a similar way, the function wγ ≡ wγ corresponding to (2.6a), (2.6d) admits the integral representation π iγ x 1 ω2 S it + 21 (ω1 +ω2 ) − iω2π log g wγ(+) (x) = Nγ e ω1 ω2
C+
× S − it − ix + ω1 +ω2 −
iω1 ω2 2π
(2.23)
log g e
2π iγ t ω1 ω2
dt,
where the contour belongs asymptotically to the sectors 1 3π (arg ω1 + arg ω2 ) + π < arg t < arg ω2 + , 2 2 (2.24) 1 π (arg ω1 + arg ω2 ) < arg t < arg ω2 + , 2 2 and lies between the two sets of poles of the integrand:
ω1 ω2 1 1 = log g − i (n + )ω + (n + )ω , n1 , n2 ≥ 0, tn(+) 1 1 2 2 ,n 2π 2 2 1 2 (+) tm (x) = −x − 1 ,m2
ω1 ω2 2π
log g + i(m1 ω1 + m2 ω2 ),
m1 , m2 ≥ 0.
The integral (2.23) is absolutely convergent. (0,0) (0) Quite similarly, one can construct the function wγ (x) ≡ wγ (x) using the Whit (0) and ;(0) : taker vectors ; λ λ π iγ x 1 ω2 S −1 it + 2i γ − iω2π log g wγ(0) (x) = Nγ e ω1 ω2
C0
× S − it − ix + ω1 +ω2 −
iω1 ω2 2π
log g e
(2.25) 2π iγ t ω1 ω2
dt ,
where the contour C0 belongs asymptotically to the sectors 1 π < arg t < (arg ω1 + arg ω2 ) + π, 2 2 (2.26) π 1 (arg ω1 + arg ω2 ) < arg t < arg ω2 + 2 2 and lies below the poles of the integrand. Thus, the functions (2.21), (2.23), and (2.25) are the eigenfunctins of the corresponding spectral problems, πγ
2π x − πγ 1 + q −1 g 2ω1 e ω2 wγ(−) (x − iω1 ) + wγ(−) (x + iω1 ) = e ω2 + e ω2 wγ(−) (x), arg ω1 +
πγ 2π x − πγ wγ(+) (x − iω1 ) + 1 + qg 2ω1 e ω2 wγ(+) (x + iω1 ) = e ω2 + e ω2 wγ(+) (x), wγ(0) (x − iω1 ) + wγ(0) (x + iω1 ) + g 2ω1 e
2π x ω2
πγ − πγ wγ(0) (x) = e ω2 + e ω2 wγ(0) (x).
(2.27)
(2.28)
(2.29)
588
S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky
Besides, these solutions are the eigenfunctions for the dual spectral problems where ω1 ↔ ω2 . (±) The solutions wγ described above appear to be close to the q-Macdonald functions of the first and second kind which arise in the context of relativistic Toda chain [25]. However, the deformations of the Macdonald function have been investigated in the framework of the standard q-analysis [26] for the typical region |q| < 1 (which evidently fails in the case |q| = 1) and without any reference to the dual symmetry. Formulae (2.19) will be referred to as the Gauss–Euler representation for Whittaker (C) functions. The integral representations for wγ (x), (C = 0, ±1) are the degenerations of a more general q-hypergeometric function [27]. We shall see later that the technique of QISM yields a different integral representation for Whittaker functions which is a q-deformation of the Mellin–Barnes integrals.
2.3. Analytic properties. Let us give the summary of the analytic properties of the Whittaker functions which may be derived directly from the Gauss–Euler representation. (±)
(0)
Lemma 2.1. wγ and wγ can be extended to the entire functions in γ ∈ C. As a (−) function of x ∈ C, wγ (x) has poles at x = − ω1πω2 log g − i(k1 + 21 )ω1 − i(k2 + 21 )ω2 ,
k1 , k2 ≥ 0.
(2.30)
k1 , k2 ≥ 0.
(2.31)
log g wγ(−) (x).
(2.32)
arg γ ∈ / arg ω2 − π2 , arg ω1 − π2 arg ω2 + π2 , arg ω1 + π2 ,
(2.33)
(+)
Similarly, the function wγ (x) has poles at x = − ω1πω2 log g + i(k1 + 21 )ω1 + i(k2 + 21 )ω2 , (0)
The function wγ (x) is an entire one in x ∈ C. Lemma 2.2. wγ(+) (x) = S − ix + 21 (ω1 +ω2 ) −
iω1 ω2 π
Lemma 2.3. For any γ ∈ C such that
the following asymptotics holds as x tends to infinity in the sector π 3π < arg x < arg ω2 + : 2 2 π iγ x − π iγ x wγ(C) (x) = c(γ ) e ω1 ω2 1 + o(1) + c(−γ ) e ω1 ω2 1 + o(1) , arg ω1 +
(2.34) (2.35)
(C = 0, ±), where the function c(γ ) is defined by (2.8). We shall call c(γ ) the quantum Harish-Chandra function associated with Uq (sl(2, R)).
Analytic Theory of Quantum q-Deformed Toda Chains
589
2.4. Mellin-Barnes representation. To make a comparison with the formulae provided by the Quantum Inverse Scattering Method we need a different integral representation of the Whittaker functions. Put π iγ x iζ x − π iC [ζ 2 +γ ζ ] 2π (C) ω ω c(ζ )c(ζ + γ )e ω1 ω2 e ω1 ω2 dζ, (2.36) ψγ (x) = e 1 2 Ce
where the contour CC is above the poles of the integrand and belongs in the left (right) half2 2 − π(C−1)ζ − π(C+1)ζ plane in ζ ∈ C to the sectors where the exponential e ω1 ω2 e ω1 ω2 quadratically vanishes. The integral (2.36) is absolutely convergent for any x ∈ C provided C = ±1. In the degenerate case C = −1 the integral is convergent provided that
π π , (2.37) arg x ∈ / arg ω2 − , arg ω1 − 2 2 while for C = 1 it is defined in the region
π π arg x ∈ / arg ω2 + , arg ω1 + . 2 2
(2.38) (C)
Using the properties of double sine it can be directly verified that the function ψγ (x) satisfies to Eqs. (2.10) where α1 − α2 = C. Proposition 2.3. For C = 0, ±1, wγ(C) (x) = ψγ(C) (x).
(2.39)
The expression (2.36) will be referred to as the (q-deformed) Mellin–Barnes representation for Whittaker functions. It will be shown below that this is the representation which can be easily generalized to those for the N -particle q-deformed Toda chain. 2.5. Limit to SL(2, R) Toda chain. Let ωk > 0 , (k = 1, 2). Suppose that the “coupling constant” g(ω) has the asymptotics such that g ω1 (ω) =
2π [1 + O(ω2−1 )] ω2
(ω2 → ∞).
(2.40)
−1
For example, the simplest (and standard) choice g ω1 (ω) = q−q iω1 satisfies this condition. After the rescaling x → ωπ2 x, Eqs. (2.27), (2.28), and (2.29) take the form π iω1 πγ π iω1 − ∂ ∂ − πγ 1 + q −1 g 2ω1 e2x e ω2 x + e ω2 x wγ(−) (x) = e ω2 + e ω2 wγ(−) (x). (2.41a)
e
−
π iω1 ω2 ∂x
e
−
π iω1
πγ ∂ − πγ + 1 + q −1 g 2ω1 e2x e ω2 x wγ(+) (x) = e ω2 + e ω2 wγ(+) (x), (2.41b)
π iω1 ω2 ∂x
+e
−
π iω1 ω2 ∂x
πγ − πγ + g 2ω1 e2x wγ(0) (x) = e ω2 + e ω2 wγ(0) (x),
(2.41c)
590
S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky
In the limit (2.40) the Eqs. (2.41) are reduced to the SL(2, R) Toda equation p 2 + 4e2x wγ (x) = γ 2 wγ (x),
(2.42)
where p = −iω1 ∂x and ω1 plays the role of Planck constant. Note that the more general equation (2.10a) has the same limit (2.42). The solution to (2.42) with appropriate asymptotic behavior is written in terms of Macdonald function wγ (x) = K γ ω21 ex . (2.43) iω1
Lemma 2.4. lim ψ (C)
ω2 →∞
ω2 π x =
1 K γ 2 ex . π ω1 iω1 ω1
(2.44)
Proof. Using the formula √ lim 2π
ω2 →∞
2π ω1 ω2
1− 2
z ω1
S2−1 (z) = =
z , ω1
(2.45)
proved in [28], one easily finds that the quantum Harish-Chandra function (2.8) reduces to the usual =-function: ζ ζ iω1 1 lim c(ζ ) = 2πω ω = , (2.46) 1 1 ω2 →∞ iω1 provided that asymptotics (2.40) holds. (This function is closely related to the standard Harish-Chandra function for the Toda chain [6]; the difference with the usual definition is due to a different normalization of solutions.) Hence, in the limit ω2 → ∞ the rescaled function (2.36) takes the form iγ ζ ζ + γ ex 2iζ 1 e x ω1 1 ω1 (C) ω2 lim ψ ( π x) = = dζ , = ω2 →∞ πω1 4πω1 ω1 iω1 iω1 ω1 (2.47) where the contour is parallel to the real axis and passes above the poles of the integrand. The expression in brackets is exactly the Macdonald function (2.43) in the Mellin–Barnes representation. ! 3. N -Particle q-Toda Chain and Duality The extension of the formalism described above to the case of the N -particle Toda chain may be performed directly with the help of the “free field representation” for Uq (sl(N, R)), i.e., the homomorphism of Uq (sl(N, R)) into an appropriate multidimensional quantum torus. Instead, we shall describe a different approach based on the “lattice Lax representation with spectral parameter”. As usual, the Lax representation allows to construct of quantum Hamiltonians for a bunch of related systems: periodic Toda chain, open Toda chain, as well as different degenerate systems obtained by removing some of the potential terms from the Hamiltonians. Of course, the choice of the model
Analytic Theory of Quantum q-Deformed Toda Chains
591
in question depends on our choice of the quantum R-matrix. The obvious choice is the standard trigonometric 4 × 4 R-matrix; to get more freedom in the choice of the model we may use twisted trigonometric R-matrices. In all cases, there is a natural homomorphism of the corresponding quantum algebra into the tensor product of noncommutative tori; this allows to introduce the corresponding dual system realized by means of the natural representation of the product of modular dual quantum tori in the same Hilbert space. The entire picture of modular duality is thus fully generalized to the N -particle case. We would like to point out that in the R-matrix formalism it is more convenient to work with Uq (gl(N, R)) and reduce the final formulae to the case of Uq (sl(N, R)) in the standard way. The main advantage provided by the use of the lattice Lax representation is the possibility to get inductive integral representations for the wave functions in question and generalization of the above construction to the periodic case as it was done in [29]. 3.1. The models. q-Toda chain, or relativistic Toda chain (RTC), was introduced by Ruijsenaars [18]. The periodic chain can be described by the Hamiltonian H1 (x1 , p1 ; . . . ; xN , pN ) =
N
2π
1 + q −1 g 2ω1 e ω2
(xn −xn+1 )
eω1 pn ,
(3.1)
n=1
where xn , pn are the canonical coordinates and momenta with standard commutation relations [xn , pm ] = iδnm and the boundary condition xN +1 = x1 is imposed. The system has exactly N mutually commuting Hamiltonians (the polynomial functions of ± 2π xn
ω2 , v = e ω1 pn )7 . the Weyl variables u±1 n n =e Guided by the notion of the modular double considered above, one can define the dual system which is determined by the Hamiltonian
1 (x1 , p1 ; . . . ; xN , pN ) = H
N
2π
1 + q −1 g 2ω2 e ω1
(xn −xn+1 )
eω2 pn
(3.2)
n=1
with the same boundary condition. It is evident that the systems mutually commute. Analogously, the open relativistic Toda chain and its dual system are defined by the Hamiltonians N 2π (x −x ) h1 (x1 , p1 ; . . . ; xN , pN ) = (3.3) 1 + q −1 g 2ω1 e ω2 n n+1 eω1 pn n=1
and h1 (x1 , p1 ; . . . ; xN , pN ) =
N
2π
1 + q −1 g 2ω2 e ω1
(xn −xn+1 )
eω2 pn
(3.4)
n=1
respectively, with the boundary condition xN +1 ≡ ∞. Similarly to the periodic case, each open system possesses exactly N mutually commuting Hamiltonians. Moreover, the Hamiltonians of the dual system commute with those of original one. The basic goal of the present section is to construct the explicit integral representation of the common eigenfunctions for all Hamiltonians in the case of the open N -particle RTC. This will be done in the framework of the QISM approach for the periodic RTC. 7 Higher Hamiltonians will be described below using the standard Lax formalism.
592
S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky
3.2. Twisted trigonometric R-matrix. In order to investigate the relativistic Toda chain using the quantum version of the corresponding classical Lax matrix [30], one needs to introduce the notion of the twisted R-matrix [31]. Let qz2 − q −1 w 2 0 0 0 1 0 z2 − w 2 (q − q −1 )zw 0 R(z/w) = 2 × −1 2 2 2 0 (q − q )zw z − w 0 z −w 2 −1 2 0 0 0 qz − q w (3.5)
be the R-matrix in the principal gradation satisfying the standard Yang–Baxter equation. Consider the twisting of the R-matrix (3.5): −1 Rθ (z/w) = F21 (θ )R(z/w)F12 (θ )
(3.6)
with −1 F12 (θ ) ≡ F21 (θ ) = exp
θ 4
1 ⊗ σ 3 − σ3 ⊗ 1
,
(3.7)
where σ3 is the Pauli matrix. One gets
a(z, w) 0 0 0 0 b(z, w) c(z, w) 0 1 Rθ (z/w) = 2 , z − w2 0 c(z, w) b(z, w) 0 0 0 0 a(z, w)
(3.8)
a(z, w) = qz2 − q −1 w 2 , b(z, w) = eθ (z2 − w 2 ), b(z, w) = e−θ (z2 − w 2 ), c(z, w) = (q − q −1 )zw.
(3.9)
where
It is easy to verify that Rθ (z/w) satisfies the same Yang–Baxter equation as R(z/w). A quantum Lax operator L(z) is, by definition, a 2 × 2-matrix L(z) =
L11 (z) L12 (z) L21 (z) L22 (z)
(3.10)
with operator-valued entries which satisfies the fundamental commutation relations Rθ (z/w)L(z) ⊗ L(w) = (1 ⊗ L(w))(L(z) ⊗ 1)Rθ (z/w).
(3.11)
We define the quantum determinant of the matrix (3.10) by the formula detq L(z) = L11 (zq 1/2 )L22 (zq −1/2 ) − eθ L12 (zq 1/2 )L21 (zq −1/2 ).
(3.12)
Analytic Theory of Quantum q-Deformed Toda Chains
593
3.3. Lax operator and monodromy matrix. As usual in the Quantum Inverse Scattering Method, the entries of the quantum Lax operator generate the basic Hopf algebra AR (defined implicitly by the fundamental commutation relation (3.11) which underlies all the associated quantum integrable systems; to get a particular system, we need to fix its representation. The representation which yields the q-deformed Toda chain is provided by the following construction. Let ω1 , ω2 ∈ C. We consider a lattice system with local quantum Lax operators − 2π xn z − z−1 eω1 pn g ω1 e ω2 Ln (z) = , (3.13) 2π xn +ω1 pn −g ω1 e ω2 0 where xn , pn are the canonical coordinates and momenta with the commutation relations [xn , pm ] = iδnm and g is a real parameter (possibly depending on ω). On the classical level the Lax matrices (3.13) have been introduced in [30]. Proposition 3.1. The Lax operator (3.13) satisfies the commutation relations (3.11) with the quantum R-matrix (3.8), (3.9), where q=e
ω 2
iπ ω1
,
(3.14)
and eθ = q.
(3.15)
The monodromy matrix for the N -periodic chain is defined in the standard way: AN (z) BN (z) TN (z) = LN (z) . . . L1 (z) ≡ . (3.16) CN (z) DN (z) By the usual Hopf algebra properties, the entries of T (z) satisfy the same commutation relations as the corresponding entries of the Lax operators. The quantum determinant of the Lax operator (3.13) is detq L(z) = g 2ω1 eω1 pn .
(3.17)
It is simple to show that the quantum determinant of the monodromy matrix detq TN (z) = AN (zq 1/2 )DN (zq −1/2 ) − qBN (zq 1/2 )CN (zq −1/2 )
(3.18)
obeys the property detq TN (z) = detq LN (z) · . . . · detq L1 (z).
(3.19)
Hence, due to (3.17), detq TN (z) = g 2Nω1
N
eω1 pn .
(3.20)
n=1
Note that in the twisted case the quantum determinant is no longer a central element in the quantum algebra.
594
S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky
Following the same line of argument as in Sect. 1, we may introduce the modular dual system by 2π xn −1 eω2 pn ω2 e− ω1 z − z g n (z) = L . (3.21) 2π xn +ω2 pn −g ω2 e ω1 0 The operator (3.21) satisfies the commutation relation (3.11) with the twisted R-matrix ω iπ 2 (3.8), (3.9) with the only change q → q, θ → θ , where q = eθ = e ω1 . The dual monodromy matrix is defined by N (z) . . . L 1 (z) ≡ AN (z) BN (z) . TN (z) = L (3.22) N (z) N (z) D C The system describing by the Lax operators (3.13), (3.21) may be referred to as the modular relativistic Toda chain. 3.4. Hamiltonians. As usual, the transfer matrix tN (z) = AN (z) + DN (z)
(3.23)
satisfies the commutation relations [tN (z), tN (w)] = 0.
(3.24)
The same is true for the dual transfer matrix N (z) + D N (z); tN (z) = A
(3.25)
moreover, the modular duality implies that [tN (z), tN (w)] = 0.
(3.26)
Clearly, tN (z) has the following structure: tN (z) =
N
(−1)k zN−2k Hk (x1 , p1 ; . . . ; xN , pN ),
(3.27)
k=0
where H0 = 1 and
HN (p1 , . . . , pN ) = exp ω1
N
pn ,
(3.28)
n=1
H1 (x1 , p1 ; . . . ; xN , pN ) =
N
2π
1 + q −1 g 2ω1 e ω2
(xn −xn+1 )
eω1 pn ,
(3.29)
n=1
HN −1 (x1 , p1 ; . . . ; xN , pN ) = HN
N
2π
1 + q −1 g 2ω1 e ω2
n=1
(xn−1 −xn )
e−ω1 pn ,
(3.30)
Analytic Theory of Quantum q-Deformed Toda Chains
595
where in (3.29), (3.30) the periodicity is assumed: xN +1 = x1 . Hence, due to (3.24) the periodic RTC has exactly N commuting operators. The following statement is true: the operator AN (z) is the generating function for the Hamiltonians of the N -particle open RTC: AN (z) =
N
(−1)k zN−2k hk (x1 , p1 ; . . . ; xN , pN ),
(3.31)
k=0
where h0 = 1 and
hN (p1 , . . . , pN ) = exp ω1
N
pn ,
(3.32)
n=1
h1 (x1 , p1 ; . . . ; xN , pN ) =
N
2π
1 + q −1 g 2ω1 e ω2
(xn −xn+1 )
eω1 pn ,
(3.33)
n=1
hN −1 (x1 , p1 ; . . . ; xN , pN ) = hN
N
2π
1 + q −1 g 2ω1 e ω2
(xn−1 −xn )
e−ω1 pn
(3.34)
n=1
assuming xN +1 ≡ ∞ in (3.33) and x0 ≡ −∞ in (3.34). 1 , . . . H N and The second set of the Hamiltonians H h1 , . . . hN are obtained from the former one by the flip ω1 ↔ ω2 . Lemma 3.1. 1. Suppose that ω1 , ω2 are real; then all coefficients of tN (z), tN (z) and N (z) are formally self-adjoint in L2 (RN ). AN (z), A tN (z) and 2. Suppose that Im ω1 = 0 and ω1 = ω2 ; then all coefficients of tN (z), N (z) are normal operators and their “real” and “imaginary” parts (X + X, AN (z), A i(X − X)) are formally self-adjoint. 3.5. Integral representation for the wave functions: Inductive procedure. Our goal is to get an inductive integral representation for the wave functions of the multiparticle open relativistic Toda chain. The approach described below is an analytic version of the algebraic method of separation of variables invented by Sklyanin [32]. Set γ = (γ1 , . . . , γN −1 ) ∈ RN−1 , x = (x1 , . . . , xN −1 ) ∈ RN−1 . Let ψγ (x) be the common wave function for the dual open RTC systems with N − 1 particles: AN −1 (z)ψγ (x) =
N−1
z − z−1 e
2π γm ω2
ψγ (x),
(3.35)
ψγ (x).
(3.36)
m=1
N −1 (z)ψγ (x) = A
N−1
z − z−1 e
2π γm ω1
m=1
The key point of the inductive procedure (described for the first time in [9] for the ordinary Toda chain) is to compute the action on ψγ (x) of the N -particle Hamiltonians.
596
S. Kharchev, D. Lebedev, M. Semenov-Tian-Shansky
It turns out that such an action “preserves” the form of wave function8 . This computation, which starts with the case N = 2, is based on the ordinary RT T commutation relations for the quantum monodromy matrix; the inductive formula given below is based on a self-consistent choice of the normalization for the wave functions. Proposition 3.2. There exists a unique solution ψγ1 ,... ,γN −1 (x1 , . . . , xN −1 ) to the common spectral problem (3.35), (3.36) such that for any N ≥ 2 the eigenfunction ψγ is an entire function in γ ∈ CN−1 satisfying the relations N−1 2π γj 2π xN 2π − −1 ω N 1 ω γm e ω2 ψγ −iω1 ej (x), (3.37) AN e 2 ψγ (x) = q (−ig ) exp ω2 m=1
N−1 2π γj 2π xN 2π − −1 ω N 2 ω AN e 1 ψγ (x) = q (−ig ) exp γm e ω1 ψγ −iω2 ej (x), (3.38) ω1 m=1
where {ej } is the standard basis of RN−1 . 2π xN
As a comment to the proposition above we remark that CN (z) = −q −1 g ω1 e ω2 eω1 pN AN −1 (z). Hence, the compatibility of (3.35) and (3.37) follows from the quadratic RT T relations; the argument for the dual system is completely similar. Assuming that such a function is known, the heuristic idea behind the inductive integral representation for the N -particle wave function ψλ1 ,... ,λN (x1 , . . . , xN ) is to represent it as the generalized Fourier transform with respect to the N − 1-particle wave 2π iγ1 x1
function ψγ (x). Note that the “one-particle” solution is ψγ1 (x1 ) = e ω1 ω2 ; it is easy to verify directly that this trivial function satisfies the conditions of Proposition 3.2, which forms the induction basis. The exact statement is given by Theorem 3.1 below; here we represent the essential ideas how to arrive at this statement. Introduce an auxiliary wave function by Kγ ,ε (x1 , . . . , xN ) N−1 2πi N −1 N−1 2 x πi ω1 ω2 ε−m=1 γm N def e = exp γm − ε γm ψγ (x1 , . . . , xN −1 ). ω1 ω2 m=1
m=1
(3.39) Generalizing proposition 3.2, one can prove the following result. N (z) on the auxiliary wave function (3.39) is Proposition 3.3. The action of AN (z), A given by N N−1 −1 2π 2π γj γm ω2 ε− −1 m=1 AN (z)Kγ ,ε = z − z e z − z−1 e ω2 Kγ ,ε j =1
+ (−ig ω1 )N e
πε ω2
N−1 j =1
Kγ −iω1 ej ,ε
s=j
z−e
e
π γj ω2
2π γs ω2
−e
2π γs ω2
(3.40) z−1 e
−
π γj ω2
,
8 This idea goes back to M. Gutzviller [33] who explicitly calculated such an action on the 2 and 3-particle
eigenfunctions for the Toda chain.
Analytic Theory of Quantum q-Deformed Toda Chains
N (z)Kγ ,ε = z − z−1 e A
2π ω1
N −1
ε−
m=1
γm
597
N−1 2π γj z − z−1 e ω1 Kγ ,ε j =1
+ (−ig ω2 )N e
πε ω1
N−1
Kγ −iω2 ej ,ε
j =1
s=j
z−e
e
π γj ω1
2π γs ω1
−e
2π γs ω1
(3.41) z−1 e
−
π γj ω1
.
Let us write formally ψλ1 ,... ,λN (x1 , . . . , xN ) =
µ(γ )Q(γ |λ)Kγ ,λ1 +...+λN dγ ,
(3.42)
where µ(γ ) = def
j maxk {Im λk }; (b) The left end of the contour escapes to infinity in the sectors 1 3π (arg ω1 + arg ω2 ) + π < arg γj < arg ω2 + ; 2 2 (c) The right end of the contour escapes to infinity in the sectors π π arg ω1 − < arg γj < arg ω2 + . 2 2 Then the function (3.47) is a common eigenfunction for N -particle open RTC. Namely, it satisfies to the following properties: (i) ψλ1 ,... ,λN is an entire function in λ ∈ CN ; (ii) ψλ1 ,... ,λN is the solution to the following set of equations: AN (z)ψλ1 ,... ,λN =
N
z − z−1 e
2π λk ω2
ψλ1 ,... ,λN ,
(3.48)
k=1
π λn AN +1 e ω2 ψλ1 ,... ,λN =q
−1
ω1 N+1
(−ig )
N 2π x 2π − ωN +1 2 exp λk e ψλ1 ,... ,λn −iω1 ,... ,λN , ω2
(3.49)
k=1
N (z)ψλ1 ,... ,λN = A
N
z − z−1 e
2π λk ω1
ψλ1 ,... ,λN ,
(3.50)
k=1
π λn N +1 e ω1 ψλ1 ,... ,λN A =q
−1
ω2 N+1
(−ig )
N 2π x 2π − ωN +1 1 exp λk e ψλ1 ,... ,λn −iω2 ,... ,λN . ω1
k=1
(3.51)
Analytic Theory of Quantum q-Deformed Toda Chains
599
By inductive application of the formula (3.47), starting with the trivial one-particle wave function ψγ1 (x1 ) = e
2π iγ1 x1 ω1 ω2
, we get an explicit solution for the N -particle system.
||γj k ||N j,k=1
Theorem 3.2. Let be a lower triangular N × N matrix and let the last row (γN 1 , . . . , γNN ) be identified with(λ1 , . . . , λN ). (i) The solution to (3.48)–(3.51) can be written in the form: ψλ1 ,... ,λN (x1 , . . . , xN ) N−1 n π π = 4ω1 ω2 sinh (γnj −γnk ) · sinh (γnj − γnk ) ω1 ω2 DN n=1
×
n n+1
j,k=1 j 0 for all nontrivial ξ . On the other hand, for Lξ = C, the laplacian has a 1-dimensional kernel, i.e. one zero eigenvalue. (z) As usual, we can decompose f on the eigenstates of "ξ , i.e.: f =
gn (w)ϕn (z),
(11)
n (0,0)
where {ϕn } is an orthonormal basis for the L2 norm on (M (Lξ ) of eigenstates with eigenvalues {λ2n }; so, ||f ||2L2 (T ×C) = n ||gn ||2L2 (C) . Moreover: "ξ f =
[("(w) + λ2n )gn ]ϕn .
(12)
n
Proposition 3. Let ρ ∈ L2 (Lξ ⊗ S + ) be compactly supported and suppose that ξ is nontrivial. Then there is f ∈ L2 (Lξ ⊗ S + ) and a constant k < ∞ such that /ξ f = ρ and ||f ||L2 ≤ k||ρ||L2 . Proof. Given (12), solving the equation "ξ f = ρ amounts to solve ("(w) + λ2n )gn = ρn for each n, where gn , ρn are the components of g, ρ along the eigenspaces of λ2n , respectively. Fix some integer n and denote by Fn the fundamental solution of (/(w) +λ2n )Fn (w) = 0. Rescale the plane coordinate w = λn w, which transforms the previous equation to ("(w ) + 1)Fn ( w λn ) = 0. The unique integrable solution for this equation is the Bessel function K0 (see below), so that Fn (w) = K0 (λn w). Solutions to the non-homogeneous equations will then be given by the convolution: gn (w) = Fn (w − x)ρn (x)dxdx (13) R2
and recall that ||gn ||L2 ≤ ||Fn ||L1 ||ρn ||L2 . So, all we need is an estimate for ||Fn ||L1 which is independent of n. From the expression above, one sees that each Fn is integrable if the Bessel function K0 is: ||Fn ||L1 = λ−2 n ||K0 ||L1 . So, let λ = min{λn }n∈N ; therefore, ||Fn ||L1 ≤ λ−2 ||K0 ||L1 ; putting k = λ−2 ||K0 ||L1 we have ||gn ||L2 ≤ k||ρn ||L2 for each n. This completes the proof.
646
M. Jardim
Consider the Hilbert space L22 (Lξ ⊗ S ± ) obtained by the completion of (Lξ ⊗ S ± ) with respect to the norm: ||s||L2 = ||s||L2 + ||"ξ s||L2 .
(14)
2
The map "ξ : L22 (Lξ ⊗ S − ) → L2 (Lξ ⊗ S − ) is then bounded, for clearly ||/ξ s||L2 ≤ ||s||L2 . Let Gξ : L2 (Lξ ⊗ S − ) → L22 (Lξ ⊗ S − ) be the inverse of "ξ given by Proposi2 tion 3. Using the inequality of the proposition, one shows that Gξ is also bounded, if ξ is nontrivial: ||Gξ s||L2 = ||Gξ s||L2 + ||"ξ Gξ s||L2 = ||Gξ s||L2 + ||s||L2 ≤ 2
≤ k||s||L2 + ||s||L2 ≤ (k + 1) · ||s||L2 . Moreover, we also conclude that: ||Gξ || < 1 +
C . λ2
(15)
Hence, Gξ is an invertible operator when acting between the above Hilbert spaces, if ξ is non-trivial. Remark 1. We emphasise the necessity of assuming that ξ is nontrivial. If ξ = e, ˆ then the Eq. (10i) admits one zero eigenvalue; on the other hand, the fundamental solution of "(w) g = 0 is essentially log r, which is not integrable. It is then impossible to get the estimate of Proposition 3, in other words, the operator "(ξ =e) ˆ fails to be invertible. In addition, the parameter k also depends on ξ , and k → ∞ (i.e. λ → 0) as ξ → 0. Now, define the norms: ||s||L2 = ||s||L2 + ||Dξ∗ s||L2 if s ∈ (Lξ ⊗ S − ) 1 ||s||L2 = ||s||L2 + ||Dξ s||L2 if s ∈ (Lξ ⊗ S + ). l+1
l
(16)
l
and consider the Dirac operators as maps between the following Hilbert spaces, obtained by the completion of (Lξ ⊗ S ± ) with respect to the above norms: ∗ Dξ : L21 (Lξ ⊗ S − ) → L2 (Lξ ⊗ S + ) (17) Dξ : L2l+1 (Lξ ⊗ S + ) → L2l (Lξ ⊗ S − ). Then Dξ∗ is clearly bounded. Furthermore, it has an inverse given by (Dξ∗ )−1 = Dξ Gξ : L2 (Lξ ⊗ S + ) → L21 (Lξ ⊗ S − ), which is also bounded: ||(Dξ∗ )−1 s||L2 = ||(Dξ∗ )−1 s||L2 + ||Dξ∗ (Dξ∗ )−1 s||L2 1
= ||Dξ Gξ s||L2 + ||s||L2 = ||Dξ Gξ s||L2 1
≤ ||Gξ s||L2 ≤ (k + 1) · ||s||L2 . 2
So, Dξ∗
is also Fredholm when acting as in (17), and our proof is complete. To further ∗ −1 reference, we shall denote Q∞ ξ = (Dξ ) ; note, moreover, that this is a bounded, elliptic, pseudo-differential operator of order −1.
Nahm Transform and Spectral Curves for Doubly-Periodic Instantons
647
We are left with one point to establish: the integrability of the fundamental solution of (" + 1)F = 0 in the plane. Indeed, first note that since the operator " + 1 has polar symmetry, then the fundamental solution F also has. After imposing this symmetry, we obtain the following ODE, for r > 0: 1 (" + 1)F (r) = 0 ⇒ F + F − F = 0. r This is a Bessel equation with parameter ν = 0. Its solutions are linear combinations of the Bessel functions of imaginary argument I0 and K0 (see [1], chapter 11). Below are possible integral representations for these functions (see [8]): ∞ 1 e−rt (t 2 − 1)− 2 dt, K0 (r) = 1
I0 (r) =
1
−1
cosh(rt)(t 2 − 1)− 2 dt. 1
It is easy to see that I0 (r) increases exponentially with r; it is also finite for r = 0. For the purpose of finding a Green’s function for the operator " + 1, this solution can be eliminated. With the help of a table of integrals, one finds out that K0 is integrable; indeed, by [8]: ∞ 2π ∞ K0 (r)d 2 vol = K0 (r)rdrdθ = 2π rK0 (r)dr = 2π. R2
0
0
0
This means that ||K0 ||L1 = 2π . Proposition 4. The solution f of the flat laplacian problem /ξ f = ρ of Proposition (3) decays exponentially if ξ is nontrivial, in the sense that there is a real constant λ > 0 such that: lim eλr |f | < ∞. r→∞
Proof. As r → ∞, the Bessel function K0 admits the following asymptotic expansion ([20], p. 202): π 1 e−r 1 9 2 K0 (r) ∼ + ··· . (18) + √ 1− 2 8r 128r 2 r Now since each ρn has compact support, it follows from (13) that each gn will also decay exponentially: π 1 e−λn |w−x| 1 2 gn (w) ∼ + · · · ρn (x)dxdx, 1− √ 2 8λn |w − x| λn |w − x| : where : is the support of ρ. As |w| → ∞, then also |w − x| ∼ |w| for all x ∈ :. Therefore, π 1 e−λn |w| 1 2 gn (w) ∼ 1− ρn (x)dxdx, as |w| → ∞. + ··· √ 2 8λn |w| λn |w| : Choosing 0 < λ < min{λn }n∈N , the statement follows from the eigenspace decomposition of f (11) and (12).
648
M. Jardim
In particular, note that (f/w) also belongs to L2 (Lξ ⊗ S + ). Define 2 (Lξ ⊗ S + ) as the space of all ψ ∈ (Lξ ⊗ S + ) such that ψ/w is square-integrable. L The proposition just proved implies that the flat model laplacian acting as follows: 2 (Lξ ⊗ S ± ) → L2 (Lξ ⊗ S ± ) "ξ : L is an invertible operator. Since "ξ = Dξ Dξ∗ , we conclude that: 2 (Lξ ⊗ S − ) → L2 (Lξ ⊗ S + ) Dξ∗ : L
(19)
is also invertible. 3.2. Completing the proof of Theorem 2. Let K denote a closed ball in C of sufficiently ∗ is Fredholm, large radius R; its complement is DR defined as above. To show that DA ξ first note that the usual elliptic theory for compact manifolds guarantees the existence ∗ inside this compact core T × K; this is a bounded, elliptic, of a parametrix for DA ξ pseudo-differential operator: 2 + 2 − QK Aξ : L (E(ξ ) ⊗ S |T ×K ) → L1 (E(ξ ) ⊗ S |T ×K )
of order −1. On the other hand, it follows from Lemma 3 that: ∗ ∗ ||DA − (Dξ∗0 +ξ ⊕ D−ξ )||2L2 (T ×D 0 +ξ ξ
R)
0, then there is an ∗. isomorphism H 1 (T × P1 , E) ≡ kerDA ∗ ⊂ L2 (E ⊗ S − ), with the norm defined in (6). First, we must show Note that kerDA 1 that H 1 (T × P1 , E) has the correct dimension.
Vanishing theorem. Since χ (E) = −k, it is enough to show that the cohomologies of orders 0 and 2 vanish in order to conclude that h1 (T × P1 , O(E)) = k. A holomorphic bundle E → T × P1 is said to be generically fibrewise semistable if the restriction E|Tw is semistable 1 for generic w ∈ P1 (here, Tw = T × {w}). Similarly, E is said to be fibrewise semistable (regular) if the restriction E|Tw is semistable (regular) for all w ∈ P1 . Notice that every instanton bundle is generically fibrewise semistable, since E|T∞ is semistable, which is a generic condition. This observation leads to the desired vanishing result: Lemma 5. If E is an irreducible instanton bundle and k > 0, then: h0 (T × P1 , E(ξ )) = h2 (T × P1 , E(ξ )) = 0, ∀ξ ∈ Tˆ . Let Lξ → T be a flat line bundle as described in [14]; denote: ˜ ) = E ⊗ p1∗ Lξ ⊗ p2∗ OP1 (1). E(ξ ) = E ⊗ p1∗ Lξ and E(ξ Note that we can regard p2∗ OP1 (1) as the line bundle corresponding to the divisor T∞ . It follows from the lemma that: ˜ )) = k h1 (T × P1 , E(ξ )) = h1 (T × P1 , E(ξ for every ξ ∈ Tˆ . Proof. Take w ∈ P1 such that E(ξ )|Tw = Lξ1 ⊕ Lξ2 for some non-trivial ξ1 , ξ2 ∈ Tˆ and let V ⊂ P1 be an open neighbourhood of w such that every point of V satisfies the same condition; the existence of such an open set is guaranteed by the fact that E is generically fibrewise semistable. Suppose there is a holomorphic section s ∈ H 0 (M, E(ξ )); it gives rise to a holomorphic section sw of E(ξ )|Tw → Tw . On the other hand, we have that h0 (T , E(ξ )|T ×{w} ) = 0, hence sw ≡ 0. Moreover, sw ≡ 0 for all w ∈ V , so that s must vanish identically on the ˜ )) open set T × V , hence vanish everywhere and h0 (E(ξ )) = 0. The vanishing of h0 (E(ξ ∗ ˜ is proved in the very same way by noting E(ξ )|Tw ≡ E(ξ )|Tw since p2 OP1 (1)|Tw = C. The vanishing of the h2 ’s follows from Serre duality and a similar argument for the bundle E(ξ ) ⊗ KP1 . More precisely, Serre duality implies that: H 2 (T × P1 , E(ξ )) = H 0 (T × P1 , E(ξ )∨ ⊗ KT ×P1 )∗ = H 0 (T × P1 , E(ξ )∨ ⊗ p2∗ OP1 (−2))∗ . 1 Recall that every semistable, rank 2 vector bundle over an elliptic with trivial determinant either splits as a sum of flat line bundles or it is the unique nontrivial extension of a flat line bundle of order 2 by itself. Such bundle is regular if it is not the sum of trivial line bundles.
Nahm Transform and Spectral Curves for Doubly-Periodic Instantons
651
On the other hand, it is easy to see that: E(−ξ )|Tw ≡ (E(ξ )∨ ⊗ p2∗ OP1 (−2))|Tw so that we can apply the h0 (T × P1 , E(ξ )∨ ⊗ KT ×P1 ) = 0.
same
argument
as
above
to
show
that
Proof of Proposition 5. Let {wi } ⊂ P1 be such that H 0 (Twi , E|Twi ) does not vanish. As we argued above, there are only finitely many such points; in fact, it can be shown that there are at most k such points (see Lemma 2 of [14]). Suppose that #{wi } = p ≤ k; note also that ∞ ∈ / {wi } if ξ0 is nontrivial. Denote by B the divisor in T × P1 consisting of the elliptic curves lying over these points, i.e. B = i T × {wi }. Also, denote E(p) = E ⊗ OT ×P1 (B). Consider the exact sequence of sheaves: 0 → O(E) → O(E(p)) → O(E(p)|B ) → 0 which induces the following sequence of cohomology: 0 → H 0 (B, E(p)|B ) → H 1 (T × P1 , E) → H 1 (T × P1 , E(p)) → H 1 (B, E(p)|B ) → 0 dim = k dim = k (23) and note that p ≤ h0 (B, E(p)|B ) = h1 (B, E(p)|B ) ≤ 2k. It follows from (23) that h0 (B, E(p)|B ) = h1 (B, E(p)|B ) = k, so that the map H 0 (B, E(p)|B ) → H 1 (T × P1 , E) is an isomorphism. This means that each element in H 1 (T × P1 , E) can be represented by a (0, 1)-form θ supported on tubular neighbourhoods of the fibres T × {wi }. Pulling θ back to T × C, we obtain a compactly supported (0, 1)-form, which we also denote by θ , since ξ0 is nontrivial. ∗ ψ = 0 out of θ, and within the same coWe want to fashion a solution ψ of DA homology class. In other words, we want to find a section s ∈ L2 ((0 E) such that ∗ (θ + ∂ s) = 0. Since D ∗ = ∂ ∗ − ∂ , this is the same as solving the equation: DA A A A A ∗
∗
∂ A ∂ A s = /A s = −∂ A θ for a compactly supported θ . In the Fredholm theory for the Dirac operator developed above, we constructed the ∗ Green’s operator GA of the Dirac laplacian /A . Thus, we can write s = −GA ∂ A θ and P
∗
∗. ψ = θ − ∂ A GA ∂ A θ = P θ , where P denotes the L2 projection L2 (E ⊗ S − ) → kerDA ∗ 2 − We must verify that ψ ∈ L (E ⊗ S ); it is enough to show that ∂ A GA ∂ A θ is square∗ integrable for any compactly supported (0,1)-form θ . First note that γ = ∂ A θ also has compact support, thus s = GA γ ∈ L2 ((0 E). Therefore, we have:
||∂ A s||2L2 = ,∂ A s, ∂ A s- = ,∂ A s, (∂ A GA )γ = ,(∂ A GA )∗ ∂ A s, γ -
which is finite, since γ is compactly supported. Note the integration by parts made from the first to the second line is justified by the same fact. Therefore, ψ is indeed a ∗ ψ = 0. square-integrable solution of DA
652
M. Jardim
Finally, to see that the map defined above is injective (hence an isomorphism), let θ be another (0, 1)-form supported around B and within the same cohomology class as θ , so that θ − θ = ∂ A α. Thus: ∗
∗
ψ − ψ = (θ − ∂ A GA ∂ A θ) − (θ − ∂ A GA ∂ A θ ) ∗
= (θ − θ ) − ∂ A GA ∂ A (θ − θ ) = ∂ Aα
∗ − ∂ A GA ∂ A ∂ A α
(24)
= ∂ A α − ∂ A α = 0.
This completes the proof.
4. Nahm Transform of Doubly-Periodic Instantons Recall that our starting point is a rank two vector bundle E → T × C provided with an instanton connection A ∈ A(k,ξ0 ) , where the instanton number k and the asymptotic state ξ0 are from now on fixed. Over the punctured Jacobian torus Tˆ \ {±ξ0 }, consider the trivial Hilbert bundle ˆ H → Tˆ \ {±ξ0 } whose fibres are Hˆ ξ = L21 (E(ξ ) ⊗ S − ). Taking the L21 -norm on the fibres, Hˆ becomes an hermitian bundle. Moreover, call dˆ the trivial covariant derivative on Hˆ ; such derivative is clearly unitary, hence one can define a holomorphic structure over Hˆ . Now consider the finite-dimensional sub-bundle V C→ Hˆ over Tˆ \ {±ξ0 } whose ∗ . Remark that this is actually the index bundle for the fibres are given by Vξ = kerDA ξ family of Dirac operators DAξ . Let i : V → Hˆ be the natural inclusion and P : Hˆ → V
∗ for the fibrewise orthogonal L2 projection; more precisely, Pξ = I − DAξ GAξ DA ξ each ξ ∈ Tˆ \ {±ξ0 }, where GAξ denotes the Green’s operator for (22), I is the identity operator. We can define a connection on V via the projection formula:
∇B = P ◦ dˆ ◦ i,
(25)
where B is the associated connection form. Clearly, V inherits the hermitian metric from Hˆ , and B is also unitary with respect to this induced metric. Hence, we can provide V with the holomorphic structure coming from the unitary connection B. Alternatively, V also admits an interpretation in terms of monads, see [6]. The Dirac operator can be unfolded into a family of elliptic complexes parametrised by Tˆ \ {±ξ0 }, namely: ∂ Aξ
−∂ Aξ
0 → L22 ((0 E(ξ )) −→ L21 ((0,1 E(ξ )) −→ L2 ((0,2 E(ξ )) → 0
(26)
which, of course, are also Fredholm. Moreover, the cohomologies of order 0 and 2 must vanish, by Proposition 5. As in [6], such a holomorphic family defines a holomorphic ∗ }, plus an unitary connection, vector bundle V → (Tˆ \ {±ξ0 }), with fibres Vξ = ker{DA ξ induced by orthogonal projection, which is compatible with the given holomorphic structure. Such a connection will be denoted by B. We will invoke this construction repeatedly throughout this work.
Nahm Transform and Spectral Curves for Doubly-Periodic Instantons
653
The curvature FB of B is simply given by: ˆ d). ˆ FB = ∇B ∇B = P d(P Explicit formulas for the matrix elements on an arbitrary local trivialisation of V → (Tˆ \ {±ξ0 }) will be useful later on. For instance, pick up an orthonormal frame {ψi }kn=1 over an open set U ⊂ Tˆ \ {±ξ0 }. Then, we have that: ˆ j -, (B)ij = ,ψj , ∇B ψi - = ,ψj , dψ ˆ dψ ˆ i )- = ,ψj , d(P ˆ dψ ˆ i )-. (FB )ij = ,ψj , FB ψi - = ,ψj , P d(P
(27)
Higgs field. We now define the Higgs field E ∈ End(V ) ⊗ KTˆ . Let w be the complex ∗ . coordinate of the plane, and ψ ∈ (V ), i.e. for each ξ ∈ Tˆ \ {±ξ0 }, ψ[ξ ] ∈ kerDAξ For a fixed ξ , the Higgs field will act on ψ[ξ ] by multiplying this section by the plane ∗ : coordinate w and then projecting it back to kerDA ξ √ (E(ψ))[ξ ] = 2 2π Pξ (wψ[ξ ])dξ.
(28)
√ Its conjugate is clearly given by (E∗ (ψ))[ξ ] = 2 2π Pξ (wψ[ξ ])dξ . There is a subtle analytical point here. The spinors ψ belong to L2 (E(ξ ) ⊗ S − ) but is not necessarily the case that wψ also belong to L2 (E(ξ ) ⊗ S − ). However, we have the following lemma: Lemma 6. If ψ ∈ wψ ∈ L2 (E ⊗ S − ).
∗ and A has nontrivial asymptotic state, then kerDA
Proof. The key result here is Pproposition 4, and the observation that follows it, in particular the invertibility of the operator (19). ∗ is sufficiently close to the flat Let K ⊂ T × C be a compact subset such that DA ∗ outside K. Thus, restricted to the complement of K, D ∗ is invertible Dirac operator D±ξ A 0 acting from L˜2 → L2 . ∗ , then D ∗ (wψ) = dw · ψ ∈ L2 (E ⊗ S + | Now if ψ ∈ kerDA T ×C\K ) and the A proposition follows. Note that the dependence of (B, E) on the original instanton A is contained on the L2 ∗ ψ = 0. It is easy to see that the finite projection operator P , i.e. on the k solutions of DA ξ dimensional space spanned by these ψ is gauge invariant; moreover, the multiplication by w also commutes with gauge transformations gˆ ∈ Aut(V ). Therefore, we have that: Proposition 6. If A and A are gauge equivalent irreducible instantons, then the corresponding pairs (B, E) and (B , E ) are also gauge equivalent. A pair (B, E) is called a Higgs pair on the bundle V → Tˆ \ {±ξ0 } if it satisfies Hitchin’s self-duality equations: (i) FB + [E, E∗ ] = 0 (29) (ii) ∂ B E = 0.
654
M. Jardim
Recall from [14] that the unitary connection of the Poincaré line bundle P → T × Tˆ and its corresponding curvature are given by: ω(z, ξ ) = iπ ·
2
2 dξµ ∧ dzµ . ξµ dzµ − zµ dξµ and :(z, ξ ) = 2iπ ·
µ=1
µ=1
From Braam & Baal [4], we know that if s ∈
(E(ξ ) ⊗ S − ),
then:
∗ ˆ ∗ ˆ = −: · s, DA (ds) = [DA , d]s ξ ξ
where “·” means Clifford multiplication. The local formula for the curvature (27) may now be cast on a more convenient form: ∗ ˆ ˆ dψ ˆ i )- = ,ψj , d(D ˆ Aξ GAξ DA (FB )ij = ,ψj , d(P dψi )ξ ∗ ˆ ∗ ˆ = ,−DA dψj , GAξ (DA dψi )- = ,:.ψj , GAξ (: · ψi )ξ ξ
Since the Clifford multiplication commutes with the Green’s operator, we end up with: (FB )ij = −,(: ∧ :) · ψi , GAξ ψi = 8π 2 ,(dz1 ∧ dz2 ) · ψj , GAξ ψi -dξ1 ∧ dξ2
(30)
= −4π 2 i,(dz1 ∧ dz2 ) · ψj , GAξ ψi -dξ ∧ dξ . Note moreover that the inner product is taken in L2 (E(ξ ) ⊗ S − ), integrating out the (z, w) coordinates. Theorem 3. If A ∈ A(k,ξ0 ) , then the associated pair (B, E) on the dual bundle V → Tˆ \ {±ξ0 } constructed above satisfies the Hitchin’s equations (29). Proof. Choose an open set U ∈ Tˆ \ {±ξ0 } and pick up a local orthonormal trivialisation of V → Tˆ \ {±ξ0 } over U , such that the corresponding local frame {ψi }kn=1 is parallel ∗ . at ξ . Recall that ψi (ξ ) ∈ kerDA ξ First, we shall look at the second equation of (29), and recall that Tˆ \ {±ξ0 } was given the flat Euclidean metric induced from the quotient. Once a local trivialisation is chosen, the endomorphism E can then be put in matrix form, with matrix elements given by: aij (ξ ) = ,ψj (ξ ), E[ψi ](ξ )-, where , , - is the inner product on L2 (E(ξ ) ⊗ S − ), integrating out the (z, w) coordinates. Clearly, E is a holomorphic endomorphism if its matrix elements in a holomorphic trivialisation are holomorphic functions. However: ∗ E[ψi ](ξ ) = Pξ (wψi (ξ ))dξ = (I − DAξ GAξ DA )(wψi (ξ ))dξ ξ
so that:
√ ∗ aij (ξ ) = 2 2π ,ψj (ξ ), wψi (ξ )- − ,ψj (ξ ), DAξ GAξ DA (wψi (ξ ))ξ √ ∗ ∗ = 2 2π ,ψj (ξ ), wψi (ξ )- − ,DA ψ (ξ ), G D (wψ (ξ ))j A i ξ A ξ ξ √ = 2 2π,ψj (ξ ), wψi (ξ )-.
Nahm Transform and Spectral Curves for Doubly-Periodic Instantons
655
Therefore: √ (ξ ) = 2 2π ,∂B ψj , wψi - + ,ψj , ∂ B (wψi )∂ξ √ ∂w ψi + ∂ B ψi - = 0 = 2 2π ,ψj , ∂ξ
∂aij
as ψi is parallel at ξ . Since this can be done for all ξ ∈ Tˆ \ {±ξ0 }, the second equation is satisfied. Now, we move back to (29(i)). Let us first compute the matrix elements ([E, E∗ ])ij . Note that: ∗ , w]ψ (ξ ) = D ∗ (wψ (ξ )) = −dw · ψ (ξ ) (i) [DA i i i Aξ ξ (31) ∗ , w]ψ (ξ ) = D ∗ (wψ (ξ )) = 0, (ii) [DA i i Aξ ξ ∗
where we used the fact that DAξ = ∂ Aξ − ∂ Aξ . Recall that for 1-forms [E, E∗ ] = EE∗ + E∗ E. We compute each term separately: E∗ E(ψi ) = 8π 2 P [wP (wψi )]dξ ∧ dξ ∗ = 8π 2 wP (wψi ) − DAξ GAξ DA wP (wψi ) dξ ∧ dξ ξ ∗ = 8π 2 wwψi − wDAξ GAξ DA (wψi ) ξ ∗ −DAξ GAξ DA wP (wψi ) dξ ∧ dξ ξ EE∗ (ψi ) = 8π 2 P [wP (wψi )]dξ ∧ dξ ∗ = 8π 2 wwψi − wDAξ GAξ DA (wψi ) ξ ∗ wP (wψi ) dξ ∧ dξ. −DAξ GAξ DA ξ The two first terms of EE∗ and E∗ E cancel each other and the third terms will cancel out when we take the inner product with ψj . Moreover, the second term of E∗ E is zero by (31(ii)). So we are left with: ([E, E∗ ])ij = 8π 2 ,ψj , [E, E∗ ]ψi ∗ = 8π 2 ,ψj , wDAξ Gξ DA (wψi )-dξ ∧ dξ ξ ∗ ∗ = 8π 2 ,DA (wψj ), Gξ DA (wψi )-dξ ∧ dξ ξ ξ
= −8π 2 ,(dw ∧ dw).ψj , Gξ ψi -dξ ∧ dξ = −4π 2 i,(dw1 ∧ dw2 ).ψj , Gξ ψi -dξ ∧ dξ , where we have once more used the fact that the Clifford multiplication commutes with the Green’s operator. Summing the final expression above with (30), one gets: (FB )ij + ([E, E∗ ])ij = −4π 2 i,(dz1 ∧ dz2 + dw1 ∧ dw2 ) · ψj , Gξ ψi -dξ ∧ dξ = 0 for the first term of the inner product is zero since it consists of a self-dual form (the Kähler form) acting on a negative spinor.
656
M. Jardim
Clearly, the above result has two weak points: it tells nothing about the behaviour of the Higgs field around the singular points ±ξ0 ; and it fails to show that the Higgs pairs so obtained are admissible in the sense of [14]. In fact, establishing the first point requires the use of algebraic-geometric methods, and will be taken up in Sect. 5 below. The second point will be clarified in Sect. 6. 5. Holomorphic Version The vanishing results of Sect. 3.4 put us in position to define the transformed bundle V → Tˆ . Indeed, consider the following elliptic complex: ∂ Aξ
−∂ Aξ
0 → L22 ((0 E(ξ )) → L21 ((0,1 E(ξ )) → L2 ((0,2 E(ξ )) → 0.
(32)
According to Proposition 5, H 1 (T × P1 , E(ξ )) is the only nontrivial cohomology of this complex. It then follows that the family of vector spaces given by Vξ = H 1 (T ×P1 , E(ξ )) forms a holomorphic vector bundle of rank k over Tˆ ; denote such holomorphic structure by ∂ V . Note that Vξ is defined even if ξ = ±ξ0 . Furthermore, by Proposition 5, V|Tˆ \±ξ0 coincides holomorphically with the dual bundle V defined on the previous section, i.e.: (V, ∂ V )|Tˆ \{±ξ0 } (V , ∂ B ). Moreover, V comes equipped with a hermitian metric h , which we want to compare with h, the hermitian metric on V induced from the monad (26). The key point is a fact we noted before in Lemma 3: given an 1-form a on T × P1 , its L2 -norm with respect to the round metric is always larger than its L2 -norm with respect to the flat metric on T × (P1 \ {∞}): ||a||L2 > ||a||L2 . R
F
Thus, comparing the monads (26) and (32), one sees that h is bounded above by h . In particular, the metric h is bounded at ±ξ0 . We can regard V as an index bundle for the family of Dirac operators over T × P1 parametrised by ξ ∈ Tˆ . Hence, its degree can be computed by the Atiyah-Singer index ∗ E ⊗ p ∗ P over T × P1 × Tˆ , and theorem for families. Consider now the bundle G = p12 13 note that G|T ×P1 ×{ξ } = E(ξ ). Then we have: ch(V) = −ch(G) · td(T × P1 )/[T × P1 ] 1 = − 2 + 2c1 (P) + c1 (P)2 − c2 (E) 1 + c1 (P1 ) /[T × P1 ] 2 1 = k − c1 (P)2 c1 (P1 )/[T × P1 ] = k − 2tˆ, 2 where the “−” sign in the first line is needed since V is formed by the null spaces of the adjoint Dirac operator. Summing up: Lemma 7. The dual bundle (V , ∂ B ) → Tˆ \ {±ξ0 } admits a holomorphic extension V → Tˆ of degree −2. Moreover, its hermitian metric h is bounded above at the punctures ±ξ0 .
Nahm Transform and Spectral Curves for Doubly-Periodic Instantons
657
The determinant line bundle of V is not fixed, however. In fact, let tx : T × P1 → T × P1 be the translation of the torus by x ∈ T , acting trivially on P1 , and let E = tx∗ E. If V is the dual bundle associated with E then V = V ⊗ Lx . Indeed: ∗ ∗ ∗ Vξ = H 1 (T × P1 , E (ξ )) = H 1 T × P1 , p12 (tx E) ⊗ p13 P|T ×P1 ×{ξ } ∗ ∗ = H 1 T × P1 , tx∗ (p12 E ⊗ p13 P) ⊗ p3∗ Lx |T ×P1 ×{ξ } ∗ ∗ = H 1 T × P1 , p12 E ⊗ p13 P|T ×P1 ×{ξ } ⊗ (Lx )ξ ⇒ Vξ = Vξ ⊗ (Lx )ξ as a canonical isomorphism for each ξ ∈ Tˆ . Thus V = V ⊗ Lx . Note also that if B is an admissible connection, V admits no splitting V = V0 ⊕ L compatible with B for any flat line bundle L. Defining the Higgs field. The next step is to give a holomorphic description of the Higgs field E. Recall that h0 (T × P1 , p2∗ OP1 (1)) = 2, and regarding P1 = C ∪ {∞}, we can fix two holomorphic sections s0 , s∞ ∈ H 0 (P1 , OP1 (1)) such that s0 vanishes at 0 ∈ C and s∞ vanishes at the point added at infinity. In homogeneous coordinates {(w1 , w2 ) ∈ C2 |w2 ! = 0} and {(w1 , w2 ) ∈ C2 |w1 ! = 0}, we have that, respectively (w = w1 /w2 ): s0 (w) = w, s∞ (w) = 1,
s0 (w) = 1, 1 s∞ (w) = . w
Let us first consider an alternative definition of the transformed Higgs field. For each ξ ∈ Tˆ , we define the map: H
ξ ˜ )) H 1 (T × P1 , E(ξ )) × H 1 (T × P1 , E(ξ )) −→ H 1 (T × P1 , E(ξ (α, β) / → α ⊗ s0 − β ⊗ s∞ .
(33)
If (α, β) ∈ kerHξ , we define an endomorphism ϕ of H 1 (T × P1 , E(ξ )) at the point ξ ∈ Tˆ as follows: ϕξ (α) = β.
(34)
We check that ϕ actually coincides with the Higgs field E we defined in the previous section, up to a multiplicative constant. Note that: α ⊗ s0 − β ⊗ s∞ = 0 ⇔ β = α(⊗s0 )(⊗s∞ )−1 . Moreover, recall that, for any trivialisation of OP1 (1) with local coordinate w on P1 , the = w. The claim now follows from the proof of Proposition 5; we denote quotient ss∞0 (w) √(w) E[ξ ] = 2 2π · ϕξ . Proposition 7. The eigenvalues of the Higgs field E have at most simple poles at ±ξ0 . Moreover, the residues of E are semi-simple and have rank ≤ 2 if ξ0 is an element of order 2 in the Jacobian of T , and rank ≤ 1 otherwise.
658
M. Jardim
Proof. Suppose α(ξ ) is an eigenvector of Eξ with eigenvalue Eξ (α(ξ )) = (ξ ) · α(ξ ). Thus,
(ξ )
= 1/ (ξ ), i.e.
α(ξ ) ⊗ s0 − (ξ ) · α(ξ ) ⊗ s∞ = 0. ⇒ α(ξ ) ⊗ ( (ξ ) · s0 − s∞ ) = 0 Therefore, denoting s (ξ ) = (ξ ) · s0 − s∞ , we have that α(ξ ) ∈ ker(⊗s (ξ )). On the other hand, consider the sheaf sequence: ⊗s (ξ ) ) → E(ξ )|T → 0, 0 → E(ξ ) → E(ξ (ξ )
since the section s (ξ ) vanishes at (ξ ). It induces the cohomology sequence: 0 → H 0 (T
(ξ )
˜ )|T ) → H 1 (T × P1 , E(ξ )) ⊗s→(ξ ) . . . , E(ξ (ξ )
(35)
˜ )|T ) which is non-empty if and only if so that ker(⊗s (ξ )) = H 0 (T (ξ ) , E(ξ (ξ ) E(ξ )|T (ξ ) = Lξ ⊕ L−ξ or F2 ⊗ Lξ . Hence, as ξ approaches ±ξ0 , we must have that one of the eigenvalues of E, say (ξ ) approaches ∞, since E| T∞ = Lξ0 ⊕ L−ξ0 . Moreover, s (ξ ) → s∞ , so that: lim α(ξ ) ∈ ker(⊗s∞ ) = H 0 (T∞ , E(ξ )|T∞ ).
ξ →±ξ0
Therefore, we conclude that, if ξ0 ! = −ξ0 , then one of the eigenvalues of E has a simple pole at ±ξ0 since h0 (T∞ , E(±ξ0 )|T∞ ) = 1; similarly, if ξ0 = −ξ0 , then two of the eigenvalues of E have a simple poles at ξ0 . Note in particular that the images of the residues of E at ±ξ0 are precisely given by: 1 1 ˜ H 0 (T∞ , E(±ξ 0 )|T∞ ) ⊂ H (T × P , E(±ξ0 )).
This proposition almost concludes the main task of this paper, namely to construct the inverse of the Nahm transform of [14]. It only remains to be shown that the Nahm transformed Higgs pair is admissible. We must then show how to match the SU (2) bundle Eˇ → T × C with doubly-periodic instanton Aˇ constructed from the transformed Higgs pair (B, E) as in [14] with the original objects A and E → T × C we started with in the present paper. These tasks are taken up in the following section. 6. Proof of Inversion So far, we have established that the Nahm transform of a doubly-periodic instanton is the same kind of singular Higgs pair as those we started with in the first part of this series [14]. We must now show that the transform presented here is actually the inverse of the construction of instantons of [14]. More precisely, we show that if we start with a doublyperiodic instanton A, apply the Nahm transform to obtain a Higgs pair (B, E), then the corresponding doubly-periodic instanton constructed as in [14] is gauge equivalent to the original object. First, consider the six-dimensional manifold T ×C×(Tˆ \{±ξ0 }). To shorten notation, we denote Mξ = T × C × {ξ } and Tˆ(z,w) = {z} × {w} × (Tˆ \ {±ξ0 }).
Nahm Transform and Spectral Curves for Doubly-Periodic Instantons
659
∗ E ⊗ p ∗ P over T × C × (Tˆ \ {±ξ }); note that Now take the bundle G = p23 0 12 G|Mξ ≡ E(ξ ) and G|Tˆ(z,w) ≡ E(z,w) ⊗ Lz , where E(z,w) denotes a trivial rank 2 bundle over Tˆ \ {±ξ0 } with the fibres canonically identified with the vector space E(z,w) . G is clearly holomorphic; we denote by ∂ M the action of the associated Dolbeault operator along the T × P1 direction, and by ∂ Tˆ its action along the Tˆ direction. In particular, ∂ M |Mξ ≡ ∂ Aξ . q
Let Cp,q = (T ×C (G) ⊗ ( ˆ (G); in other words, Cp,q consists of the (p + q)-forms T over T × C × (Tˆ \ {±ξ0 }) with values in G spanned by forms of the shape: 0,p
s(z, w, ξ )dzi1 dw i2 dξj1 dξ j2 , i1 , i2 , j1 , j2 ∈ {0, 1} and i1 + i2 = p, j1 + j2 = q.
(36)
Analytically, we want to regard Cp,q as the completion of the set of smooth forms of the shape above with respect to a Sobolev norm described as follows: s|T ×C×{ξ } ∈ L2 ((2−q E(ξ )) for each ξ ∈ Tˆ \ {±ξ0 }, q s|{(z,w)}×Tˆ \{±ξ0 } ∈ L2q ((2−q Lz ) for each (z, w) ∈ T × C. Now, define the maps: δ1
δ2
Cp,0 → Cp,1 → Cp,2
(37) δ1 (s) = (∂ Tˆ s, −w · s ∧ dξ ) δ2 (s1 , s2 ) = (∂ Tˆ s2 + w · s1 ∧ dξ ) 0,p 1,0 for (s1 , s2 ) ∈ (T ×P1 (G) ⊗ (0,1 (G) ⊕ ( (G) ≡ C(p, 1). Note that (37) does define Tˆ Tˆ a complex. The inversion result will follow from the analysis of the spectral sequences associated to the following double complex (for the general theory of spectral sequences and double complexes, we refer to [3]): ∂M
C0,2 → ↑ δ2 ∂M
C0,1 → ↑ δ1 ∂M
C0,0 →
−∂ M
C1,2 → C2,2 ↑ −δ2 ↑ δ2 −∂ M
C1,1 → C2,0 ↑ −δ1 ↑ δ1 C1,0
(38)
−∂ M
→ C2,0 .
The idea is to compute the total cohomology of the spectral sequence in the two possible different ways and compare the filtrations of the total cohomology. Lemma 8. By first taking the cohomology of the rows, we obtain p,q E2
0 H 2 (C(e, 0)) 0 0 H 1 (C(e, 0)) 0 q ↑ 0 H 0 (C(e, 0)) 0, →
p
(39)
660
M. Jardim
where H i (C(e, 0)) are the cohomology groups of the complex that yields the monad description of the construction of doubly-periodic instantons in [14] (see Proposition 3 there). Proof. First, note that the rows coincide with the complex (26). Moreover, we can regard elements in Cp,q as q-forms over Tˆ with values in 0,p 2 L2−p ((T ×C G). To see this, fix some ξ ∈ Tˆ ; by (36), s(z, w, ξ ) ∈ (0,p G|Mξ . So, by varying ξ we get the interpretation above. p,q This said, it is clear that the first and second columns of E1 must vanish, since A is ∗ irreducible. In the middle column, we get q-forms over Tˆ with values in ker(∂ M − ∂ M ), ∗ ). which for a fixed ξ restricts to ker(DA ξ
Therefore, after taking the cohomologies of the rows, we are left with: 0 p,q
C1
Lp ((1,1 V )
0
↑ (∂ B + E) p 0 L1 ((1,0 V ⊕ (0,1 V ) 0 ↑ (E + ∂ B ) q↑0
→
p
L2 ((0 V )
(40)
0.
p
But this is just the complex that yields the monad description of the construction of doubly-periodic instantons in [14]. The lemma follows after taking the cohomology of the remaining column. Total cohomology and admissibility. Note that, as we pointed out in the beginning of this section, we still do not know if the Higgs pair (B, E) arising from the instanton (E, A) is admissible or not, i.e. the hypercohomology spaces H0 and H2 might be nontrivial. The next lemma deals with this problem. Lemma 9. The only nontrivial cohomology of the total complex is H 2 (C(p, q)), which is naturally isomorphic to the fibre E(e,0) . In particular, this shows that the Higgs pairs (B, E) obtained via Nahm transform on instanton connection A ∈ A(k,ξ0 ) are indeed admissible, see [14]. Proof. First note that we can regard an element in Cp,q as a (0, p)-form over T × C ∗ q ,q with values in ( ˆ1 2 (G). Since G|Tˆ(z,w) ≡ E(z,w) ⊗ Lz , ker∂ M and ker∂ M are nontrivial T only if z = e, the identity element in the group law of T . Hence, it is enough to work on a tubular neighbourhood of {e} × P1 (Tˆ \ {±ξ0 }). More precisely, we define another double complex (germ C)p,q , consisting of forms defined on arbitrary neighbourhoods of {e}×P1 ×(Tˆ \{±ξ0 }). Then we have a restriction map Cp,q → (germ C)p,q commuting with ∂ M , δ1 and δ2 . Such a map also induces an isomorphism between the total cohomologies of Cp,q and (germ C)p,q . So we can work with (germ C)p,q to prove the lemma.
Nahm Transform and Spectral Curves for Doubly-Periodic Instantons
661
Let Ve be some neighbourhood of e ∈ T . By the Poincaré lemma applied to ∂ T , we get:
p,q
(germ C)1
q↑
→
(2Ve (G) 0 0 ↑ (1Ve (G) 0 0 ↑ (0Ve (G) 0 0
(41)
p
where Ve denotes a tubular neighbourhood of Ne = {e} × P1 × (Tˆ \ {±ξ0 }) As in [6] (see pp. 91–92), the complex in the first row is, after restriction, mapped into a Koszul complex over Ne : (w ξ )
(−ξ,z)
ONe (G) −→ ONe (G) ⊕ ONe (G) → ONe (G) so that: E(e,0) 0 0
p,q
(germ C)2
q↑
0 0
00 00
(42)
→
p
It then follows from Lemmas 8 and 9 that there is a natural isomorphism of vector spaces II : H 1 (C(e, 0)) ≡ Eˇ (e,0) → E(e,0) , which in principle may depend on the choice of complex structure I on T × C. ˇ A) ˇ with the original data. Since the choice of identity element in T and Matching (E, ˇ More of origin in C is arbitrary, we can extend II to a bundle isomorphism E → E. precisely, let t(u,v) : T × C → T × C be the translation map (z, w) → (z + u, w + v). ∗ ∗ Clearly, the connection t(u,v) A on the pullback bundle t(u,v) E is also irreducible and ∗ t(u,v) E(e,0) ≡ E(u,v) . Computing the total cohomology of the double complex (38) associated to the bundle t ∗ G (where t ∗ acts trivially on Tˆ coordinate), Lemmas 8 (u,v)
(u,v)
and 9 lead to an isomorphism of vector spaces H 1 (C(u, v)) ≡ Eˇ (u,v) → E(u,v) . It is clear from the naturality of the constructions that these fibre isomorphisms fit ˇ In particular, together to define a holomorphic bundle isomorphism II : E → E. II takes the Dolbeault operator ∂ A of the holomorphic bundle E → T × C to the Dolbeault operator ∂ Aˇ of the holomorphic bundle Eˇ → T × C. It also follows from this observation that the holomorphic extensions E and Eˇ must be isomorphic as holomorphic vector bundles. However, such fact still does not guarantee that the connections A and Aˇ are gaugeequivalent. This is accomplished if we can show that II is actually independent of the choice of complex structure in T × C. Therefore, the proof of the main theorem 1 is completed by the following proposition:
662
M. Jardim
Proposition 8. The bundle map II : Eˇ → E is independent of the choice of complex structure on T × C. Proof. Again, it is sufficient to consider only the fibre over (e, 0). As in [6], the idea is to present an explicit description of II : Eˇ (e,0) → E(e,0) , and then show that it is Euclidean invariant. Let α ∈ H 1 (C(e, 0)) ⊂ C1,1 . To find II ([α]) we have to find β ∈ C0,2 such that ∂ M β = δ2 α. A solution to this equation is provided by the Hodge theory for the ∂ M operator: ∗ β = GM (∂ M δ2 α), ∗
where GM denotes the Green’s operator for ∂ M ∂ M , which can be regarded fibrewise as the family of Green’s operators GAξ = GM |Mξ parametrised by ξ ∈ (Tˆ \ {±ξ0 }). In principle, β depends on the complex structure I via the operators ∂ M and GM . However, by the Weitzenböck formula applied to the bundle G, we have: ∗
∗ ∂ M ∂ M = ∇M ∇M .
Here, ∇M is the covariant derivative in the T ×C direction on G. With this interpretation, ∗ ∇ )−1 is seen to be independent of the complex structure I ; in fact, it is GM = (∇M M Euclidean invariant. Now β as an element of C1,1 has the form β(z, w; ξ )dξ dξ , so that the restriction r(e,0) (β) = β|Tˆ(e,0) is a (1, 1)-form over Tˆ \ {±ξ0 } with values in E(e,0) . Take its cohomology class in H 2 (Tˆ \ {±ξ0 }, C ⊗ E(e,0) ), so that:
II ([α]) = which is the desired explicit description.
Tˆ(e,0)
r(e,0) (β)
Together with the work done in [14], we have thus proven Theorem 1. 7. Instantons of Higher Rank One easily realizes that there is nothing really special about rank two bundles; the whole proof could be generalised to higher rank. Indeed, the only point in restricting to the rank two case is to reduce the number of possible vector bundles over an elliptic curve, and avoid a tedious case-by-case study throughout the various stages of the proof. Before we can state the generalisation of the main theorem 1, we must review our definitions of asymptotic state and irreducibility. The restriction of the instanton bundle E → T × P1 to the added divisor T∞ is a flat SU (n) bundle, i.e. E|T∞ = Lξ1 ⊕ · · · ⊕ Lξk k
Lξl = OT .
such that l=1
In other words, E|T∞ is determined by a set of points (ξ1 , . . . , ξj ) ∈ J (T ) with multij plicities (m1 , . . . , mj ), and such that l=1 ml ξl = 0. We call such data the generalised asymptotic state.
Nahm Transform and Spectral Curves for Doubly-Periodic Instantons
663
Moreover, we will say that (E, A) is 1-irreducible if there is no flat line bundle E → T × C such that E admits a splitting E ⊕ L which is compatible with the connection A. Theorem 4. There is a bijective correspondence between the following objects: – Gauge equivalence classes of 1-irreducible SU (n)-instantons over T × C with fixed instanton number k and generalised asymptotic state (ξ1 , . . . , ξj ) with multiplicities (m1 , . . . , mj ); – Admissible U (k) solutions of the Hitchin’s equations over the dual torus Tˆ , such that the Higgs field has at most simple poles at {ξ1 , . . . , ξj }; moreover, its residue at ξj is semi-simple and has rank ≤ mj . 8. The Instanton Spectral Data Our goal now is to construct an algebraic curve S C→ Tˆ × C associated to a doublyperiodic instanton A; let E be the associated instanton bundle. ∗ Let DA denote the restriction of the coupled Dirac operator DAξ to the torus Tw . ξ (w) We define: ∗ S = {(ξ, w) ∈ Tˆ × C | ker{DA } ! = 0}. ξ (w)
(43)
∗
Since DAξ (w) = ∂ Aξ |Tw − ∂ Aξ |Tw , it is easy to see that: ∗ ker{DA } = H 1 (Tw , E(ξ )|Tw ) = H 1 (Tw , E(ξ )|Tw ). ξ (w)
(44)
Note also that S can be compactified to a curve S C→ Tˆ × P1 by adding the two points (±ξ0 , ∞) corresponding to the asymptotic states. Assuming that the instanton bundle E is fibrewise semistable 2 , we conclude that S is a branched double cover of P1 ; the branch points correspond to those w ∈ C such that E|Tw is an extension of a line bundle of order two by itself. Lemma 10. If E is fibrewise semistable, the natural projection map π1 : S → Tˆ is a branched k-fold covering map. Furthermore, the projection π2 : S → P1 is a branched double covering map with 4k branch points, counted with multiplicity. It follows that all spectral curves belong to the linear system |k·[Tˆ ]+2·[P1 ]| ⊂ Tˆ ×P1 . Moreover, if E is regular, then S is a smooth curve of genus g(S) = 2k − 1, by the Riemann–Hurwitz formula. Proof. The proof of the first statement is a simple application of the Riemann–Roch theorem for the family of Dolbeault operators ∂ w on E(ξ )|Tw , parametrised by P1 . For generic w ∈ P1 , dim{ker∂ w } = 0; this dimension jumps precisely when E(ξ )|Tw is an extension of the trivial line bundle by itself. Thus the cardinality of π1−1 (ξ ) coincides with the number of jumping points (counted with multiplicity). 2 In general, E is only generically fibrewise semistable, so that S may contain whole fibres.
664
M. Jardim
From index theory, we know that the number of jumping points is precisely the first Chern class of the index bundle; therefore: ch(E(ξ )) · td(KT )/[T ] #(jumping points) = c1 (Ind[∂ w ]) = P1 =− c2 (E(ξ )) = −k. (45) T ×P1
This shows that S is a k-fold covering of Tˆ . Since the branch points of the projection π2 are exactly the pre-images of the elements of order two in Tˆ , there are 4k branch points, counted with multiplicity. Line bundle with connection. Let π1 : Tˆ × P1 → Tˆ and π2 : Tˆ × P1 → P1 be the natural projection maps; we will also use π1 and π2 to denote the projections S → Tˆ and S → P1 . To each s ∈ S, we attach the vector space: ∗ (46) = H 1 (Tπ2 (s) , E(π1 (s))|Tπ2 (s) ). Ls = ker DA (π (s)) π (s) 2 1
If E is generically fibrewise semistable, then L is only a coherent sheaf on the (possibly singular, non-reduced) spectral curve. However, when the instanton bundle is regular L becomes a line bundle over the smooth spectral curve. So now let us assume that A is a regular doubly-periodic instanton, and consider the bundle π1∗ H → S. There is a bundle map T : π1∗ H → L, which is given by the following composition on each fibre: r P ∗ ∗ L21 (:0,1 E(π1 (s))) → ker DA → ker D , (47) A (π (s)) π (s) π (s) 2 1
1
where r denotes the restriction map. Let ιL→H denote the inclusion L C→ π1∗ H, which makes sense in terms of distributions. A connection on the line bundle L → S is defined by: ∇ = T ◦ π1∗ d ◦ ιL→H .
(48)
9. Hitchin’s Spectral Data We now look at the other side of the correspondence in Theorem 1 and review Hitchin’s construction of spectral curves associated to Higgs bundles [12]. Recall that V → Tˆ \ {±ξ0 } is a rank k vector bundle, and E is an endomorphism valued (1, 0)-form with simple poles at ±ξ0 . So, for any fixed ξ ∈ Tˆ \ {±ξ0 }, E[ξ ] is a k × k matrix and one can compute its k eigenvalues. As we vary ξ , we get a k-fold covering, possibly branched, of Tˆ \ {±ξ0 } inside Tˆ × C. This curve of eigenvalues is what we want to define as our Higgs spectral curve; more precisely: (49) C = (ξ, w) ∈ Tˆ × C | det(E[ξ ] − w · Ik ) = 0 . In other words, C is the set of points (ξ, w) ∈ Tˆ × C such that w is an eigenvalue of the endomorphism E[ξ ] : Vξ → Vξ . Since we are assuming that E has simple poles at ±ξ0 , the curve C C→ Tˆ × C can be compactified to a curve C C→ Tˆ × P1 by adding the points (±ξ0 , ∞). The following proposition is a familiar fact from the theory of Higgs bundles.
Nahm Transform and Spectral Curves for Doubly-Periodic Instantons
665
Proposition 9. If ξ0 ! = −ξ0 , the spectral curve associated to a generic Higgs bundle (V , B, E) is smooth. Otherwise, if ξ0 = −ξ0 , then all spectral curves have a double-point at (±ξ0 , ∞), but are generically smooth elsewhere. Defining the spectral bundle. As before, we will denote the projections C → Tˆ and C → P1 by π1 and π2 . We define a coherent sheaf N on C with stalks given by: Nc = coker {E[π1 (c)] − π2 (c) · Idk } ,
(50)
i.e. the dual of the π2 (c)-eigenspace of E[π1 (c)]. Generically, one expects the eigenvalues to be distinct, so that N becomes a line bundle over the smooth curve C. Assuming that Higgs bundle (V , B, E) is generic, we define a connection ( on the line bundle N → C. First note that N is naturally a subbundle of π1∗ V ; let ιN →V be the inclusion and E : π1∗ V → N the fibrewise projection. We define: ∇( = E ◦ π1∗ ∇B ◦ ιN →V .
(51)
10. Matching the Spectral Data We are finally in a position to state and prove the second main result of this paper: Theorem 5. If (V , B, E) is the Nahm transform of a regular instanton (E, A), then the instanton spectral data (S, L, ) is equivalent to the Higgs spectral data (C, N , (), in the sense that the curves S and C coincide pointwise and there is a natural isomorphism L → N preserving the connections. Proof. Clearly, both spectral curves already have the points (±ξ0 , ∞) in common. So let ξ ! = ±ξ0 and suppose that α is an eigenvector of E[ξ ] with eigenvalue < ∞. In particular, the point (ξ, ) ∈ Tˆ × C belongs to the Higgs spectral curve C. By definition, we have: E[ξ ](α) = · α ⇒ α ⊗ (s0 − · s∞ ) = 0. Clearly, s = s0 − · s∞ is a holomorphic section in H 0 (P1 , OP1 (1)) vanishing at ∈ P1 \ {∞}. Therefore it induces the following exact sequence: ⊗s ) → E(ξ )|T → 0 0 → E(ξ ) → E(ξ
which in turn induces the cohomology sequence: )|T ) → 0 → H 0 (T , E(ξ
⊗s r )) → → H 1 (T × P1 , E(ξ )) → H 1 (T × P1 , E(ξ r )|T ) → → H 1 (T , E(ξ 0.
(52)
)|T ) is nonempty (since it contains α), thus H 1 (T , E(ξ )|T ) = Now H 0 (T , E(ξ 1 1 1 1 1 )) = k. H (T , E(ξ )|T ) is also nonempty since h (T × P , E(ξ )) = h (T × P , E(ξ ∗ Therefore, ker{DAξ (w) } is also nonempty since it can be identified with H 1 (T , E(ξ )|T ) (see (44)). Hence (ξ, ) ∈ Tˆ × C also belongs to the instanton spectral curve S. The same argument provides the converse statement. Thus the curves C and S must coincide pointwise.
666
M. Jardim
It also follows from the cohomology sequence (52) that the dual of the -eigenspace )|T ) = H 1 (T , E(ξ )|T ). In other words, there are canonof E[ξ ] is exactly H 1 (T , E(ξ ical identifications between the fibres N(ξ, ) and L(ξ, ) ; therefore, the line bundles are isomorphic. Finally, let us check that the connection ∇( and ∇ also coincide. Noting that the projection E : π1∗ V → N = L is just the restriction map ∗ ∗ r : ker DA → ker DA on each s ∈ S = C, it is easy to see that π1 (s) π1 (s) (π2 (s)) ∗ T = E ◦ π1 P . Therefore, we have:
∇ = T ◦ π1∗ d ◦ ιN →H = E ◦ π1∗ P ◦ π1∗ d ◦ ιV →H ◦ ιN →V = E ◦ π1∗ ∇B ◦ ιN →V = ∇( . Remark 2. More generally, the above argument shows that the pairs (S, L) and (C, N ) also coincide (i.e. the curves S and C coincide pointwise, and L and N are isomorphic as coherent sheaves) when E is fibrewise semistable. Remark 3. Cherkis and Kapustin used a similar argument to establish the analogous result for periodic monopoles [5]. More precisely, they considered monopoles on S 1 ×R2 , so that the Nahm transformed object is a Higgs pair on S 1 × R. Each of these objects can be associated to a spectral pair consisting of an algebraic curve on R2 × (R2 \ {0}) plus a line bundle over it. If the Higgs pair is the Nahm transform of a periodic monopole, Cherkis and Kapustin have shown that both spectral data coincide. 11. Conclusion 11.1. An analytical remark. The attentive reader might have noticed that the assumptions on doubly-periodic instantons used on this paper (namely extensibility) do coincide with the conclusions of the first paper of the series. However, it is important to point out at this stage the small gap remaining between the conclusions of the present paper and the assumptions in [14]. More precisely, we assumed in [14] that the harmonic metric associated with the Higgs pair (B, E) on the bundle V → Tˆ \ {±ξ0 } is non-degenerate along the kernel of the residues of E, and h ∼ O(r 1±α ) along the image of the residues of E, for some 0 ≤ α < 1/2, in a holomorphic trivialisation of V over a sufficiently small neighbourhood around ±ξ0 , . The gap is closed in [2], where it is shown that the Nahm transformed Higgs pairs here constructed do satisfy the above condition. The analytical features of extensible doubly-periodic instantons are further studied by Olivier Biquard and the author in [2]. In particular, we show that if |FA | ∼ O(r −2 ) then A is extensible, and the asymptotic behaviour is completely determined. Moreover, we give a deformation theory description of the moduli space of rank two doubly-periodic instanton connections as a hyperkähler manifold of complex dimension 4k − 2. It is also shown that the Nahm transform is a hyperkähler isometry between the moduli space of doubly-periodic instantons and the moduli space of singular Higgs pairs. 11.2. Relation with Fourier–Mukai transform. The instanton spectral pair (S, L) could also be constructed via Fourier–Mukai transform in the following way.
Nahm Transform and Spectral Curves for Doubly-Periodic Instantons
667
Let F be a sheaf on T × P1 and consider the diagram: T × Tˆ × LP1 LLL r LLOˆL rrr r r LLL r r O & rx r T × P1 Tˆ × P1 The Fourier–Mukai transform of F is given by ˆ ∗ (O∗ F ⊗ P), H(F ) = R O where P denotes the pullback of the Poincaré bundle from T × Tˆ to T × Tˆ × P1 . If F is torsion-free and generically fibrewise semistable, then H(F ) is a torsion sheaf on Tˆ × P1 . It is simple to show that if F is locally-free and generically fibrewise semistable (as the instanton bundles considered in this paper are), then H(F ) is supported exactly over the spectral curve S, and the restriction of H(F ) to its support coincides with L [18]. ˆ ∗ (O∗ F ⊗ P)). A more careful study of doubly-periodic Furthermore, V = π1∗ (R 1 O instantons from the point of view of its Fourier–Mukai transform is done in [17]. Therefore, the holomorphic version of the Nahm transform can be seen as a Fourier– Mukai transform composed with Hitchin’s correspondence. However, the Nahm transform (and the spectral construction of Sect. 8) also contains some differential-geometric information (i.e. the instanton A, the transformed connection B, and the spectral connection ) in addition to the holomorphic information encoded into the Fourier–Mukai transform. Of course, such differential-geometric information is usually encoded into the holomorphic data in the form of a stability condition. Such a condition is well-known for Higgs bundles [11]. For doubly-periodic instantons, the appropriate concept of stability for the corresponding instanton bundles is established in [2]. It is less clear, though, what is the stability condition to be imposed on the spectral pairs (S, L); such a question is addressed in [18] in a more general context. Acknowledgement. This work is part of my Ph.D. project [13], which was funded by CNPq, Brazil. I am grateful to my supervisors, Simon Donaldson and Nigel Hitchin, for their constant support and guidance. I also thank Brian Steer and Olivier Biquard for valuable suggestions in the later stages of this project.
References 1. Arfken, G.: Mathematical methods in physics. London, New York: Academic Press, 1966 2. Biquard, O., Jardim, M.: Asymptotic behaviour and the moduli space of doubly-periodic instantons. J. Eur. Math. Soc. 3, 335–375 (2001) 3. Bott, R., Tu, L.: Differential forms in algebraic topology. New York: Springer-Verlag, 1982 4. Braam, P., van Baal, P.: Nahm’s transform for instantons. Commun. Math. Phys. 122, p.267-280 (1989). 5. Cherkis, S.; Kapustin, A.: Nahm Transform for Periodic Monopoles and N = 2 SuperYang–Mills Theory. Commun. Math. Phys. 218, 333–371 (2001) 6. Donaldson, S., Kronheimer, P.: Geometry of four-manifolds. Oxford: Clarendon Press, 1990 7. García Pérez, M., González-Arroyo, A., Pena, C., van Baal, P.: Nahm dualities on the torus – a synthesis. Nucl. Phys. B564, 159–181 (2000) 8. Gradshteyn, I., Ryzhik, I., Jeffrey, A. (ed.): Table of integrals, products and series. Boston: Academic Press, 1994 9. Gromov, M., Lawson, H.: Positive scalar curvature and the index of the Dirac operator on complete Riemannian manifolds. Inst. des Hautes Études Scientifiques Publ. Math. 58, 295–408 (1983)
668
M. Jardim
10. 11. 12. 13.
Hitchin, N.: Construction of monopoles. Commun. Math. Phys. 89, 145–190 (1983) Hitchin, N.: The self-duality equations on a Riemann surface. Proc. London Math. Soc. 55, 59–126 (1987) Hitchin, N.: Stable bundles and integrable systems. Duke Math. J. 54, 91–114 (1987) Jardim, M.: Nahm transform for doubly-periodic instantons. Ph.D. thesis, Oxford 1999. Available at math.DG/9912028 Jardim, M.: Construction of doubly-periodic instantons. Commun. Math. Phys. 216, 1–15 (2001) Jardim, M.: Nahm transform for doubly-periodic instantons. Preprint math.DG/9910120 Jardim, M.: Spectral curves and Nahm transform for doubly-periodic instantons. Preprint math.AG/9909146 Jardim, M.: Classification and existence of doubly-periodic instantons. Preprint math.DG/0108004 Jardim, M., Maciocia, A.: A Fourier–Mukai approach to spectral data for instantons. Preprint math.AG/0006054 Kapustin, A., Sethi, S.: Higgs branch of impurity theories. Adv. Theor. Math. Phys. 2, 571–592 (1998) Watson, G.N.: A treatise on the theory of Bessel functions. Cambridge: Cambridge University Press, 1995
14. 15. 16. 17. 18. 19. 20.
Communicated by R. Dijkgraaf