Commun. Math. Phys. 296, 1–33 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1009-8
Communications in
Mathematical Physics
Phase Transition in a Vlasov-Boltzmann Binary Mixture R. Esposito1 , Y. Guo2 , R. Marra3 1 Dipartimento di Matematica pura ed applicata, Università dell’Aquila,
Coppito, 67100 L’Aquila, Italy. E-mail:
[email protected] 2 Division of Applied Mathematics, Brown University, Providence,
RI 02812, U.S.A. E-mail:
[email protected] 3 Dipartimento di Fisica and Unità INFN, Università di Roma Tor Vergata,
00133 Roma, Italy. E-mail:
[email protected] Received: 5 April 2009 / Accepted: 24 November 2009 Published online: 19 February 2010 – © Springer-Verlag 2010
Abstract: There are not many kinetic models where it is possible to prove bifurcation phenomena for any value of the Knudsen number. Here we consider a binary mixture over a line with collisions and long range repulsive interaction between different species. It undergoes a segregation phase transition at sufficiently low temperature. The spatially homogeneous Maxwellian equilibrium corresponding to the mixed phase, minimizing the free energy at high temperature, changes into a maximizer when the temperature goes below a critical value, while non homogeneous minimizers, corresponding to coexisting segregated phases, arise. We prove that they are dynamically stable with respect to the Vlasov-Boltzmann evolution, while the homogeneous equilibrium becomes dynamically unstable.
1. Introduction and Main Results The phenomenon of phase transition in a thermodynamic system is usually described by the arising of multiple minimizers of the free energy. Namely, when the temperature is lowered below a certain critical value, the unique equilibrium minimizer becomes a local maximizer and new minimizers appear. This is interpreted as a loss of stability of the old minimizer and the birth of new stable states. Here the meaning of stability is merely related to the free energy minimizing properties of the state. Of course it would be desirable to have a more detailed, dynamical analysis of the stability properties involved in such a phenomenon of phase transition. This issue is, generally speaking, very difficult, because the natural dynamics is a many-component one, whose detailed understanding is very far from being attainable. A simpler but, in our opinion, still interesting problem is to study this kind of behavior at the kinetic level, where, instead of a system of a huge number of particles, one has to study a partial differential equation for the probability distribution of particles on the one particle phase space. This is the problem we want to address in this paper.
2
R. Esposito, Y. Guo, R. Marra
The standard kinetic model of a rarefied gas undergoing collisions (Boltzmann equation), however, describes essentially the ideal gas, which does not exhibit phase transitions. Following van der Waals, it is convenient to include a long range attractive interaction between particles to see a vapor-liquid transition. With the introduction of such an interaction, new problems arise, due to the fact that nothing prevents the system from collapsing (Statistical Mechanics instability). The van der Waals approach of adding a hard core interaction is not easy to handle and more complicated many body long range interactions have been introduced (see [13,15]) to handle this in the framework of the equilibrium Statistical Mechanics. While one could in principle try to follow the approach in [13], we find it easier to consider a different kinetic model introduced a few years ago in [2], where there is no attractive interaction and the Statistical Mechanics instability does not arise. The model consists of two species of particles which, for simplicity, have the same mass. To fix the ideas, think of them as distinguished just by their color, say red and blue. Their short range interaction is modeled by Boltzmann-like collisions which are color blind, while the long range repulsive interaction, arising only between particles of different color, is modeled by a Vlasov force with a smooth, bounded, finite range potential. We refer to [1–4] for more information on this model. For any t ∈ R+ , let the non negative functions f i (t, x, ξ ), i = 1, 2, denote the probability densities of finding a particle of species 1 (red) or 2 (blue) in a cell of the phase space around the point (x, ξ ) at time t. In this paper we will only consider the case x ∈ R, the real line, while the velocity ξ = (v, ζ ) ∈ R3 with ζ ∈ R2 . The time evolution is governed by ∂t f 1 + v∂x f 1 + F( f 2 )∂v f 1 = Q( f 1 , f 1 + f 2 ), ∂t f 2 + v∂x f 2 + F( f 1 )∂v f 2 = Q( f 2 , f 1 + f 2 ). Here the Vlasov force F(h) due to the mass distribution h is defined as F(h)(t, x) = −∂x dyU (|x − y|) dξ h(t, y, ξ ), R
R3
(1.1)
(1.2)
where U (r ), the interaction potential is non negative, smooth, bounded, with U (r ) = 0 for r ≥ 1 and d xU (|x|) = 1. R
The collision integral is defined as: Q(h 1 , h 2 )(ξ ) = |(ξ − ξ ) · ω|{h 1 (ξ∗ )h 2 (ξ∗ ) − h 1 (ξ )h 2 (ξ )}dξ dω R3 ×S2
(1.3)
:= Q gain (h 1 , h 2 ) − Q loss (h 1 , h 2 ), with S2 = {ω ∈ R3 | |ω| = 1} and ξ∗ , ξ∗ related to ξ , ξ by the usual elastic collision relations ξ∗ = ξ − ω[ω · (ξ − ξ )],
ξ∗ = ξ + ω[ω · (ξ − ξ )].
(1.4)
For any β > 0, the couple (a1 µβ , a2 µβ ), with µβ the spatially homogeneous Maxwellian at temperature β −1 , 3 β 2 −βξ 2 /2 e (1.5) µβ (ξ ) = 2π
Phase Transition in a Vlasov-Boltzmann Binary Mixture
3
and ai > 0, is an equilibrium solution and the most general homogeneous equilibrium differs from it just for rescaling and centering. However, due to the presence of the Vlasov force, non homogeneous Maxwellian equilibria are possible and they are of the form f i (x, ξ ) = ρi (x)µβ (ξ ),
(1.6)
provided that the densities ρi (x) > 0 satisfy the conditions ln ρi + βU ∗ ρi+1 = Ci , i = 1, 2.
(1.7)
Here and in the rest of the paper the label i + 1 means 2 if i = 1 and 1 if i = 2 and the convolution product ∗ is defined by a ∗ b(x) = R dya(|x − y|)b(y). The constants Ci have the physical meaning of chemical potentials. The phase transition phenomenon we want to discuss occurs when C1 = C2 , a situation where the equilibrium solutions are symmetric under the exchange 1 ↔ 2. This symmetry is spontaneously broken below the critical temperature. Therefore we assume C1 = C2 in the rest of the paper. If β is suitably small, conditions (1.7) are satisfied only by constant ρi ’s. On the other hand if β is sufficiently large, non constant ρi solving (1.7) can be constructed. To understand the arising of multiple equilibria, let us start by defining the local free energy ϕ(ρ1 , ρ2 ) on R+ × R+ (i.e. the free energy density when U is replaced by a δ-function) as ϕ(ρ1 , ρ2 ) = ρ1 ln ρ1 + ρ2 ln ρ2 + βρ1 ρ2 .
(1.8)
Let ρ = ρ1 + ρ2 . It is shown in [4]] (in a slightly more general context) that, if βρ < 2, then the only stationary points for ϕ are characterized by ρ1 = ρ2 = ρ/2 and they are minimizers for ϕ. On the other hand, if βρ > 2, then there are ρ + > ρ − > 0 such that (ρ1 , ρ2 ) = (ρ + , ρ − )
(1.9)
(and (ρ1 , ρ2 ) = (ρ − , ρ + ) by symmetry) is an absolute minimizer for ϕ, while (ρ1 , ρ2 ) = (ρ/2, ρ/2) is a local maximizer for ϕ. When (a1 , a2 ) is chosen equal to (ρ + , ρ − ), the equilibrium state is interpreted as a pure phase rich of red particles, while the state (a1 , a2 ) = (ρ − , ρ + ) corresponds to a phase rich of blue particles. Non-homogeneous solutions become relevant when one has to describe a situation of phase coexistence. The idea to construct non-homogeneous solutions is to observe that at low temperature, in order to minimize the local free energy, the system has to be in a pure phase at infinity. A situation with phase coexistence should arise when the system is in the blue-rich phase at −∞ and in the red-rich phase at +∞ or vice versa. Therefore one looks at the minimizers of a suitable free energy functional with above constraints. Indeed, the conditions (1.7) are the Euler-Lagrange equations for the free energy functional F(ρ1 , ρ2 ) = d x(ρ1 ln ρ1 +ρ2 ln ρ2 )+β d x dyU (|x − y|)ρ1 (x)ρ2 (y) I
I
I
(1.10) on a finite interval I with periodic boundary conditions. Moreover, it can be shown, (see [3]), that when β is large, with the masses of the two species fixed, there are non constant couples (ρ1 , ρ2 ) giving lower values to the free energy than the constants. Over
4
R. Esposito, Y. Guo, R. Marra
the real line −∞ < x < ∞, above free energy does not make sense and a more careful definition is required: Let (− , ) be a bounded interval and set F(− , ) (ρ1 , ρ2 ) = dxϕ(ρ1 , ρ2 ) (− , ) β + dxdyU (|x − y|)[ρ1 (x) − ρ1 (y)][ρ2 (y) − ρ2 (x)]. 2 (− , )2 We define the excess free energy as
ˆ 1 , ρ2 ) := lim F(− , ) (ρ1 , ρ2 ) − 2 ϕ(ρ + , ρ − ) . F(ρ
→∞
(1.11)
Note that, since ϕ(ρ + , ρ − ) = ϕ(ρ − , ρ + ), the functional Fˆ is +∞ when ρ = (ρ1 , ρ2 ) does not go to pure phases at infinity. Since we are interested in the coexistence of phases, we assume that (ρ1 , ρ2 ) satisfy the conditions lim x→±∞ ρ1 (x) = ρ ± and lim x→±∞ ρ2 (x) = ρ ∓ . In [4] several results are proved about the minimizers for the ˆ which are summarized in the following theorem (see also [5]): functional F, Theorem 1.1. Let βρ > 2. Then there exists a unique (up to translations) positive ˆ defined by (1.11), minimizer (front) for the one-dimensional excess free energy F, in the class of continuous functions ρ = (ρ1 (x), ρ2 (x)) such that lim z→±∞ ρ1 = ρ ± , lim z→±∞ ρ2 = ρ ∓ , where ρ ± are defined in (1.9). We denote by ρ¯ = (ρ¯1 (x), ρ¯2 (x)) the unique minimizer such that ρ¯1 (x) = ρ¯2 (−x). ρ¯1 is monotone increasing and ρ¯2 is monotone decreasing and ρ − < ρ¯i (x) < ρ + for any x ∈ R. Moreover, the front ρ¯ is C ∞ (R)-smooth and satisfies the Euler-Lagrange equations (1.7); its derivative ρ¯ satisfies the equations ρ¯i + β ρ¯i (U ∗ ρ¯i+1 ) = 0, i = 1, 2.
(1.12)
The front ρ¯ converges to its asymptotic values exponentially fast, in the sense that there is α > 0 such that |ρ¯1 (x) − ρ ∓ |eα|x| → 0 as x → ∓∞, |ρ¯2 (x) − ρ ± |eα|x| → 0 as x → ∓∞. Finally, the derivatives of ρ¯i of any order vanish at infinity exponentially fast and ρ¯ is odd in the sense that ρ¯1 (x) = −ρ¯2 (−x).
(1.13)
The aim of this paper is to show that for β larger than a certain critical value, the non-homogeneous equilibria, the front solutions (ρ¯1 , ρ¯2 ) are dynamically stable with respect to the evolution (1.1), while the homogeneous mixed phase is unstable. This will show that a complete bifurcation scenario arises in this model for any value of the Knudsen number, i.e. the ratio between the mean free path and the range of the interaction potential. In the rest of the paper, without loss of generality we fix the asymptotic total density ρ = ρ + + ρ − = 2. If β < 1, the only minimizer is the homogeneous equilibrium (mixed phase) Mhom (ξ ) = (µβ (ξ ), µβ (ξ )). On the other hand, if β > 1, the pure phases Mred = (ρ + µβ (ξ ), ρ − µβ (ξ )) and Mblue = (ρ − µβ (ξ ), ρ + µβ (ξ )) are constant minimizers and Mhom is a maximizer. Moreover, we have the non homogeneous equilibrium Mρ¯ = (ρ¯1 (x)µβ (ξ ), ρ¯2 (x)µβ (ξ )), with ρ¯1 (x) + ρ¯2 (x) → 2 as x → ±∞.
(1.14)
Phase Transition in a Vlasov-Boltzmann Binary Mixture
5
We assume that the initial datum for the dynamics (1.1) is a small perturbation of one of the above equilibria, denoted generically by M: f i (0) = Mi + Mi gi (0), (1.15) with gi sufficiently small in a sense that will be specified later. Let f (t, x, ξ ) be the solution to (1.1) and define g(t, x, ξ ) by setting f i (t, x, ξ ) = Mi (x, ξ ) + Mi (x, ξ )gi (x, ξ, t).
(1.16)
The equation for the perturbation g is
∂t + v∂x + F Mi+1 + Mi+1 gi+1 ∂v + ν(x, ξ ) gi = βF +F
Mi+1 gi+1
Mi v +
2
K i, j g j
j=1
Mi+1 gi+1 gi v + Γ (gi , gi ) + Γ (gi , gi+1 ),
(1.17)
where we have used the notation:
|(ξ − ξ ) · ω| M1 (ξ ) + M2 (ξ ) dξ dω, ν(x, ξ ) = R3 ×S2
1 1 Mi gi , M1 + M2 + √ Q Mi , Mi gi , K i,i gi = √ Q gain Mi Mi (1.18)
1 K i,i+1 gi+1 = √ Q Mi , Mi+1 gi+1 , Mi
1 Mi gi , M j g j . Γ (gi , g j ) = √ Q Mi We will need the following symmetry condition on the initial data: g1 (0, x, v, ζ ) = g2 (0, −x, −v, ζ ).
(1.19)
We note that such a property is preserved by the time evolution. It plays a crucial role in removing the obvious orbital instability of our problem related to the translation invariance: indeed such an invariance is explicitly broken when condition (1.19) is assumed. The momentum, kinetic energy and particle number of each species are conserved during collisions, while the sum of potential and kinetic energy is conserved along the trajectories. Therefore the following quantities are conserved under the evolution (1.1): – The masses of the perturbation g, Mi (g) = dx dξ( f i (x, ξ ) − Mi (x, ξ )), i = 1, 2; R
R3
– The energy of the perturbation g, ξ2 E(g) = dx dξ [( f 1 + f 2 ) − (M1 + M2 )] 2 R R3 dyU (|x − y|) ρ f1 (x)ρ f2 (y) − ρ M1 (x)ρ M2 (y) , + R
(1.20)
6
R. Esposito, Y. Guo, R. Marra
with f i = Mi +
√
Mi gi and
ρ f (x) =
R3
dξ f (x, ξ ),
(1.21)
the spatial density of the distribution f . Moreover, by standard arguments it follows that the H -function of the perturbation g, H (g) = dx dξ [( f 1 (x, ξ ) ln f 1 (x, ξ ) + f 2 (x, ξ ) ln f 2 (x, ξ )) R
R3
− (M1 (x, ξ ) ln M1 (x, ξ ) + M2 (x, ξ ) ln M2 (x, ξ ))]
(1.22)
does not increase in the evolution (1.1). We notice that, since the spatial domain is R, the total masses, energy and H -function of the distribution f are not well defined, while above differences are finite. Since H (g) does not increase during the evolution and E(g) and the total masses Mi (g) are constant, any linear combination of them, with a positive coefficient for the entropy, does not increase. In particular, the following non-increasing entropy-energy functional is crucial to study the stability of the equilibria: 3 2 β 2 H(g) = H (g) + βE(g) − Mi (g) Ci + 1 + ln . (1.23) 2π i=1
The factors multiplying the masses are suitably chosen to cancel some linear terms, as will be shown in the next section. The factor β in front of the temperature is dictated by the free energy minimizing properties of the equilibria. In the next sections we shall use the following weighted norm for p ∈ [1, +∞]:
f L wp =
R
1
dx
R3
dξ |w(ξ ) f (x, ξ )|
p
p
,
for some positive weight function w. Moreover L w, p denotes the space of the measurable functions om R × R3 with · L p ,w finite. When notation will be unambiguous, we will also omit the index w. Finally ∇x,v denotes the couple (∂x , ∂v ). Theorem 1.2 (Stability). Assume β > 1 and M = Mρ¯ . Let w(ξ ) = (Σ + |ξ |2 )γ , for Σ > 0 and γ > √ 3/2. There are δ > 0 and Σ > 0, C > 0 such that, if the initial datum f i (0) = Mi + Mi gi (0) satisfies the symmetry condition (1.19), the bound
wg(0) L ∞ + H(g(0)) < δ and ∇x,v g(0) L 2 < +∞, then the initial value problem for (1.17) has a unique global in time solution with sup wg(t) L ∞ ≤ C{ wg(0) L ∞ + H(g(0))}, (1.24) 0≤t≤∞
∇x,v g(t) L 2 ≤ eCt ∇x,v g(0) L 2 .
(1.25)
Remark 1.1. Theorem 1.2 implies the stability of the non homogeneous equilibrium solution Mρ¯ in the Vlasov-Boltzmann evolution (1.1), with respect to initial perturbations satisfying (1.19). It turns out that H (g(0)) > 0 for wg(0) L ∞ small (see Lemma 2.2).
Phase Transition in a Vlasov-Boltzmann Binary Mixture
7
Dynamical stability and rate of convergence for the same front solutions has been established [5] for the Vlasov-Fokker-Planck dynamics: the Vlasov force is the same introduced here, but the collisions are replaced by a Fokker-Plank operator modeling the contact with a reservoir at inverse temperature β. A careful analysis of the macroscopic equation plays an important role. To prove nonlinear stability, the first major mathematical difficulty we encounter is the presence of a large amplitude potential U . To our knowledge, so far there has been no published work on the stability result in the presence of a large external field in the Boltzmann theory. The main problem is the collapse of Sobolev estimate in higher order energy norms. Indeed, even upon taking one x−derivative, the H 1 -norm might actually grow in time due to the presence of the term ∂x F(Mi+1 )∂v gi in (4.14). We are thus forced to design a strategy of proof based on a weighted L ∞ formulation without any derivatives and get control of derivatives only afterwards. Furthermore, unlike the previous linear Fokker-Planck interaction, the nonlinear Boltzmann collisions make it difficult to analyze the equation for the projection on the hydrodynamical modes in this case. In fact, even a L 2 stability of g is difficult to obtain directly from the analysis of Eq. (1.17). Instead, we avoid a direct study of Eq. (1.17) and make crucial use of the fundamental entropy-energy H(g) estimate to obtain a mixed L 1 − L 2 type of stability estimate, based on the spectral gap of the linearized free energy operator A (Lemma 2.2). We note that the spectral gap estimate relies in an essential way on the minimizing properties of the front solution and the symmetry condition is used to control the part of the solution in the null space of the operator A. We then bootstrap such a L 2 stability to a L ∞ estimate to obtain a pointwise stability estimate, by following the curved trajectory induced by the √ force field F(Mi+1 + Mi+1 gi+1 ). The success of such a strategy is somewhat surprising: the perturbation of the field does not need to decay in time, and all the analysis over the one dimensional trajectory is carried out in a finite time interval [0, T0 ] (Lemma 4.1). Yet this is still sufficient for the global in time estimate due to the strong exponential time decay from the collision frequency. Remark 1.2. The same result also holds when perturbing initially the equilibria Mred and Mblue . Indeed the proof is even simpler in this case, because the analog of the operator A has again a spectral gap property, but its null space is trivial. Due to this, the symmetry assumption (1.19) is no more needed. We next discuss the homogeneous equilibrium Mhom = (µβ , µβ ). Theorem 1.3. Assume β < 1 and M =√Mhom . There are δ > 0 and Σ > 0, C > 0 such that, if the initial datum f i (0) = Mi + Mi gi (0) satisfies the bound
wg0 L ∞ +
H(g(0)) < δ
and ∇x,v g(0) L 2 < +∞, then the initial value problem for (1.17) has a unique global in time solution satisfying the estimates (1.24) and (1.25). Remark 1.3. This implies the stability of the mixed phase for β < 1. The proof is again a simpler version of the one for Theorem 1.2 which will be omitted. In this case the analog of operator A has a spectral gap property (because β < 1) and a trivial null space, so that we do not need to require the symmetry property (1.19).
8
R. Esposito, Y. Guo, R. Marra
We have already noticed that, when β > 1, the couple (1, 1) is a local maximizer for the free energy. In the next theorem we establish the dynamical instability of Mhom . Theorem 1.4 (Instability). Assume β > 1. There exist constants k0 > 0, θ > 0 , C > 0, √ δ δ δ c > 0 and a family of initial 2π k0 −periodic data f i (0) = µβ + µβ gi (0) ≥ 0, with g (0) satisfying (1.19) and
∇x,v g δ (0) L 2 + wg δ (0) L ∞ ≤ Cδ, for δ sufficiently small, but the solution g δ (t) to (1.1) satisfies sup wg δ (t) L ∞ ≥ c sup g δ (t) L 2 ≥ cθ > 0.
0≤t≤T δ
0≤t≤T δ
Here the escape time is Tδ =
θ 1 ln , Reλ δ
(1.26)
λ is the eigenvalue with the largest real part for the linearized Vlasov-Boltzmann system constructed in Theorem 5.1 with Reλ > 0. Remark 1.4. Note that T δ → ∞ as δ → 0. We also observe that the critical value β = 1 escapes our analysis, since it is based on some strict inequalities which collapse for the critical β. Furthermore, the growing mode which we construct satisfies the symmetry condition (1.19). Hence the instability is not a consequence of the absence of such a symmetry. To prove instability of the homogeneous state, we encounter a second major difficulty. Even though such a homogeneous equilibrium is not a local minimizer for the free-energy (1.11), it is not clear at all if such a property leads to a dynamical instability along the time evolution. As a matter of fact, the presence of (possibly strong) stabilizing collision in the velocity space might damp out the instability. It is thus fundamental to understand such a stabilizing collision effect. Unfortunately, a direct linear stability analysis in the presence of the collision effect is too complicated to draw any conclusion. We therefore developed a new perturbation argument to establish the linear instability around such a homogeneous steady state. We first observe that in the absence of the collision effect, the homogeneous state is indeed dynamically unstable, by an explicit analysis similar to the Penrose criterion in plasma physics [14]. It then follows, via a perturbation argument, that for ‘weak’ collision effects, such an instability should persist. The key is to show that this is even true for arbitrarily ‘strong’ collision effect. We use an argument of contradiction and a method of continuation. In fact, if instability should fail at some level of the collision effect, then a neutral mode must occur. The interaction between the Vlasov force and the collision effect forces any neutral mode to behave like a multiple of Mhom with a particular dispersion relation never satisfied for our model (Theorem 5.1). To bootstrap such a linear instability into a nonlinear one is delicate due to severe nonlinearity, and we follow the program developed by Strauss and the second author over the years [10,11]. We remark that the instability of the homogeneous equilibrium for the VFP model is still open because the techniques used in the present paper cannot apply due to the unboundedness of the Fokker-Plank operator. On the other hand, in this paper we cannot prove the convergence to the stable equilibrium, while it was possible in the VFP model even to compute the rate.
Phase Transition in a Vlasov-Boltzmann Binary Mixture
9
The paper is organized as follows: In Sect. 2 we use the energy-entropy to derive a mixed L 1 − L 2 estimate. In Sect. 3, we establish some lemmas on the characteristics curves for Eqs. (1.1). In Sect. 4 we establish the nonlinear stability in L ∞ norm. In Sect. 5 we construct a growing mode for the linearized problem around the homogeneous equilibrium and finally, in Sect. 6 we show that such a linear growing mode leads to nonlinear instability. 2. Entropy-Energy Estimate In this section we use the conservation of energy and masses, and the entropy inequality to obtain a priori estimates on the deviation of the solution from equilibrium. A crucial ˆ role is played by the quadratic approximation of the excess free energy functional F. We need a few definitions and some notation. 2 Let u = (u 1 , u 2 ) be a couple of functions in L 2 (R) and denote u, v = i=1 (u i , vi ). Given a couple of densities ρ = (ρ1 (x), ρ2 (x)), we define the operator A by setting
u, Au :=
2 i=1
R
dxu i (x)(Au)i (x) =
1 d2 ˆ F(ρ + su) . s=0 2 ds 2
In particular, when ρ = (ρ¯1 (x), ρ¯2 (x)), the action of the operator A on u is (Au)1 =
u1 u2 + βU ∗ u 2 , (Au)2 = + βU ∗ u 1 . ρ¯1 ρ¯2
(2.1)
Hence
u, Au =
2 u 2 (x) 1 +β dx i dx dyU (|x − y|)u 1 (x)u 2 (y). 2 ρ¯i (x) R R i=1 R
Due to the minimizing properties of ρ, ¯ this quadratic form is non negative. Moreover, by (1.12) and (2.1), we see that Aρ¯ = 0, which shows that ρ¯ is in the null space of A. Indeed, one can show (see [5] and references quoted therein) the following Lemma 2.1. Suppose β > 1 and ρ = (ρ¯1 , ρ¯2 ). Then there exist δ0 > 0 such that
u, Au ≥ δ0 (I − P)u, (I − P)u, where P is the projector on Null A: Null A = {u ∈ L 2 (R) × L 2 (R) | u = cρ¯ , c ∈ R}. If either ρ = (ρ ± , ρ ∓ ), or ρ = (1, 1) with β < 1, we have
u, Au ≥ δ0 u, u, and the null space of A reduces to {0}. We prove the following lemma which plays a crucial role in the proof of the stability of Mρ¯ . Recall that we have adopted the notation ξ = (v, ζ ), with ζ ∈ R2 and v ∈ R for the velocity.
10
R. Esposito, Y. Guo, R. Marra
Lemma 2.2. Let M = Mρ¯ . Let f 1 (t, x, v, ζ ) = f 2 (t, −x, −v, ζ ) and that g L ∞ ≤ δ for some small δ. Then there exists C > 0 and κ > 0, such that ( f i (t) − Mi )2 C dx dξ 1| fi (t)−Mi |≤κ Mi + Mi R3 i=1,2 R | f i (t) − Mi |1| fi (t)−Mi |≥κ Mi ≤ H(g(0)). Remark 2.1. Note that, when dealing with Mred , Mblue and Mhom , the functions ρ¯1 and ρ¯2 have to be replaced by the constant values (ρ + , ρ − ), (ρ − , ρ + ) and (1, 1) respectively. With this modification the lemma still holds. Proof. Remember the notation ρ f (t, x) = R3 dξ f (t, x, ξ ). We may construct solutions (see [8]) such that H(g) ≤ H(g(0)). We expand H(g) and use (1.7) to cancel the linear part of the expansion, which takes the form 3 2 β 2 − dx dξ {Ci + 1 + ln }{ f i − Mi } 2π R R3 i=1 2
+
β|ξ |2 ( f i − Mi ) + (ln Mi + 1)( f i − Mi ) 2 R3 i=1 R + β( f i − Mi )U ∗ ρ¯i+1 . dx
dξ
β 2 Indeed, since ln Mi = ln( 2π ) − β|ξ2 | + ln ρ¯i , by (1.7), the above quantity is zero by construction. Therefore, we turn to the second order expansion of H(g). For some f˜i between Mi and f i , 2 ( f i (t) − Mi )2 H(g) = dx dξ d xdξ 2 f˜i R3 i=1 R dx dy(ρ f1 (t, x) − ρ¯1 (x))U (|x − y|)(ρ f2 (t, x) − ρ¯2 (x)). (2.2) +β 3
R
2
R
For some small number κ to be determined, we introduce the indicator functions χi< = 1| fi (t)−Mi |≤κ Mi and χi> = 1| fi (t)−Mi |>κ Mi and split the first term into ( f i (t) − Mi )2 > ( f i (t) − Mi )2 < dx dξ χi + dx dξ χi . 2 f˜i 2 f˜i R R3 R R3 ( f i (t) − Mi )2 in the case of | f i (t) − Mi | > κ Mi . Notice that either 2 f˜i f i ≥ (1 + κ)Mi , or f i ≤ (1 − κ)Mi . If f i ≥ (1 + κ)Mi , f˜i (t) ≤ f i , and we have
We first estimate
κ | f i (t) − Mi | | f i (t) − Mi | Mi 1 = . ≥ =1− ≥1− ˜ f f 1 + κ 1 + κ f i (t) i i
Phase Transition in a Vlasov-Boltzmann Binary Mixture
11
In the second case f i ≤ (1 − κ)Mi , f˜i (t) ≤ Mi and | f i (t) − Mi | | f i (t) − Mi | fi κ . ≥ =1− ≥ 1 − {1 − κ} = κ > ˜ Mi Mi 1+κ f i (t) Combining these two cases and noticing f˜i ≤ (1 + κ)Mi for | f i (t) − Mi | ≤ κ Mi , we conclude ( f i (t) − Mi )2 < ( f i (t) − Mi )2 > dx dξ χi + dx dξ χi 2 f˜i 2 f˜i R R3 R R3 ( f i (t) − Mi )2 < κ ≥ dx dξ χi + dx dξ | f i (t) − Mi |χi> 3 3 2(1 + κ)M 2(1 + κ) i R R R R 1 κ 2 < = dx dξ gi χi + dx dξ | f i (t) − Mi |χi> 2(1 + κ) 2(1 + κ) R R R3 R3 1 κ 2 ≥ d xn i + dx dξ | f i (t) − Mi |χi> , (2.3) 2(1 + κ) 2(1 + κ) R R R3 √ where we have set n i (t, x) µβ = P[( f i (t) − Mi )χi< ], and P denotes the L 2ξ -projection √ on Mi : P f := dξ f (ξ ) µβ (ξ ) µβ (ξ ). R3
We now split the potential contribution in (2.2) to get β dx dy dξ dη M1 g1 χ1< U (|x − y|) M2 g2 χ2< 3 R R3 R R +β dx dy dξ dη M1 g1 χ1< U (|x − y|) M2 g2 χ2> 3 3 R R R R +β dx dy dξ dη M1 g1 χ1> U (|x − y|) M2 g2 χ2> 3 3 R R R R +β dx dy dξ dη M1 g1 χ1> U (|x − y|) M2 g2 χ2< . R
R
R3
R3
From our assumption g L ∞ ≤ δ, the last three terms are controlled by Cβ ( g1 L ∞ + g2 L ∞ ) dx dξ | f i (t) − M1 |χi> ≤ Cβ δ
2 i=1
R
dx
R3
i=2
R
R3
dξ | f i (t) − Mi |1| fi (t)−Mi |≥κ M1 ,
√ √ which is bounded by the second term in (2.3) for δ 0. Let |v| ≤ N . 1. For any ε > 0, there exists L ε sufficiently large so that, if |x| ≥ L ε , then for 0 ≤ s ≤ t − ε, ∂ X i0 (s; t, x, v) ε < − < 0. ∂v 2 2. For any η > 0, there exist P finite points |xk | ≤ L ε (1 ≤ k ≤ P) and corresponding open sets Oxk = {an < s < bn } × {co < v < do } (n,o)∈Ik
with the property |[0, T0 ] × {|v| ≤ N } ∩ Oxck | < η, so that there exists m > 0 and, for any |x| ≤ L ε , there exists l ∈ {1, . . . , P}, ∂ X 0 (s; t, x, v) i > m > 0. ∂v for (s, v) ∈ Oxl
14
R. Esposito, Y. Guo, R. Marra
Proof. For any ε > 0, from (3.2) and (3.3 ),
|Vi0 (s; t, x, v)| ≤ |v| + 2 φ0 L ∞ ≤ N + C,
|X i0 (s; t, x, v) − x| ≤ T0 {N + C}. By choosing L ε (depending on T0 and N ) large enough, for |x| ≥ L ε , |v| ≤ N , and 0 ≤ s ≤ T0 , |X i0 (s; t, x, v)| ≥
Lε . 2
(3.4)
From (3.2), we have ∂ X i0 (s; t, x, v) d 2 ∂ X i0 (s; t, x, v) 0 = −∂ , φ (X (s; t, x, v)) xx 0 i ds 2 ∂v ∂v and we deduce that for |s| ≤ T0 , ∂ X 0 (s; t, x, v) i ≤ C T0 . ∂v
(3.5)
(3.6)
By the Taylor expansion for s, we get ∂ X i0 (s; t, x, v) ∂ X i0 (s; t, x, v) d ∂ X i0 (s; t, x, v) + (s − t) = ∂v ∂v ds ∂v s=t s=t (s − t)2 d 2 ∂ X i0 (¯s ; t, x, v) 2 ds 2 ∂v (s − t)2 d 2 ∂ X i0 (¯s ; t, x, v) = (s − t) + 2 ds 2 ∂v +
for some t − T0 ≤ s¯ ≤ t. Since the densities ρ¯i tend to their asymptotic values at infinity, lim y→∞ ∂x x φ0 (y) = 0. Therefore, using again (3.5) with s = s¯ and (3.6), we have ⎛ ⎞ ∂ X i0 (s; t, x, v) ≤ (s − t) ⎝1 − (t − s)C T0 sup |∂x x φ0 (y)|⎠ ∂v |y|≥ L ε 2
s−t ≤ < 0, 2 by choosing L ε sufficiently large. Part (1) thus follows. To prove part (2), for |x| ≤ L ε , introduce the zero set of Z x = {t − T0 ≤ s ≤ T0 , |v| ≤ N :
∂ X i0 (s; t, x, v) = 0}. ∂v
Then from the Fubini Theorem and Lemma 3.1, N t |Z x | = 1 ∂ X 0 (s;t,x,v) −N
t−T0
{s,v:
i
∂v
∂ X i0 (s; t, x, v) as ∂v
=0}
ds dv = 0.
Phase Transition in a Vlasov-Boltzmann Binary Mixture
15
η . Clearly 2 0 ∂ X i (s;t,x,v) = 0 over the compact set [t − T0 , t] × {|v| ≤ N } ∩ Ωxc . By the continuity in ∂v s and v, there exists m x > 0 such that over [t − T0 , t] × {|v| ≤ N } ∩ Ωxc , ∂ X 0 (s; t, x, v) i > 4m x > 0. ∂v
Therefore, there exists an open set Ωx such that Z x ⊂ Ωx with |Ωx |
2m x > 0 ∂v for all x ∈ (x − ∆x , x + ∆x ), (s, v) ∈ [t − T0 , t] × {|v| ≤ N } ∩ Ωxc . Such (x − ∆x , x + ∆x ) forms an open covering for |x| ≤ L ε , hence there is a finite subcovering {(xk − ∆k , xk + ∆k ), k = 1, . . . , P} for |x| ≤ L ε . For any |x| ≤ L ε , there exists Ωxl such that x ∈ (xl − ∆l , xl + ∆l ) and for [t − T0 , t] × {|v| ≤ N } ∩ Ωxcl , ∂ X 0 (s; t, x, v) i > 2m = min 2m k > 0. 1≤k≤P ∂v c We finally choose an (finite) open covering of [t − T0 , t] × {|v| ≤ N } ∩ Ωxk of the form Oxk = n,o {an < s < bn } × {co < v < do } with |an − bn | + |co − do | sufficiently small so that over Oxk , ∂ X 0 (s; t, x, v) i > m > 0. ∂v
Lemma 3.3. Fix T0 > 0 and N > 0. Let |v| ≤ N . For any ε > 0, recall L ε , Oxk , 1 ≤ k ≤ P constructed in Lemma 3.2. There exists δ > 0 such that if g L 2 < δ, 1. If |x| ≥ L ε then for 0 ≤ s ≤ t − ε, ∂ X i (s; t, x, v) ε > 0. ∂v 2 Proof. Denote the solution operator of (3.2) and (3.1) by G 0 and G. By the Duhamel principle, we have X i (s; t, x, v) G 0 (s−t) x =e v Vi (s; t, x, v) s 0 G 0 (s−τ ) √ dτ. (3.7) + e −∂x U ∗ { Mi+1 gi+1 dξ }(τ ) t
16
Notice that
R. Esposito, Y. Guo, R. Marra
∂ ∂x U ∗ { Mi+1 gi+1 dξ }(τ ) ∂v ∂ X i (τ ; t, x, v) = ∂ x x U ∗ { Mi+1 gi+1 dξ } ∂v ∂ X i (τ ; t, x, v) . ≤ C g L 2 ∂v
It thus follows from taking the derivative of v in (3.7) and by Gronwall’s lemma that 0 ≤ s ≤ T0 , ∂ X i (s; t, x, v) ≤ eC T0 . ∂v We now use (3.7) again to get ∂ X (s; t, x, v) ∂ X 0 (s; t, x, v) i i − ≤ C T0 g L 2 . ∂v ∂v Hence, we deduce our lemma by choosing g L 2 sufficiently small.
4. Weighted L ∞ Stability In this section we use the entropy-energy bound and the estimates on the characteristics to show that the perturbation g of the non homogeneous equilibrium Mρ¯ , is arbitrarily small at any positive time in a suitable weighted L ∞ norm, provided that it is initially sufficiently small, thus showing the
stability γ of the non homogeneous equilibrium. We use the weight function w(ξ ) = Σ + |ξ |2 , with Σ a positive constant to be chosen later and γ > 23 . Lemma 4.1. Let h = wg. There exist T0 > 0 and δ > 0 such that, if h L ∞ < δ, then
h(T0 ) L ∞ ≤
1
h(0) L ∞ + C T0 H(g(0)). 2
Proof. We first write the equation for h = wg from (1.17): h i+1 ∂v + ν(x, ξ ) h i = ∂t + v∂x + F(Mi+1 + Mi+1 w
h i+1 w h i + wF F Mi+1 + Mi+1 M I +1 gi+1 v Mi + K wi j h j w w j=1,2 h i h i+1 h i+1 hi hi vh i + wΓ , + wΓ , , (4.1) +F Mi+1 w w w w w ·
ij where K w (·) = wK i j . We note that for Σ ≥ 1, w [Σ + |ξ |2 ]γ w(ξ ) [Σ + |ξ |2 ]γ + |ξ − ξ |2γ = ≤ C ≤ Cγ [1 + |ξ − ξ |2 ]γ . γ w(ξ ) [Σ + |ξ |2 ]γ [Σ + |ξ |2 ]γ
(4.2)
Phase Transition in a Vlasov-Boltzmann Binary Mixture
17
For any (t, x, ξ ), integrating along its backward trajectory (3.1), [X i (s), Vi (s)] = [X i (s; t, x, v), Vi (s; t, x, v)], we can express h i (t, x, ξ ) as h i (0, X i (0; t, x, v), Vi (0; t, x, v), ζ ) + t s w e t νi (τ )dτ {F(Mi+1 + Mi+1 gi+1 ) h i }(s, X i (s), Vi (s), ζ )ds w 0 t s + e t νi (τ )dτ {F( Mi+1 gi+1 )Vi (s) Mi }(s, X i (s), Vi (s), ζ )ds 0 ⎛ ⎞ 2 t s + e t νi (τ )dτ ⎝ K wi, j h j ⎠ (s, X i (s), Vi (s), ζ )ds j=1 0 t s
(4.3)
j=1,2
e t νi (τ )dτ {F( Mi+1 gi+1 )vh i }(s, X i (s), Vi (s), ζ ) + 0 t s h i h i+1 hi hi +Γ (s, X i (s), Vi (s), ζ )ds. + e t νi (τ )dτ w Γ , , w w w w 0 We have set νi (τ ) ≡ ν(Vi (τ ), ζ ) ≥ ν0 > 0. Fix a small constant ε > 0. We can √ choose Σ large so that | ww | ≤ ε. Since F(Mi+1 + Mi+1 gi+1 ) L ∞ ≤ C if h L ∞ is small, the second term in (4.4) is bounded by Cεe−
ν0 t 2
sup {e
ν0 s 2
h(s) L ∞ }.
0≤s≤T0
For the third term in (4.4), we split gi+1 = gi+1 1| fi (t)−Mi |≥κ Mi + gi+1 1| fi (t)−Mi |≥κ Mi . Since U is smooth,
F( Mi+1 gi+1 ) L ∞ ≤ C{ Mi+1 gi+1 1| fi (t)−Mi |≤κ Mi L 2 + Mi+1 gi+1 1| fi (t)−Mi |≥κ Mi L 1 }, by Lemma 2.2,
F( Mi+1 gi+1 ) Mi (s, X i (s), Vi (s), ζ )Vi (s) L ∞
≤ C H(g(0)) + H(g(0)) .
(4.4)
For the fifth term, we note that, sincefor hard spheres νi (s) ≥ ν0 (1 + |ζ | + |Vi (s)|), t |Vi (s)| 1 νi (s) s (νi (τ )/2)dτ it follows that < . Moreover, et ds ≤ 1. Therefore, νi (s) ν0 2 0 t ν s s e t νi (τ )dτ F( Mi+1 gi+1 )Vi (s)h i ds ≤ Ce−ν0 t sup {e 02 h(s) L ∞ }2 . 0
0≤s≤T0
18
R. Esposito, Y. Guo, R. Marra
For the last term in (4.4), by Lemma 10 of [7], it follows wΓ h i , h i (ξ ) + wΓ h i , h i+1 (ξ ) ≤ Cν(ξ ) h 2 ∞ . L w w w w We therefore get the bound for the last term by t s e t ν(τ )dτ νi (s) h(s) 2L ∞ ds 0 t ν0 s s 2 2 ≤ C{ sup e h(s) L ∞ } e t νi (τ )dτ νi (s)e−ν0 s ds. 0
0≤s≤T0 s
s
d Note that ds [e t νi (τ )dτ ] = e t νi (τ )dτ νi (s). Integrating by parts yields t t
s=t s s s νi (τ )dτ −ν0 s νi (τ )dτ −ν0 s t t e νi (s)e ds = e e + ν0 e t νi (τ )dτ e−ν0 s ds s=0
0
≤ C(1 + t)e
−ν0 t
0
.
We shall mainly concentrate on the fourth term in (4.4). Let ki, j (ξ, ξ ) be the corij responding kernel associated with K w in (4.1). We now use (4.4) for h j (s, X i (s), ξ ) again to evaluate i, j {K wi, j h j }(s, X i (s), Vi (s), ζ ) = kw (Vi (s), ζ, ξ )h j (s, X i (s), ξ )dξ . Denote [X j (s1 ), V j (s1 )] ≡ [X j (s1 ; X i (s; t, x, v), v ), V j (s1 ; X i (s; t, x, v), v )]. We can bound the fourth term in (4.4) by the sum on j of t 0 s i, j e t νi (τ )dτ + s ν j (τ )dτ |kw (Vi (s), ζ, ξ )|h j (0, X j (0), V j (0), ζ )|dξ ds R3 0 t s s s 1 i, j + e t νi (τ )dτ + s ν j (τ )dτ |kw (Vi (s), ζ, ξ )| 0
s1
w
R3
×{F(M j+1 + M j+1 g j+1 ) h j }(s1 , X j (s1 ), V j (s1 ), ζ )dξ dsds1 w t s s s 1 νi (τ )dτ + s ν j (τ )dτ i, j t + e |kw (Vi (s), ζ, ξ )| R3 0 s1 ×{F( M j+1 g j+1 )v M j }(s1 , X j (s1 ), V j (s1 ), ζ )dξ ds s s t s i, j 1 + e t νi (τ )dτ + s ν j (τ )dτ kw (Vi (s), ζ, ξ ) R3 × R3 0 s1 k ×kwj,k (V j (s1 ), ζ , ξ )h k (s1 , X j (s1 ), ξ )|dξ dξ dsds1 t s s s 1 i, j + e t νi (τ )dτ + s ν j (τ )dτ |kw (Vi (s), ζ, ξ )| 3 R 0 s1 ×{F( M j+1 g j+1 )vh j }(s1 , X j (s1 ), V j (s1 ), ζ )dξ dsds1
(4.5)
Phase Transition in a Vlasov-Boltzmann Binary Mixture
t
19
s 1 i, j e t νi (τ )dτ + s ν j (τ )dτ |kw (Vi (s), ζ, ξ )| 3 R 0 s1 h j h j+1 hj hj , +Γ , (s, X j (s1 ), V j (s1 ))dξ dsds1 . ×w Γ w w w w
+
s s
We will make an extended use of Lemma 7 of [7], which we report here for reader’s convenience: For hard spheres, the usual Grad estimates imply: |ki, j (ξ, ξ )| ≤ C{|ξ − ξ | + |ξ − ξ |−1 }e
− 18 |ξ −ξ |2 − 18
ξ |2 −|ξ |2 |2 |ξ −ξ |2
.
(4.6)
Lemma 4.2 (Lemma 7 of [7]). There are ε > 0 and C > 0 such that 1−ε 2 1−ε ξ |2 −|ξ |2 |2 w(ξ ) C −1 − 8 |ξ −ξ | − 8 |ξ −ξ |2 {|ξ − ξ . (4.7) dξ | + |ξ − ξ | }e ≤ 3 w(ξ ) 1 + |ξ | R By Lemma 4.2, we obtain the crucial estimate i, j |kw (ξ, ξ )|dξ < R3
C 1 + |ξ |
(4.8)
uniformly in Σ. Since νi ≥ ν0 , by taking the L ∞ norm for h and (4.8), we bound the first term in (4.6) by Cte−ν0 t h 0 L ∞ , and the second term by ν0
εCe− 2 t sup {e
ν0 2 s
h(s) L ∞ }.
0≤s≤T0
By (4.8), the third term is bounded by CH(g(0)) as in (4.4), and the last two nonlinear terms are bounded by C{1 + t}e−ν0 t { sup e
ν0 2 s
0≤s≤T0
h(s) L ∞ }2 .
We now concentrate on the fourth term in (4.6), which will be estimated along the same lines of the proof of Theorem 20 in [7]. Case 1. For |ξ | ≥ N T0 , we know that from (3.1), |Vi (s; t, x, v) − v| ≤ |s − t|C ≤ C T0 . By Lemma 4.2 and (4.2), for N T0 large, i, j (Vi (s), ζ, ξ )kwj,k (V j (s1 ), ζ , ξ )|dξ dξ |kw ≤
C C ≤ , 1 + |Vi (s)| + |ζ | N − C T0
we therefore can find an upper bound for the fourth term in (4.6) by (N >> T0 ) s C t −ν0 (t−s) e × e−ν0 (s−s1 ) h(s1 ) L ∞ ds1 ds N 0 0 ν0
≤
Ce− 2 t N
sup e 0≤s≤T0
ν0 2 s
h(s) L ∞ .
20
R. Esposito, Y. Guo, R. Marra
Case 2. For |ξ | ≤ N , |ξ | ≥ 2N , or |ξ | ≤ 2N , |ξ | ≥ 3N . Notice that we have either |ξ − ξ | ≥ N or |ξ − ξ | ≥ N . This implies that |v − Vi (s; t, x, v)| ≥ |v − v| − |v − Vi (s; t, x, v)| ≥ |v − v| − C T0 , |v − V j (s1 ; X i (s; t, x, v), v )| ≥ |v − v | − |v − V j (s1 ; X i (s; t, x, v), v )| ≥ |v − v | − C T0 . Therefore, either one of the following are valid correspondingly for some σ > 0: σ
σ
2 +|ζ −ζ |2 }
i, j i, j |kw (Vi (s), ζ, ξ )| ≤ C T0 e− 8 N |kw (Vi (s), ζ, ξ )e 8 {|Vi (s)−v |
|kwj,k (V j (s1 ), ζ , ξ )|
≤
2
|,
− σ8 N 2
C T0 e |kwj,k (V j (s1 ), ζ , ξ ) × σ 2 2 8 {|V j (s1 )−v | +|ζ −ζ | } |.
e
From Lemma 4.2, σ 2 2 i, j |kw (Vi (s), ζ, ξ )e 8 {|Vi (s)−v | +|ζ −ζ | } |dξ σ 2 2 + |kwj,k (V j (s1 ), ζ , ξ )e 8 {|V j (s1 )−v | +|ζ −ζ | } |dξ < +∞.
(4.9)
We use this bound to combine the cases of |ξ − ξ | ≥ N or |ξ − ξ | ≥ N as: t s1 + . 0
|ξ |≤N ,|ξ |≥2N ,
0
|ξ |≤2N ,|ξ |≥3N
We first integrate ξ for the first integral and apply (4.8) to integrate kw over ξ . We i, j then integrate ξ for the second integral and apply (4.8) to integrate kw over ξ . We thus find an upper bound j,k
t
s1
C
sup 0
0
+ sup ξ
|ξ |≤N ,|ξ |≥2N ,
ξ
|ξ |≤2N ,|ξ |≥3N
Cη η 2 ≤ 2 e− 8 N κ η
t
≤ C η e− 8 N e 2
s1
e 0 0 ν − 20 t
ij |kw (Vi (s), ζ, ξ )|dξ
ij |kw (V j (s1 ), ζ , ξ )|dξ
(4.10)
−ν0 (t−s1 )
sup {e
ν0 2 s
h(s1 ) L ∞ ds1 ds
h(s) L ∞ }.
0≤s≤t
Case 3. |ξ | ≤ N , |ξ | ≤ 2N , |ξ | ≤ 3N . This is the last remaining case because if |ξ | > 2N , it is included in Case 2; while if |ξ | > 3N , either |ξ | ≤ 2N or |ξ | ≥ 2N are also included in Case 2. We now can bound the second term in (4.6) by t
s
C 0
B
0
i, j e−ν0 (t−s1 ) |kw (Vi (s), ζ, ξ )kwj,k (V j (s1 ), ζ , ξ )h k (s1 , X j (s1 ), ξ )|,
Phase Transition in a Vlasov-Boltzmann Binary Mixture
21
where B = {|ξ | ≤ 2N , |ξ | ≤ 3N }. We notice that kw (ξ, ξ ) has a possible integrable i, j 1 singularity of the type |ξ −ξ | . We can choose k N (ξ, ξ ) smooth with compact support such that 1 i, j i, j (4.11) |k N ( p, ξ ) − kw ( p, ξ )|dξ ≤ . sup N | p|≤3N |ξ |≤3N i, j
Split kw (Vi (s), ζ, ξ )kw (V j (s1 ), ζ , ξ ) into ij
j,k
i, j (Vi (s), ζ, ξ ) − k N (Vi (s), ζ , ξ )}kwj,k (V j (s1 ), ζ , ξ ) {kw i, j
i, j +{kw (V j (s1 ), ζ, ξ ) − k N (V j (s1 ), ζ, ξ )}k N (V j (s), ζ , ξ ) i, j
j,k
+k N (Vi (s), ζ, ξ )k N (V j (s1 ), ζ , ξ ). i, j
j,k
We then integrate the first term above in ξ and the second term above in ξ . By (4.8), we can use such an approximation (4.11) to bound the s1 , s integration by ν0
ν0 Ce− 2 t sup {e 2 s h(s) L ∞ } (4.12) N 0≤s≤t i, j j,k × sup |kw (V j (s1 ), ζ , ξ )|dξ + sup |kw (Vi (s), ζ, ξ )|dξ
|ξ |≤2N
t
|ξ |≤2N
s
+C 0
B
s1
e−ν0 (t−s1 ) |k N (Vi (s), ζ, ξ )k N (V j (s1 ), ξ )h j (s, X j (s1 ), ζ , ξ )|. i, j
i, j
The first term above is further bounded by
ν0 t
Ce− 2 N
sup0≤s≤t {e
ν0 2 s
h(s) L ∞ }.
Fix ε > 0. We use now Lemma 3.3 for the last main contribution in (4.13) for which we separate two cases |X j (s; t, x, v)| ≥ L ε and |X j (s; t, x, v)| ≤ L ε , where L ε is given in Lemma 3.3. In the case |X j (s; t, x, v)| ≥ L ε , we bound it by t s e−ν0 (t−s1 ) |h k (s, X j (s1 ), ξ )|1|X j (s1 )|≥L ε dsdξ dξ ds1 CN 0 B 0 t−ε t . ≤ CN + 0
t−ε ν0
ν0
The second integral is bounded by Cεe− 2 t sup0≤s≤t {e 2 s h(s) L ∞ }. In the first integral, since s ≤ t − ε, by Lemma 3.3, we can make a change of variable y = X j (s1 ) = X j (s1 ; X i (s; t, x, v), v ) dy ε because | dv | ≥ 2 . We observe since ∂ x φ L ∞ ≤ C, that from (3.1), s |v − V j (τ )| ≤
∂x φ L ∞ dτ ≤ T0 ∂x φ L ∞ , τ s |V j (τ )|dτ ≤ T0 (|v | + T0 ∂x φ L ∞ ) ≤ C T0 ,N |y − X i (s)| ≤ s1
(4.13)
22
R. Esposito, Y. Guo, R. Marra
for |v | ≤ 2N . By first integrating over ζ and using the change of variable (4.13), t−ε s e−ν0 (t−s1 ) |h k (s1 , X j (s1 ), ξ )|1|X j (s1 )|≥L ε dsdξ dξ ds1 B 0 0 s C t−ε ≤ e−ν0 (t−s1 ) |h k (s1 , y, ξ )|dsdξ dyds1 ε 0 |y−X i (s)|≤C T0 ,N |ξ |≤3N 0 CN ≤ |h k (s1 , y, ξ )|ds1 dξ dy sup ε 0≤s1 ≤T0 |y−X i (s)|≤C T0 ,N |ξ |≤3N + = | f k (t)−Mk |≥κ M j
| f k (t)−Mk |≤κ M j
H(g(0))}.
≤ C T0 ,N ,ε {H(g(0)) +
f k −Mk ) We have used the fact h k = w( √ , (which is bounded by f k − Mk for |ξ | ≤ 3N ), Mk and applied to Lemma 2.2. For |X i (s; t, x, v)| ≤ L ε , for any η > 0, we again employ Lemma 3.3 to find Oxl such that T0 T0 e−ν0 (t−s1 ) |h k (s1 , X j (s1 ), ξ )|1|X j (s)|≤L ε dsdξ dξ ds1 0
=
B 0 T0
0
B
T0
T0
1 Oxc e−ν0 (t−s1 ) |h k (s1 , X j (s1 ), ξ )|1|X j (s)|≤L ε dsdξ dξ ds1 l
0
T0
+ 0
B
1 Ox e−ν0 (t−s1 ) |h k (s1 , X j (s1 ), ξ )|1|X j (s)|≤L ε dsdξ dξ ds1 . l
0
Since |[0, T0 ] × [−N , N ] ∩ Oxcl | < η, the first part is bounded by ν0
C T0 ,N ,ε ηe− 2 t sup {e
ν0 2 s
h(s) L ∞ }.
0≤s≤t
The second part is bounded by T0 T0 C T0 ,N ,ε 1 Oxl |h k (s1 , X j (s1 ), ξ )dsds1 dξ dξ . 0
0
B
∂ X (s ;X (s;t,x,v),v ) Since | j 1 i∂v | > m η /2 on Oxl from Lemma change of variable y = X j (s1 ) = X j (s1 ; X i (s; t, x, v), v )
C T0 ,N ,ε
T0
0
= C T0 ,N ,ε |ξ |≤3N
=
T0
0 Il
B
1 Oxc |h k (s1 , X j (s1 ), ξ )dsds1 dξ dξ
C T0 ,N
l
T0
0
0
T0
|y−X i (s1 )|≤C T0 ,N
h k (s1 , y, ξ )dsds1 dydξ +
| f k (t)−Mk |≤κ Mk
3.3, we can make a (local) to get
| f k (t)−Mk |≥κ Mk
≤ C T0 ,N ,ε,η {H(g(0)) +
H(g(0))}.
×
Phase Transition in a Vlasov-Boltzmann Binary Mixture
23
Collecting terms, we conclude sup e
ν0 2 t
h(s) L ∞ ≤ C(1 + T0 ) h(0) L ∞
0≤s≤T0 ν0 C T0 + C N ,T0 ε + C N ,T0 ,ε η} sup {e 2 s h(s) L ∞ } N 0≤s≤T0 ν0 s 2 +C{ sup e 2 h(s) L ∞ } + C T0 ,N ,ε,η H(g(0)).
+{
0≤s≤T0
Assume sup0≤s≤T0 h(s) L ∞ is sufficiently small. We first choose T0 sufficiently large so that ν0
2C(1 + T0 )e− 2
T0
≤
1 , 2
then N sufficiently large, then ε sufficiently small, finally η small to conclude our lemma. Proof of Theorem 1.2. Assume sup0≤t≤∞ h(t) L ∞ is small. We first establish (1.24). Choose any n = 0, 1, 2, 3, . . . and apply Lemma 4.1 repeatedly to get
h(nT0 ) L ∞ ≤ ≤ ≤ ≤ ≤
1
h({n − 1}T0 ) L ∞ + C T0 H(g(0)) 2 1 1
h({n − 2}T0 ) L ∞ + C T0 H(g(0)) + C T0 H(g(0)) 4 2 ... 1 1 1 ∞ + CT
h
H(g(0)){1 + + + . . . } 0 L 0 2n 2 4 1 ∞ + 2C T
h
H(g(0)). 0 L 0 2n
For any t, we can find n such that nT0 ≤ t ≤ {n + 1}T0 , and from L ∞ estimate from [0, T0 ], we conclude (1.24) by
h(t) L ∞ ≤ C T0 h(nT0 ) ≤ C{ h 0 L ∞ + H(g(0))}. To prove (1.25), we take x and v derivatives to get {∂t + v∂x + F(Mi+1 + Mi+1 gi+1 )∂v + ν(ξ )}∂x gi − K i, j ∂x g j = −∂x F(Mi+1 + Mi+1 gi+1 )∂v gi + β∂x F( Mi+1 gi+1 )v∂x Mi + +∂x {F( M j g j )vgi } + ∂x {Γ (gi , gi ) + Γ (gi , g j )}; (4.14) i, j {∂t + v∂x + F(M j + M j g j )∂v + ν(ξ )}∂v gi − ∂v {K g j } + {∂v ν(ξ )}gi = −∂x gi + β∂x F( M j g j )v Mi + β∂x F( M j g j )v∂x Mi +F( M j g j )∂v {vgi } + ∂v {Γ (gi , gi ) + Γ (gi , g j )}, (4.15) where K i, j has a similar property as K i in [6] (see Lemma 2.2 in [6], p. 1109). In particular, ∂v {K i, j g j }∂v gi L 1 ≤ 21 ∂v g 2ν + C g 2L 2 so that a positive dissipation for ∂v gi
24
R. Esposito, Y. Guo, R. Marra
occur for small h L ∞ in (4.15). Notice that L = ν − K ≥ 0. We take the inner product with ∂x gi and ∂v gi respectively, following the procedures in [6] to get: d dt d dt
1
∂x g 2L 2 ≤ C{ ∂x F(Mi+1 ) L ∞ + h L ∞ } ∇x,v g 2L 2 + C g 2L 2 . 2 1 1
∂v g 2L 2 + ∂v g 2ν ≤ C ∂x g 2L 2 + C g 2L 2 . (4.16) 2 4
Hence (1.25) follows from the Gronwall Lemma since sup0≤t≤∞ h(t) L ∞ is bounded by (1.24). With such an estimate, we obtain the uniqueness by taking√the L 2 estimate for the difference for (1.17) because the most difficult term F(Mi+1 + Mi+1 gi+1 )∂v gi can be handled. 5. Linear Instability: Growing Mode In this section we study the linearization of Eq. (1.1) around the homogeneous equilibrium Mhom = (µβ , µβ ). In the sequel we omit the index β for sake of shortness: |ξ |2
β 3/2 −β 2 ) e . When M is replaced by Mhom = (µ, µ) in (1.17), we get µ = µβ = ( 2π the following linearized Vlasov-Boltzmann system:
∂t g + Lg = 0, where g = (g1 , g2 ),
(5.1)
√ √ (Lg)i = v∂x gi − β F( µgi+1 )v µ − L i g
and 1 √ √ L i g = √ Q( µgi , 2µ) + Q(µ, µ(g1 + g2 )) . µ We seek an exponential growing mode for such a system when β > 1. To this end, we consider a family of systems ∂t g + Lα g = 0, √ √ (Lα g)i = v∂x gi − β F( µgi+1 )v µ − αL i g,
(5.2)
and show that there is a growing mode for all α > 0. We seek a growing mode periodic in x, so we assume periodic dependence on space and exponential in time: g1 (t, x, v, ζ ) = eλt eikx q(v, ζ ), g2 (t, x, v, ζ ) = eλt e−ikx q(−v, ζ ), so that the system (5.2), using the definition of F (see (1.2)), reduces to the single equation √ √ {λ + ivk}q − βkiUˆ (k) (5.3) q µdξ v µ = αLq,
√ √ with Lg = √2µ Q( µg, µ) + Q(µ, µg . Equivalently, q(ξ ) is an eigenfunction for the operator T α , √ √ α ˆ (T q)(ξ ) = ivkq(ξ ) − βkiU (k) (5.4) q(ξ ) µdξ v µ − αLq(ξ ) with eigenvalue −λ.
Phase Transition in a Vlasov-Boltzmann Binary Mixture
25
Lemma 5.1. Let β > 1. There exists sufficiently small α > 0 such that there is an eigenfunction q(ξ ) to T α with Reλ > 0. Proof. We first study the eigenvalue problem for the unperturbed operator T 0 for α = 0 : √ √ ˆ (λ + ivk)q − βkiU (k) q µdξ v µ = 0. (5.5) R3
This is similar to the Penrose dispersion relation for the unperturbed Vlasov-Poisson system ([14]) for a collisionless plasma. From (5.5), we obtain: √ √ ˆ βk U (k)i q µdξ v µ R3 q= . (5.6) (λ + ivk) √ Normalizing q µdξ = 1, we deduce that R3
Uˆ (k)kvµ(ξ ) dξ = 1. R3 (λ + ivk) √ Multiply and divide (5.6) by (λ − ivk) µ, take the imaginary part and then integrate on ξ . By consistency we must have: βi
β
v 2 Uˆ (k)k 2 µ(ξ ) dξ = 1. λ2 + k 2 v 2 R3
v 2 Uˆ (k)k 2 µ(ξ ) dξ . Clearly F(0, k) = β Uˆ (k). Since Uˆ (0) = 1 λ2 + k 2 v 2 R3 and β > 1, there is k0 sufficiently small so that Define F(λ, k) ≡ β
F(0, k0 ) = β Uˆ (k0 ) > 1.
(5.7)
Moreover, limλ→∞ F(λ, k0 ) = 0 for any k0 = 0. Hence there exists a real number √ βk Uˆ (k )iv µ
0 λ > 0 such that F(λ, k0 ) = 1 and q = 0 (λ+ivk) is the eigenfunction. We now fix k = k0 and return to (5.3). It can be proved (see for example [9]) that Lq L 2 ≤ C νq L 2 . Moreover, for hard spheres, there are constants C1 and C2 > 0 such that √ √ 0 ˆ
T q L 2 = ik0 vq − βk0 iU (k0 ){ q µdξ }v µ L 2
R3
≥ C1 νq L 2 − C2 q L 2 .
(5.8)
This implies Lq L 2 ≤ C{ T 0 q L 2 + q L 2 } so the perturbation L is T 0 -bounded. Since T α = T 0 − αL , we deduce from Kato’s book (p. 206, [12]) that, for α small, there is an eigenvalue −λ and an eigenfunction q(v, ζ ) for (5.3) with positive real part of λ for (5.3).
26
R. Esposito, Y. Guo, R. Marra
Theorem 5.1. Let β > 1. Then there is a 2π k0 -periodic eigenvector (g˜ 1 , g˜ 2 ) for −L such that Reλ > 0 and g˜ 2 (x, v, ζ ) = g˜ 1 (−x, −v, ζ ). Proof. We fix k0 as in Lemma 5.1. We define, for the family of Eqs. (5.2): α0 = sup α : there is an eigenvalue with positive real part for T α α 2π with a – periodic eigenvector . k0 By Lemma 5.1, for α sufficiently small this set is not empty. We want to show that α0 = +∞. We prove it by contradiction. Suppose α0 < +∞. We claim that, if there is such a finite α0 > 0, then there is an eigenvalue λ0 with an eigenfunction q0 with q0 ν = ν|q0 |2 dξ = 1, such that Reλ0 = 0 and (λ0 + ivk0 )q0 − βk0 iUˆ (k0 )
R3
√ √ q0 µdξ v µ = α0 Lq0 .
(5.9)
Proof of the claim: In fact, by (5.8), choose a family of eigenfunctions qα ∈ L 2 such that qα ν = 1, Reλα > 0, as α → α0 and √ √ ˆ qα µdξ v µ = αLqα . (5.10) (λα + ivk0 )qα − βk0 iU (k0 ) R3
Let q¯α denote the complex conjugate of qα . Notice that both (Lqα , q¯α ) and (ivk0 qα , q¯α ) are bounded by C qα 2ν . As α → α0 , taking the L 2 inner product with q¯a for (5.10), we deduce that |λα | is bounded for α → α0 . Hence limα→α0 λα = λ0 (up to subsequences) with Reλ0 ≥ 0. We now prove that λ0 is an eigenvalue so that Reλ0 = 0 by the definition of α0 and the claim is proven. Clearly, we may assume that lim qα = q0 weakly in L 2 and (5.9) are valid as α → α0 . We only need to show that lim qα =√q0 strongly so that q0 ν = 1, and q0 is an eigenfunction. Denote Ph = {1, v, |v|2 } µ. Clearly Pqα → Pq0 strongly in L 2 . It thus is left to show that (I − P)qα → (I − P)q0 strongly in L 2 . We subtract (5.10) from (5.9) to get (λα − λ0 )qα + (α − α0 )Lqα + (λ0 + ik0 v)(qα − q0 ) √ √ {qα − q0 } µdξ v µ + α0 L(qα − q0 ) −βk0 iUˆ (k0 ) = 0.
R3
We take the L 2 inner product with q¯α − q¯0 and then take the real part. Since Reλ0 qα − q0 2 ≥ 0, and ik0 v|qα − q0 |2 dξ is purely imaginary, we obtain: α0 L(g α − g), (g¯ α − g) ¯ ≤ (|λα − λ0 | + |α − α0 |) qα ν · qα − q0 ν √ +C| {qα − q0 } µdξ | · (g α − g) . R3
Therefore, (I − P){g α − g} → 0 in L 2ν and g ν = 1 and our claim follows.
Phase Transition in a Vlasov-Boltzmann Binary Mixture
27
Hence, λ0 is purely imaginary. Actually we show that it is 0. To do this we take the inner product with q¯0 in (5.9) to get √ √ 2 ˆ λ0 q0 2 − βk0 iU (k0 ) q0 µdξ q¯0 v µdξ + α0 Lg, g ¯ = 0. R3
But by integrating
√
R3
µ×(5.9) over ξ , we obtain the continuity equation √ √ λ0 q0 v µdξ + k0 i q0 v µdξ = 0. R3
√
R3
√
Therefore, k0 i R3 q¯0 v µdξ = λ¯ 0 R3 q¯0 v µdξ and 2 √ 2 ˆ ¯ q0 µdξ + α0 Lg, g ¯ = 0. λ0 g 2 − β λ0 U (k0 ) R3
(5.11)
Since Uˆ (k0 ) is real and λ0 is purely imaginary, taking the real part of (5.11) we conclude that
Lq0 , q¯0 = √ 0. Therefore, q0 is a linear combination of the collision invariants √ α0√ µ, ξ µ and |ξ 2 |2 µ and α0 Lq0 vanishes in (5.9). Now (5.9) reduces to a pure Vlasov equation, and we deduce that √ √ βik0 Uˆ (k0 ){ q0 (ξ ) µdξ }v µ q0 (ξ ) = . λ0 + ivk0 √ Since q0 (ξ ) µdξ = 0, this is compatible with the condition that q0 is a combination √ of collision invariants if and only if λ0 = 0. Thus q0 (ξ ) = β Uˆ (k0 ) µ, and hence β Uˆ (k0 ) = 1, which is a contradiction to (5.7). Theorem 5.1 thus follows.
(5.12)
6. Nonlinear Instability In order to establish the non-linear instability, we need several lemmas on the properties of the fastest linear growing mode. First of all we need to establish the smoothness and long time behavior for the growing mode. Recall the definition of the operator L (5.1). We define M = {g = [g1 , g2 ] ∈ L 2 | g1 (x, v, ζ ) = g2 (−x, −v, ζ )} and · L 2 (M) will denote the L 2 norm on this set. We have the following lemmas: Lemma 6.1. Let β > 1. Then for k0 sufficiently small, for all δ > 0, the spectrum of −L in {Reλ > δ} consists of a finite number of eigenvalues of finite multiplicity. If λ1 denotes an eigenvalue with maximal real part, and Λ > max{0, Reλ1 }, then there exists CΛ > 0 such that, for any g0 ∈ M,
e−t L g0 L 2 (M) ≤ CΛ eΛt g0 L 2 (M) .
28
R. Esposito, Y. Guo, R. Marra
Proof. This follows easily from the Vidav’s Lemma [16]. Notice that we can split √ √ Lg = {v∂x g + Lg} + {β F( µg)v µ} ≡ Ag + Kg, where K is a compact operator from L 2 to L 2 , while e−t A L 2 (M)→L 2 (M) ≤ 1.
Lemma 6.2. Let R = (R1 , R2 ) ∈ L 2 (M) with R L 2 (M) = 1 be an eigenvector of −L with Reλ > 0. Then there exists a constant C depending only on λ such that
∇x,v R L 2 (M) ≤ C, sup w(v)|R(x, v)| ≤ C,
(6.1) (6.2)
x,v
where w is a polynomial weight as in previous section. Proof. We begin with R ∈ L 2 . We first claim ∞ R=− e−λt e−t A KRdt.
(6.3)
0
Notice that the corresponding growing mode g(t) = eλt R satisfies ∂t g + Ag= −Kg t so that eλt R = e−(t−s)A eλs R − s e−(t−τ )A KRdτ . Letting s → −∞, since
e−t A L 2 →L 2 ≤ 1 for any t > 0 and Reλ > 0, we get t ∞ eλt R = − e−(t−τ )A KR eλτ dτ = − e−τ A KR eλ(t−τ ) dτ. −∞
0
Dividing by eλt we prove our claim because Reλ > 0 and the integral converges in L 2 . From the property of linear Boltzmann equation, clearly
∂x {e−t A g} L 2 (M) ≤ ∂x g L 2 (M) . Taking the v derivative of (∂t + v∂x + L)g = 0 yields: {∂t + v∂x }{∂v g} + ∂v {−Lg} = −∂x g. From [6], ∂v {−Lg}∂v g ≥ ν2 |∂v g|2 − Cν g 2L 2 . We thus obtain by taking L 2 inner product with ∂v g,
||∂v g|| L 2 = ∂v {e−t A g0 } L 2 ≤ C(t + 1){ g0 L 2 + ∇x,v g0 L 2 }.
(6.4)
Since KR ∈ C ∞ and ∂x {KR} L 2 + ∂v {KR} L 2 ≤ C R L 2 , we can take ∂x and ∂v derivatives in (6.3) to get ∞
∂x R L 2 + ∂v R L 2 ≤ e−Reλt { ∂x {e−t A KR} L 2 + ∂v {e−t A KR} L 2 }dt 0 ∞ e−Reλt (t + 1) ∂x {KR} L 2 + ∂v {KR} L 2 dt ≤C 0 ∞ e−Reλt (t + 1)dt ≤ C R L 2 . ≤ C R L 2 0
Phase Transition in a Vlasov-Boltzmann Binary Mixture
29
We therefore deduce (6.1). To show (6.2), we denote S = w R. We then have √ µS S √ )vw µ} ≡ Aw S + Kw S. λS = {v∂x S + wL( )} + {β F( w w Applying the same proof in Sect. 3 for the stability for the pure linear Boltzmann operator, we can establish:
e−t Aw g0 L ∞ ≤ C{ g0 L 2 + g0 L ∞ }. ∞ We can similarly obtain S = − 0 e−λt e−t Aw Kw Sdt, so that from Reλ > 0, ∞ ∞
S L ≤ e−Reλt e−t Aw Kw S L ∞ dt 0 ∞ e−Reλt { Kw S L ∞ + Kw S L 2 }dt ≤C ≤ C.
0
Lemma 6.3. Let R be an eigenvector of −L with its eigenvalue λ with Reλ > 0. If λ is not real, then there is a constant ζ > 0 such that for all t > 0,
e−Lt Im R L 2 ≥ ζ eReλt Im R 2 > 0. √ √ Proof. We prove by contradiction. Notice that, since ImF( µ R) = F( µIm R), one can immediately check that e−Lt Im R = Im{e−Lt R} = eReλt (sin[Imλt]ReR + cos[Imλt]Im R). If the lemma were false, by passing through a convergent subsequence of sin[Imλtn ] and cos[Imλtn ], with n → ∞ we would have aIm R + bReR = 0, with a 2 + b2 = 1. Therefore either Im R or ReR would be a real eigenvector and λ would be real, a contradiction. Lemma 6.4. Let R be as in the preceeding lemma. There exists δ0 > 0, such that for 0 < δ < δ0 , there exists a (compactly supported) approximate eigenfunction Rδ such that √ δ|Rδ (x, v)| µ ≤ µ, √
R − Rδ L 2 ≤ δ,
∂x Rδ L 2 + ∂v Rδ L 2 ≤ C{ ∂x R L 2 + ∂v R L 2 }. Proof. In fact, we choose χ (v) to be a smooth cutoff function χ (v) = 1 for |v| ≤ N and χ (v) ≡ 0 for |v| ≥ N + 1. By Lemma 6.2, we have √ √ √ |χ (v)R(x, v)| µ ≤ |R(x, v)| µ = |wS| µ ≤ {Cwµ−1/2 }µ. (6.5) Define N by the equation δ =
µ1/2 (N +1) Cw(N +1)
and define
Rδ = χ (v)R(x, v).
30
R. Esposito, Y. Guo, R. Marra
Clearly the third estimate in the lemma is valid. From (6.5) and the definition of N and δ, the first inequality in the lemma is also valid. Since w is a polynomial, we have √ √ 1/2 (N +1) δ = µCw(N +1) ≥ µ(N ) when N is large. We then conclude the lemma by
R − Rδ L 2 = R1|v|≥N L 2 ≤ C
√ 1 µ 2 (v) dv = Cµ 2 (N ) ≤ δ. w(v) 1
|v|≥N
We now establish the crucial bootstrap lemma which shows that L 2 growth leads to the same growth rate for L ∞ . Lemma 6.5. Let g = (g1 , g2 ) be a solution to the nonlinear problem around Mhom : √ √ (6.6) (∂t + v∂x ) gi + β F( µgi+1 ) µv − L i g √ √ = −F( µgi+1 )∂v gi + F( µgi+1 )gi v + Γ (gi , gi ) + Γ (gi , gi+1 ). Assume that Reλ > 0 and
g(t) L 2 ≤ CeReλt g(0) L 2 for t ∈ [0, T ]. There exists ε0 > 0 such that if sup0≤t≤T { wg(t) L ∞ + g(t) L 2 } ≤ ε0 , then there is a constant C such that
∂x g(t) L 2 + ∂v g(t) L 2 + wg(t) L ∞ ≤ CeReλt { ∂x g(0) L 2 + ∂v g(0) L 2 + h(0) L ∞ }.
(6.7)
Proof. We take x and v derivatives for (6.6). Since F(Mi ) = 0, from (4.16), we have (h = wg) d
∂x g 2L 2 ≤ C h L ∞ { ∂x g 2L 2 + ∂v g 2L 2 } + C g 2L 2 , dt d
∂v g 2L 2 ≤ C ∂x g 2L 2 + C g 2L 2 . dt
(6.8)
Applying Gronwall’s inequality to (6.8), by ||h|| L ∞ ≤ ε0 < Reλ, we obtain t
∂x g(t) 2L 2 ≤ ∂x g(0) 2L 2 + Cε0 eCε0 (t−s) ∂v g(s) 2L 2 ds 0
+Ce d
∂v g(t) 2L 2 ≤ Cε0 dt
Reλt
0
t
g(0) 2L 2 ,
(6.9)
eCε0 (t−s) ∂v g(s) 2L 2 ds + CeReλt { g(0) 2L 2
+ ∂x g(0) 2L 2 }.
(6.10)
We further integrate over t of Eq. (6.10) to get t 2 2 ||∂v g(t)|| L 2 ≤ ||∂v g(0)|| L 2 + Cε0 0
τ 0
eCε0 (τ −s) ||∂v g(s)||2L 2 dsdτ
+CeReλt {||∂x g(0)||2L 2 + ||g(0)||2L 2 }.
(6.11)
Phase Transition in a Vlasov-Boltzmann Binary Mixture
31
We therefore have, by writing −Cε0 s = −Reλs + {Reλ − Cε0 }s, e−Reλt ||∂v g(t)||2L 2 ≤ e−Reλt ||∂v g(0)||2L 2 t τ +Cε0 e−Reλt eCε0 (τ −s) ||∂v g(s)||2L 2 dsdτ
≤
(6.12)
0 0 +C{||∂x g(0)||2L 2 + ||g(0)||2L 2 } t τ Cε0 e−Reλt eCε0 τ e(Reλ−Cε0 )s {e−Reλs ||∂v g(s)||2L 2 }dsdτ 0 0 +C{||∂v,x g(0)||2L 2 + ||g(0)||2L 2 }.
τ Since 0 e(Reλ−Cε0 )s ds ≤ Cλ e(Reλ−Cε0 )τ for Reλ − Cε0 > 21 Reλ > 0, we further bound the double integration as t τ −Reλs 2 −Reλt Cε0 τ ||∂v g(s)|| L 2 } × e e e(Reλ−Cε0 )s dsdτ Cε0 sup {e 0≤s≤t
0
≤ Cε0 sup {e−Cε0 s ||∂v g(s)||2L 2 } × Cλ e−Reλt 0≤s≤t
0
t
eCε0 τ e(Reλ−Cε0 )τ dτ
0
≤ Cλ ε0 sup {e−Reλs ||∂v g(s)||2L 2 }. 0≤s≤t
We therefore deduce from (6.12) for Cλ ε0
0. We choose the approximate eigenfunction Im R δ to the imaginary part of R by Lemma 6.4. In case λ is real, we simply do not take the imaginary parts. √ We choose a family of solutions f δ (0, x, ξ ) = µ + δIm R δ µ ≥ 0 or g δ (0, x, ξ ) = δIm R δ . Note that the positivity follows from the first statement in Lemma 6.4, for δ sufficiently small. Clearly, from Lemma 6.4, 1
g δ (0) − δIm R L 2 = δ R − Im R δ L 2 ≤ δ 1+ 2 ≤
δr , 2
for δ sufficiently small. Hence, from Lemma 6.4,
g δ (0) H 1 + h δ (0) L ∞ = δ Im R δ H 1 + δ Im R δ L ∞ ≤ Cδr. Now from the nonlinear Vlasov-Boltzmann system (6.6), we have g δ (t) = δe−Lt Im R δ √ √ t −F( µg2 )∂v g1 + F( µg2 )vg1 +Γ (g1 , g1 ) + Γ (g1 , g2 ) dτ. + e−L(t−τ ) √ √ −F( µg1 )∂v g2 + F( µg1 )vg2 +Γ (g2 , g2 ) + Γ (g2 , g1 ) 0 (6.13)
32
R. Esposito, Y. Guo, R. Marra
We choose Λ such that 1 Reλ < Λ < (1 + )Reλ. 2
(6.14)
Let T δδ =
1 ζr | ln √ |. Λ − Reλ 2CΛ δ
1 ln θδ , since 2(Λ − Reλ) < Reλ, for small δ, we have T δ ≤ T δδ . By (6.14) and T δ = Reλ δδ Clearly, e(Λ−Reλ)t ≤ e(Λ−Reλ)T = ζ r√ , and for 0 ≤ t ≤ T δδ : 2CΛ δ
√ ζ CΛ δeΛt ≤ eReλt r. 2
(6.15)
Let T ∗ = sup{s : ∇x g δ (t) L 2 + ∇v g δ (t) L 2 + h δ (t) L ∞ ≤ ε0 },
(6.16)
s
ζ Reλt δe r, for all 0 ≤ t ≤ s}. 4
T ∗∗ = sup{s : g δ (t) − δe−Lt R δ L 2 ≤ s
For 0 ≤ t ≤ min{T δ , T ∗∗ }, we have from (6.15), ζ
g δ (t) L 2 ≤ δ e−Lt Im R δ L 2 + δeReλt r 4
(6.18)
= δ e−Lt Im R L 2 + δ e−Lt {R − R δ } L 2 + 1
≤ δeReλt + CΛ δ {1+ 2 } eΛt + ≤ (1 +
(6.17)
ζ Reλt δe r 4
ζ Reλt δe r 4
3ζ Reλt δ )e
g (0) L 2 . 4
We now claim that T δ ≤ min{T ∗ , T ∗∗ }. In fact, if T ∗ < min{T δ , T ∗∗ }, then by (1.26), (6.16), (6.18), and the Lemma 6.5 (bootstrap lemma), we obtain for 0 ≤ t ≤ T ∗ :
∇x,v g δ (t) L 2 + h δ (t) L ∞ ≤ CeReλt { ∇x,v g δ (0) L 2 + h δ (0) L ∞ }. In particular, by (1.26), ∗
∇x,v g δ (T ∗ ) L 2 + h δ (T ∗ ) L ∞ ≤ CeReλT { ∇x,v g δ (0) L 2 + h δ (0) L ∞ } δ
< CeReλT δ = Cθ. This is a contradiction to the definition of T ∗ if θ is chosen 0. We now introduce a Ruelle transfer operator L: Assumption 2.1. Let a be an element in B ⊗ A[1,∞) , and E : B ⊗ Md (C) → B a completely positive unital map. Define a Ruelle transfer operator L on B ⊗ A[1,∞) by
L(Q) := τc,+ E ⊗ id[2,∞) (a ∗ Qa), Q ∈ B ⊗ A[1,∞) . (2) Assume that (i) The element a is in Fθ and invertible in Fθ . (ii) There exists an invariant state ϕ of L. (iii) There exists a positive constant K such that the following bound is valid: Let Q be any strictly positive element in B ⊗ (Aloc ∩ A[1,∞) ). There exists a positive integer N = N (Q) satisfying L n (Q) ≤ K inf L n (Q), ∀n ≥ N .
40
Y. Ogata
If Assumption 2.1 is valid, the restriction of L to the Banach space Fθ gives a bounded operator on Fθ (see Lemma C.3). Assumption 2.1 guarantees the following properties of L. Theorem 2.1. Let L be a Ruelle transfer operator satisfying Assumption 2.1. Then (i) There exist an element h in Fθ and a positive constant m > 0 such that L(h) = h, m ≤ h, ϕ(h) = 1. (ii) There exist strictly positive constants C1 and δ1 such that n L (Q) − ϕ(Q)h ≤ C1 e−δ1 n |Q|θ , ∀Q ∈ Fθ , ∀n ∈ N. θ Proof. See Appendix B.
Now we consider a family of Ruelle transfer operators {L α }α∈C . Theorem 2.2. Let C α → a(α) ∈ Fθ ∩ A[1,∞) be an Fθ -valued entire analytic function such that each a(α), α ∈ C has an inverse in Fθ . Let E : B ⊗ Md (C) → B be a completely positive unital map. For each α ∈ C, define a map L α : B⊗A[1,∞) → B⊗A[1,∞) by
¯ ∗ Qa(α)), Q ∈ B ⊗ A[1,∞) . L α (Q) := τc,+ E ⊗ id[2,∞) (a(α) (3) Assume that for real α, L α satisfies (iii) of Assumption 2.1. Then, for real α, (i) There exist a strictly positive number λ(α) and a strictly positive element h(α) in Fθ such that L α (h(α)) = λ(α)h(α), and lim λ(α)−n L nα (1) − h(α) = 0.
n→∞
(ii) The function R α → λ(α) is differentiable. Remark 2.1. An analogous result for the left-side chain A(−∞,−1] holds. Proof. Each L α , α ∈ C gives a bounded operator on Fθ into itself. (See Lemma C.3). To prove (i), we claim that for real α, there exists a strictly positive number λ(α) such that λ(α)−1 L α satisfies Assumption 2.1. Note that if α is real, L α is a completely positive map satisfying (i) and (iii) of Assumption 2.1. For each α ∈ R, there are a state ϕα and a strictly positive scalar λ(α) such that L ∗α ϕα = λ(α)ϕα . In fact, as a(α) is invertible and E is completely positive and unital, we have −2 L α (1) ≥ a(α)−1 > 0. Accordingly, if ν is a state of B ⊗ A[1,∞) , a state G(ν)(Q) :=
ν(L α (Q)) , ν(L α (1))
Q ∈ B ⊗ A[1,∞)
Large Deviations in Quantum Spin Chains
41
is well defined. This defines a weak∗ -continuous map G from the state space into itself. As the state space is weak∗ -compact and convex, by the Schauder Tychonov theorem, there exists a fixed point ϕα of G. This state ϕα and a strictly positive scalar λ(α) = ϕα (L α (1)), satisfy L ∗α ϕα = λ(α)ϕα . (See [A1].) Clearly, the operator λ(α)−1 L α satisfies Assumption 2.1. Applying Theorem 2.1 to λ(α)−1 L α , α ∈ R, we obtain a strictly positive element h(α) in Fθ such that L α (h(α)) = λ(α)h(α), ϕα (h(α)) = 1. Furthermore, for some strictly positive constants Cα and δα , we have λ(α)−n L n (Q) − ϕα (Q)h(α) ≤ Cα e−δα n |Q|θ , ∀Q ∈ Fθ , ∀n ∈ N. α θ
(4)
Hence (i) is proven. To prove (ii), we use Lemma 9.2 of [A1]: Lemma 2.1 (Lemma 9.2 of [A1]). Let X be a Banach space. Let L α be a bounded linear operator on X , analytic in α in a neighborhood D of a real point α0 ∈ D, satisfying the following conditions for all α ∈ R ∩ D: (a) There exist λ(α) > 0, h(α) ∈ X , and ϕα ∈ X ∗ such that L α (h(α)) = λ(α)h(α), L ∗α ϕα = λ(α)ϕα , ϕα (h(α)) = 1. (b) Define a projection Eα : X → X by Eα (Q) = ϕα (Q)h(α). There exists 0 < µα < λ(α) such that N lim µ−N α L α (1 − Eα ) N →∞
B(X )
= 0.
Then, there exists a neighborhood D of α0 such that D ∩ R α → λ(α) has an analytic extension to D . In particular, D ∩ R α → λ(α) is differentiable. As a(α) is an Fθ -valued entire analytic function, L α is a B(Fθ )-valued entire analytic function. (See Lemma C.4.) Furthermore, by the above argument, L α satisfies (a) of Lemma 2.1 with λ(α), h(α), and ϕα . From (4), we obtain λ(α)−n L nα (1 − Eα ) B(F ) ≤ Cα e−δα n . θ
1
Hence for 0 < µα := λ(α)e− 2 δα < λ(α), we have n 1 L (1 − Eα ) ≤ C α e − 2 δα n , µ−n α α B(F ) θ
and L α satisfies (b). Applying Lemma 2.1, R α → λ(α) is differentiable.
We will construct Ruelle operators L α so that the eigenvalue λ(α) in Theorem 2.2 corresponds to the logarithmic moment generating function f (α) in (1).
42
Y. Ogata
3. Large Deviation Principle for KMS-States Let be a finite range interaction and ω a unique (τ , β)-KMS state. Let be another finite range interaction. In this section, we prove the large deviation principle of the distribution of n1 H ([1, n]) in ω, Theorem 1.1. By the Gärtner-Ellis Theorem, it suffices to show the existence and differentiability of the logarithmic moment generating function, f (α) = lim
n→∞
1 log ω(eα H ([1,n]) ), ∀α ∈ R. n
(5)
Lemma 3.1. Let pn (α) be
β β pn (α) := T r[1,n] e− 2 H [1,n] eα H [1,n] e− 2 H [1,n] , α ∈ R.
It suffices to prove the existence and differentiability of the limit 1 log pn (α), ∀α ∈ R. n→∞ n lim
(6)
Proof. In [LR], it was shown that there exists a positive constant C1 such that C1−1 ωn ≤ ω|A[1,n] ≤ C1 ωn ,
(7)
where ωn is a state on A[1,n] given by ωn (A) =
T r[1,n] e−β H [1,n] A . T r[1,n] e−β H [1,n]
From this inequality, we have 1
lim log pn (α) − log ω(eα H [1,n] ) − log T r[1,n] e−β H [1,n] n→∞ n 1
log ωn (eα H [1,n] ) − log ω(eα H [1,n] ) = 0. = lim n→∞ n As the existence of the limit 1 log T r[1,n] e−β H [1,n] n→∞ n lim
is known, it suffices to prove the existence and differentiability of the limit (6). For Lemma 3.1, we shall confine our attention to the analysis of pn (α). We now define a family of Ruelle transfer operators {L α }α∈C in the form (3). We set B = Md (C), and define a completely positive unital map E : Md (C) ⊗ Md (C) → Md (C), through the formula E(a ⊗ b) := d −1 T r Md (C) (a)b. Next we introduce an Fθ -valued entire analytic function a(α). We denote by A1 the subalgebra given by Q ∈ Fθ ∩ A[1,∞) : 0 < ∀θ < 1 . Note that Aloc ∩ A[1,∞) is included in A1 . Let be any translation invariant finite range interaction. For a subset I of [1, ∞), we denote by eitδ(H (I )) the strongly continuous one parameter group of automorphisms generated by a generator i X ⊂I [(X ), ·].(See Appendix A.) Any element in A1 is
Large Deviations in Quantum Spin Chains
43
entire analytic for this automorphism group (Lemma A.2). We denote the cocycle associated to the perturbed dynamics eit (δ(H (I ))+i[P,·]) of eitδ(H (I )) with P = P ∗ ∈ A, by F1 (e(·)δ(H (I )) (P); it), t ∈ R. (See (47).) If Q and P = P ∗ are in A1 , then eitδ(H (I )) (Q) and F1 (e(·)δ(H (I )) (P); it) are in Fθ , for all 0 < θ < 1. Furthermore, the maps iR it → eitδ(H (I )) (Q), F1 (e(·)δ(H (I )) (P); it) ∈ Fθ have entire analytic extensions C z → e zδ(H (I )) (Q), F1 (e(·)δ(H (I )) (P); z) ∈ Fθ . Both e zδ(H (I )) (Q) and F1 (e(·)δ(H (I )) (P); z) are in A1 .(See Lemma A.3 and Lemma A.4.) For a translation invariant finite range interaction , we define Hˆ r (n) := (I ) ∈ A[1,∞) ∩ Aloc , I ⊂[1,∞),I ∩[1,n]=φ
Wr (n)
:=
(I )
∈ A[1,∞) ∩ Aloc .
I ⊂[1,∞),I ⊂[1,n−1],I ⊂[n+1,∞)
We define a(α) by
α β a(α) := e 2 δ(H [1,∞)) F1 e(·)δ(H [2,∞)) Hˆ r (1) , − 2
α (·)δ(H [2,∞)) r , α ∈ C. Hˆ (1) , F1 e 2
As F1 e(·)δ(H [2,∞)) Hˆ r (1) ; − β2 is in A1 , a(α) is a well-defined element in Fθ and the map C α → a(α) ∈ Fθ is entire analytic. Each a(α) has an inverse in Fθ ∩ A[1,∞) (Lemma A.6). The Ruelle transfer operator on B ⊗ A[1,∞) = A[0,∞) is given by
¯ ∗ Qa(α) , α ∈ C, Q ∈ A[0,∞) . (8) L α (Q) := γ−1 d −1 T r{0} ⊗ id[1,∞) a(α) Now we prove that L α with real α satisfies (iii) of Assumption 2.1. The proof goes parallel to the argument in [M]. We shall first write L nα in a more tractable form. By an inductive calculation, we obtain
L nα (Q) = d −n γ−n ◦ T r[0,n−1] ⊗ id[n,∞) ◦ a˜ n∗ (α)Q a˜ n (α) , where we denoted a(α)γ1 (a(α))γ2 (a(α)) · · · γ(n−1) (a(α)) by a˜ n (α). It can be shown that
α β a˜ n (α) = e 2 δ(H [1,∞)) F1 e(·)δ(H [n+1,∞)) Hˆ r (n) , − 2
α , F1 e(·)δ(H [n+1,∞)) Hˆ r (n) , 2 (see Appendix A.7). Let an (α), n ≥ 2 be β
α
an (α) := a˜ n (α)e 2 H [1,n−1] e− 2 H [1,n−1] . We have (Lemma A.7)
α β r an (α) = e 2 δ(H [1,∞)) F1 e(·)δ ( H [1,∞)−W (n)) Wr (n) , − 2
α r . F1 e(·)δ (( H [1,∞)−W (n))) Wr (n) , 2
44
Y. Ogata
We define a completely positive unital map ϕn : A[0,∞) → A[0,∞) , n ≥ 2, by
−1 ϕn (Q) := pn−1 (α)d −1 γ−n ◦ T r[0,n−1] ⊗ id[n,∞)
β β α α e− 2 H [1,n−1] e 2 H [1,n−1] Qe 2 H [1,n−1] e− 2 H [1,n−1] . Using these notations, we can rewrite L nα as L nα (Q) = d −(n−1) pn−1 (α)ϕn (an (α)∗ Qan (α)), n ≥ 2.
(9)
Next we evaluate (9), using the properties of an (α) given in Lemma A.7: that is, lim [Q, an (α)] = 0, ∀Q ∈ Aloc ,
(10)
and that there exists a positive constant C such that sup an (α) , sup (an (α))−1 < C.
(11)
n→∞
n∈N
n∈N
Let Q be any strictly positive element in A[0,n 0 ] . By (10), we can choose ε > 0 and N (Q) ∈ N so that 1 4C 3 Q 2 ε ≤ inf Q, and 1 N (Q) ≥ n 0 + 1, [Q 2 , an (α)] < ε, ∀n ≥ N (Q). As ϕn is a completely positive unital map, we have ϕn = ϕn (1) = 1. Note that ϕn (Q) is a scalar if n − 1 ≥ n 0 . Thus we get 1 1
L nα (Q) ≤ d −(n−1) pn−1 (α) C 2 ϕn (Q) + 2C Q 2 [Q 2 , an (α)] 1 −(n−1) 2 ≤d pn−1 (α) + C ϕn (Q), 2C 2 and 1 1 1 L nα (Q) ≥ d −(n−1) pn−1 (α) −2C Q 2 [Q 2 , an (α)] + 2 ϕn (Q) C 1 ≥ d −(n−1) pn−1 (α) 2 ϕn (Q), 2C for all n ≥ N (Q). Hence we obtain (iii) of Assumption 2.1: L nα (Q) ≤ (1 + 2C 4 ) inf L nα (Q), for all n ≥ N (Q).
Large Deviations in Quantum Spin Chains
45
Proof of Theorem 1.1. We have seen that {L α }α∈C satisfies all the assumptions in Theorem 2.2. Therefore, we can apply the theorem to {L α }α∈C . Accordingly, for real α, there exist a strictly positive number λ(α) and a strictly positive element h(α) in Fθ such that L α (h(α)) = λ(α)h(α), and lim λ(α)−n L nα (1) − h(α) = 0.
n→∞
Furthermore, R α → λ(α) is differentiable. By (9) and (11), for α ∈ R we have d −(n−1) pn−1 (α)C −2 ≤ L nα (1) = d −(n−1) pn−1 (α)ϕn (an (α)∗ 1an (α)) ≤ d −(n−1) pn−1 (α)C 2 .
(12)
Hence for any state ν on A[0,∞) , we have
1 log pn−1 (α) − log ν(λ(α)−n L nα (1)) − n log λ(α) − (n − 1) log d n−1 1 = lim (log pn (α)) − log λ(α) − log d = 0. n→∞ n
lim
n→∞
Therefore, the limit 1 log pn (α) = log λ(α) + log d, ∀α ∈ R n→∞ n lim
(13)
exists and is differentiable. Applying Lemma 3.1, we have thus proved the theorem. 4. Large Deviation Principle for C ∗-Finitely Correlated States In this section, we prove the large deviation principle for finitely correlated states, Theorem 1.2. First, we note the following fact: Lemma 4.1. For i = 1, . . . , l, l ∈ N, let {µi,n }n∈N be a sequence of distributions over R. Suppose that each {µi,n }n∈N satisfies the large deviation principle with a good rate function Ii . Let λi , i = 1, . . . , l be positive numbers such that λi > 0,
l
λi = 1.
i=1
For each n ∈ N, define µn by µn :=
l
λi µi,n .
i=1
Then {µn }n∈N satisfies the large deviation principle with a good rate function I (x) := min1≤i≤l Ii (x).
46
Y. Ogata
Proof. For any Borel set of R, we have inf x∈ I (x) = min1≤i≤l inf x∈ Ii (x). As {µi,n }n∈N satisfies the large deviation principle with a rate function Ii , we have 1 log µi,n () n 1 ≤ lim sup log µi,n () n ≤ − inf Ii (x)
− inf Ii (x) ≤ lim inf x∈ 0
x∈¯
≤ − inf I (x). x∈¯
(14)
First we prove the upper bound. For any Borel set , we have l 1 1 1 λi µi,m () ≤ max sup sup log µm () = sup log log µi,m () . 1≤i≤l m≥n m m≥n m m≥n m i=1
(15) If inf x∈¯ I (x) = +∞, then for any R > 0, we have from (15) and (14), 1 1 log µi,m () < −R, sup log µm () ≤ max sup 1≤i≤l m≥n m m≥n m for any n large enough. Hence we have lim sup
1 log µn () = −∞ ≤ − inf I (x) = −∞. n x∈¯
If inf x∈¯ I (x) < +∞, then for any ε > 0, we have from (15) and (14), 1 1 log µi,m () ≤ − inf I (x) + ε, sup log µm () ≤ max sup 1≤i≤l m≥n m m≥n m x∈¯ for any n large enough. Hence we have lim sup
1 log µn () ≤ − inf I (x). n x∈¯
We thus proved the upper bound. The lower bound is trivial if inf x∈ 0 I (x) = +∞. If inf x∈ 0 I (x) < +∞, then there exists i 0 such that inf x∈ 0 Ii0 (x) = inf x∈ 0 I (x) < +∞. For any ε > 0, we have inf
m≥n
1 log µi0 ,m () ≥ − inf Ii0 (x) − ε, m x∈ 0
for n large enough. We thus obtain l
1 1 1 inf λi µi,m () ≥ inf log µm () = inf log log λi0 µi0 ,m () m≥n m m≥n m m≥n m i=1
1 1 1 log µi0 ,m () ≥ log λi0 − inf Ii0 (x) − ε, ≥ log λi0 + inf m≥n m n n x∈ 0
Large Deviations in Quantum Spin Chains
47
for n large enough. Therefore, we have the lower bound lim inf
1 log µn () ≥ − inf Ii0 (x) = − inf I (x). n x∈ 0 x∈ 0
Note that {x ∈ R : I (x) ≤ α} = ∪li=1 {x ∈ R : Ii (x) ≤ α} for any α ∈ [0, ∞). As each {x ∈ R : Ii (x) ≤ α} is compact, so is {x ∈ R : I (x) ≤ α}. Hence I is a good rate function. Let ω be a C ∗ -finitely correlated state generated by a finite dimensional C ∗ -algebra B, a completely positive unital map E : Md (C) ⊗ B → B and a faithful state ρ. We define a completely positive unital map Eˆ1 : B → B through the formula Eˆ1 (b) := E(1⊗b), b ∈ B. For l ∈ N, we denote the l th iterate of E, E ◦(id{1} ⊗E)◦· · ·◦(id[−l+1,−1] ⊗E) by E (l) . It is known that every C ∗ -finitely correlated state has a decomposition as a finite convex ∗ combination of extremal periodic states, which [FNW]. nare again C -finitely correlated n That is, we can write ω as a finite sum ω = i=1 λi ωi , 0 < λi , i=1 λi = 1, where each ωi is an extremal pi periodic state. Furthermore, ωi is a C ∗ -finitely correlated state on (Md (C)⊗ pi )Z , generated by a triple (Bi , Ei , ρi ), such that 1 is a nondegenerate eigenvalue of (Eˆi )1 , and the rest of the spectrum has modulus strictly less than 1. Therefore, by Lemma 4.1, it suffices to prove the large deviation principle for ω generated by a completely positive map E such that Eˆ1 has a nondegenerate eigenvalue 1 and the rest of the spectrum has modulus strictly less than 1. We shall confine our attention to this case. Lemma 4.2. Let ω be a C ∗ -finitely correlated state on Z Md (C) generated by (B, E, ρ). Assume Eˆ1 has a nondegenerate eigenvalue 1 and the rest of the spectrum has modulus strictly less than 1. Then there exist a positive constant s > 0 and l ∈ N such that ω is
a C ∗ -finitely correlated state on Z Md (C)⊗l generated by (B, E (l) , ρ) satisfying s −1 ρ(b) ≤ (E ˆ(l) )1 (b) ≤ sρ(b), 0 ≤ ∀b, b ∈ B.
(16)
Proof. We claim that there exists an integer l and a positive constant s > 0 such that
l s −1 ρ(b) ≤ Eˆ1 (b) ≤ sρ(b), 0 ≤ b, b ∈ B.
(17)
To see this, let P be a spectral projection of Eˆ1 corresponding to the eigenvalue 1, and set P¯ = 1 − P. By assumption, the range of P is C1. As ρ is a faithful state on a finite dimensional C ∗ -algebra, there exists c > 0 such that ρ(·) ≥ cT rB (·). Accordingly, we have c b ≤ ρ(b), ∀b ≥ 0, b ∈ B. By the assumption, if we take l large enough, we have ˆ l ¯ c (E1 ) P(b) ≤ b , ∀b ∈ B. 2 Furthermore, we have
ρ(b) = lim ρ Eˆ1n (b) = ρ(P(b)). n→∞
48
Y. Ogata
Hence we have P(b) = ρ(b)1. We thus obtain the claim: there exists l such that 1 c ¯ ρ(b) ≤ ρ(b) − b ≤ Eˆ1l (b) = Eˆ1l (Pb) + Eˆ1l ( Pb) 2 2
3 c ¯ = ρ(b) + Eˆ1l P(b) ≤ ρ(b) + b ≤ ρ(b), 2 2 for 0 ≤ b, b ∈ B. Note that ω is a C ∗ -finitely correlated state on Z (Md (C))⊗l , generated by (B, E (l) , ρ). Furthermore, we have E ˆ(l) 1 = (Eˆ1 )l , and obtain (16). Now we apply the Ruelle transfer operator method to prove existence and differentiability of the logarithmic moment generating function for a C ∗ -finitely correlated state ω, satisfying (16). Lemma 4.3. Let ω be a C ∗ -finitely correlated state on Z Md (C) generated by (B, E, ρ). Suppose that there exists a positive constant s such that ˆ 1 (b) ≤ sρ(b), 0 ≤ ∀b, b ∈ B. s −1 ρ(b) ≤ (E)
(18)
Then for any α ∈ R, the limit
1 log ω eα H [−n,−1] n→∞ n lim
exists and is differentiable with respect to α. Proof. As in Sect. 3, we use the Ruelle transfer operator method. We use the analogous notation and arguments of Sect. 2 for the left side chain. The algebra Fθ is defined analogously for the left side chain and an analogous result of Theorem 2.2 holds. For a translation invariant finite range interaction , we define Hˆ l (n) := (I ) ∈ A(−∞,−1] ∩ Aloc , I ⊂(−∞,−1],I ∩[−n,−1]=φ
Wl (n)
:=
(I )
∈ A(−∞,−1] ∩ Aloc .
(19)
I ⊂(−∞,−1],I ⊂[−n+1,−1],I ⊂(−∞,−n−1]
As a transfer operator, we consider a map from A(−∞,−1] ⊗ B to A(−∞,−1] ⊗ B. For each α ∈ C, we define L α by
L α (Q) := τc− ◦ id(−∞,−2] ⊗ E a(α) ¯ ∗ Qa(α) , Q ∈ A(−∞,−1] ⊗ B. We set a(α) to be
α a(α) := F1 e(·)δ(H (−∞,−2]) Hˆ l (1) , , α ∈ C. 2
As in Lemma A.6, C α → a(α) is an Fθ -valued entire analytic function and each a(α) has an inverse in Fθ . Now we prove that each L α , α ∈ R satisfies (iii) of Assumption 2.1. We shall first write L nα in a more tractable form. By an inductive calculation, we obtain
n
L nα (Q) = τc− ◦ (id(−∞,−2] ⊗ E) a˜ n (α)∗ Q a˜ n (α) ,
Large Deviations in Quantum Spin Chains
49
where a˜ n (α) := a(α)γ−1 (a(α)) · · · γ−(n−1) (a(α)). Let an (α), n ≥ 2 be α
an (α) := a˜ n (α)e− 2 H [−n+1,−1] . For each n ≥ 2, we define a positive constant pn (α) and a completely positive map n by
pn (α) := ω eα H [−n+1,−1] ,
n α H [−n+1,−1] α H [−n+1,−1] n (Q) := pn−1 (α) τc− ◦ (id(−∞,−2] ⊗ E) e2 . Qe 2 (20) Using these notations, we can write L nα as L nα (Q) = pn (α)n (an (α)∗ Qan (α)),
Q ∈ A(−∞,−1] ⊗ B, n ≥ 2.
Next, note that for R ∈ A[−n+1,−1] ⊗ B, n ≥ 2, an element
n−1 α H [−n+1,−1] α H [−n+1,−1] e2 τc− ◦ (id(−∞,−2] ⊗ E) Re 2
(21)
(22)
belongs to 1A(−∞,−1] ⊗ B, and (identifying 1A(−∞,−1] ⊗ B with B),
n−1 α H [−n+1,−1] e = ω(eα H [−n+1,−1] ) = pn (α). ρ τc− ◦ (id(−∞,−2] ⊗ E) Accordingly, ϕn (R) := pn (α)−1 ρ
n−1 α H [−n+1,−1] α H [−n+1,−1] e2 τc− ◦ (id(−∞,−2] ⊗ E) Re 2
defines a state on A[−n+1,−1] ⊗ B. We claim s −1 ϕn (R) ≤ n (R) ≤ sϕn (R), ∀R ≥ 0,
R ∈ A[−n+1,−1] ⊗ B.
(23)
To see this, we denote (22) by 1A(−∞,−1] ⊗ b R . We have
n (R) = pn−1 (α) τc− ◦ (id(−∞,−2] ⊗ E)
n−1 α H [−n+1,−1] α H [−n+1,−1] e2 Re 2 τc− ◦ (id(−∞,−2] ⊗ E)
= pn−1 (α) 1A(−∞,−1] ⊗ Eˆ1 (b R ) . Therefore, from the bound (18), we obtain the claim: s −1 ϕn (R) = s −1 pn−1 (α)ρ(b R ) ≤ n (R)
= pn−1 (α) 1A(−∞,−1] ⊗ Eˆ1 (b R ) ≤ spn−1 (α)ρ(b R ) = sϕn (R).
(24)
From (23), we have 0 ≤ n (1) ≤ s. As n is completely positive, we obtain n = n (1) ≤ s.
50
Y. Ogata
We now check the condition (iii). As in Sect. 3, there exists a positive constant C > 0 such that (25) sup an (α) , sup an (α)−1 < C. n∈N
n∈N
Furthermore, we have lim [Q, an (α)] = 0, ∀Q ∈ Aloc .
n→∞
(See Lemma A.7.) For a strictly positive element Q in A[−n 0 ,−1] ⊗ B, we can choose ε > 0 and N (Q) ∈ N so that 1 1 −2 s inf Q, 2ε Q 2 C ≤ 2C 2 and
1 n 0 + 1 ≤ N (Q), [Q 2 , an (α)] < ε, ∀n ≥ N (Q).
Thus, due to the inequality (23), for n ≥ N (Q), we have L nα (Q) = pn (α)n (an (α)∗ Qan (α)) 1 1 ≤ 2C n [Q 2 , an (α)] Q 2 pn (α) + C 2 spn (α)ϕn (Q) 1 −1 2 ≤ pn (α) s + C s ϕn (Q), 2C 2 1 1 1 L nα (Q) ≥ −2 n C [Q 2 , an (α)] Q 2 pn (α) + 2 s −1 ϕn (Q) pn (α) C 1 −1 ≥ pn (α)ϕn (Q) 2 s . 2C Hence for n ≥ N (Q), we obtain L nα (Q) ≤ 2C 2 s
1 −1 2 s + C s inf L nα (Q). 2C 2
We thus showed (iii). Hence {L α }α∈C satisfies all the assumptions of Theorem 2.2. We thus can apply the left-side version of Theorem 2.2 to {L α }α∈C , and for α ∈ R, we obtain lim λ(α)−n L nα (1) − h(α) = 0, n→∞
for some strictly positive element h(α) in A(−∞,−1] ⊗ B and a strictly positive number λ(α). Furthermore, λ(α) is differentiable with respect to α. By (21), (23)and (25), we have 1 pn (α) ≤ C −2 pn (α)n (1) ≤ L nα (1) = pn (α)n (an (α)∗ an (α)) sC 2 ≤ C 2 pn (α)n (1) ≤ C 2 spn (α).
(26)
Large Deviations in Quantum Spin Chains
51
For any state ν on A(−∞,−1] ⊗ B, we obtain
1 1 lim log ω eα H [−n,−1] = lim log pn (α) n→∞ n n→∞ n
1 = lim log ν L α n (1) = log λ(α). n→∞ n As log λ(α) is differentiable, we have proved the lemma.
(27)
Proof of Theorem 1.2. For a translation invariant finite range interaction over
ˆ over ⊗Z Md (C)⊗k by ⊗Z Md (C) and k ∈ N, we define an interaction ˆ ) := (I (X ), X :I = I˜(X )
I˜(X ) := {l ∈ Z : (kl + [0, k − 1]) ∩ X = φ}.
ˆ is translation invariant finite range interaction over ⊗Z Md (C)⊗k It is easy to see that such that Hˆ (J ) = H ( Jˆ),
Jˆ = k · J + [0, k − 1],
(28)
for all finite subsets J of Z. Furthermore, by the same argument as Appendix A, (use Lemma A.2 and A.4), we have α α α n n α C := sup e 2 H ([−n,−1]) e− 2 H ([−k[ k ],−1]) + sup e 2 H ([−k[ k ],−1]) e− 2 H ([−n,−1]) n n
α
n n = sup F1 e(·)δ ( H ([−k[ k ],−1])) H ([−n, −1]) − H ([−k[ ], −1]) , k 2 n
α n + sup F1 e(·)δ(H ([−n,−1])) H ([−k[ ], −1]) − H ([−n, −1]) , k 2 n < ∞. Hence we obtain n
n
C −2 eα Hˆ [−[ k ],−1] ≤ eα H [−n,−1] ≤ C 2 eα H [−k[ k ],−1] = C 2e
α Hˆ [−[ nk ],−1]
, ∀n ∈ N, ∀α ∈ R.
(29)
Any C ∗ -finitely correlated state ω has a decomposition as a finite convex combination ω=
m i=1
λi ωi , 0 < λi ,
m
λi = 1,
i=1
where each ωi is a C ∗ -finitely correlated state on Z Md (C)⊗ pi ) , generated by a triple (Bi , Ei , ρi ), such that 1 is a nondegenerate eigenvalue of (Eˆi )1 , and rest of the spectrum has modulus strictly less than 1. By Lemma 4.2, ωi is a C ∗ −finitely correlated
(l ) state on Z Md (C)⊗ pi li ) , generated by (Bi , Ei i , ρi ) satisfying ˆ) (l si−1 ρi (b) ≤ (Ei i )1 (b) ≤ si ρi (b), 0 ≤ ∀b, b ∈ Bi , for some li ∈ N and si > 0.
(30)
52
Y. Ogata
ˆ i over As stated above, there exists a translation invariant finite range interaction
⊗ p l i i such that ⊗Z Md (C) Ci−2 e
α Hˆ [−[ pnl ],−1] i
i i
≤ eα H [−n,−1] ≤ Ci2 e
α Hˆ [−[ pnl ],−1] i i
i
, ∀n ∈ N, ∀α ∈ R,
for some Ci > 0. By this, we have
1
1 α H [−[ n ],−1] − log ωi eα H [−n,−1] lim log ωi e ˆ i pi li = 0. n→∞ n n
As ωi is a C ∗ −finitely correlated state on Z Md (C)⊗ pi li ) generated by (Bi , Ei(li ) , ρi ) satisfying (30), Lemma 4.3, implies the existence and differentiability of
1 1 α H [−[ n ],−1] . log ωi eα H [−n,−1] = lim log ωi e ˆ i pi li n→∞ n n→∞ n lim
By the Gärtner-Ellis Theorem this proves the large deviation principle with good rate function Ii for each ωi . Applying Lemma 4.1, we conclude the large deviation principle for ω with a good rate function I (x) = min1≤i≤l Ii (x). 5. Equivalence of Ensembles An immediate consequence of Theorem 1.1 is the equivalence of ensembles considered in [DMN1]. Let 1 , . . . , K be the translation invariant finite range interactions and X 1,N , . . . , X K ,N corresponding macroscopic observables: X k,N := N1 Hk [1, N ]. Several notions of concentration of macroscopic observables were introduced in [DMN1]: A sequence of projections {PN } N , PN ∈ A[1,N ] , is said to be concentrating at x ∈ R K whenever
T r[1,N ] F(X k,N )PN = F(xk ), lim N →∞ T r[1,N ] (PN ) mc
for all F ∈ C(R) and k = 1, · · · K , and written PN →x. In order to define concentration of states, we need a set F of maps G from a set of all finite sequences of {1, . . . , K }, I , to C, such that
|G(k1 , . . . , km )|
m≥0 (k1 ,...,km )∈I
m k < ∞. i i=1
We define G(X N ) by G(X N ) :=
G(k1 , . . . , km )X k1 ,N . . . X km ,N .
m≥0 (k1 ,...,km )∈I
A sequence of states ω N on A[1,N ] , is concentrating at x ∈ R K if lim ω N (G(X N )) = G(x),
N →∞
Large Deviations in Quantum Spin Chains
53 mc
for all G ∈ F, and written ω N → x. It was shown in [DMN1] that if PN → x, then the 1 Tr ] (·PN ) → x. Furthermore, we write ω N → x whenever lim N →∞ ω N (X k,N ) states T r[1,N [1,N ] (PN ) = xk . Three H-functions H mc , H can , H1can were introduced in [DMN1]: H mc (x) := sup lim sup mc P N →x N →+∞
1 log T r[1,N ] (P N ), N
1 H(ω N ), ω N →x N →+∞ N 1 H1can (x) := sup lim sup H(ω N ), N →+∞ N N 1
H can (x) := sup lim sup
ω →x
where H(ω N ) is the von Neumann entropy of ω N . By definition, we have H mc (x) ≤ H can (x) ≤ H1can (x). The following theorem was proven in [DMN1]. Theorem 5.1. Assume that there exists a sequence of states ω N on A[1,N ] with density matrices σ N , satisfying the following conditions: (i) For all δ > 0 and k, there exist Ck (δ) > 0 and Nk (δ) ∈ N such that xk +δ ω N (Q kN (dλ)) ≥ 1 − e−Ck (δ)N , ∀N ≥ Nk (δ), xk −δ
Q kN
where is the spectral projection of X k,N . (ii) For all δ > 0, δ 1 log ω N ( Q˜ N (dλ)) = 0, lim N →∞ N −δ where Q˜ N is the spectral projection of (iii) H1can (x) = lim N →∞ N1 H(ω N ). Then we have
1 N (log σ N
− T r[1,N ] σ N log σ N ).
H mc (x) = H can (x) = H1can (x). This means the equivalence of microcanonical ensemble and canonical ensemble. Let us consider a sequence of states of the form
ω (A) = N
T r[1,N ] e T r[1,N ] e
k
λk Hk [1,N ] k
A
λk Hk [1,N ]
, λk ∈ R.
(31)
Theorem 1.1 and a bound similar to (7) guarantee that ω N concentrates at x for some x ∈ R K and satisfies conditions (i),(ii) of Theorem 5.1. Furthermore, it can be shown that a state of this form satisfies (iii) [DMN1]. Therefore, applying Theorem 5.1, we obtain the equivalence of ensembles in the one dimensional quantum spin system: Corollary 5.1. If there exists a sequence of states of ω N of the form (31) such that 1
ω N → x ∈ R K , then H mc (x) = H can (x) = H1can (x). Acknowledgement. The author thanks Professor L. Rey-Bellet and Dr. W. De Roeck for interesting discussions. The present research is supported by JSPS Grant-in-Aid for Young Scientists (B) and Hayashi Memorial Foundation for Female Natural Scientists.
54
Y. Ogata
A. Analyticity of Local Elements All the results in this section are straightforward application of arguments in [A1]. For the readers’ convenience, we sketch the proof here. Throughout this section, we fix 3 ≤ r ∈ N. For R > 0, we denote by B R an open ball in C centered at the origin with radius R. Let I be a subset of Z, and a translation invariant finite range interaction with range diameter less than r . The derivation iδ (H (I )) (Q) := i [(X ), Q] , Q ∈ Aloc , X ⊂I
defined on Aloc is closable and generates a strongly continuous one parameter group of automorphisms on A. In order to show the dependence of I explicitly, we use the notation of H. Araki and denote this one parameter group by exp(itδ (H (I ))). In one dimension, local elements are entire analytic for this dynamics: Theorem A.1. Let n 1 , n 2 , N1 , N2 be integers such that n := n 1 +n 2 +1 ≥ r , [−n 1 , n 2 ] ⊂ [−N1 , N2 ], and −N1 ≤ N2 . Then Q ∈ A[−n 1 ,n 2 ] is an entire analytic element for exp(itδ (H (I ))) for any I ⊂ Z. Furthermore, we have exp(βδ (H (I )))(Q) ≤ Fn (2 |β| ||) Q ,
(32)
and exp(βδ (H (I )))(Q) − exp(βδ (H (I ∩ [−N1 , N2 ])))(Q) ≤ Fnmin{N2 +n 1 ,N1 +n 2 }+1 (2 |β| ||) Q ,
(33)
for all β ∈ C. Here, Fn is a function on R given by Fn (x) := exp [(n − r + 1)x + g(x)] , r g(x) := 2 k −1 {exp(kx) − 1},
(34)
k=1
and
FnL
is a function such that
FnL+1 (x) ≤ FnL (x), FnLr +n (x) ≤ ((L + 1)!)−1 [g(x)] L+1 Fn (x), ∀x > 0, 0 ≤ L ∈ N. (35) Proof. Basically, this is proven in [A1] (Theorem 4.2). A slight difference of the setting (for example, N1 , N2 are replaced by −N , n + N , N ∈ N in [A1]) causes no difference of the proof because of the translation invariance of . For an element Q in A I , a0 ∈ Z, and 0 < θ < 1, we say that Q allows a decomposition into local elements centered at a0 with rate θ , if there exists a sequence of local 0 elements := (Q θ,a k )k such that Q =
∞
0 Q θ,a k ,
0 Q θ,a ∈ A[a0 −k,a0 +k]∩I , k
k=r
0 −k < ∞. Cθ,a0 , (Q) := sup Q θ,a k θ
(36)
k
0 Furthermore, we say that a decomposition is self-adjoint if all Q θ,a are self-adjoint. k We denote by A1 the subalgebra given by Q ∈ Fθ ∩ A[1,∞) : 0 < ∀θ < 1 .
Large Deviations in Quantum Spin Chains
55
Lemma A.1. Let Q be an element of A[1,∞) , a0 ∈ Z, and 0 < θ < 1. Suppose that there exist a positive number C and r ≤ l ∈ N satisfying the following condition: for all l ≤ N ∈ N, there exists Q N ∈ A[a0 −N ,a0 +N ]∩[1,∞) such that Q − Q N ≤ Cθ N .
(37)
Then Q has a decomposition into local elements centered at a0 with rate θ , such that QN =
N
−l −1 0 Q θ,a k , l ≤ N , C θ,a0 , (Q) ≤ C + max{Q θ , Cθ } < ∞. (38)
k=r
If each Q N is self-adjoint, the decomposition can be taken to be self-adjoint. In particular, all Q ∈ A1 has a decomposition into local elements centered at a0 = 0 for any 0 < θ < 1, with C ≡ |Q|θ in (38). Proof. This follows by taking 0 Q θ,a k
⎧ ⎨ 0, = Ql , ⎩Q − Q , k k−1
If Q ∈ A1 , we take Q N = Q (N ) of Lemma C.1.
r ≤k 0 and 0 < θ < e−4R|| , let Q be an element in A with a decomposition into local elements centered at a0 with rate θ . Then for any I ⊂ Z, exp(itδ (H (I )))(Q) has an analytic extension to B R such that exp(βδ (H (I )))(Q) ≤ Cθ,a0 , (Q)C1,R,θ,|| , |β| ≤ R.
(39)
Furthermore, for N ∈ N, N θ,a0 Q k ) exp(βδ (H (I )))(Q) − exp(βδ (H (I ∩ [a0 − 2N , a0 + 2N ])))( k=r
N ≤ Cθ,a0 , (Q)C2,R,θ,|| θ e4R|| , |β| ≤ R. (40) Here, positive constants C1,R,θ,|| and C2,R,θ,|| depend only on R, θ , and ||. In particular, any Q ∈ A1 is entire analytic for eitδ (I ) . Proof. The first part is an immediate consequence of Theorem A.1. Let Q be an element of A1 . For any R > 0, take 0 < θ < e−4R|| . Then by Lemma A.1, Q admits a decomposition into local elements centered at a0 = 0 with rate θ . Hence exp(itδ (H (I )))(Q) has an analytic extension in A to B R . As this holds for all R > 0, exp(itδ (H (I )))(Q) has an entire analytic extension in A.
56
Y. Ogata
Lemma A.3. For any element Q in A1 , I ⊂ [1, ∞), and β ∈ C, the element exp(βδ (H (I )))(Q) belongs to Fθ for all 0 < θ < 1. Furthermore, C β → exp(βδ (H (I )))(Q) ∈ Fθ is Fθ -entire analytic. For any R > 0 and 0 < θ < 1, fix some 0 < θ < e−4R|| (θ )3 . Let be a decomposition of Q into local elements centered at a0 = 0 with rate θ . Then we have |exp(βδ (H (I )))(Q)|θ ≤ C3,R,θ,θ ,|| Cθ,0, (Q), |β| < R, and
N θ,0 Q k ) exp(βδ (H (I )))(Q) − exp(βδ (H (I ∩ [0, 2N ])))( k=r
N
≤ C4,R,θ,θ ,|| Cθ,0, (Q)(θ )
|β| < R.
(41)
θ
(42)
Here, positive constants C3,R,θ,θ ,|| and C4,R,θ,θ ,|| depend only on R, θ, θ and ||. Proof. Note that for A ∈ A[1,∞) and A j ∈ A[1, j−1] , j ∈ N, we have var j (A) ≤ 2 A ,
(43)
var j (A) = var j (A − A j ) ≤ 2 A − A j .
(44)
Fix any R > 0 and 0 < θ < 1, and take 0 < θ < e−4R|| (θ )3 . Let be a decomposition of Q into local elements centered at a0 = 0 with rate 0 < θ < 1. Lemma A.2 implies l (θ )−2l exp(βδ (H (I )))(Q) − exp(βδ (H (I ∩ [0, 2l])))( Q θ,0 ) k k=r
l
≤ (θ ) Cθ,0, (Q)C2,R,θ,|| , |β| < R, l exp(βδ (H (I ∩ [0, 2l])))( Q θ,0 k ) ∈ A[1,2l] , r ≤ ∀l ∈ N.
(45)
k=r
Applying (45) for l =
j−1 2
and using (44), we obtain
(θ )− j var j (exp(βδ (H (I )))(Q)) ≤ C1,R,θ,θ ,|| · (θ )
j−1 2
Cθ,0, (Q), |β| < R, (46)
for a positive constant C1,R,θ,θ ,|| which depends only on R, θ, θ , ||. Hence we obtain (41) and exp(βδ (H (I )))(Q) is an element in Fθ for |β| < R. N Q θ,0 Next, note that if 2N < j, exp(βδ (H (I ∩ [0, 2N ])))( k=r k ) is in A[1, j−1] . Hence we have N θ,0 −j (θ ) var j exp(βδ (H (I )))(Q) − exp(βδ (H (I ∩ [0, 2N ])))( Qk ) −j
= (θ )
k=r −1 N var j (exp(βδ (H (I )))(Q)) ≤ C1,R,θ,θ ,|| θ (θ ) Cθ,0, (Q),
|β| < R.
Large Deviations in Quantum Spin Chains
57
On the other hand, if 2N ≥ j, applying (45) for l = N and using (43), −j
(θ )
var j exp(βδ (H (I )))(Q) − exp(βδ (H (I ∩ [0, 2N ])))(
N
Q θ,0 k )
k=r
≤ 2(θ ) N Cθ,0, (Q)C2,R,θ,|| , |β| < R. Combining these estimates, we obtain (42) with C4,R,θ,θ ,|| := max{2C2,R,θ,|| , (θ )−1 C1,R,θ,θ ,|| }. Now we show that exp(βδ (H (I )))(Q) is Fθ -analytic on B R . For each N ∈ N, N exp(βδ (H (I ∩ [0, 2N ])))( k=r Q θ,0 ) is in A[0,2N ] ⊂ Fθ and the map B R β N k θ,0 → exp(βδ (H (I ∩ [0, 2N ])))( k=r Q k ) ∈ Fθ is analytic. Furthermore, the N Q θ,0 bounded on B R i.e., sequence exp(βδ k ) is uniformly (H (I ∩ [0, 2N ])))( k=r θ,0 N sup|β| N . We have Cθ,0, N (Q N ) ≤ Cθ,0, (Q). Applying (41) to this decomposition, we obtain a uniform bound.) Hence {exp(βδ (H (I ∩ [0, 2N ])))
θ,0 N } N is a uniformly bounded sequence of Fθ -valued analytic functions k=r Q k on B R . From (42), Fθ -valued function exp(βδ (H (I )))(Q) is approximated by this sequence in Fθ -norm. Therefore, exp(βδ (H (I )))(Q) is also Fθ -analytic on B R . As R > 0 is arbitrary, exp(βδ (H (I )))(Q) is Fθ -entire analytic. Next for an Fθ -valued entire analytic function A(z), we define u n−1 1 u1 ∞ n F1 (A, z) := z du 1 du 2 · · · du n A(u n z) · · · A(u 1 z) F2 (A, z) :=
n=0 ∞ n=0
0
0
(−z)n
0
1
u1
du 1 0
du 2 · · ·
0
u n−1
du n A(u 1 z) · · · A(u n z). (47)
0
In particular, for P = P ∗ ∈ A, F1 (e(·)δ(H (I )) (P), it) is a co-cycle related to the perturbed automorphism group eitδ(H (I )+P) of eitδ(H (I )) . By a routine argument, we obtain the following: Lemma A.4. For Fθ -valued entire analytic functions A(z), B(z), each Fi (A, z), i = 1, 2 is in Fθ and the map C z → Fi (A, z) ∈ Fθ is entire analytic. Furthermore, we have Fi (A, z) ≤ exp |z| sup A(w) , |w|≤|z|
|Fi (A, z)|θ ≤ exp 2 |z| sup |A(w)|θ , |w|≤|z|
Fi (A, z) − Fi (B, z) ≤ |z| exp |z| sup (A(w) + B(w)) |w|≤|z|
sup A(w) − B(w)
|w|≤|z|
58
Y. Ogata
|Fi (A, z) − Fi (B, z)|θ ≤ |z| exp 2 |z|
sup (|A(w)|θ + |B(w)|θ )
|w|≤|z|
sup |A(w) − B(w)|θ .
|w|≤|z|
(48)
Lemma A.5. Let a0 be a positive integer and I ⊂ [1, ∞). Let Q, P = P ∗ be elements in A[1,∞) , which allow for any rate 0 < θ < 1, decompositions Q,θ , P,θ into local elements centered at a0 . Assume that P,θ can be taken to be self-adjoint. Let eitδ(H (I )+P) be a strongly continuous one parameter group of automorphisms over A[1,∞) generated by i (δ (H (I )) + [P, ·]). Then for all 0 < θ < 1, eitδ(H (I )+P) (Q) is in Fθ and the map R t → eitδ(H (I )+P) (Q) ∈ Fθ has an Fθ -entire analytic extension e zδ(H (I )+P) (Q). Furthermore, for any R > 0 and 0 < θ < e−4R|| , we have
zδ(H (I )+P) (Q) ≤ C1,R,θ,|| exp C5,R,θ,|| Cθ,a0 , P,θ (P) Cθ,a0 , Q,θ (Q), (49) e
N N θ,a zδ H (I ∩[a0 −2N ,a0 +2N ])+ k=r Pk 0 θ,a0 zδ(H (I )+P) (Q) − e ( Q k ) e k=r
≤ C6,R,θ,|| exp C7,R,θ,|| Cθ,a0 , P,θ (P) Cθ,a0 , P,θ (P) + 1 N
Cθ,a0 , Q,θ (Q) θ e4R|| , (50) |z| ≤ R. Here, positive constants C5,R,θ,|| , C6,R,θ,|| , and C7,R,θ,|| depend only on R, θ , and ||. Proof. This follows from identity
eitδ(H (I )+P) (Q) = F1 e(·)δ(H (I )) (P), it eitδ(H (I )) (Q)F2 e(·)δ(H (I )) (P), it , and the estimation of the analytic continuation of the right hand side using Lemma A.3 and Lemma A.4. If I is finite, then exp(βδ (H (I )))(Q) = eβ H (I ) Qe−β H (I ) ,
F1 (exp((·)δ (H (I )))(Q), β) = eβ(H (I )+Q) e−β H (I ) ,
F2 (exp((·)δ (H (I )))(Q), β) = eβ H (I ) e−β(H (I )+Q) ,
and satisfies the relations: F1 (exp((·)δ (H (I ) + Q))(Q 1 + Q 2 ), β) = F1 (exp((·)δ (H (I ) + Q + Q 1 ))(Q 2 ), β) F1 (exp((·)δ (H (I ) + Q))(Q 1 ), β) , F1 (exp((·)δ (H (I ) + Q))(Q 1 ), β) exp(βδ (H (I ) + Q))(Q 2 ) = exp(βδ (H (I ) + Q + Q 1 ))(Q 2 )F1 (exp((·)δ (H (I ) + Q))(Q 1 ), β) , F1 (exp((·)δ (H (I ) + Q))(Q 1 ), β) F2 (exp((·)δ (H (I ) + Q))(Q 1 ), β) = F2 (exp((·)δ (H (I ) + Q))(Q 1 ), β) F1 (exp((·)δ (H (I ) + Q))(Q 1 ), β) = 1, (51)
Large Deviations in Quantum Spin Chains
59
for all self-adjoint elements Q, Q 1 in A1 and all element Q 2 in A1 . As Fi (exp((·)δ (H (I )))(Q), β) and exp(βδ (H (I )))(Q) are approximated by local elements by Lemma A.4 and Lemma A.5, the relations (51) hold for general I . We use the following notation: Hˆ r (n) := (I ) ∈ A[1,∞) ∩ Aloc , I ⊂[1,∞),I ∩[1,n]=φ
Hˆ l (n)
:=
(I )
∈ A(−∞,−1] ∩ Aloc ,
I ⊂(−∞,−1],I ∩[−n,−1]=φ
Wr (n) := Wl (n)
:=
(I )
∈ A[1,∞) ∩ Aloc ,
I ⊂[1,∞),I ⊂[1,n−1],I ⊂[n+1,∞)
(I )
∈ A(−∞,−1] ∩ Aloc . (52)
I ⊂(−∞,−1],I ⊂[−n+1,−1],I ⊂(−∞,−n−1]
Let and be translation invariant finite range interactions with range diameter less than r . Lemma A.6. For fixed β ∈ R, define the A-valued function a(α) on C by
α β a(α) := e 2 δ(H [1,∞)) F1 e(·)δ(H [2,∞)) Hˆ r (1) , − 2
α (·)δ(H [2,∞)) r , α ∈ C. Hˆ (1) , F1 e 2
(53)
Then for any 0 < θ < 1 and α ∈ C, a(α) belongs to Fθ and C α → a(α) ∈ Fθ is Fθ -entire analytic. For each α ∈ C, a(α) is invertible in Fθ and
α a(α)−1 = F2 e(·)δ(H [2,∞)) Hˆ r (1) , 2
α β δ(H [1,∞)) (·)δ(H [2,∞)) F2 e . e2 Hˆ r (1) , − 2
Proof. By Lemma A.3 and Lemma A.4, F1 e(·)δ(H [2,∞)) Hˆ r (1) , − β2 is an element
α in A1 . Therefore, by Lemma A.3, e 2 δ(H [1,∞)) F1 e(·)δ(H [2,∞)) Hˆ r (1) , − β2
α is a well-defined element in Fθ and C α → e 2 δ(H [1,∞)) F1 e(·)δ(H [2,∞)) Hˆ r (1) , ∈ Fθ is entire analytic for all 0 < θ < 1. On the other hand, by Lemma A.3 and − β2
Lemma A.4, C α → F1 e(·)δ(H [2,∞)) Hˆ r (1) , α2 ∈ Fθ is entire analytic. Hence a(α) is an Fθ -valued entire analytic function for all 0 < θ < 1. The existence of an inverse follows from (51). Lemma A.7. For n ∈ N and α ∈ C, define a˜ n (α) and an (α) by a˜ n (α) := a(α)γ1 (a(α))γ2 (a(α)) · · · γ(n−1) (a(α)), β
α
an (α) := a˜ n (α)e 2 H [1,n−1] e− 2 H [1,n−1] .
60
Y. Ogata
Then
α β a˜ n (α) = e 2 δ(H [1,∞)) F1 e(·)δ(H [n+1,∞)) Hˆ r (n) , − 2
α (·)δ(H [n+1,∞)) r , Hˆ (n) , F1 e 2
α β r (n) δ(H [1,∞)) (·)δ H [1,∞)−W r ( ) F1 e an (α) = e 2 W (n) , − 2
α r r (n) (·)δ (( H [1,∞)−W )) . W (n) , F1 e 2
For any K > 0, there exists a positive constant C K such that sup sup an (α) , sup sup (an (α))−1 < C K . |α| 0, 0 < θ < e−4R|| and I ⊂ [1, ∞), we have
zδ ( H (I )−Wr (n)) r W (n) ≤ C2,R,θ,|| , (59) e
N N r zδ H [1,∞)−W r (n) r ) W (n) − e zδ H [1,∞)∩[n−2 2 ,n+2 2 ] −W (n) W r (n) e (
≤
C3,R,θ,||
θ e4R||
N 2
,
N ≥ 2r, |z| < R.
(60)
Here, C2,R,θ,|| and C3,R,θ,|| are positive constants which depend only on R, θ , and ||. Then from Lemma A.4, for R > 0 and 0 < θ < e−4R|| , we have
r , (61) F1 e(·)δ ( H [1,∞)−W (n)) Wr (n) , z ≤ C4,R,θ,||
Large Deviations in Quantum Spin Chains
61
r F1 e(·)δ ( H [1,∞)−W (n)) Wr (n) , z
(·)δ H [1,∞)∩[n−2 N2 ,n+2 N2 ] −Wr (n) r W (n) ; z −F1 e ≤
C5,R,θ,||
θe
4R||
N 2
,
N ≥ 2r, |z| < R,
(62)
and C5,R,θ,|| which depend only on R, θ and with positive constants C4,R,θ,|| ||.
r Now, we claim that for any 0 < θ < 1, F1 e(·)δ ( H [1,∞)−W (n)) Wr (n) , − β2 has
a decomposition n,θ into local elements centered at n with rate θ which is uniformly
r bounded in n, i.e., supn∈N Cθ ,n,n,θ (F1 e(·)δ ( H [1,∞)−W (n)) Wr (n) , − β2 )
0, 0 < θ < e−4K || , 0 < θ < e−4|β||| θ 3 , and I ⊂ [1, ∞). Here, C6,K ,|β|,θ,θ ,||,|| , C 7,K ,|β|,θ,θ ,||,|| are constants which depend only on K , |β| , θ, θ , || and ||. Combining (63) with (61), we obtain from (55) the bound sup sup an (α) ≤ C8,K ,|β|,θ,θ ,||,|| C 4,K ,θ ,|| ,
|α| 0 with 0 < θ < e−4K || and 0 < θ < e−4|β||| θ 3 . The bound for an (α)−1 follows by the same argument. To prove (57), define anN (α) :=
β (·)δ H [1,∞)∩[n−2 N2 ,n+2 N2 ]−Wr (n) r F1 e W (n) , − e 2
r (n)
α (·)δ H [1,∞)∩[n−2 N2 ,n+2 N2 ]−W F1 e Wr (n) , 2 ∈ A[n−2N ,n+2N ]∩[1,∞) , N ≥ 2r. α 2 δ(H [1,∞)∩[n−2N ,n+2N ])
Using (55), (61), (62), (63), and (64), we have an (α) − anN (α)
N 4K || ≤ C7,K θ e C4,K ,|β|,θ,θ ,||,|| ,θ ,||
4K || +C6,K ,|β|,θ,θ ,||,|| C 5,K ,θ ,|| θ e
N 2
→ 0, N → ∞, |α| < K , for all K > 0 with 0 < θ < e−4K || , and 0 < θ < e−4|β||| θ 3 . From this, we have n an (α) − an 3 (α) → 0, n → ∞. Hence, for a local element A, we have lim [an (α), A] = lim
n→∞
n→∞
! n an 3 (α), A = 0.
B. Proof of Theorem 2.1 In this section, we sketch the proof of Theorem 2.1. Although we are considering a generalized form of L (B = Md (C), and E(a ⊗ b) := d −1 T r Md (C) (a)b in [M]), most parts of the proof can be carried out parallel to [M]. Instead of repeating the details, we indicate the corresponding part of [M]. Lemma B.1. Let L : B ⊗ A[1,∞) → B ⊗ A[1,∞) be a linear operator satisfying the following: (a) L is completely positive. (b) L has an invariant state ϕ. (c) There are positive constants a1 , a2 such that n L (Q) ≤ a1 Q + a2 θ n Qθ , ∀Q ∈ Fθ , ∀n ∈ N. θ
Large Deviations in Quantum Spin Chains
63
(d) There exists a positive constant K such that the following bound is valid: Let Q be any strictly positive element in Fθ . There exists a positive integer N = N (Q) satisfying L n (Q) ≤ K inf L n (Q), ∀n ≥ N . (e) L is unital. Then there are positive constants C1 and δ1 such that n L (Q) − ϕ(Q)1 ≤ C1 e−δ1 n |Q|θ , ∀Q ∈ Fθ , ∀n ∈ N. θ
(65)
Proof. For any strictly positive element Q in Fθ , we have lim L n (Q) − ϕ(Q)1 = 0
(66)
n→∞
([M] Lemma 2.8). To see this, note that a set {L n (Q)}n∈N is a subset of the C ∗ -norm compact set {A ∈ Fθ : |A|θ ≤ a1 Q + a2 Qθ } by (c) (see Lemma C.2). Therefore, it has a convergent subsequence with a limit in Fθ . Let {L n k (Q)} be any convergent n k ¯ subsequence, L (Q) − Q → 0, with 0 ≤ Q¯ ∈ Fθ . We show Q¯ = ϕ(Q) · 1. Tak¯ − Q¯ → ing suitable 0 < q(l) = n k(l+1) − n k(l) ∈ N, l ∈ N, we have L q(l) ( Q) 0, q(l) → ∞, l → ∞. (See the explanation before (2.23) of [M].) On the other hand, ¯ n is an increasing sequence and { L n ( Q) ¯ } a decreasing from (a) and (e), {inf L n ( Q)} sequence. (See (2.20) of [M].) Therefore, we have ¯ ≤ lim L q(l) ( Q) ¯ = Q¯ ≤ L n ( Q) ¯ ≤ Q¯ inf Q¯ ≤ inf L n ( Q) l→∞
¯ and L n ( Q) ¯ = Q¯ for all n ∈ N. for all n ∈ N. Hence we obtain inf Q¯ = inf L n ( Q) Then from (d), we have 0 ≤ Q¯ − λ = L n ( Q¯ − λ) ≤ K inf L n ( Q¯ − λ) = K (inf Q¯ − λ), ¯ This ¯ for all λ < inf Q¯ and n ≥ N ( Q−λ). Taking λ ↑ inf Q¯ limit, we get Q¯ = inf( Q). n ¯ = limk→∞ ϕ(L k (Q)) = ϕ(Q), means Q¯ is proportional to 1. From (b), we obtain ϕ( Q) and Q¯ = ϕ(Q)1. As this argument applies to any subsequence of L n (Q), we obtain (66) for strictly positive Q ∈ Fθ . As L(1) = 1, we get (66) for all selfadjoint Q in Fθ . Fix any ε > 0. From (66), a compact set {Q = Q ∗ ∈ Fθ : |Q|θ ≤ 1} can be covered by a finite number of open sets Un := {Q ∈ B ⊗ A[1,∞) : L n (Q) − ϕ(Q) < ε }. Because L is a completely positive unital map, L is contractive and we have U1 ⊂ U2 ⊂ U3 · · ·. Therefore, there exists a positive integer N1 ∈ N such that {Q = Q ∗ ∈ Fθ : |Q|θ ≤ 1} ⊂ U N1 , i.e., n L (Q) − ϕ(Q)1 ≤ ε |Q|θ , ∀Q = Q ∗ ∈ Fθ , ∀n ≥ N1 . (67) (See [M], Lemma 2.9.) Hence for any 0 < ε < 1 we have from (c) and (67), 2N0 L (Q) − ϕ(Q)1 = L N0 (L N0 (Q) − ϕ(Q)1) θ θ N0 N0 N0 ≤ a1 L (Q) − ϕ(Q)1 + a2 θ L (Q) ≤ ε |Q|θ , Q = Q ∗ ∈ Fθ θ
64
Y. Ogata
for N0 large enough. For n = 2m N0 + r , we thus obtain n L (Q) − ϕ(Q) ≤ Cεm |Q|θ , θ
Q = Q ∗ ∈ Fθ ,
with some constant C > 0. For Q = Q 1 + i Q 2 ∈ Fθ with Q 1 , Q 2 self-adjoint, we have |Q 1 |θ , |Q 2 |θ ≤ |Q|θ . Therefore, we obtain (65) for all Q ∈ Fθ . Our transfer operator L does not satisfy the condition (e). However, it turns out to be similar to an operator satisfying (a)-(e): Lemma B.2. Let L : B ⊗ A[1,∞) → B ⊗ A[1,∞) be a linear operator satisfying (a)-(d) of Lemma B.1 and (f) There exists m, M > 0 such that m ≤ L n (1) ≤ M, ∀n ∈ N. Then there exists an element h in Fθ such that L(h) = h, m ≤ h ≤ M, ϕ(h) = 1.
(68)
Furthermore, an operator L h defined by 1
1 1 1 L h (Q) := h − 2 L h 2 Qh 2 h − 2 , Q ∈ B ⊗ A[1,∞) , satisfies (a)-(e) of Lemma B.1 with an invariant state
1 1 ϕh (Q) := ϕ h 2 Qh 2 . Therefore, from (72) and Lemma B.1, we have n L (Q) − ϕ(Q)h ≤ C1 e−δ1 n |Q|θ , ∀Q ∈ Fθ , ∀n ∈ N, θ for some constants C1 , δ1 > 0. Proof. By (c), C ∗ -norm closure S of the convex hull of {L n (1) : n ∈ N} is a compact convex set. As (a) and (f) imply L = L(1) ≤ M, the operator L maps S continuously (in C ∗ -norm) into S. Therefore, by the Schauder-Tychonov fixed point theorem, there exists h ∈ S such that L(h) = h. This h satisfies (68) (Lemma 2.5 of [M]). As 1 1 h > 0 is an element in Fθ , h 2 , h − 2 are also in Fθ .(See [M], p. 1191.) The properties (a), (b), (e) of L h are trivial, and (c) follows from that of L, using (72). Property (d) of 1 1 L h follows from (d) of L and the fact that inf R ≤ h inf(h − 2 Rh − 2 ) for all R ≥ 0 ([M], Lemma 2.7). Proof of Theorem 2.1. The operator given in Theorem 2.1 satisfies the conditions in the last lemma: Lemma B.3. Let L be an operator satisfying Assumption 2.1. Then L satisfies (a)-(d) of Lemma B.1 and (f) of Lemma B.2.
Large Deviations in Quantum Spin Chains
65
Proof. (a), (b) is trivial. First we prove (f). As L satisfies (a), (b) and (iii) of Assumption 2.1, we have 1 = ϕ(L n (1)) ≤ L n (1) ≤ K inf L n (1) ≤ K ϕ(L n (1)) = K , −2 for n ≥ N = N (1). On the other hand, because a is invertible, we have a ∗ a ≥ a −1 , −1 −2 and obtain L(1) ≥ a > 0, hence L k (1) > 0, for all k ∈ N. For m = N −1 min{inf L(1), . . . , inf L (1), K −1 } and M = max{L(1) , . . . , L N −1 (1) , K }, L satisfies (f) ([M], Lemma 2.3). In order to prove (d), we approximate Q by a local element. For 0 < K in (iii), fix any positive constant K such that K < K . We show that (d) holds for K . For any strictly positive Q ∈ Fθ , choose ε > 0 and l ∈ N such that (K + 1)Mε ≤ m(K − K ) inf Q and |Q|θ θ l ≤ ε. For this l, by Lemma C.1, there exists Q (l) ∈ B ⊗ A[1,l−1] such that 0 < inf Q ≤ Q (l) ≤ Q and Q − Q (l) ≤ |Q|θ θ l . Using L n = L n (1) ≤ M, we obtain − M |Q|θ θ l ≤ L n (Q) − L n (Q (l) ) ≤ M |Q|θ θ l , n ∈ N.
(69)
Applying (iii) to Q (l) , we have L n (Q (l) ) ≤ K inf L n (Q (l) ), ∀n ≥ N (Q (l) ).
(70)
Using (69), (70), and m inf Q ≤ inf L n (Q), n ∈ N, we obtain (d): L n (Q) ≤ M |Q|θ θ l + L n (Q (l) ) ≤ M |Q|θ θ l + K inf L n (Q (l) )
≤ M |Q|θ θ l + K inf L n (Q) + M |Q|θ θ l ≤ M(K + 1)ε + K inf L n (Q) ≤ m(K − K ) inf Q + K inf L n (Q) ≤ K inf L n (Q), n ≥ N (Q (l) ). To prove (c), we define a map L˜ : O → O by
˜ L(Q) := τc,+ ⊗ τc,+ ◦ (ε ⊗ id[2,∞ ) ⊗ (ε ⊗ id[2,∞ ) (a ∗ ⊗ 1)Q(a ⊗ 1) . ˜ Note that L(Q ⊗ 1) = L(Q) ⊗ 1, Q ∈ B ⊗ A[1,∞) . Define ( L˜ k ) j := j ◦ L˜ k ◦ j+k , k ∈ N ∪ {0}, j ∈ N. For any n, j ∈ N, Q ∈ B ⊗ A[1,∞) , we have
var j L n (Q) ≤ ( L˜ n ) j (Q ⊗ 1) − L˜ n (Q ⊗ 1)
+ j ◦ L˜ n j+n (Q ⊗ 1) − Q ⊗ 1 . (71)
Qθ θ j+n . By j ◦ τc,+ ⊗ τc,+ ◦ (ε ⊗ id[2,∞ ) By (f), the second
by M
term is bounded
⊗(ε ⊗ id[2,∞ ) = τc,+ ⊗ τc,+ ◦ (ε ⊗ id[2,∞ ) ⊗ (ε ⊗ id[2,∞ ) ◦ j+1 , j ∈ N, we have
˜ ( L˜ k−1 ) j ◦ L(R) = ( L˜ k ) j j+k (a ∗−1 ⊗ 1)(a ∗ ⊗ 1)R(a ⊗ 1) j+k (a −1 ⊗ 1) , ∀R ∈ O, j, k ∈ N.
66
Y. Ogata
From this, we obtain ˜ k−1 ˜ − ( L˜ k ) j (R) ( L ) j ◦ L(R)
= ( L˜ k ) j j+k (a ∗−1 ⊗ 1)(a ∗ ⊗ 1)R(a ⊗ 1) j+k (a −1 ⊗ 1) − R
≤ M a −1 a + 1 a a −1 θ j+k R , R ∈ O, j, k ∈ N. θ
Hence we get ˜n ( L ) j (Q ⊗ 1) − L˜ n (Q ⊗ 1) n
k L˜ ≤ ◦ L˜ n−k (Q ⊗ 1) − L˜ k−1 j
k=1
j
≤ θ j M(M + 1)( a −1 a + 1) a a −1
θ
◦ L˜ n−k+1 (Q ⊗ 1) θ Q =: θ j K Q . 1−θ
Substituting these for (71), we obtain (c): n L (Q) ≤ K Q + M Qθ θ n , θ ([M], Lemma 2.4)
Theorem 2.1 is an immediate consequence of Lemma B.2 and Lemma B.1.
C. Banach Space Fθ It is straightforward to check |AB|θ ≤ 2 |A|θ |B|θ , |A + B|θ ≤ |A|θ + |B|θ , |A|θ = A∗ θ , ∀A, B ∈ Fθ .
(72)
Lemma C.1 ([M], Lemma 2.1). If Q is an element of Fθ and k is a positive integer, there exists Q (k) in B ⊗ A[1,k−1] such that Q − Q (k) ≤ |Q|θ θ k . If Q is positive, Q (k) can be taken to satisfy inf Q ≤ Q (k) ≤ Q . If Q is in Fθ ∩ A[1,∞) , then Q (k) is in A[1,k−1] . A closed ball in Fθ is compact with respect to the C ∗ -norm ([M], Lemma 2.2): Lemma C.2. For any C > 0, a set {R ∈ Fθ : |R|θ ≤ C} is compact with respect to the C ∗ -norm. Next we consider operators in the form of (2).
Large Deviations in Quantum Spin Chains
67
Lemma C.3. For a, b ∈ Fθ and E : B ⊗ Md (C) → B a completely positive unital map, define an operator L on Fθ by
L(Q) := τc,+ E ⊗ id[2,∞) (bQa), Q ∈ Fθ . Then L is a bounded linear operator on Fθ such that |L(Q)|θ ≤ 8 |a|θ |b|θ |Q|θ , Q ∈ Fθ . Proof. From (72), the map Fθ Q → bQa ∈ Fθ is a bounded linear operator on Fθ such that |bQa|θ ≤ 4 |a|θ |b|θ |Q|θ for all Q ∈ Fθ . For Q ∈ Fθ and j ∈
N, we take Q ( j) ∈ B ⊗ A[1, j−1] given in Lemma C.1. As var j τc,+ E ⊗ id[2,∞) (Q ( j) ) = 0, we have
var j τc,+ E ⊗ id[2,∞) (Q) = var j τc,+ E ⊗ id[2,∞) (Q − Q ( j) )
Q − Q ( j) ≤ 2 |Q|θ θ j . ≤ 2 τc,+ E ⊗ id[2,∞) B(B⊗A[1,∞) )
Here, we used the fact that a completely positive unital map is a contraction. Hence we obtain
τc,+ E ⊗ id[2,∞) (Q) ≤ 2 |Q|θ , ∀Q ∈ Fθ . (73) θ Combining these estimates, we obtain the claim.
Lemma C.4. Let C α → a(α) ∈ Fθ be an Fθ -valued entire analytic function. Let E : B ⊗ Md (C) → B be a completely positive unital map. Define a family of operators (L α )α∈C on Fθ by
¯ ∗ Qa(α)), Q ∈ Fθ , α ∈ C. L α (Q) := τc,+ E ⊗ id[2,∞) (a(α) Then the B(Fθ )-valued function C α → L α ∈ B(Fθ ) is · B(Fθ ) -entire analytic. Proof. It is straightforward to see from (72) and (73) that the analyticity of a(α) implies that of L α . References [A1] [A2] [BLP] [BR1] [BR2] [DZ] [DMN1] [DMN2]
Araki, H.: Gibbs states of a one dimensional quantum lattice. Commun. Math. Phys. 14, 120–157 (1969) Araki, H.: Relative hamiltonian for faithful normal states of a von neumann algebra. Pub. R.I.M.S., Kyoto Univ. 9, 165–209 (1973) van den Berg, M., Lewis, J.T., Pule, J.V.: The large deviation principle and some models of an interacting boson gas. Commun. Math. Phys. 118, 61–85 (1988) Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics 1. BerlinHeidelberg-New York: Springer-Verlag, 1986 Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics 2. BerlinHeidelberg-New York: Springer-Verlag, 1996 Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Second edition, Berlin-Heidelberg-New York: Springer-Verlag, 1998 De Roeck, W., Maes, C., Netoˇcny, K.: Quantum macrostates, equivalence of ensembles and an h-theorem. J. Math. Phys. 47, 073303 (2006) De Roeck, W., Maes, C., Netoˇcny, K.: The Gibbs property for classical restrictions of quantum equilibrium states. In preparation
68
[FNW] [GLM] [GN] [HMO] [HMOP] [LLS] [LR] [M] [NR] [PRV] [P] [R] [RW]
Y. Ogata
Fannes, M., Nachtergaele, B., Werner, R.F.: Finitely correlated states on quantum spin chains. Commun. Math. Phys. 144, 443–490 (1992) Gallavotti, G., Lebowitz, J.L., Mastropietro, V.: Large deviations in rarefied quantum gases. J. Stat. Phys. 108, 831–861 (2002) Golodets, V.Y., Neshveyev, S.: Gibbs states for af algebras. J. Math. Phys. 39, 6329–6344 (1998) Hiai, F., Mosonyi, M., Ogawa, T.: Large deviations and chernoff bound for certain correlated states on a spin chain. J. Math. Phys. 48, 123301 (2007) Hiai, F., Mosonyi, M., Ohno, H., Petz, D.: Free energy density for mean field perturbation of states of a one-dimensional spin chain. Rev. Math. Phys. 20, 335–365 (2008) Lebowitz, J.L., Lenci, M., Spohn, H.: Large deviations for ideal quantum systems. J. Math. Phys. 41, 1224–1243 (2000) Lenci, M., Rey-Bellet, L.: Large deviations in quantum lattice systems: one-phase region. J. Stat. Phys. 119, 715–746 (2005) Matsui, T.: On non-commutative ruelle transfer operator. Rev. Math. Phys. 13, 1183–1201 (2001) Netoˇcný, K., Redig, F.: Large deviations for quantum spin systems. J. Stat. Phys. 117, 521–547 (2004) Petz, D., Raggio, G.P., Verbeure, A.: Asymptotics of varadhan-type and the gibbs variational principle. Commun. Math. Phys. 121, 271–282 (1989) Petz, D.: First steps towards a Donsker and Varadhan theory in operator algebras. In: Quantum Probability and Applications IV, Lecture Notes in Math, 1442, Berlin-Heidelberg-New York: Springer, 1990, pp. 311–319 Ruelle, D.: Statistical mechanics of a one dimensional lattice gas. Commun. Math. Phys. 9, 267–278 (1968) Raggio, G.A., Werner, R.F.: Quantum statistical mechanics of general mean field systems. Helv. Phys. Acta 62, 980–1003 (1989)
Communicated by M. Aizenman
Commun. Math. Phys. 296, 69–88 (2010) Digital Object Identifier (DOI) 10.1007/s00220-009-0964-4
Communications in
Mathematical Physics
A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0) Chongying Dong1,2, , Cuipo Jiang3, 1 Department of Mathematics, University of California, Santa Cruz, CA
95064, USA. E-mail:
[email protected] 2 School of Mathematics, Sichuan University, Chengdu 610065, China 3 Department of Mathematics, Shanghai Jiaotong University, Shanghai 200240, China
Received: 7 May 2009 / Accepted: 8 September 2009 Published online: 3 December 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com
Abstract: We study a simple, rational and C2 -cofinite vertex operator algebra whose weight 1 subspace is zero, the dimension of weight 2 subspace is greater than or equal to 2 and with c = c˜ = 1. Under some additional conditions it is shown that such a vertex operator algebra is isomorphic to L( 21 , 0) ⊗ L( 21 , 0).
1. Introduction The vertex operator algebra L( 21 , 0) ⊗ L( 21 , 0) is characterized in [ZD] as a unique simple rational, C2 -cofinite vertex operator algebra with c = c˜ = 1, weight one subspace being zero and weight two subspace being 2 dimensional. In this paper we strengthen this result by allowing the dimensions of weight two subspace to be greater than or equal to 2. This proves the conjecture given in [ZD]. The importance of L( 21 , 0)⊗L( 21 , 0) was first noticed in [DMZ] (also see [M2,DGH]) for the study of the moonshine vertex operator algebra V [FLM]. In fact, it was essentially proved in [DMZ] that the fixed point vertex operator subalgebra VL+ under the involution induced from the −1 isometry of L is isomorphic to L( 21 , 0) ⊗ L( 21 , 0) if L is a rank one lattice generated by a vector whose squared length is 4 and V contains L( 21 , 0)⊗48 . This led to the theory of code vertex operator algebras [M1,M2,M3] and framed vertex operator algebras [DGH]. A new construction of the moonshine vertex operator algebra V is given in [M4] using the theory of code and framed vertex operator
Supported by NSF grants and a Faculty research grant from the University of California at Santa Cruz; part of this work was done when C. Dong was a Changjiang Visiting Chair Professor in Sichuan University. Supported in part by China NSF grants 10871125, 10811120445, and a grant of Science and Technology Commission of Shanghai Municipality (No. 09XD1402500).
70
C. Dong, C. Jiang
algebras. Furthermore, the recent progress in [DGL and LY] on proving the uniqueness of V depends largely on the theory of framed vertex operator algebras and code vertex operator algebras. Also see [KL] for the study of conformal nets arising from framed vertex operator algebras. The characterization of L( 21 , 0) ⊗ L( 21 , 0) given in this paper is a necessary step in the classification of rational vertex operator algebras with c = 1. It is a well known conjecture (cf. [K,ZD]) that any simple rational vertex operator algebra with c = 1 is either VL , VL+ or VLGA where L is a rank one positive definite even lattice, L A1 is the 1 root lattice of type A1 and G is a subgroup of S O(3) isomorphic to A4 , S4 or A5 . As pointed out in [ZD], the correct conjecture should also assume c is equal to the effective central charge c. ˜ A characterization of VL for an arbitrary positive definite even lattice is obtained in [DM1]. Although there was some progress at the q-character level on the classification of rational vertex operator algebras with c = 1 in the physics literature [K], there is still a long way to prove the conjecture completely by a lack of characterization of VL+ . It is desirable that the characterization of L( 21 , 0) ⊗ L( 21 , 0) may help to understand VL+ in general. If the weight one subspace of a vertex operator algebra is 0, then its weight two subspace is a commutative (non-associative) algebra (cf. [FLM,DGL]). Since the weight two subspace V2 in [ZD] is assumed to be 2-dimensional, it is necessarily a commutative associative algebra. The main result in [ZD] was based on the study of the vertex operator algebra W (2, 2) and the growth of the graded dimensions of vertex operator algebras. But in this paper we assume dim V2 ≥ 2. So V2 is not an associative algebra and the situation is much more complicated. By a result from [R], V2 either has two nontrivial idempotent elements or has a nontrivial nilpotent element. The former case basically follows from the argument in [ZD]. The key point in this paper is to use the fusion rules for the Virasoro algebra with c = 1 to deal with the later case. This should explain why we need the assumption in the main theorem that the vertex operator algebra is a sum of highest weight modules for the Virasoro algebra. This assumption is expected to be established for all rational vertex operator algebras with c = 1. This leads us to the study of fusion rules for the Virasoro algebra with c = 1. The fusion rules for the Virasoro algebra with c = 1 have been investigated from different points of view [RT,X]. The fusion rules among irreducible modules L(1, m 2 /4) with m ∈ Z for the Virasoro algebra have been given in [M] based on the A(V )-theory developed in [Z,FZ and L2]. We extend these results to include irreducible modules L(1, n) for n ∈ Z. We certainly believe that the fusion rules computed in this paper will play important roles in the future classification of rational vertex operator algebras with c = 1. The paper is organized as follows: In Sect. 2 we review the various notions of modules and define rational vertex operator algebras. Section 3 is about the Virasoro vertex operator algebras and some results on the structure of highest weight modules for the Virasoro algebra with c = 1. We also prove that any simple vertex operator algebra with c > 1 is a completely reducible module for the Virasoro algebra. In Sect. 4 we first review the A(V )-theory including how to use the bimodules to compute the fusion rules. The new results in this section are the fusion rules for the Virasoro algebra with c = 1. The most difficult case is the fusion rules for the irreducible modules L(1, m 2 ) for integers m as they are not the Verma modules. These fusion rules are fundamental later in the proof of the main theorem. Section 5 is devoted to the proof of the main theorem. In the case that V2 has a nontrivial nilpotent element we need to construct some highest weight vectors with certain properties. Then we use the fusion rules to prove this is impossible. This forces the dimension of V2 to be 2 and the result in [ZD] applies.
A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)
71
2. Preliminaries Let V = (V, Y, 1, ω) be a vertex operator algebra [B,FLM]. We review various notions of V -modules (cf. [FLM,Z,DLM1]) and the definition of rational vertex operator algebras. We also discuss some consequences following [DLM1]. Definition 2.1. A weak V module is a vector space M equipped with a linear map Y M : V → End(M)[[z, z −1 ]], v → Y M (v, z) = n∈Z vn z −n−1 , vn ∈ End(M), satisfying the following: 1) vn w = 0 for n >> 0, where v ∈ V and w ∈ M, 2) Y M (1, z) = I d M , 3) The Jacobi identity holds: z1 − z2 z2 − z1 z 0−1 δ Y M (u, z 1 )Y M (v, z 2 ) − z 0−1 δ Y M (v, z 2 )Y M (u, z 1 ) z0 −z 0 z1 − z0 Y M (Y (u, z 0 )v, z 2 ). (2.1) = z 2−1 δ z2 Definition 2.2. An admissible V module is a weak V module which carries a Z+ -grading M = n∈Z+ M(n), such that if v ∈ Vr then vm M(n) ⊆ M(n + r − m − 1). Definition 2.3. An ordinary V module is a weak V module which carries a C-grading M = λ∈C Mλ , such that: 1) dim(Mλ ) < ∞, 2) Mλ+n = 0 for fixed λ and n 1 and h > 0. 2 V (1, h) = L(1, h) if and only if h = m4 for m ∈ Z. In case h = m 2 for a nonnegative integer m, the unique maximal submodule of V (1, m 2 ) is generated by a highest weight vector with highest weight (m + 1)2 and is isomorphic to V (1, (m + 1)2 ).
We next study a general simple vertex operator algebra as a module for the Virasoro algebra. Lemma 3.2. Let V be a simple vertex operator algebra such that V0 = C1 and L(1)V1 = 0. Let h > 0 be such that the Verma module V (c, h) for the Virasoro algebra is irreducible. Let U be the sum of irreducible submodules of V isomorphic to V (c, h). Then V = U ⊕ U ⊥ , where U ⊥ = {v ∈ V |(v, U ) = 0} and (, ) is the canonical non-degenerate symmetric invariant bilinear form on V such that (1, 1) = 1 [FHL], [L1]. Proof. It is enough to prove that U ∩U ⊥ = 0. First note that U is a completely reducible module for the Virasoro algebra. Also, U ⊥ is a module for the Virasoro algebra. Suppose that U ∩ U ⊥ = 0. Let W be an irreducible submodule of U ∩ U ⊥ . Then X = V /W ⊥ is an irreducible module for the Virasoro algebra isomorphic to V (c, h) and can be identified with the graded dual W of W . Let v ∈ Vh be such that v + W ⊥ is the highest weight
A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)
73
vector of V /W ⊥ . Let M be the module for the Virasoro algebra generated by v. Then M ∩ W ⊥ is a submodule of M, M/(M ∩ W ⊥ ) is isomorphic to X and M ∩ Vh = Cv ⊕ (M ∩ W ⊥ ∩ Vh ) (direct sum of subspaces). Note that there are only finitely many composition factors in M ∩ W ⊥ . We have the following exact sequences for modules of the Virasoro algebra: 0 → M ∩ W ⊥ → M → L(c, h) → 0 and 0 → L(c, h) → M → (M ∩ W ⊥ ) → 0. Since (W, v) = 0, it follows that M can not be a direct sum of submodules L(c, h) and M ∩ W ⊥ for the Virasoro vertex operator algebra. So M can not be a direct sum of submodules L(c, h) and (M ∩W ⊥ ) . Therefore there exists a highest weight submodule Z of M such that L(c, h) is a submodule of Z . But from the module structure theory in [KR], L(c, h) can never be a submodule of any highest weight module if V (c, h) = L(c, h). This is a contradiction. The proof is complete. Proposition 3.3. If V is a simple vertex operator algebra such that V0 = C1, L(1)V1 = 0 and c > 1. Then V is a completely reducible module for the Virasoro algebra. Proof. Recall from [KR] or Proposition 3.1 that V (c, h) = L(c, h) if h > 0 and L(c, 0) = V¯ (c, 0). It is clear that the vertex operator subalgebra of V generated by 1 is isomorphic to L(c, 0). So we can regard L(c, 0) as a subalgebra of V. Then we have the decomposition V = L(c, 0) ⊕ L(c, 0)⊥ as (1, 1) = 1 and L(c, 0) ∩ L(c, 0)⊥ = 0. Let U n be the L(c, 0)-submodule of V generated by the highest weight vectors with highest weight n. Then U n is a completely reducible module for the Virasoro algebra and V = ⊕n≥0 U n by Lemma 3.2. We remark that in the case c = 1 we cannot establish the result in Proposition 3.3 although we strongly believe it is true if we also assume that V is rational and C2 -cofinite. We need this assumption for c = 1 later to characterize the vertex operator algebra L(1/2, 0) ⊗ L(1/2, 0). This is also the original motivation for us to study the complete reducibility of vertex operator algebras as modules for the Virasoro algebra. It has been studied extensively on how to decompose an arbitrary vertex operator algebra and its modules as a sum of indecomposable modules for sl(2, C) = CL(1) + CL(−1) + CL(0) in [DLiM]. It seems that decomposing an arbitrary vertex operator algebra into a sum of indecomposable modules for the Virasoro algebra is much more difficult. But such a decomposition is definitely important in the study of vertex operator algebras and their representations. 4. A(V )-Theory and Fusion Rules Let V be a vertex operator algebra. An associative algebra A(V ) has been introduced and studied in [Z]. It turns out that A(V ) is very powerful and useful in representation theory for vertex operator algebras. One can use A(V ) not only to classify the irreducible admissible modules [Z], but also to compute the fusion rules using A(V )-bimodules [FZ]. We will first review the definition of A(V ) and some important results about A(V )
74
C. Dong, C. Jiang
from [Z,FZ and L2]. We then apply the A(V )-theory to the vertex operator algebra L(1, 0) to compute the fusion rules for L(1, 0). The central task is to determine the A(L(1, 0))-bimodule A(L(1, m 2 )) for any integer m. As a vector space, A(V ) is a quotient space of V by O(V ), where O(V ) denotes the linear span of elements wt u (z + 1)wt u u ◦ v = Resz (Y (u, z) u i−2 v v) = (4.1) i z2 i≥0
for u, v ∈ V with u being homogeneous. Product in A(V ) is induced from the multiplication wt u (z + 1)wt u u i−1 v v) = u ∗ v = Resz (Y (u, z) (4.2) i z i≥0
for u, v ∈ V with u being homogeneous. A(V ) = V /O(V ) is an associative algebra with identity 1 + O(V ) and with ω + O(V ) being in the center of A(V ). The most important result about A(V ) is that for any admissible V -module M = ⊕n≥0 M(n) with M(0) = 0, M(0) is an A(V )-module such that v + O(V ) acts as o(v), where o(v) = vwtv−1 for homogeneous v. For an admissible V -module W , we also define O(W ) ⊂ W to be the linear span of elements of type wt v (z + 1)wt v Resz (Y (v, z) w) = (4.3) vi−2 w i z2 i≥0
for homogeneous v ∈ V and w ∈ W. Let A(W ) = W/O(W ). Then A(W ) has an A(V )-bimodule structure [FZ] induced by the following bilinear operations V ×W → W and W × V → W : for w ∈ W and homogeneous v ∈ V, wt v (z + 1)wt v v ∗ w = Resz (Y (v, z) w) = vi−1 w, (4.4) z i i≥0
w ∗ v = Resz (Y (v, z)
wt v − 1 (z + 1)wt v−1 vi−1 w. w) = i z
(4.5)
i≥0
We quote the following proposition from [FZ]: Proposition 4.1. If W is an admissible module for a vertex operator algebra V and M is a submodule of W , then the image M¯ of M in A(W ) is a sub-A(V )-bimodule of A(W ), and the quotient A(W )/ M¯ is isomorphic to the A(V )-bimodule A(W/M) associated to the quotient V -module W/M. W3 i the vector Let W (i = 1, 2, 3) be ordinary V -modules. We denote by I V W1 W2 W3 . For a V -module W , let W denote space of all intertwining operators of type W1 W2 the graded dual of W . Then W is also a V -module [FHL]. It is well known that fusion rules have the following symmetry (see [FHL]).
A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)
75
Proposition 4.2. Let W i (i = 1, 2, 3) be V -modules. Then W3 W3 W3 (W 2 ) = dim I , dim I = dim I . dim I V V V V W1 W2 W2 W1 W1 W2 W 1 (W 3 ) Let W i = ⊕n≥0 W i (n) (i = 1, 2, 3) be V -modules such that L(0)|W i (0) = λi . Let W3 . Define the following bilinear Y(·, z) be an intertwining operator of type W1 W2 map: f Y : A(W 1 ) ⊗ A(V ) W 2 (0) → W 3 (0), u 1 ⊗ u 2 → o(u 1 )u 2 , u 1 ∈ A(W 1 ), u 2 ∈ W 2 (0), where o(u 1 ) is the component operator of Y(u 1 , z) such that o(u 1 ) maps W 2 (0) to W 3 (0). Then f Y is an A(V )-module homomorphism [FZ]. To state the next result we need to define the Verma type admissible module M(U ) associated to an A(V )-module U : Definition 4.3. Let V be a vertex operator algebra and U an A(V )-module. An admissible V -module M = ∞ by U if n=0 M(n) is called the Verma type module generated M(0) = U as A(V )-module and for any admissible V -module W = ∞ W (n) with n=0 W (0) = U as A(V )-module, the identity map from M(0) to W (0) lifts to a V -module homomorphism from M to W . The existence of a Verma type admissible module was given in [Z] (also see [DLM2]). The following result comes from [L2]: Lemma 4.4. Let W i be V -modules for i = 1, 2, 3. If W 3 is an irreducible V -module, then thelinear map Y → f Y is an injective map from the space of intertwining operators W3 to H om A(V ) (A(W 1 ) ⊗ A(V ) W 2 (0), W 3 (0)). Furthermore, Y → f Y of type W1 W2 is an isomorphism, if both W 2 and (W 3 ) are Verma type modules for V . We quote a result about the vertex operator algebra V¯ (c, 0) from [FZ]. Proposition 4.5. (1) The associative algebra A(V¯ (c, 0)) is isomorphic to the polynomial algebra C[x], with the isomorphism being given by x n ∈ C[x] → [(L(−2) + L(−1))n 1], where [a] = a + O(V¯ (c, 0)) for a ∈ V¯ (c, 0). (2) For the Verma module V (c, h), the A(V¯ (c, 0))-bimodule A(V (c, h)) is C[x, y] with x and y acting on the left and right as multiplications by x and y respectively. The isomorphism from C[x, y] to A(V (c, h)) is given by x m y n → [(L(−2) + 2L(−1) + L(0))m (L(−2) + L(−1))n 1h ], where 1h is a fixed nonzero highest weight vector of V (c, h). We now discuss the relation between the Verma module for the Virasoro algebra and the Verma type admissible module for vertex operator algebra V¯ (c, 0). By Proposition 4.5, A(V¯ (c, 0)) = C[x]. So any irreducible A(V¯ (c, 0))-module is one dimensional such that [ω] acts as a constant h. Denote this module by U. It is clear that the Verma type admissible V¯ (c, 0)-module generated by U is exactly the Verma module V (c, h). We next turn our attention to the fusion rules for the vertex operator algebra L(1, 0). The following theorem is the foundation in our computation of the fusion rules.
76
C. Dong, C. Jiang
Theorem 4.6. Let r be a positive integer. Then A(L(1, r 2 )) = C[x, y]/ I¯, where I¯ = < (x − y)
r [(x − y)2 − 2i 2 (x + y) + i 4 ] > i=1
is a two-sided ideal of C[x, y] generated by (x − y)
r
i=1 [(x
− y)2 − 2i 2 (x + y) + i 4 ].
Proof. Since V¯ (1, 0) = L(1, 0), by Proposition 4.5, the associative algebra A(L(1, 0)) is C[x] and the A(L(1, 0))-bimodule A(V (1, r 2 )) is isomorphic to C[x, y] with x and y acting on the left and right as multiplications by x and y respectively. By Proposition 4.1, as an A(L(1, 0))-bimodule, A(L(1, r 2 )) ∼ = C[x, y]/ I¯, where I¯ is the image in A(V (1, r 2 )) of the maximal proper submodule I of V (1, r 2 ). Since I is generated by a non-zero element v (r +1) in V (1, r 2 ) such that L(0)v (r +1) = (r + 1)2 v (r +1) , L(k)v (r +1) = 0, 0 < k ∈ Z+ , it follows that I¯ is generated by a polynomial f (x, y) in C[x, y] with degree s ≤ 2r + 1. Assume that f (x, y) =
s
ai (x)y i ,
i=0
where ai (x), i = 0, 1, . . . , s are polynomials in x of degrees at most 2r + 1 − i. We need to use the vertex operator algebra VL associated to the rank one even positive definite lattice L = Zα with (α, α) = 2 [FLM]. Let h = L ⊗Z C, and hˆ Z be the corresponding Heisenberg algebra. Denote by M(1) = C[α(−n)|n > 0] the associated irreducible induced module for hˆ Z such that the canonical central element of hˆ Z acts as 1. Let C[L] be the group algebra of L with a basis eγ for γ ∈ L . Let β ∈ h be such that (β, β) = 1. It is known that VL = M(1) ⊗ C[L] is a simple rational vertex operator algebra with 1 = 1 ⊗ e0 and ω = 21 β(−1)2 1 [B,FLM,D,DLM1]. The subalgebra generated by ω of VL is isomorphic to L(1, 0) and M(1) = L(1, p 2 ), VL =
p≥0
(2m + 1)L(1, m 2 ),
(4.6)
m≥0
as modules for the Virasoro algebra (cf. [DG]). It is well-known that VL is isomorphic to the fundamental representation L( 0 ) for the affine Kac-Moody algebra A(1) 1 [FK]. Note that the weight one subspace (VL )1 of VL forms a Lie algebra g isomorphic to sl(2, C), where the Lie bracket in (VL )1 is defined as [u, v] = u 0 v and u 0 is the component operator of Y (u, z) = n∈Z u n z −n−1 . g acts g on VL via v0 for v ∈ (VL )1 . The g-invariant elements VL = {v ∈ VL |g · v = 0} form a simple vertex operator algebra and is isomorphic to L(1, 0) (see [DG]).
A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)
77
Let Wm be the unique m + 1-dimensional highest weight module for g with highest weight m ∈ Z≥0 . Let VLWm be the sum of irreducible g-submodules of VL isomorphic to Wm , and (VL )Wm the space of highest weight vectors in VLWm . Then by [DG], as a g (VL , g)-module VL has decomposition W VL 2m = (VL )W2m ⊗ W2m (4.7) VL = m≥0
m≥0
g
and (VL )W2m is an irreducible module for VL . Moreover, (VL )W2k and (VL )W2m are isomorphic if and only if k = m. By [DG], (VL )W2m is isomorphic to L(1, m 2 ) as L(1, 0)-module. For m, n ∈ Z+ , m ≥ n, let W2m,2n = span{u j v|u ∈ W2m , v ∈ W2n , j ∈ Z}. Then W2m,2n is a g-module. Let u ∈ W2m and v ∈ W2n such that α(0)u = (2m − 2i)u, α(0)v = (2n − 2 j)v, for some 0 ≤ i ≤ 2m, 0 ≤ j ≤ 2n, where α(0) = (α(−1)1)0 is the component operator of α(z) = Y (α(−1)1, z) = k∈Z α(k)z −k−1 . Then α(0)u p v = (α(0)u) p v + u p α(0)v = (2m + 2n − 2i − 2 j)u p v, for all p ∈ Z. This means that W2m,2n is a sum of irreducible g-modules in {W2k |0 ≤ k ≤ m + n}. On the other hand, we have the following well-known tensor product decomposition: W2m ⊗ W2n = W2(m−n) ⊕ W2(m−n)+2 ⊕ · · · ⊕ W2(m+n)−2 ⊕ W2(m+n) .
(4.8)
By Lemma 2.2 of [DM2], for small enough integer p, the map ψ p : W2m ⊗ W2n → ∞ W2m,2n defined by ψ p : u ⊗ v → u i v, u ∈ W2m , v ∈ W2n is injective. Therei= p
fore in the decomposition of W2m,2n into irreducible g-modules, each W2k appears for m − n ≤ k ≤ m + n. Denote by Um,n the L(1, 0)-submodule of VL generated by W2m,2n . Then by (4.7), we have (VL )W2k ⊗ W2k . Um,n ⊇ m−n≤k≤m+n
This proves that
I L(1,0)
L(1, k 2 ) L(1, m 2 ) L(1, n 2 )
= 0,
for all m, n, k ∈ Z+ such that |m − n| ≤ k ≤ n + m. Let m = r , then we have f (n 2 , k 2 ) = 0, for all n, k ∈ Z+ satisfying |r − n| ≤ k ≤ n + r . Thus for n ∈ Z+ with n − r ≥ 0, we have ⎤⎡ ⎡ ⎤ a0 (n 2 ) 1 (n − r )2 (n − r )4 (n − r )6 ··· (n − r )2s ⎢ 1 (n − r + 1)2 (n − r + 1)4 (n − r + 1)6 · · · (n − r + 1)2s ⎥ ⎢ a1 (n 2 )⎥ ⎥⎢ ⎢ ⎥ ⎢ 1 (n − r + 2)2 (n − r + 2)4 (n − r + 2)6 · · · (n − r + 2)2s ⎥ ⎢ a2 (n 2 )⎥ ⎥⎢ ⎢ ⎥ = 0. ⎥ ⎢ .. ⎥ ⎢ .. .. .. .. .. .. ⎣ ⎦ ⎣. . . . . . . ⎦ 2 4 6 2s (n + r ) (n + r ) ··· (n + r ) 1 (n + r ) as (n 2 ) (4.9)
78
C. Dong, C. Jiang
If s ≤ 2r , then for each n ∈ Z+ such that n ≥ r , the coefficient matrix of (4.9) contains a (s + 1) × (s + 1)-minor which is a non-singular Vandermonde determinant, it follows that (4.9) has only zero solution. This implies that ai (x) = 0 for all i, a contradiction. So we have s = 2r + 1. We may assume that a2r +1 (x) = 1. Then we have ⎡ ⎤ ⎤ ⎡ −(n − r )2(2r +1) a0 (n 2 ) ⎢ a1 (n 2 ) ⎥ ⎢ (n − r + 1)2(2r +1) ⎥ ⎢ ⎥ ⎥ ⎢ 2 ⎥ 2(2r +1) ⎥ ⎢ ⎢ A(n) ⎢ a2 (n ) ⎥ = ⎢ −(n − r + 2) ⎥, ⎢ ⎥ ⎥ ⎢ .. .. ⎣ ⎣ ⎦ ⎦ . . a2r (n 2 )
(4.10)
(n + r )2(2r +1)
where ⎡
A(n)
1 (n − r )2 (n − r )4 (n − r )6 2 4 ⎢ 1 (n − r + 1) (n − r + 1) (n − r + 1)6 ⎢ 2 4 6 ⎢ = ⎢ 1 (n − r + 2) (n − r + 2) (n − r + 2) ⎢ .. .. .. .. ⎣. . . . (n + r )4 (n + r )6 1 (n + r )2
⎤ ··· (n − r )4r · · · (n − r + 1)4r ⎥ ⎥ · · · (n − r + 2)4r ⎥ ⎥. ⎥ .. .. ⎦ . . 4r ··· (n + r )
This shows that (4.10) has a unique solution for each n ∈ Z+ such that n ≥ r . Since ai (x), i = 0, 1, . . . , 2r + 1 are polynomials in x with degrees at most 2r + 1, it follows that f (x, y) is uniquely determined (up to a non-zero scalar) by the condition that f (n 2 , k 2 ) = 0 for all n, k ∈ Z+ such that |n − r | ≤ k ≤ n + r . Let f i (x, y) = (x − y)2 − 2i 2 (x + y) + i 4 , i = 1, 2, · · · , r. Then we have f i (n 2 , (n ± i)2 ) = 0. This proves that the polynomial (x − y)
r
[(x − y)2 − 2i 2 (x + y) + i 4 ]
i=1
satisfies the above condition. So we have f (x, y) = (x − y)
r
[(x − y)2 − 2i 2 (x + y) + i 4 ],
i=1
as expected. We are now in a position to give the fusion rules for the vertex operator algebra L(1, 0).
A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)
79
Theorem 4.7. We have L(1, k 2 ) = 1, k ∈ Z+ , |n − m| ≤ k ≤ n + m, (4.11) dim I L(1,0) L(1, m 2 ) L(1, n 2 ) L(1, k 2 ) = 0, k ∈ Z+ , k < |n − m| or k > n + m, (4.12) dim I L(1,0) L(1, m 2 ) L(1, n 2 ) where n, m ∈ Z+ . For n ∈ Z+ such that n = p 2 , for all p ∈ Z+ , we have L(1, n) = 1, dim I L(1,0) L(1, m 2 ) L(1, n) L(1, k) = 0, dim I L(1,0) L(1, m 2 ) L(1, n) for k ∈ Z+ such that k = n.
Proof. By Lemma 4.4, for k1 , k2 , k3 ∈ Z+ , dim I L(1,0)
L(1, k3 ) L(1, k1 ) L(1, k2 )
(4.13) (4.14) is less than
or equal to dim H om A(L(1,0)) (A(L(1, k1 )) ⊗ A(L(1,0)) L(1, k2 )(0), L(1, k3 )(0)), where L(1, h)(0) = C1h is the one-dimensional lowest weight space of the irreducible L(1, 0)-module L(1, h) such that L(0)1h = h1h , L(n)1h = 0, 1 ≤ n ∈ Z+ . That is, x in C[x] = A(L(1, 0)) acts on L(1, h)(0) as h. Let m, n, k ∈ Z+ such that |m − n| ≤ k ≤ m + n. It is easy to see that A(L(1, m 2 )) ⊗ A(L(1,0)) L(1, n 2 )(0) ∼ = C[x]/ < (x − n 2 )
m [(x − n 2 )2 i=1
−2i (x + n ) + i ] > . 2
m
2
4
Denote the ideal < (x − n 2 ) i=1 [(x − n 2 )2 − 2i 2 (x + n 2 ) + i 4 ] > by I¯n . For 0 = φ ∈ H om A(L(1,0)) (A(L(1, m 2 )) ⊗ A(L(1,0)) L(1, n 2 )(0), L(1, k 2 )(0)), we have x · φ(1 + I¯n )1k 2 = k 2 1k 2 = φ(x + I¯n )1k 2 , since x · 1k 2 = k 2 1k 2 . So φ( p(x) + I¯)1k 2 = p(k 2 )1k 2 , for p(x) ∈ C[x]. This means that dim H om A(L(1,0)) (A(L(1, m 2 )) ⊗ A(L(1,0)) L(1, n 2 )(0), L(1, k 2 )(0)) = 1. On the other hand, by Theorem 4.6, we have
L(1, k 2 ) I L(1,0) = 0. L(1, m 2 ) L(1, n 2 ) So (4.11) holds.
80
C. Dong, C. Jiang
For n, k ∈ Z+ such that k < |n − m| or k > n + m, let x = k 2 , y = n 2 , then we have f (k 2 , n 2 ) = (k 2 − n 2 )
m
[(k 2 − n 2 )2 − 2i 2 (k 2 + n 2 ) + i 4 ]
i=1
= (k 2 − n 2 )
m
[k 2 − (n − i)2 ][k 2 − (n + i)2 ] = 0.
i=1
This proves that dim H om A(L(1,0)) (A(L(1, m 2 )) ⊗ A(L(1,0)) L(1, n 2 )(0), L(1, k 2 )(0)) = 0. So (4.12) is true. For (4.14), we have f (k, n) = (k − n)
m [(k − n)2 − 2i 2 (k + n) + i 4 ] i=1
m = (k − n) [(k − n − i)2 − 4i 2 n] = 0, i=1
since n = k and n = p 2 , for all p ∈ Z+ . Therefore (4.14) holds. By Theorem 4.6, we have dim H om A(L(1,0)) (A(L(1, m 2 )) ⊗ A(L(1,0)) L(1, n)(0), L(1, n)(0)) = 1. Since for n ∈ Z+ such that n = p 2 , for all p ∈ Z+ , L(1, n) = V (1, n) ∼ = L(1, n) , (4.13) then follows from Lemma 4.4. The following corollary is not used in this paper. But it is an interesting result. Corollary 4.8. Let U be a highest weight module for the Virasoro algebra generated by the highest weight vector u (r ) such that L(0)u (r ) = r 2 u (r ) , L(k)u (r ) = 0, r ∈ Z+ \{0}. Let m, n ∈ Z+ \{0} be such that m = n and m, n are not perfect squares. Then U = 0. I L(1,0) L(1, m) L(1, n) Proof. If U is irreducible, the lemma immediately follows from Proposition 4.2 and Theorem 4.7. Otherwise, let U be the graded dual of U . Then U contains an irreducible submodule W (r ) which is isomorphic to L(1, r 2 ). By Theorem 4.7,
L(1, n) I L(1,0) = 0. W (r ) L(1, m) U contains a submodule W (r +1) such that W¯ (r +1) = W (r +1) /W (r ) is an irreducible L(1, 0)-module isomorphic to L(1, (r + 1)2 ). Again by Theorem 4.7, we have
L(1, n) I L(1,0) = 0. W¯ (r +1) L(1, m)
A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)
81
This implies I L(1,0)
L(1, n) W (r +1) L(1, m)
= 0.
Continuing the above steps, we deduce that L(1, n) =0 I L(1,0) W L(1, m) for any proper submodule W of U . We now claim that I L(1,0) Let Y ∈ I L(1,0)
L(1, n)
L(1, n) U L(1, m)
= 0.
be a nonzero intertwining operator. Then Y(u, z) = 0 U L(1, m) for some u ∈ U . Since U is a highest weight module for the Virasoro algebra, there exists a proper submodule W of U such that u ∈ W. This shows that L(1, n) = 0, I L(1,0) W L(1, m) a contradiction. Using Proposition 4.2 we conclude that U L(1, n) dim I L(1,0) = dim I L(1,0) = 0, L(1, m) L(1, n) U L(1, m) as desired. 5. Uniqueness of L(1/2, 0) ⊗ L(1/2, 0) In this section we prove the main theorem in this paper: Theorem 5.1. If V is a simple, rational and C2 -cofinite vertex operator algebra such that V1 = 0, c = c˜ = 1, V is a sum of highest weight modules for the Virasoro algebra and dim V2 ≥ 2, then dim V2 = 2 and V is isomorphic to L(1/2, 0) ⊗ L(1/2, 0). From now on we assume that V satisfies all the assumptions given in Theorem 5.1. First we notice that Vn = 0 if n < 0 and V0 = C1 (see [DGL]). Also there is a unique symmetric, non-degenerate invariant bilinear from (, ) on V such that (1, 1) = 1 (see [L1]). Then for any u, v, w ∈ V, (u, v)1 = Resz z −1 Y (e L(1)z (−z −2 ) L(0) u, z −1 )v. In particular, the restriction of the form to each homogeneous subspace Vn is non-degenerate and (u n+1 v, w) = (v, u −n+1 w)
82
C. Dong, C. Jiang
for all u, v ∈ V2 and w ∈ V. V2 is a commutative non-associative algebra with the product ab = a1 b for a, b ∈ V2 and the identity ω2 (cf. [FLM]). For a, b ∈ V2 we have (a, b)1 = a3 b. Moreover, the form on V2 is associative. That is, (ab, c) = (a, bc) for a, b, c ∈ V2 . By [R], either there is a nontrivial nilpotent element x ∈ V2 or V2 is spanned by idempotent elements. Lemma 5.2. If V2 is spanned by the idempotent elements, then V is isomorphic to L(1/2, 0) ⊗ L(1/2, 0). Proof. Let x ∈ V2 be a nontrivial idempotent element. Set ω1 = 2x and ω2 = ω − 2x. Then ωi are Virasoro elements [M1]. It follows from the proof of Theorem 3.1 of [ZD] that V contains L(c1 , 0) ⊗ L(c2 , 0) as a subalgebra for some complex numbers c1 , c2 such that c1 + c2 = 1. In fact, L(ci , 0) is isomorphic to the subalgebra generated by ωi . It then follows from the proof of Lemmas 4.5 and 4.6 of [ZD] that both c1 and c2 are 1/2. That is, V contains rational vertex operator algebra L(1/2, 0) ⊗ L(1/2, 0) (see [DMZ] and [W]) as a subalgebra and V is a completely reducible L(1/2, 0)⊗ L(1/2, 0)-module. Since the irreducible modules of L(1/2, 0) ⊗ L(1/2, 0) are L(1/2, h 1 ) ⊗ L(1/2, h 2 ) for 1 h i ∈ {0, 21 , 16 } and dim V0 = 1, dim V1 = 0, we immediately see that V = L(1/2, 0) ⊗ L(1/2, 0). In particular, dim V2 = 2. We now deal with the case that there exists 0 = x ∈ V2 such that x 2 = 0. There are two cases: (1) (ω, x) = 0; (2) (ω, x) = 0. Lemma 5.3. We must have (ω, x) = 0. Proof. If (ω, x) = 0, we can assume that (ω, x) = 1. Then the component operators W (n) of Y (x, z) = n∈Z W (n)z −n−2 and the component operators L(n) of the Y (ω, z) generate a copy of the W -algebra W (2, 2) with central charge 1, where W (2, 2) is an infinite dimensional Lie algebra with basis L m , Wm , C for m ∈ Z and Lie brackets, [L m , L n ] = (m − n)L m+n +
m3 − m δm+n,0 C, 12
[L m , Wn ] = (m − n)Wm+n +
m3 − m δm+n,0 C, 12
[Wm , Wn ] = 0 for m, n ∈ Z, where C is a central element( see [ZD]). Let c, h 1 , h 2 ∈ C and denote by V (c, h 1 , h 2 ) the Verma module for W (2, 2) with central charge c and highest weight (h 1 , h 2 ). Then V (c, h 1 , h 2 ) = U (W (2, 2))/Ic,h 1 ,h 2 , where Ic,h 1 ,h 2 is the left ideal of the universal enveloping algebra U (W (2, 2)) generated by L m , Wm , C − c, L 0 − h 1 and W0 − h 2 for positive m. By PBW theorem V (c, h 1 , h 2 ) has basis {W−m 1 · · · W−m s L −n 1 · · · L −n t 1(h 1 ,h 2 ) |m 1 ≥ · · · ≥ m s ≥ 1, n 1 ≥ · · · ≥ n t ≥ 1}, where 1(h 1 ,h 2 ) = 1 + Ic,h 1 ,h 2 . It is standard that V (c, h 1 , h 2 ) has a unique maximal submodule J (c, h 1 , h 2 ) so that L(c, h 1 , h 2 ) = V (c, h 1 , h 2 )/J (c, h 1 , h 2 ) is an irreducible highest weight module of W (2, 2). By Theorem 2.1 of [ZD], if c = 0 then J (c, 0, 0) = U (W (2, 2))L −1 1(0,0) + U (W (2, 2))W−1 1(0,0) and L(c, 0, 0) has a basis {W−m 1 · · · W−m s L −n 1 · · · L −n t 10 |m 1 ≥ · · · ≥ m s > 1, n 1 ≥ · · · ≥ n t > 1}, where 10 is the canonical highest weight vector of L(c, 0, 0).
(5.1)
A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)
83
Let U be the vertex operator subalgebra generated by ω, x. Then U is a highest weight W (2, 2)-module with highest weight vector 1 such that Wn acts as W (n) and L n acts as L(n) for all n ∈ Z. Since L(−1)1 = W (−1)1 = 0, we see that U is isomorphic to L(1, 0, 0). By (5.1), U has q-character q −1/24 . n 2 n>1 (1 − q )
chq U =
By Proposition 4.2 of [ZD], the coefficients of η(q)chq U = 1−q n grow faster than n>1 (1−q ) any polynomial in n. But this is a contradiction as the coefficients of η(q)chq V satisfy the polynomial growth condition by Lemma 2.6. So we can now assume that (ω, x) = 0. Since L(1)x ∈ V1 and (ω, x) = (L(2)x, 1) we see that x is a highest weight vector for the Virasoro algebra. By the fact that the bilinear form (·, ·) on V is non-degenerate and (ω, ω) = 21 , there exists y ∈ V2 such that (x, y) = 1, (y, ω) = 0. So (L(2)y, 1) = 0. This means that L(2)y = 0. Since L(1)y ∈ V1 = 0, we deduce that y is a highest weight vector for the Virasoro algebra. Assume that x y = aω + αx + βy + u, where α, β ∈ C , and u ∈ V2 such that (u, x) = (u, y) = (u, ω) = 0. Note that (x, y) =
1 1 (x, yω) = (x y, ω) 2 2
and (ω, ω) = 21 . We have a = 4. Since (y, x x) = (x y, x) = β(x, y) = 0, it follows that β = 0. Therefore x y = 4ω + αx + u. It is obvious that u is a highest weight vector for the Virasoro algebra. The following lemma is an immediate consequence of the commutator formula in vertex operator algebras. Lemma 5.4. Let v be a highest weight vector for the Virasoro algebra with highest weight 2. Then [L(m), vn ] = (m − n + 1)vn+m for all m, n ∈ Z. Lemma 5.5. Assume that x−1 x = 0. Then we have (1) u 1 x = −10x, (2) u 0 x = −5x−2 1. Proof. Since Vn = 0 for n < 0, we have xn x = 0, for n ≥ 4. By the fact that x1 x = x 2 = 0, we have (x, x) = (x3 x, 1) = (ω/2, x 2 ) = 0. So x3 x = 0. Using the skew symmetry Y (x, z)x = e L(−1)z Y (x, −z)x we see that x0 x = −x0 x + L(−1)x1 x = −x0 x + L(−1)x 2 = −x0 x.
84
C. Dong, C. Jiang
This proves that x0 x = 0. Note that x2 x = 0, since V1 = 0. So we have xn x = 0 for n ≥ 0. Thus Y (x, z 1 )Y (x, z 2 ) = Y (x, z 2 )Y (x, z 1 ) and Y (x−1 x, z) = Y (x, z)Y (x, z) = 0. In particular, x1 x1 + 2
x1−i x1+i = 0
i≥1
and (x1 x1 + 2
x1−i x1+i )y = x1 x1 y + 2x = 10x + x1 u = 0.
i≥1
This proves (1). For (2), we apply the zero operator i≥0 x−i xi+1 to y to obtain 0 = x0 x1 y + x−2 x3 y = x0 (4ω + αx + u) + x−2 1 = 5x−2 1 + x0 u, where we have used Lemma 5.4. Thus, x0 u = −5x−2 1. Using the skew symmetry we see that u 0 x = −x0 u + L(−1)x1 u = 5x−2 1 − 10x−2 1 = −5x−2 1, as desired. α u. It follows from Lemma 5.5 that x1 y = From now on we redefine y as y = y + 10 y1 x = 4ω + u. Although this new y is again a highest weight vector for the Virasoro algebra, we cannot assume (y, u) = 0 any more.
Corollary 5.6. (1) [u m , xn ] = 5(n − m)xm+n−1 for m, n ∈ Z. (2) (u, u) = −10. Proof. (1) follows from Lemma 5.5 and the commutator formula [u m , xn ] =
m (u i x)m+n−i . i i≥0
For (2) we compute (x1 y, x1 y) = (4ω + u, 4ω + u) = 8 + (u, u). On the other hand, (x1 y, x1 y) = (y, x1 (4ω + u)) = (y, 8x − 10x) = −2. That is, (u, u) = −10.
Lemma 5.7. Assume that x−1 x = 0. Then there exist a, b ∈ C such that v = u −1 x + ax−3 1 + bL(−2)x is a nonzero highest weight vector of weight 4 for the Virasoro algebra.
A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)
85
Proof. We first use the conditions L(1)v = L(2)v = 0 to determine a, b. Using Lemmas 5.4 and 5.5 we have L(1)v = L(1)u −1 x + a L(1)x−3 1 + bL(1)L(−2)x = 3u 0 x + 5ax−2 1 + 3bx−2 1 = (−15 + 5a + 3b)x−2 1 and L(2)v = L(2)u −1 x + a L(2)x−3 1 + bL(2)L(−2)x 1 = 4u 1 x + 6ax + b(4L(0) + )x 2 1 = (−40 + 6a + b(8 + ))x. 2 So a =
220 15 ,b= are uniquely determined by the linear system 49 49 5a + 3b = 15, 12a + 17b = 80.
It is clear that L(n)v = 0 for n > 2. We now prove that v is nonzero. It is enough to prove that y3 v = 0. We have the following computation: y3 v =
3 3 3 3 (yi u)2−i x + u + a (yi x)−i 1 + b(4y1 + L(−2)y3 )x i i i=0
i=0
= (y0 u)2 x + 3(y1 u)1 x + (y, u)x + u + 3ay1 x + 4by1 x + bω = (−u 0 y + L(−1)u 1 y)2 x + 3(y1 u)1 x + (y, u)x + u + (3a + 4b)(4ω + u) + bω = −u 0 y2 x + y2 u 0 x −2(u 1 y)1 x +3(y1 u)1 x + (y, u)x + (12a + 17b)ω + (3a + 4b + 1)u = −5y2 x−2 1 + (u 1 y)1 x + (y, u)x + (12a + 17b)ω + (3a + 4b + 1)u. Thus we have (y3 v, u) = (−5y2 x−2 1 + (u 1 y)1 x + (y, u)x + (12a + 17b)ω + (3a + 4b + 1)u, u) = −5(x−2 1, y0 u) + (u 1 y, x1 u) + (3a + 4b + 1)(u, u) = −5(x−2 1, −u 0 y + L(−1)u 1 y) − 10(u 1 y, x) − 10(3a + 4b + 1) = 5(u 2 x−2 1, y) − 5(L(1)x−2 1, u 1 y) + 100 − 10(3a + 4b + 1) = −100(x, y) − 20(x, u 1 y) + 100 − 10(3a + 4b + 1) 60 = 0. = 200 − 10(3a + 4b + 1) = 49 The proof is complete. Lemma 5.8. Assume that x−1 x = 0. Let v = u −1 x + ax−3 1 + bL(−2)x be the nonzero highest weight vector given in Lemma 5.7. Then xi v = 0 for all i ≥ 0.
86
C. Dong, C. Jiang
1 Proof. Since x−1 x = 0, it follows that x−2 x = L(−1)x−1 x = 0. So for i ≥ 0, we 2 have xi v = xi u −1 x + axi x−3 1 + bxi L(−2)x = 5(−1 − i)xi−2 x + u −1 xi x + b(i + 1)xi−2 x + bL(−2)xi x = 0, as desired. Lemma 5.9. V is a completely reducible module for the Virasoro algebra. Proof. By the assumption, V is a sum of highest weight modules for the Virasoro algebra. We claim that any highest weight module for the Virasoro algebra generated by a highest weight vector w ∈ V with highest weight n is isomorphic to L(1, n). If not, let U be the highest weight module generated by w for the Virasoro algebra. Then U has a unique maximal submodule M generated by a highest weight vector f . Then we can write f as a linear combination of L(−n 1 ) · · · L(−n k )w for n 1 ≥ · · · ≥ n k ≥ 1. Let X be a highest weight module in V for the Virasoro algebra generated by a highest weight vector g. It is clear that (L(−n 1 ) · · · L(−n k )w, g) = (w, L(n k ) · · · L(n 1 )g) = 0, and so ( f, g) = 0. Let L(−m 1 ) · · · L(−m p )g ∈ X such that m i > 0 and p ≥ 1. Then ( f, L(−m 1 ) · · · L(−m p )g) = (L(m p ) · · · L(m 1 ) f, g) = 0. This shows that ( f, V ) = 0. Since the form is non-degenerate, this is impossible. As a result, V is a completely reducible module for the Virasoro algebra. We now can complete the proof of Theorem 5.1. Let v be the vector given in Lemma 5.7 if x−1 x = 0, otherwise let v = x−1 x. Then v is a nonzero highest weight vector for the Virasoro algebra with highest weight 4 such that xi v = 0 for all i ≥ 0. It follows from Lemma 5.9 that highest weight modules generated by x and v are isomorphic to L(1, 2) and L(1, 4) respectively. By Proposition 11.9 of [DL], Y (x, z)v = 0 as V is simple. Thus there exists n > 0 such that x−n v = 0 and x−m v = 0 for all m < n. Then x−n v is a highest weight vector for the Virasoro algebra with highest weight n + 5 and generates an irreducible highest weight module isomorphic to L(1, n+ 5). As a L(1, n + 5) result we have a nonzero intertwining operator of type . This is a L(1, 4), L(1, 2) contradiction by Theorem 4.7. Hence there is no nontrivial nilpotent element in V2 and Theorem 5.1 holds by Lemma 5.2. Remark 5.10. As we pointed out in [ZD] the assumption c = c˜ in Theorem 5.1 is necessary. We believe that the assumption that V is a sum of highest weight modules for the Virasoro algebra is unnecessary. But we do not know how to prove the main result without this assumption in this paper. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
A Characterization of Vertex Operator Algebra L( 21 , 0) ⊗ L( 21 , 0)
87
References [B] [D] [DG] [DGH] [DGL] [DL] [DLM1] [DLM2] [DLM3] [DLiM]
[DM1] [DM2] [DMZ] [FHL] [FK] [FLM] [FZ] [KR] [KL] [K] [KM] [LY] [L1] [L2] [M] [M1] [M2] [M3] [M4] [R] [RT]
Borcherds, R.: Vertex algebras kac-moody algebras and the monster. Proc. Natl. Acad. Sci. USA 83, 3068–3071 (1986) Dong, C.: Vertex algebras associated with even lattices. J. Algebra 160, 245–265 (1993) Dong, C., Griess, R. Jr.: Rank one lattice type vertex operator algebras and their automorphism groups. J. Algebra 208, 262–275 (1998) Dong, C., Griess, R. Jr., Hoehn, G.: Framed vertex operator algebras, codes and the moonshine module. Commu. Math. Phys. 193, 407–448 (1998) Dong, C., Griess, R. Jr., Lam, C.: Uniqueness results of the moonshine vertex operator algebra. Ameri. J. Math. 129, 583–609 (2007) Dong, C., Lepowsky, J.: Generalized Vertex Algebras and Relative Vertex Operators. Progress in Math. Vol. 112, Boston: Birkhäuser, 1993 Dong, C., Li, H., Mason, G.: Regularity of rational vertex operator algebras. Adv. in Math. 132, 148–166 (1997) Dong, C., Li, H., Mason, G.: Twisted representations of vertex operator algebras. Math. Ann. 310, 571–600 (1998) Dong, C., Li, H., Mason, G.: Modular invariance of trace functions in orbifold theory and generalized moonshine. Commu. Math. Phys. 214, 1–56 (2000) Dong, C., Lin, Z., Mason, G.: On vertex operator algebras as sl2 -modules. In: Groups, Difference Sets, and the Monster, Proc. of a Special Research Quarter at The Ohio State University, Spring 1993, ed. by Arasu, K.T., Dillon, J.F., Harada, K., Sehgal, S., Solomon, R., Berlin-New York: Walter de Gruyter, 1996, pp. 349–362 Dong, C., Mason, G.: Rational vertex operator algebras and the effective central charge. International Math. Research Notices 56, 2989–3008 (2004) Dong, C., Mason, G.: Quantum galois theory for compact lie groups. J. Algebra 214, 92–102 (1999) Dong, C., Mason, G., Zhu, Y.: Discrete series of the virasoro algebra and the moonshine module. Proc. Symp. Pure. Math. American Math. Soc. 56(II), 295–316 (1994) Frenkel, I.B., Huang, Y., Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules. Memoirs American Math. Soc. 104, 1993 Frenkel, I., Kac, V.: Basic representations of affine lie algebras and dual resonance models. Invent. Math 62, 23–66 (1980) Frenkel, I.B., Lepowsky, J., Meurman, A.: Vertex Operator Algebras and the Monster. Pure and Applied Math. Vol. 134, New York-London: Academic Press, 1988 Frenkel, I., Zhu, Y.: Vertex operator algebras associated to representations of affine and virasoro algebras. Duke Math. J. 66, 123–168 (1992) Kac, V.G., Raina, A.: Highest Weight Representations of Infinite Dimensional Lie Algebras. Adv. Ser. In Math. Phys., Singapore: World Scientific, 1987 Kawahigashi, Y., Longo, R.: Local conformal nets arising from framed vertex operator algebras. Adv. Math. 206, 729–751 (2006) Kiritsis, E.: Proof of the completeness of the classification of rational conformal field theories with c = 1,. Phys. Lett. B 217, 427–430 (1989) Knopp, M., Mason, G.: On vector-valued modular forms and their fourier coefficients. Acta Arith. 110, 117–124 (2003) Lam, C., Yamauchi, H.: A characterization of the moonshine vertex operator algebra by means of Virasoro frames. Int. Math. Res. Not. 2007 (2007), ID rnm003, 10 pp Li, H.: Symmetric invariant bilinear forms on vertex operator algebras. J. Pure Appl. Algebra 96, 279–297 (1994) Li, H.: Determining fusion rules by a(v)-modules and bimodules. J. Algebra 212, 515–556 (1999) Milas, A.: Fusion rings for degenerate minimal models. J. Algebra 254, 300–335 (2002) Miyamoto, M.: Griess algebras and conformal vectors in vertex operator algebras. J. Algebra 179, 523–548 (1996) Miyamoto, M.: Binary codes and vertex operator superalgebras. J. Algebra 181, 207–222 (1996) Miyamoto, M.: Representation theory of code vertex operator algebra. J. Algebra 201, 115–150 (1998) Miyamoto, M.: A new construction of the moonshine vertex operator algebra over the real number field. Ann. of Math. 159, 535–596 (2004) Röhrl, H.: Finite-dimensional algebras without nilpotents over algebraically closed fields. Arch. Math. 32, 10–12 (1979) Rehern, K., Tuneke, H.: Fusion rules for the continuum sectors of the virasoro algebra of c = 1. Lett. Math. Phys. 53, 305–312 (2000)
88
[W] [X] [ZD] [Z]
C. Dong, C. Jiang
Wang, W.: Rationality of virasoro vertex operator algebras. Internat. Math. Res. Notices 7, 197–211 (1993) Xu, F.: Strong additivity and conformal nets. Pacific J. Math. 221, 167–199 (2005) Zhang, W., Dong, C.: W-algebra w(2,2) and the vertex operator algebra, l( 21 , 0)⊗l( 12 , 0). Commun. Math. Phys. 285, 991–1004 (2009) Zhu, Y.: Modular invariance of characters of vertex operator algebras. J. Amer. Math. Soc. 9, 237–302 (1996)
Communicated by Y. Kawahigashi
Commun. Math. Phys. 296, 89–109 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-0993-z
Communications in
Mathematical Physics
Langlands Duality for Representations and Quantum Groups at a Root of Unity Kevin McGerty Department of Mathematics, Imperial College London, London SW7 2AZ, United Kingdom. E-mail:
[email protected] Received: 19 May 2009 / Accepted: 28 October 2009 Published online: 4 February 2010 – © Springer-Verlag 2010
Abstract: We give a representation-theoretic interpretation of the Langlands character duality of [FH], and show that the “Langlands branching multiplicities” for symmetrizable Kac-Moody Lie algebras are equal to certain tensor product multiplicities. For finite type quantum groups, the connection with tensor products can be explained in terms of tilting modules. 1. Introduction Let g be a simple Lie algebra and L g its Langlands dual Lie algebra. In [FH] a duality between the irreducible characters of g and L g was established and a number of conjectures were made about its properties, both on the level of representations and, more combinatorially, at the level of crystals. In this paper we generalize the character duality to the category Oint of integrable representations in category O for a symmetrizable Kac-Moody Lie algebra, and establish, in this more general context, a number of the conjectures of [FH]. With a mild restriction on the generalized Cartan matrix we also give a representation-theoretic interpretation of the character duality using Lusztig’s modified quantum groups at a root of unity. It will be convenient to use Lusztig’s notion of a root datum and Cartan datum, the latter being essentially a symmetrizable generalized Cartan matrix with an integral choice of symmetrization. Indeed for any symmetrizable generalized Cartan matrix C = (ai j )i, j∈I on an indexing set I , we may choose a root datum (X, Y, I ) consisting of a weight lattice X , a coweight lattice Y , a perfect pairing ·, · : Y × X → Z, along with a set of simple roots {αi }i∈I ⊂ X and a set of simple coroots {αˇ i }i∈I ⊂ Y , which satisfy αˇ i , α j = ai j . We will assume that both the simple roots and simple coroots are linearly independent, so that we have the standard partial ordering on X , and a dominant cone
Supported by a Royal Society University Research Fellowship.
90
K. McGerty
X + = {λ ∈ X : αˇ i , λ ≥ 0, ∀i ∈ I }. Let (di )i∈I be an integral vector such that DC is symmetric, where D is the diagonal matrix with Dii = di . Set d to be the least common multiple of the di , and let li = d/di . Let L g be the Langlands dual Kac-Moody algebra with Cartan matrix C t . Then we may embed the weight lattice of L g into X in such a way that iL → i∗ = li i (where i , iL are the fundamental weights). Let X ∗ denote the image of this map1 . Given a simple highest weight representation ∇(λ) of g, such that λ ∈ X ∗ ∩ X + , let c∗ (∇(λ)) denote the direct sum of those weight spaces of ∇(λ) whose weights lie in X ∗ . In [FH] it was shown that the character of c∗ (∇(λ)) is the virtual character of a representation of L g: more precisely it was shown that χ (c∗ (∇(λ))) = χ L (λ) + m λµ χ L (µ), (m λµ ∈ Z), (1.1) µ∈X ∗ ,µ0
while χ L (λ) =
w∈W ∗
∗
(1 − e−α )−n α
α ∗ ∈∗ ,α ∗ >0
ε(w)ew(λ+ρ
L )−ρ L
,
w∈W
where n α and n ∗α denote the dimensions of the roots spaces in g and L g respectively. But now we have L ε(w)ew(ρ )−ρ χ (ρ L − ρ) = ( (1 − e−α )−n α ) α>0
= (e
−ρ+ρ L
w∈W
(1 − e
−α −n α
)
α>0
)
ε(w)ew(ρ
L )−ρ L
.
w∈W
Applying Weyl’s denominator formula for (X ∗ , Y ∗ , I ) to the sum in the last expression, we find that L ∗ ∗ χ (ρ L − ρ) = (eρ −ρ (1 − e−α )−n α ) (1 − e−α )n α . α ∗ >0
α>0
The statement of the lemma follows immediately.
Remark 5.3. In the finite or affine case, Lemma 2.12 shows that and ∗ are in bijection α ↔ α ∗ , so that α ∗ = lα α for some lα ∈ Z positive. Since all root spaces are one-dimensional in the finite case, we get a simple expression for the Weyl character of ρ L − ρ: L (1 + e−α + . . . + e−(lα −1)α ). χ (ρ L − ρ) = eρ −ρ α∈,α>0 (s)
In the affine case suppose the generalized Cartan matrix is of type X m in the classification given in [Kac, Chap. 4]. Then although the root spaces corresponding to real roots are again all one-dimensional, the root space of weight jδ has dimension |I | − 1 if s divides j and dimension (m − |I | + 1)/(s − 1) otherwise. The explicit formula for χ (ρ L − ρ) is thus similar but contains a more elaborate product contribution coming from imaginary roots. ˙ be the quantum group of type B2 , so that U ˙ ∗ is of type Example 5.4. Let = 2, and U
C2 . We take α1 to be the long root and α2 to be the short root, so that ρ L − ρ = 2 . Setting µ = 1 ∈ X ∗ , for example. and writing yi = ei , we have
χ (ρ L − ρ + µ) = χ (ρ) = y1 y2 (1 + y1−2 y2−2 )(1 + y1 y2−2 )(1 + y1−1 )(1 − y2−2 ) (since we always have χ (ρ) = eρ α>0 (1 + e−α )). Now the representation of highest weight 2 is 4-dimensional with character χ (2 ) = y2 (1 + y1 y2−2 )(1 + y1−1 ), and so dually we have χ L (1 ) = y1 (1+ y1−2 y2−2 )(1+ y2−2 ). The product formula of the previous lemma now follows immediately.
104
K. McGerty
For ν1 , ν2 , ν3 ∈ X + let {cνν13 ,ν2 } be the structure constants for the multiplication on the Grothendieck group of the category Oint (g) given by tensor product. Since K 0 (Oint (g)) injects into E via the character map we have χ (ν1 )χ (ν2 ) = cνν13 ,ν2 χ (ν3 ). ν3 ∈X +
Theorem 5.5. The Langlands duality branching rules m λµ are positive. More precisely, we have µ+ρ L −ρ
m λµ = cρ L −ρ,λ . Proof. Suppose we have (χ (λ)) = Lemma 5.2 that χ (ρ L − ρ) (χ (λ)) =
λ L µ∈X + ∩X ∗ ,µ≤λ m µ χ (µ).
µ∈X + ∩X ∗ ,µ≤λ
Then it follows from
m λµ χ (µ + ρ L − ρ).
(5.1)
On the other hand, since χ (λ) is W -invariant, and is W -equivariant, we may apply Lemma 5.1 with ξ = (χ (λ) − (χ (λ)) to obtain χ (ρ L − ρ).(χ (λ) − (χ (λ))) = n λν χ (ν) (5.2) ν∈X + ν ∈ρ / L −ρ+X ∗
for some integers n λν ∈ Z. Finally, since we have η χ (ρ L − ρ).χ (λ) = cρ L −ρ,λ χ (η), η∈X +
and the Weyl characters which can occur on the right-hand sides of Eqs. (5.1) and (5.2) have highest weight which lie in different X ∗ -cosets, the claim immediately follows. Remark 5.6. Note that the above argument shows that χ (ρ L − ρ).(χ (λ) − (χ (λ)) is indeed a positive sum of Weyl characters, in contrast to the general situation of Lemma 5.1. We will see in the next section that, at least in the finite-type case, it is the character ˙ -module ∇(ρ L − ρ) ⊗ ∇(λ). of a direct summand of the U Tensor product multiplicities have been computed combinatorially by various people: for finite type, building on a combinatorial description due to Lusztig, Berenstein and Zelevinsky have given “polyhedral” multiplicity formulas in [BZ]; for a general Kac-Moody algebra, there is a Littlewood-Richardson rule in terms of Littelmann paths [Li95]. Example 5.7. Consider again the case of B2 . We need to calculate the multiplicities in the tensor products ∇(λ) ⊗ ∇(2 ) for λ ∈ X ∗ ∩ X + (where α2 is the short root and α1 the long root). As we have seen above, the set of weights of ∇(2 ) is the Weyl group orbit of 2 and each weight has multiplicity one. Let W2 be the stabilizer of 2 in W . Using the formula in the statement of Lemma 5.1, it follows that for w ∈ W/W2 , the simple highest weight representation ∇(λ + w(2 )) occurs exactly once in the tensor product, provided λ + w(2 ) is dominant (since it is easy to check that if λ + w(2 )
Langlands Duality for Representations and Quantum Groups at a Root of Unity
105
is not dominant, then λ + w(2 ) + ρ is not regular) and these are all the constituents. Hence we have 2 (χ (λ)) = χ L (λ + w(2 ) − 2 ). w∈W/W2 λ+w(2 )∈X +
Note that since λ ∈ X ∗ ∩ X + , the weight λ + w(2 ) is dominant if and only if λ + w(2 ) − 2 is dominant. This recovers the calculations of [FH, Remark 6.9]. 6. On Langlands Duality Branching Rules in the Finite-type Case and Tilting Modules In this section we study the branching multiplicities m λµ from a representation-theoretic point of view, and give an interpretation of the results of the last section using tilting modules. We restrict ourselves to the case of finite type algebras, as we will use the machinery of induction etc. for quantum algebras at roots of unity provided by [APW,Ka], and the infinitesimal quantum groups defined by Lusztig. We begin by recalling the results on quantum groups at a root of unity that we will need. Since we work here with only finite type quantum groups we may work with the category F of finite dimensional representations (of type 1). In [L90], Lusztig defined root vectors θα for each positive root α, via a braid group action. For a positive root α, let α = i , where α is conjugate to the simple root αi under the action of W the Weyl group. Proposition 6.1. Let f be the subalgebra of f = A ⊗A f generated by {θα : α ≥ 2}. Then f is a finite dimensional algebra and we have an isomorphism f∗ ⊗ f → f , given by (x, y) → Fr (x).y. Proof. This is established in most cases in [L93, 35], except when is small. The excluded cases (which are already stated in [L93] but without detailed proof) have been checked in [Ka, 2.7]. ˙ ∗ . Define algebras u˙ and uˆ by ˙ and U Definition 6.2. We need various subalgebras of U + − setting u˙ = {x 1λ y : x, y ∈ f, λ ∈ X } and uˆ = {x1λ : x ∈ u˙ or x ∈ U − }, (note that ˙ ≤0 = {x − 1λ : x ∈ f}. these are indeed subalgebras). Finally, we need the subalgebra U In [APW92] the authors define (derived) induction functors on integrable modules ˙ denoted H i (U1 /U2 , −), where U1 ⊃ U2 are two ˙ ≤0 and U for the algebras u˙ , uˆ , U 6 of the algebras above . Given λ ∈ X , there is a natural one-dimensional module for ˙ ≤0 which we denote simply by λ. When λ is dominant, H 0 (U/ ˙ U ˙ ≤ , λ), just as in the U classical theory of induction from a Borel subgroup, is an indecomposable module with character given by Weyl’s formula, which we denote by ∇(λ). It has a unique simple submodule L(λ). The dual of ∇(λ), denoted (λ), is a costandard, or Weyl, module. ˙ ∗ , though here of course the theory is easier The same theory exists for the algebra U 6 In fact they work with “unmodified” algebras, but the categories of modules obviously correspond to categories of modules for the modified algebras – see [Ka] for more details.
106
K. McGerty
˙ ∗ -modules is semisimple. We will write ∇ ∗ (λ), ∗ (λ) for the because the category of U ˙∗ corresponding modules for U . We define σ = i∈I (i − 1)i , and let St , the Steinberg representation, be the ˙ ≤0 , −) is exact (see [Ka, 2.9]), and we denote it by module ∇(σ ). The functor H 0 (ˆu/U ∼ ˙ and ˆ ˆ Z . It is known that St = Z (σ ) as uˆ -modules, and in fact St is simple as both a U uˆ -module, see for example [APW92, §0.9] for more details. We will also need the class of modules known as tilting modules, whose definition we now recall. ˙ module is said to be tilting if it has a filtration both by standard Definition 6.3. A U and costandard modules. We now review some of the basic results on tilting modules. Although all the results are standard, we sketch the proof to point out that they all hold even for small values of . Theorem 6.4. (1) If M1 , M2 are tilting modules, then so is M1 ⊗ M2 . (2) If M and N are tilting modules, then M ∼ = N if and only if M and N have the same character. Proof. The key to (1) is to show that the tensor product of two standard modules has a filtration by standard modules. This follows even integrally from Lusztig’s theory of based modules: see for example [Ka98]. The general construction of tilting modules shows that for each λ ∈ X + there is a unique indecomposable tilting module T (λ), where λ occurs as a weight of T (λ) with multiplicity one and all other weights of T (λ) are strictly less than λ. Moreover every indecomposable tilting module has this form. See [A92, §2] for more details. This readily implies (2). We also need to understand some relations between pulling back via the Frobenius, ˙ ∗≤0 -module there is an and induction. The main result of [Ka] asserts that for M a U isomorphism ˙ u, M Fr ) ∼ ˙ ∗ /U ˙ ∗≤0 , M) Fr , (i ≥ 0), H i (U/ˆ = H i (U
(6.1)
where M Fr is the uˆ -module obtained via composition with Fr , and similarly for the right-hand side. (This result is already established in [APW92] with some restrictions.) Theorem 6.5. Let λ ∈ X + ∩ X ∗ , then we have ˙ /U ˙ ≤0 , λ + σ ) ∼ ˙ ∗ /U ∗≤0 , λ) Fr . H i (U = St ⊗ H i (U
(6.2)
Proof. With the ingredients provided by [Ka], the proof is standard. By (6.1) we have ˙ ∗ /U ∗≤0 , λ) Fr ∼ ˙ u, λ Fr ). H i (U = H i (U/ˆ Thus tensoring both sides with St we find the right-hand side of (6.2) is isomorphic to ˙ /ˆu, λ Fr ) ⊗ St ∼ ˙ /ˆu, λ Fr ⊗ St ) H i (U = H i (U ∼ ˙ /ˆu, λ Fr ⊗ Zˆ (σ )) = H i (U ∼ H i (U ˙ /ˆu, Zˆ (λ + σ )) = ∼ ˙ /U ˙ ≤0 , λ + σ ), = H i (U
Langlands Duality for Representations and Quantum Groups at a Root of Unity
107
˙ -modules, in the second the isowhere in the first line we use the tensor identity for U morphism St ∼ = Zˆ (σ ), in the third the tensor identity for uˆ -modules, and in the final line, the fact that Zˆ is exact, so the spectral sequence for the composition of induction ˙ /ˆu, Zˆ (M)) ∼ ˙ /U ˙ ≤0 , M). functors degenerates to give an isomorphism H i (U = H i (U Remark 6.6. The characteristic p version of this theorem, due to Andersen, gives a short proof of Kempf’s vanishing theorem. Moreover, taking characters of H 0 when = d we recover Lemma 5.2, and thus it can be seen as the representation-theoretic version of that calculation. Proposition 6.7 ([APW,A92]). We have the following properties of the Steinberg module St : (1) St is injective and projective in F. (2) If M is a finite dimensional representation, then St ⊗ M is tilting, projective and injective in F. Proof. We outline a proof of this theorem here to emphasize that it holds for all (at least over C, which is the only field we need here). We must show that Ext1 (St , L(λ)) = 0 for all λ ∈ X + . The linkage principle already implies that this Ext vanishes unless λ = σ + µ, where µ ∈ X ∗ . Now the previous theorem shows that these modules are in fact standard modules ∇(σ + µ) = St ⊗ Fr ∗ (∇ ∗ (µ)). Hence it is enough to show that Ext1 (St , ∇(σ + µ)) = 0. Since St is self-dual, this follows if we can show Ext1 ((σ ), ∇(σ + µ)) = 0, but it is known (and crucial in the construction of tilting modules) that Ext1 ((λ), ∇(µ)) = 0, ∀λ, µ ∈ X + , and so we are done. Self-duality also immediately implies that St is injective. Moreover, using standard properties of Hom and the tensor product, it follows readily that St ⊗ E is injective and projective for any finite-dimensional module E. To see that it is tilting, one can show that any module can be imbedded in a module of the form St ⊗ T , where T is tilting. Since St is also tilting, and tilting modules are closed under direct summands, it follows that indecomposable projectives and injective modules are tilting. See [A92] for more details. Corollary 6.8. Let µ ∈ X ∗ ∩ X + . Then ∇(µ + ρ L − ρ) is simple, tilting, projective and injective. Proof. From Theorem 6.2 and Lusztig’s quantum version of Steinberg’s tensor product theorem, it follows that the modules ∇(µ + ρ L − ρ) = ∇(ρ L − ρ) ⊗ Fr ∗ (∇ ∗ (µ)) are simple. By the previous proposition, they are also tilting and injective. By duality, they are also projective. We now examine the Langlands branching multiplicities. We would like a representation-theoretic interpretation of the calculation of these multiplicities in terms of tensor-product multiplicities in Theorem 5.5. The key, unsurprisingly, is Theorem 6.2 and the theory of tilting modules outlined above. Notice first that c∗ (∇(λ)) is a rep˙ ∗ , so it does not make sense to compare it to ∇(λ), however we may resentation of U pull it back via Fr without changing its character. Unfortunately, Fr ∗ c∗ (∇) still has no obvious (at least to the authors) relation to ∇(λ). Nevertheless once we tensor with the
108
K. McGerty
Steinberg representation, a natural relation appears. Recall from [A03] that the linkage ˙ shows that the orbits of the ρ-shifted action of the affine Weyl group Wˆ principle for U ˙ . The simple modules ∇(µ + ρ L − ρ) for µ ∈ X ∗ thus all lie are unions of blocks for U in the union of blocks given by the orbits of Wˆ on ρ L − ρ + X ∗ . Proposition 6.9. Let λ ∈ X ∗ . The module St ⊗ Fr ∗ c∗ (∇(λ)) is a direct summand of the module St ⊗ ∇(λ), and moreover is precisely the summand which lies in the union ˙ contained in the Wˆ -orbits of ρ L − ρ + X ∗ . of the blocks of U Proof. By part (2) of Proposition 6.7 we see that St ⊗ ∇(λ) is a tilting module. Thus if T (γ ) denotes the indecomposable tilting module with highest weight γ , we may write St ⊗ ∇(λ) = T (ν + ρ L − ρ), ν∈X +
a sum of indecomposable tilting modules (the tilting modules which occur must have highest weight of the form ν + ρ L − ρ, by [A92, 5.12]). For any µ ∈ X ∗ , Theorem 6.2 combined with Proposition 6.7 shows that ∇(µ + ρ L − ρ) is projective and injective and tilting, thus it cannot occur as a composition factor of a standard filtration of T (ν + ρ L − ρ) for ν ∈ / X ∗ . Therefore ⎛ ⎞ µ+ρ L −ρ ⊕c St ⊗ ∇(λ) = T ⊕ ⎝ ∇(µ + ρ L − ρ) ρ L −ρ,µ ⎠ , µ∈X ∗
where T is a tilting module whose character lies entirely in the (positive) span of the Weyl characters χ (ν + ρ L − ρ) for ν ∈ / X ∗. On the other hand, we have λ St ⊗ Fr ∗ c∗ (∇(λ)) = ∇(µ + ρ L − ρ)⊕m µ . µ∈X ∗
Hence using Theorem 5.5 the result follows.
Remark 6.10. This also shows that the expression χ (ρ L − ρ).(χ (λ) − (χ (λ)) is the character of T in the above proof, which is also a direct summand of St ⊗ ∇(λ). Note µ+ρ L −ρ
also that the above proof shows that m λµ ≤ cµ,ρ L −ρ , independently of Sect. 5 since tilting modules are determined by their character. It would be interesting to know if there ˙ is a natural U-module map between St ⊗ ∇(λ) and St ⊗ Fr ∗ c∗ (∇(λ)). References [A03] [A92] [AP] [APW] [APW92]
Andersen, H.H.: The strong linkage principle for quantum groups at roots of 1. J. Alg. 260, 2–15 (2003) Andersen, H.H.: Tensor products of quantized tilting modules. Commun. Math. Phys. 159, 149–159 (1992) Andersen, H.H., Paradowski, J.: Fusion categories arising from semisimple lie algebras. Commun. Math. Phys. 169, 563–588 (1995) Andersen, H., Polo, P., Wen, K.: Representations of quantum algebras. Invent. Math. 104, 1–53 (1991) Andersen, H., Polo, P., Wen, K.: Injective modules for quantum groups. Amer. J. Math. 114, 571–604 (1992)
Langlands Duality for Representations and Quantum Groups at a Root of Unity
[BZ] [FH] [FH2] [FM] [Kac] [Ka98] [Ka] [K96] [K02] [KL02] [KS] [Li95] [Li] [L90] [L93] [M] [St]
109
Berenstein, A., Zelevinsky, A.: Tensor product multiplicities, canonical bases and totally positive varieties. Invent. Math. 143, 77–128 (2001) Frenkel, E., Hernandez, D.: Langlands duality for representations of quantum groups. http:// arXiv.org/abs/0809.4453v3[math.QA], 2008 Frenkel, E., Hernandez, D.: Langlands duality for finite-dimensional representations of quantum affine algebras. http://arXiv.org/abs/09.2.0447v2[math.QA], 2009 Frenkel, E., Mukhin, E.: The q-characters at a root of unity. Adv. Math. 171(1), 139–167 (2002) Kac, V.: Infinite dimensional Lie algebras. 3rd ed., Cambridge: Cambridge University Press, 1990 Kaneda, M.: Based modules and good filtrations in algebraic groups. Hiroshima Math. J. 28, 337–344 (1998) Kaneda, M.: Cohomology of infinitesimal quantum algebras. J. Alg. 226, 250–282 (2000) Kashiwara, M.: Similarity of crystal bases. Cont. Math. 194, 177–186 (1996) Kashiwara, M.: Bases cristallines des groupes quantiques, Edited by Charles Cochet. Cours Spécialisés, 9. Paris: Société Mathématique de France, 2002 Kumar, S., Littelmann, P.: Algebraization of frobenius splitting via quantum groups. Ann. of Math. (2) 155(2), 491–551 (2002) Kumar, S., Stembridge, J.: Special Isogenies and Tensor Product Multiplicities. Int. Math. Res. Not. 2007, Article ID rnm081, 13 pp. Littelmann, P.: Path and root operators in representation theory. Ann. of Math. (2) 142, 499–525 (1995) Littelmann, P.: Contracting modules and standard monomial theory for symmetrizable kac-moody algebras. J. Amer. Math. Soc. 11(3), 551–567 (1998) Lusztig, G.: Quantum groups at roots of 1. Geom. Dedicata 35, 89–114 (1990) Lusztig, G.: Introduction to Quantum Groups. Birkhäuser, Boston, 1993 McGerty, K.: Generalized q-schur algebras and quantum frobenius. Adv. Math. 214(1), 116–131 (2007) Steinberg, R.: The isomorphism and isogeny theorems for reductive algebraic groups. J. Alg. 216, 366–383 (1999)
Communicated by Y. Kawahigashi
Commun. Math. Phys. 296, 111–143 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-0996-9
Communications in
Mathematical Physics
Comments on Hastings’ Additivity Counterexamples Motohisa Fukuda1 , Christopher King2 , David K. Moser3 1 Department of Mathematics, University of California, Davis, CA, USA 2 Department of Mathematics, Northeastern University, Boston, MA 02115, USA
E-mail:
[email protected] 3 Department of Physics, Northeastern University, Boston, MA 02115, USA
Received: 22 May 2009 / Accepted: 11 November 2009 Published online: 4 February 2010 – © Springer-Verlag 2010
Abstract: Hastings [12] recently provided a proof of the existence of channels which violate the additivity conjecture for minimal output entropy. In this paper we present an expanded version of Hastings’ proof. In addition to a careful elucidation of the details of the proof, we also present bounds for the minimal dimensions needed to obtain a counterexample. Contents 1. 2. 3.
4.
5.
The Additivity Conjectures . . . . . . . . . . . . . . . . . . . Notation and Statement of Results . . . . . . . . . . . . . . . . 2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The main result . . . . . . . . . . . . . . . . . . . . . . . Background on Random States and Channels . . . . . . . . . . 3.1 Probability distributions for states . . . . . . . . . . . . . 3.2 Probability distributions on the simplex d . . . . . . . . 3.3 Estimates for µd,n . . . . . . . . . . . . . . . . . . . . . . 3.4 Probability distribution for random unitary channels . . . . Proof of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . . 4.1 Definition of the typical channel . . . . . . . . . . . . . . 4.2 Definition of the low-entropy events E . . . . . . . . . . . 4.3 The upper bound for Pr ob(E) . . . . . . . . . . . . . . . 4.4 The lower bound for Pr ob(E) . . . . . . . . . . . . . . . 4.5 Combining the bounds for Pr ob(E) and finishing the proof 4.6 Optimizing the bounds for Pr ob(E) and the proof of Proposition 3 . . . . . . . . . . . . . . . . . . . . Proofs of Lemmas . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Proof of Lemma 1 . . . . . . . . . . . . . . . . . . . . . . 5.2 Proof of Lemma 4 . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
112 114 114 116 117 117 117 118 119 120 120 121 122 124 125
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
126 126 126 127
112
M. Fukuda, C. King, D. K. Moser
5.3 Proof of Lemma 7 . . . . . . 5.4 Proof of Lemma 9 . . . . . . 5.5 Proof of Lemma 11 . . . . . 5.6 Proof of Lemma 12 . . . . . 5.7 Proof of Lemma 13 . . . . . 6. Discussion . . . . . . . . . . . . A. Derivation of Bound for Z (n, d) . B. Proof of Proposition 14 . . . . . References . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
128 130 131 134 135 138 138 141 142
1. The Additivity Conjectures The classical capacity of a quantum channel is the maximum rate at which classical information can be reliably transmitted through the channel. This maximum rate is approached asymptotically with multiple channel uses by encoding the classical information in quantum states which can be reliably distinguished by measurements at the output. In general, in order to achieve optimal performance, it is necessary to use measurements which are entangled across the multiple channel outputs. However it was conjectured that product input states are sufficient to achieve the maximal rate of transmission, in other words that there is no benefit in using entangled states to encode the classical information. This conjecture is closely related to other additivity conjectures of quantum information theory, as will be explained below. Recently Hastings [12] disproved all of these additivity conjectures by proving the existence of channels which violate the additivity of minimal output entropy. The purpose of this paper is to present in detail the findings of Hastings’ paper, and also to find bounds for the minimal dimensions needed for this type of counterexample. We begin by formulating the various additivity conjectures. The Holevo capacity of a quantum channel is defined by ∗ χ () = sup S pi ρi pi S ((ρi )), (1.1) − { pi , ρi }
i
i
where the supremum runs over ensembles of input states, and where S(ρ) denotes the von Neumann entropy of the state ρ (here and throughout the paper log denotes the natural logarithm): S(ρ) = −Trρ log ρ.
(1.2)
The classical information capacity C() is known [15,25] to equal the following limit: 1 ∗ ⊗n (1.3) χ ( ). n→∞ n It has been a longstanding conjecture that the classical information capacity is in fact equal to the Holevo capacity: C() = lim
Conjecture 1
C() = χ ∗ ().
(1.4)
Conjecture 1 would be implied by additivity of χ ∗ over tensor products. This led to the following conjecture: for all channels and , Conjecture 2
χ ∗ ( ⊗ ) = χ ∗ () + χ ∗ ().
(1.5)
Comments on Hastings’ Additivity Counterexamples
113
Subsequently a third conjecture appeared, namely the additivity of minimum output entropy: Conjecture 3
Smin ( ⊗ ) = Smin () + Smin (),
(1.6)
where Smin is defined by Smin () = inf S((ρ)).
(1.7)
ρ
Finally Amosov, Holevo and Werner [3] proposed a generalization of Conjecture 4 with von Neumann entropy replaced by the Renyi entropy: for all p ≥ 1, S p,min ( ⊗ ) = S p,min () + S p,min (),
Conjecture 4
(1.8)
where S p,min is the minimal Renyi entropy defined for p = 1 by S p,min () = inf S p ((ρ)), ρ
S p (τ ) =
1 log Tr τ p . 1− p
(1.9)
In 2004 Shor [27] proved the equivalence of several additivity conjectures, including Conjectures 2 and 3 above. In subsequent work [11] it was shown that Conjectures 1 and 2 are equivalent. The conjectures have been proved in several special cases [1,2,6, 9,17–20], but recently most progress has been made in the search for counterexamples. This started with the Holevo-Werner channel [28] which provided a counterexample to Conjecture 4 with p > 4.79, then more recently Winter and Hayden found counterexamples to Conjecture 4 for all p > 1 [14], and violations have since been found also for p = 0 and p close to zero [8]. Finally in 2008, Hastings [12] produced a family of channels which violate Conjecture 3, namely additivity of minimal output von Neumann entropy, thereby also proving (via [27 and 11]) that Conjectures 1 and 2 are false. Following Winter’s idea, the product channels used by Hastings have the form ⊗, where is a special channel which we call a random unitary channel. This means that there are positive numbers w1 , . . . , wd with i wi = 1 and unitary n × n matrices U1 , . . . , Ud such that (ρ) =
d i=1
wi Ui ρ Ui∗ ,
(ρ) =
d
wi Ui ρ Ui∗ .
(1.10)
i=1
These channels are chosen randomly using a distribution that depends on the two integers n and d, where n is the dimension of the input space and d is the dimension of the environment. Hastings’ main result is that for n and d sufficiently large there are random unitary channels which violate Conjecture 3, that is Smin ( ⊗ ) < Smin () + Smin ().
(1.11)
This result also allows a direct construction of channels which violate Conjectures 1 and 2, as we now show. Using results from the paper [11], the inequality (1.11) implies that the additivity of minimal output entropy does not hold for the product ⊗ , where = ⊕ . In addition, as shown in the paper [10], there is a unital extension of , denoted , such that the additivity of minimal output entropy does not hold for ⊗ , and such that Smin ( ⊗ ) = 2 log D − χ ∗ ( ⊗ ),
(1.12)
114
M. Fukuda, C. King, D. K. Moser
where D is the dimension of the output space for . Thus provides a counterexample for Conjecture 2, and 1 ∗ χ (( )⊗2k ) > χ ∗ ( ). k→∞ 2k lim
(1.13)
Therefore, the classical capacity of does not equal its Holevo capacity, and this provides a counterexample for Conjecture 1. One key ingredient in the proof is the relative sizes of dimensions, namely n >> d >> 1, where n is the dimension of the input space, and d is the dimension of the environment. Recall that in the Stinespring representation a channel is viewed as a partial isometry from the input space Hin to the product of output and environment spaces Hout ⊗ Henv , followed by a partial trace over the environment. The image of Hin under the partial isometry is a subspace of dimension n sitting in the product Hout ⊗ Henv . Making the environment dimension d much smaller than the input dimension n should guarantee that with high probability this subspace will consist of almost maximally entangled states. For such states the output entropy will be close to the maximal possible value log d, and therefore the minimal entropy of the channel should also (hopefully) be close to log d. At the same time the product channel ⊗ sends the maximally entangled state into an output with one relatively large eigenvalue, and thus one might hope to find a gap between Smin ( ⊗ ) and Smin () + Smin (). Turning this vague notion into a proof requires considerable insight and ingenuity. In this paper we focus on the technical aspects of Hastings’ proof. Some of the estimates and inequalities derived in this paper are new, but all the main ideas and methods are taken from [12]. The paper is organized as follows. In Sect. 2 we define notation and make a precise statement of Hastings’ results. In Sect. 3 we present some background material on probability distributions for states and channels. In Sect. 4 we ‘walk through’ the proof of Hastings’ Theorem, stating results where needed and delineating the logic of the argument. In Sect. 5 we give the proofs of various results needed in Sect. 4 and elsewhere. Sect. 6 discusses different aspects of the proof and possible directions for further research. The Appendix contains the derivation of some estimates needed for the proof.
2. Notation and Statement of Results We will mostly avoid Dirac bra and ket notation, although it will be used in Sects. 5.1 and 5.5. 2.1. Notation. Let Mn denote the algebra of complex n × n matrices. The identity matrix will be denoted In , or just I . The set of states in Mn is defined as Sn = {ρ ∈ Mn : ρ = ρ ∗ ≥ 0, Trρ = 1}.
(2.1)
The set of unit vectors in Cn will be denoted Vn = {z = (z 1 , . . . , z n )T ∈ Cn : z ∗ z =
n i=1
|z i |2 = 1}.
(2.2)
Comments on Hastings’ Additivity Counterexamples
115
Every unit vector z ∈ Vn defines a pure state ρ = zz ∗ satisfying ρ 2 = ρ. The set of unit vectors Vn is identified with the real (2n − 1)-dimensional sphere S 2n−1 , and hence carries a unique uniform probability measure which we denote σn . The set of unitary matrices in Mn is denoted U(n) = {U ∈ Mn : UU ∗ = I }.
(2.3)
We will write Hn for the normalized Haar measure on U(n). A channel is a completely positive trace-preserving map : Mn → Mm . Recall the definition of random unitary channel (1.10): (ρ) =
d
wi Ui ρ Ui∗ .
(2.4)
i=1
The set of all random unitary channels on Mn with d summands will be denoted Rd (n). Given a channel ∈ Rd (n) the complementary or conjugate channel C : Mn → Md is defined by [16,22] C (ρ) =
d √ wi w j Tr(ρ U ∗j Ui ) |i j|.
(2.5)
i, j=1
As is well-known, for any input state ρ the output states (ρ) and C (ρ) are related by (ρ) = Tr2 WρW ∗ , C (ρ) = Tr1 WρW ∗ .
(2.6)
Here, W : Cn → Cnd is a partial isometry. Also Tr2 denotes the partial trace over the state space of the environment, and Tr1 denotes the partial trace over the state space of the system. When ρ = zz ∗ is a pure state, the matrices (zz ∗ ) and C (zz ∗ ) are partial traces of the same pure state, and thus have the same non-zero spectrum and the same entropy. Therefore Smin () = Smin (C ). For the purposes of constructing the counterexample it is convenient to work with both and C . In particular, we are interested in the cases where W consists of rescaled unitary block matrices; ⎛√ ⎞ w1 U 1 ⎜ ⎟ W = ⎝ ... ⎠ . (2.7) √ wd U d Note that i wi = 1 as W is a partial isometry. We define a measure on this subset of partial isometries, in Sect. 3.4, as the product of Haar measures and a particular measure on the simplex. The complex conjugate channel is defined by (ρ) =
d i=1
wi Ui ρUi∗
=
d
wi Ui ρUiT .
i=1
Again note that and have identical minimum output entropies.
(2.8)
116
M. Fukuda, C. King, D. K. Moser
2.2. The main result. Following the work of Winter and Hayden [14], the counterexample is taken to be a product channel of the form ⊗ , where is a random unitary channel. Hastings first proves the following universal upper bound for the minimum output entropy of such a product. Lemma 1. For any ∈ Rd (n), Smin ( ⊗ ) ≤ 2 log d −
log d . d
(2.9)
Lemma 1 will be proved in Sect. 5.1. The counterexample is found by proving the existence of a random unitary channel whose minimum output entropy is greater than one half of this upper bound, that is greater than log d − log d/2d. For such a channel it will follow that Smin ( ⊗ ) ≤ 2 log d −
log d d
< 2Smin () = Smin () + Smin ()
(2.10) (2.11) (2.12)
and this will provide the counterexample to Conjecture 3. Hastings [12] proved the existence of such channels using a combination of probabilistic arguments and estimates involving the distribution of the reduced density matrix of a random pure state. The next theorem is a precise statement of Hastings’ result. Theorem 2. There is h min < ∞, such that for all h > h min , all d satisfying d log d ≥ h, and all n sufficiently large, there is ∈ Rd (n) satisfying Smin () > log d −
h . d
(2.13)
By taking d large enough so that 2h min < log d, we deduce from Theorem 2 that there is a channel satisfying Smin () > log d −
log d , 2d
(2.14)
and this establishes the existence of counterexamples for Conjecture 3. In fact the proof will show that as d, n → ∞, the probability that a randomly chosen channel in Rd (n) will satisfy the bound (2.13) approaches one. It would be interesting to determine the set of integers (n, d) for which there are random unitary channels in Rd (n) violating additivity, and in particular to find the smallest dimensions which allow violations, as well as the size of the largest possible violation. Following this line of reasoning we define log d dmin = inf d : ∃ n, ∃ ∈ Rd (n) s.t. Smin () > log d − , 2d log d , (2.15) n min = inf n : ∃ d, ∃ ∈ Rd (n) s.t. Smin () > log d − 2d
Smax = sup sup Smin () + Smin () − Smin ( ⊗ ) . n,d ∈Rd (n)
The next result gives some bounds on these quantities.
Comments on Hastings’ Additivity Counterexamples
117
Proposition 3. dmin < 3.9 × 104 , n min < 7.8 × 1032 , Smax > 9.5 × 10−6 . Proposition 3 will be proved in Sect. 4.6. The bounds in Proposition 3 are surely not optimal, however they may indicate the delicacy of the non-additivity effect for this class of channels. It would certainly be interesting to tune the estimates in this paper in order to improve the bounds in Proposition 3, or even better to find a different class of channels where the effect is larger. 3. Background on Random States and Channels As mentioned above, the proof of Theorem 2 relies on probabilistic arguments, involving distributions of pure states and random unitary channels. The next sections explain the distributions which play a role in the proof. 3.1. Probability distributions for states. Recall that Vn is the set of unit vectors in Cn . This set carries a natural uniform measure σn , namely the uniform measure on the (real) (2n − 1)-dimensional sphere. If Cdn = Cd ⊗ Cn is a product space, then a unit vector z ∈ Vdn can be written as a n × d matrix M, with entries Mi j (z) = z (i−1)d+ j , i = 1, . . . n, j = 1, . . . , d, satisfying TrM ∗ M = i j |z i j |2 = 1. Define the map G : Vdn → Md by G(z) = M(z)∗ M(z).
(3.1)
(3.2)
It follows that G(z) ≥ 0 and Tr G(z) = 1, and hence the image of G lies in Sd (the set of d-dimensional states). Since z is a random vector (with distribution σdn ) it follows that G(z) is a Sd -valued random variable, or more simply a random state. Its distribution has been studied in many other contexts (see for example [13]) and it plays a key role in the proof here. 3.2. Probability distributions on the simplex d . Let d denote the simplex of d-dimensional probability distributions: d = {(x1 , . . . , xd ) ⊂ Rd : xi ≥ 0,
d
xi = 1}.
(3.3)
i=1
We define below three different probability distributions on d . One is the uniform measure inherited from Rd , and the others are defined by the diagonal entries and the eigenvalues of G(z), where z is a random unit vector in Vdn . Uniform distribution The simplex d carries a natural measure inherited from Lebesgue measure on Rd : this is conveniently written as d d δ wi − 1 dw1 . . . dwd = δ wi − 1 [dw], (3.4) i=1
i=1
118
M. Fukuda, C. King, D. K. Moser
where δ(·) is the Dirac δ-function. Integrals with respect to this measure can be evaluated by introducing local coordinates on Rd in a neighborhood of d . In particular the volume of d with respect to the measure (3.4) can be computed: d 1 . (3.5) δ wi − 1 [dw] = (d − 1)! d i=1
Diagonal distribution νd,n Let z ∈ Vdn be a random unit vector in Cn ⊗ Cd . The joint distribution of the diagonal entries (G 11 (z), . . . , G dd (z)) will be denoted νd,n . It is possible to find an explicit formula for the density of νd,n , however we will not need it in this paper. It is sufficient to note that a collection of d random variables Y1 , . . . , Yd have the joint distribution νd,n if and only if they can be written as Yj =
n
|z i j |2 ,
j = 1, . . . , d,
(3.6)
i=1
where {z i j } are the components of a uniform random vector on the unit sphere in Cn ⊗Cd . We come back to this problem in Sect. 5.3. Eigenvalue distribution µd,n As noted above the eigenvalues of G(z) are non-negative and sum to one.1 However the eigenvalues are not ordered and so define a map not into d but rather into the quotient d / d , where d is the symmetric group. Thus when z ∈ Vdn is a random vector the eigenvalues of G(z) are d / d -valued random variables. However it is convenient to use a joint density for the eigenvalue distribution on d , with the understanding that it should be evaluated only on events which are invariant under d . This density is known explicitly [21,29]: for any event A ⊂ d , d d n−d −1 2 µd,n (A) = Z (n, d) (wi − w j ) wi δ wi − 1 [dw], (3.7) A 1≤i< j≤d
i=1
i=1
where Z (n, d) is a normalization factor. The distribution µd,n plays an essential role in the proof of Theorem 2. Explicit expressions for Z (n, d) are known [29]. In Appendix A we derive the following bound: for n sufficiently large, Z (n, d)−1 ≤ n d d d (n−d) . 2
(3.8)
3.3. Estimates for µd,n . Define the function F(x) = − log x + x − 1 .
(3.9)
Lemma 4. For all d, for n sufficiently large, and for any event A ⊂ d , d d µd,n (A) ≤ exp d 2 log n − (n − d) F(dwi ) δ wi − 1 [dw]. A
i=1
(3.10)
i=1
1 G(z) gives the complex Wishart matrix when z ∈ Cn ⊗ Cd with each entry z having IID comi j plex normal distribution. The eigenvalue distribution was shown to be proportional to 1≤i< j≤d (wi −w j )2 d n−d [dw], for example, in [5]. i=1 wi
Comments on Hastings’ Additivity Counterexamples
119
This lemma will be proved in Sect. 5.2. Using (3.5) we immediately get the following bound. Corollary 5. For all d, for n sufficiently large, and for any event A ⊂ d , d 2 F(dwi ) . µd,n (A) ≤ exp d log n − log(d − 1)! − (n − d) inf w∈A
(3.11)
i=1
Note that F(x) is convex, and also F(1) = F (1) = 0. The Taylor expansion around 1 gives F (1 + dδw) =
1 2 d (δw)2 + R, 2
(3.12)
where the remainder is 1 R = − (1 + dδ)−3 (dδw)3 , (3.13) 3 and δ is some value between 0 and δw. Note that −1/d < δw < (d −1)/d as 0 ≤ w ≤ 1. Also, R > 0 if δw < 0. When δw > 0, 1 0 > R > − d 3 (δw)3 . (3.14) 3 Recall that F(x) ≥ 0, so we have the bound F(dwi ) ≥ 0 for all i. Thus feeding (3.12) into Corollary 5 gives the following estimate, which will be used in Sect. 5.4. Corollary 6. For all d, for n sufficiently large, and for any i = 1, . . . , d, 1 n − d 2 2 n−d 3 3 2 d t + d t . µd,n w : wi − ≥ t ≤ exp d log n−log(d − 1)! − d 2 3 (3.15) 3.4. Probability distribution for random unitary channels. A random unitary channel (1.10) is determined by the coefficients wi and the unitary matrices Ui . Thus the set of random unitary channels Rd (n) is naturally identified with d × U(n)d . Recall the distribution νd,n defined in Sect. 3.2 for the diagonal entries of G(z), and the Haar measure Hn defined on U(n). We define the following product probability measure on Rd (n): Pd,n = νd,n × Hn × · · · × Hn ,
(3.16) U(n)d .
where Hn × · · · × Hn is the d-fold product Haar measure on Using the measure Pd,n on Rd (n) means that the unitaries Ui are selected randomly and independently, while the coefficients w j have the joint distribution νd,n , and thus can be written in the form (3.6), where {z i j } (i = 1, . . . , n; j = 1, . . . , d) are the components of a random unit vector in Vnd . Recall the definition (2.5) of the conjugate channel. Define the map H : Rd (n) × Vn → Md , (, z) → C (zz ∗ ).
(3.17)
Recall the definition (3.2) of the map G : Vdn → Md . The following relation between the distributions Pd,n , σn and σdn is crucial to the proof. Given a measurable map f : X → Y between measure spaces (X, A) and (Y, B) (where A and B are σ -algebras on X, Y respectively), and given a measure µ on (X, A), we define the push-forward measure f ∗ (µ) on (Y, B) by f ∗ (µ)(B) = µ( f −1 (B)) for all B ∈ B.
120
M. Fukuda, C. King, D. K. Moser
Lemma 7. H ∗ (Pd,n × σn ) = G ∗ (σdn ).
(3.18)
Lemma 7 will be proved in Sect. 5.3. It implies that if is chosen randomly according to the measure Pd,n and z is chosen randomly and uniformly in Vn , then the eigenvalues of the matrix C (zz ∗ ) will have the distribution µd,n . 4. Proof of Theorem 2 The main idea of the proof is to isolate some properties of random unitary channels which are typical for large values n and d. These properties will then be used to prove that large minimum output entropy is also typical for random unitary channels when n, d are large. Recall that the environment dimension d will be chosen to be much smaller than the input dimension n. As the identity (2.6) shows, selecting a channel in Rd (n) corresponds to selecting a subspace of dimension n in the product space Cn ⊗ Cd . The structure of random bipartite subspaces was analyzed in the paper [13], and it was shown that in some circumstances most states in a randomly selected subspace will be close to maximally entangled. In such a situation the reduced density matrix of a randomly selected output state C (zz ∗ ) will be close to the maximally mixed state I /d, and hence its entropy will be close to log d. Although this observation plays an essential role in Hastings’ proof, the methods used in [13] do not directly yield the bounds needed. 4.1. Definition of the typical channel. A channel will be called typical if C maps at least one half of input states into a small ball centered at the maximally mixed output state I /d. The size of the small ball in question involves a numerical parameter b and is defined as follows: 1 log n Bd (n) = ρ ∈ Sd : . (4.1) ρ − d I ≤ b n ∞ Definition 8. A random unitary channel is called typical if with probability at least 1/2 a randomly chosen input state is mapped by C into the set Bd (n). The set of typical channels is denoted T : T = : σn z : C (zz ∗ ) ∈ Bd (n) ≥ 1/2 . (4.2) As the next result shows, for large n most channels are typical. √ Lemma 9. For all b > 3, d ≥ 2 and 0 < α < b2 /3 − 1, taking n sufficiently large, Pd,n (T c ) ≤
2d exp[−α d 2 log n]. (d − 1)!
(4.3)
√ Thus if b > 3, then as n → ∞ with high probability a randomly chosen channel will lie in the set T . In particular Pd,n (T c ) < 1 for n sufficiently large. The number α can be chosen to satisfy α=
b2 (n − d) − 1. 3n
(4.4)
Comments on Hastings’ Additivity Counterexamples
121
The dimension n must be large enough so that the right side of (4.4) is positive, and also so that n ≥ 4b2 d 2 log n (this is a technical condition needed in the proof, see Sect. 5.4). The second property of a typical channel is the existence of a ‘tube’ of output states surrounding C (zz ∗ ) for every input state z ∈ Vn . This property is used to eliminate the possibility of isolated output states with low entropy: if for some z the output entropy S(C (zz ∗ )) is small, then there is a nonzero fraction of input states whose outputs also have low entropy. In order to define the tube we first construct a line segment Y (ρ) pointing from a general state ρ toward the maximally mixed state I /d. The length of the segment depends on a parameter γ , which satisfies 0 < γ < 1: 1 Y (ρ) = rρ + (1 − r ) I : γ ≤ r ≤ 1 . (4.5) d The tube at ρ is defined to be the set of states which lie within a small distance of the set Y (ρ), and thus form a thickened line segment pointing from ρ toward the maximally mixed state. The definition of ‘small’ here depends on the size of the ball Bd (n), and also on another numerical parameter t. Definition 10. Let ρ ∈ Sd , then the Tube at ρ is defined as Tube(ρ) = θ ∈ Sd : dist (θ, Y (ρ)) ≤ t
d log n n
, (4.6)
dist (θ, Y (ρ)) = inf θ − τ ∞ . τ ∈Y (ρ)
The next result shows that for a channel in the typical set T , and for any state ρ = C (zz ∗ ) in the image of C , there is a uniform lower bound for the probability that a randomly chosen state belongs to the tube at ρ. As explained before, this means that an output state C (zz ∗ ) cannot be too isolated from the other output states. Lemma 11. For all d ≥ 3 there is β > 0 such that for n sufficiently large, for all t ≥ b + 4, and for all ∈ T and ρ ∈ Im(C ), σn z : C (zz ∗ ) ∈ Tube(ρ) ≥ β (1 − γ )n−1 . (4.7) Lemma 11 will be proved in Sect. 5.5. The number β is given by the following expression: d log d n−1 1 2 . (4.8) β = − (d + 2) 1 − 2 n It can be easily seen that for all d ≥ 3 the right side of (4.8) is positive for n sufficiently large. 4.2. Definition of the low-entropy events E. Define the set of channels whose minimum output entropy does not satisfy our requirements for a violation: h Cd,n = ∈ Rd (n) : Smin () ≤ log d − . (4.9) d The goal is to show that for d, n and h sufficiently large we have Pd,n (Cd,n ) < 1, c ) > 0, and thus that there exist random unitary channels with implying that Pd,n (Cd,n
122
M. Fukuda, C. King, D. K. Moser
Smin () > log d − h/d. The proof will hold for all h, d sufficiently large, and thus by taking log d ≥ 2h this will provide a counterexample to additivity. The method is to find useful upper and lower bounds for the probability of a particular event E in Rd (n) × Vn . The event E is chosen to contain all the pairs (, z), where C (zz ∗ ) lies in a tube connected to a state of low entropy. This set of tubes is defined by ! h J= Tube(ρ) : S(ρ) ≤ log d − . (4.10) d ρ Then the main event of interest for us is the following subset of Rd (n) × Vn : E = {(, z) : C (zz ∗ ) ∈ J } = H −1 (J ),
(4.11)
where H is the map defined in (3.17). The proof will proceed by proving upper and lower bounds for the probability of E, that is (Pd,n × σn )(E). These bounds will hold for any 0 < γ < 1; the parameter γ will be ‘tuned’ at the end in order to derive an estimate for the minimal size h min needed for the counterexample. √ As noted the construction works for any values of the parameters b, t satisfying b > 3 and t ≥ b + 4. The sizes of b and t do not play a crucial role, and they can be set to the values b = 2 and t = 6 without changing anything in the proof. 4.3. The upper bound for Pr ob(E). Note that by Lemma 7, (Pd,n × σn )(E) = (Pd,n × σn )(H −1 (J )) = H ∗ (Pd,n × σn )(J ) = G ∗ (σdn )(J ).
(4.12)
Let ρ be a fixed state in the set of tubes J . Then by definition there is a state τ ∈ Sd with low entropy such that ρ lies in the tube at τ . Thus for some r satisfying γ ≤ r ≤ 1, ρ − r τ + (1 − r ) 1 I ≤ t d log n , S(τ ) ≤ log d − h . (4.13) d ∞ n d Letting qi , pi denote the eigenvalues of ρ, τ respectively, it follows that qi = r pi + (1 − r )
1 + i , i = 1, . . . , d, d
(4.14)
where pi , i satisfy −
pi log pi ≤ log d −
i
h , d
d
i = 0.
(4.15)
i=1
Weyl’s perturbation theorem [4, Cor. III. 2.6] and (4.13) imply that d log n . |i | ≤ t n
(4.16)
The entropy condition (4.15) can be written as d i=1
pi d log( pi d) =
d i=1
( pi d log( pi d) − pi d + 1) ≥ h.
(4.17)
Comments on Hastings’ Additivity Counterexamples
123
Define the function f (x) = x log x − x + 1
(4.18)
Lemma 12. sup
x≥0, γ ≤r ≤1
f (x) f (0) 1 = = . f (r x + 1 − r ) f (1 − γ ) f (1 − γ )
(4.19)
Lemma 12 will be proved in Sect. 5.6. Recall (4.14) and define 1 z i = qi − i = r pi + (1 − r ) . d
(4.20)
Then Lemma 12 implies that for each i = 1, . . . , d, pi d log( pi d) − pi d + 1 = f ( pi d) ≤
1 f (z i d). f (1 − γ )
(4.21)
Therefore from (4.17) it follows that d
(z i d log(z i d) − z i d + 1) =
i=1
d
f (z i d) ≥ h f (1 − γ ).
(4.22)
i=1
We will use a standard bound for the difference between the entropies of z i and qi in terms of the l1 -norm of their difference [7, Th. 16.3.2]: 1 z i log z i + qi log qi ≤ m (log d + log ), (4.23) − m i
where m =
d
|z i − qi | =
i=1
d
|i | ≤ t d
i=1
d log n . n
(4.24)
Define η = d m (log d + log
1 ). m
(4.25)
Note that for all d and t, m → 0 as n → ∞, and hence also η → 0 as n → ∞. From (4.23) and (4.22) we deduce d
f (qi d) ≥ h f (1 − γ ) − η.
(4.26)
i=1
To summarize what we have shown so far: if ρ ∈ J has eigenvalues (q1 , . . . , qd ) then (4.26) holds. Thus we may upper bound the probability (4.12) by the probability of the state ρ satisfying the inequality (4.26). Since this event depends only on the eigenvalues of ρ, we obtain d ∗ G (σdn )(J ) ≤ µd,n q : f (qi d) > h f (1 − γ ) − η . (4.27) i=1
124
M. Fukuda, C. King, D. K. Moser
This probability is estimated using the bound (3.11): given a positive number x ≤ d log d, define d d Md (x) = inf F(qi d) : f (qi d) ≥ x , (4.28) q∈d
i=1
i=1
where F(x) = − log x + x − 1 as defined in (3.9). Then from (4.27) and (3.11) we deduce (Pd,n × σn )(E) = G ∗ (σdn )(J ) # " ≤ exp d 2 log n − log(d − 1)! − (n − d)Md (h f (1 − γ ) − η) .
(4.29)
The next lemma gives a lower bound for Md (x) which is not optimal but is sufficient for our purposes. Lemma 13. The function Md (x) is increasing. Suppose that 2e2 ≤ x ≤ d log d. Then Md (x) ≥ log(x − 1) − log(2e2 − 1).
(4.30)
Lemma 13 will be proved in Sect. 5.7. Applying (4.30) to (4.29) gives (Pd,n × σn )(E) h f (1 − γ ) − η − 1 2 , ≤ exp d log n − log(d − 1)! − (n − d) log 2e2 − 1
(4.31)
where h is assumed to satisfy the bounds 2e2 ≤ h f (1 − γ ) − η ≤ d log d.
(4.32)
4.4. The lower bound for Pr ob(E). First we write (Pd,n × σn )(E) = E [σn (z : C (zz ∗ ) ∈ J )] ≥ E [1Cd,n ∩T σn (z : C (zz ∗ ) ∈ J )], where E denotes expectation over Rd (n) with respect to the measure Pd,n , and 1Cd,n ∩T is the characteristic function of the event Cd,n ∩ T . Given that ∈ Cd,n , there is a state v ∈ Cn such that S(C (vv ∗ )) ≤ log d −
h . d
(4.33)
Since Tube(C (vv ∗ )) ⊂ J it follows that (Pd,n × σn )(E) ≥ E [1Cd,n ∩T σn (z : C (zz ∗ ) ∈ Tube(C (vv ∗ )))].
(4.34)
Applying Lemma 11 to (4.34) gives (Pd,n × σn )(E) ≥ β (1 − γ )n−1 E [1Cd,n ∩T ]
(4.35)
= β (1 − γ )n−1 Pd,n (Cd,n ∩ T ) ≥ β (1 − γ )
n−1
(4.36)
(Pd,n (Cd,n ) − Pd,n (T )). c
(4.37)
Comments on Hastings’ Additivity Counterexamples
125
4.5. Combining the bounds for Pr ob(E) and finishing the proof. Putting together the upper and lower bounds for (Pd,n√× σn )(E) and using Lemma 9 produces the following bound: for all d ≥ 3, for all b > 3 and t ≥ b + 4, for all 0 < γ < 1, for h, d satisfying (4.32), and for n sufficiently large n−1 1 1 (Pd,n × σn )(E) Pd,n (Cd,n ) ≤ Pd,n (T ) + β 1−γ 2d exp[−αd 2 log n] ≤ (d − 1)! n−1 1 1 ˜ + exp[d 2 log n − log(d − 1)! − (n − d) log h] β 1−γ 2d exp[−αd 2 log n] = (d − 1)! 1−γ ˜ + (4.38) exp[d 2 log n + d log h˜ − n log(1 − γ )h], β(d − 1)! c
where h˜ = (h f (1 − γ ) − η − 1)/(2e2 − 1). Define h min =
2e2 − γ (1 − γ ) f (1 − γ )
(4.39)
(note that h min satisfies the lower bound in (4.32)). As n → ∞ the parameter η approaches zero, and therefore for h > h min the second term on the right side of (4.38) is controlled by the factor (1 − γ )(h f (1 − γ ) − 1) h f (1 − γ ) − 1 exp −n log = 2 2e − 1 h min f (1 − γ ) − 1
−n
.
(4.40)
The first factor on the right side of (4.38) approaches zero as n → ∞, therefore (4.40) implies that for h > h min , Pd,n (Cd,n ) → 0 as n → ∞.
(4.41)
Summary and conclusion We have shown that for any 0 < γ < 1, for h > h min as √ defined in (4.39), for any b > 3 and t ≥ b + 4, for any d ≥ 3 satisfying d log d > h f (1 − γ ) (this comes from the second inequality in (4.32)), there is N < ∞ such that c ) > 0, and for all n ≥ N we have Pd,n (Cd,n ) < 1. In this case we also have Pd,n (Cd,n c thus a guarantee that the set Cd,n is non-empty. Referring to (4.9), this means that there exists a random unitary channel such that Smin () > log d −
h . d
(4.42)
126
M. Fukuda, C. King, D. K. Moser
4.6. Optimizing the bounds for Pr ob(E) and the proof of Proposition 3. First consider the value h min defined in (4.39). Varying γ shows that the right side achieves its minimum value at γ = 0.72. In order to achieve a counterexample we need log d ≥ 2h, so this implies the existence of counterexamples for all d ≥ d0 with d0 = exp[2h min + 1] exp[276].
(4.43)
In order to get a better estimate of dmin , we return to the bound (4.29) and look for the smallest value of d satisfying f (1 − γ ) log d + log(1 − γ ) > 0. (4.44) Md 2 For n sufficiently large this will yield a counterexample. This is a straightforward numerical problem: for each γ we find the smallest d so that d−z : d −1 f (1 − γ ) d−z = log d} z log z + (d − z) log d −1 2
− log(1 − γ ) < inf {− log z − (d − 1) log z>1
(4.45)
and then minimize over γ . The solution occurs at γ = 0.762 and yields d0 = 38578. This also proves the first statement in Proposition 3. For the second statement we estimate the smallest value of n which yields Pd,n (Cd,n ) < 1. Using the values b = 2, t = 6, γ = 0.762, and with d = 50,000, crude numerical estimates show that we can achieve this with n = d 7 . This proves the second statement in Proposition 3. For the third statement, we note from Lemma 1 that for any random unitary channel , Smax ≥ 2Smin () − 2 log d +
log d . d
(4.46)
c we have Thus for every ∈ Cd,n
Smax ≥
log d − 2h . d
(4.47)
For a fixed value h, the right side of (4.47) achieves its maximum value when d = [exp(2h + 1)], and this maximum value is 1/d. Numerical calculation shows that we can achieve Md ( f (1−γ )h)+log(1−γ ) > 0 using the values γ = 0.762, h = log(38590)/2 and d = [exp(2h + 1)], and then 1/d yields the lower bound for Smax stated in Proposition 3. 5. Proofs of Lemmas 5.1. Proof of Lemma 1. First, note that for any unit vectors {|ψk } and probability distribution { pk }, S pk |ψk ψk | ≤ − pk log pk . (5.1) k
k
Comments on Hastings’ Additivity Counterexamples
127
ˆ be the maximally entangled state. Then Let |ψ ˆ ψ|) ˆ = ( ⊗ )(|ψ
d
ˆ ψ|U ˆ i∗ ⊗ U jT wi w j Ui ⊗ U j |ψ
i, j=1
=
d
ˆ ψ| ˆ + |ψ
wi2
(5.2)
ˆ ψ|U ˆ i∗ ⊗ U jT , (5.3) wi w j Ui ⊗ U j |ψ
i= j
i=1
ˆ ψ|U ˆ ∗ ⊗ U T = |ψ ˆ ψ| ˆ for all i. Hence, where we used the identity Ui ⊗ Ui |ψ i i d d 2 2 ˆ ˆ S ( ⊗ )(|ψ ψ|) ≤ − wi log wi − wi w j log(wi w j ). (5.4) i=1
i= j
i=1
d
Write p = i=1 wi2 and then i= j wi w j = 1 − p. Hence ˆ ψ|) ˆ S ( ⊗ )(|ψ ⎧ 2 −d 2 −d d ⎨ d ≤ − p log p + sup − vk log vk : vk ≥ 0, vk = 1 − ⎩ k=1
k=1
⎫ ⎬
p . ⎭
(5.5)
The supremum on the right side of (5.5) is achieved with vk = (1− p)/(d 2 −d) for all k, hence 1− p ˆ ˆ , (5.6) S ( ⊗ )(|ψ ψ|) ≤ h( p) = − p log p − (1 − p) log d2 − d where 1/d ≤ p ≤ 1. However, h ( p) = − log p + log(1 − p) − log(d 2 − d) 1 1 1 −1 ≤ log = log p d(d − 1) d
b some i = 1, . . . , d d n d ! Li , (5.32) = 2µd,n i=1
√ where the events L i are defined by L i = {(q1 , . . . , qd ) : |qi − 1/d| > b log n/n}. Thus we have Pd,n (T ) ≤ 2 c
d i=1
µd,n (L i ) = 2 d µd,n (L i ).
(5.33)
√ We use the bound (3.15) of Corollary 6 with t = b log n/n to estimate µd,n (L i ). In addition we assume that n is large enough so that 1 log n dt = db ≤ , (5.34) n 2 and hence n−d 2 2 n−d 3 3 n−d 2 2 d t − d t ≥ d t . (5.35) 2 3 3 Thus (5.32) gives n − d 2 2 log n c 2 Pd,n (T ) ≤ 2 d exp d log n − log(d − 1)! − d b 3 n 2 2d b (n − d) = exp −d 2 log n −1 . (5.36) (d − 1)! 3n
Comments on Hastings’ Additivity Counterexamples
131
5.5. Proof of Lemma 11. This result relies on several properties of random states. We will switch to Dirac bra and ket notation throughout this section, as it lends itself well to the arguments used in the proof. To set up the notation, let |ψ be a fixed state in Vn , and let |θ be a random pure state in Vn , with probability distribution σn . Without loss of generality we assume that a basis is chosen so that |ψ = (1, 0, . . . , 0)T . We write x = ψ|θ , and let |φ be the state orthogonal to |ψ such that |θ = x |ψ + 1 − |x|2 |φ. (5.37) Thus |φ is also a random state, defined by its relation to the uniformly random state |θ in (5.37). The following results are proved in Appendix B. Proposition 14. x and |φ are independent. |φ is a random vector in Vn−1 with distribution σn−1 . For all 0 ≤ t ≤ 1, σn {|θ : | ψ|θ | = |x| > t} = (1 − t 2 )n−1 .
(5.38)
Proposition 14 implies that as n → ∞ the overlap x = ψ|θ becomes concentrated around zero. In other words, with high probability a randomly chosen state will be almost orthogonal to any fixed state. As a consequence, from (5.37) it follows that |φ will be almost equal to |θ . This statement is made precise by noting that √ |θ − |φ2 ≤ 2| ψ|θ |. (5.39) Then (5.38) immediately implies that t2 σn (|θ : |θ − |φ2 > t) ≤ 1 − 2
n−1
.
(5.40)
The second property relies on the particular form of the random unitary channel, or more precisely on the form of the complementary channel C . Roughly, this property says that for any fixed random unitary channel and random state |θ , with high probability the norm of the matrix C (|θ ψ|) is small, and approaches zero as n → ∞. We will prove the following bound: for any ∈ Rd (n), and for all 0 ≤ t ≤ 1, σn (|θ : C (|θ ψ|)2 > t) ≤ d 2 (1 − t 2 )n−1 . As a first step toward deriving (5.41), note that for any states |u and |v, ⎛ ⎞1 2 d C ∗ 2 (|u v|)2 = ⎝ wk wl | v|Ul Uk |u| ⎠ ≤ max | v|Ul∗ Uk |u|. k,l
k,l=1
(5.41)
(5.42)
In particular this implies that C (|u v|)2 ≤ |u2 |v2 .
(5.43)
To derive (5.41) we apply (5.42) with u = θ and v = ψ and deduce that σn (|θ : C (|θ ψ|)2 > t) ≤ σn (|θ : max | ψ|Ul∗ Uk |θ | > t) k,l
≤ d σn (|θ : | ψ|Ul∗ Uk |θ | > t) 2
= d 2 (1 − t 2 )n−1 , where the last equality follows from (5.38).
(5.44)
132
M. Fukuda, C. King, D. K. Moser
With these ingredients in place the proof of Lemma 11 can proceed. By assumption is a random unitary channel belonging to the typical set T , and ρ = C (|ψ ψ|) is some state in Im(C ). Let |θ be a random input state, then as in (5.37) we write |θ = x |ψ + 1 − |x|2 |φ. It follows that |θ θ | = |x|2 |ψ ψ| + (1 − |x|2 ) |φ φ| +
-
1 − |x|2 (x |ψ φ| + x |φ ψ|).
(5.45)
Write r = |x|2 , then (5.45) yields
1 (|θ θ |) − r C (|ψ ψ|) + (1 − r ) I d 1 = (1 − r ) C (|φ φ|) − I d + r (1 − r )C eiξ |ψ φ| + e−iξ |φ ψ| , C
where ξ is the phase of x. Since r ≤ 1 this implies C (|θ θ |) − r C (|ψ ψ|) + (1 − r ) 1 I d ∞ C 1 C + I ≤ (|φ φ|) − (|ψ φ|) . ∞ d ∞
(5.46)
(5.47)
Referring to the definition (4.6) of Tube(ρ), recall that C (|θ θ |) belongs to Tube(ρ) if and only if for some r satisfying γ ≤ r ≤ 1, C (|θ θ |) − r C (|ψ ψ|) + (1 − r ) 1 I ≤ t d log n (5.48) d ∞ n (the set Y (ρ) defined in (4.5) is closed so the infimum in (4.6) is achieved). Define the following three events in Vn : A1 = {|θ : r = | ψ|θ |2 ≥ γ }, (5.49) √ C d log d d log n 1 +b A2 = |θ : , (5.50) (|φ φ|) − d I ≤ 2 2 n n ∞ √ d log d C A3 = |θ : (|ψ φ|) ≤ (1 + 2) . (5.51) ∞ n
Assume that d 2 ≤ n and then since t ≥ b + 4 it follows from (5.47) and (5.48) that A1 ∩ A2 ∩ A3 ⊂ {|θ : C (|θ θ |) ∈ Tube(ρ)}.
(5.52)
Note that n ≥ d 2 Furthermore by Proposition 14, A1 is independent of A2 and A3 , hence σn (|θ : C (|θ θ |) ∈ Tube(ρ)) ≥ σn (A1 ∩ A2 ∩ A3 ) = σn (A1 ) σn (A2 ∩ A3 ). (5.53)
Comments on Hastings’ Additivity Counterexamples
133
Proposition 14 immediately yields σn (A1 ) = (1 − γ )n−1 .
(5.54)
From (5.53) this gives σn (|θ : C (|θ θ |) ∈ Tube(ρ)) ≥ (1 − γ )n−1 (1 − σn (Ac2 ) − σn (Ac3 )). (5.55) In order to bound σn (Ac3 ) we first use (5.43) to deduce C (|ψ φ|)∞ ≤ C (|ψ φ|)2 ≤ C (|ψ θ |)2 + |θ − |φ2 . Thus
σn (Ac3 ) = σn
|θ : C (|ψ φ|)
∞
> (1 +
√
2)
d log d n
≤ σn |θ : (|ψ θ |)2 + |θ − |φ2 > (1 + C
d log d ≤ σn |θ : C (|ψ θ |)2 > n √ d log d +σn |θ : |θ − |φ2 > 2 n n−1 d log d ≤ (d 2 + 1) 1 − , n
(5.56)
√
2)
d log d n
(5.57)
where the last inequality follows from (5.44) and (5.40). Turning now to σn (Ac2 ), note first that C C 1 C C (|φ φ|) − 1 I ≤ (|φ φ|) − (|θ θ |) + (|θ θ |) − I ∞ d ∞ d ∞ 1 C C C ≤ (|φ φ|) − (|θ θ |) + (|θ θ |) − I 2 d ∞ C 1 ≤ 2 |θ − |φ2 + (5.58) (|θ θ |) − d I , ∞ where we used (5.43) for the last inequality. As in (5.57) this gives √ C d log d d log n 1 c +b σn (A2 ) = σn |θ : (|φ φ|) − I > 2 2 d 2 n n √ d log d ≤ σn |θ : 2 |θ − |φ2 > 2 2 n C d log n 1 +σn |θ : (|θ θ |) − I > b d ∞ n C d log n 1 d log d n−1 + σn |θ : ≤ 1− , (|θ θ |) − d I > b n n ∞ (5.59)
134
M. Fukuda, C. King, D. K. Moser
where we used (5.40) for the last inequality. By assumption ∈ T , and therefore there is a set of input states L with σn (L) ≥ 1/2 such that C d log n 1 |θ ∈ L ⇒ (|θ θ |) − I ≤ b . (5.60) d ∞ n Thus
σn
C d log n 1 1 |θ : ≤ σn (L c ) ≤ . (|θ θ |) − d I > b n 2 ∞
Putting together the bounds (5.55), (5.57), (5.59) and (5.61) we get
σn (C (|θ θ |) ∈ Tube(ρ)) ≥ (1 − γ )n−1 1 − σn (Ac2 ) − σn (Ac3 ) d log d n−1 n−1 ≥ (1 − γ ) 1− 1− n n−1 1 d log d − − (d 2 + 1) 1 − 2 n 1 d log d − (d 2 + 2) 1 − = (1 − γ )n−1 2 n
(5.61)
n−1
. (5.62)
This completes the proof, with 1 d log d 2 β= − (d + 2) 1 − 2 n
n−1
.
(5.63)
5.6. Proof of Lemma 12. It is clear that f (r x + 1 − r ) is monotone increasing in r , and therefore sup sup
x≥0 γ ≤r ≤1
f (x) f (x) = sup . f (r x + 1 − r ) x≥0 f (γ x + 1 − γ )
(5.64)
The function f (x) f (γ x + 1 − γ )−1 is analytic and decreasing at x = 1 for γ < 1. Thus either the supremum in (5.64) is achieved at x = 0 or else there is a critical point of the function f (x) f (γ x + 1 − γ )−1 in the interval (0, ∞). In order to rule out the second possibility, we introduce a Lagrange multiplier and define the function h(x, y, β) = log f (x) − log f (y) − β(γ x + 1 − γ − y).
(5.65)
To find the critical points of h we solve ∂h ∂h ∂h = = = 0. ∂x ∂y ∂β
(5.66)
f (x) f (y) =γ . f (x) f (y)
(5.67)
Solving for β leads to
Comments on Hastings’ Additivity Counterexamples
135
Since y − 1 = γ (x − 1) this is equivalent to (x − 1) log x (y − 1) log y = . x log x − x + 1 y log y − y + 1
(5.68)
Direct computation shows that d (x − 1) log x (x − 1)2 −2 (log x)2 − = f (x) d x x log x − x + 1 x 1/2 x − x −1/2 = f (x)−2 (log x)2 1 − log x
2
. (5.69)
Furthermore, the function x 1/2 − x −1/2 −log x is monotone increasing for all x > 0, and thus x 1/2 − x −1/2 > log x for x > 1. Thus for x ≥ 1 the derivative (5.69) is negative, and therefore (5.68) has no solution with x > 1. Similarly x 1/2 − x −1/2 < log x for 0 < x < 1, and hence again (5.69) is negative for 0 < x < 1. So there are no solutions of (5.68) except x = y = 1. Therefore (5.65) has no critical points except x = y = 1, and thus the function f (x) f (γ x + 1 − γ )−1 achieves its supremum at x = 0. 5.7. Proof of Lemma 13. Suppose first that 0 < h < d log d. Recall the definition d d Md (h) = inf F(qi d) : f (qi d) ≥ h , (5.70) q∈d
i=1
i=1
where F(x) = − log x + x − 1 and f (x) = x log x − x + 1. Letting xi = qi d we have d d d Md (h) = inf F(xi ) : f (xi ) ≥ h, xi = d . (5.71) xi ≥0
i=1
i=1
i=1
d
The gradient of the function i=1 F(xi ) is zero only at x1 = · · · = xd = 1, hence since d d h > 0 there are no critical points of i=1 F(xi ) in the region i=1 f (xi ) ≥ h. Thus d the infimum in (5.71) is achieved at the boundary where i=1 f (xi ) = h, and so Md (h) = inf
xi ≥0
d i=1
F(xi ) :
d
f (xi ) = h,
d
i=1
xi = d .
We introduce Lagrange multipliers and define d d d F(xi ) − α f (xi ) − h − β xi − d . H (xi , α, β) = i=1
i=1
(5.72)
i=1
(5.73)
i=1
The critical equations for H are ∂H 1 =1− − α log xi − β = 0. ∂ xi xi
(5.74)
136
M. Fukuda, C. King, D. K. Moser
The constraints can be used to eliminate β and obtain h 1+α xi − 1 = α xi log xi , i = 1, . . . , d. d
(5.75)
If α ≤ 0, Eqs. (5.75) have the unique solution xi = 1 for all i = 1, . . . , d. However this d f (xi ) = h for h > 0. Thus α > 0, in which case does not satisfy the constraint i=1 there are positive numbers w and z satisfying 0<w 3 log L, so that the first term in the r.h.s. of (4.19) is smaller than 1/L. Taking t1 := exp(c(log L)ε ), one has from (1.9), π − , µ− K ,t
π τ , µτ,− K ,t
π K−
− µ− K ,t1 − π K ≤ e
− −t1 /Tmix,K
≤ exp[− exp(c(log L)ε − c ε )] " 1/| L |,
(4.20)
Mixing Time of 2D Stochastic Ising Model at Low Temperature
201
if one chooses c suitably larger than c (recall that we chose = O(log L)) and the corollary is proved. 4.3. Proof of Corollary 1.10. This is rather standard, once (1.13) is known (cf. for instance Theorem 3.2 in [16] or Theorem 3.6 in [7]). Clearly, it is sufficient to prove the result with f redefined as f (σ ) := (σ0 + 1) which has the advantage of being non-negative, increasing and with support {0}. Consider a square J ⊂ Z2 with side 2 + 1 ∈ {2n − 1}n∈N and centered at 0. By the exponential decay of correlations in the +, pure phase π∞
+ ( f ) − π J+ ( f )| ≤ c e−c . |π∞
(4.21)
Moreover, by monotonicity, for every initial configuration σ of the infinite system + tL 0 ≤ (et L f )(σ ) ≤ e J f (σ ) (4.22) and the right-hand side is an increasing function of σ ; in accord with the notations of Sect. 1.2, L+J denotes the generator of the dynamics in J with + boundary conditions on ∂ J (its invariant measure is of course π J+ ) and L is the generator of the infinite-volume dynamics. One has then (using once more monotonicity) 2 L+ + (4.23) π∞ (et L f )2 ≤ π J+ e J f which, together with (4.21), gives +
tL ρ(t) = Var +∞ et L f ≤ Var π J+ e J f + c e−c .
(4.24)
By (1.6), one has that Var π J+
+ tL −2t gap+J , e J f ≤ Var π J+ ( f )e
(4.25)
with gap+J the spectral gap of L+J . From the inequality gap ≥
1 Tmix
(cf. (1.10)) and (1.13), one deduces that for every ε > 0,
−cε . Var +∞ (et L f ) ≤ c e−c + e−2te
(4.26)
(4.27)
Now letting = (t) be the smallest integer such that cε ≥ log t −
1 log log t, ε
(4.28)
(with the condition that 2 + 1 ∈ {2n − 1}n∈N ) one sees that (4.27) implies (1.15).
202
F. Martinelli, F. L. Toninelli
Appendix A. Some Equilibrium Estimates A.1. A few basic facts on cluster expansion. In this section we rely on the results of ∗ [9], but we try to be reasonably self-contained. We let Z2 be the dual lattice of Z2 and ∗ we call a bond any segment joining two neighboring sites in Z2 . Two sites x, y in Z2 are said to be separated by a bond e if their distance (in R2 ) from e is 1/2. A pair of ∗ orthogonal bonds which meet in a site x ∗ ∈ Z2 is said to be a linked pair of bonds if both bonds are on the same side of the forty-five degrees line across x ∗ . A contour is a sequence e0 , . . . , en of bonds such that: (1) ei = e j for every i = j, except possibly when (i, j) = (0, n), ∗ (2) for every i, ei and ei+1 have a common vertex in Z2 , ∗ (3) if four bonds ei , ei+1 and e j , e j+1 , i = j, j + 1 intersect at some x ∗ ∈ Z2 , then ei , ei+1 and e j , e j+1 are linked pairs of bonds. If e0 = en , the contour is said to be closed, otherwise it is said to be open. Given a contour γ , we let γ be the set of sites in Z2 such that either their distance (in R2 ) from ∗ γ is 1/2, or their distance from the set of vertices in Z2 where two non-linked bonds √ of γ meet equals 1/ 2. We need the following Definition A.1. Given V ⊂ Z2 , we let V˜ ⊂ R2 be the union of all closed unit squares ∗ centered at each site in V , and V¯ be the set of all bonds e ∈ Z2 such that at least one of the two sites separated by e belongs to V . Given a rectangular domain V ⊂ Z2 , a configuration σ ∈ V and a boundary condition τ on ∂ V , let σ (τ,+) be the spin configuration on Z2 which coincides with σ in V , with τ on ∂ V and which is + otherwise. One immediately sees that the (finite) collection ∗ (τ,+) (τ,+) of bonds of Z2 which separate neighboring sites x, y ∈ Z2 such that σx = σ y splits in a unique way into a finite collection τ (σ ) of closed contours. It is easy to see that τ (σ ) ∩ V˜ consists of a certain number of closed contours, plus m open contours, where m is such that going along ∂ V one meets 2m changes of sign in τ . Note that the collection of the 2m endpoints of the open contours is fixed uniquely by τ . We write τ open (σ ) for the collection {γ1 , . . . , γm } of open contours in τ (σ ) ∩ V˜ . Of course, the open contours γi have to satisfy certain compatibility conditions: γi and γ j have no ∗ bond in common if i = j, and if they meet at some x ∗ ∈ Z2 , each of the two linked pairs of bonds belongs to only one contour. Moreover, each γi is contained in V˜ and the collection of the endpoints of the {γi }i≤m must coincide with that dictated by τ . We will write {γ1 , . . . , γm } ∼ τ to indicate that the collection of open contours is compatible with τ . The following result can be easily deduced from [9, Sect. 3.9 and 4.3]. Writing as usual πVτ for the equilibrium measure in V with b.c. τ , one has Theorem A.2. There exists β0 such that for every β > β0 the following holds. For every rectangle V ⊂ Z2 , every b.c. τ on ∂ V and every collection {γ1 , . . . , γm } of open contours compatible with τ , one has ({γ , . . . , γ }; V ) 1 m τ πVτ σ : open , (A1) (σ ) = {γ1 , . . . , γm } = (V, τ )
Mixing Time of 2D Stochastic Ising Model at Low Temperature
203
where the Boltzmann weight ({γ1 , . . . , γm }; V ) is defined as ⎧ ⎪ ⎪ m ⎨ ({γ1 , . . . , γm }; V ) := exp −2β |γi | − ⎪ ⎪ i=1 ⊂V : ⎩
∩(∪i γi )=∅
|γi | is the geometric length of γi and (V, τ ) :=
⎫ ⎪ ⎪ ⎬
() , ⎪ ⎪ ⎭
({γ1 , . . . , γm }; V ).
(A2)
(A3)
{γ1 ,...,γm }∼τ
The potential satisfies for every ⊂ V, || ≥ 2 and for every x ∈ V : |()| ≤ exp(−2(β − β0 )d()), |({x})| ≤ exp(−8(β − β0 )),
(A4) (A5)
where, for connected (in the sense of subgraphs of the graph Z2 ) , d() is the length of ¯ (cf. Definition A.1) containing all the bonds the smallest connected set of bonds from which separate sites in from sites in c . If is not connected then d() := +∞. The fast decay property of (with respect to both β and d()) has the following simple consequence: Lemma A.3 [9, Lemma 3.10]. There exists β0 depending only on β0 of Theorem A.2 ∗ such that for β > β0 , for every bond e ∈ Z2 and for every d > 0 one has
e−2(β−β0 )d() ≤ e−2(β−β0 )d . (A6) ¯ ⊂Z2 :e∈ d()≥d
This allows to essentially neglect the interaction between portions of a contour which are sufficiently far from each other. In order to apply directly results from [9] to obtain the estimates we need, we define the canonical ensemble of contours. Let a, b be sites in Z2 . Then, for any open contour γ ∗ γ which has a + (1/2, 1/2), b + (1/2, 1/2) ∈ Z2 as endpoints, in formulas a ↔ b (with some abuse of language, we will sometimes say that γ connects a and b), we define the probability distribution ⎧ ⎫ ⎪ ⎪ ⎪ ⎪ ⎨ ⎬ −1 −1 Pa,b (γ ) := Za,b exp −2β|γ | − () = Za,b (γ ; Z2 ) (A7) ⎪ ⎪ ⎪ ⎪ ⎩ ⎭ ⊂Z2 : ∩γ =∅
and of course Za,b :=
(γ ; Z2 ).
(A8)
γ
γ :a ↔b
Note that we do not require that γ ⊂ V˜ and the sum in is now over all (connected) sets ⊂ Z2 . The expectation w.r.t. Pa,b will be denoted by Ea,b .
204
F. Martinelli, F. L. Toninelli
A.1.1. Surface tension and basic properties Let n be a vector in the unit circle S such that n · e 1 > 0 and call φn the angle it forms with e 1 (of course, −π/2 < φn < π/2). For N ∈ N, let b N , n = (N , y N , n ) ∈ Z2 , where y N , n = max{y ∈ Z : y ≤ N tan(φn )}. Let also 0 := (0, 0). Then, it is known [9, Prop. 4.12] that, for β large enough, the surface tension introduced in (1.2) is given by τβ ( n ) := − lim
N →∞
1 log Z0,b N , n , βd(0, b N , n )
(A9)
where, if x, y ∈ R2 , d(x, y) is their Euclidean distance. To be precise, one has to assume that φn is bounded away from ±π/2 uniformly in N , but this will be inessential for us since we will always have φn small. One can extract from [9, Sect. 4.8, 4.9 and 4.12] that the surface tension is an analytic function of φn (always assuming that β is large enough), and by symmetry one sees that it is an even function of φn . In [9, Sect. 4.12], sharp estimates on the rate of convergence in (A9) (e.g. (A13) below) are given. A.2. Proof of (3.20). The domain E L (Q L ) which appears in (3.20) is a rectangle with height shorter than its base, and the b.c. τ is + on the South border and − otherwise. Since the event that the unique open contour reaches the height of the South border of A is increasing, in order to prove (3.20), by the FKG inequalities we can first of all move upwards the North border of E L (Q L ) until we obtain a square (of side 3L, which however here we call just L); we let therefore V := {1, . . . , L}2 . Secondly (always by FKG) we can change the b.c. τ to τ ≥ τ by first fixing a δ > 0 and then establishing that τx = + if x = (x1 , x2 ) ∈ ∂ V with x2 ≤ δL 1/2+ε , and τx = − otherwise. τ (σ ): of Given a configuration σ ∈ V , let γ be the unique open contour in open γ course, γ ⊂ V˜ and a1 ↔ a2 , where a1 := (0, δL 1/2+ε ) and a2 := (L , δL 1/2+ε ). We let h(γ ) := max{x2 : (x1 , x2 ) ∈ γ } be the maximal height reached by γ , while as usual ε > 0 is small and fixed. Looking at (A1) and (3.20), we see that what we have to prove is that for every fixed δ > 0 one has for every L ∈ N, N 2ε γ ∼τ (γ ; V )1{h(γ )>2δL 1/2+ε } := ≤ e−cL (A10) (V, τ ) (V, τ ) for some c(β, δ, ε) > 0. We will always assume that β is large enough. First we upper bound the numerator in (A10): with the notations of Sect. A.1 (cf. in particular (A7)) and setting for a given contour γ and a given V ⊂ Z2 , (), (A11) V (γ ) := ⊂Z2 : ∩γ =∅,∩V c =∅
one has N ≤ Za1 ,a2 Ea1 ,a2 1{h(γ )>2δL 1/2+ε } exp (V (γ )) ≤ Za1 ,a2 Pa1 ,a2 (h(γ ) > 2δL 1/2+ε ) Ea1 ,a2 exp (2V (γ )) ,
(A12)
where in the first step we simply removed the constraint that γ ⊂ V˜ , which is implicit in the requirement γ ∼ τ . It follows directly from [9, Prop. 4.15] that the first square root is
Mixing Time of 2D Stochastic Ising Model at Low Temperature
205
C
C
1
2
w1
w2
w1
w2 2
1
1
2
2
1
Fig. A1. The two topologically distinct possibilities: either γ1 connects 1 to 2 , or it connects w1 to 1 . The fist case is very unlikely, see (A18)
smaller than exp(−cL 2ε ) (note that we are requiring the contour to reach a height which exceeds by δL 1/2+ε the height of its endpoints). On the other hand, from [9, Th. 4.16, in particular Eq. (4.16.6)] and the fast decay properties of (in particular Lemma A.3) it is not difficult to deduce that the second one is upper bounded by exp (c(log L)c ). Moreover, one has [9, Eq. (4.12.3)] that Za1 ,a2 ≤ c(β)
e−βτβ ( e1 )L , √ L
(A13)
where of course τβ ( e1 ) is the surface tension in the horizontal direction and we used the fact that d(a1 , a2 ) = L. In conclusion, we have (A14) N ≤ exp −βτβ ( e1 )L − cL 2ε . Next we observe that, again from [9, Th. 4.16 and Eq. (4.16.7)], (V, τ ) ≥ exp −βτβ ( e1 )L − c(log L)c , which together with (A14) concludes the proof of (3.20).
(A15)
A.3. Proof of Claim 3.10. In this section, V is the rectangle {(i, j) ∈ Z2 : 1 ≤ i ≤ L , 1 ≤ j ≤ 4(2L + 1)1/2+ε } and the b.c. τ is defined by τx = − for x ∈ := {(i, 0) ∈ Z2 : |i − L/2 | ≤ L 3ε } and for x = (x1 , x2 ) ∈ ∂ V with x2 > 2(2L + 1)1/2+ε ; τx = + otherwise. Moreover, C is the infinite vertical column C = {(x1 , x2 ) ∈ R2 : x1 = L/2 }. Write 1 + (1, 0) (resp. 2 ) for the left-most (resp. right-most) point τ in . For every σ ∈ V there are two open contours in open (σ ): γ1 and γ2 , and we establish by convention that γ1 is the contour which contains 1 + (1/2, 1/2) as one of its endpoints. Two cases can occur (see Fig. A1): γ1
γ2
• either 1 ↔ 2 and w1 ↔ w2 , where w1 := (0, 2(2L + 1)1/2+ε ) and w2 := (L , 2(2L + 1)1/2+ε ), γ1 γ2 • or w1 ↔ 1 and 2 ↔ w2 . Let C1 (resp. C2 ) be the vertical column at distance L ε to the left (resp. to the right) of the column C. Then, one has
206
F. Martinelli, F. L. Toninelli
Lemma A.4. The probability that appears in Claim 3.10 can be upper bounded as (−,+,)
π A¯
( c ) ≤ πVτ (¯ c ),
(A16)
where γi ¯ := {wi ↔ i and γi ∩ Ci = ∅, i = 1, 2}.
(A17)
Therefore, from Theorem A.2 we see that to prove Claim 3.10 it is enough to show that γ1 {γ1 ,γ2 }∼τ ({γ1 , γ2 }; V )1{ ↔ N1 3ε 1 2} := ≤ e−cL (A18) (V, τ ) (V, τ ) and that N2 := (V, τ )
{γ1 ,γ2 }∼τ
({γ1 , γ2 }; V )1
γ1
{1 ↔w1 }
(V, τ )
1{γ1 ∩C1 =∅}
≤ e−cL , 3ε
(A19)
for some positive c = c(β, ε). Proof of Lemma A.4. Since the event c is increasing, we note first of all that thanks to FKG we can enlarge the system from A¯ to V and change the b.c. from (−, +, ) to τ . Secondly, we observe that the event ¯ implies . A.3.1. Lower bound on (V, τ ) We will prove that there exists a positive constant c
such that for β large, (V, τ ) ≥ exp −βτβ ( e1 )(L − c L 3ε ) . (A20) Since we want a lower bound, we are allowed to keep only the configurations {γ1 , γ2 } ∼ τ γi such that wi ↔ i and γi does not touch the column Ci , for i = 1, 2. Call Gi , i = 1, 2 the set of configurations of γi allowed by the above constraints. Using the decay properties of , one sees that ⎞2 ⎛ (γ1 ; V )⎠ . (A21) (V, τ ) ≥ c ⎝ γ1 ∈G1
The square is due to the fact that γ1 and γ2 essentially do not interact because their mutual distance is larger than L ε (the residual interaction can be bounded by a constant which is absorbed in c). It remains to prove that (γ1 ; V ) ≥ exp(−βτβ ( e1 )((L/2) − c L 3ε )) (A22) γ1 ∈G1
for some positive c . This is an immediate consequence of Lemma A.6 below (applied with κ = ε), together with the fact that d(w1 , 1 ) = L/2 − L 3ε + O(L 2ε ), of the fact that the angle φ formed by the segment w1 1 and e 1 is O(L −1/2+ε ), and finally of the analyticity of the surface tension and its symmetry around e 1 .
Mixing Time of 2D Stochastic Ising Model at Low Temperature
207
A.3.2. Upper bound on N1 Using rough upper bounds on the number of paths γ1 which connect 1 and 2 and the decay properties of (in particular Lemma A.3), one sees that for L large, N1 ≤ e−cL
3ε
(γ ; V )
(A23)
γ γ ⊂V˜ : w1 ↔w2
for some c = c(β, ε) > 0, where of course one uses the fact that d(1 , 2 ) = 2L 3ε . Moreover, Theorem 4.16 of [9] ensures that (γ ; V ) ≤ exp(−βτβ ( e1 )L + c(log L)c ), (A24) γ
γ ⊂V˜ : w1 ↔w2
which, together with (A20), concludes the proof of (A18). A.3.3. Proof of (A19) The estimate we wish to prove is very intuitive: if the path γ1 makes a deviation to the right to touch the column C1 , it has an excess length, and therefore an excess energy, of order L 3ε with respect to typical paths. The actual proof of (A19) is a straightforward (although a bit lengthy) application of results from [9] and of the FKG inequalities. We sketch only the main steps. First of all, letting d(γ1 , γ2 ) := min{d(x1 , x2 ), xi ∈ γi , i = 1, 2}, we show that the contribution of the configurations such that d(γ1 , γ2 ) < L ε is negligible. To this purpose, decompose first of all N2 as N2 = N2 + N2
, where N2 :=
({γ1 , γ2 }; V )1
γ1
{1 ↔w1 }
{γ1 ,γ2 }∼τ
1{γ1 ∩C1 =∅} 1{d(γ1 ,γ2 ) a1 . Let v ab be the unit vector pointing from a to b and φab be the angle which v ab forms with e 1 . Assume that −π/4 ≤ φab ≤ π/4. Let A > 0, κ > 0, let Ua,b = Ua,b (A, κ) ⊂ R2 be the cigar-shaped region which is delimited by the two curves
(x − a1 )(b1 − x) 1/2+κ ± (x) := x tan(φab ) ± A , x ∈ [a1 , b1 ], x → ξa,b;A,κ b1 − a1
Mixing Time of 2D Stochastic Ising Model at Low Temperature
Ux
,x
−1
211
(A’,k) 0
+
U a,b (A,k) Ux
,x
−2
(A’,k)
−1
z0 z −1
z
+
U a,b (A’,k)
1
z −n
zn
a
x−2
x−1
x0
x1
x2
b
Fig. A3. A typical path γ which contributes to the lower bound (A39). For graphical convenience, we have assumed that a and b have the same vertical coordinate, and not all the cigar-shaped sets Uzi ,zi+1 (A , κ) have been drawn + be the upper half of U , obtained by slicing U and Ua,b a,b a,b along the segment ab. Also, we will denote by a,b = a,b (A, κ) the set of all open contours γ having a and b as endpoints, and such that every bond in γ has non-empty intersection with Ua,b ; similarly + . Then, we define a,b
Lemma A.6. Let β be large enough, and consider a domain V ⊂ Z2 such that V˜ + (A, κ) (cf. Definition A.1). There exists c depending on β, A, κ such that contains Ua,b (A39) (γ ; V ) ≥ exp −βτβ ( vab )d(a, b) − c(d(a, b))2κ . + γ ∈a,b
This result can be obtained via a repeated use of Theorem 4.16 of [9]. The error term exp(−c (d(a, b))2κ ) is very rough (but sufficient for our purposes) and can presumably be improved. We do not give full details because they are a bit lengthy, although standard, but we sketch the main steps. First of all, let for simplicity of notations L := b1 − a1 and A := A/10. Then, one proceeds as follows (keep in mind Fig. A3): ∗
• for every −n ≤ i ≤ n, with n = log2 (L) − 2, let z i = (xi , yi ) be a point in Z2 at + minimal distance from (x˜i , ξa,b;A ˜i )), where
,κ ( x ⎛ ⎞ |i|−1 1 sign(i) 2− j ⎠ ; x˜i := a1 + (b1 − a1 ) ⎝ + (A40) 2 4 j=0
• remark via elementary geometrical considerations that for every −n ≤ i < n, the + (A, κ); cigar-shaped set Uzi ,zi+1 (A , κ) is entirely contained in Ua,b • restrict the sum (A39) to the paths γ which, when oriented from a to b, go through the points z −n , z −n+1 , . . . , z n (in this order), and such that the portion of the path between z i and z i+1 belongs to zi ,zi+1 (A , κ); • remark that, via the decay properties of the potential , the interaction between two adjacent portions of γ just defined can be bounded above by a constant;
212
F. Martinelli, F. L. Toninelli
• apply Theorem 4.16 of [9] to write that for every −n ≤ i < n one has (γ ; V )≥exp −βτβ ( vzi ,zi+1 )d(z i , z i+1 )−c(log d(z i , z i+1 ))c , γ ∈zi ,zi+1 (A ,κ)
(A41) for some constant c depending on A, κ, β. As for the two portions of γ from a to z −n and from z n to b, they give a multiplicative contribution of order 1 to (A39) (this is because d(a, z −n ) = O(1) and d(b, z n ) = O(1), as is immediately seen from the definition of n); • put together the estimates on the contributions coming from the 2n + 3 portions of γ obtained in the previous point: using the convexity and smoothness properties of the surface tension τβ (·), one obtains the claim of the lemma. Acknowledgements. We are extremely grateful to Senya Shlosman and to Yvan Velenik for valuable help on low-temperature equilibrium estimates. Part of this work was done during the authors’ stay at the Institut Henri Poincaré - Centre Emile Borel during the semester “Interacting particle systems, statistical mechanics and probability theory”. The authors thank this institution for hospitality and support.
References 1. Alexander, K.S.: The spectral gap of the 2-D stochastic ising model with nearly single-spin boundary conditions. J. Stat. Phys. 104, 59–87 (2001) 2. Alexander, K.S., Yoshida, N.: The spectral gap of the 2-D stochastic Ising model with mixed boundary conditions. J. Stat. Phys. 104, 89–109 (2001) 3. Higuchi, Y., Yoshida, N.: Slow relaxation of 2-D stochastic Ising models with random and non-random boundary conditions. In: New Trends in Stochastic Analysis, (Charingworth, England, Sept. 1994), Singapore: World Scientific, 1994, pp. 153–167 4. Schonmann, R.H., Yoshida, N.: Exponential relaxation of Glauber dynamics with some special boundary conditions. Commun. Math. Phys. 189(2), 299–309 (1997) 5. Bianchi, A.: Glauber dynamics on non-amenable graphs: boundary conditions and mixing time. Electron. J. Probab. 13, 1980–2012 (2008) 6. Bodineau, T., Martinelli, F.: Some new results on the kinetic Ising model in a pure phase. J. Stat. Phys. 109, 207–235 (2002) 7. Caputo, P., Martinelli, F., Toninelli, F.L.: On the approach to equilibrium for a polymer with adsorption and repulsion. Electron. J. Probab. 13, 213–258 (2008) 8. Cesi, F., Guadagni, G., Martinelli, F., Schonmann, R.H.: On the 2D stochastic Ising model in the phase coexistence region close to the critical point. J. Stat. Phys. 85, 55–102 (1996) 9. Dobrushin, R., Kotecký, R., Shlosman, S.: Wulff Construction. A global Shape from Local Interaction. Transl. Math. Monographs 104, Providence, RI: Amer. Math. Soc., 1992 10. Fisher, D.S., Huse, D.A.: Dynamics of droplet fluctuations in pure and random Ising systems. Phys. Rev. B 35, 6841–6846 (1987) 11. Fortuin, C.M., Kasteleyn, P.W., Ginibre, J.: Correlation inequalities on some partially ordered sets. Commun. Math. Phys. 22, 89–103 (1971) 12. Higuchi, Y., Wang, J.: Spectral gap of Ising model for Dobrushin’s boundary condition in two dimensions. Preprint, 1999 13. Liggett, T.M.: Interacting particle systems. New York: Springer Verlag, 1985 14. Levin, D.A., Peres, Y., Wilmer, E.L.: Markov Chains and Mixing Times. Providence, RI: Amer. Math. Soc., 2009 15. Levin, D., Luczak, M., Peres, Y.: Glauber dynamics for the Mean-field Ising Model: cut-off, critical power law, and metastability. Probab. Theory Related Fields 146(1,2), 223–265 (2010) 16. Martinelli, F.: On the two dimensional dynamical Ising model in the phase coexistence region. J. Stat. Phys. 76, 1179–1246 (1994) 17. Martinelli, F.: Lectures on Glauber dynamics for discrete spin models. Lecture Notes in Math. 1717, Berlin: Springer, 1999
Mixing Time of 2D Stochastic Ising Model at Low Temperature
213
18. Martinelli, F., Sinclair, A., Weitz, D.: Glauber dynamics on trees: Boundary conditions and mixing time. Commun. Math. Phys. 250(2), 301–334 (2004) 19. Martinelli, F., Sinclair, A.: Mixing time for the solid-on-solid model. In: Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC), New York: Assoc. for Comp. Mach., 2009, pp. 571–580 20. Martin-Löf, A.: Mixing properties, differentiability of the free energy and the central limit theorem for a pure phase in the Ising model at low temperature. Commun. Math. Phys. 32, 75–92 (1973) 21. Messager, A., Miracle-Solé, S., Ruiz, J.: Convexity properties of the surface tension and equilibrium crystals. J. Stat. Phys. 67, 449–470 (1992) 22. Peres, Y.: Mixing for Markov Chains and Spin Systems. Available at www.stat.berkeley.edu/~peres/ubc. pdf, August 2005 23. Shlosman, S.: The droplet in the tube: a case of phase transition in the canonical ensemble. Commun. Math. Phys. 125, 81–90 (1989) 24. Simon, B.: The Statistical Mechanics of Lattice Gases. Vol. I. Princeton Series in Physics. Princeton, NJ: Princeton University Press, 1993 25. Sugimine, N.: A lower bound on the spectral gap of the 3-dimensional stochastic Ising models. J. Math. Kyoto Univ. 42, 751–788 (2002) 26. Sugimine, N.: Extension of Thomas’ result and upper bound on the spectral gap of d(≥ 3)-dimensional stochastic Ising models. J. Math. Kyoto. Univ. 42(1), 141–160 (2002) 27. Thomas, L.E.: Bound on the mass gap for finite volume stochastic Ising models at low temperature. Commun. Math. Phys. 126, 1–11 (1989) Communicated by H. Spohn
Commun. Math. Phys. 296, 215–249 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-0994-y
Communications in
Mathematical Physics
From Limit Cycles to Strange Attractors William Ott1 , Mikko Stenlund2,3 1 Department of Mathematics, University of Houston, Houston,
TX 77204-3008, USA. E-mail:
[email protected] 2 Courant Institute of Mathematical Sciences, New York, NY 10012, USA.
E-mail:
[email protected] 3 Department of Mathematics and Statistics, University of Helsinki,
P.O. Box 68, 00014 Helsinki, Finland Received: 28 May 2009 / Accepted: 27 October 2009 Published online: 11 February 2010 – © Springer-Verlag 2010
Abstract: We define a quantitative notion of shear for limit cycles of flows. We prove that strange attractors and SRB measures emerge when systems exhibiting limit cycles with sufficient shear are subjected to periodic pulsatile drives. The strange attractors possess a number of precisely-defined dynamical properties that together imply chaos that is both sustained in time and physically observable. 1. Introduction This paper is about a mechanism for producing chaos: shear. We are guided by the idea that in the presence of shear, a stable dynamical structure can be transformed into a strange attractor with strong stochastic properties by forcing the structure with a pulsatile drive. The forcing does not overwhelm the intrinsic dynamics. Instead, it acts as an amplifier, amplifying the effects of the intrinsic shear. We focus on one particular dynamical structure of great importance: the limit cycle. Limit cycles are asymptotically stable periodic orbits of flows on Riemannian manifolds. The application of a periodic pulsatile drive to a flow exhibiting a limit cycle causes deformations to occur. If shear is present in a neighborhood of the limit cycle, if the limit cycle only weakly attracts nearby orbits, and if the time between pulses (the relaxation time) is sufficiently large, then stretch-and-fold geometry emerges in a neighborhood of the limit cycle. Stretch-and-fold geometry suggests that chaotic behavior that is both sustained in time and observable may exist. We prove that such chaotic behavior does exist in a certain parameter regime for any (generic) forcing function if the shear is sufficiently strong. Moreover, we define a quantity called the shear integral that quantifies the amount of shear that is present in the intrinsic flow in a neighborhood of the limit cycle. We emphasize that the shear integral depends only on the intrinsic system and not on the external forcing. Our result is the first of its kind for general limit cycles. Wang and Young [16,17] obtain results of a similar flavor for supercritical Hopf bifurcations and certain linear models.
216
W. Ott, M. Stenlund
The search for and analysis of stochastic behavior in deterministic dynamical systems have played a major role in guiding dynamical systems research. We discuss a few relevant developments. The theory of uniformly hyperbolic systems is well-developed. Let M be a compact Riemannian manifold and let f : M → M be a C 2 diffeomorphism of M. An attractor for f is a compact set satisfying f () = forwhich there ∞ exists an open set U ⊂ M (the basin) such that f (U¯ ) ⊂ U and = i=0 f i (U¯ ). An attractor is said to be an Axiom A attractor if the tangent bundle over splits into 2 D f -invariant subbundles E s and E u such that vectors in E s are contracted by D f and vectors in E u are expanded by D f (we assume E u is nontrivial). An Axiom A attractor supports a special invariant measure known as a Sinai-Ruelle-Bowen (SRB) measure that describes the asymptotic distribution of the orbit of almost every point in U with respect to Riemannian volume and has strong stochastic properties. In this sense, the chaotic behavior associated with Axiom A systems is observable. It is also sustained in time because of the presence of positive Lyapunov exponent(s). One can, in principle, detect the presence of uniform hyperbolicity in a given system by finding invariant cone families with suitable properties. For example, Tucker uses this approach to prove that the Lorenz equations are chaotic for the classical parameter values studied by Lorenz [13]. Many systems of interest in the biological and physical sciences display some form of hyperbolicity but are not uniformly hyperbolic. A mature theory of nonuniform hyperbolicity has emerged over the last 4 decades. However, the following problem remains a challenge. Given a dynamical system (or a parametrized family of dynamical systems), how can nonuniform hyperbolicity be detected? Numerical techniques include the calculation of Lyapunov exponents and the 0-1 test [4,5]. This paper addresses the analytical component of the problem in the context of limit cycles. Our proofs are based on the recently-developed theory of rank one maps [15,18]. Rank one theory is based on the ideas of Jakobson [8], Benedicks and Carleson [1,2], and Young [19,20]. Rank one theory provides checkable conditions that imply the existence of SRB measures with strong stochastic properties in parametrized families of diffeomorphisms. We conclude the introduction with a remark that the results obtained in this paper are in some sense dual to the phenomenon known as self-induced stochastic resonance (SISR) (see e.g. [3]). Our results demonstrate that certain intrinsic characteristics of a deterministic system (shear) can produce stochastic-type behavior when the system is forced in a deterministic way. SISR demonstrates that underlying phase space structures can produce deterministic (coherent) behavior in stochastically-forced systems when the noise level is taken to 0 along certain distinguished limits. 2. Statement of Results We state the main results and discuss their relationship to the existing literature. Let f : Rn → Rn be a C 5 vector field and consider the differential equation dx = f (x). dt
(2.1)
We assume that (2.1) admits an asymptotically stable hyperbolic periodic solution η of length L and period p0 . Let γ : R → Rn be a function of the parameter s that parametrizes η by length. Define = {γ (s) : s ∈ [0, L)}. Solutions to (2.1) that begin sufficiently close to will converge to at an exponential rate as t → ∞. We are interested in the effects of adding periodic pulsatile forcing to the vector field defining (2.1).
From Limit Cycles to Strange Attractors
217
For 0 < ρ < T , define the periodic function Pρ,T : R → R as follows. For 0 t T , set 1, if 0 t ρ , Pρ,T (t) = 0, if ρ < t < T and then extend periodically to all t ∈ R by requiring Pρ,T (t + T ) = Pρ,T (t). We study the externally-forced system dx = f (x) + ε Pρ,T (t)F(x), dt
(2.2)
where F : Rn → Rn is a C 4 vector field and the parameter ε > 0 controls the amplitude of the forcing. Notice that the right side of (2.2) is not continuous. In Sect. 3 we compute a normal form of Eq. (2.2) that is valid in a tubular neighborhood M˜ ≈ × D, where D is a closed disk in Rn−1 of sufficiently small radius. We are interested in the dynamics of (2.2) in the tubular neighborhood M ≈ × 21 D. Since the external forcing is periodic with period T , it is natural to study the time-T map induced by (2.2). We write the time-T map as the composition of a kick map Hk : M → M˜ and a relaxation map Hr : M˜ → int(M). Let Hk be the time-ρ map induced by the flow associated with (2.2). Notice that the external forcing is active during the kick phase because Pρ,T (t) = 1 for 0 t ρ. For ε sufficiently small, Hk maps M into M˜ diffeomorphically. Let Hr be the time-(T − ρ) map induced by (2.2) with ε set to 0. There exists T0 = T0 (ε) such that if T T0 , then Hr maps M˜ into int(M). The composition G T := Hr ◦ Hk is the time-T map induced by (2.2). The dynamical properties of G T : M → int(M) depend on a number of factors. One feature common to every map G T for T T0 is the existence of an attractor defined by =
∞
G iT (M).
i=0
We call U := int(M) the basin of attraction of . For every x ∈ U , G iT (x) → as i → ∞. Two characteristics of the intrinsic system (2.1) play a key role in determining the structure of and the dynamical properties of G T : shear and the strength of the limit cycle. We quantify these notions momentarily; for now, imagine that (2.1) exhibits strong shear in M if for most points x ∈ , the velocity vector f (ˆx) varies substantially as xˆ moves away from x in directions orthogonal to the limit cycle . Think of the limit cycle as strongly stable if solutions to (2.1) that begin in M converge quickly to . If the shear is weak and the limit cycle is strongly stable, then the attractor associated with G T will be an invariant closed curve. We are interested in the opposite situation. Suppose that the shear is strong in M and the limit cycle is weakly stable. The addition of the periodic pulsatile external force ε Pρ,T (t)F(x) will amplify the effect of the shear in the following way: disturbances that are created when Pρ,T = 1 will be stretched during the relaxation period (when Pρ,T = 0). The stretching effect increases in intensity as T increases. If T is large, then folds will be created in the phase space. If G T exhibits stretch-and-fold geometry, then G T potentially exhibits chaotic behavior that is sustained in time and observable. This paper aims to accomplish the following:
218
W. Ott, M. Stenlund
(1) We define a computable quantity called the shear integral that quantifies the shear associated with the intrinsic system (2.1) near the limit cycle . (2) We prove that if the magnitude of the shear integral is sufficiently large and if the contraction near the limit cycle is sufficiently weak, then the following holds for suitable values of ε. For a typical external vector field F, there exists T1 > 0 and a set ⊂ [T1 , ∞) of positive Lebesgue measure such that for T ∈ , the time-T map G T associated with (2.2) admits a strange attractor and exhibits chaos that is sustained in time and observable. The quantity T1 satisfies T1 ρ, ensuring sufficient relaxation time for the stretch-andfold geometry to emerge. The term strange attractor refers to a number of precisely defined dynamical and structural properties that represent sustained, observable chaos. For T ∈ , supports a unique ergodic SRB measure ν. Here the term SRB measure refers to a measure ν with a positive Lyapunov exponent ν almost everywhere and whose conditional measures on unstable manifolds are absolutely continuous with respect to Riemannian volume on these manifolds. The SRB measure ν satisfies the central limit theorem and exhibits exponential decay of correlations for Hölder continuous observables. For Lebesgue almost every x in the basin of attraction U , the orbit of x has a positive Lyapunov exponent and is asymptotically distributed according to ν in the sense that for every continuous function ϕ : U → R, we have m−1 1 i lim ϕ(G T (x)) = ϕ dν. m→∞ m
(2.3)
i=0
Notice that this statement is substantially stronger than the conclusion of the Birkhoff ergodic theorem. The Birkhoff ergodic theorem implies that (2.3) holds for ν almost every x. However, ν is singular with respect to Lebesgue measure (supported on a set of Lebesgue measure zero) because the dynamics are dissipative. We prove that (2.3) holds for Lebesgue almost every x ∈ U . See (SA1)–(SA4) in Sect. 4 for a more precise description of the dynamical properties of G T for T ∈ . We now define the shear integral. In Sect. 3 we derive a normal form of (2.1) that is ˜ The normal form, expressed in the natural (s, z)-coordinates introduced in valid in M. Sect. 3.1, is given by dt = f (γ (s))−1 + β(s), z + ω1 (s, z), ds dz = Az + ω2 (s, z). ds
(2.4a) (2.4b)
Here ·, · denotes the inner product on Rn−1 . Functions depending on s in (2.4a)–(2.4b) are periodic in s with period 2L. The matrix A is in Jordan canonical form. The functions ω1 and ω2 represent higher order corrections. The function β gives the pointwise magnitude and direction of the shear. Define the shear integral by 2L = ( 1 , . . . , n−1 ) := β(τ ) dτ 0
and define the shear factor σ by σ := . Having defined the shear integral, we describe the setting of the main theorem. We identify intrinsic parameters (parameters associated with f ) and external parameters
From Limit Cycles to Strange Attractors
219
(parameters associated with the external forcing). We fix the normalized shear vector σ
and view the shear factor σ as the first intrinsic parameter. The second intrinsic parameter quantifies the strength of the contraction near the limit cycle and is derived from A. We assume for the sake of simplicity that A is a diagonal matrix given by A = diag(λ1 , . . . , λn−1 ), where 0 > λ1 λ2 · · · λn−1 are the eigenvalues of A. We fix the eigenvalue ratios µi = λλ1i for 1 i n − 1 and we view the weakest eigenvalue λ1 as an intrinsic parameter. The only external parameter is ε, the factor that controls the amplitude of the external forcing. We fix ρ > 0. A key parameter derived from ε, σ , and λ1 is the hyperbolicity factor |λεσ1 | . One additional ingredient is needed. Even if σ is large and |λ1 | is small, a strange attractor cannot emerge unless the forcing F acts in direction(s) in which shear is present. We express this idea by introducing a certain function on the circle S := 2LRZ . We identify S with the interval [0, 2L). In Sect. 3 we derive a normal form of the forced system (2.2) that is valid in M˜ when the forcing is active (Pρ,T = 1): dt = f (γ (s))−1 + β(s), z + ω3 (s, z), ds dz = Az + εζ (s) + ω4 (s, z). ds
(2.5a) (2.5b)
Functions depending on s in (2.5a)–(2.5b) are periodic in s of period 2L. The functions ω3 and ω4 are higher order corrections. The function ζ is related to the projection of F in directions orthogonal to . For s0 ∈ S, define s˜ implicitly by s˜ f (γ (τ ))−1 dτ. ρ= s0
Define the vector d := and define : S → R by
i µi σ
(s0 ) = d,
s˜
n−1 , i=1
ζ (τ ) dτ .
(2.6)
s0
We say that is a Morse function if the critical set C( ) = {s ∈ S : (s) = 0} is finite and if for every s ∈ C( ), we have (s) = 0. We are now in position to state the main theorem. In Theorem 1, we assume that the radius of M is κ0 ε for some constant κ0 > 0. Theorem 1. Let G T denote the time-T map associated with (2.2). Suppose that the function defined by (2.6) is a Morse function. Then there exist a small constant κ1 > 0 and a large constant κ2 > κ1 such that the following holds. If (1) |λ1 | < κ1 , (2) |λε1 | < κ1 , (3) |λεσ1 | > κ2 ,
220
W. Ott, M. Stenlund
then there exists T1 > 0 and a set ⊂ [T1 , ∞) of positive Lebesgue measure such that for T ∈ , G T admits a strange attractor in M and satisfies (SA1)–(SA4) from Sect. 4. For every interval I ⊂ [T1 , ∞) of length 1, ( ∩ I ) > 0, where denotes the Lebesgue measure on R. Remark 2.1. The assumption that is a Morse function is quite mild and should hold for a typical forcing vector field F. We do not formulate precise results of this type in this paper, but such results should hold in terms of both topological genericity and prevalence. Prevalence is a measure-theoretic notion of genericity that generalizes the concept of ‘Lebesgue almost every’ to infinite-dimensional spaces. It provides a powerful framework for describing generic phenomena in a probabilistic way (see e.g. [6,7,10]). Remark 2.2. Theorem 1 concludes that G T exhibits sustained, observable chaos for a set of values of T of positive Lebesgue measure rather than for all T ∈ [T1 , ∞). This is not a consequence of the nature of the proof. Rather, it is a fundamental consequence of the fact that an alternate scenario competes with the SRB scenario in the space of T -values. For an open set S of T -values in [T1 , ∞), the basin U contains a G T -invariant Cantor set on which G T is uniformly hyperbolic (a horseshoe) and a periodic sink. The trajectory of Lebesgue almost every x ∈ U converges to the periodic sink. Thus for T ∈ S, G T exhibits transient chaos: a typical trajectory in the basin will move erratically for some time due to the presence of the horseshoe before finally converging to the periodic sink. Remark 2.3. The function does not depend on the parameters λ1 , σ , and ε. Theorem 1 is related to 2 results obtained by Wang and Young in [17]. Wang and Young consider limit cycles forced by periodic δ-function kicks. First, they prove that any limit cycle, when suitably kicked, can be transformed into a strange attractor. This result is universal but not constructive. An artificially-strong kick is needed if geometric conditions are unfavorable for the creation of nonuniform hyperbolicity. Second, they prove that the Hopf limit cycle that emerges from a supercritical Hopf bifurcation can be transformed into a strange attractor. Here the so-called twist factor plays the role of the shear integral. Unlike the shear integral, the twist factor is local in the sense that it depends only on derivatives of the vector field at the bifurcation parameter. Many of the quantities in Theorem 1 are required to be sufficiently large or sufficiently small. This is an unavoidable consequence of the perturbative nature of the analytic techniques used in the proof. However, numerical evidence suggests that shearinduced chaos emerges over parameter ranges that far exceed those to which the rigorous analysis applies. For example, Lin and Young [9] conduct numerical studies of a linear shear flow model previously studied by Zaslavsky [21]. The work of Lin and Young also provides numerical evidence that the temporal form of the kicks need not be periodic: temporally-sustained chaotic behavior is observed for random kicks at Poissondistributed times and for continuous-time forcing by white noise. 3. Derivation of the Singular Limit 3.1. Derivation of the normal forms. We derive the normal forms (2.4a)–(2.4b) and n (2.5a)–(2.5b) that are valid in a small neighborhood of . For s ∈ S, let {ei (s)}i=1 be n an orthonormal basis for R such that en (s) = γ (s) (where γ denotes the derivative of γ with respect to s) and ei is a C 5 function of s for all 1 i n. One may choose the first n − 1 vectors in many ways. For example, if γ is at least C n+5 and the first n
From Limit Cycles to Strange Attractors
221
derivatives of γ are linearly independent, then one may construct the basis by applying the Gram-Schmidt procedure to the first n derivatives of γ . For any x ∈ Rn sufficiently close to , there exist unique s ∈ S and y = (y1 , . . . , yn−1 ) such that x = γ (s) +
n−1
yi ei (s).
(3.1)
i=1
We use (s, y) as new phase variables. Define
⎛
⎞ (e1 (s))T ⎜(e2 (s))T ⎟ ⎜ ⎟ E(s) = ⎜ ⎟. .. ⎝ ⎠ . (en (s))T
Differentiating E(s) with respect to s, we have E (s) = K(s)E(s), where K(s) = (k j,i (s)) is a skew-symmetric matrix of generalized curvatures defined by k j,i (s) = ej (s), ei (s) . If the first n derivatives of γ are used to create E, then this differential equation is the classical Frenet-Serret equation from differential geometry. For 1 i n, define the vector ⎞ ⎛ k1,i (s) ⎜ k2,i (s) ⎟ ⎟. ki (s) = ⎜ .. ⎠ ⎝ . kn−1,i (s)
Differentiating (3.1) with respect to t, we obtain ⎞ ⎛ n−1 n−1 dx dyi ds ⎝ = ei (s) + y j ej (s)⎠ = f (x) + ε Pρ,T (t)F(x). γ (s) + dt dt dt i=1
(3.2)
j=1
Taking the inner product of (3.2) with respect to ei (s) for 1 i n − 1 yields dyi ds = f (x), ei (s) + ε Pρ,T (t) F(x), ei (s) − y, ki (s) . dt dt Taking the inner product of (3.2) with respect to en (s) yields ds ( y, kn (s) + 1) = f (x), en (s) + ε Pρ,T (t) F(x), en (s) . dt Notice that y, kn (s) + 1 = 0 if y is sufficiently small. Consequently, the system ds 1 = ( f (x), en (s) + ε Pρ,T (t) F(x), en (s) ), dt 1 + y, kn (s)
dyi ds = f (x), ei (s) + ε Pρ,T (t) F(x), ei (s) − y, ki (s)
dt dt is valid in a small neighborhood of .
(3.3a) (3.3b)
222
W. Ott, M. Stenlund
We now extract the terms of leading order in (3.3a) and (3.3b). For 1 j n, define ψ j (s, y) = f (x), e j (s) . For 1 i n − 1, we have ψi (s, y) = ψ i(1) (s), y + Os,y (y2 ), where (1) ψ i (s)
∂ f (x), ei (s) = . ∂y y=0
Here Os,y (y2 ) denotes a function of s and y for which there exists a constant K > 0 independent of s and y such that |Os,y (y2 )| K y2 . Expanding ψn (s, y), we have 2 ψn (s, y) = ψn(0) (s) + ψ (1) n (s), y + Os,y (y ),
where ψn(0) (s) = f (γ (s)), ψ (1) n (s) =
∂ f (x), en (s) . ∂y y=0
Set φ j (s, y) = F(x), e j (s) for 1 j n. Writing (3.3a) and (3.3b) in terms of ψ j and φ j , when the forcing is active (Pρ,T (t) = 1) we obtain ⎧ dt 1 ψ (1) ⎪ n (s) 2 ⎪ ⎪ = 1+ kn (s) − (0) · y + Os,y (y ) ⎪ ⎪ ds ψn(0) (s)+εφn (s, y) ⎪ ψn (s) + εφn (s, y) ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ dy ψ i(1) (s) εφi (s, y) i = (0) + − ki (s) · y (0) ⎪ ds ψn (s) + εφn (s, y) ψn (s) + εφn (s, y) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ εφi (s, y) (s) ψ (1) ⎪ n ⎪ + kn (s) − (0) · y + Os,y (y2 ). ⎩ (0) ψn (s) + εφn (s, y) ψn (s) + εφn (s, y) (3.4) When the forcing is off (Pρ,T (t) = 0), we have ⎧ ⎪ dt 1 (s) ψ (1) ⎪ n 2 ⎪ = (0) 1 + kn (s) − (0) · y + Os,y (y ) ⎪ ⎪ ⎨ ds ψn (s) ψn (s) (1) ⎪ ⎪ ψ (s) dy ⎪ i i ⎪ ⎪ − ki (s) · y + Os,y (y2 ). ⎩ ds = (0) ψn (s) Define b0 (s) := b1 (s) :=
1 (0) ψn (s)
1 (0)
ψn (s)
, kn (s) −
ψ (1) n (s) (0)
ψn (s)
,
(3.5)
From Limit Cycles to Strange Attractors
223
˜ and let A(s) denote the (n − 1) × (n − 1) matrix with i th row given by
T
(1)
ψ i (s)
ψn(0) (s)
− ki (s)
.
˜ system (3.5) becomes In terms of b0 , b1 , and A, ⎧ dt ⎪ = b0 (s) + b1 (s), y + Os,y (y2 ) ⎨ ds ⎪ ⎩ dy = A(s)y ˜ + Os,y (y2 ). ds
(3.6)
Applying the Floquet theorem, there exists a real-valued, periodic (n − 1) × (n − 1) matrix P(s) of period 2L such that setting z = P−1 (s)y, we transform (3.6) into dt = b0 (s) + ((b1 (s))T P(s))z + h 2 (s, z), ds dz = Az + h1 (s, z). ds
(3.7a) (3.7b)
This is the normal form of (2.2) on which we will base our analysis of the flow during the relaxation period (when Pρ,T (t) = 0). We obtain the normal form of (2.2) during the forcing period (when Pρ,T (t) = 1) by writing (3.4) in (s, z)-coordinates, giving dt = b0 (s) + ((b1 (s))T P(s))z + Os,z (ε) + Os,z (εz) + Os,z (z2 ), ds dz εP−1 (s)φ(s, 0) = Az + + Os,z (εz) + Os,z (ε2 ) + Os,z (z2 ), (0) ds ψn (s)
(3.8a) (3.8b)
where φ(s, 0) = (φ1 (s, 0), . . . , φn−1 (s, 0))T . 3.2. A general form of the singular limit. Let M˜ ≈ × D be a tubular neighborhood of in Rn , where D is a disk of sufficiently small radius so that the normal form (3.8a)–(3.8b) is valid. Let M ≈ × 21 D. We define flow-induced maps Hk : M → M˜ and Hr : M˜ → M˜ as follows. Let Hk be the time-ρ map associated with the forced system (3.8a)–(3.8b). We call Hk the ‘kick’. Notice that for ε sufficiently small, Hk maps ˜ Let Hr be the time-(T −ρ) map associated with the relaxation system (3.7a)– M into M. (3.7b). We call Hr the relaxation map. There exists T0 = T0 (ε) such that if T T0 , then Hr maps M˜ into int(M). The composition G T := Hr ◦ Hk is the time-T map generated by the flow. Our goal is to show that the family {G T : M → int(M), T T0 } of diffeomorphisms on M has a well-defined singular limit in a certain sense as T → ∞. Let (s0 , y0 ) ∈ M. We write Hk (s0 , y0 ) = (ˆs , zˆ ) and compute Hr (ˆs , zˆ ). Integrating (3.7b), we have s e−(τ −ˆs )A h1 (τ, z(τ )) dτ . z(s) = e(s−ˆs )A zˆ + sˆ
224
W. Ott, M. Stenlund
Integrating (3.7a), we have s(T ) T −ρ = b0 (τ ) dτ + zˆ · sˆ
s(T )
sˆ
b1 (τ )T P(τ )e(τ −ˆs )A dτ +
sˆ
E 2 (s(T )) =
s(T )
sˆ
E k (s(T )),
(3.9)
k=1
where the error terms are given by s(T ) E 1 (s(T )) = b1 (τ )T P(τ )e(τ −ˆs )A
2
sˆ
τ
e−(ξ −ˆs )A h1 (ξ, z(ξ )) dξ dτ ,
h 2 (τ, z(τ )) dτ.
Letting T → ∞ in (3.9) yields nothing meaningful. However, we use the fact that s can be computed modulo 2L to introduce an auxiliary parameter a ∈ S and thereby obtain the singular limit. Recall that p0 is the period of η. As a varies from 0 to 2L, γ traverses 2 times. Let tˆ : [0, 2L) → [0, 2 p0 ) be the strictly increasing function defined by η(tˆ(a)) = γ (a). For m ∈ Z+ and a ∈ S, set T = ρ + 2 p0 m + tˆ(a). Substituting into (3.9), writing s(ρ + 2 p0 m + tˆ(a)) = sˆ + 2Lm + s˜ (ρ + 2 p0 m + tˆ(a)), and using the fact that v+2Lm b0 (τ ) dτ = 2 p0 m v
for all v ∈ R, we obtain sˆ+˜s (ρ+2 p0 m+tˆ(a)) tˆ(a) = b0 (τ ) dτ + zˆ · sˆ
+
s(ρ+2 p0 m+tˆ(a))
sˆ
2
b1 (τ )T P(τ )e(τ −ˆs )A dτ (3.10)
E k (s(ρ + 2 p0 m + tˆ(a)).
k=1
Define G a,m −1 : M → int(M) by G a,m −1 (s0 , y0 ) = (s(ρ + 2 p0 m + tˆ(a)), y(ρ + 2 p0 m + tˆ(a)). It follows from [17, Prop. 3.1] that there exists s∞ (s0 , y0 , a) such that lim sˆ + s˜ (ρ + 2 p0 m + tˆ(a)) = s∞ (s0 , y0 , a),
m→∞
and s∞ (s0 , y0 , a) is defined implicitly by taking the m → ∞ limit in (3.10): ∞ s∞ (s0 ,y0 ,a) 2 T (τ −ˆ s )A tˆ(a) = b0 (τ ) dτ + zˆ , b1 (τ ) P(τ )e dτ + E k (∞). sˆ
sˆ
k=1
The family of maps {G a,0 : M → × {0}}a∈S defined by G a,0 (s0 , y0 ) = (s∞ (s0 , y0 , a), 0) is the desired singular limit. It follows from [17, Prop. 3.1] that the maps (s0 , y0 , a) → G a,m −1 (s0 , y0 ) converge to the map (s0 , y0 , a) → G a,0 (s0 , y0 ) in
C 3 (M
× S) as m → ∞.
(3.11)
From Limit Cycles to Strange Attractors
225
3.3. A computable form of the singular limit. From this point forward, we assume the setting of Theorem 1. We now extract the primary terms in the right side of (3.11). Recall that the shear integral is defined by
2L
= ( 1 , . . . , n−1 ) =
b1 (τ )T P(τ ) dτ,
0
and that the shear factor is given by σ = . We assume that the operator A is diagonalizable and that the z-coordinate has been chosen such that A = diag(λ1 , . . . , λn−1 ), where 0 > λ1 λ2 · · · λn−1 are the eigenvalues of A. Fix the normalized shear λ1 vector σ and the eigenvalue ratios µi = λi for 1 i n − 1. Set ρ = 1 for notational simplicity. We regard σ , ε, and λ1 as the parameters associated with the singular limit. Expanding the second term on the right side of (3.11), we have
∞ sˆ
b1 (τ )T P(τ )e(τ −ˆs )A dτ =
∞
sˆ
= d¯ +
e(τ −ˆs )A dτ +
∞
sˆ
∞ sˆ
T
(b1 (τ )T P(τ ) − )e(τ −ˆs )A dτ (3.12)
(b1 (τ ) P(τ ) − )e
(τ −ˆs )A
dτ,
where d¯ =
∞
sˆ
i n−1 e(τ −ˆs )A dτ = − . λi i=1
Let H˜ k : M → M˜ be the time-1 map generated by the system dt = b0 (s), ds dz εP−1 (s)φ(s, 0) = , ds ψn(0) (s)
(3.13a) (3.13b)
obtained from (3.8a)–(3.8b) by retaining only the terms of leading order. For (s0 , y0 ) ∈ M, write H˜ k (s0 , y0 ) = (˜s , z˜ ). Integrating (3.13a) and (3.13b) gives 1=
s˜
b0 (τ ) dτ,
s0
z˜ = z0 + ε
s˜ s0
P−1 (τ )φ(τ, 0) (0)
ψn (τ )
(3.14) dτ.
Proposition 3.1. There exists a system constant K 0 > 0 such that sˆ = s˜ + ξ1 (s0 , y0 ), zˆ = z˜ + ξ 2 (s0 , y0 ), where ξ1 |{y0 = 0}C 3 (S) K 0 ε, ξ 2 |{y0 = 0}C 3 (S) K 0 ε|λ1 |.
(3.15)
226
W. Ott, M. Stenlund
Setting y0 = 0, define g(s0 , a) = s∞ (s0 , 0, a). Substituting (3.12), (3.14), and (3.15) into (3.11), the value g(s0 , a) is defined implicitly by tˆ(a) + 1 =
g(s0 ,a) s0
−
sˆ
s˜
¯ b0 (τ ) dτ + (˜z + ξ 2 (s0 , 0)), d
b0 (τ ) dτ + zˆ ·
∞ sˆ
T
(b1 (τ ) P(τ ) − )e
(τ −ˆs )A
dτ +
2
E k (∞).
k=1
(3.16) ¯ we define Rescaling d, d=
i µi σ
n−1 ,
s˜
(s0 ) = d,
P−1 (τ )φ(τ, 0)
s0
i=1
(0)
ψn (τ )
dτ ,
giving ¯ = εσ (s0 ). ˜z, d
|λ1 | The higher-order terms are given by E1 = E 1 (∞), E2 = E 2 (∞), ∞ T (τ −ˆs )A (b1 (τ ) P(τ ) − )e dτ , E3 = zˆ , sˆ sˆ
E4 = − Setting E = limit:
5
k=1 Ek
s˜
¯ b0 (τ ) dτ, E5 = ξ 2 (s0 , 0), d .
and substituting into (3.16), we obtain the final form of the singular
tˆ(a) + 1 =
g(s0 ,a)
s0
b0 (τ ) dτ +
εσ
(s0 ) + E. |λ1 |
(3.17)
Proposition 3.2. There exists a system constant K 1 > 0 such that the following hold: ε σε E1 C 3 (S) K 1 , |λ1 | |λ1 | σε ε E2 C 3 (S) K 1 , |λ1 | σ σε E3 C 3 (S) K 1 (|λ1 |), |λ1 | σ ε |λ1 | E4 C 3 (S) K 1 , |λ1 | σ σε (|λ1 |). E5 C 3 (S) K 1 |λ1 |
From Limit Cycles to Strange Attractors
227
4. Theory of Rank One Attractors Let D denote the closed unit disk in Rn−1 and let M = S1 × D. We consider a family of maps G a,b : M → M, where a = (a1 , . . . , ak ) ∈ V is a vector of parameters and b ∈ B0 is a scalar parameter. Here V = V1 × · · · × Vk ⊂ Rk is a product of intervals and B0 ⊂ R \ {0} is a subset of R with an accumulation point at 0. Points in M are denoted by (x, y) with x ∈ S1 and y ∈ D. Rank one theory postulates the following: (H1) Regularity conditions. (a) For each b ∈ B0 , the function (x, y, a) → G a,b (x, y) is C 3 . (b) Each map G a,b is an embedding of M into itself. (c) There exists K D > 0 independent of a and b such that for all a ∈ V, b ∈ B0 , and z, z ∈ M, we have | det DG a,b (z)| K D. | det DG a,b (z )| (H2) Existence of a singular limit. For a ∈ V, there exists a map G a,0 : M → S1 × {0} such that the following holds. For every (x, y) ∈ M and a ∈ V, we have lim G a,b (x, y) = G a,0 (x, y).
b→0
Identifying S1 × {0} with S1 , we refer to G a,0 and the restriction f a : S1 → S1 defined by f a (x) = G a,0 (x, 0) as the singular limit of G a,b . (H3) C 3 convergence to the singular limit. We select a special index j ∈ {1, . . . , k}. Fix ai ∈ Vi for i = j. For every such choice of parameters ai , the maps (x, y, a j ) → G a,b (x, y) converge in the C 3 topology to (x, y, a j ) → G a,0 (x, y) on M × V j as b → 0. (H4) Existence of a sufficiently expanding map within the singular limit. There exists a∗ = (a1∗ , . . . , ak∗ ) ∈ V such that f a∗ ∈ M, where M is the set of Misiurewicz-type maps defined in Definition 4.1 below. (H5) Parameter transversality. Let Ca∗ denote the critical set of f a∗ . For a j ∈ V j , define the vector a˜ j ∈ V by a˜ j = (a1∗ , . . . , a ∗j−1 , a j , a ∗j+1 , . . . , ak∗ ). We say that the family { f a } satisfies the parameter transversality condition with respect to parameter a j if the following holds. For each x ∈ Ca∗ , let p = f a∗ (x) and let x(˜a j ) and p(˜a j ) denote the continuations of x and p, respectively, as the parameter a j varies around a ∗j . The point p(˜a j ) is the unique point such that p(˜a j ) and p have identical symbolic itineraries under f a˜ j and f a∗ , respectively. We have d d f a˜ j (x(˜a j )) = p(˜a j ) . da j da j a j =a ∗ a j =a ∗ j
j
(H6) Nondegeneracy at ‘turns’. For each x ∈ Ca∗ , there exists 1 m n − 1 such that ∂ G a∗ ,0 (x, y) = 0. ∂ ym y=0
228
W. Ott, M. Stenlund
(H7) Conditions for mixing. 1
(a) We have e 3 λ0 > 2, where λ0 is defined within Definition 4.1. (b) Let J1 , . . . , Jr be the intervals of monotonicity of f a∗ . Let Q = (qim ) be the matrix of ‘allowed transitions’ defined by 1, if f a∗ (Ji ) ⊃ Jm , qim = 0, otherwise. There exists N > 0 such that Q N > 0. We now define the family M. Definition 4.1. We say that f ∈ C 2 (S1 , R) is a Misiurewicz map and we write f ∈ M if the following hold for some neighborhood U of the critical set C = C( f ) = {x ∈ S1 : f (x) = 0}: (A) Outside of U. There exist λ0 > 0, M0 ∈ Z+ , and 0 < d0 1 such that (1) for all m M0 , if f i (x) ∈ / U for 0 i m − 1, then |( f m ) (x)| eλ0 m , + i (2) for any m ∈ Z , if f (x) ∈ / U for 0 i m − 1 and f m (x) ∈ U , then m λ m 0 |( f ) (x)| d0 e . (B) Critical orbits. For all c ∈ C and i > 0, f i (c) ∈ / U. (C) Inside U. (1) We have f (x) = 0 for all x ∈ U , and (2) for all x ∈ U \C, there exists p0 (x) > 0 such that f i (x) ∈ / U for all i < p0 (x) 1
and |( f p0 (x) ) (x)| d0−1 e 3 λ0 p0 (x) . Rank one theory states that given a family {G a,b } satisfying (H1)–(H6), a measuretheoretically significant subset of this family consists of maps admitting attractors with strong chaotic and stochastic properties. We formulate the precise results and we then describe the properties that the attractors possess. Theorem 4.2 ([15,18]). Suppose the family {G a,b } satisfies (H1), (H2), (H4), and (H6). The following holds for all 1 j k such that the parameter a j satisfies (H3) and (H5). For all sufficiently small b ∈ B0 , there exists a subset j ⊂ V j of positive Lebesgue measure such that for a j ∈ j , G a˜ j ,b admits a strange attractor with properties (SA1), (SA2), and (SA3). Theorem 4.3 ([15,16,18]). In the sense of Theorem 4.2, (H1)–(H7) ⇒ (SA1)–(SA4). Remark 4.4. The proof of Theorem 4.2 for the special case n = 2 appears in [15]. The additional component (H7) ⇒ (SA4) in Theorem 4.3 is proved in [16]. For general n, Wang and Young [18] prove the existence of an SRB measure for G a˜ j ,b if a j ∈ j . The complete proofs of (SA1)–(SA3) (and (SA4) assuming (H7)) for G a˜ j ,b with a j ∈ j will appear in [14] for general n. We now describe (SA1)–(SA4) precisely. Write G = G a˜ j ,b . (SA1) Positive Lyapunov exponent. Let U denote the basin of attraction of the attractor . This means that U is an open set satisfying G(U ) ⊂ U and =
∞ m=0
T m (U ).
From Limit Cycles to Strange Attractors
229
For almost every z ∈ U with respect to Lebesgue measure, the orbit of z has a positive Lyapunov exponent. That is, lim
m→∞
1 log DG m (z) > 0. m
(SA2) Existence of SRB measures and basin property. (a) The map G admits at least one and at most finitely many ergodic SRB measures each one of which has no zero Lyapunov exponents. Let ν1 , · · · , νr denote these measures. (b) For Lebesgue-a.e. z ∈ U , there exists j (z) ∈ {1, . . . , r } such that for every continuous function ϕ : U → R, m−1 1 ϕ(G i (x, y)) → ϕ dν j (z) . m i=0
(SA3) Statistical properties of dynamical observations. (a) For every ergodic SRB measure ν and every Hölder continuous function ϕ : → R, the sequence {ϕ ◦ G i : i ∈ Z+ } obeys a central limit theorem. That is, if ϕ dν = 0, then the sequence m−1 1 ϕ ◦ Gi √ m i=0
converges in distribution (with respect to ν) to the normal distribution. The variance of the limiting normal distribution is strictly positive unless ϕ = ψ ◦ G − ψ for some ψ ∈ L 2 (ν). (b) Suppose that for some N 1, G N has an SRB measure ν that is mixing. Then given a Hölder exponent η, there exists τ = τ (η) < 1 such that for all Hölder ϕ, ψ : → R with Hölder exponent η, there exists K = K (ϕ, ψ) such that for all m ∈ N, (ϕ ◦ G m N )ψ dν − ϕ dν ψ dν K (ϕ, ψ)τ m . (SA4) Uniqueness of SRB measures and ergodic properties. (a) The map G admits a unique (and therefore ergodic) SRB measure ν, and (b) the dynamical system (G, ν) is mixing, or, equivalently, isomorphic to a Bernoulli shift.
5. Verification of the Rank One Hypotheses We view the singular limit {G a,0 : a ∈ S} as a function of 3 parameters: ε, σ , and λ1 . We show that the family {G a,m −1 : a ∈ S, m ∈ Z+ } satisfies (H1)–(H7) if the parameters ε, σ , and λ1 satisfy certain scaling assumptions.
230
W. Ott, M. Stenlund
5.1. 1D analysis: Verification of (H4), (H5), and (H7). Recall that g(s, a) is defined implicitly by g(s,a) εσ b0 (τ ) dτ + tˆ(a) + 1 =
(s) + E. |λ 1| s Defining f a (s) = g(s, a), = becomes tˆ(a) + 1 =
εσ |λ1 | ,
and (s) = (s) + −1 E, the singular limit
f a (s)
b0 (τ ) dτ + (s).
(5.1)
s
For a map f : S → S and δ > 0, let C( f ) = {s : f (s) = 0} and let Cδ ( f ) = {s : |s − sˆ | < δ for some sˆ ∈ C( f )}. We assume the following about : there exist positive constants K 2 , d0 , d1 , and d2 , and a constant δ0 satisfying 0 < δ0 < 21 d1 , such that the following hold: (A1) (A2) (A3) (A4)
C 3 (S) < K 2 , | (s)| > d0 for s ∈ Cδ0 (), If (s1 ) = (s2 ) = 0 and s1 = s2 , then |s1 − s2 | > d1 , | (s)| > d2 for s ∈ S \ Cδ0 ().
Because is a Morse function, Proposition 3.2 implies that Assumptions (A1)–(A4) are satisfied if σ 1, |λ1 | is sufficiently small, and |λε1 | is sufficiently small. We now compare the map f a to the map . Let {v¯1 , . . . , v¯q0 } be the set of critical 3 points of . Set ξ = − 4 . Lemma 5.1. There exists 0 > 0 and positive constants K 3 , K 4 , and K 5 such that the following hold for fixed > 0 : (a) C( f a ) = {v1 , . . . , vq0 } with |vi − v¯i | < K 3 −1 for 1 i q0 , (b) | f a (s)| > K 4 for all s ∈ Cξ ( f a ), 1 (c) | f a (s)| > K 5 4 for all s ∈ S \ C 1 ξ ( f a ). 2
Proof of Lemma 5.1. Differentiating (5.1) with respect to s, we obtain b0 (s) − (s) = b0 ( f a (s)) f a (s).
(5.2)
Setting f a (s) = 0 gives b0 (s) = (s). Since b0 is bounded above and bounded away from 0, (A2)–(A4) imply (a). Solving for f a (s), we have f a (s) =
b0 (s) − (s) . b0 ( f a (s))
(5.3)
On S \ C 1 ξ ( f a ) we have | (s)| > K ξ using (a), (A2), and (A4). Estimate (c) now 2 follows from (5.3). Differentiating (5.2) with respect to s, we obtain b0 (s) − (s) − b0 ( f a (s))[ f a (s)]2 = b0 ( f a (s)) f a (s). | (s)|
(5.4) | f a (s)|
For all s ∈ Cξ ( f a ), we have < K ξ by (A1) and (a). This implies that < 1 K 4 on Cξ using (5.3). Therefore the second term on the left side of (5.4) dominates and (b) holds.
From Limit Cycles to Strange Attractors
231
5.1.1. Critical curves. Assume > 0 and let ⊂ S be a parameter interval. For a ∈ , we have C( f a ) = {v1 (a), . . . , vq0 (a)} by Lemma 5.1. Write γ (i) (a) = vi (a) for 1 i q0 . For 1 k q0 and i ∈ N, define γi(k) (a) := f ai (γ (k) (a)). Differentiating γ1(k) (a) = f a (γ (k) (a)) = f (γ (k) (a), a) with respect to a, we have d (k) ∂ f (k) d (k) ∂ f (k) γ (a) = (γ (a), a) · γ (a) + (γ (a), a) da 1 ∂s da ∂a ∂ f (k) = (γ (a), a). ∂a Differentiating (5.1) with respect to a and using the fact that
d ˆ da t (a)
(5.5)
= b0 (a), we obtain
b0 (a) ∂ f (s, a) = . ∂a b0 ( f (s, a))
(5.6)
Thus d (k) mins∈S b0 (s) γ1 (a) > 0. da maxs∈S b0 (s) More generally, an estimate on
d (k) da γi+1 (a)
for i ∈ N follows from the recursive formula
d (k) ∂ f (k) d (k) ∂ f (k) γi+1 (a) = (γi (a), a) · γi (a) + (γ (a), a). da ∂s da ∂a i
(5.7)
Lemma 5.2 (Growth estimate for derivatives of critical curves). There exists 1 0 such that the following holds for all > 1 . For any k ∈ {1, . . . , q0 } and i ∈ N such that γ j(k) (a) ∈ S \ Cξ () for all 1 j i, then i d (k) γ (a) > K 5 14 > 5i . da i+1 2
(5.8)
Proof of Lemma 5.2. Estimate (5.8) follows from (5.7), estimate (c) from Lemma 5.1, and the fact that for all s ∈ S and a ∈ we have ∂ maxs∈S b0 (s) f (s, a) =: K 6 . ∂a mins∈S b0 (s) Lemma 5.3 (Distortion estimate for critical curves). There exists 2 1 and D1 > 0 such that the following holds for all > 2 . For any k ∈ {1, . . . , q0 } and any n 2, let be a parameter interval such that (k)
(a) γi () ⊂ S \ Cξ () for 1 i n − 1, and (k) (b) (γn−1 ()) < ξ ( denotes Lebesgue measure on S). Then for all a, aˆ ∈ , we have
d γ (k) (a) da n d (k) < D1 . γn (a) ˆ da
If n = 1, then (5.9) holds for all k ∈ {1, . . . , q0 } and for all a, aˆ ∈ S.
(5.9)
232
W. Ott, M. Stenlund
Proof of Lemma 5.3. For n = 1 and a, aˆ ∈ S, the estimate d γ (k) (a) da 1 d (k) < K 62 γ (a) ˆ da 1 (k)
(k)
follows from (5.5) and (5.6). For n 2 and a, aˆ ∈ , let si = γi (a) and sˆi = γi (a). ˆ We have ∂ d d s f (s ) · d s da i a i−1 da i−1 + ∂a f a (si−1 ) f a (si−1 ) · da si−1 − 15 (i−1) 1 + O( = ) . d = sˆi f (ˆsi−1 ) · d sˆi−1 + ∂ f aˆ (ˆsi−1 ) f (ˆsi−1 ) · d sˆi−1 aˆ
da
∂a
da
aˆ
da
This implies the estimate n−1 n−1 ds ds f (s ) i da 1 da n a i log d = log d + log log 1 + O(− 5 ) + sˆ1 sˆn f aˆ (ˆsi ) da da i=1
i=1
n−1 | f a (si ) − f aˆ (ˆsi )|
| f aˆ (ˆsi )|
i=1
+ O(1).
The equality
1 | f a (si ) − f aˆ (ˆsi )| = (b0 (si+1 ) − b0 (ˆsi+1 )) f aˆ (ˆsi ) b0 ( f a (si ))
+ ( (si ) − (ˆsi )) + (b0 (ˆsi ) − b0 (si ))
implies the estimate n−1 n−1 n−1 ds 3 1 da n log d K |si+1 − sˆi+1 | + K 4 |si − sˆi | + K − 4 |ˆsi − si | + O(1) sˆn da
i=1
i=1
= K |sn − sˆn | +
n−1
i=1
|si − sˆi |
i=2 3
+K 4
n−1
1
|si − sˆi | + K − 4
i=1
n−1
|ˆsi − si | + O(1)
i=1
K |sn − sˆn | + K |sn−1 − sˆn−1 |
n−3
1
− 5 i
i=0 3
1
+K |sn−1 − sˆn−1 |( 4 + − 4 )
n−2
1
− 5 i + O(1)
i=0
= O(1).
From Limit Cycles to Strange Attractors
233
5.1.2. Verification of (H4): Definition 4.1(B) We prove the existence of a parameter a ∗ such that f a ∗ satisfies Definition 4.1(B). We will then show that if is sufficiently large, then for any parameter a, if f a satisfies Definition 4.1(B), then f a ∈ M. Proposition 5.4. There exists 3 2 such that if 3 and ⊂ S is a parameter interval satisfying () = 3D1 K 6 q0 ξ , then there exists a ∗ ∈ such that for all c ∈ C( f a ∗ ), f an∗ (c) ∈ S \ Cξ () for all n ∈ N. Proof of Proposition 5.4. We inductively construct a nested sequence of parameter inter∞ vals = 0 ⊃ 1 ⊃ 2 ⊃ · · · such that a ∗ ∈ i=0 i has the desired property. Definition 5.5. The (q0 + 1)-tuple (n ; i 1,n , . . . , i q0 ,n ) is called an admissible configuration if n is a subinterval of 0 and if for every k ∈ {1, . . . , q0 }, i k,n n and the following conditions are satisfied. (k)
(M1) γi (n ) ∩ Cξ () = ∅ for all i i k,n (M2) For all a, aˆ ∈ n , we have the distortion estimate d (k) da γik,n (a) d (k) < D1 . ˆ da γik,n (a) (k) (M3) γik,n +1 (n ) 3D1 q0 ξ We inductively construct admissible configurations for all n ∈ N such that i k,n → ∞ as n → ∞ for every k. We begin with n = 1. Let d˜ :=
min |s − t|.
s,t∈C() s =t
˜ Let i k,1 = 1 for all k. We choose 1 as follows. We We assume that 3D1 K 62 q0 ξ < 21 d. have d (k) b0 (a) γ1 (a) = , (k) da b0 (γ1 (a)) so (k)
3D1 q0 ξ (γ1 (0 )) 3D1 K 62 q0 ξ
0 such that for sufficiently large and a ∗ as in Proposition 5.4, we have the following. For c ∈ C( f a ∗ ) and s ∈ S satisfying |s − c| 11 − 12 , let m(s) be the smallest value of m ∈ Z+ such that | f am∗ (s) − f am∗ (c)| > 21 ξ . Then m(s) > 1 and |( f am(s) ) (s)| (K 7 ) ∗
m(s) 16
.
Proof of Proposition 5.6. We begin with a spatial distortion lemma. Lemma 5.7 (Spatial distortion estimate). There exists D2 1 such that the following holds for all a ∈ S. For s, sˆ ∈ S, let m ∈ Z+ be such that πi , the segment between f ai (s) and f ai (ˆs ), satisfies (πi ) < 21 ξ and πi ∩ C 1 ξ ( f a ) = ∅ for all 0 i < m. Then 2 m ( f a ) (s) ( f m ) (ˆs ) D2 . a Proof of Lemma 5.7. Writing si = f ai (s) and sˆi = f ai (ˆs ) and using Lemma 5.1 and its proof, we have m m−1 ( f a ) (s) f a (si ) log m = log ( f a ) (ˆs ) f a (ˆsi ) i=0
m−1 i=0
| f a (si ) − f a (ˆsi )| | f a (ˆsi )|
1 K 5−1 − 4
1
K6 +
3
C 0 (S) minw∈S b0 (w)
(K − 4 + K 4 )|sm−1 − sˆm−1 |
m−1
|si − sˆi |
i=0 m−1
1
(K 5 4 )−i
i=0
= O(1).
236
W. Ott, M. Stenlund
Returning to the proof of Proposition 5.6, write f = f a ∗ . We first show that m(s) > 1. We have 1 | f (ζ )|(s − c)2 2
| f (s) − f (c)| = 11
for some ζ satisfying |ζ − c| − 12 . Arguing as in the proof of Lemma 5.1, | f (ζ )| K . Therefore 5
| f (s) − f (c)| K − 6
ξ 2
for sufficiently large. Now assume m(s) > 1. Using Lemma 5.7, we have ξ < | f m(s) (s) − f m(s) (c)| 2 = |( f m(s)−1 ) (ζ1 )| · | f (s) − f (c)| D2 |( f
(for some ζ1 between f (s) and f (c))
m(s)−1
) ( f (c))| · | f (s) − f (c)|,
and therefore ξ < D2 |( f m(s)−1 ) ( f (c))| · | f (ζ )| · (s − c)2 .
(5.10)
Reversing inequality (5.10) at time m(s) − 1, we have ξ D2−1 |( f m(s)−2 ) ( f (c))| · | f (ζ )| · (s − c)2 .
(5.11)
Estimating |( f m(s)−1 ) ( f (c))| from below using (5.10) gives |( f m(s) ) (s)| = | f (s) − f (c)| · |( f m(s)−1 ) ( f (s))| D2−1 |( f m(s)−1 ) ( f (c))|·| f (ζ4 )|·|s −c| (for some ζ4 between s and c) | f (ζ4 )| ξ . (5.12) 2 D2 |s − c| | f (ζ )|
Arguing as in the proof of Lemma 5.1, || ff (ζ(ζ4)|)| K > 0 since ζ4 and ζ are between s and c. Using this fact and estimating |s − c|−1 from below using (5.11), (5.12) implies |( f
Kξ ) (s)| 2 D2
m(s)
D2−1 |( f m(s)−2 ) ( f (c))| · | f (ζ )| ξ
(K )
m(s) 1 8 −8
(K )
m(s) 16
1 2
.
From Limit Cycles to Strange Attractors
237
5.1.4. Verification of (H5) and (H7). The following lemma facilitates the verification of (H5). Lemma 5.8 ([11,12]). Let f = f a ∗ . Suppose that for all x ∈ C( f a ∗ ), we have ∞ k=0
1 < ∞. |( f k ) ( f (x))|
Then for each x ∈ C( f a ∗ ), ! " ∞ d [(∂a f a )( f k (x))]a=a ∗ d = f p(a) (x(a)) − . a ( f k ) ( f (x)) da da a=a ∗ k=0
Property (H5) follows from Lemma 5.8 for sufficiently large. To see this, suppose f a ∗ ∈ M and let c ∈ C( f a ∗ ). For k ∈ Z+ , we have 1 k |( f ak∗ ) ( f (c))| K 5 4 by Lemma 5.1(c). Since K 6−1 large, then
∂ ∂a
f (s, a) K 6 , we conclude that if is sufficiently
∞ [∂a f a ( f ak∗ (c))]a=a ∗ k=0
( f ak∗ ) ( f a ∗ (c))
K 6−1
−
∞ k=1
K6 1
K5 4
k > 0.
Property (H7) follows from Lemma 5.1 and Proposition 5.6 provided is sufficiently large. Acknowledgements Mikko Stenlund was partially supported by the Academy of Finland. William Ott has been partially supported by NSF grant DMS-0603509.
Appendix A. Some Proofs We assume throughout Sect. A that L = 1. Notice that if V denotes a vector field, then dz =V ds
⇒
dz 1 dz2 z dz z = = · = · V. ds 2z ds z ds z
(A.1)
We will use this fact together with the following Grönwall-type inequality: Lemma A.1. Assume that β is a constant, the function ϕ is continuous on the interval [ˆs , sˇ ], and that the function u is differentiable and satisfies du s , sˇ ). Then, ds βu + ϕ on (ˆ for all s ∈ (ˆs , sˇ ), s β(s−ˆs ) u(s) u(ˆs )e + eβ(s−τ ) ϕ(τ ) dτ. s
sˆ
Proof. Suppose v(s) = u(ˆs )eβ(s−ˆs ) + sˆ eβ(s−τ ) ϕ(τ ) dτ . Then v satisfies the equation dv d s ) = u(ˆs ). Since u − v is differentiable, ds (u − v) ds (s) = βv(s) + ϕ(s) with v(ˆ β(u − v), and (u − v)(ˆs ) = 0, a standard Grönwall argument shows that u v. We get immediately
238
W. Ott, M. Stenlund
λ1 λ1 (s−ˆs ) . Then Corollary A.2. Suppose that in Lemma A.1 du ds (s) 2 u + C 0 e λ1 2C0 e 2 (s−ˆs ) . u(s) u(ˆs ) + |λ1 |
Our first application of a Grönwall inequality is Lemma A.3. Assume z solves the forced equation (3.8b) with z(s0 ) = z0 and fix a constant K > 0. If ε/|λ1 | is sufficiently small, ∂sm ∂sl0 z(s) Cε
(0 l + m 3)
(A.2)
as long as s − s0 K . Moreover, ∂z0 z − 1 C|λ1 |.
(A.3)
Proof. Equation (3.8b) reads dz = Az + h3 (s, z) ds
h3 (s, z) = Os (ε) + Os,z (εz) + Os,z (ε2 ) + Os,z (z2 ).
with
(A.4) Assuming z/|λ1 | and ε/|λ1 | are sufficiently small, (A.1) implies λ1 dz z + C0 ε. ds 2 By Lemma A.1, z(s) z0 e
λ1 2 (s−s0 )
+
λ1 λ1 2C0 ε 1 − e 2 (s−s0 ) z0 e 2 (s−s0 ) + C0 ε(s − s0 ). |λ1 |
For s − s0 K we get z(s)/|λ1 | z0 /|λ1 | + C0 K ε/|λ1 |, which proves the assumption legitimate. Differentiating (A.4) with respect to s up to two times yields an expression for ∂sm z(s). One immediately obtains ∂sm z(s) Cε for 0 m 3 and s − s0 K . Equation (A.4) implies z(s) = e(s−s0 )A z0 +
s
e(s−τ )A h3 (τ, z(τ )) dτ.
(A.5)
s0
Differentiating this with respect to s0 up to three times and evaluating at s = s0 yields ∂s0 z(s0 ) = −Az0 − h3 (s0 , z0 ), ∂s20 z(s0 ) = A2 z0 + Ah3 (s0 , z0 ) − ∂s h3 (s0 , z0 ) − Dh3 (s0 , z0 )∂s0 z(s0 ), ∂s30 z(s0 ) = −A3 z0 − A2 h3 (s0 , z0 ) + 2A∂s h3 (s0 , z0 ) −∂s2 h3 (s0 , z0 ) + ADh3 (s0 , z0 )∂s0 z(s0 ) −D(∂s h3 )(s0 , z0 )∂s0 z(s0 ) − Dh3 (s0 , z0 )
d ∂s z(s0 ) ds0 0
−D 2 h3 (s0 , z0 )(∂s0 z(s0 ), ∂s0 z(s0 )) − Dh3 (s0 , z0 )∂s20 z(s0 ).
From Limit Cycles to Strange Attractors
239
Clearly, for 1 l 3, ∂sl0 z(s0 ) Cε. Such initial conditions are needed for analyzing the variational equations d (A.6) ∂s z = (A + Dh3 (s, z)) ∂s0 z, ds 0 d 2 ∂ z = (A + Dh3 (s, z)) ∂s20 z + D 2 h3 (s, z)(∂s0 z, ∂s0 z), (A.7) ds s0 d 3 ∂ z = (A + Dh3 (s, z)) ∂s30 z + 3D 2 h3 (s, z)(∂s20 z, ∂s0 z) + D 3 h3 (s, z)(∂s0 z, ∂s0 z, ∂s0 z). ds s0 One then checks recursively, using (A.1) and Corollary A.2, that ∂sl0 z(s) Cεe
λ1 2 (s−s0 )
Cε
hold for 1 l 3 and s − s0 K . Equations (A.6) and (A.7) provide us with an expression for ∂s ∂sl0 z(s) with l = 1 and l = 2. Moreover, (A.6) can be differentiated with respect to s to yield an expression for ∂s2 ∂s0 z(s). The bounds in (A.2) are then readily obtained. Finally, we will prove (A.3). To this end, notice that d ∂z z = (A + Dh3 (s, z)) ∂z0 z. ds 0
(A.8)
In particular, each row, ∂z0,i z, of ∂z0 z satisfies this equation. Hence, by principle (A.1), λ1 d ds ∂z0,i z 2 ∂z0,i z, so that the matrix ∂z0 z remains perpetually bounded. Integrating both sides of (A.8) from s0 to s and recalling ∂z0 z(s0 ) = 1 gives ∂z0 z(s) − 1 |s − s0 | |λn−1 | + sup Dh3 (s , z) sup ∂z0 z(s ) , s0 s s
s0 s s
where · now denotes the matrix norm induced by the Euclidean norm. This estimate implies (A.3). Proof of Proposition 3.1. Throughout the proof, · C 3 will stand for the C 3 -norm with respect to s0 . By (3.8a) and (3.14), s˜ and sˆ have to satisfy sˆ s˜ b0 (τ ) dτ = ρ = b0 (τ ) + v(τ ) dτ, (A.9) s0
s0
where v(s) = bT1 (s)P(s)z(s) + Os,z (ε) + Os,z (εz(s)) + Os,z (z(s)2 ) and z = z(s) solves (3.8b) with z(s0 ) = z0 . We theorem to find s˜ . Clearly, F : R × R → R : (s0 , s) → s use the implicit function 3 . Observe that F(s , s ) = −ρ and lim b (τ ) dτ − ρ is C 0 0 0 s→∞ F(s0 , s) = ∞ as s0 min b0 = m > 0. By the intermediate value theorem, there exists a number s˜ such ∂ that F(s0 , s˜ ) = 0. Because ∂s F(s0 , s) = b0 (s) m, the implicit function theorem implies that s˜ is a C 3 -function of s0 . Notice that F(s0 + 1, s + 1) ≡ F(s0 , s), so that s˜ (s0 + 1) = s˜ (s0 ) + 1 which implies that s0 → s˜ (s0 ) − s0 is periodic.
240
W. Ott, M. Stenlund
Now that we have s˜ , let us define the function g(ξ ) := −ρ +
s˜ +ξ
b0 (τ ) + v(τ ) dτ.
s0
Notice that, denoting ξ1 = sˆ − s˜ , the right side of (A.9) is equivalent to g(ξ1 ) = 0. The Taylor expansion g(ξ ) = g(0) + g (0)ξ + δ2 g(ξ ) yields G(ξ ) := −
1 (g(0) + δ2 g(ξ )) = ξ, g (0)
which we regard, for all fixed z0 , as a fixed point equation on the space of C 3 functions ξ = ξ(s0 ). Assuming G is a contraction in a closed, origin-centered, ball B¯ r ⊂ C 3 of radius r , there exists a unique solution, ξ1 , to G(ξ ) = ξ inside the ball. Next, we prove that for a suitably small value of r , G is indeed a contraction. First, notice that g(0) =
s˜
v(τ ) dτ = (˜s − s0 )
s0
1
v((1 − τ )s0 + τ s˜ ) dτ,
0
g (0) = b0 (˜s ) + v(˜s ), 1 1 2 2 δ2 g(ξ ) = ξ (1 − τ ) g (ξ τ ) dτ = ξ (1 − τ ) (b0 + v )(˜s + ξ τ ) dτ 0
0
are smooth functions of s0 . Because s˜ is C 3 in s0 and inf s0 g (0) > 0, the bounds (A.2) yield # # # 1 # # # # g (0) #
C3
# # # g(0) # # C and # # Cε. g (0) #C 3
Moreover, 2 δ2 g(ξ )C 3 Cξ C s + ζ )C 3 . 3 sup (b0 + v )(˜ ζ ∈ B¯ r
Hence, G(ξ )C 3 C0 (ε +r 2 ) for some C0 . Choosing r = 2C0 ε, we have G( B¯ r ) ⊂ B¯ r for ε small enough. Second, let ξ 1 and ξ 2 be elements of B¯ r . Since the map ξ → G(ξ ) is differentiable and the operator norm of the derivative obeys the bound supξ ∈ B¯ r DG(ξ )L(C 3 ) C supξ ∈ B¯ r Dδ2 g(ξ )L(C 3 ) Cr , the mean value theorem yields G(ξ 1 )−G(ξ 2 )C 3 Cr ξ 1 − ξ 2 C 3 . Hence, G is a contraction on B¯ r if ε is sufficiently small. We will now prove that the fixed point, ξ1 , of G is a periodic function of s0 . Let us denote z(s, s0 , z0 ) the solution and v(s)|s0 the function v defined above, when the initial condition z(s0 ) = z0 is being used. Because h3 (s + 2, z) = h3 (s, z) in (A.4), we have z(s + 2, s0 + 2, z0 ) = z(s, s0 , z0 ) and v(s + 2)|s0 +2 = v(s)|s0 . Since g(ξ1 ) = 0 for all
From Limit Cycles to Strange Attractors
241
values of s0 and s˜ (s0 + 2) = s˜ (s0 ) + 2, the computation g(ξ1 )(s0 + 2) = −ρ + = −ρ + = −ρ + = −ρ +
s˜ (s0 +2)+ξ1 (s0 +2) s0 +2 s˜ (s0 )+2+ξ1 (s0 +2) s0 +2 s˜ (s0 )+ξ1 (s0 +2) s0 s˜ (s0 )+ξ1 (s0 +2) s0
= g(ξ1 )(s0 ) +
b0 (τ ) + v(τ )|s0 +2 dτ b0 (τ ) + v(τ )|s0 +2 dτ
b0 (τ + 2) + v(τ + 2)|s0 +2 dτ b0 (τ ) + v(τ )|s0 dτ
s˜ (s0 )+ξ1 (s0 +2)
s˜ (s0 )+ξ1 (s0 )
b0 (τ ) + v(τ )|s0 dτ,
implies that the last integral vanishes despite the fact that the integrand is positive, so we must have ξ1 (s0 + 2) = ξ1 (s0 ). As the last step, we will bound the difference zˆ − z˜ . Let z(1) and z(2) solve (3.8b) and (3.13b), respectively, with the initial condition z(1) (s0 ) = z(2) (s0 ) = z0 . Both of these are C 3 functions of (s0 , z0 ) by the smoothness of the vector fields. By definition, zˆ = z(1) (ˆs ) and z˜ = z(2) (˜s ). We need a bound on the C 3 norm of the difference ξ 2 (s0 ) = zˆ − z˜ for fixed z0 . Notice that ξ 2 (s0 ) = (z(1) − z(2) )(ˆs ) + (z(2) (ˆs ) − z(2) (˜s )). Observe that the difference δ = z(1) − z(2) satisfies the differential equation dδ = Az(1) + Os,z(1) (εz(1) ) + Os,z(1) (ε2 ) + Os,z(1) (z(1) 2 ) = w. ds Here z(1) , and hence w, is to be regarded as a predetermined function for which we already have good bounds. Indeed, let S = {(s0 , s) : 0 s0 < 2, s0 s K } and · C 3 stand for the C 3 norm on this set. According to (A.2), w − Az(1) C 3 Cε2 S
S
whereas, recalling that all eigenvalues of A are proportional to λ1 , Az(1) C 3 C|λ1 |ε. S In other words, wC 3 C|λ1 |ε. As δ(s0 ) = 0, we have S
(z
(1)
−z
(2)
)(ˆs ) = δ(ˆs ) =
sˆ
w(τ ) dτ.
s0
Because sˆ is C 3 in s0 , it follows that (z(1) − z(2) )(ˆs )C 3 C|λ1 |ε. By (3.13b), the remaining contribution reads z(2) (ˆs ) − z(2) (˜s ) =
s˜
sˆ
εP−1 (τ )φ(0, τ ) (0)
ψn (τ )
dτ.
We have seen above that ˆs − s˜ C 3 Cε, which implies z(2) (ˆs ) − z(2) (˜s )C 3 Cε2 and finally ξ 2 (s0 )C 3 C|λ1 |ε.
242
W. Ott, M. Stenlund
Remark A.4. It follows from the previous proof that, under the conditions of Proposition 3.1, ∂z0 sˆ C.
(A.10)
Indeed, ∂z0 sˆ = ∂z0 ξ1 , as ∂z0 s˜ = 0. From the fixed point equation ξ1 = G(ξ1 ) we get ∂z0 ξ1 = (1 − DG(ξ1 ))−1 (∂z0 G)(ξ1 ) and then the claimed bound. Moreover, ∂z0 zˆ − 1 C|λ1 |.
(A.11)
Let z(s) = z(s; s0 , z0 ) be the solution to (3.8b) with z(s0 ; s0 , z0 ) = z0 and recall that sˆ depends on (s0 , z0 ). By definition, zˆ = z(ˆs ; s0 , z0 ) so that ∂z0 zˆ = ∂s z(ˆs )∂z0 sˆ + ∂z0 z(ˆs ) = 1 + O(λ1 ) by the bounds in Lemma A.3. Let us view the solution z(s) = z(s, sˆ , zˆ ),
z(ˆs ) ≡ zˆ
(A.12)
to Eq. (3.7b) as a function of three variables and abbreviate ∂s = ∂/∂s , ∂ˆi = ∂/∂ zˆ i , ∂ˆi1 ···ik = ∂ˆi1 · · · ∂ˆik , and ∂sˆ = ∂/∂ sˆ . Proposition A.5. Assuming ˆz/|λ1 | is small enough, we have, for 0 k + l + m 3 and s sˆ , the following bounds: # # λ1 # m l # #∂s ∂sˆ z(s)# Cˆze 2 (s−ˆs ) , # # λ1 C # m lˆ # e 2 (s−ˆs ) (k > 0). #∂s ∂sˆ ∂i1 ···ik z(s)# k−1 |λ1 | Proof. The initial conditions (∂z/∂ zˆ )(ˆs ) = 1, ∂ˆi j z(ˆs ) = 0, and ∂ˆi jk z(ˆs ) = 0 follow from (A.12), as the zˆ -derivatives can be computed after evaluating z at s = sˆ . Similarly, taking sˆ -derivatives of s z(s) = e(s−ˆs )A zˆ + e−(τ −ˆs )A h1 (τ, z(τ )) dτ sˆ
yields first, analogously to how the identities below (A.5) were obtained, ∂sˆ z(ˆs ) = −Aˆz − h1 (ˆs , zˆ ), ∂sˆ2 z(ˆs ) = A2 zˆ + Ah1 (ˆs , zˆ ) − ∂s h1 (ˆs , zˆ ) − Dh1 (ˆs , zˆ )∂sˆ z(ˆs ), ∂sˆ3 z(ˆs ) = −A3 zˆ − A2 h1 (ˆs , zˆ ) + 2A∂s h1 (ˆs , zˆ ) − ∂s2 h1 (ˆs , zˆ ) + ADh1 (ˆs , zˆ )∂sˆ z(ˆs ) −D(∂s h1 )(ˆs , zˆ )∂sˆ z(ˆs ) − Dh1 (ˆs , zˆ )
d ∂sˆ z(ˆs ) − D 2 h1 (ˆs , zˆ )(∂sˆ z(ˆs ), ∂sˆ z(ˆs )) d sˆ
−Dh1 (ˆs , zˆ )∂sˆ2 z(ˆs ). These formulas can then be differentiated with respect to zˆ in order to find higher-order initial conditions. As h1 (s, z) = O(z2 ), we obtain the following estimates: ∂sˆl z(ˆs ) Cˆz,
∂sˆl ∂ˆi z(ˆs )
C,
∂sˆ ∂ˆi j z(ˆs ) C.
l = 1, 2, 3, l = 1, 2,
From Limit Cycles to Strange Attractors
243
Combining (3.7b) and (A.1), d λ1 z · Az + z · h1 (s, z) z = λ1 z + h1 (s, z) z ds z 2 if z/|λ1 | is small enough. Below, ˆz/|λ1 | will always be assumed small enough. Thus, for all s > sˆ , z(s) ˆze
λ1 s) 2 (s−ˆ
.
(A.13)
Differentiating (3.7b) with respect to various components of zˆ , we obtain the variational equations d ˆ ∂i z = (A + Dh1 (s, z)) ∂ˆi z, ds
(A.14)
d ˆ ∂i j z = (A + Dh1 (s, z)) ∂ˆi j z + D 2 h1 (s, z)(∂ˆi z, ∂ˆ j z), ds
(A.15)
d ˆ ∂i jk z = (A + Dh1 (s, z)) ∂ˆi jk z + D 2 h1 (s, z)(∂ˆi z, ∂ˆ jk z) ds
(A.16)
+ D 2 h1 (s, z)(∂ˆk z, ∂ˆi j z) + D 2 h1 (s, z)(∂ˆ j z, ∂ˆki z) + D 3 h1 (s, z)(∂ˆi z, ∂ˆk z, ∂ˆi j z). Combining (A.14), (A.1), and (A.13), we have ∂ˆi z(s) e
λ1 s) 2 (s−ˆ
(A.17)
in analogy with (A.13). Combining (A.15), (A.1), (A.13), and (A.17), d ˆ λ1 λ1 ∂i j z ∂ˆi j z + C∂ˆi z∂ˆ j z ∂ˆi j z + Ceλ1 (s−ˆs ) . ds 2 2 Applying Corollary A.2, ∂ˆi j z(s)
C λ1 (s−ˆs ) e2 . |λ1 |
(A.18)
Similarly, combining (A.16), (A.1), (A.13), (A.17), and (A.18), ∂ˆi jk z(s)
C λ1 (s−ˆs ) e2 . |λ1 |2
(A.19)
Differentiating Eqs. (3.7b), (A.14), and (A.15) with respect to sˆ produces equations for ∂sˆl z, ∂sˆl ∂ˆi z, and ∂sˆ ∂ˆi j z. For example, d ∂sˆ z = (A + Dh1 (s, z)) ∂sˆ z, ds d 2 ∂ z = (A + Dh1 (s, z)) ∂sˆ2 z + D 2 h1 (s, z)(∂sˆ z, ∂sˆ z). ds sˆ Such equations can be handled in a similar fashion and it is easy to verify that the additional sˆ -derivatives do not change the bounds by more than a constant prefactor. The bounds with m = 0 in the proposition are now clear. The bounds with m = 0 follow immediately from the appropriate differential equation; for instance, bounding d ˆ the right-hand side of (A.14) yields the bound on ds ∂i z.
244
W. Ott, M. Stenlund
Proof of Proposition 3.2. We first view the error terms Ek with 1 k 3 as smooth functions of (ˆs , zˆ ) and bound their derivatives with respect to these variables. The bounds on the C 3 -norms with respect to s0 follow by the chain rule. Bounding the C 3 -norms of E4 and E5 is trivial and is done at the end of the proof. Throughout the proof, · C 3 will stand for the C 3 -norm with respect to s0 . Terms E1 and E2 . It is convenient to express E1 and E2 in the form ∞ τ T E1 = b1 (ˆs + τ ) P(ˆs + τ ) e(τ −ξ )A h1 (ˆs + ξ, z(ˆs + ξ )) dξ dτ, 0 0 ∞ h 2 (ˆs + τ, z(ˆs + τ )) dτ. E2 = 0
First of all, that
(bT1 P)(ˆs
+ τ )C 3 CbT1 PC 3 for every τ , because ˆs − s0 C 3 C, so
E1 C 3
CbT1 PC 3
E2 C 3
∞ 0
∞ τ 0
0
eλ1 (τ −ξ ) h1 (ˆs + ξ, z(ˆs + ξ ))C 3 dξ dτ,
h 2 (ˆs + τ, z(ˆs + τ ))C 3 dτ.
Since h1 (s, z) and h 2 (s, z) are periodic in the variable s, their partial derivatives of any order (less than four) with respect to s are periodic functions of s and can be bounded exactly as h1 (s, z) and h 2 (s, z). To save a considerable amount of space, we write η = sˆ +ξ and ζ = (ˆs + ξ, z(ˆs + ξ )) below. Notice that the first three total sˆ -derivatives of z(η) are d z(η) = ∂s z(η) + ∂sˆ z(η), d sˆ d2 z(η) = ∂s2 z(η) + 2∂s ∂sˆ z(η) + ∂sˆ2 z(η), d sˆ 2 d3 z(η) = ∂s3 z(η) + 3∂s2 ∂sˆ z(η) + 3∂s ∂sˆ2 z(η) + ∂sˆ3 z(η). d sˆ 3 Taking zˆ -derivatives of the first two formulas above, d ˆ ∂i z(η) = ∂s ∂ˆi z(η) + ∂sˆ ∂ˆi z(η), d sˆ d ˆ ∂i j z(η) = ∂s ∂ˆi j z(η) + ∂sˆ ∂ˆi j z(η), d sˆ d2 ˆ ∂i z(η) = ∂s2 ∂ˆi z(η) + ∂s ∂sˆ ∂ˆi z(η) + ∂sˆ2 ∂ˆi z(η). d sˆ 2 Proposition A.5 then implies the bounds for 0 k + l 3: # l # #d # λ1 ξ # # # d sˆl z(η)# Cˆze 2 , # l # #d # λ1 C ξ # ˆ∂i1 ...ik z(η)# (k > 0). # d sˆl # |λ |k−1 e 2 1
From Limit Cycles to Strange Attractors
245
These will be used to bound the C 3 -norm of h1 (ζ ). To this end, we compute d d h1 (ζ ) = ∂s h1 (ζ ) + Dh1 (ζ ) z(η) = O ˆz2 eλ1 ξ , d sˆ d sˆ d2 d h1 (ζ ) = ∂s2 h1 (ζ ) + 2D(∂s h1 )(ζ ) z(η) d sˆ 2 d sˆ d2 d d 2 + Dh1 (ζ ) 2 z(η) + D h1 (ζ ) z(η), z(η) d sˆ d sˆ d sˆ = O ˆz2 eλ1 ξ , d3 d h1 (ζ ) = ∂s3 h1 (ζ ) + D(∂s2 h1 )(ζ ) z(η) d sˆ 3 d sˆ d d2 z(η) + 3D(∂s h1 )(ζ ) 2 z(η) d sˆ d sˆ d d d3 + 3D 2 (∂s h1 )(ζ ) z(η), z(η) + Dh1 (ζ ) 3 z(η) d sˆ d sˆ d sˆ
+ 2D(∂s2 h1 )(ζ )
d d2 + 3D h1 (ζ ) z(η), 2 z(η) d sˆ d sˆ
2
d d d + D h1 (ζ ) z(η), z(η), z(η) . = O ˆz2 eλ1 ξ , d sˆ d sˆ d sˆ 3
d h1 (ζ ) = Dh1 (ζ )∂ˆi z(η) = O ˆzeλ1 ξ , d zˆ i d2 h1 (ζ ) = Dh1 (ζ )∂ˆi j z(η) + D 2 h1 (ζ )(∂ˆi z(η), ∂ˆ j z(η)) d zˆ i d zˆ j =O
ˆz + 1 eλ1 ξ = O eλ1 ξ , |λ1 |
d3 h1 (ζ ) = Dh1 (ζ )∂ˆi jk z(η) + D 2 h1 (ζ )(∂ˆi z(η), ∂ˆ jk z(η)) d zˆ i d zˆ j d zˆ k + D 2 h1 (ζ )(∂ˆk z(η), ∂ˆi j z(η)) + D 2 h1 (ζ )(∂ˆ j z(η), ∂ˆki z(η)) + D 3 h1 (ζ )(∂ˆi z(η), ∂ˆk z(η), ∂ˆi j z(η)). =O
1 λ1 ξ ˆz 1 λ1 ξ = O . e e + |λ1 |2 |λ1 | |λ1 |
246
W. Ott, M. Stenlund
Taking zˆ -derivatives of we get
2 d h (ζ ), ddsˆ2 h1 (ζ ), and d sˆ 1
the resulting expression for
d2 h (ζ ), d zˆ i d sˆ 1
d d2 d h1 (ζ ) = D(∂s h1 )(ζ )∂ˆi z(η) + Dh1 (ζ ) ∂ˆi z(η) + D 2 h1 (ζ ) z(η), ∂ˆi z(η) d zˆ i d sˆ d sˆ d sˆ = O ˆzeλ1 ξ , d3 d h1 (ζ ) = D(∂s2 h1 )(ζ )∂ˆi z(η) + 2D(∂s h1 )(ζ ) ∂ˆi z(η) 2 d zˆ i d sˆ d sˆ d + 2D 2 (∂s h1 )(ζ ) z(η), ∂ˆi z(η) d sˆ 2 d2 ˆ d 2 ˆ + Dh1 (ζ ) 2 ∂i z(η) + D h1 (ζ ) z(η), ∂i z(η) d sˆ d sˆ 2 d ˆ d d d 2 3 ˆ ∂i z(η), z(η) + D h1 (ζ ) ∂i z(η), z(η), z(η) + 2D h1 (ζ ) d sˆ d sˆ d sˆ d sˆ = O ˆzeλ1 ξ , d3 d h1 (ζ ) = D(∂s h1 )(ζ )∂ˆi j z(η)+ D 2 (∂s h1 )(ζ ) ∂ˆi z(η), ∂ˆ j z(η) + Dh1 (ζ ) ∂ˆi j z(η) d zˆ i d zˆ j d sˆ d sˆ d ˆ d ˆ + D 2 h1 (ζ ) ∂i z(η), ∂ˆ j z(η) + D 2 h1 (ζ ) ∂ j z(η), ∂ˆi z(η) d sˆ d sˆ d d + D 2 h1 (ζ ) z(η), ∂ˆi j z(η) + D 3 h1 (ζ ) z(η), ∂ˆi z(η), ∂ˆ j z(η) . d sˆ d sˆ ˆz + 1 e λ1 ξ = O e λ1 ξ . =O |λ1 |
We bound the derivatives of h 2 (ˆs + τ, z(ˆs + τ )) in exactly the same way. ∞ Term E3 . Setting v(τ ) = b1 (τ )T P(τ ) − , we have E3 = zˆ · sˆ v(τ )e(τ −ˆs )A dτ . Using the facts that v is 2-periodic, that its integral vanishes, and that A is negative definite, sˆ
∞
v(τ )e(τ −ˆs )A dτ =
∞
v(ˆs + τ )eτ A dτ =
0
k=0
2
v(ˆs + τ )eτ A dτ e2kA
0
−1 2 τA v(ˆs + τ )e dτ 1 − e2A
=
∞
0 2
= 0
v(ˆs + τ ) e
τA
−1 − 1 dτ 1 − e2A .
From Limit Cycles to Strange Attractors
Hence, dk d sˆ k
∞ sˆ
v(τ )e
(τ −ˆs )A
247
2
dτ =
v
(k)
(ˆs + τ ) e
τA
−1 − 1 dτ 1 − e2A .
0
Recalling that A is diagonal, we obtain for each value of k the upper bound k ∞ # 2 d # (k) # τλ (τ −ˆs )A = #v # e i − 1 dτ (1 − e2λi )−1 v(τ )e dτ # i d sˆ k i ∞ sˆ # # 0 # (k) # C #v # . (A.20) ∞
Incorporating (s0 , z0 = 0) → (ˆs , zˆ ). We set z0 = 0 and denote (ˆs , zˆ ) = Hk (s0 , 0). As k k zˆ = z(ˆs ) = z(ˆs (s0 ), s0 , 0), we have d kzˆ = d k z(ˆs (s0 ), s0 , 0). The bounds ds0 ds0 # # # d k zˆ # # # (1 k 3) # k # Cε # ds0 # follow from the fact that sˆ is a C 3 -function of s0 and the bounds in (A.2). Since sˆ and zˆ are functions of s0 , for any function u = u(ˆs , zˆ ), d d sˆ d zˆ i u= ∂sˆ u + ∂zˆ u, ds0 ds0 ds0 i d2 d 2 sˆ d 2 zˆ i u = ∂ u + ∂zˆ u + s ˆ ds02 ds02 ds02 i
d sˆ ds0
2 ∂sˆsˆ u + 2
d sˆ d zˆ i d zˆ i d zˆ j ∂sˆ zˆ u + ∂zˆ zˆ u, ds0 ds0 i ds0 ds0 i j
d3 d 3 sˆ d 3 zˆ i d sˆ d 2 sˆ u = ∂ u + ∂ u + 2 ∂sˆsˆ u s ˆ z ˆ ds0 ds02 ds03 ds03 ds03 i d 2 sˆ d zˆ i d sˆ d 2 zˆ i d 2 zˆ i d zˆ j +2 + ∂zˆ zˆ u ∂sˆ zˆi u + 2 2 2 2 ds0 ds0 ds0 ds0 ds0 ds0 i j d 2 sˆ d d 2 zˆ i d d sˆ 2 d + 2 ∂sˆ u + ∂ u + ∂sˆsˆ u zˆ ds0 ds0 ds0 ds0 ds02 ds0 i d sˆ d zˆ i d d zˆ i d zˆ j d ∂sˆ zˆi u + ∂zˆ zˆ u. ds0 ds0 ds0 ds0 ds0 ds0 i j Here summation over repeated indices is understood and we leave it to the reader to expand the remaining s0 -derivatives on the last line. Using the bounds derived earlier, we then get +2
h1 (ζ ) Cˆz2 eλ1 ξ # # # d # 2 # # C ˆ z h (ζ ) + εˆ z eλ1 ξ # ds 1 # 0 # # # d2 # # # # 2 h1 (ζ )# C ˆz2 + εˆz + ε2 eλ1 ξ # ds0 # # # # d3 # ε3 # # eλ1 ξ . # 3 h1 (ζ )# C ˆz2 + εˆz + ε2 + # ds0 # |λ1 |
248
W. Ott, M. Stenlund
Similar bounds are obtained for h 2 (ζ ). We conclude that E1 C 3 CbT1 PC 3
ε2 ε2 Cσ , 2 |λ1 | |λ1 |2
ε2 , |λ1 | Cεσ.
E2 C 3 C E3 C 3
The final inequality involving E1 holds because
bT1 P σ
is independent of σ .
Terms E4 and E5 . Writing E4 in the form 1 b0 ((1 − τ )˜s + τ sˆ ) dτ, E4 = (˜s − sˆ ) 0
and recalling that s˜ and sˆ are both
C3
functions of s0 allows us to estimate
E4 C 3 C˜s − sˆ C 3 Cε. Proposition 3.1 was used here. Finally, by the same proposition, ¯ C 3 Cε, E5 C 3 = ξ 2 (s0 , 0), d
which finishes the proof.
Lemma A.6. For all s0 and a, we have ∂z0 s∞ (s0 , 0, a) > 0. Proof. Differentiating both sides of (3.11) with respect to z0 , we get ∞ 0 = b0 (s∞ )∂z0 s∞ + eτ A dτ , ∂z0 zˆ + R,
(A.21)
0
where
R = −b0 (ˆs )∂z0 sˆ + + zˆ , 0
∞
∞ 0
(bT1 P − )(ˆs + τ )eτ A dτ , ∂z0 zˆ
2 (bT1 P) (ˆs + τ )eτ A dτ (∂z0 sˆ ) + ∂z0 Ek . k=1
Because (A.20) holds for any periodic, zero-integral function, the two integrals appearing in R are O(σ ) in the limit λ1 → 0. Terms ∂z0 sˆ and ∂z0 zˆ are bounded by (A.10) and (A.11), respectively. Estimating ∂z0 E1 and ∂z0 E2 , we conclude that σε . R = O(σ ) + O |λ1 |2 From (A.21) and (A.11), we have ! " 1 σε −1 A (1 + O(λ1 )) + O(σ ) + O ∂z0 s∞ = b0 (s∞ ) |λ1 |2
(A.22)
n−1 as λ1 → 0. Since A−1 = (− i λi−1 )i=1 , if |λε1 | is sufficiently small, then the first term on the right side of (A.22) dominates and thus ∂z0 s∞ > 0.
From Limit Cycles to Strange Attractors
249
References 1. Benedicks, M., Carleson, L.: On iterations of 1 − ax 2 on (−1, 1). Ann. of Math. (2) 122(1), 1–25 (1985) 2. Benedicks, M., Carleson, L.: The dynamics of the Hénon map. Ann. of Math. (2) 133(1), 73–169 (1991) 3. DeVille, R.E.L., Vanden-Eijnden, E., Muratov, C.B.: Two distinct mechanisms of coherence in randomly perturbed dynamical systems. Phys. Rev. E (3) 72(3), 031105 (2005) 4. Falconer, I., Gottwald, G.A., Melbourne, I., Wormnes, K.: Application of the 0-1 test for chaos to experimental data. SIAM J. Appl. Dyn. Syst. 6(2), 395–402 (2007) (electronic) 5. Gottwald, G.A., Melbourne, I.: A new test for chaos in deterministic systems. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 460(2042), 603–611 (2004) 6. Hunt, B.R., Sauer, T., Yorke, J.A.: Prevalence: a translation-invariant “almost every” on infinite-dimensional spaces. Bull. Amer. Math. Soc. (N.S.) 27(2), 217–238 (1992) 7. Hunt, B.R., Sauer, T., Yorke, J.A.: Prevalence. An addendum to: Prevalence: a translation-invariant ‘almost every’ on infinite-dimensional spaces. Bull. Amer. Math. Soc. (N.S.) 28(2), 306–307 (1993) 8. Jakobson, M.V.: Absolutely continuous invariant measures for one-parameter families of one-dimensional maps. Commun. Math. Phys. 81(1), 39–88 (1981) 9. Lin, K.K., Young, L.-S.: Shear-induced chaos. Nonlinearity 21(5), 899–922 (2008) 10. Ott, W., Yorke, J.A.: Prevalence. Bull. Amer. Math. Soc. (N.S.) 42(3), 263–290 (2005) (electronic) 11. Thieullen, Ph., Tresser, C., Young, L.-S.: Positive Lyapunov exponent for generic one-parameter families of unimodal maps. J. Anal. Math. 64, 121–172 (1994) 12. Thieullen, P., Tresser, C., Young, L.-S.: Exposant de Lyapunov positif dans des familles à un paramètre d’applications unimodales. C. R. Acad. Sci. Paris Sér. I Math. 315(1), 69–72 (1992) 13. Tucker, W.: A rigorous ODE solver and Smale’s 14th problem. Found. Comput. Math. 2(1), 53–117 (2002) 14. Wang, Q., Young, L.-S.: In preparation 15. Wang, Q., Young, L.-S.: Strange attractors with one direction of instability. Commun. Math. Phys. 218(1), 1–97 (2001) 16. Wang, Q., Young, L.-S.: From invariant curves to strange attractors. Commun. Math. Phys. 225(2), 275–304 (2002) 17. Wang, Q., Young, L.-S.: Strange attractors in periodically-kicked limit cycles and Hopf bifurcations. Commun. Math. Phys. 240(3), 509–529 (2003) 18. Wang, Q., Young, L.-S.: Toward a theory of rank one attractors. Ann. of Math. (2) 167(2), 349–480 (2008) 19. Young, L.-S.: Statistical properties of dynamical systems with some hyperbolicity. Ann. of Math. (2) 147(3), 585–650 (1998) 20. Young, L.-S.: Recurrence times and rates of mixing. Israel J. Math. 110, 153–188 (1999) 21. Zaslavsky, G.M.: The simplest case of a strange attractor. Phys. Lett. A 69(3), 145–147 (1978/79) Communicated by G. Gallavotti
Commun. Math. Phys. 296, 251–270 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1015-x
Communications in
Mathematical Physics
Uniform Regularity Close to Cross Singularities in an Unstable Free Boundary Problem John Andersson1 , Henrik Shahgholian2 , Georg S. Weiss3 1 Mathematics Institute, University of Warwick, Coventry CV4 7AL, UK.
E-mail:
[email protected] 2 Department of Mathematics, Royal Institute of Technology,
100 44 Stockholm, Sweden. E-mail:
[email protected] 3 Graduate School of Mathematical Sciences, University of Tokyo, 3-8-1 Komaba,
Meguro-ku, Tokyo-to, 153-8914, Japan. E-mail:
[email protected] Received: 29 May 2009 / Accepted: 24 November 2009 Published online: 18 February 2010 – © Springer-Verlag 2010
Dedicated to Nina Nikolaevna Uraltseva on the occasion of her 75th birthday Abstract: We introduce a new method for the analysis of singularities in the unstable problem u = −χ{u>0} , which arises in solid combustion as well as in the composite membrane problem. Our study is confined to points of “supercharacteristic” growth of the solution, i.e. points at which the solution grows faster than the characteristic/invariant scaling of the equation would suggest. At such points the classical theory is doomed to fail, due to incompatibility of the invariant scaling of the equation and the scaling of the solution. In the case of two dimensions our result shows that in a neighborhood of the set at which the second derivatives of u are unbounded, the level set {u = 0} consists of two C 1 -curves meeting at right angles. It is important that our result is not confined to the minimal solution of the equation but holds for all solutions.
Contents 1. Introduction . . . . . . . . . . . . . . . . 2. Notation . . . . . . . . . . . . . . . . . . 3. Preliminaries . . . . . . . . . . . . . . . . 4. A Newtonian Potential and its Projection . 5. Growth of the Solution at Singular Points . 6. Controlling the Movement of (u(x + r ·)) 7. Conclusion . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
252 254 254 256 258 264 268 269
252
J. Andersson, H. Shahgholian, G. S. Weiss
1. Introduction In the last decade, the theory of free boundary regularity of obstacle type has got renewed attention, owing to the seminal paper [4] of L.A. Caffarelli as well as [7]. Many interesting old and new problems, intractable by earlier techniques, have been solved, thanks to the ideas in [4 and 7] (see for example [16]). All these problems share a common feature: the scaling of the solution at free boundary points coincides with the characteristic/invariant scaling of the equation. However, there are problems arising in applications for which this does not hold. An example is the unstable obstacle problem u = −χ{u>0}
in ⊂ Rn ,
(1.1)
related to traveling wave solutions in solid combustion with ignition temperature (see the Introduction of [14] for more details), to the composite membrane problem (see [3,8–11,15]) as well as the shape of self-gravitating rotating fluids (see [6, Eq. (1.26)]). Solutions of Equation (1.1) may exhibit “supercharacteristic” growth of order r 2 | log r | not suggested by the invariant/characteristic scaling u(r x)/r 2 of the equation. In this paper we introduce a new method to analyze the fine structure of singular sets close to points of supercharacteristic growth of the solution. Equation (1.1) has been investigated by R. Monneau-G.S. Weiss in [14]. They establish partial regularity for second order non-degenerate solutions of (1.1). More precisely they show that the singular set has Hausdorff dimension less than or equal to n − 2, and that in two dimensions the free boundary consists close to points where the second derivative is unbounded, of four Lipschitz graphs meeting at right angles. They also show that energy-minimising solutions are in the two-dimensional case of class C 1,1 and that their free boundaries are locally analytic. J. Andersson-G.S. Weiss have constructed a cross-shaped counter-example proving that the solution need not be of class C 1,1 (see [1]). In [14] it has been shown that the second variation of the energy at that particular solution takes the value −∞. In this sense the cross-solution is completely unstable. Moreover, it cannot be obtained by naive numerical schemes. In this paper we analyze the behavior of solutions at points at which the second derivatives are unbounded. Difficulties in the analysis are: (i) At cross-like singular points the solution has the “wrong scaling”, i.e. u(r x) scales like r 2 | log(r )| which is different from the characteristic scaling r 2 of the equation. The lack of a suitable local Lyapunov functional/monotonicity formula implies that methods like the Lojasiewicz inequality (see for example [17,18]) would be hard to apply even at isolated singularities. (ii) The cross-like singularities are unstable. (iii) The comparison principle does not hold. Instead we use knowledge about the Newtonian potential of the right-hand side to derive a quantitative estimate for the projection of the solution onto the homogeneous harmonic polynomials of degree 2. Remark 1.1. Although our problem may superficially resemble nodal line problems where one is also interested in the singular set {w = 0} ∩ {∇w = 0}, there is a fundamental difference: whereas in nodal lines the solution w is sometimes expanded close to
Uniform Regularity
253
a singular point into a harmonic polynomial and a (relatively) smooth remainder term, and this expansion thereafter quickly leads to regularity, the same is not true for our problem. Let us make this difference more precise for the example of the nodal line paper [5]: In [5, Theorem 4.3], it is shown that the solution of that paper, w, satisfies close to a (more general) singular point x 0 , w = H + , where H is a harmonic polynomial of precise degree N and the remainder satisfies |(x)| ≤ C|x − x 0 | N +δ for some δ > 0. The corresponding statement would in our case be that in a small neighborhood of the singularity x 0 , u = H + , where H is a harmonic polynomial of precise degree 2 and the remainder satisfies |(x)| ≤ C|x − x 0 |2+δ for some δ > 0. But that is by our Lemma 5.3 not true! What we do is a finer expansion of the type u = H +z+ (up to rotation), where z is the singular Newtonian potential of Lemma 4.4. Observe that the Newtonian potential z is a much more sophisticated object than the harmonic polynomial H . A careful analysis leads in the case of two dimensions to the growth estimate Theorem A (i) for the solution as well as an estimate of order
r 0
√ | log | log s|| ds s| log s|3/2
(1.2)
for how much the projection of u(x + s·) and also the approximate tangent space of the singular set can turn as s moves from r to 0 (see Theorem A and Remark 1.2). Our main result Theorem A shows that close to a non-degenerate singular point, the level set {u = 0} consists of two C 1 -curves meeting at right angles. We provide estimates for the modulus of the normal of the free boundary close to singular points. Different from the (also two-dimensional) unique tangent cone result [14, Theorem 7.1], the result in the present paper is a quantitative result valid uniformly for a certain class of solutions. Moreover the result in the present paper is not confined to the minimal solution. In the paper [2] in preparation the authors extend these new methods to the case of higher dimensions. Our main result in the present paper is the following (cf. Corollary 5.6 and Corollary 7.1):
254
J. Andersson, H. Shahgholian, G. S. Weiss
Theorem A. Let u be a solution of (1.1) in ⊂ R2 satisfying sup |u| ≤ M. Moreover let d > 0. Then there exist an r0 = r0 (M, d) > 0 and a δ0 = δ0 (M, d) > 0 such that if x 0 ∈ d = {x ∈ : dist(x, ∂) > d} and 1/2 1 r2 u 0 2 1 (1.3) S (x , r ) ≡ u dH ≥ r 1 ∂ Br (x 0 ) δ for some δ ≤ δ0 , r ≤ r0 and u(x 0 ) = |∇u(x 0 )| = 0 then: (i) 1δ − C(M, d) s 2 + c log(r/s)s 2 ≤ S u (x 0 , s) for every s ≤ r . 0 (ii) There exists a second order homogeneous harmonic polynomial p x ,u = p such that for each α ∈ (0, 1/2) and each β ∈ (0, 1), α u(x 0 + sx) δ − p ≤ C(M, d, α, β) . (1.4) sup Bs (x 0 ) |u| 1,β 1 + δ log(r/s) C
(B 1 )
(iii) The set {u = 0} ∩ Br (x 0 ) consists of two C 1 -curves intersecting each other at right angles at x 0 . Remark 1.2. 1) By [14, Lemma 8.5] the estimate Theorem A (i) is sharp. The inequality (1.3) is always satisfied for some r at singular points, that is, points at which the solution u is not C 1,1 . Theorem A thus states that x 0 is a singular point if and only if (1.3) is satisfied for some r . 2) The left hand side in (1.4) may be estimated by the somewhat sharper term in (1.2) (see the end of the proof of Theorem 6.3). The proof of (i) in Theorem A is contained in Corollary 5.6, and (ii) and (iii) will be proved in Corollary 7.1. 2. Notation Throughout this article Rn will be equipped with the Euclidean inner product x · y and the induced norm |x| . We define ei as the i th unit vector in Rn , and Br (x 0 ) will denote the open n-dimensional ball of center x 0 , radius r and volume r n ωn . When not specified, x 0 is assumed to be 0. We shall often use abbreviations for inverse images like {u > 0} := {x ∈ : u(x) > 0} , {xn > 0} := {x ∈ Rn : xn > 0}, etc., and occasionally we shall employ the decomposition x = (x1 , . . . , xn ) of a vector x ∈ Rn . Since we are concerned with local regularity we will use the set d := {x ∈ : dist(x, ∂) ≥ d > 0}. We will use the k-dimensional Hausdorff measure Hk . When considering a set A , χ A shall stand for the characteristic function of A , while ν shall typically denote the outward normal to a given boundary. 3. Preliminaries In this section we state some of the definitions and tools from [20,14] and mention some examples from [1]. First we need the monotonicity formula derived in [20] by G.S. Weiss for a class of semilinear free boundary problems. For the sake of completeness let us state the unstable case here:
Uniform Regularity
255
Theorem 3.1 (Monotonicity formula, [20]). Suppose that u is a solution of (1.1) in and that Bδ (x 0 ) ⊂ . Then for all 0 < ρ < σ < δ the function ux 0 (r ) := r −n−2
Br (x 0 )
− 2 r −n−3
|∇u|2 − 2 max(u, 0)
∂ Br (x 0 )
u 2 dHn−1 ,
defined in (0, δ) , satisfies the monotonicity formula ux 0 (σ ) − ux 0 (ρ) =
ρ
σ
r −n−2
u 2 2 ∇u · ν − 2 dHn−1 dr ≥ 0. 0 r ∂ Br (x )
The following proposition has been proved in [14, Sect. 5]. Proposition 3.2 (Classification of blow-up limits with fixed center, Prop. 5.1 in [14]). Let u be a solution of (1.1) in and let us consider a point x 0 ∈ ∩{u = 0}∩{∇u = 0}. (i) In the case ux 0 (0+) = −∞, limr →0 r −3−n ∂ Br (x 0 ) u 2 dHn−1 = +∞, and for 1 2 S u (x 0 , r ) = r 1−n ∂ Br (x 0 ) u 2 dHn−1 , each limit of u(x 0 + r x) S u (x 0 , r ) as r → 0 is a homogeneous harmonic polynomial of degree 2. (ii) In the case ux 0 (0+) ∈ (−∞, 0), u r (x) :=
u(x 0 + r x) r2
is bounded in W 1,2 (B1 (0)), and each limit as r → 0 is a homogeneous solution of degree 2. (iii) Else ux 0 (0+) = 0, and u(x 0 + r x) → 0 in W 1,2 (B1 (0)) as r → 0. r2 Remark 3.3. 1. As observed recently by one of the authors, case (ii) is possible even in two dimensions (cf. [2]). 2. Case (iii) is equivalent to u being degenerate of second order at x 0 . In [1], the authors have obtained abstract existence of solutions in two dimensions that exhibit cross-like singularities, at which the second derivatives of the solution are unbounded (case (i) of Proposition 3.2), as well as degenerate singularities, at which the solution decays to zero faster than any quadratic polynomial (case (iii) of Proposition 3.2):
256
J. Andersson, H. Shahgholian, G. S. Weiss
Theorem 3.4 (Cross-shaped singularity in two dimensions, Cor. 4.2 in [1]). There exists a solution u of u = −χ{u>0} in B1 ⊂ R2 that is not of class C 1,1 . Each limit of u(r x) S u (0, r ) as r → 0 coincides after rotation with the function (x12 − x22 )/x12 − x22 L 2 (∂ B1 (0)) . Theorem 3.5 (Existence of a degenerate point, Cor. 4.4 in [1]). There exists a nontrivial solution u of u = −χ{u>0} in B1 ⊂ R2 that is degenerate of second order at the origin. 4. A Newtonian Potential and its Projection In what follows we will need the space P of second order homogeneous harmonic polynomials and two dimensional homogeneous polynomials respectively which we define now. Definition 4.1. Let us first define in each dimension n ≥ 2 the space P of 2-homogeneous harmonic polynomials, i.e. harmonic polynomials of degree 2. Definition 4.2.
(i) Let us define the projection : W 2,2 (B1 ) → P
as follows: for v ∈ W 2,2 (B1 ), let (v) be the, by Lemma 4.3 unique, minimizer of p → |D 2 v − D 2 p|2
B1
n 2 on P, where |A| = i, j=1 ai j is the Frobenius norm of the matrix A. (ii) Let us also define τ (v) ≥ 0 by
(v) = τ (v) p, p ∈ P, sup | p| = 1. B1
Lemma 4.3. (i) For each v ∈ W 2,2 (B1 ) the minimizer of Definition 4.2 exists and is unique. Thus : W 2,2 (B1 ) → P is well-defined. (ii) is a linear operator. (iii) If h ∈ W 2,2 (B1 ) is harmonic in B1 then (h(x)) = (h(r x)/r 2 ) for all r ∈ (0, 1). (iv) For every v, w ∈ W 2,2 (B1 ), sup |(v + w)| ≤ sup |(v)| + sup |(w)|. B1
B1
B1
Uniform Regularity
257
Proof. The first and second statement follow from the projection theorem with respect 2 to the L 2 (B1 ; Rn )-inner product and the linear subspace 2
{ f ∈ L 2 (B1 ; Rn ) : f (x)is symmetric, constant, and trace( f ) = 0}. Writing h as the sum of homogeneous harmonic polynomials h j that are orthogonal to each other with respect to n (v, w) := ∂i j v∂i j w, B1 i, j=1
we see that (h j ) = 0 for all j such that the degree of h j is different from 2, implying the third statement. The last statement follows from the linearity of and the triangle 2 inequality in L 2 (B1 ; Rn ). In [13] L. Karp-A.S. Margulis derive eigenfunction expansions for generalized Newtonian potentials with respect to a large class of right-hand sides. In the following lemma we calculate explicitly a normalized generalized Newtonian potential of −χ{x1 x2 >0} as well as its projections. Properties (iv), (v) and (vi) in Lemma 4.4 are crucial for what follows. Lemma 4.4. Define v : (0, +∞) × [0, +∞) → R by x2 π − π(x12 + x22 ). v(x1 , x2 ) := −4x1 x2 log(x12 + x22 ) + 2(x12 − x22 ) − 2 arctan 2 x1 Moreover let
⎧ x1 x2 ≥ 0, x1 = 0, ⎨ v(x1 , x2 ), w(x1 , x2 ) := −v(−x1 , x2 ), x1 < 0, x2 ≥ 0, ⎩ −v(x , −x ), x > 0, x ≤ 0, 1 2 1 2
and let z(x1 , x2 ) :=
w(x1 , x2 ) − π(x12 + x22 ) + 8x1 x2 . 8π
Then, z is the unique solution to (i) (ii) (iii) (iv) (v) (vi)
z = −χ{x1 x2 >0} in R2 , z(0) = |∇z(0)| = 0, lim x→∞ z(x) = 0, |x|3 (z) = 0, (z 1/2 ) = log(2)x1 x2 /π, τ (z 1/2 ) = log(2)/(2π ).
Proof. A calculation shows that w can be extended to a C 1 -function and that w = −4π χ{x1 x2 >0} + 4π χ{x1 x2 0} in R2 and satisfying (ii) and (iii). Next we show that h := (z) = 0: setting a b , D2 h = b −a
258
J. Andersson, H. Shahgholian, G. S. Weiss
we obtain
|D z − D h| = 4
0 = ∂b
2
B1
2
∂12 (h − z) = 4b − 4
2
B1
∂12 z B1
1 + log(x12 + x22 ) = 4b π
= 4b + 2 B1
as well as |D 2 z − D 2 h|2 = 4a,
0 = ∂a B1
implying that h ≡ 0. Rescaling z we see that z(r x1 , r x2 ) x1 x2 log r 2 = z(x , x ) − 1 2 r2 2π which implies ⎛ (z 1/2 ) = (z) − ⎝
x1 x2 log
2 ⎞ 1
2π
2
⎠
= − log(1/2)(x1 x2 )/π = − log(1/2)x1 x2 /π. Thus (v) and (vi) are true. Last, we show uniqueness of z satisfying (i)-(iv). Observe that (v) and (vi) are not needed to show uniqueness. If z 1 and z 2 are two solutions to (i)-(iv), then by (i), z 1 − z 2 is harmonic. Condition (iii) implies that z 1 − z 2 is a second order polynomial. Conditions (ii) and (iv) then imply that z 1 − z 2 = 0.
5. Growth of the Solution at Singular Points The next lemma is crucial for all that follows. Lemma 5.1. Let u solve (1.1) and suppose that d > 0, sup |u| ≤ M < +∞, x 0 ∈ d , u(x 0 ) = |∇u(x 0 )| = 0 and r ≤ d/2. Then B1
1/ p 0 + r x) p 2 u(x 0 + r x) u(x 2 D −D ≤ C(n, M, d, p) r2 r2
and u(x 0 + r x) u(x 0 + r x) − ≤ C(n, M, d, β). 1,β 2 2 C (B 1 ) r r
Uniform Regularity
259
Proof. Let u r (x) = BMO, and that
u(x 0 +r x) . From [19, 4.1 Prop. 1] we infer that r2
D 2 u is locally of class
1/2 |D u r − 2
D2u
3r/2 |
≤ C1 ,
2
B3/2
where D2u
3r/2
1 = ωn (3/2)n
D2 ur , B3/2
and C1 is a constant depending only on n, M and d. It follows that 1/2 C1 ≥
|D 2 u r − D 2 u 3r/2 |2
≥
B3/2
2 1/2 2 D u r − D 2 u 3r/2 − 1 trace(D 2 u 3r/2 )I n B3/2 2 1/2 1 2 − , n trace(D u 3r/2 )I B3/2
where I is the identity matrix. Next it is easy to see that 2 1 trace(D 2 u 3r/2 )I ≤ 1, n B3/2 since trace
D2u
3r/2
1 = ωn (3/2)n
u r B3/2
and |u r | ≤ 1. In particular we have 2 1/2 2 1 2 2 C1 + 1 ≥ . D u η − D u 3r/2 − n trace(D u 3r/2 )I B3/2 Using the minimizing property of the projection we get 2 2 D u r − D 2 u 3r/2 − 1 trace(D 2 u 3r/2 )I (C1 + 1)2 ≥ n B3/2 ≥ |D 2 u r − D 2 (u 3r/2 )|2 . B3/2
Observe that if we set v := u r − (u 3r/2 ), then |D 2 v|2 ≤ (C1 + 1)2 , (v) L 2 (B1 ) ≤ C2 , B3/2
(v) L 2 (B3/2 ) ≤ C3 and v − (v) L 2 (B3/2 ) ≤ C4 .
260
J. Andersson, H. Shahgholian, G. S. Weiss
It follows that D 2 (u r − (u r )) is bounded in L 2 (B3/2 ). Moreover, since (u r ) is harmonic, (u r − (u r )) = −χ{ur >0} . Poincaré’s Inequality implies that u r − (u r ) − ∇u r · x − u r 2,2 ≤ D 2 u r − D 2 (u r ) L 2 (B ) ≤ C5 , W (B ) 3/2
3/2
where ∇u r and u r denote the averages. Thus L p -theory (see for example [12, Th. 9.11]) implies that u r − (u r ) − ∇u r · x − u r 2, p ≤ C6 . W (B ) 1
The embedding into Hölder spaces therefore yields u r − (u r ) − ∇u r · x − u r
C 1,β (B 1 )
≤ C7 .
Using that u(x 0 ) = |∇u(x 0 )| = 0 and the above estimates implies the statement of the lemma. Remark 5.2. The above lemma implies in particular that when one of the quantities u L ∞ (Br (x 0 )) , S u (x 0 , r ) and τ (u(x 0 +r ·)) is large in comparison to r 2 then all these quan¯ 2 tities are comparable. Let us indicate how to prove this: assume that τ (u(x 0 +r ·)) > Cr ¯ for some large constant C¯ = C(n, M, d), then 1/2 1/2 1 1 2 n−1 2 n−1 S u (x 0 , r ) = u dH ≥ (u) dH r n−1 ∂ Br (x 0 ) r n−1 ∂ Br (x 0 ) 1/2 1 − n−1 (u − (u))2 dHn−1 ≥ c(n)τ (u(x 0 + r ·)) − C(n, M, d)r 2 . r ∂ Br (x 0 ) It follows that if C¯ > 2C(n, M, d)/c(n) then S u (x 0 , r ) > c(n)τ (u(x 0 + r ·))/2. Similarly one may deduce that under the above assumptions S u (x 0 , r ) < C(n)τ (u(x 0 + r ·)) and that the corresponding relationships between the other quantities above hold. In what follows, we denote by z(x1 , . . . , xn ) := z(x1 , x2 ) the solution of Lemma 4.4, extended to Rn . Lemma 5.3. For each > 0, n ∈ N, d > 0, M < +∞, α ∈ [1, +∞) and β ∈ (0, 1) there exist r0 , δ > 0 with the following property: Suppose that 0 < r ≤ r0 , x ∈ d and that u is a solution of (1.1) in satisfying sup |u| ≤ M, u(x) = |∇u(x)| = 0 and Ln (({u(x + r ·) > 0}{x1 x2 > 0}) ∩ B1 ) ≤ δ. Then
u(x + r ·) u(x + r ·) − ( ) − z r2 1,β ¯ ≤ . 2 r C ( B1 )
Proof. Suppose that r j → 0, that Ln ({u j (x j + r j ·) > 0}{x1 x2 > 0}) → 0 as j → ∞, and that u j (x j + r j ·) u j (x j + r j ·) 1,β 2,α − ( ) → z˜ in Cloc (Rn ) and weakly in Wloc (Rn ) rj2 rj2
Uniform Regularity
261
as j → ∞ (cf. Lemma 5.1). Now let N˜ be the Newtonian potential of χd u j , i.e. 1 2−n (χ u )(ξ ) dξ, n > 2, d j Rn |y − ξ | n(2−n)ω ˜ n N (y) := 1 n = 2. 2π R2 log |y − ξ |(χd u j )(ξ ) dξ, Next we let N (y) := N˜ (y) − N˜ (x j ) − ∇ N˜ (x j ) · (y − x j ), and consider the harmonic function h(y) := u j (y) − N (y). Since sup |u j | ≤ M, |h| ≤ C2 on ∂ Bd (x j ), and it follows that |D 3 h(y)| ≤ C3 in Bd/2 (x j ), where C3 depends on n, d and M. Consequently |u j (y) − N (y) − D 2 h(x j )(y − x j )(y − x j )| ≤ C4 |x j − y|3 in Bd/2 (x j ), where C4 depends only on n, d and M. For the scaled functions v j (y) := u j (x j + r j y)/r 2j , N j (y) := N (x j + r j y)/r 2j and p j (y) = D 2 h(x j )(y)(y) we obtain |v j (y) − N j (y) − p j (y)| ≤ C4 r j |y|3 in Bd/(2r j ) . Thus v j − (v j ) = N j − (N j ) + o(1) as j → ∞. Passing if necessary to another subsequence j → ∞, the functions N j converge locally to N0 , where N0 = −χ{x1 x2 >0} , N0 (0) = 0, ∇ N0 (0) = 0 and N0 − (N0 ) = z˜ . We need to establish that |N0 (y)| = o(|y|3 ) as |y| → ∞. Once this is established the uniqueness part of Lemma 4.4 implies that z˜ = N0 −(N0 ) = z and the lemma follows. First, D 2 N0 ∈ B M O(Rn ), so that 2 2 R4 D (N0 (Ry)) − D 2 (N0 (R·)) dy ≤ C5 2 sup B R |D N0 | sup B R |D 2 N0 |2 B1 for all R ∈ (0, +∞), where D 2 (N0 (R·)) denotes the mean value of D 2 (N0 (R·)) on B1 . Thus lim sup R→∞ sup B1 |D 2 N0 (R·)|/R 2 = +∞ implies that N0 (Rk ·)/ sup B R |D 2 N0 | converges for a sequence Rk → ∞ k to a 2-homogeneous harmonic polynomial.
(5.1)
Now suppose towards a contradiction that lim sup |y|→∞
|N0 (y)| > 0. |y|3
Then (N0 − z) = 0 in Rn and lim sup |y|→∞
|N0 (y) − z(y)| > 0. |y|3
Thus N0 − z must be a harmonic polynomial of degree m ≥ 3, contradicting (5.1).
262
J. Andersson, H. Shahgholian, G. S. Weiss
Lemma 5.4. Let n = 2, d > 0 and M < +∞. Then there are r0 , δ > 0 with the following property: Suppose that 0 < r ≤ r0 , x 0 ∈ d and that u is a solution of (1.1) in satisfying sup |u| ≤ M, u(x 0 ) = |∇u(x 0 )| = 0 and S u (x 0 , r ) ≥
r2 , δ
for some r ≤ r0 . Then | log(S u (x 0 , r )/r 2 )| Ln ({u(x 0 + r ·) > 0}{(u(x 0 + r ·)) > 0}) ∩ B1 ≤ C , S u (x 0 , r )/r 2 where C = C(d, M, r0 ). Proof. Let u r (y) := u(x 0 + r y)/r 2 . Then u r is a solution to (1.1) and S ur (0, 1) > 1/δ. Let τ (u r ) pr = (u r ). By Lemma 5.1, sup B1 |u r − τ (u r ) pr | ≤ C, and we obtain at each point x ∈ {u r > 0} ∩ { pr ≤ 0} that | pr (x)| ≤
C1 C ≤ u , τ (u r ) S r (0, 1)
where we have used that S ur (0, 1) is comparable to τ (u r ) (see Remark 5.2). Next we calculate C1 } ∩ B1 ) S ur (0, 1) C1 }) ≤ 4Ln ({(x1 , x2 ) : 0 < x1 < 1, 0 < x2 < 1, x1 x2 ≤ Sr (0, 1)
Ln ({u r > 0} ∩ { pr ≤ 0} ∩ B1 ) ≤ Ln ({| pr | ≤
C1 /S ur (0,1)
=4
d x1 + 4 0
1
C1 /S ur (0,1)
C1 C| log(S ur (0, 1))| d x . ≤ 1 x1 S ur (0, 1) S ur (0, 1)
The lemma follows by scaling back S u (x 0 , r ) = r 2 S ur (0, 1).
Lemma 5.5. Let n = 2. For each γ ∈ (0, log(2)/(2π )), d > 0 and M < +∞ there are r0 , δ > 0, depending only on γ , d and M, with the following property: Suppose that 0 < r ≤ r0 , x 0 ∈ d and that u is a solution of (1.1) in satisfying sup |u| ≤ M, u(x 0 ) = |∇u(x 0 )| = 0 and for some r ≤ r0 , S u (x 0 , r ) ≥
r2 . δ
Then τ (4u(x 0 + r · /2)/r 2 ) ≥ τ (u(x 0 + r ·)/r 2 ) + γ . Proof. Suppose towards a contradiction that τ (4u j (x j + r j · /2)/r 2j ) < τ (u j (x j + r j ·)/r 2j ) + γ for a sequence u j satisfying the assumptions with δ = δ j → 0 as j → ∞. Let v j := u j (x j + r j ·)/r 2j . A straightforward calculation shows that v j solves (1.1) and that S v j (0, 1) ≥
1 . δj
Uniform Regularity
263
From Lemma 5.4 it follows that Ln ({v j > 0}{(v j ) > 0}) ∩ B1 ) → 0. We may apply Lemma 5.3 and deduce that, after a rotation of the coordinate system, v j − (v j ) → z weakly in W 2,α (B1 ) and strongly in C 1,β ( B¯ 1 ) as j → ∞, and that therefore — rotating each v j only slightly more — (v j ) = M j x1 x2 with M j → +∞ as j → ∞. Defining f 1/2 (y) := 4 f (y/2), it follows from Lemma 4.4 (v) that ((v j )1/2 − M j x1 x2 ) → (z 1/2 ) = log(2)x1 x2 /π as j → ∞. On the other hand, τ ((v j )1/2 ) < τ (v j ) + γ , so that (log(2)/π + M j )/2 = τ ((log(2)/π + M j )x1 x2 ) = o(1) + τ ((v j )1/2 ) < o(1) + τ (v j ) + γ = o(1) + M j /2 + γ ,
a contradiction for large j.
The next corollary proves the first statement in Theorem A and is fundamental for the rest of the paper. Corollary 5.6. Let n = 2. Fix a γ ∈ (0, log(2)/2π ), and let u satisfy the assumptions in Lemma 5.5 for some r ≤ r0 (with possibly somewhat smaller δ). Then τ (22 j u(x 0 + 2− j r ·)/r 2 ) ≥ τ (u(x 0 + r ·)/r 2 ) + jγ for all j ∈ N. Moreover, for each s ≤ r , S u (x 0 , s) S u (x 0 , r ) log(r/s) − 2C, ≥ + cγ s2 r2 log(2) where c = x1 x2 L 2 (∂ B1 ) , and C = C(M, d, r0 ). Proof. Since by Lemma 5.1, u(x 0 + r x) u(x 0 + r x) sup − ≤ C0 , r2 r2 B1 it follows that for s ≤ r , u s (x) = u(x 0 + sx)/s 2 and c = x1 x2 L 2 (∂ B1 ) , S (0, 1) − us
1
2C0 π ≤
|(u s )| dH 2
∂B
1
1 − 2C0 π ≤ cτ (u s ). Similarly it follows that τ (u s )
1
2
+
|u s − (u s )| dH 2
∂ B1
1/2 ∂ B1
(x1 x2 )2
dH1 ≤ S u s (0, 1) +
1
2
(5.2)
2C0 π .
(5.3)
From Lemma 5.5 we infer that if S u (x 0 , r )/r 2 ≥ 1/σ with σ < δ and δ is as in Lemma 5.5, then τr/2 ≥ τr + γ . Here we use the shorthand τr ≡ τ (u(x + r ·)/r 2 ). From inequalities (5.2) and (5.3) we see that S u (x 0 , r/2) S u (x 0 , r ) ≥ (τr + γ )c − 2C0 π ≥ + γ c − 2 2C0 π , 2 2 (r/2) r
(5.4)
264
J. Andersson, H. Shahgholian, G. S. Weiss
where c is the constant in the statement of the corollary. In particular, if σ has been chosen small enough, say 1/σ > 1/δ + 2C1 , then u satisfies the assumptions of Lemma 5.5 in Br/2 . We may thus apply Lemma 5.5 again and deduce that S u (x 0 , r/4) ≥ (τ + 2γ )c − 2 2C0 π . r (r/4)2 Applying Lemma 5.5 j times, we arrive at S u (x 0 , r/2 j ) S u (x 0 , r ) ≥ (τ + jγ )c − C ≥ + cγ j − 2C1 . r 1 (r/2 j )2 r2
√ Notice that since τ2− j r is increasing in j and thus S u (x 0 , 2− j r ) ≥ τr − 2 2C0 π for each j and the assumptions of Lemma 5.5 are therefore satisfied for each j. If we put s = 2− j r then j = log(r/s)/ log(2) and we obtain the statement in the corollary. For general s ≤ r we may consider a j such that 2−( j+1)r < s ≤ 2− j r . Using Lemma 5.1, u(x 0 + 2− j r x) u(x 0 + 2− j r x) − ≤ C2 , 1,β − j 2 − j 2 C (B 1 ) (2 r ) (2 r ) and it follows that S u (x 0 , s) S u (x 0 , 2− j r ) − ≤ C3 . s2 (2− j r )2 The corollary follows with a slightly larger constant C. 6. Controlling the Movement of (u(x + r·)) In this section we will exploit the estimate in Corollary 5.6 to obtain control of how much the projection of u(x + r ·) can turn when passing to a smaller radius r . Lemma 6.1. Let n = 2, d > 0 and M < ∞. Then there is r0 , δ > 0 with the following property: Suppose that 0 < r ≤ r0 , x 0 ∈ d and that u is a solution of (1.1) in satisfying sup |u| ≤ M, u(x) = |∇u(x)| = 0 and S u (x 0 , r ) 1 ≥ . 2 r δ Let g be the solution of g = χ{(u(x+r ·))>0} − χ{u(x+r ·)>0} in B1 , g = 0 on ∂ B1 . Then (i)
D g L 2 (B1 ) ≤ C 2
| log(S u (x 0 , r )/r 2 )| . S u (x 0 , r )/r 2
Uniform Regularity
265
(ii) τ (g) ≤ C
| log(S u (x 0 , r )/r 2 )| , S u (x 0 , r )/r 2
where C = C(d, M, r0 ). Proof. (i) follows from Lemma 5.4 and L 2 -theory (see for example [12, Th. 8.8]). (ii) Rotating and setting p := (g) = a1 x12 + a2 x22 , we obtain | log(S u (x 0 , r )/r 2 )| D 2 p L 2 (B1 ) ≤ C1 D 2 g L 2 (B1 ) ≤ C2 S u (x 0 , r )/r 2 and
|a j | ≤ C3
| log(S u (x 0 , r )/r 2 )| S u (x 0 , r )/r 2
for j = 1, 2. The next proposition already contains the desired estimate for how much the projection may turn when passing from u(x 0 + r ·) to u(x 0 + r · /2). Proposition 6.2. Let n = 2, d > 0 and M < +∞. Then there are r0 , δ > 0 with the following property: Suppose that 0 < r ≤ r0 , x 0 ∈ d and that u is a solution of (1.1) in satisfying sup |u| ≤ M, u(x) = |∇u(x)| = 0 and S u (x 0 , r ) 1 ≥ . r2 δ Then (u(x + r ·)) (u(x + r · /2)) | log(|S u (x 0 , r )/r 2 |)| sup − ≤C 3/2 , sup B1 |(u(x + r · /2))| B1 sup B1 |(u(x + r ·))| S u (x 0 , r )/r 2 where C = C(n, M, d). Proof. Let us consider v = u r − z ◦ Q r − h r − τ (u r ) pr , where u r (y) = u(x + r y)/r 2 , z is the function defined in Lemma 4.4, (u r ) = τ (u r ) pr , the orthogonal matrix Q r has been chosen such that {(u r ) > 0} = {(x1 x2 ) ◦ Q r > 0} (we may assume that Q r = I , the identity matrix), h r = h(r y)/r 2 , and h is harmonic and satisfies h(x) ≤ C1 |x|3 . It ˜ where g is the solution of follows that (v) = 0. Moreover we may express v = g + h, ˜ ˜ Lemma 6.1 and h is harmonic. Lemma 6.1 (ii) implies now that for h˜ 1/2 (y) = 4h(y/2), g1/2 (y) = 4g(y/2) and v1/2 (y) = 4v(y/2), sup |(v1/2 )| = sup |(h˜ 1/2 + g1/2 )| ≤ sup |(g1/2 )| B1
B1
B1
+ sup |(h˜ 1/2 )| ≤ sup |(h˜ 1/2 )| + C2 B1
B1
| log(S u (x 0 , r )/r 2 )| . S u (x 0 , r )/r 2
266
J. Andersson, H. Shahgholian, G. S. Weiss
u (x 0 ,r )/r 2 )| ˜ ≤ |(g)| ≤ C2 | log(S Since (v) = 0 we also know that |(h)| . On the S u (x 0 ,r )/r 2 ˜ ˜ ˜ other hand, using that h is harmonic and Lemma 4.3 (iii), (h) = (h 1/2 ) so that sup | u r/2 − z 1/2 − h r/2 − τ (u r ) pr | = sup |(v1/2 )| ≤ 2C2 B1
B1
| log(S u (x 0 , r )/r 2 )| . S u (x 0 , r )/r 2
From the linearity of , |h(x)| ≤ C3 |x|3 and Lemma 4.4 we infer that sup |(u r/2 ) − (τ (u r ) + log(2)/(2π )) pr | B1
≤ 2C2
| log(S u (x 0 , r )/r 2 )| + sup |(h r/2 )| ≤ C4 S u (x 0 , r )/r 2 B1
| log(S u (x 0 , r )/r 2 )| ; (6.1) S u (x 0 , r )/r 2
here we also used that sup B1 |(h r/2 )| ≤ C4 r , which can be absorbed in the last term since S u (x 0 , r )/r 2 is large by assumption. From (6.1) we conclude that (u r/2 ) (u r ) − sup sup |(u )| sup |(u )| r r/2 B1 B1 B1 (u r ) (τ (u r ) + log(2)/(2π )) pr | log(S u (x 0 , r )/r 2 )| ≤ sup − + C6 3/2 , sup B1 |(u r/2 )| B1 sup B1 |(u r )| S u (x 0 , r )/r 2 where we also used sup B1 |(u r/2 )| ≥ C7 S u (x 0 , r )/r 2 (cf. Remark 5.2). Next we make the following estimate, which together with the previous estimate yields the conclusion of the proposition: τ (u ) p (τ (u r ) + log(2)/(2π )) pr r r sup − sup B1 |(u r/2 )| B1 τ (u r ) τ (u r ) + log(2)/(2π ) τ (u r ) pr (τ (u ) + log(2)/(2π )) p r r + ≤ sup − − 1 sup |(u )| τ (u ) (τ (u ) + log(2)/(2π )) r r r/2 B1 B1 1 | log(S u (x 0 , r )/r 2 )| ≤ C8 u 0 , 2 S (x , r )/r S u (x 0 , r )/r 2 where we have used (6.1) to estimate
| log(S u (x 0 , r )/r 2 )| , S u (x 0 , r )/r 2 B1 τ (u ) + log(2)/(2π ) 1 | log(S u (x 0 , r )/r 2 )| r − 1 ≤ C9 u 0 . sup B1 |(u r/2 )| S (x , r )/r 2 S u (x 0 , r )/r 2
| sup |(u r/2 )| − (τ (u r ) + log(2)/(2π ))| ≤ C4
Uniform Regularity
267
Theorem 6.3. Let n = 2, d > 0 and suppose that u solves (1.1) and that sup |u| ≤ M < +∞. Then there exists a δ = δ(M, d) > 0 and an r0 = r0 (M, d) > 0 such that if x 0 ∈ d and S u (x 0 , r ) 1 ≥ r2 δ for some r ≤ r0 then for each α ∈ (0, 1/2) and all s ≤ r , α r2 (u(x 0 + r x)) (u(x 0 + sx)) . sup − ≤ C(d, M, α) 0 sup B1 |(u(x 0 + sx))| S u (x 0 , r ) B1 sup B1 |(u(x + r x))| Proof. For simplicity we will only prove the theorem for s = 2− j r ; for general s we may use the estimate in Lemma 5.1 as indicated in the proof of Corollary 5.6. Let us choose δ small enough so that Corollary 5.6 holds for some fixed γ > 0, i.e. S u (x 0 , 2− j r ) S u (x 0 , r ) ≥ + cγ j − 2C. 2−2 j r 2 r2
(6.2)
Decreasing δ somewhat more if necessary, we see that (6.2) implies that the assumptions in Proposition 6.2 hold for every ball B2− j r (x 0 ). Using the triangle inequality we obtain that (u(x 0 + r x)) (u(x 0 + 2− j r x)) sup sup − 0 sup B1 |(u(x 0 + 2− j r x))| j B1 sup B1 |(u(x + r x))| ∞ (u(x 0 + 2− j r x)) (u(x 0 + 2− j−1r x)) ≤ − sup . 0 − j 0 − j−1 sup |(u(x + 2 r x))| sup |(u(x + 2 r x))| B1 B1 B1 j=0
This sum may be estimated, by Proposition 6.2, from above by ∞ log(S u (x 0 , 2− j r )/(2−2 j r 2 )) . u (x 0 , 2− j r )/(2−2 j r 2 ) 3/2 S j=0
(6.3)
Let us set k to be the smallest integer satisfying 1 S u (x 0 , r ) k≥ − 2C . cγ r2 For S u (x 0 , r )/r 2 large enough we see that k > c1
S u (x 0 , r ) . r2
Using (6.2) we may estimate (6.3) by ∞ ∞ log(cγ t) log(cγ j) 2 + log k C2 ≤ C3 dt ≤ C4 √ ≤ C5 (α)k −α 3/2 3/2 (cγ j) (cγ t) k k j=k
for each α ∈ (0, 1/2). Using (6.4) gives the Theorem.
(6.4)
268
J. Andersson, H. Shahgholian, G. S. Weiss
7. Conclusion Corollary 7.1. Under the assumptions in Theorem 6.3 the following holds: (i) there exists a homogeneous harmonic polynomial p x ,u = p of second order such that for each α ∈ (0, 1/2) and each β ∈ (0, 1/2), α u(x 0 + sx) δ − p 1,β ≤ C(d, M, α, β) . C (B 1 ) sup Bs (x 0 ) |u| 1 + δ log(r/s) 0
(ii) The set {u = 0} ∩ Br (x 0 ) consists of two C 1 -curves intersecting each other at right angles at x 0 . Proof. From Corollary 5.6 we know that for each s ≤ r , 1 S u (x 0 , s) ≥ c1 + log(r/s) . s2 δ
(7.1)
It follows from Theorem 6.3 that (u(x 0 + sx)) 0 = p x ,u ≡ p 0 s→0 sup B |(u(x + sx))| 1
(7.2)
lim
exists. Using Lemma 5.1 gives u(x 0 + sx) (u(x 0 + sx)) − C2 ≥ 1,β s2 s2 C u(x 0 + sx) sup Bs (x 0 ) |u| sup Bs (x 0 ) |u| (u(x 0 + sx)) ≥ − p − p− 1,β s2 s2 s2 s2 C 1,β C 0 sup Bs (x 0 ) |u| u(x + sx) − p = sup Bs (x 0 ) |u| 1,β s2 C 0 + sx))| sup |(u(x 0 (u(x 0 + sx)) B1 (x ) − p − . (7.3) sup Bs (x 0 ) |u| sup B1 |(u(x 0 + sx))| 1,β C
As a direct consequence of Lemma 5.1 we obtain sup |(u(x 0 + sx))| C3 s 2 B1 − 1 ≤ . sup Bs (x 0 ) |u| sup Bs (x 0 ) |u| This, together with Theorem 6.3, implies that sup B1 |(u(x 0 + sx))| (u(x 0 + sx)) p − sup Bs (x 0 ) |u| sup B1 |(u(x 0 + sx))|
C 1,β
≤ C4
s2 S u (x 0 , s)
α .
Rearranging terms in (7.3) we get α α u(x 0 + sx) s2 δ ≤ C(d, M, α, β) . − p 1,β ≤ C5 C sup Bs (x 0 ) |u| S u (x 0 , s) 1 + δ log(r/s) This proves (i).
Uniform Regularity
269
Rotating the coordinate system we may assume that p x ,u = p = 2x1 x2 . The first part of the corollary implies that α δ 0 ≡ K s− , u(x + s·) < 0 in (x1 , x2 ) ∈ B1 : x1 x2 ≤ −C(d, M, α, β) 1 + δ log(r/s) 0
that u(x 0 + s·) > 0 in and that
(x1 , x2 ) ∈ B1 : x1 x2 ≥ C(d, M, α, β)
δ 1 + δ log(r/s)
α
≡ K s+ ,
u(x 0 + sx) ∂θ ≥ c6 |x| in B1 \ (K s− ∪ K s+ ). sup Bs (x 0 ) |u|
From the implicit function theorem it follows that, for each > 0, {u = 0} consists of four C 1 -curves in Bs (x 0 ) \ Bs/2 (x 0 ). To show that {u = 0} consists of two C 1 -curves we only need to show that these four curves are differentiable at x 0 and that their derivatives match. The normal ν of {u = 0} will point in the same (or opposite) direction as ∇u at any point of Bs (x 0 ) \ {x 0 } ∩ {u = 0}. Let us consider a point x 0 + sx of {u = 0} such that x2 = 1 and |x1 | ≤ 1: from (i) it follows that at the point x 0 + sx, 0 ∇ u(x + sx) ∇ u(x 0 + sx) = − 2∇(x1 x2 ) + 2∇(x1 x2 ) sup Bs (x 0 ) |u| sup Bs (x 0 ) |u| = 2e1 + terms of order
δ 1 + δ log(r/s)
α
.
By a similar argument for each of the four components of {u = 0} ∩ Bs (x 0 ) \ {x 0 } it follows that each component is a C 1 -curve with modulus of continuity σ (s) = C7 (log(r/s))−α and that each component approaches x 0 tangentially relative to the x 1 - or x 2 -axis. This proves (ii). Acknowledgements. H. Shahgholian has been supported in part by the Swedish Research Council. G.S. Weiss has been partially supported by the Grant-in-Aid 18740086 of the Japanese Ministry of Education, Culture, Sports, Science and Technology. He also thanks the Knut och Alice Wallenberg foundation for a visiting appointment to KTH. Both J. Andersson and G.S. Weiss thank the Göran Gustafsson Foundation for visiting appointments to KTH. The present result is part of the ESF-program GLOBAL. It was completed while the first two authors were visiting the Petrolium Institute in Abu Dhabi.
References 1. Andersson, J., Weiss, G.S.: Cross-shaped and degenerate singularities in an unstable elliptic free boundary problem. J. Diff. Eqs. 228(2), 633–640 (2006) 2. Andersson, J., Shahgholian, H., Weiss, G.S.: In preparation 3. Blank, I.: Eliminating mixed asymptotics in obstacle type free boundary problems. Comm. Part. Diff. Eqs. 29(7–8), 1167–1186 (2004) 4. Caffarelli, L.A.: The obstacle problem revisited. J. Fourier Anal. Appl. 4(4-5), 383–402 (1998)
270
J. Andersson, H. Shahgholian, G. S. Weiss
5. Caffarelli, L.A., Friedman, A.: The free boundary in the Thomas-Fermi atomic model. J. Dif. Eqs. 32(3), 335–356 (1979) 6. Caffarelli, L.A., Friedman, A.: The shape of axisymmetric rotating fluid. J. Funct. Anal. 35(1), 109–142 (1980) 7. Caffarelli, L.A., Karp, L., Shahgholian, H.: Regularity of a free boundary with application to the Pompeiu problem. Ann. of Math. (2) 151(1), 269–292 (2000) 8. Chanillo, S., Grieser, D., Imai, M., Kurata, K., Ohnishi, I.: Symmetry breaking and other phenomena in the optimization of eigenvalues for composite membranes. Commun. Math. Phys. 214(2), 315–337 (2000) 9. Chanillo, S., Grieser, D., Kurata, K.: The free boundary problem in the optimization of composite membranes. In: Differential Geometric Methods in the Control of Partial Differential Equations (Boulder, CO, 1999), Volume 268 of Contemp. Math., Providence, RI: Amer. Math. Soc., 2000, pp. 61–81 10. Chanillo, S., Kenig, C.E.: Weak uniqueness and partial regularity for the composite membrane problem. J. Eur. Math. Soc. 10, 705–737 (2007) 11. Chanillo, S., Kenig, C.E., To, T.: Regularity of the minimizers in the composite membrane problem in R2 . J. Funct. Anal. 255(9), 2299–2320 (2008) 12. Gilbarg, D., Trudinger, N.S.: Elliptic partial differential equations of second order. Volume 224 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Berlin: Springer-Verlag, second edition, 1983 13. Karp, L., Margulis, A.S.: Newtonian potential theory for unbounded sources and applications to free boundary problems. J. Anal. Math. 70, 1–63 (1996) 14. Monneau, R., Weiss, G.S.: An unstable elliptic free boundary problem arising in solid combustion. Duke Math. J. 136(2), 321–341 (2007) 15. Shahgholian, H.: The singular set for the composite membrane problem. Commun. Math. Phys. 271(1), 93–101 (2007) 16. Shahgholian, H., Uraltseva, N., Weiss, G.S.: The two-phase membrane problem—regularity of the free boundaries in higher dimensions. Int. Math. Res. Not. IMRN, (8):Art. ID rnm026, 16 (2007) 17. Simon, L.: Asymptotics for a class of nonlinear evolution equations, with applications to geometric problems. Ann. of Math. (2) 118(3), 525–571 (1983) 18. Simon, L.: Theorems on Regularity and Singularity of Energy Minimizing Maps. Based on lecture notes by Norbert Hungerbühler, Lectures in Mathematics ETH Zürich. Basel: Birkhäuser Verlag, 1996 19. Stein, E.M.: Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscillatory Integrals. With the assistance of Timothy S. Murphy, Monographs in Harmonic Analysis, III. Volume 43 of Princeton Mathematical Series. Princeton, NJ: Princeton University Press, 1993 20. Weiss, G.S.: Partial regularity for weak solutions of an elliptic free boundary problem. Comm. Part. Diff. Eqs. 23(3-4), 439–455 (1998) Communicated by P. Constantin
Commun. Math. Phys. 296, 271–283 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-0990-2
Communications in
Mathematical Physics
Limit of Quasilocal Mass at Spatial Infinity Mu-Tao Wang1, , Shing-Tung Yau2, 1 Department of Mathematics, Columbia University, New York,
NY 10027, USA. E-mail:
[email protected] 2 Department of Mathematics, Harvard University, Cambridge,
MA 02138, USA Received: 8 June 2009 / Accepted: 6 October 2009 Published online: 4 February 2010 – © Springer-Verlag 2010
Abstract: We study the limit of quasilocal mass defined in [4 and 5] for a family of spacelike 2-surfaces in spacetime. In particular, we show the limit coincides with the ADM mass at spatial infinity. The limit for coordinate spheres of a boosted slice of the Schwarzchild solution is computed explicitly and shown to give the expected energymomentum four-vector. 1. Review of the Definition of Quasilocal Energy In [4 and 5], we define a notion of quasilocal mass for spacelike 2-surfaces in a spacetime. Given an isometric embedding of a 2-surface into R3,1 and a future timelike unit vector (observer) in R3,1 , we associated a quasilocal energy with respect to a canonical gauge. Minimizing among the reference data gives the quasilocal mass and the quasilocal energy-momentum four-vector. We prove that the mass has the important positivity property when the 2-surface bounds a non-singular hypersurface that satisfies the dominant energy condition and it vanishes for surfaces in R3,1 . The expression for the mass is nevertheless rather nonlinear and complicated. In this article, we show that for a family of surfaces going out to spatial infinity, the expression indeed gets “linearized” and gives a well-defined energy-momentum four-vector. First of all, we recall the definition of quasilocal energy in [4]. Let be a spacelike 2-surface in a time-orientable spacetime N . Consider a reference isometric embedding → R3,1 = Nˆ . Fix a future timelike unit vector tˆν in R3,1 . We decompose tˆν along ⊂ R3,1 into tˆν = Nˆ uˆ ν + Nˆ ν in which Nˆ is the lapse function, Nˆ ν is the shift vector, and uˆ ν is the future timelike unit normal vector field along ⊂ R3,1 determined by this decomposition. We also take the spacelike outward pointing unit normal vˆ ν that is orthogonal to uˆ ν along ⊂ R3,1 . (uˆ ν , vˆ ν ) is the reference gauge for ⊂ R3,1 The first author is supported by NSF grant DMS-0605115. The second author is supported by NSF grant DMS-0628341.
272
M.-T. Wang, S.-T. Yau
with respect to tˆν . To compute the quasilocal energy, we also need the canonical gauge (u ν , v ν ) along ⊂ N . u ν is characterized as the unique future timelike unit normal vector field along ⊂ N such that h ν u ν = hˆ ν uˆ ν ,
(1.1)
where h ν is the mean curvature vector of ⊂ N and hˆ ν is the mean curvature vector of ⊂ R3,1 . v ν is the spacelike unit normal vector that is orthogonal to u ν and satisfies ˆ ⊂ R3,1 spanned by ⊂ R3,1 and vˆ ν , v ν h ν < 0. Take a spacelike hypersurface and a spacelike hypersurface ⊂ N spanned by ⊂ N and v ν . Let kˆ be the mean ˆ and k be the mean curvature of with respect to . curvature of with respect to ˆ ˆ and , respectively. These Also denote by K µν and K µν the extrinsic curvatures of data depend only on the gauges along but not on the hypersurfaces. Quasilocal energy in the canonical gauge (see Eq. (6) in [4]) is defined to be 1 (kˆ − k) Nˆ − (vˆ µ Kˆ µν − v µ K µν ) Nˆ ν . (1.2) 8π We shall rewrite the quasilocal energy in terms of the mean curvature gauge. In order to do so, we adopt a different set of notations from [5]. Set T0 = tˆν , Hˆ = hˆ ν , H = h ν , eˆ3 = vˆ ν , eˆ4 = uˆ ν , e3 = v ν , e4 = u ν . Denote by X : → R3,1 the position vector of the isometric embedding and by τ = −X, T0 the restriction of the time function associated 2 ˆ with T0 . T0 = 1 + |∇τ | eˆ4 − ∇τ and thus N = 1 + |∇τ |2 and Nˆ ν = −∇τ . The quasilocal energy becomes 1 R3,1 N (− Hˆ , eˆ3 +H, e3 ) 1+|∇τ |2 −(∇−∇τ eˆ4 , eˆ3 −∇−∇τ e4 , e3 ) d. (1.3) 8π ˆ
Suppose the mean curvature vector Hˆ of in R3,1 is spacelike. Let e3H =
− Hˆ | Hˆ |
be the
ˆ unit vector in the direction of − Hˆ and e4H be the future-directed time-like unit normal ˆ
ˆ
vector with e3H , e4H = 0. The relation between the two gauges along ⊂ R3,1 is ˆ
ˆ
e3H = cosh θˆ eˆ3 + sinh θˆ eˆ4 , and e4H = sinh θˆ eˆ3 + cosh θˆ eˆ4 for some θˆ ∈ R. Since τ = − Hˆ , T0 , we derive sinh θˆ =
− τ . ˆ | H | 1 + |∇τ |2
Therefore, ˆ R3,1 R3,1 Hˆ ∇∇τ eˆ4 , eˆ3 = −∇ θˆ · ∇τ + ∇∇τ e4 , e3H .
The canonical gauge condition (1.1) Hˆ , eˆ4 = H, e4 implies e H =
−H |H |
is given by
e3H = cosh θ e3 + sinh θ e4 with sinh θ =
− τ . |H | 1 + |∇τ |2
Expression (1.3) can now be rewritten in terms of the mean curvature gauge.
(1.4)
Limit of Quasilocal Mass at Spatial Infinity
273
To summarize, let ⊂ N be a spacelike 2-surface in a spacetime N and let X : → R3,1 be a reference isometric embedding of into the Minkowski space. For any given future timelike constant unit vector T0 ∈ R3,1 , the time function on ⊂ R3,1 is denoted by τ = −X, T0 . Let H be the mean curvature vector of in N , we assume H is spacelike. Let J be the future timelike normal vector field along in N which is dual to H along the light cone in the normal bundle of in N , i.e. J is the unique future timelike vector that is the reflection of H along the light cone in the normal bundle. Denote by Hˆ and Jˆ the corresponding data on the isometric embedding in R3,1 . Again, Hˆ is assumed to be spacelike in R3,1 . The quasilocal energy of with respect to the pair (X, T0 ) is given by 1 | Hˆ |2 (1 + |∇τ |2 ) + ( τ )2 − |H |2 (1 + |∇τ |2 ) + ( τ )2 E(, X, T0 ) = 8π τ τ −1 −1 − τ sinh − sinh 1 + |∇τ |2 | Hˆ | 1 + |∇τ |2 |H |
ˆ Hˆ H R3,1 J N J , , d, (1.5) − ∇∇τ + ∇∇τ |H | |H | | Hˆ | | Hˆ | where τ is the Laplacian of τ on (with respect to the induced metric), and ∇ N and 3,1 ∇ R are the covariant derivatives on N and R3,1 , respectively, and ∇τ is the gradient of τ on (with respect to the induced metric again), considered as a tangent vector field on . In the expressions for the last two integrands, we push forward ∇τ by the embeddings and identify it as vector fields along in R3,1 and N , respectively. 2. General Formula for the Limit of Quasilocal Energy Fix R0 > 0 and suppose r , R0 < r < ∞ is a family of closed 2-surfaces in N , and X r is a family of isometric embeddings of r into R3,1 . In the following theorem, we derive an expression for the limit of E(r , X r , T0 ). Theorem 2.1. Suppose the mean curvature vectors of r and of the image of X r in R3,1 are both spacelike for r > R0 and |Hˆ | → 1 as r → ∞. Then the limit of E(r , X r , T0 ) |H | as r → ∞ is the same as the limit of
ˆ ˆ 1 H H J J Jˆ 3,1 R N , , − T0 , (| Hˆ | − |H |) − ∇∇τ + ∇∇τ dr , 8π r |H | |H | | Hˆ | | Hˆ | | Hˆ | as long as the limits exist. Proof. We compute ˆ
τ = − Hˆ , T0 = | Hˆ |e3H , T0
(2.1)
and ˆ
ˆ
|∇τ |2 = −1 + e4H , T0 2 − e3H , T0 2 ,
(2.2)
274
M.-T. Wang, S.-T. Yau ˆ
ˆ − Hˆ and e4H | Hˆ | in R3,1 .
where e3H =
Jˆ | Hˆ |
=
ˆ
is the future timelike unit normal dual to e3H along the
image of X Rationalize the expression | Hˆ |2 (1 + |∇τ |2 ) + ( τ )2 − |H |2 (1 + |∇τ |2 ) + ( τ )2 as (| Hˆ | + |H |)(1 + |∇τ |2 ) . (| Hˆ | − |H |) 2 2 2 2 2 2 ˆ | H | (1 + |∇τ | ) + ( τ ) + |H | (1 + |∇τ | ) + ( τ ) By assumption of
|H | | Hˆ |
1 8π
→ 1 at infinity, the limit as r → ∞ is thus the same as the limit
ˆ
ˆ
e4H , T0 2 − e3H , T0 2 ˆ
−e4H , T0
r
Next we study the term − τ sinh by rewriting it as − τ sinh
−1
−1
(| Hˆ | − |H |) dr .
τ 1 + |∇τ |2 | Hˆ |
− sinh
τ 1 + |∇ρ|2 | Hˆ |
−1
− sinh
−1
τ
(2.3)
1 + |∇τ |2 |H |
| Hˆ | 1 + |∇τ |2 | Hˆ | |H | τ
.
Note that −A sinh−1 A − sinh−1 (A(1 + x)) →√ x 1 + A2 as x → 0. With x = limit of
| Hˆ | |H |
− 1 → 0, the limit of the second term is thus the same as the
1 8π
r
ˆ
e3H , T0 2 ˆ
−e4H , T0
ˆ (| H | − |H |) dr .
The theorem is proved by combining (2.3) and (2.4).
(2.4)
ˆ
ˆ
Suppose the image of the isometric embedding X r lies in R3 ⊂ R3,1 , then e4H = Jˆ |H | 3,1 Jˆ Hˆ R Hˆ coincide with term vanishes. In this case, e is a constant vector and the ∇∇τ , 3 ˆ ˆ |H | |H |
the outward unit normal of the embedding in R3 .
Limit of Quasilocal Mass at Spatial Infinity
275
Corollary 2.1. Suppose the reference isometric embedding is in R3 ⊂ R3,1 and |Hˆ | → 1 |H | as r → ∞, then the limit of the quasilocal energy with respect to T0 = ( 1 + |a|2 , a 1 , 3 (a i )2 is the same as the limit of a 2 , a 3 ) with |a|2 = i=1
H 1 1 N J ∇∇τ ( 1 + |a|2 ) , dr , (| Hˆ | − |H |)dr + 8π r 8π r |H | |H |
(2.5)
when r → ∞ as long as the limits exist. Suppose the isometric embedding for r is given by X r = (X 1 , X 2 , X 3 ) : → R3 3 and consider X i , i = 1, 2, 3 as functions on r . Thus ∇τ = − i=1 a i ∇ X i and we obtain a limiting quasilocal energy-momentum four-vector (e, p1 , p2 , p3 ) as the limit of 1 e = lim (| Hˆ | − |H |)dr , r →∞ 8π r (2.6) H 1 J N , d pi = lim ∇−∇ , i = 1, 2, 3. r i X |H | |H | r →∞ 8π r 3. Relating to ADM Energy-Momentum Let (M, gi j , pi j ) be an asymptotically flat hypersurface in a spacetime N . Thus there exists a compact set K ⊂ M such that M\K is diffeomorphic to a union of complements of balls in R3 (ends) such that gi j = δi j + ai j with ai j = O r1 , ∂k (ai j ) = O r12 , ∂l ∂k (ai j ) = O r13 , and pi j = O r12 , ∂k ( pi j ) = O r13 on each end of M\K . The ADM energy momentum (Arnowitt-Deser-Misner) of an end of M is the four vector (E, P1 , P2 , P3 ), where 1 E = lim (∂ j gi j − ∂i g j j )ν i d Sr r →∞ 16π S r is the total energy and 1 r →∞ 16π
Pk = lim
2( pik − δik p j j )ν i d Sr Sr
is the total momentum. Here Sr is a coordinate sphere of radius r on the end and ν is the outward unit normal of Sr . The positive mass theorem (Schoen-Yau [3], Witten [6]) asserts that under the dominant energy condition, the four-vector (E, P1 , P2 , P3 ) is future timelike, i.e. E ≥ 0 and − E 2 + P12 + P22 + P32 ≤ 0. In the following, we prove that for coordinate spheres of radius r , the limit of the quasilocal energy momentum (2.6) is the same as the ADM energy-momentum.
276
M.-T. Wang, S.-T. Yau
Theorem 3.1. Suppose Sr is the coordinate sphere of radius r in an end of an asymptotically flat three-manifold (M, gi j , pi j ) and (E, P1 , P2 , P3 ) is the ADM energymomentum four vector of this end, then lim E(Sr , X r , T0 ) =
r →∞
3 1 + |a|2 E + a i Pi , i=1
where X r is the (unique) isometric embedding of Sr into R3 ⊂ R3,1 and T0 = ( 1 + |a|2 , a 1 , a 2 , a 3 ) is an arbitrary constant timelike unit vector. Proof. Denote by e0 the future timelike unit normal of the hypersurface M and ν the unit outward normal of the coordinate sphere Sr . Let (y 1 , y 2 , y 3 ) be the asymptotically flat coordinates on the end. Sr is given by (y 1 )2 + (y2 )2 + (y 3 )2 = r 2 and we denote the 1 1 embedding of Sr into M by Y . Since pi j = O r 2 , we have H, e0 = O r 2 . It is known that H, ν = r2 + O r12 (see for example [1]). Since H = H, νν −H, e0 e0 , we estimate 1 . |H | − |H, ν| = O r3 Therefore,
lim
r →∞ S r
(| Hˆ | − |H |)d Sr = lim
r →∞ S r
(| Hˆ | − |H, ν|)d Sr ,
i.e, the Brown-York energy and the Liu-Yau energy have the same limit at spatial infinity. It is known that (see for example [1] and the reference therein) the Brown-York energy approaches the ADM energy E at spatial infinity. Now it suffices to prove 3
a i Pi =
i=1
3
a i pi =
i=1
1 8π
Sr
N ∇∇τ
H J , d Sr . |H | |H |
By definition, the ADM momentum is 3 i=1
a i Pi =
1 8π
p(a i Sr
∂ i ∂ , ν) − (tr p) a , ν d Sr , M ∂ yi ∂yi
where tr M p is the trace g i j pi j of p on M. We decompose a i ∂∂y i = a i ∂∂y i + a i ∂∂y i , ν ν and the integrand becomes
∂ i i ∂ a ,ν + a , ν ( p(ν, ν) − (tr M p)). p ∂ yi ∂ yi By the definition of the mean curvature vector H , we obtain p(ν, ν)−(tr M p) = H, e0 . Therefore the ADM momentum term is
3 1 N i ∂ ∇(a a i Pi = e , ν + H, e a , ν d Sr . (3.1) 0 0 i ∂ ) 8π Sr ∂ yi ∂ yi i=1
Limit of Quasilocal Mass at Spatial Infinity
277
Now we turn to the limit of the quasilocal energy momentum. We can express the normal vector fields H and J in terms of ν and e0 as H = H, νν − H, e0 e0 and J = −H, νe0 + H, e0 ν. We compute
H, e0 H N J N = −∇τ · ∇ sinh−1 − ∇∇τ ∇∇τ e0 , ν. , |H | |H | |H | Integrating by parts gives
1 1 H N J N −1 H, e0 , d Sr = d Sr . ∇ −∇∇τ e0 , ν + τ sinh 8π Sr ∇τ |H | |H | 8π Sr |H | Plug in (2.1), the second integrand on the right-hand side becomes −1 H, e0 Hˆ ˆ e3 , T0 | H | sinh . |H | Recall the asymptotics 2 2 1 1 1 ˆ H, e0 = O , |H | = + O , |H | = + O r2 r r2 r r2 1 H N J and sinh−1 x ∼ x if x 0 and β with γ = 1/ 1 − β 2 , consider coordinates given by (y 0 ) = γ y 0 − βγ y 3 , (y 3 ) = γ y 3 − βγ y 0 , (y 1 ) = y 1 , (y 2 ) = y 2 . Now consider the family of 2-surfaces r0 given by (y 0 ) = 0 and (y 1 )2 + (y 2 )2 + (y 3 )2 = r02 as r0 → ∞. These are standard coordinate spheres of a boosted slice of Schwarzchild’s solution. We calculate with the standard isotropic coordinates, in terms of which, r0 is defined by γ y 0 − βγ y 3 = 0 and (y 1 )2 + (y 2 )2 + (γ y 3 − βγ y 0 )2 = r02 . Denote the embedding of r0 into Schwarzchild’s solution by Y = (y 0 , y 1 , y 2 , y 3 ). We parametrize the 2-surfaces r0 by y0 y1 y2 y3
= βγ r0 cos θ, = r0 sin θ sin φ, = r0 sin θ cos φ, = γ r0 cos θ.
In terms of local coordinates u 1 = θ and u 2 = φ on the surface, the induced metric on r0 is 2M 2M a b 2 2 2 2 2 2 σab du du = r0 1 + (1 + 2β γ sin θ ) dθ + r0 1 + sin2 θ dφ 2 ρ ρ +O(r0 ), (4.1) and
2M 1 + β 2 γ 2 sin2 θ + O(r0 ). det σab = r02 | sin θ | 1 + ρ
The mean curvature vector H = H γ ∂ ∂y γ of r0 is by definition: 2 γ α β ∂ y γ ∂y ∂y γ γ (δβ − β ), + H γ = σ ab αβ ∂u a ∂u b ∂u a ∂u b
(4.2)
Limit of Quasilocal Mass at Spatial Infinity
279
γ
γ
α
γ
where αβ are the Christoffel symbols of the metric G αβ and β = G βα σ ab ∂∂uy a ∂∂uy b is γ the projection operator onto the tangent space of r0 . The asymptotic expansion of αβ can be computed from the asymptotic expansion of G αβ . α Denote by y˜ α = yr0 and ρ˜ = rρ0 , which are both scaling invariant now. We shall use the following frames along r0 to express the mean curvature vector: ∂ N = y˜ α α , ∂y ∂ ∂ B=γ + β , and ∂ y0 ∂ y3 ∂ y˜ α ∂ . T = ∂θ ∂ y α We notice that T is a tangent vector fieldto r0 while N and Bareonly asymptot1 ically normal in the sense that N , T = O r0 and B, T = O r10 . We also have N , N = 1 + O r10 , B, B = −1 + O r10 , and T, T = 1 + O r10 . A straightforward calculation gives Lemma 4.1. −2 1 H= N + 2 (nN + tT + bB) + O r0 r0
1 r03
with M (6 + 6β 2 γ 2 + 2β 4 γ 4 sin2 θ cos2 θ ), ρ˜ 3 M t = 3 (−8ρ˜ 2 )(β 2 γ 2 sin θ cos θ ), and ρ˜ M b = 3 (2βγ 2 cos θ )(β 2 γ 2 sin2 θ − 1). ρ˜
n=
From here, we compute the norm of |H |: Proposition 4.1. 2 1 |H | = + r0 r02
2M 1 2 2 2 (1 + 2β γ cos θ ) − n + O . ρ˜ r03
Let J be the future-directed timelike normal vector that is dual to H along the light cone in the normal bundle. Lemma 4.2. J is given by 2 1 −2M(1 + 2β 2 γ 2 cos2 θ + γ 2 + β 2 γ 2 ) +n B B− 2 r0 ρ˜ r0 1 1 8M 8Mβγ 2 cos θ 2 (βγ sin θ )T − (b + )N + O + 2 . ρ˜ ρ˜ r0 r03
280
M.-T. Wang, S.-T. Yau
Proof. The coefficients of J are determined by the following equations: −J, J = H, H , J, H = 0, and J, T = 0.
With this explicit formula, we compute the coefficients of the connection form of the normal bundle in the mean curvature vector gauge: Proposition 4.2.
1 1 8M N J, H = 3 2b + , βγ 2 sin θ + O ∇ ∂Y ρ˜ ∂θ r0 r04 where b denotes the derivative of b with respect to θ and 1 N ∇ ∂Y J, H = O . ∂φ r04 It turns out the second term does not contribute to the limit of the quasilocal energy.
4.2. Total mean curvature of isometric embedding. We consider the isometric embedding of a general axially symmetric metric into R3 . The metric is of the form r02 P 2 (r0 , θ )dθ 2 + r02 Q 2 (r0 , θ ) sin2 θ dφ 2 with
P(r0 , θ ) = 1 + O
1 r0
, and Q(r0 , θ ) = 1 + O
Suppose the isometric embedding is given by X = (u(r0 , θ ) sin φ, u(r0 , θ ) cos φ, v(r0 , θ )). Thus ∂X = ∂θ
∂u ∂u ∂v sin φ, cos φ, ∂θ ∂θ ∂θ
and ∂X = (u cos φ, −u sin φ, 0). ∂φ It is not hard to see
1 r0
.
Limit of Quasilocal Mass at Spatial Infinity
281
Lemma 4.3. u and v are given by 2 2 ∂u ∂v + = r02 P 2 ∂θ ∂θ and u 2 = r02 Q 2 sin2 θ. Proposition 4.3. The mean curvature of the isometric embedding of the metric r02 P 2 (r0 , θ )dθ 2 + r02 Q 2 (r0 , θ ) sin2 θ dφ 2 into R3 is given by 2 u ∂v 2 v ∂u 1 ∂v ∂ 1 ∂ + 2 + , Hˆ = − 3 3 2 2 ∂θ ∂θ ∂θ ∂θ ∂θ r0 P r0 P Q sin θ where u(r0 , θ ) = r0 Q(r0 , θ ) sin θ, and
∂v ∂θ
2
=
r02 P 2
−
∂u ∂θ
2 =
Now suppose p P =1+ +O r0 and q +O Q =1+ r0
P −
r02
2
1 r02 1 r02
∂Q sin θ + Q cos θ ∂θ
2 .
, p = p(θ ) , q = q(θ ).
The asymptotic expansion of the mean curvature is found to be 1 cos θ 2 (2q − p ) + q ). − (2 p + Hˆ = r0 r 2 sin θ
(4.3)
Comparing with (4.1), we deduce that in our case p=
M M (1 + 2β 2 γ 2 sin2 θ ) and q = . ρ˜ ρ˜
u and v can be solved explicitly: u = r0 sin θ +
M M sin θ and v = r0 cos θ + cos θ + 2Mβγ sinh−1 (βγ cos θ ). ρ˜ ρ˜
Plug in the expression of p and q into (4.3) and integrate by parts we obtain Proposition 4.4. ˆ H dr0 = 8πr0 + 2π M r0
2π 0
1 + β 2 γ 2 sin2 θ | sin θ |dθ + O ρ˜
This calculation is compatible with Lemma 2.4 in [1].
1 r0
.
(4.4)
282
M.-T. Wang, S.-T. Yau
4.3. Evaluating the quasilocal energy. We are ready to compute the limit of the Liu-Yau mass: Proposition 4.5. r0
( Hˆ − |H |)dr0 = 8π γ M + O
1 r0
.
Proof. Combine Proposition 4.1 and Proposition 4.4; we obtain r0
2+4β 2 γ 2 −6β 2 γ 2 cos2 θ − 4β 4 γ 4 cos4 θ | sin θ |dθ ρ˜ 0 1 . +O r0
( Hˆ − |H |)dr0 = π M
2π
The integral can be evaluate by the substitution βγ cos θ = sinh y.
Now we turn to the momentum part. Suppose T0 = ( 1 + |a|2 , a 1 , a 2 , a 3 ), |a|2 = 3 i 2 3 3,1 i=1 (a ) is a future timelike unit vector and the isometric embedding into R ⊂ R is given by X = (0, u sin φ, u cos φ, v). We know from (4.4) that u = r sin θ + O(1) ∂τ ab ∂Y and v = r cos θ + O(1). The gradient of τ is given by ∇τ = ∂u . We compute aσ ∂u b
r0
N ∇∇τ
H J , dr0 = −a 1 |H | |H |
r0
−a 2 −a
r0
3 r0
1 N (u sin θ ) σ θθ ∇ ∂Y J, H dr0 |H |2 ∂θ 1 N (u cos θ ) σ θθ ∇ ∂Y J, H dr0 |H |2 ∂θ 1 θθ N 1 . (4.5) v σ ∇ ∂Y J, H dr0 + O 2 |H | r0 ∂θ
These integrals can be evaluated and we obtain Proposition 4.6. r0
N ∇∇τ
H J , dr0 = a 3 8πβγ M + O |H | |H |
Proof. By Proposition 4.2,
1 r0
.
∇N
∂Y ∂θ
J, H is of the order
1 r03
while u and v are both of order
r0 . We have
1 θθ N ∇ ∂Y J, H dr0 (u sin θ ) σ 2 ∂θ r0 |H |
π 2π 2 r0 1 N = J, H r02 | sin θ |dθ dφ (r0 sin2 θ ) 2 ∇ ∂Y 4 ∂θ r0 0 0
2π π N (sin2 θ ) r03 ∇ ∂Y J, H | sin θ |dθ. = 4 0 ∂θ
Limit of Quasilocal Mass at Spatial Infinity
283
Therefore the first integral on the right-hand side of (4.5) is π 2π 8M −a 1 βγ 2 sin θ | sin θ |dθ, (sin2 θ ) 2b + 4 0 ρ˜ where b=
M (2βγ 2 cos θ )(β 2 γ 2 sin2 θ − 1), ρ˜ 3
and the second one is 2π 8M 2π 2 −a βγ sin θ | sin θ |dθ. (sin θ cos θ ) 2b + 4 0 ρ˜ 2π Both integrate to zero as they are of the form 0 (cos θ )F(cos2 θ )| sin θ |dθ or 2π 2 2 0 (sin θ )F(cos θ )| sin θ |dθ for some smooth function F of cos θ . The last integral becomes 2π 8M 3π 2 a βγ sin θ | sin θ |dθ sin θ 2b + 4 0 ρ˜ which can be simplified by integration by parts as π sin θ a 3 4π Mβγ 2 dθ. ρ˜ 3 0 Using the same substitution βγ cos θ = sinh y, the integral is a 3 8πβγ M.
Therefore the limit of the quasilocal energy (1.5) is ( 1 + |a|2 )γ M + a 3 βγ M.
Recall that γ 2 − β 2 γ 2 = 1. Minimizing this expression among all T0 = ( 1 + |a|2 , a 1 , a 2 , a 3 ), we see the minimum is achieved at (γ , 0, 0, −βγ ) and the minimum value is M. The limit of the quasilocal energy-momentum is thus M(γ , 0, 0, −βγ ). Acknowledgements. We would like to thank PoNing Chen for his help in checking the correctness of the calculations in §3.
References 1. Fan, X.-Q., Shi, Y., Tam, L.-F.: Large-sphere and small-sphere limits of the Brown-York mass. Comm. Anal. Geom. 17(1), 37–72 (2009) 2. Liu, C.-C.M., Yau, S.-T.: Positivity of quasilocal mass. Phys. Rev. Lett. 90(23), 231102 (2003) 3. Schoen, R., Yau, S.-T.: Positivity of the total mass of a general space-time. Phys. Rev. Lett. 43(20), 1457–1459 (1979) 4. Wang, M.-T., Yau, S.-T.: Quasilocal mass in general relativity. Phys. Rev. Lett. 102(2), 021101 (2009) 5. Wang, M.-T., Yau, S.-T.: Isometric embeddings into the Minkowski space and new quasi-local mass. Commun. Math. Phys. 288, 919–942 (2009) 6. Witten, E.: A new proof of the positive energy theorem. Commun. Math. Phys. 80(3), 381–402 (1981) Communicated by P.T. Chru´sciel
Commun. Math. Phys. 296, 285–301 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-0991-1
Communications in
Mathematical Physics
Non-Uniform Dependence on Initial Data of Solutions to the Euler Equations of Hydrodynamics A. Alexandrou Himonas, Gerard Misiołek Department of Mathematics, University of Notre Dame, Notre Dame, IN 46556, USA. E-mail:
[email protected];
[email protected] Received: 13 June 2009 / Accepted: 24 October 2009 Published online: 29 January 2010 – © Springer-Verlag 2010
Abstract: We show that continuous dependence on initial data of solutions to the Euler equations of incompressible hydrodynamics is optimal. More precisely, we prove that the data-to-solution map is not uniformly continuous in Sobolev H s () topology for any s ∈ R if the domain is the (flat) torus Tn = Rn /2π Zn and for any s > 0 if the domain is the whole space Rn . 1. Introduction The classical notion of well-posedness of an abstract Cauchy problem due to Hadamard requires constructing a unique solution which depends continuously on initial conditions. This notion is quite strong and difficulties of a specific problem often force one to relax or drop the requirements of continuous dependence or uniqueness. On the other hand, there are many equations for which the solution operator can be shown in suitably chosen topologies to be uniformly continuous, Lipschitz or even differentiable.1 The Cauchy problem for the Euler equations of ideal hydrodynamics has a long and distinguished history which we will not attempt to survey here, recommending instead the monograph of Majda and Bertozzi [MB] or a recent article by Constantin [C] for fundamental results and additional references. Of relevance to us however will be the property of continuous dependence of solutions (in the strong sense of Sobolev norms) which was first established by Ebin and Marsden [EM] for bounded domains (possibly with boundary) and by Kato [K1] for the whole space using semigroup techniques. The former approach is based on a result of V. Arnold according to which motions of an n-dimensional ideal fluid correspond to geodesics of its kinetic energy functional in the group of diffeomorphisms preserving the volume of the fluid domain, see Arnold and Khesin [AK] for a detailed exposition. Topologizing the space of diffeomorphisms by Sobolev H s norms (with s > n/2 + 1) the geodesic equation is then solved (locally 1 Various examples can be found in [H,B,Sh or KPV].
286
A. A. Himonas, G. Misiołek
in time) using Banach contractions. In particular, the geodesics depend smoothly on initial conditions but since derivative loss occurs upon changing back from Lagrangian to Eulerian coordinates this approach ultimately yields only continuous dependence of the corresponding solutions to the Euler equations.2 There arises therefore a natural question whether this dependence is optimal, see [EM], 15.2(ii) p.151. We point out that for the closely related Navier-Stokes equations, the dependence of solutions on the data in sufficiently high Sobolev norms and sufficiently large viscosity is at least Lipschitz.3 However, we have not been able to find an answer to the optimality question for the Euler equations anywhere in the literature. In this paper we show that continuous dependence in Eulerian coordinates is indeed the best one can expect. More precisely, we will prove that the solution map u 0 → u for the Euler equations in (in dimension 2 and 3) is not uniformly continuous on bounded sets into C([0, T ], H s ) for any s ∈ R when = Tn and for any s > 0 when = Rn . Our methods in the two cases are different. One of the first results of this type was proved by Kato [K2] who showed that the solution operator for the (inviscid) Burgers equation is not Hölder continuous in the H s (T) norm (s > 3/2) regardless of the Hölder exponent. Since then other techniques have been developed and successfully applied in the study of various nonlinear dispersive and integrable equations in one space dimension, see for example Kenig, Ponce and Vega [KPV] and its references. Our approach is most closely related to that of Koch and Tzvetkov [KT] in their study of the Benjamin-One equation (see also [HK]). In the next section we describe the basic set-up and present the statements of our results. Sections 3 and 4 contain the main constructions of the paper. A few well-known technical proofs are gathered in the Appendix in an attempt to make the paper reasonably self-contained. 2. Background and Statements of Main Results The initial value problem for the Euler equations governing the motion of an incompressible fluid in an n-dimensional domain can be formulated as ∂t u + ∇u u + ∇ p = 0, div u = 0, u(0, x) = u 0 (x), x ∈ , t ∈ R,
(2.1) (2.2)
where u : R × → Rn is the fluid velocity, p : R × → R is the pressure function and u 0 : → Rn is a divergence free initial condition. We shall be concerned only with the cases when is either the flat n-torus Tn = Rn /2π Zn or the whole space = Rn and where n = 2 or 3. However, it should be clear that similar constructions can be done also for certain bounded domains (such as a disc or a finite cylinder for example) with appropriate boundary conditions. As is well known pressure can be eliminated from (2.1). In fact, applying the divergence operator to the first equation and then solving for p gives ∇ p = −∇−1 div∇u u = −(1 − P)∇u u,
(2.3)
2 The fact that fluids are “better behaved” in Lagrangian coordinates has led to detailed studies of the associated Riemannian exponential map on the diffeomorphism group as in [EMP], see also [AK]. 3 In fact, using semilinear parabolic techniques it can be shown to be analytic, see [H], p. 79–81.
Non-Uniform Dependence of Solutions to the Euler Equations
287
where P is the L 2 -orthogonal projection onto the divergence free part in the Hodge decomposition of vector fields on into divergence free fields and gradients of functions in . Using (2.3) the first of the equations in (2.1) takes the form ∂t u + ∇u u − ∇−1 div∇u u = 0.
(2.4)
We emphasize that the nonlocal term in the above equation is more regular than it appears. In fact, since u = (u 1 , . . . , u n ) is divergence free it follows that the function div∇u u =
n
∂i u j ∂ j u i
(2.5)
i, j=1
involves only first order derivatives of u. For any s ∈ R we shall equip the Sobolev space of vector-valued distributions H s (, Rn ) with the norm u
H s (,Rn )
=
n
u j H s () ,
(2.6)
j=1
where f H s () is the standard Sobolev norm for functions on defined either as 1/2 1/2 s s ˆ 2 ˆ 2 2 2 1+n 1 + |ξ | or , f (n) f (ξ ) dξ Rn
n∈Z
depending on the context. We shall also often use the symbols and to denote estimates that hold up to some universal constant. Local well-posedness of the Euler equations in (in dimensions n ≥ 2) has of course been established by many authors. We summarize the result in the form which is convenient for our purposes. Local well-posedness. If s > n/2 + 1 and u 0 ∈ H s (, Rn ) is a divergence free vector field then there exists T > 0 and a unique solution u ∈ C ([0, T ], H s (, Rn )) of the Cauchy problem (2.1)–(2.2) which depends continuously on the initial data u 0 . Furthermore, we have the estimate u(t) H s ≤
u 0 H s 1 − Ct u 0 H s
for 0 ≤ t ≤ T < C −1 u 0 −1 Hs ,
(2.7)
where C > 0 is a constant depending on s. The proof can be found for example in the references [MB or KP]. Our main results show that in general one cannot expect to improve the continuous dependence of Eulerian solutions on initial data. Theorem 2.1. Let n = 2 or 3 and let u 0 → u denote the solution map of the Euler equations defined by the Cauchy problem (2.1)–(2.2): (P) Periodic case. For any s ∈ R the solution map is not uniformly continuous from the unit ball in H s (Tn , Rn ) into C ([0, T ], H s (Tn , Rn )). (NP) Non-periodic case. If s > 0 then the solution map is not uniformly continuous from the unit ball in H s (Rn , Rn ) into C ([0, T ], H s (Rn , Rn )).
288
A. A. Himonas, G. Misiołek
A few comments are in order. Although our proofs for the periodic and the non-periodic domains are different in both cases the general strategy will be to construct for each Sobolev index two sequences of solutions which are converging at time zero but remain far apart at any later time. The constructions are essentially two dimensional but modifications needed for higher dimensions are trivial. Furthermore, it seems that the restriction on the values of the Sobolev index s in the non-periodic case (NP) is merely a consequence of our methods and can be improved. On the other hand, one may reasonably argue that the range of s is determined by those values which correspond to a well-posed evolution of the fluid. For classical solutions of the Euler equations this range is s > n/2 + 1 and thus is covered by our results. The main value of our result consists in the fact that the instability we prove occurs on a finite time interval independent of the initial distance between the two sequences. This would certainly be known in the space C([0, ∞], H s ) due to the existence of (plenty of) smooth exponentially unstable stationary solutions with smooth unstable eigenfunctions corresponding to unstable eigenvalues of the linearized equation. Indeed, if u 0 is a parallel shear flow with smooth profile possessing a smooth unstable eigenfunction vλ , then the difference between u 0 , as one solution, and the solution with initial data u 0 + εvλ will be greater than an absolute c0 on a time interval of order − log ε. This is a well-known bootstrap argument (see Lin [L] and the references on the subject therein). Remark 2.1. It is not difficult to see that the solution map u 0 → u is differentiable from H s into C([0, T ], H s−1 ) for s > n/2 + 1. This follows from the fact that the corresponding solution map in Lagrangian coordinates (or equivalently, the Riemannian exponential map of the L 2 metric on the group of volume-preserving diffeomorphisms) u 0 → η(t) = expe (tu 0 ),
where e(x) = x
is differentiable as a map into H s diffeomorphisms for any s > n/2 + 1. As already mentioned, the change from Lagrangian to Eulerian coordinates introduces a loss of derivatives. However, letting ι : η → η−1 denote the inversion of diffeomorphisms and writing u = η˙ ◦ ι ◦ η we easily conclude that the map u 0 → u retains the desired differentiablity considered as a function from H s to C([0, T ], H s−1 ) because ι itself is of class C 1 when mapping diffeomorphisms of class H s to those of class H s−1 . We refer to [EM and EMP] for more details regarding diffeomorphism groups and the geometry of the L 2 exponential map. 3. Proof of Theorem 2.1: Part (P) The proof of part (P) for the case when = T2 relies essentially on the construction of two sequences of explicit periodic solutions to the Euler equations with mixed high and low frequency terms that have suitably chosen phase shifts. Our motivation comes partly from the constructions in [M]. Also, sequences similar to those defined by formula (3.1) below have been used for equations like Burgers’, BO and CH which can be considered as approximations to the Euler equations (see [KT,HKM] and the references therein). Remarkably, in the case of the Euler equations these functions are indeed solutions. Lemma 3.1. For any ω ∈ R and n ∈ Z+ the divergence free vector field u ω,n (t, x1 , x2 ) = ωn −1 + n −s cos(nx2 − ωt), ωn −1 + n −s cos(nx1 − ωt) is a solution to the Euler equations on T2 .
(3.1)
Non-Uniform Dependence of Solutions to the Euler Equations
Proof. We compute the first two terms
∂t u ω,n = ωn −s sin(nx2 − ωt), ωn −s sin(nx1 − ωt)
289
(3.2)
and ∇u ω,n u ω,n = − ωn −1 + n −s cos(nx1 − ωt) n −s+1 sin(nx2 − ωt), − ωn −1 + n −s cos(nx2 − ωt) n −s+1 sin(nx1 − ωt)
(3.3)
so that ∂t u ω,n + ∇u ω,n u ω,n = −n −2s+1 cos(nx1 − ωt) sin(nx2 − ωt), − n −2s+1 sin(nx1 − ωt) cos(nx2 − ωt) .
(3.4)
Furthermore, from (2.5) we have div∇u ω,n u ω,n = 2n −2s+2 sin(nx1 − ωt) sin(nx2 − ωt),
(3.5)
and since a quick inspection shows that the product sin nx1 sin nx2 is an eigenfunction of the Laplacian with eigenvalue −2n 2 , we have −1 div∇u ω,n u ω,n = −n −2s sin(nx1 − ωt) sin(nx2 − ωt),
(3.6)
and hence ∇−1 div∇u ω,n u ω,n = −n −2s+1 cos(nx1 − ωt) sin(nx2 − ωt), − n −2s+1 sin(nx1 − ωt) cos(nx2 − ωt) .
(3.7)
Combining (3.4) and (3.7) gives ∂t u ω,n + ∇u ω,n u ω,n − ∇−1 div∇u ω,n u ω,n = 0, which completes the proof.
(3.8)
The second ingredient we need is provided by the following simple estimate: Lemma 3.2. For any s ∈ R and any n 1 we have n −s cos(n · −ωt) H s (T) + n −s sin(n · −ωt) H s (T) 1.
(3.9)
Proof. A direct computation gives (cos(n · −ωt))∧ (k) =
2π
e−ikx cos(nx − ωt) d x = π e−iωt δk,n + π eiωt δk,−n ,
0
√ and consequently cos(n · −ωt) H s (T) = π 2(1 + n 2 )s/2 . An analogous computation for sin(n · −ωt) yields the estimate.
290
A. A. Himonas, G. Misiołek
In order to show that
the solution mapis not uniformly continuous on bounded sets of initial data into C [0, T ], H s (T2 , R2 ) it will be sufficient to select two sequences of solutions converging in the H s norm at t = 0 but separated at some later time t > 0 and which remain confined to a bounded set in H s (T2 ) for all 0 ≤ t ≤ T . To this end we pick the solutions described in Lemma 3.1, namely u 1,n (t) and u −1,n (t) with n = 1, 2, . . . and corresponding to ω = 1 and ω = −1 respectively and use Lemma 3.2 as our tool to verify that they meet our requirements. First, observe that boundedness of the two sequences in any H s norm follows since for any t and ω = ±1, we have ω,n u ω,n (t) H s (T2 ,R2 ) = u ω,n 1 (t) H s (T2 ) + u 2 (t) H s (T2 )
= ωn −1 + n −s cos(n · −ωt) H s (d x2 ) + ωn −1 + n −s cos(n · −ωt) H s (d x1 ) n −1 + 1 1. Next, estimating the difference of the two sequences at time t = 0, we find that u 1,n (0) − u −1,n (0) H s (T2 ,R2 ) 2n −1 −→ 0 whenever n → ∞. On the other hand, by triangle inequality, a little trigonometry and repeated application of Lemma 3.2, we have u 1,n (t) − u −1,n (t) H s (T2 ,R2 ) 2 cos(n · −t) − cos(n · +t) 2 cos(n · −t) − cos(n · +t) + + = + s s n n ns ns Hx Hx 2 1 cos(n · −t) − cos(n · +t) cos(n · −t) − cos(n · +t) 4 s + s −n ns ns Hx Hx 2
= 2n
−s
sin t sin n(·) Hxs + 2n 2
−s
1
sin t sin n(·) Hxs
1
1 | sin t| − n for any t ≥ 0. From the last inequality we now obtain lim inf u 1,n (t) − u −1,n (t) s 2 n→∞
4 − n
H (T ,R2 )
| sin t|,
which completes the proof in the case when = T2 . Remark 3.1. (The case = T3 ) It only suffices to observe that for any constants ω, n and s the vector-valued function u ω,n (t, x1 , x2 , x3 ) = ωn −1 +n −s cos(nx2 −ωt), ωn −1 +n −s cos(nx1 −ωt), 0 (3.10) is also a solution to the Euler equations on the three torus applies without change.4 This completes the proof of part (P) of Theorem 2.1. 4 A similar argument works for any higher-dimensional flat torus.
T3 .
The construction above
Non-Uniform Dependence of Solutions to the Euler Equations
291
4. Proof of Theorem 2.1: Part (NP) As in the previous section we prove (NP) in the two-dimensional case and later indicate modifications needed in three dimensions. Our strategy in the nonperiodic case will be to select two sequences of approximate solutions which are arbitrarily close at time zero but are separated at later times. They will consist of a high frequency part as in the previous section but localized in the spacial variable and a low frequency part which in fact will be a smooth solution with suitably chosen initial data. One of our tasks will be to control the error terms. This approach has been successfully applied to equations in one space dimension, see e.g. [KT or HK].
4.1. Approximate solutions. Our approximate solutions have the following form: u ω,λ (t, x) = u l (t, x) + u h (t, x),
x = (x1 , x2 ) ∈ R2 , t ∈ R,
(4.1)
where u h is the high frequency term u h (t, x) = rotφ h (t, x) = ∂2 φ h (t, x), −∂1 φ h (t, x)
(4.2)
given by the stream function φ h (t, x) = λ−δ−s−1 φ
x x 1 2 φ δ sin(λx2 − ωt), λδ λ
λ ∈ Z+ ,
(4.3)
and where φ ∈ Cc∞ (R) satisfies supp φ ⊂ [−2, 2] and φ(x) ≡ 1 for |x| < 1. The values of the parameters δ > 0 and s ∈ R will be specified later. The low frequency term u l is defined as the solution of the following initial value problem: ∂t u l + ∇ul u l − ∇−1 div∇ul u l = 0, div u l = 0, u l (0, x) = rot φ l (x), with the corresponding stream function x x 1 2 φ l (x) = −ωλ−1+δ ψ1 δ ψ2 δ , λ λ
(4.4) x ∈ R2 ,
ω = ±1, λ ∈ Z+ ,
(4.5)
and where the localizing functions ψ1 , ψ2 ∈ Cc∞ (R) are chosen such that ψ1 = ψ2 ≡ 1 on the support of φ. We first aim to show that the functions u ω,λ are indeed good approximations to solutions in that they satisfy the Euler equations in R2 up to a “small” error term. Since both u l and u h are divergence free, we have div u ω,λ = 0. Furthermore, using the first of the equations in (4.4) we compute six error terms ∂t u ω,λ + ∇u ω,λ u ω,λ − ∇−1 div∇u ω,λ u ω,λ = ∂t u h + ∇ul u h + ∇u h u l + ∇u h u h − 2∇−1 div∇ul u h − ∇−1 div∇u h u h = E1 + E2 + E3 + E4 + E5 + E6. (4.6)
292
A. A. Himonas, G. Misiołek
4.2. L 2 -estimates of error terms. Before we proceed to estimate the error terms E 1 , . . . , E 6 we first need to derive appropriate bounds on u l and u h . The next lemma, whose proof involves a little Fourier analysis, will be helpful in what follows. Lemma 4.1. Let σ ≥ 0 and δ ≥ 0. For any Schwartz function ψ ∈ S(R) we have · λδ/2 ψ L 2 (R) ≤ ψ δ ≤ λδ/2 ψ H σ (R) , λ 1. (4.7) λ H σ (R ) Furthermore, for any constant a ∈ R we have the estimate · λσ +δ/2 ψ L 2 (R) , λ 1, ψ δ cos(λ · −a) σ λ H (R )
(4.8)
which also holds when cos(λ · −a) is replaced by sin(λ · −a). Proof. See Appendix. Since
x x x 1 2 2 −1 x 1 u l (0, x) = −ωλ−1 ψ1 δ ψ2 , ωλ ψ2 δ , ψ 1 λ λδ λδ λ
from estimate (4.7) we obtain that l u (0)
H σ (R 2 )
λ−1+δ .
(4.9)
Next, using (4.9) and standard energy estimates we derive the following lemma whose proof is also relegated to the Appendix. Lemma 4.2. For any δ > 0 and λ 1 the initial value problem (4.4) has a unique solution u l = u l (t, x) such that for any σ ≥ 0 we have l (4.10) u (t) σ 2 λ−1+δ H (R )
uniformly for all t ∈ [0, 1]. Proof. See Appendix.
Using Lemma 4.1 we can estimate Sobolev H σ norms of the high frequency term From (4.2) and (4.3) we have x x 1 2 u h (t, x) = λ−s−δ φ δ φ δ cos(λx2 − ωt) λ λ x2 x1 + λ−s−1−2δ φ δ φ δ sin(λx2 − ωt), λ λ x x 1 2 (4.11) − λ−s−1−2δ φ δ φ δ sin(λx2 − ωt) . λ λ
u h (t).
For any σ ≥ 0 we estimate its H σ norm h u (t) σ 2 = u 1h (t) H (R )
H σ (R 2 )
+ u 2h (t)
H σ (R 2 )
Non-Uniform Dependence of Solutions to the Euler Equations
293
by the sum of three terms · · φ δ cos(λ · −ωt) λ−s−δ φ δ λ λ Hxσ Hxσ 2 1· · −s−1−2δ sin(λ · −ωt) +λ σ φ δ σ φ δ λ λ Hx Hx 2 · 1 · + λ−s−1−2δ φ δ φ δ sin(λ · −ωt) σ λ λ Hxσ Hx 1 2 and apply (4.7) and (4.8) to obtain h u (t)
H σ (R 2 )
λ−s+σ
which holds uniformly in t ∈ [0, 1]. Similarly, we find another bound h u (t) u 1h (t) + u 2h (t) λ−s−δ ∞
∞
∞
(4.12)
(4.13)
(4.14)
also valid for any t ∈ [0, 1]. We proceed to derive L 2 estimates of the error terms. We expect the contributions involving derivatives of the high frequency part to have the slowest decay in λ. It is possible to improve this decay somewhat by combining E 1 and E 2 and using energy estimates for the low frequency part. We will therefore bound the sum E 1 + E 2 . Observe that ∂t u h (t, x) x x x x 1 2 1 2 = ωλ−s−δ φ δ φ δ sin(λx2 −ωt)−ωλ−s−1−2δ φ δ φ δ cos(λx2 −ωt), λ λ λ λ x1 x2 ωλ−s−1−2δ φ δ φ δ cos(λx2 − ωt) , λ λ and hence with our choices of the cut-offs ψ1 and ψ2 the first term in the first component of ∂t u h (t, x) can be written as x x 1 2 ωλ−s−δ φ δ φ δ sin(λx2 − ωt) λ λ x1 x2 −s+1−δ x1 x2 ψ λ φ δ φ δ sin(λx2 − ωt) = ωλ−1 ψ1 2 δ λδ λ λ λx x 1 2 = λ−s+1−δ u l2 (0, x)φ δ φ δ sin(λx2 − ωt), λ λ l using the formula for u (0, x) given by (4.4) and (4.5). We can now write the first component of E 1 + E 2 explicitly in the form x x 1 2 ∂t u h + ∇ul u h (t, x) = λ−s+1−δ u l2 (0, x)−u l2 (t, x) φ δ φ δ sin(λx2 −ωt) 1 λ λ x x 1 2 − ωλ−s−1−2δ φ δ φ δ cos(λx2 − ωt) λ λ x1 x2 + λ−s−2δ u l1 (t, x)φ δ φ δ cos(λx2 − ωt) λ λ x1 x2 + λ−s−1−3δ u l1 (t, x)φ δ φ δ sin(λx2 − ωt) λ λ x x 1 2 + 2λ−s−2δ u l2 (t, x)φ δ φ δ cos(λx2 − ωt) λ λ x x 1 2 + λ−s−1−3δ u l2 (t, x)φ δ φ δ sin(λx2 − ωt), λ λ
294
A. A. Himonas, G. Misiołek
while its second component is x x 1 2 ∂t u h + ∇ul u h (t, x) = ωλ−s−1−2δ φ δ φ δ cos(λx2 − ωt) 2 λ λ x x 1 2 − λ−s−1−3δ u l1 (t, x)φ δ φ δ sin(λx2 − ωt) x λ x λ 1 2 − λ−s−2δ u l2 (t, x)φ δ φ δ cos(λx2 − ωt) λ λ x1 x2 − λ−s−1−3δ u l2 (t, x)φ δ φ δ sin(λx2 − ωt). λ λ Using the estimates in Lemma 4.2 and Lemma 4.1 we can now estimate the L 2 norm of the first two error terms, E 1 + E 2 L 2 ∂t u h + ∇ul u h (t) + ∂t u h + ∇ul u h (t) . 2 2 2 2 1
L (R )
2
L (R )
The norm of the first component is bounded by the sum · λ−s+1−δ u l2 (t) − u l2 (0) 2 φ∞ φ δ sin(λ · −ωt) L λ ∞ · · −s−1−2δ cos(λ · −ωt) +λ φ δ 2 φ λ λδ L L 2 (R ) · + u l1 (t) L 2 φ ∞ λ−s−2δ φ δ cos(λ · −ωt) λ ∞ · −s−1−3δ sin(λ · −ωt) +λ φ λδ · ∞ l −s−2δ cos(λ · −ωt) + u 2 (t) L 2 φ∞ λ φ δ λ ∞ · , + λ−s−1−3δ φ δ sin(λ · −ωt) λ ∞ and that of the second component is bounded by · · cos(λ · −ωt) λ−s−1−2δ φ δ φ δ λ λ L 2 (R ) L 2 (R ) · −s−1−3δ l +λ u 1 (t) 2 2 φ ∞ φ δ sin(λ · −ωt) L (R ) λ ∞ · l −s−2δ + u 2 (t) L 2 φ ∞ λ φ δ cos(λ · −ωt) λ ∞ · −s−1−3δ sin(λ · −ωt) . +λ φ λδ ∞ Combining these estimates and using (4.7), (4.8) and (4.10) we get T l E 1 + E 2 L 2 λ−s+1−δ ∂t u (t) 2 2 dt + λ−s−1−2δ λδ/2 λδ/2 + λ−s−2δ λ−1+δ 0 −s−1−3δ −1+δ
L (R ) −s−2δ −1+δ
+λ λ +λ λ + λ−s−1−3δ λ−1+δ + λ−s−1−2δ λδ/2 λδ/2 −s−1−3δ −1+δ −s−2δ −1+δ +λ λ +λ λ + λ−s−1−3δ λ−1+δ T l λ−s−1−δ + λ−s+1−δ ∂t u (t) 2 2 dt. 0
L (R )
Non-Uniform Dependence of Solutions to the Euler Equations
295
Since u l (t, x) is defined in (4.4) as a solution of the Euler equations we can estimate the integral term above using Lemma 4.2 by T T l ∂t u (t) 2 2 dt = P∇ul u l (t) 2 2 dt L (R )
0
L (R )
0
T
l l u (t) u (t)
0 2(−1+δ)
λ
∞
H 1 (R 2 )
dt
,
where P is the L 2 orthogonal Hodge projection onto divergence free vector fields defined in (2.3). We have therefore obtained the estimate E 1 + E 2 L 2 (R2 ) λ−s−1+δ .
(4.15)
Using the bounds on the high frequency term in (4.13) and (4.14) and proceeding as above we find estimates of the remaining error terms E 3 + E 5 L 2 (R2 ) = P∇u h u l − (1 − P)∇u h u l 2 2 L (R ) h l ≤ 2 u (t) u (t) 1 (4.16) H (R )
∞
λ−s−δ λ−1+δ = λ−s−1 , and similarly
E 4 + E 6 L 2 (R2 ) = P∇u h u h 2 2 L (R ) h h ≤ u (t) u (t) λ
∞ −s−δ −s+1
λ
=λ
H 1 (R ) −2s+1−δ
(4.17) .
Collecting the estimates above gives the following L 2 bound: 6 E j λ−rs,δ , j=1 2 2 L (R )
(4.18)
where rs,δ = min (2s − 1 + δ, s + 1 − δ) .
Note that in order to assure that the error terms are small for λ 1 we need rs,δ > 0. 4.3. Construction of solutions. Our next task will be to show that the family of functions u ω,λ constructed in the previous subsections is a sufficiently good approximation to solutions of the Euler equations. Let u ω,λ = u ω,λ (t, x) be the unique solution of the Euler equations in R2 with initial data given by the values of u ω,λ at time t = 0. Namely, ∂t u ω,λ + ∇u ω,λ u ω,λ = −∇ pω,λ , div u ω,λ = 0, u ω,λ (0, x) = u
ω,λ
(4.19)
(0, x) = u (0, x) + u (0, x), x ∈ R , t ∈ R. l
h
2
296
A. A. Himonas, G. Misiołek
Observe that from formulas (4.5) and (4.11) and the estimates in (4.13) and Lemma 4.1 we have ω,λ h l u (0) s 2 ≤ (0) + (0) u u s 2 H (R ) s 2 H (R )
H (R )
λ−1+δ + 1 1, provided that we pick δ > 0 such that 0 < δ < 1.
(4.20)
It follows that if s > 2 then by the local existence and uniqueness theorem for the Euler equations the solution u ω,λ (t, x) is defined globally in time and with values in H s (R2 ). In fact, it also follows from the bound on the lifespan (2.7) in the local wellposedness theorem, see estimate (5.1). Next, consider the difference v = u ω,λ − u ω,λ between the approximate and the real solutions constructed above and observe that v satisfies the Cauchy problem ∂t v − P∇v v + P∇u ω,λ v + P∇v u ω,λ =
6
E j,
(4.21)
j=1
v(0) = 0. Standard energy estimates give
1 ∂t v, v = − ∂t v2L 2 (R) = ∇v u ω,λ , v + E j, v 2 R2 R2 R2 j E j 2 2 v L 2 (R2 ) ≤ ∇v u ω,λ L 2 (R2 ) v L 2 (R2 ) + L (R ) j
u ω,λ C 1 (R2 ) v2L 2 (R2 ) + λ−rs,δ v L 2 (R2 ) , where by the estimate of Lemma 4.2, the explicit formula for u h (t, x) in (4.11) and the Sobolev lemma we have ω,λ h l u (t) 1 2 ≤ (t) + (t) u u 1 2 C (R ) 2+ 2 H
(R )
C (R )
λ−1+δ + λ−s+1−δ . Therefore, we obtain ∂t v L 2 (R2 ) max λ−1+δ , λ−s+1−δ v L 2 (R2 ) + λ−rs,δ , and using Gromwall’s inequality we find v(t) L 2 (R2 ) λ−rs,δ et max
λ−1+δ ,λ−s+1−δ
which in particular holds uniformly for all t ∈ [0, 1].
,
(4.22)
Non-Uniform Dependence of Solutions to the Euler Equations
297
4.4. Conclusion of the proof. Let u +1,λ (t) and u −1,λ (t) be two sequences of solutions of the Cauchy problem (4.19) corresponding to initial conditions u +1,λ (0) = u +1,λ (0) and u −1,λ (0) = u +1,λ (0). Since 0 < δ < 1 for any integer k > 2 we have ±1,λ (t) k 2 ≤ u l (t) k 2 + u h (t) k 2 u H (R )
λ
−1+δ
H (R ) −s+k
+λ
H (R )
λ
−s+k
uniformly in t by (4.13) and the estimate of Lemma 4.2. Using the corresponding energy estimate for the solutions u ±1,λ (t) (see the Appendix) we then also get ±1,λ u ±1,λ (t) k 2 u ±1,λ (0) k 2 = u (0) k 2 λ−s+k . H (R ) H (R ) H (R )
Put together these estimates give the following bound for the difference: ±1,λ (t) − u ±1,λ (t) k 2 λ−s+k , u H (R )
which holds uniformly in time for any integer k > n/2 + 1. On the other hand, since 0 < δ < 1, we see that if s > 1 − δ,
(4.23)
then rs,δ > 0 (see the definition in (4.18) above). Furthermore the exponential term in (4.22) is bounded for large λ 1 and thus we have ±1,λ (t) − u ±1,λ (t) 2 2 λ−rs,δ −→ 0 as λ ∞. u L (R )
Now let s > 0. Choosing δ ∈ (0, 1) such that s > 1 − δ we have rs,δ > 0.
(4.24)
Interpolating between s1 = 0 and s2 = k = [s] + 2 gives ±1,λ (t) − u ±1,λ (t) u
(k−s)/k ±1,λ u (t) − u (t) 2 2 ±1,λ L (R ) H s (R 2 ) s/k ±1,λ × u (t) − u ±1,λ (t) k 2 λ
−rs,δ (k−s)/k
λ
(−s+k)s/k
H (R ) −(rs,δ −s)(k−s)/k
=λ
.
(4.25)
Observe that by definition of rs,λ in (4.18) and our choice of δ in (4.20) we have rs,δ − s = min (s − 1 + δ, 1 − δ) > 0, and therefore (4.23) and (4.26) give the key estimate ±1,λ (t) − u ±1,λ (t) s 2 λ−(rs,δ −s)(k−s)/k −→ 0 u H (R )
(4.26)
as λ ∞.
(4.27)
298
A. A. Himonas, G. Misiołek
We can now complete the proof as follows. On the one hand, we have u +1,λ (0) − u −1,λ (0)
· · −1 = 2λ ψ ψ 1 H s (R 2 ) λδ Hxs1 (R) 2 λδ Hxs2 (R) · · + 2λ−1 ψ1 (4.28) ψ2 δ s δ s λ λ H x (R ) H x (R ) 1 2 λ−1+δ + λ−1+δ −→ 0,
provided that λ ∞. On the other hand, for any t > 0 by the triangle inequality and (4.25) we have u +1,λ (t) − u −1,λ (t)
+1,λ −1,λ u ≥ (t) − u (t) s 2 H s (R 2 ) H (R ) +1,λ − u (t) − u +1,λ (t) s 2 H (R ) +1,λ − u (t) − u −1,λ (t) s 2 (4.29) H (R ) u +1,λ (t) − u −1,λ (t) s 2 − λ−(rs,δ −s)(k−s)/k . H (R )
The difference of the approximate solutions (with obvious notation for low and high frequency terms) can be written as u +1,λ (t, x) − u −1,λ (t, x) = u l,+1 (t, x) − u l,−1 (t, x) + u h,+1 (t, x) − u h,−1 (t, x), so that by Lemma 4.2 we have +1,λ (t) − u −1,λ (t) u
Hs
≥ u h,+1 (t) − u h,−1 (t) s 2 − u l,+1 (t) s 2 H (R ) H (R ) −1,λ − u (t) s 2 H (R ) h,+1 u (t) − u h,−1 (t) s 2 − λ−1+δ . (4.30) H (R )
Furthermore, the difference of the high frequency terms appearing on the right side can be expressed explicitly using (4.11) as u h,+1 (t, x) − u h,−1 (t, x) x x 1 2 = λ−s−δ φ δ φ δ (cos(λx2 − t) − cos(λx2 + t)) λ λ x2 x1 +λ−s−1−2δ φ δ φ δ (sin(λx2 − t) − sin(λx2 + t)) λ λ −s−1−2δ x 1 −λ φ(x φ ) − t)−sin(λx + t)) , (sin(λx 2 2 2 λδ
Non-Uniform Dependence of Solutions to the Euler Equations
299
so that using the triangle inequality and a little trigonometry as in the periodic case together with Lemma 4.1 we obtain · · h,+1 sin λ(·) (t) − u h,−1 (t) s | sin t|λ−s−δ φ δ s φ u δ s H λ λ Hx Hx 1 2 · −s−1−2δ φ −λ λδ Hxs1 · × φ δ (sin(λ · −t) − sin(λ · +t)) λ Hxs 2 · −s−1−2δ +λ φ λδ Hxs1 · × φ δ (sin(λ · −t) − sin(λ · +t)) λ Hxs 2 | sin t|λ−s−δ λδ/2 λs+δ/2 − λ−s−1−2δ λδ/2 λs+δ/2 | sin t| − λ−1−δ . Combining this estimate with (4.29) and (4.30) we obtain u +1λ (t) − u −1,λ (t) s 2 | sin t| − λ−1−δ − λ−1+δ − λ−(rs,δ −s)(k−s)/k −→ | sin t| H (R ) as λ ∞ for any 0 < t < 1. The proof of part (NP) of Theorem 2.1 is complete.
5. Appendix In order to make the paper self-contained we provide here proofs of the estimates omitted from the main text. 5.1. Proof of Lemma 4.1. Let σ ≥ 0 and δ ≥ 0. For any Schwartz function ψ ∈ S(R) and any λ 1 a simple computation gives 2 · 2
· 1 1 2 σ 2 σ δ δ 2 ψ ψ λ ψ λ ξ dξ (ξ ) = (1+ξ ) dξ = (1 + ξ ) λδ H σ 2π R λδ 2π R 1 δ 2 σ δ 2 δ 1 λ ξ d(λδ ξ ) ψ 1+ δ λ ξ =λ 2π R λ 1 2 σ δ 1 (ξ )2 dξ, ψ 1+ δξ =λ 2π R λ which proves estimate (4.7) of Lemma 4.1. Next, for any constant a ∈ R we have · ∧ ψ δ cos(λ · −a) (ξ ) = e−i xξ ψ(λ−δ x) cos(λx − a) d x λ R e−ia = e−i x(ξ −λ) ψ(λ−δ x) d x 2 R eia + e−i x(ξ +λ) ψ(λ−δ x) d x 2 R e−ia δ eia δ λ (ξ − λ) + λδ λ (ξ + λ) . = λδ ψ ψ 2 2
300
A. A. Himonas, G. Misiołek
The second estimate (4.8) follows now from 2 · λ−2s−δ ψ δ cos(λ · −a) λ Hσ
∧ 2 1 = λ−2σ −δ (1 + ξ 2 )σ ψ(λ−δ ·) cos(λ · −a) (ξ ) dξ 2π R
2 σ 1 ψ (ξ )2 dξ 1 + λ−δ ξ + λ = λ−2σ 8π R σ (ξ )ψ (ξ + 2λδ ) dξ +2 1 + (λ−δ ξ + λ)2 Re e−2ia ψ R 2
−δ 2 σ ψ (ξ ) dξ 1+ λ ξ +λ + R
which is equal to σ 1 (ξ )2 dξ = λ−2 + (λ−1−δ ξ + 1)2 ψ 8π R σ 1 (ξ )2 dξ + λ−2 + (λ−1−δ ξ − 1)2 ψ 8π R σ 1 (ξ )ψ (ξ + 2λ1+δ ) dξ. + λ−2 + (λ−1−δ ξ + 1)2 Re e−2ia ψ 4π R By the dominated convergence theorem the first two terms converge to 1/2ψ2L 2 while the third term vanishes as λ ∞. . 5.2. Proof of Lemma 4.2. The inequality in (4.10) follows from Lemma 4.1 and an energy estimate for (4.4) in H s (Rn , Rn ), where s > n/2 + 1. For the energy estimate one usual trick is to use Friedrichs mollifiers J ( > 0) combined with a limiting argument. First, we replace (4.4) with a regularized equation ∂t s J u l + s J P∇ul u l = 0,
P = 1 − ∇−1 div,
where s = (1 − )s/2 . We can arrange so that the pseudodifferential operators J , s and P commute and then proceed with standard estimates to get 2 1 ∂t J u l s = − ∇ul s J u l , Ps J u l − s J , ∇ul u l , s J u l H 2 Rn Rn 2 l l ≤ C u 1 u s . C
H
Rewriting it as an integral inequality and passing to the limit with → 0 we eliminate dependence on on the left-hand side. Integrating in time over [0, t] we get l u (0) s l H u (t) s ≤ H 1 − Ct u l (0) H s so that
l u (t)
Hs
≤ 2 u l (0)
Hs
Non-Uniform Dependence of Solutions to the Euler Equations
301
for all 0 ≤ t ≤ T < Tc =
1 l 2C u (0)
. Hs
Finally, observe that for any σ ≤ s we have l u (t) s n ≤ u l (t) s n ≤ 2 u l (0) H (R )
H (R )
H s (R n )
λ−1+δ
by (4.5) and Lemma 4.1 which holds for all 0 ≤ t ≤ T , where by the estimates above T ≥ provided that λ 1 and 0 < δ < 1.
1 −1 1−δ C λ ≥ 1, 2
(5.1)
Acknowledgements. The authors would like to thank the referee for constructive suggestions.
References [AK] [B] [C] [EM] [EMP] [H] [HK] [HKM] [K1] [K2] [KP] [KPV] [KT] [L] [MB] [M] [Sh]
Arnold, V., Khesin, B.: Topological Methods in Hydrodynamics. New York: Springer, 1998 Bourgain, J.: Fourier transform restriction phenomena for certain lattice subsets and applications to nonlinear evolution equations. Part II: the KdV Equation. Geom. Funct. Anal. 3, 209–262 (1993) Constantin, P.: On the euler equations of incompressible fluids. Bull. Amer. Math. Soc. 44, 603–621 (2007) Ebin, D., Marsden, J.: Groups of diffeomorphisms and the motion of incompressible fluids. Ann. Math. 92, 341–363 (1970) Ebin, D., Misiołek, G., Preston, S.: Singularities of the exponential map on the volume-preserving diffeomorphism group. Geom. Funct. Anal. 16, 850–868 (2006) Henry, D.: Geometric Theory of Semilinear Parabolic Equations. Lecture Notes in Mathematics 840, New York: Springer, 1981 Himonas, A., Kenig, C.: Non-uniform dependence on initial data for the ch equation on the line. Diff. Int. Eqs. 22(3–4), 201–224 (2009) Himonas, A., Kenig, C., Misiołek, G.: Non-uniform dependence for the periodic Camassa-Holm equation. Comm. Part. Diff. Eqs., to appear Kato, T.: Quasi-Linear Equations of Evolution with Applications to Partial Differential Equations. Lecture Notes in Mathematics 448, New York: Springer, 1975 Kato, T.: The cauchy problem for quasi-linear symmetric hyperbolic systems. Arch. Rat. Mech. Anal. 58, 181–205 (1975) Kato, T., Ponce, G.: Commutator estimates and the euler and navier-stokes equations. Comm. Pure Appl. Math. 41, 891–907 (1988) Kenig, C., Ponce, G., Vega, L.: On the ill-posedness of some canonical dispersive equations. Duke Math J. 106, 617–633 (2001) Koch, H., Tzvetkov, N.: Nonlinear wave interactions for the benjamin-ono equation. Int. Math. Res. Not. 30, 1833–1847 (2005) Lin, Z.: Nonlinear instability of ideal plane flows. Int. Math. Res. Not. 41, 2147–2178 (2004) Majda, A., Bertozzi, A.: Vorticity and Incompressible Flow. Cambridge: Cambridge University Press, 2002 Misiołek, G.: Stability of ideal fluids and the geometry of the group of diffeomorphisms. Indiana Univ. Math. J. 42, 215–235 (1993) Shnirelman, A.: On the nonuniqueness of weak solutions of the euler equations. Comm. Pure Appl. Math. 50, 1260–1286 (1997)
Communicated by P. Constantin
Commun. Math. Phys. 296, 303–321 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1019-6
Communications in
Mathematical Physics
Interaction of Four Rarefaction Waves in the Bi-Symmetric Class of the Two-Dimensional Euler Equations Jiequan Li1, , Yuxi Zheng2, 1 Department of Mathematics, Capital Normal University,
Beijing 100037, Peoples Republic of China
2 Department of Mathematics, The Pennsylvania State University,
University Park, PA 16802, USA. E-mail:
[email protected] Received: 6 August 2008 / Accepted: 27 December 2009 Published online: 24 February 2010 – © Springer-Verlag 2010
Abstract: The global existence and structures of solutions to multi-dimensional unsteady compressible Euler equations are interesting and important open problems. In this paper, we construct global classical solutions to the interaction of four orthogonal planar rarefaction waves with two axes of symmetry for the Euler equations in two space dimensions, in the case where the initial rarefaction waves are large. The bi-symmetric initial data is a basic type of four-wave two-dimensional Riemann problems. The solutions in this case are continuous, bounded and self-similar, and we characterize how large the rarefaction waves must be. We use the methods of hodograph transformation, characteristic decomposition, and phase space analysis. We resolve binary interactions of simple waves in the process. 1. Introduction Consider the two-dimensional isentropic compressible Euler system ⎧ ⎨ ρt + (ρu)x + (ρv)y = 0, (ρu)t + (ρu 2 + p)x + (ρuv)y = 0, ⎩ (ρv)t + (ρuv)x + (ρv 2 + p)y = 0,
(1.1)
where ρ is the density, (u, v) is the velocity and p is the pressure given by p(ρ) = Kρ γ , where K > 0 will be scaled to be one and γ > 1 is the gas constant. Cauchy problems for (1.1) are open. Riemann problems for (1.1) are a current research topic, as they are reducible to involve fewer independent variables while retaining important features of general solutions. We refer the reader to [2,7,8,13,20,25] for some general solutions to one-dimensional and multidimensional Euler equations, and to [15,30] for the rich Research partially supported by the Key Program from Beijing Educational Commission (KZ200910028002), 973 project (2006CB805902) and PHR(IHLB) and NSFC (10971142). Research partially supported by NSF-DMS-0603859, 0908207.
304
J. Li, Y. Zheng
flow patterns displayed by solutions to Riemann problems. Shock reflection problems [5,19,24,31,34], in particular, are included in the Riemann problems. Two-dimensional (2-D) Riemann problems are Cauchy problems with special initial data that are constant along each ray from the origin. The one-dimensional case is quite well-understood [4]. The two-dimensional case was formulated, and the solution configurations conjectured, in [28]. The solution configurations are complicated, as confirmed afterward by several numerical simulations [3,11,12,23]. There has been no rigorous proof of the numerical simulations due to lack of effective methods of analysis. In this paper, with the first success since the proposition [28], we construct analytic solutions to a case of a configuration of the 2-D four-wave Riemann problems of (1.1), using methods that we have developed in recent years. The construction is based on the analysis of (1.1) in three planes: The self–similar variables (ξ, η) = (x/t, y/t), the inclination angles of characteristics (α, β), and the velocity (u, v)-plane of the hodograph transformation. These forms enable us to do analysis effectively. The case that we solve in this paper has two axes of symmetry. The initial data of a 2-D Riemann problem may possess a certain symmetry, e.g., axial symmetry, or piecewise constant along the axial direction with one or two axes of symmetry of the plane. A global solution for the axially symmetric case has been constructed [29,33]. Furthermore, a global solution for the binary interaction of two planar rarefaction waves have also been constructed recently [17]. The construction for the interaction of four planar rarefaction waves with one axis of symmetry, a primary class of the two-dimensional four-wave Riemann problems, and denoted as Configuration A in [15,28,30], has not been available, but the reason is clearly revealed in paper [10] in which shock formation is established numerically. The shock formation near the sonic boundary makes the global construction difficult. The class of four planar rarefaction waves with two axes of symmetry, which we call bi-symmetric and denote as Configuration B in [15,28], have also been shown numerically to have shock development in [10] as well as in earlier numerical experiments [3,12,15,23]. However, the extra symmetry makes the configuration accessible by our newly developed tools. We obtain continuous global solutions in this class when the rarefaction waves are large, see Theorem 7.1. We use the hodograph transformation and characteristic decomposition in constructing the global solution. The characteristic decomposition handles simple waves best while the hodograph transformation, valid for non-simple waves, reduces the system to a linearly degenerate one. Assuming that the flow is ir-rotational and self-similar, Pogodin, Suchkov and Ianenko ([21], 1958) introduced the hodograph transformation to represent the system of equations (1.1) in the velocity variables (u, v), resulting in a decoupled partial differential equation of second order for the speed of sound c. In 2001, Li ([14]) carried out an analysis of the second order equation in the space (c, u, v), through a pair of variables resembling the well-known Riemann invariants together with their invariant regions, and established the existence of a solution to the expansion of a wedge of gas into vacuum in the hodograph plane for wide ranges of the gas constant and the wedge angle. In 2006, paper [16] clarified the concept of simple waves for (1.1). Then, in paper [17], we show that the hodograph transformation is non-degenerate (and globally one-to-one) precisely for non-simple waves, and the solutions constructed in [14] in the hodograph plane can be transformed back to the self-similar plane. Thus a complete procedure of construction of solutions is now available, which we use here for studying the interactions involved in Configuration B – the bi-symmetric class. In particular, the interaction of any two simple waves is completed in this paper, provided that the two waves are expanding toward vacuum, see Theorem 6.1.
Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations
305
Our main results are given in Theorems 6.1 and 7.1. A helpful characterization of simple waves, especially their boundaries, is given in Lemma 6.1, which will have broad applications in solving other 2-D Riemann problems. In the next section we list formulas and equations in various forms which form the basis for our construction in this paper. In Sect. 3, we select data to set up Configuration B, eliminating redundancy through scaling and translation or normalization. In Sect. 4, we recall previous work on the interaction of two symmetric planar rarefaction waves. In Sect. 5, we winnow the data to keep the problem hyperbolic. In Sect. 6, we characterize a complete patch of simple wave, as is needed in our solution, and construct the solution of interactions of two simple waves. In Sect. 7, we put the various pieces together to obtain the existence of global solutions, which is stated in Theorem 7.1 and reproduced here: Theorem (Main theorem). Consider the Riemann problem for system (1.1) with initial data consisting of constant states (ci , u i , vi ) in the i th quadrants (i = 1, 2, 3, 4) so that + , states 2 and 3 form a backward states 1 and 2 form a forward rarefaction wave R12 − + , and states rarefaction wave R23 , states 3 and 4 form a forward rarefaction wave R34 − 4 and 1 form a backward rarefaction wave R41 . (The rarefaction wave requirement on the data forces c2 = c4 , c1 = c3 , thus we call√it a bi-symmetric problem.) Then, there exists a number c2∗ (γ ) ∈ (0, 1) for γ > 1 + 2, such that our bi-symmetric Riemann problem has global continuous solutions, provided 0 < c2 < c2∗ (γ )c1 . Notations. Here is√a list of our notations: Besides the primitive variables ρ, (u, v) and p, we have c = γ p/ρ as the speed of sound, i = c2 /(γ − 1) the enthalpy, ϕ the pseudo-velocity potential. In terms of the self-similar variables (ξ, η), we often use the pseudo-velocity (U, V ) = (u − ξ, v − η), and ± , λ± , where √ U V ± c U 2 + V 2 − c2 ± = , ± λ∓ = −1. (1.2) U 2 − c2 The angles α, β and ω are defined as tan α := + tan β := − , ω = (α − β)/2, τ = (α + β)/2.
(1.3)
We also use the following vector fields: ∂ ± = ∂ξ + ± ∂η , ∂± = ∂u + λ± ∂v , ∂0 = ∂u ∂¯ + = (cos α, sin α)·(∂ξ , ∂η ), ∂¯ − = (cos β, sin β)·(∂ξ , ∂η ), ∂¯ 0 = cos τ ∂ξ + sin τ ∂η ∂¯+ = (sin β, − cos β) · (∂u , ∂v ), ∂¯− = (sin α, − cos α) · (∂u , ∂v ), (1.4) and some notations γ −1 3−γ γ +1 κ= , m= , = m − tan2 ω, tan2 θs = m, ν = . (1.5) 2 γ +1 2(γ − 1) 2. Systems of Self-Similar Flows 2.1. Irrotational flows. Our primary system is system (1.1) in the self–similar variables (ξ, η) = (x/t, y/t): ⎧ ⎨ U i ξ + V i η + 2κ i (u ξ + vη ) = 0, U u ξ + V u η + i ξ = 0, (2.1) ⎩Uv + V v + i = 0 ξ η η
306
J. Li, Y. Zheng
with the ir-rotationality condition u η = vξ . System (2.1), (2.2) can also be reduced to the system 2 (c − U 2 )u ξ − U V (u η + vξ ) + (c2 − V 2 )vη = 0, u η − vξ = 0,
(2.2)
(2.3)
supplemented by Bernoulli’s law 1 i + (U 2 + V 2 ) = −ϕ, ϕξ = U, ϕη = V. 2 The (pseudo-)characteristics are √ U V ± c U 2 + V 2 − c2 dη = ± : ≡ ± . dξ U 2 − c2
(2.4)
(2.5)
Then system (2.3) can be written in characteristic form: ∂ ± u + ∓ ∂ ± v = 0.
(2.6)
2.2. Characteristic decomposition. In [16], it is shown that system (2.3) and (2.4) has a characteristic decomposition, in analogy with that for the classical wave operator. It is very useful in the discussion of simple waves and their interactions. Proposition 2.1 (Commutator relation). For any quantity I (ξ, η), there holds ∂ −∂ + I − ∂ +∂ − I =
∂ − + − ∂ + − − (∂ I − ∂ + I ). − − +
(2.7)
Proposition 2.2 (Characteristic decomposition). For (2.3) and (2.4), there hold ∂ + ∂ − I = m 1 ∂ − I,
∂ − ∂ + J = m 2 ∂ + J,
(2.8)
where m 1 and m 2 can be expressed in the form m 1 = m 1 (u, v)(∂ξ u + ζ1 (u, v)∂η u), m 2 = m 2 (u, v)(∂ξ u + ζ2 (u, v)∂η u); and I = u, v, c or − , J = u, v, c or + . 2.3. System in inclination angles of characteristics. The inclination angles (α, β) of characteristics (see notation (1.3)) play an important role in our study. First we have from [17,32] cos τ sin τ , v − η = −c . (2.9) sin ω sin ω System (2.3) can be written as the closed system of equations in terms of α, β, and c: ⎧ ⎪ c∂¯ − β = cos2 ω(2 sin2 ω + c∂¯ − α), ⎪ ⎨ ¯+ c∂ α = cos2 ω[−2 sin2 ω + c∂¯ + β], (2.10) γ −1 ⎪ ⎪ [4 sin2 ω + c∂¯ − α − c∂¯ + β]. ⎩ ∂¯ 0 c = 2(γ + 1) sin ω u − ξ = −c
Form (2.10) is given in Chen and Zheng [6]. Note that the sign for ∂¯ 0 here is the opposite of ∂¯0 of [6]. Here we offer a more direct derivation.
Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations
307
Proof of (2.10). We start with the differential relations among the new variables: c cos βdα − c cos αdβ cos τ dc + , sin ω 2 sin2 ω c sin βdα − c sin αdβ sin τ dc + . dv − dη = − sin ω 2 sin2 ω
du − dξ = −
The differential form of the Bernoulli law (2.4) can be written as κ dc = (cos τ du + sin τ dv). sin ω
(2.11)
(2.12)
Using (2.6) and (2.11), we have cot ω∂¯ − c = cos(2ω) +
c [cos(2ω)∂¯ − α − ∂¯ − β]. 2 sin2 ω
(2.13)
The Bernoulli law (2.12) gives ∂¯ − c =
κ sin(2ω) cκ cot ω + (∂¯ − α − ∂¯ − β). 2 2(κ + sin ω) 2(κ + sin2 ω)
(2.14)
The above two equations together give c∂¯ − β = cos2 ω(2 sin2 ω + c∂¯ − α).
(2.15)
c∂¯ + α = cos2 ω[−2 sin2 ω + c∂¯ + β].
(2.16)
Similarly, we have
On the other hand, we can obtain the equation for c κ cot ω(2 sin2 ω + c∂¯ − α), 1+κ κ cot ω(−2 sin2 ω + c∂¯ + β). ∂¯ + c = − 1+κ
∂¯ − c =
Note that cos ω∂¯ 0 =
∂¯ + +∂¯ − 2 .
∂¯ 0 c =
(2.17)
Then we sum up:
κ [4 sin2 ω + c∂¯ − α − c∂¯ + β]. 2(1 + κ) sin ω
(2.18)
Thus we obtain the closed system of equations (2.10). Remark 2.1. In fact, (2.10) can be reduced to a diagonal form, ⎧ 2 ⎪ ⎪ ∂¯ + (−β + ψ(ω)) = sin ω[cos(2ω) − κ] , ⎪ ⎪ ⎨ c(κ + sin2 ω) 2 sin ω[cos(2ω) − κ] ⎪ , ∂¯ − (α + ψ(ω)) = ⎪ ⎪ c(κ + sin2 ω) ⎪ ⎩ ¯0 2 ∂ [c (1 + κ M 2 )] = 2cκ M, where
ψ(ω) :=
γ +1 arctan γ −1
γ −1 cot ω , γ +1
(2.19)
(2.20)
308
J. Li, Y. Zheng
and the pseudo-Mach number M is related to ω as 1 = M := U 2 + V 2 /c. sin ω
(2.21)
The Riemann variables ψ − β and ψ + α correspond to the classical Riemann invariants for homogeneous systems. However, it is convenient for us to use (2.10) in this paper. Other useful formulas are given below. Proposition 2.3. First-order derivatives have the formulas ⎧ sin α ¯ − sin β ¯ + ⎪ ⎪ ∂¯ − u = ∂ c, ∂¯ + u = − ∂ c, ⎪ ⎪ ⎪ κ κ ⎨ ¯+ + − ¯ ¯ c∂ β = ν sin(2ω)∂¯ − c, c∂ α = −ν sin(2ω)∂ c, (2.22) 2 − − ¯ ¯ ⎪ c∂ α = 2ν tan ω∂ c − 2 sin ω, c∂¯ + β = −2ν tan ω∂¯ + c + 2 sin2 ω, ⎪ ⎪ ⎪ tan ω ¯ ± ⎪ ⎩ c∂¯ ± ω = (sin2 ω + κ) ∂ c − sin2 ω. κ 2.4. System in the hodograph plane. Pogodin, Suchkov and Ianenko [21] proposed the hodograph transformation T : (ξ, η) → (u, v)
(2.23)
for (2.1), reversing the roles of (ξ, η) and (u, v) and regarding i as a function of (u, v). Then i as the function of u and v satisfies ξ − u = iu , η − v = iv ,
(2.24)
provided that the transformation (2.23) is non-degenerate. System (2.3) becomes a linearly degenerate system (2κ i(u, v) − i u2 )ηv + i u i v (ξv + ηu ) + (2κ i − i v2 )ξu = 0, (2.25) ξv − ηu = 0 for the unknowns (ξ, η). And i satisfies (2κ i − i u2 )i vv + 2i u i v i uv + (2κ i − i v2 )i uu = i u2 + i v2 − 4κ i.
(2.26)
The linear degeneracy of (2.25) becomes more transparent when it is expressed in terms of α, β and c. In paper [17], we convert (2.26) to ⎧
1+γ ⎨ ∂¯+ α = 4c · sin(α − β) · m − tan2 ω =: G(α, β, c), (2.27) ∂¯ β = G(α, β, c), ⎩ − ∂0 c = κ cos α+β / sin ω, 2 with ∂¯+ c = −κ, ∂¯− c = κ.
(2.28)
The definitions of ∂¯± and ∂0 (see (1.4)) implies that system (2.27) is linearly degenerate. In our construction of solutions, we need C 0 , C 1 and C 1,1 estimates. The main difficulty lies in the non-homogeneity of (2.27). Thus we shall need the second-order derivatives, given in [17], which are obtained by direct calculation.
Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations
309
Proposition 2.4. Assume that the solution of (2.27) (α, β) ∈ C 2 . Then we have
∂¯+ ∂¯− α + W ∂¯− α = Q(ω, c), −∂¯− ∂¯+ β + W ∂¯+ β = Q(ω, c),
(2.29)
where W (ω, c) and Q(ω, c) are 1 + γ m − tan2 ω 3 tan2 ω − 1 cos2 ω + 2 tan2 ω , 4c
(1 + γ )2 2 2 sin(2ω) m − tan ω 3 tan ω − 1 . Q(ω, c) := 16c2
W (ω, c) :=
(2.30)
Proposition 2.5. Assume that the solution of (2.27) (α, β) ∈ C 2 . Then we have
∂¯+ ∂¯− (α + β) + W ∂¯− (α + β) = a(ω, c)∂¯+ (α + β) −∂¯− ∂¯+ (α + β) + W ∂¯+ (α + β) = a(ω, c)∂¯− (α + β),
(2.31)
γ +1 cos2 ω(tan2 ω + α2 )(tan2 ω − α1 ), 4c
(2.32)
where a(ω, c) := where α2 :=
1 2m [3 + m + (3 + m)2 + 4m], α1 := . 2 3 + m + (3 + m)2 + 4m
(2.33)
Proposition 2.6. Assume that the solution of (2.27) (α, β) ∈ C 2 . Then we have ⎧ γ +1 ⎪ ⎨ (∂¯+ + W )(Z − ∂¯− α) = (tan2 ω + 1)(Z − ∂¯+ β) 4c γ +1 ⎪ ⎩ (−∂¯− + W )(Z − ∂¯+ β) = (tan2 ω + 1)(Z − ∂¯− α), 4c
(2.34)
where Z :=
γ +1 tan ω. 2c
(2.35)
To invert the solution on the hodograph plane to the (ξ, η) plane, we notice that (2.24) defines a mapping from (u, v, ) to (ξ, η) as ξ = u + i u , η = v + i v . The Jacobian has the formula j (ξ, η; u, v) =
∂(ξ, η) c2 (∂¯− α − Z )(∂¯+ β − Z ). = ∂(u, v) 4 sin4 ω
(2.36)
310
J. Li, Y. Zheng
(a)
(b) Fig. 3.1. Illustration of characteristics and planar rarefaction waves
3. Bi-symmetric Four Rarefaction Waves The initial data for the 2-D Riemann problem are constant along each ray from the origin, (u, v, ρ)(t, x, y)|t=0 = (u 0 , v0 , ρ0 )(θ ), θ = arctan(y/x).
(3.1)
For theoretical and application reasons, (u 0 , v0 , ρ0 )(θ ) is usually piecewise constant. The four-constant Riemann problem is a prototype example and has special initial data that takes on constant values in each of the four initial quadrants; i.e., (u 0 , v0 , ρ0 )(θ ) = (u i , vi , ρi ), (i − 1)π/2 < θ < iπ/2,
(3.2)
(i = 1, 2, 3, 4). We use I, II, III and IV to designate the corresponding state (u i , vi , ρi ). The four-wave Riemann problem is restricted further so that each adjacent pair of data in the four-constant Riemann problem is connectible by a single planar wave. See [28]. The characteristics defined by (2.5) in the region of a constant state provide a basic reference point for understanding non-constant states. The characteristics are straight lines and the set {(ξ, η); (ξ − u) ¯ 2 + (η − v) ¯ 2 = c¯2 } is a sonic circle of the state (u, ¯ v, ¯ ρ). ¯ The plus characteristic lines, denoted by + in Fig. 3.1(a), are tangent to the sonic circle, and go in the counterclockwise direction if regarded as starting at the tangent points in reference to the sonic circle, while − go clockwise. The Euler system (1.1) has two classes of planar rarefaction waves connecting a given pair of states (u f , v f , ρ f ) and (u b , vb , ρb ). We denote the first class by R +f b , whose telltale feature is that the family of straight-line characteristics go counter-clockwise in reference to the sonic circle of the state (u b , vb , ρb ). We denote the second class by R −f b , whose tell-tale feature is that the family of straight-line characteristics go clockwise in reference to the sonic circle of the state (u b , vb , ρb ). The two classes have examples represented by ⎧ c du ⎨ = , v = v f = vb , ρb < ρ f ξ = u + c, ± Rfb : (3.3) dρ ρ ⎩ η > v or η < v . b
b
Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations
311
We designate that the front is the state with higher pressure, or equivalently, higher density, as shown in Fig. 3.1(b). + connecting states I and II, R − conWe require our initial data (3.2) to have R12 14 − + connecting states III necting states I and IV, R32 connecting states III and II, and R34 and IV. These requirements place strong restrictions on the four states and a number of compatibility conditions result. In the end, see [15], however, we only need ρ1 = ρ3 , ρ2 = ρ4 , u 1 − u 2 = v1 − v4 (ρ2 < ρ1 ).
(3.4)
And the set-up is symmetric with respect to ξ − η = u 1 − v1
and
ξ + η = u 2 + v2 .
So our data enjoys two axes of symmetry and so we call it bi-symmetric data. It is + R − R − R + , see [12,15,22,23,28,30]. denoted traditionally Configuration B R12 14 32 34 In a recent paper [17] we handled the interaction of an R + with an R − at any angle between 0 and π . It is a fundamental case of wave interactions. We use it in this paper to + R − is to interact consider the bi-symmetric configuration, where the interaction of R12 14 − + with the interaction of R32 R34 , which is really the second level interaction of primary binary interactions, cf. Dinu [9]. Normalization. We use scaling and translations to get rid of unnecessary freedom in the data to prepare for our construction. We note that ξ and u can be shifted by an equal amount without changing the solution. The same is true for η and v. So we shall assume that u 1 − v1 = 0, u 2 + v2 = 0.
(3.5)
In addition, the transformation (u, v, c, ξ, η) → c(u, ¯ v, c, ξ, η) (where c¯ is any positive constant) does not change system (2.1), so we shall assume that c1 = 1.
(3.6)
Thus we have only two free parameters: c2 ∈ (0, 1) and γ > 1. And there hold u1 =
c1 − c2 > 0, v1 = v2 = u 1 , u 2 = −u 1 . γ −1
(3.7)
+ has the explicit expression The rarefaction wave R12 ρ ξ = u1 + ρ −1 p (ρ) dρ + p (ρ), ρ2 < ρ < ρ1 , ρ1
v = v1 , u = u1 +
ρ ρ1
ρ −1 p (ρ) dρ,
(3.8)
η > v1 . The characteristics of the plus family are straight lines; the characteristics of the minus family are given by γ +1 γ (γ + 1) γ −3 η = v1 + ρ 4 c¯ + ρ 2 , (3.9) 3−γ
312
J. Li, Y. Zheng
(a)
(b)
Fig. 4.1. Interaction of planar rarefaction waves: In (a) we show the case of gas expansion into a vacuum; In (b) we show the critical case that the vacuum interface is a single point
where c¯ is a constant, for γ = 3. For γ = 3, the characteristics are η = v1 + ρ c¯ − 6 ln ρ.
(3.10)
For the special minus characteristic curve that starts horizontally (i.e., the curve ab in Fig. 5.1) we have c¯ = −
2γ (γ − 1) γ −3 ρ1 2 3−γ
(3.11)
for γ = 3, and c¯ = 3 + 6 ln ρ1
(3.12)
for γ = 3. 4. Binary Interaction of Planar Rarefaction Waves Before constructing the global solution for the four bi-symmetric rarefaction waves we proposed last section, we recall the binary interaction of planar rarefaction waves from [17]. The most typical case is the interaction of full rarefaction waves R + and R − that connect the vacuum to a constant state, as shown in Fig. 4.1(a). These two waves penetrate each other completely and fully expand into the vacuum. Lemma 4.1 (Gas expansion [17]). There exists a solution (u, v, ρ) ∈ C 1 of (1.1) for the problem of gas expansion into a vacuum in the wave interaction region in the self-similar (ξ, η)-plane for all γ ≥ 1 and all wedge half-angle θ ∈ (0, π/2]. For θ > θs :=
arctan(Re 3−γ γ +1 ), the vacuum boundary is representable as a single-valued concave function ξ = B(η), the minus family of characteristics are concave, the plus family of characteristics are convex, and the difference of their inclination angles at the boundary is 2θs (γ ).
Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations
(a)
313
(b)
Fig. 4.2. Interaction of two symmetric rarefaction waves: In (a) the data satisfies 0 ≤ ρ1 = ρ2 < ρ∗ < ρ0 ; In (b) the data satisfies ρ∗ ≤ ρ1 = ρ2 < ρ0 + As a corollary, we can study the interaction of two planar rarefaction waves R01 − − + connecting states (u , v , ρ ) and (u , v , ρ ), R and R02 in Fig. 4.1(b), R01 0 0 0 1 1 1 02 connecting states (u 0 , v0 , ρ0 ) and (u 2 , v2 , ρ2 ), for two appropriate states (u 1 , v1 , ρ1 ) and (u 2 , v2 , ρ2 ). These two waves penetrate each other. Here we state a symmetric case: u 1 = u 2 , v1 = −v2 and ρ1 = ρ2 , and fix the state (u 0 , v0 , ρ0 ). Then it is evident that there exists a state (u ∗ , v∗ , ρ∗ ) such that if (u 1 , v1 , ρ1 ) = (u ∗ , v∗ , ρ∗ ) and (u 2 , v2 , ρ2 ) = (u ∗ , −v∗ , ρ∗ ), the vacuum interface just shrinks into a single point. That is, the wave-tail characteristics from points b and b meet at a point d at which the density is zero. This case is referred to as the critical case. Once the states (u 1 , v1 , ρ1 ) and (u 2 , v2 , ρ2 ) are such that ρ1 = ρ2 < ρ∗ , then the vacuum interface is no longer a single point. We refer to this case as the “large” rarefaction waves. We summarize the interaction of planar rarefaction waves in the following corollary. + and Corollary 4.1. For the interaction of two symmetric planar rarefaction waves R01 − R02 , there are three cases of solutions. For the large data case, they expand into vacuum and an interface separates the vacuum from the interaction region; For the small data case, they penetrate each other without the presence of vacuum. The third case is the middle case when the data yields a single point of vacuum. See Fig. 4.2.
We remark that we do not have any quantitative estimate on the location of the vacuum boundary ξ = B(η), e.g., the value of B(0). 5. Hyperbolicity and Non-overlapping of Domains of Determinacy + R − follows We start the construction of solutions. See Fig. 5.1. The interaction of R12 14 − + from Lemma 4.1 and Corollary 4.1. So does the interaction of R32 R34 . Under our normalization, the interaction point a has the coordinate 1 − c2 1 − c2 + 1, +1 , a = (ξ1 , η1 ) = γ −1 γ −1
314
J. Li, Y. Zheng
Fig. 5.1. Interaction of four bi-symmetric rarefaction waves + . Similarly, which follows from (3.6), (3.7) and the solution formula ξ = u + c for R12 point b has the horizontal coordinate
ξb = u 2 + c2 = −
γ c2 1 − c2 1 + c2 = − + . γ −1 γ −1 γ −1
Depending on the magnitude c2 of state II, we may or may not have a portion of the sonic circle of state II in the solution. We require that state II has no sonic point. In other words, the characteristic lines bh and b h intersect before contacting the sonic circle of state II. Thus we need the exiting slope of the minus characteristic curve (line bh), that starts horizontally from the first quadrant, to be greater than one, so that it will intersect − + its counterpart (line b h) from the interaction of R32 R34 before hitting the sonic circle of state II. Lemma √ 5.1. For state II to be hyperbolic at point h it is necessary and sufficient to have γ > 1 + 2 with 2 √ (2 − 2)(γ − 1) γ −3 ρ2 /ρ1 < (5.1) √ 2(γ − 1 − 2) for γ = 3, and
for γ = 3.
√ ρ2 /ρ1 < exp(− 2 − 1)
(5.2)
Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations
315
√ √ Proof. For γ = 3, we have c = p = 3ρ, u = c+u 1 −c1 , ξ = u+c = u 1 −c1 +2 3ρ + . The characteristic curve ab is given by (3.10) with data (3.12) or in R12 (5.3) η = v1 + ρ 3 + 6 ln(ρ1 /ρ). We then compute dη/dρ ln(ρ1 /ρ) dη = =√ . dξ dξ/dρ 1 + 2 ln(ρ1 /ρ)
(5.4)
Requiring the slope to be greater or equal to one, we find ρ1 /ρ ≥ e
√
2+1
.
(5.5)
Next for γ = 3, we use ξ from (3.8) and η from (3.9) with data (3.11) to compute γ + 1 √ γ −3 dξ γρ 2 ; = dρ 2
γ −3 dη γ (γ + 1)(γ − 1) γ −1 γ −3 = ρ 2 (ρ 2 − ρ1 2 ). dρ 2(3 − γ )(η − v1 )
(5.6)
We require that the slope be greater or equal to one, i.e., dη/dξ ≥ 1 which simplifies to γ −1 1 x−1 1 x−1 ≥ + (5.7) √ √ γ +1 3−γ γ +1 x 3−γ for x := (ρ/ρ1 ) We then factorize (5.7),
γ −3 2
.
√ √ 2)(γ − 1) (2 − 2)(γ − 1) (γ − 2γ − 1) x − x− ≥ 0, √ √ 2(γ + 2 − 1) 2(γ − 2 − 1) 2
(2 +
(5.8)
which then reduces to our conclusion. This completes the proof. Non-overlapping of domains of determinacy. In addition to a hyperbolic point h, we shall choose c2 lower enough so that the point d is a vacuum. We do not have a quantity of c2 to tell when this happens —- its value is most probably a numerical one, depending on γ only, under the current normalization (3.5), (3.6). The reason that we require d to be a vacuum is to minimize further interaction. Further, we need point d to be above the line ξ + η > 0; i.e., ξd > −ηd , which + to be possibly larger than before. We need it because we want to avoid may require R12 + R − and R − R + . We overlapping of the domains of determinacy of the interactions R12 14 32 34 explain that this is achievable. In fact, it is sufficient to require that β > π/4
(5.9)
along the plus characteristic curve db in Fig. 5.1. We use Fig. 5.2 for better illustration, in which the curve ab has a tangent line at point b with an inclination angle greater than π/4. Note that the lines to the left of curve c2 bd are not present in the four-rarefaction wave interaction because they are shadowed by the state c2 > 0, except for the segment bh. Since the curve abb0 is concave, we see that the value of β at b0 is also greater than π/4.
316
J. Li, Y. Zheng
Fig. 5.2. Interaction of two rarefaction waves in rotated coordinate
Using the fact that the solution to the binary interaction is continuous, we see that β on the curve bd will be greater than π/4 once the point b is sufficiently close to point b0 . We call such a critical value of c2 (so that β ≥ π/4 along curve bd) c20 . That is, c20 = sup{c2 ∈ (0, 1) | β > π/4 on plus characteristic curve bd }.
(5.10)
We explain now that condition (5.9) implies point d is above the line ξ + η = 0. Let us rotate the coordinate system of Fig. 5.2 counter-clockwise by π/4, so that we regard the line ξ − η = 0 as the new ξ -axis, called ξ˜ -axis. In rotated Fig. 5.2, we note that the velocity component along the ξ˜ -axis is √ u˜ = (u + v)/ 2. In particular we have u˜ = 0 at point b due to our normalization u 2 + v2 = 0. We observe that sin β˜ ¯ + ∂¯ + u˜ = − ∂ c κ holds along db, which we can integrate to find that u˜ > 0 at point d, since β˜ > 0 along db and c is increasing from d to b.√We observe further that ξ˜ ≥ u˜ at the vacuum from τ ˜ ξ − u = c cos sin ω ≥ 0, thus ξ + η = 2ξ > 0 at point d. In sum, Proposition 5.1. Suppose c2 ∈ (0, c20 ). Then we have β > π/4 along curve bd, point h is hyperbolic and point d is above the line ξ + η = 0.
Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations
317
Fig. 6.1. A patch of simple wave
6. Simple Waves and Their Interaction 6.1. A complete patch of simple wave. In [16] we showed that adjacent to a constant state is a simple wave, by using the characteristic decomposition (2.2). Thus the region adjacent to state II and covered by the curvilinear boundaries bhd in Fig. 5.1 is a simple wave. We show that its vacuum boundary is the single point d, and the boundary hd is a characteristic curve of the plus family. See Fig. 6.1. Lemma 6.1 (Simple wave). Let bd be a characteristic curve of the plus family, along which the density ρ decreases from point b to zero at point d. Let bk be a straight characteristic curve, where point k is sonic. Then a simple wave exists, forming a curvilinear triangle bkd, for which the boundary kd (the dotted curve in Fig. 6.1) is sonic, and each of the characteristics of the plus family extends from point d to a point on bk or kd. Remark. We do not know if β is monotone along the curve bd. Proof. It follows from [16] that the patch is a simple wave, in which the characteristics of the minus family are straight lines, along which the density is constant. Thus point k is a sonic point where U 2 + V 2 − c2 = 0. Every minus characteristic ends at a sonic point instead of vacuum. The length of the minus characteristics inside the patch shrinks to zero since the density shrinks to zero. The minus characteristics do not form shocks inside the patch since it can be shown, following the idea and proofs of [1,18,26], that the ∂¯ + c is always finite. In fact, we claim that 1 + κ ¯+ ¯+ 1 − ¯+ ¯ 2 sin(2ω) − − ∂ (∂ c) = ∂ c ∂ c. 2c κ cos2 ω
(6.1)
We derive (6.1) as follows. We use I = c in the commutator relation (2.7) and ∂ − c = 0 to obtain ∂ −∂ +c =
∂ − tan α − ∂ + tan β (−∂ + c). tan β − tan α
318
J. Li, Y. Zheng
We use ∂ − β = 0 in (2.10) to obtain 2 sin2 ω + c∂¯ − α = 0.
(6.2)
So we obtain ∂ − tan α =
1 2 sin2 ω − α = − ∂ . cos2 α c cos β cos2 α
We use (2.17) to obtain ∂ + tan β =
1 1 1+κ 2 ¯ +β = ¯ +c . 2 sin ω − c ∂ tan ω ∂ c cos2 β cos α c cos2 β cos α κ
In addition, we have ∂¯ − ∂¯ + c = cos β[∂ − (cos α∂ + c)] = cos α cos β∂ − ∂ + c − tan α ∂¯ − α ∂¯ + c. Using (6.2) again, we obtain ∂¯ − ∂¯ + c = cos α cos β∂ − ∂ + c + 2c−1 sin2 ω tan α ∂¯ + c. Combining the above and using cos α + cos β = 2 cos τ cos ω and sin ω sin α − cos τ = − cos α cos ω, we obtain (6.1). In (6.1), the direction ∂¯ + is going from d to b, thus ∂¯ + c > 0 on the curve db. The direction of −∂¯ − is going from b to h. Because the right-hand side has the factor ∂¯ + c, it does not get to zero. And because the coefficient of the quadratic term is negative, it does not grow to positive infinity. Thus ∂¯ + c > 0 remain positive and finite in the whole patch of the simple wave. This proves the above claim. We need to show that the plus characteristics do not start from an interior point of the boundary kd. From (6.2), we obtain that along a minus characteristic, sin2 ω + c∂¯ − ω = 0,
(6.3)
thus ω is monotone increasing in the direction parallel to that from b to k. Following the monotonicity, we can conclude that the plus characteristics cannot start from an interior point of the boundary kd. Because, if it does, then ω would be zero at the starting point, but our ω on the “initial” line bd is positive, thus contradicting the monotonicity along the minus characteristics connecting the starting point and the initial point on bd. This completes the proof of the lemma. 6.2. Interaction of simple waves. We consider the interaction of two simple waves. The two simple waves will be quite general, with quite general interaction angles, which will + include the interaction of the waves bhd with b he, as well as the interaction of R12 − with R14 , see Fig. 5.1. Let us take a survey on what angles of interactions are involved in our bi-symmetric + with R − is π/4. For the interaction at point interaction. The interaction half-angle of R12 14 h, the maximum angle π/2 is achieved when bh is parallel to b h, while the minimum angle π/4 is achieved when the point h becomes vacuum so that bh becomes bd and thus bh is perpendicular to b h. Therefore we shall need interaction (half-)angles between (π/4, π/2).
Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations
(a)
(b)
319
(c)
Fig. 6.2. Interaction of two simple waves
√ We use the interaction bhd with b he as the primary problem and γ > 1+ 2. Notice that the Suchkov angle θs (γ ) is less than π/4 for γ > 1, so our interaction half-angles are always greater than the Suchkov angles, hence our interactions belong to the large angle case following the terminology of paper [17]. Thus we consider data as shown in Fig. 6.2, part (a). The data is symmetric with respect to the ξ −axis. Point h is on the ξ −axis. The lower curvilinear triangle bhd is a simple wave in which bh is a straight characteristic curve of the minus family, while d is vacuum. The density is monotone decreasing to zero from h to d along hd, and hd is a convex characteristic curve of the plus family. Length of hd is finite. We need to construct the interaction zone dhe, where the dotted curve de is vacuum. Part (b) of Fig. 6.2 represents the hodograph domain of interaction, while part (c) of Fig. 6.2 represents the phase space (α, β), where the three lower branches of parts (a), (b), and (c) represent the same boundary. Local existence of the solution at point P in part (b) of Fig. 6.2 follows from the standard argument for Goursat problems, see [27,32] for example. We need uniform estimates on (α, β) and their derivatives to extend the local solution up to the vacuum boundary. Curve D Z in part (c) of Fig. 6.2 represents the relation between α and β on the boundary dh of part (a) of Fig. 6.2. It is a horizontal straight segment if the simple wave bhd is a planar wave. We note, once we require condition (5.10) that c2 < c20 , that β > −π/2 along D Z , i.e., the slopes of the straight lines in curvilinear triangle bhd of part (a) of Fig. 6.2 is negative but not −∞. Similarly, we have α < π/2 along the boundary E Z in part (c) of Fig. 6.2. Thus we can use β = min β, DZ
α = max α EZ
(6.4)
to form the bottom and right sides of an√ invariant triangle for (α, β), while the third side is the line α − β = 2θs for γ ∈ (1 + 2, 3) or α − β = 0 for γ ≥ 3. The fact that the three straight-lines are not penetrable has been established in paper [17]. Hence the (α, β) are bounded in this way: α ≤ max α, β ≥ min β, α − β ≥ 2θs (γ ), EZ
where we use θs (γ ) = 0 for γ ≥ 3.
DZ
(6.5)
320
J. Li, Y. Zheng
Now that the invariant region (6.5) for (α, β) is available, the derivatives of (α, β) can be shown to be bounded in terms of c > 0, see [17,32]. We omit the details. Thus, a global solution exists where c > 0 in region D P E. Using Proposition 2.6, we can invert the mapping to yield a solution in the ξ − η plane. By the invariance of the (α, β) of (6.5), we obtain that the characteristics in the ξ − η plane are either convex or concave. We summarize this subsection in a theorem. Theorem 6.1 (Simple wave interactions). Interaction of two simple waves with an interaction half-angle between (π/4, π/2) and a density vanishes along the interaction boundaries exists as a smooth solution, in which the plus family of characteristics are convex while the minus family is concave, provided that c2 ∈ (0, c20 ). 7. Global Solution We construct the global solution for the bi-symmetric four rarefaction wave interaction. + with R − has been done in [17], and it also follows from the The first interaction of R12 14 + to be large, by choosing previous section, for any c2 ∈ [0, 1]. We need the wave R12 c2 close to zero, so that the point d of Fig. 5.1 is a vacuum and point h is hyperbolic. Choosing c2 ∈ (0, c20 ), we avoid point d running into point e and obtain the global existence of the interaction of the two simple waves dhe at the same time. We summarize our results in a theorem. √ Theorem 7.1 (Global existence). Suppose γ > 1 + 2. Then there exists c20 (γ ) ∈ (0, 1) so that for any c2 ∈ (0, c20 ) the associated bi-symmetric four rarefaction wave interactions have continuous global solutions, whose centers are vacuum. We then take all of c20 under which there exists a global continuous solution regardless of the sign of β on curve bd, and denote the supremum of such c20 by c2∗ , to obtain the main theorem stated in the Introduction. References 1. Bang, S.: Interaction of three and four rarefaction waves of the pressure-gradient system. J. Diff. Eqs. 246, 453–481 (2009) 2. Bressan, A.: Hyperbolic systems of conservation laws. In: The One Dimensional Cauchy Problem. Oxford: Oxford University Press, 2000 3. Chang, T., Chen, G.Q., Yang, S.L.: On the 2–D Riemann problem for the compressible Euler equations, I. Interaction of shock waves and rarefaction waves. Disc. Cont. Dyn. Syst. 1, 555–584 (1995) 4. Chang, T., Hsiao, L.: The Riemann Problem and Interaction of Waves in Gas Dynamics. Pitman Monographs and Surveys in Pure and Applied Mathematics, 41, Harlow: Longman Scientific & Technical, 1989 5. Chen, G.-Q., Feldman, M.: Global solutions of shock reflection by large-angle wedges for potential flow. Ann. Math (2), to appear, available at http://pjm.math.berkeley.edu/annals/ta/080510-Chen/080510Chen-v1.pdf 6. Chen, X., Zheng, Y.: The interaction of rarefaction waves of the two-dimensional Euler equations. Indiana Univ. Math. J. 58(2009), No. 6 (in press) 7. Courant, R., Friedrichs, K.O.: Supersonic Flow and Shock Waves, New York: Interscience Pulishers, Inc., 1948 8. Dafermos, C.: Hyperbolic Conservation Laws in Continuum Physics. Grundlehren der mathematischen Wissenschaften, Berlin-Hidelberg-NewYork: Springer, 2000 9. Dinu, L.F.: Multidimensional Wave-Wave Regular Interactions and Genuine Nonlinearity: Some Remarks. Lecture presented in Loughborough University, UK, 2006-07 10. Glimm, G., Ji, X., Li, J., Li, X., Zhang, P., Zhang, T., Zheng, Y.: Transonic shock formation in a rarefaction Riemann problem for the 2-D compressible Euler equations. SIAM J. Appl. Math. 69, 720–742 (2008)
Interaction of Four Rarefaction Waves in Bi-Symmetric 2-D Euler Equations
321
11. Kurganov, A., Tadmor, E.: Solution of two-dimensional Riemann problems for gas dynamics without Riemann problem solvers. Num. Meth. Part. Diff. Eqs. 18, 584–608 (2002) 12. Lax, P., Liu, X.: Solutions of two–dimensional Riemann problem of gas dynamics by positive schemes. SIAM J. Sci. Compt. 19, 319–340 (1998) 13. LeFloch, P.G.: Hyperbolic Systems of Conservation Laws, The Theory of Classical and Non-Classical Shock Waves. Basel: Birkhaüser Verlag, 2002 14. Li, J.: On the two-dimensional gas expansion for compressible Euler equations. SIAM J. Appl. Math. 62, 831–852 (2001) 15. Li, J., Zhang, T., Yang, S.: The Two-Dimensional Riemann Problem in Gas Dynamics. Pitman Monographs and Surveys in Pure and Applied Mathematics 98, Essex: Addison Wesley Longman limited, 1998 16. Li, J., Zhang, T., Zheng, Y.: Simple waves and a characteristic decomposition of the two dimensional compressible Euler equations. Commun. Math. Phys. 267, 1–12 (2006) 17. Li, J., Zheng, Y.: Interaction of rarefaction waves of the two-dimensional self-similar Euler equations. Arch. Rat. Mech. Anal. 193, 623–657 (2009) 18. Li, M., Zheng, Y.: Semi-hyperbolic patches of solutions of the two-dimensional Euler equations. Preprint, available on request 19. Elling, V., Liu, T.P.: Supersonic flow on a solid wedge. Comm. Pure Appl. Math. 61, 1331–1481 (2008) 20. Majda, A.: Compressible Fluid Flow and Systems of Conservation Laws in Several Space Variables. Applied Mathematical Sciences 53. New York: Springer-Verlag, 1984 21. Pogodin, I.A., Suchkov, V.A., Ianenko, N.N.: On the traveling waves of gas dynamic equations. J. Appl. Math. Mech. 22, 256–267 (1958) 22. Schulz–Rinne, C.W.: Classification of the Riemann problem for two-dimensional gas dynamics. SIAM J. Math. Anal. 24, 76–88 (1993) 23. Schulz–Rinne, C.W., Collins, J.P., Glaz, H.M.: Numerical solution of the Riemann problem for two– dimensional gas dynamics. SIAM J. Sci. Compt. 4, 1394–1414 (1993) 24. Serre, D.: Écoulements de fluides parfaits en deux variables indépendantes de type espace. Réflexion d’un choc plan par un dièdre compressif. Arch. Rat. Mech. Anal. 132, 15–36 (1995) 25. Smoller, J.: Shock Waves and Reaction-Diffusion Equations. Berlin-Heidelberg-NewYork: Springer, 1983 26. Song, K., Zheng, Y.: Semi-hyperbolic patches of solutions of the pressure gradient system. Disc. Cont. Dyn. Syst. Series A 24, 1365–1380 (2009) 27. Wang, R., Wu, Z.: On mixed initial boundary value problem for quasilinear hyperbolic system of partial differential equations in two independent variables (in Chinese), Acta Sci. Natur. Jinlin Univ., 2, 459–502, (1963) 28. Zhang, T., Zheng, Y.: Conjecture on the structure of solution of the Riemann problem for two-dimensional gas dynamics systems. SIAM J. Math. Anal. 21, 593–630 (1990) 29. Zhang, T., Zheng, Y.: Axisymmetric solutions of the Euler equations for polytropic gases. Arch. Rat. Mech. Anal. 142, 253–279 (1998) 30. Zheng, Y.: Systems of Conservation Laws: Two-Dimensional Riemann Problems. Vol. 38, PNLDE, Boston: Birkhäuser, 2001 31. Zheng, Y.: Two-dimensional regular shock reflection for the pressure gradient system of conservation laws. Acta Math. Appl. Sin. Engl. Ser. 22, 177–210 (2006) 32. Zheng, Y.: The compressible Euler system in two space dimensions. In: Series of Cont. Appl. Math. Vol. 13, (Shanghai Mathematics Summer School, 2007). G. Q. Chen, T.-T. Li, C. Liu (eds.) Singapore: World Scientific/ Higher Ed. Press, 2008 33. Zheng, Y.: Absorption of characteristics by sonic curves of the two-dimensional Euler equations. Disc. Cont. Dyn. Syst. 23, 605–616 (2009) 34. Zheng, Y.: Shock reflection for the Euler system. In: Hyperbolic Problems Theory, Numerics and Applications (Proceedings of the Osaka meeting 2004), Vol. II. Eds. F. Asakura (Chief), H. Aiso, S. Kawashima, A. Matsumura, S. Nishibata, K. Nishihara; Yokohama: Yokohama Publishers, 2006, pp. 425–432 Communicated by P. Constantin
Commun. Math. Phys. 296, 323–351 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1021-z
Communications in
Mathematical Physics
Uniqueness of Topological Solutions and the Structure of Solutions for the Chern-Simons System with Two Higgs Particles Jann-Long Chern1, , Zhi-You Chen1 , Chang-Shou Lin2 1 Department of Mathematics, National Central University, Chung-Li 32001,
Taiwan. E-mail:
[email protected];
[email protected] 2 Department of Mathematics, Taida Institute for Mathematical Sciences,
National Taiwan University, Taipei 10617, Taiwan. E-mail:
[email protected] Received: 3 October 2008 / Accepted: 7 January 2010 Published online: 4 March 2010 – © Springer-Verlag 2010
Abstract: The existence of topological solutions for the Chern-Simons equation with two Higgs particles has been proved by Lin, Ponce and Yang [16]. However, both the uniqueness problem and the existence of non-topological solutions have been left open. In this paper, we consider the case of one vortex at origin. Among others, we prove the uniqueness of topological solutions and give a complete study of the radial solutions, in particular, the existence of some non-topological solutions.
1. Introduction and Main Results In this paper, we will consider the nonlinear elliptic system ⎧ N ⎪ ⎪ v u ⎪ αs δ ps ⎨ u + λe (1 − e ) = 4π ⎪ ⎪ ⎪ ⎩ v + λeu (1 − ev ) = 4π
where =
2
∂2 i=1 ∂ x 2 , λ i
s=1 N
s=1
in R2 , (1.1)
αs δ ps
in
R2 ,
is a positive constant, N and N are two positive constants
which are called the vortex numbers, αs > 0 and αs > 0 are constants, and δ p is the Dirac measure at p. Equation (1.1) arises from a relativistic Abelian Chern-Simons model with two Higgs particles. For any solution (u, v) to Eq. (1.1), we let z = x 1 + i x 2 Work partially supported by National Science Council of Taiwan.
324
J.-L. Chern, Z.-Y. Chen, C.-S. Lin (1)
(2)
and define φ, χ , Ar and Ar , r = 1, 2, in the following: ⎧ N N ⎪ ⎪ ⎪ θ1 (z) = − arg (z − ps ), θ2 (z) = − arg (z − ps ), ⎪ ⎪ ⎨ s=1 s=1 1 1 φ(z) = e 2 u(z)+iθ1 (z) , χ (z) = e 2 v(z)+iθ2 (z) , ⎪ ⎪ (1) (1) ⎪ ⎪ A1 (z) = −Re{2i∂ ln φ(z)}, A2 (z) = −Im{2i∂ ln φ(z)}, ⎪ ⎩ (2) (2) A1 (z) = −Re{2i∂ ln χ (z)}, A2 (z) = −Im{2i∂ ln χ (z)};
(1.2)
here φ and χ are interpreted as two complex scalar fields in R2 representing two Hi(1) (2) (I ) ggs particles, and Ar and Ar , r = 1, 2, are two gauge fields. Then (φ, χ , Ar ), I = 1, 2, r = 1, 2, satisfy the self-dual equation for the Chern-Simons-Higgs model with two Higgs particles. For details of computations, we refer the readers to [8,15,16] and the references therein. For the past twenty years, the equation of Chern-Simons with one Higgs particle has been intensively studied, e.g., see [2–4,6–8,10–15,17–20,22,23] and references therein. However, the study for the system (1.1) only recently began with the paper [16]. For Eq. (1.1), there are two natural boundary conditions for solutions at ∞, namely, (i) lim u(x) = lim v(x) = 0, or |x|→∞
|x|→∞
(1.3)
(ii) lim u(x) = lim v(x) = −∞. |x|→∞
|x|→∞
We note that if (u, v) is a solution with the boundary condition either (i) or (ii), then, by the maximum principle, we have u(x) < 0 and v(x) < 0 for all x ∈ R2 . In physics literature, a solution (u, v) satisfying boundary condition (i) is called a topological solution. Since the nonlinear term ev (1 − eu ) = −ev u + O(|u|2 ) for u small and u(x), v(x) → 0 as |x| → +∞, by the estimates of elliptic PDE, we know that if (u, v) is a topological solution of (1.1), then both |u| and |v| decay exponentially at ∞. To solve (1.1), one may consider a regularized form: ⎧ N ⎪ 4αs ε ⎪ v (1 − eu ) = ⎪ u + λe ⎨ (ε+|x− ps |2 )2 s=1 (1.4) N ⎪ 4αs ε ⎪ u v ⎪ , ⎩ v + λe (1 − e ) = (ε+|x− p |2 )2 s
s=1
where ε is a small positive number, and introduce the background functions
u ε0 (x)
=
N s=1
4αs
N ε + |x − ps |2 ε + |x − ps |2 ε , v0 (x) = . ln 4αs ln 1 + |x − ps |2 1 + |x − ps |2
s=1
Then
u ε0 (x) = −h 1 (x) +
N s=1
N
4αs ε 4αs ε , v0ε = −h 2 (x) + , 2 2 (ε + |x − ps | ) (ε + |x − ps |2 )2 s=1
where h 1 , h 2 ∈ W 1,2 do not depend on ε > 0. By letting u = u ε0 + f, v = v0ε + g, the regularized form of (1.1) becomes ε ε f + λev0 +g (1 − eu 0 + f ) = h 1 (1.5) ε ε g + λeu 0 + f (1 − ev0 +g ) = h 2 .
Uniqueness and Structure of Solutions for the Chern-Simons System
It is clear that (1.5) is the Euler-Lagrange equations of the nonlinear functional:
ε ε ε I ( f, g) = (∇ f · ∇g + λeu 0 +v0 + f +g − λeu 0 + f − λe
v0ε +g
325
(1.6)
+ h 2 f + h 1 g) d x.
We refer to [16] for the details of arguments. From (1.6), we see that Eq. (1.1) is the so-called skew gradient system in the literature, see [21]. Clearly, the indefinite form of I presents a lot of difficulties for solving Eq. (1.1). Hence, it is remarkable that in Lin-Ponce-Yang [16], they are able to show the existence of topological solutions for Eq. (1.1) for any given set of singularities. Theorem A. [16] For any given sets { p1 , . . . , p N } and { p1 , . . . , p N } and αs , αs > 0, Eq. (1.1) possesses a topological solution (u, v). After Theorem A, it is natural to ask the question about the uniqueness of topological solutions for Eq. (1.1). For the single Chern-Simons-Higgs model, the uniqueness result was proved in [6] with only one singularity, and in [3 and 19] for multi-singularity in R2 and large λ as well as in the periodic case. In this article, we consider the topological solution (u, v) for the case N = N = 1 and p1 and p1 to be the origin O. Then (u, v) satisfies u + ev (1 − eu ) = 4π N1 δ0 in R2 , (1.7) v + eu (1 − ev ) = 4π N2 δ0 with the boundary condition u(x) → 0, v(x) → 0 as |x| → ∞.
(1.8)
By noting u(x) < 0, v(x) < 0 for x ∈ R2 , and by applying the standard method of moving planes, we can show that (u, v) is radially symmetric with respect to the origin O. The proof is standard, and will be omitted here. We refer to [1] for the details of the proof. To a single nonlinear elliptic equation, the uniqueness problem has been extensively studied for the last decades. It is well-known that the uniqueness problem is closely related to non-degeneracy of its linearized equation. See [4,5] and references therein. In this paper, we also want to prove the uniqueness by studying the non-degeneracy of linearized equations. The linearized equation at (u, v) of (1.7) is called degenerate if there exists a nonzero bounded solution pair (A(r ), B(r )) of A + ev (1 − eu )B − eu+v A = 0 (1.9) in R2 . B + eu (1 − ev )A − eu+v B = 0 Comparing to the case of a single equation, there are additional difficulties to be overcome for (1.9). In the proof of the uniqueness for a single equation, some standard techniques such as Sturm-Liouville comparison theorem play important roles. See [4] and [5]. However, these standard tools are no longer available for a system of Eqs. (1.9). Hence, we have to develop new ideas to work out for (1.9), which will be presented in Sect. 2. We believe that the method developed here should be helpful for a general class of nonlinear elliptic systems. After the non-degeneracy of (1.9) is established, we can prove the following uniqueness theorem.
326
J.-L. Chern, Z.-Y. Chen, C.-S. Lin
Theorem 1.1. Let (u, v) be a topological solution of (1.7). Then the linearized equation (1.9) of (1.7) at (u, v) is non-degenerate. Moreover, Eq. (1.7) possesses one and only one topological solution. Now we come back to discuss the case of the boundary condition (ii) of (1.11). In the Abelian Chern-Simons-Higgs model with one particle, a solution u(x) satisfies u + eu (1 − eu ) = 4π N δ0 in R2 .
(1.10)
Suppose u = u(|x|) is a non-topological solution of (1.10), i.e., u(r ) → −∞ as r → +∞. Then it can be proved that u satisfies
eu (1 − eu ) d x < +∞. (1.11) R2
But, for the system (1.1), (1.11) might not hold even for the radial solution (u(r ), v(r )). Actually, in Sect. 5, we will show that there exists a solution pair (u, v) of (1.7) satis v (1 − eu ) d x < +∞ and fying both u(r ) and v(r ) tend to −∞ as r → ∞ with e R2 u v R2 e (1 − e ) d x = +∞. Thus, while compared with (1.10), the structure of solutions for (1.7) could be more complicated. One of our purposes in this paper is to classify solutions according to their behaviors at infinity. In this paper, we call a solution to be non-topological if (u, v) satisfies the boundary condition (ii) in (1.11), and both eu (1 − ev ) and ev (1 − eu ) are in L 1 (R2 ). For an entire solution (u, v) of (1.1), we set
1 1 ev (1 − eu ) d x, β2 = eu (1 − ev ) d x. (1.12) β1 = 2π R2 2π R2 In order to investigate the structure of all radial solutions of (1.7), we consider the following ODE system: ⎧ 1 ⎪ ⎨ u (r ) + u (r ) + ev(r ) (1 − eu(r ) ) = 0, r r >0 (1.13) 1 ⎪ ⎩ v (r ) + v (r ) + eu(r ) (1 − ev(r ) ) = 0, r with the initial value
u(r ) = 2N1 log r + α1 + o(1), v(r ) = 2N2 log r + α2 + o(1)
as r → 0+ .
(1.14)
According to the behaviors at ∞, all entire solutions of (1.13) can be classified into the following five types: Type (I): lim (u(r ), v(r )) = (0, 0), i.e., (u, v) is the topological solution. r →∞ Type (II): lim (u(r ), v(r )) = (−∞, −∞) with β1 < ∞ and β2 < ∞, i.e., r →∞ (u, v) is a non-topological solution. Type (III): lim u(r ) = −∞, lim v(r ) = −∞, and r →∞
r →∞
either 2N1 < β1 ≤ 2N1 + 2, β2 = ∞ or β1 = ∞, 2N2 < β2 ≤ 2N2 + 2.
Uniqueness and Structure of Solutions for the Chern-Simons System
327
Type (IV): lim (u(r ), v(r )) = (−cu , −∞) or lim (u(r ), v(r )) = (−∞, −cv ) r →∞ r →∞ for some constants cu > 0 and cv > 0. Type (V): lim (u(r ), v(r )) = (+∞, −∞) or lim (u(r ), v(r )) = (−∞, +∞). r →∞
r →∞
Our second result is the asymptotic behaviors of all entire solutions. Theorem 1.2. Let (u, v) be a solution of (1.13)–(1.14). Then (u, v) must be one of the above five types. Conversely, solutions of all types do exist. Let α = (α1 , α2 ), and (u(r, α), v(r, α)) denote the solution of (1.13)-(1.14). According to the behavior of (u, v), the set of initial data could be classified into the following regions:
= {α|(u(r, α), v(r, α)) is a solution with lim (u(r, α), v(r, α)) = (−∞, −∞)}, r →∞ T = {α|(u(r, α), v(r, α)) is the unique topological solution},
N T = {α|(u(r, α), v(r, α)) is a non-topological solution}, Su = {α|(u(r, α), v(r, α)) is a Type (IV) solution with lim u(r ) = −cu }, r →∞ Sv = {α|(u(r, α), v(r, α)) is a Type (IV) solution with lim v(r ) = −cv }, r →∞ Wu = {α|(u(r, α), v(r, α)) is a Type (V) solution with lim u(r ) = ∞}, r →∞ Wv = {α|(u(r, α), v(r, α)) is a Type (V) solution with lim v(r ) = ∞}. r →∞
Then the structure of solutions sets is described as follows: Theorem 1.3. Both and N T are non-empty and open simply connected. Furthermore, all sets \ N T , Su , Sv , Wu and Wv are non-empty, and the following statements are valid. (i) = v N T u is a non-empty and simple connected set, where
u = {α ∈ |(u(r, α), v(r, α)) is a Type (III) solution with β1 < ∞},
v = {α ∈ |(u(r, α), v(r, α)) is a Type (III) solution with β2 < ∞}.
(ii) ∂ = Su T Sv and S u S v = ∂ ∂ N T = T. (iii) For any α ∈ N T the corresponding (β1 , β2 ) satisfies (β1 − 2(N1 + 1))(β2 − 2(N2 + 1)) > 4(N1 + 1)(N2 + 1).
(1.15)
(iv) Wu is open. Furthermore, for each (θ, η) ∈ Su there exists > 0 such that (α1 , η) ∈ Wu ∀θ < α1 < θ + . (v) Wv is open. Furthermore, for each (µ, ν) ∈ Sv there exists δ > 0 such that (µ, α2 ) ∈ Wv ∀ν < α2 < ν + δ. We remark that the uniqueness of topological solutions implies the simple-connectedness of both and N T . The simple-connectedness is important itself, because it allows us to study the linearized equation of (1.7) at any non-topological solution through the argument of continuation. An important question about non-topological solutions arises: given any pair of (β1 , β2 ) satisfying (1.15) of Theorem 1.3, is there an unique non-topological solution (u, v) which satisfies (1.12)? We will come back to this issue in a coming paper. From Theorem 1.3, we note that there are drastic differences between the solutions of (1.10) and (1.7). For Eq. (1.10), if a solution is positive somewhere, then it will blow
328
J.-L. Chern, Z.-Y. Chen, C.-S. Lin
up in finite |x|. But the situations do change for the system of equations. For example, the solution of Type (V) depicts that u might be positive somewhere, but both u and v do not blow up in finite |x|. Another consequence of Theorem 1.2 is that if both u and v are positive at some |x0 |, then u and v must blow up in finite |x|. The paper is organized as follows. First we investigate the monotone and non-degenerate properties of the linearized equations on the negative solutions of (1.13) in Sect. 2. Based on the results of Sect. 2 and applying the Implicit Function Theorem, we prove the uniqueness of topological solution for (1.13) in Sect. 3. In Sect. 4, we will give the asymptotic behaviors of all entire solutions. Finally, we prove the existences and classification of solutions of all types, Theorems 1.2 and 1.3, in Sect. 5. 2. The Non-Degeneracy of Linearized Equations In this section, we give the proof about the non-degeneracy of the linearized equation on the topological solution of (1.7). Before going to our proof, we need to state some properties concerning solutions. First, we have the Pohozaev identity as follows. Lemma 2.1. (Pohozaev identity). Let (u(r ), v(r )) be a solution of (1.13)–(1.14) in (0, R] for some R > 0. Then we have the following identity:
r 2 u(r ) v(r ) 2 u(r )+v(r ) [r u (r ) · r v (r ) + r (e +e )−r e ]−2 s(eu(s) + ev(s) ) ds 0
r u(s)+v(s) +2 se ds = 4N1 N2 ∀r ∈ (0, R]. (2.1) 0
Proof. By multiplying r v and r u on both sides of the first and second equation of (1.13) respectively, we obtain r v (r u ) + r v r ev (1 − eu ) = 0 ∀r ∈ (0, R]. (2.2) r u (r v ) + r u r eu (1 − ev ) = 0 Then adding these two equations together and taking the integration from 0 to r , we get r r [r u (r ) · r v (r ) − lim+ (r u (r ) · r v (r ))] + 0 s 2 d(eu(s) ) + 0 s 2 d(ev(s) ) r →0 r − 0 s 2 d(eu(s)+v(s) ) = 0 ∀r ∈ (0, R]. By the above equality, and using the initial value (1.14) and the integration by parts, we can easily obtain (2.1). Secondly, we have the following property for solutions with zero boundary value. Lemma 2.2. Let (u(r ), v(r )) be a solution of (1.13)–(1.14) satisfying u(R0 ) = v(R0 ) = 0 for some R0 > 0 (or R0 = +∞). Then the following are valid: (i) u < 0, v < 0, u > 0 and v > 0 on (0, R0 ). Furthermore, if R0 = ∞, i.e., (u, v) is a topological solution of (1.7), then the corresponding (β1 , β2 ) satisfies β1 = 2N1 and β2 = 2N2 , where (β1 , β2 ) is defined in (1.12). (ii) If N1 < N2 , then u > v on (0, R0 ). (iii) If N1 > N2 , then u < v on (0, R0 ). (iv) If N1 = N2 , then u ≡ v.
Uniqueness and Structure of Solutions for the Chern-Simons System
329
Proof. We shall apply the maximum principle to prove (i). Suppose u(r0 ) = max u > 0. Then u(r0 ) ≤ 0 and thus
(0,R0 ]
0 = u(r0 ) + ev(r0 ) (1 − eu(r0 ) ) < 0, which yields a contradiction. Hence, u(r ) ≤ 0 on (0, R0 ). The strong maximum principle implies u(r ) < 0 in (0, R0 ). Similarly, it holds for v. Since u(r ) < 0 and v(r ) < 0 in (0, R0 ), the maximum principle also implies that both u and v can not attain their local minima inside (0, R0 ). Since u (r ) > 0 and v (r ) > 0 for r near 0, we obtain u (r ) > 0, v (r ) > 0 on (0, R0 ). If R0 = ∞ then, by u(r ) < 0 on (0, ∞) and (1.7), we have (r u (r )) = r ev(r ) (eu(r ) − 1) < 0 ∀r ∈ (0, ∞). Thus, by u (r ) > 0 on (0, ∞), we get 0 ≤ lim r u (r ) = r →∞ ∞ 2N1 − 0 r ev (1 − eu )dr exists (≡ cu ). If cu > 0 then we easily have u(r ) > 0 for large r . This contradiction proves cu = 0. From this we get β1 = 2N1 . The case of β2 is similar. Hence (i) holds. By (1.15), we have (u − v) = 4π(N1 − N2 )δ0 + (eu − ev ). If (u − v)(r0 ) < 0 at some r0 ∈ (0, R0 ), then we can let r0 satisfy (u − v)(r0 ) = min (u − v) < 0, and we have
(0,R0 ]
0 ≤ (u − v)(r0 ) = eu(r0 ) − ev(r0 ) < 0, a contradiction. Hence u(r ) ≥ v(r ). By the strong maximum principle, the strict inequality u(r ) > v(r ) holds for r ∈ (0, R0 ). This proves (ii). Obviously, (iii) and (iv) follow easily. In the following, we investigate the monotone property of the negative solution of (1.13)–(1.14). Let, for i = 1, 2, ⎧ ∂U ⎪ ⎨ φi (r ) = , ∂αi (2.3) ∂V ⎪ ⎩ ψi (r ) = , ∂αi where U (r ; α1 , α2 ) = u(r ; α1 , α2 ) − 2N1 log r and V (r ; α1 , α2 ) = v(r ; α1 , α2 ) − 2N2 log r . Then (φi , ψi ), i = 1, 2, satisfy the linearized equations ⎧ ⎨ φi − eu+v φi + ev (1 − eu )ψi = 0, r ∈ (0, R0 ), ψi − eu+v ψi + eu (1 − ev )φi = 0, r ∈ (0, R0 ), (2.4) ⎩ φ (0) = 1 = ψ (0), φ (0) = 0 = ψ (0), φ (0) = 0 = ψ (0). 1 2 2 1 i i The monotone property of φi and ψi is as follows: Lemma 2.3. Let (u(r ), v(r )) be a solution of (1.13)–(1.14). If u(r ) < 0 and v(r ) < 0 for r ∈ (0, R0 ) for some R0 > 0 (or R0 = ∞), then the corresponding (φi , ψi ) satisfy φ1 (r ) > 0, φ1 (r ) > 0, φ2 (r ) < 0, φ2 (r ) < 0, (2.5) ∀r ∈ (0, R0 ). ψ1 (r ) < 0, ψ1 (r ) < 0, ψ2 (r ) > 0, ψ2 (r ) > 0
330
J.-L. Chern, Z.-Y. Chen, C.-S. Lin
Proof. By (2.4) and (1.14), we obtain there exists r0 ∈ (0, R0 ) such that
r s[eu(s) (1 − ev(s) )φ1 (s) − eu(s)+v(s) ψ1 (s)] ds ∀r > 0 r ψ1 (r ) = − 0
r ≤− s[C1 s 2N1 (1 − C2 s 2N2 )φ1 (s) − C3 s 2N1 +2N2 ψ1 (s)] ds ∀r ∈ (0, r0 ) 0
≤ −Cr 2N1 +2 < 0 ∀r ∈ (0, r0 ).
(2.6)
By ψ1 (0) = 0, ψ1 (0) = 0 and (2.6), we have ψ1 (r ) < 0 and ψ1 (r ) < 0 ∀r ∈ (0, r0 ). On the other hand, by (2.4), (1.14), and the above result, we get
r s[eu(s)+v(s) φ1 (s) + ev(s) (eu(s) − 1)ψ1 (s)] ds ∀r > 0 r φ1 (r ) =
0 r ≥ C4 s · s 2N1 +2N2 φ1 (s) ds ∀r ∈ (0, r0 ) 0
≥ Cr 2N1 +2N2 +2 > 0 ∀r ∈ (0, r0 ). 1, φ1 (0)
(2.7)
= 0 and (2.7), we have φ1 (r ) > 0 and φ1 (r ) > 0 ∀r ∈ (0, r0 ). first inequality of (2.5) holds for r ∈ (0, r0 ). However (2.6) and
By φ1 (0) = These prove that the (2.7) hold as long as the first inequality of (2.5) is true. This shows that the first inequality of (2.5) holds. The proof for the second inequality of (2.5) is similar. The proof is complete. Finally, we state and prove the non-degenerate property of the linearized equation at a topological solution in the following: Lemma 2.4. Let (u(r ), v(r )) be a solution of (1.13)–(1.14) satisfying u(R0 ) = v(R0 ) = 0 for some R0 > 0 (or R0 = +∞). If (φi (r ), ψi (r )), i = 1, 2, is the respective solution pair of (2.4), then the following statements are valid. (i) If R0 = ∞, i.e., (u, v) is a topological solution, then there exist constants c1 > 0, c2 < 0, d1 < 0 and d2 > 0 such that, lim
φi (r )
r →∞ − 21 r r e
= ci and lim
ψi (r )
r →∞ − 21 r r e
= di , i = 1, 2.
ψ1 (r ) ) (ii) Let M A (r ) = − φφ21 (r (r ) and M B (r ) = − ψ2 (r ) . Then M A (r ) > M B (r ) > 0 ∀r ∈ ∞) if R0 = ∞) and M A (r ) < 0, M B (r ) > 0 ∀r ∈ (0, R0 ). [0, R 0 ] (resp., [0, φ1 (r ) φ2 (r ) = 0 ∀r ∈ [0, R0 ] (resp., [0, ∞) if R0 = ∞). (iii) det ψ1 (r ) ψ2 (r ) (iv) The corresponding linearized equation (1.9) is non-degenerate.
Proof. (i) We prove the asymptotic behavior of φ1 . The cases of ψ1 , φ2 and ψ2 are similar. Let w(r ) = φ1 (r ) − ψ1 (r ) − er . Then by (2.4), w satisfies w(r ) = (eu φ1 − ev ψ1 ) − (1 + r1 )er w(0) = 0, w (0) = −1. Since u(r ) < 0, v(r ) < 0, φ1 (r ) > 0 and ψ1 (r ) < 0 ∀r > 0, it follows that w ≤ (φ1 (r ) − ψ1 (r )) − (1 + r1 )er = w(r ) − r1 er ∀r ∈ (0, ∞). Thus we obtain w(r ) < 0 ∀r ∈ (0, ∞), i.e., φ1 (r ) − ψ1 (r ) < er on (0, ∞).
(2.8)
Uniqueness and Structure of Solutions for the Chern-Simons System
331
1
Let z(r ) = φ1 (r )r 2 . Then z satisfies z (r ) + [−1 + q(r )]z(r ) = 0,
(2.9)
where q(r ) = 1 − eu+v + Since lim
r →∞
eu −1 u
ev (1 − eu )ψ1 1 + 2. φ1 4r
= 1 and ψ1 < 0, we have
(eu − 1)ψ1 ≤ C · u(r )ψ1 (r ) for large r and some C > 0.
(2.10)
1
By |u(r )|, |v(r )| ≤ Cr − 2 e−r for large r , (2.8) and (2.10), we easily obtain 1
(eu − 1)ψ1 ≤ Cr − 2 for large r. v
From this and φ1 (r ) > r for large r , we deduce −e (1−e φ1 ∞ large. Moreover, since R (1 − eu+v )dr < ∞, we get
u )ψ 1
∈ L 1 (R, ∞) for R > 0
q(r ) ∈ L 1 [R, ∞).
(2.11)
By (2.11) and applying Corollary 9.2 of [9] to (2.9), we finally obtain lim
r →∞
z(r ) = c1 > 0, er
and hence lim
φ1 (r )
r →∞ − 21 r r e
= c1 .
This proves the case of φ1 . Thus (i) holds. (ii) By (2.4), we have limr →0+ M A (r ) = ∞, limr →0+ M B (r ) = 0, and thus M A (r ) > M B (r ) ∀r ∈ (0, r1 ) for some r1 ∈ (0, R0 ]. We divide the proof of (ii) into the following two steps. Step 1. If M A (r ) > M B (r ) ∀r ∈ (0, r0 ) for some r0 ≤ R0 , then M A (r ) < 0 and M B (r ) > 0 ∀r ∈ (0, r0 ). We prove Step 1 by contradiction. Suppose M A (r ) < 0 ∀r ∈ (0, r0 ) is not true. Then there exist 0 < r1 < r2 ≤ r0 such that M A (r1 ) < 0, M A (r2 ) > 0, M A (r1 ) = M A (r2 )(≡ C0 ), and 0 < M B (r ) < M A (r ) < C0 ∀r ∈ (r1 , r2 ).
(2.12)
For any c > 0 and r ∈ (0, R0 ], we define Ac (r ) = φ1 (r ) + c · φ2 (r ) and Bc (r ) = ψ1 (r ) + c · ψ2 (r ). Then Ac and Bc satisfy ⎧ ⎨ Ac − eu+v Ac = ev (eu − 1)Bc ∀r ∈ (0, R0 ], Bc − eu+v Bc = eu (ev − 1)Ac ∀r ∈ (0, R0 ], ⎩ A (0) = 1, B (0) = c > 0. c c
(2.13)
(2.14)
332
J.-L. Chern, Z.-Y. Chen, C.-S. Lin
From (2.12) and (2.13), we easily obtain AC0 (r ) < 0 < BC0 (r ) ∀r ∈ (r1 , r2 ) and AC0 (r1 ) = 0 = AC0 (r2 ),
(2.15)
which imply that AC0 has a local minimum at some r¯ ∈ (r1 , r2 ) and AC0 (¯r ) ≥ 0. But, from (2.14) and (2.15), we get AC0 (¯r ) = eu(¯r )+v(¯r ) AC0 (¯r ) + ev(¯r ) (eu(¯r ) − 1)BC0 (¯r ) < 0.
(2.16)
This contradiction proves M A (r ) < 0 ∀r ∈ (0, r0 ). Similarly, suppose M B (r ) > 0 ∀r ∈ (0, r0 ) is not true. Then there exist 0 < r1 < r2 ≤ r0 such that M B (r1 ) > 0, M B (r2 ) < 0, M B (r1 ) = M B (r2 )(≡ C0 ), and C0 < M B (r ) < M A (r ) ∀r ∈ (r1 , r2 ).
(2.17)
By (2.17) and (2.13), we easily obtain BC0 (r ) < 0 < AC0 (r ) ∀r ∈ (r1 , r2 ) and BC0 (r1 ) = 0 = BC0 (r2 ),
(2.18)
and hence BC0 has a local minimum at some r¯ ∈ (r1 , r2 ) with BC0 (¯r ) ≥ 0. However, from (2.14) and (2.15) we get BC0 (¯r ) = eu(¯r )+v(¯r ) BC0 (¯r ) + eu(¯r ) (ev(¯r ) − 1)AC0 (¯r ) < 0.
(2.19)
This contradiction proves Step 1. Step 2. There does not exist R ∈ (0, R0 ) such that M A (R) = M B (R). Suppose Step 2 is not true. Then there exists a smallest R ∈ (0, R0 ] such that M A (R) = M B (R)(≡ C) and M A (r ) > M B (r ) > 0 ∀r ∈ (0, R). Let Ac and Bc be defined in (2.13). Then, in this case, by Step 1 we obtain AC (r ) > 0, BC (r ) > 0 ∀r ∈ (0, R), AC (R) = BC (R) = 0, AC (R) < 0, BC (R) < 0 if R < ∞.
(2.20)
Taking the differentiation w.r.t. αi , i = 1, 2, on both sides of the Pohozaev identity, (2.1), then for any c > 0 and r ∈ (0, R0 ], we obtain r 2 Ac (r )v (r ) + r 2 Bc (r )u (r ) + r 2 [eu(r ) Ac (r ) + ev(r ) Bc (r )]
r
−r 2 eu(r )+v(r ) (Ac (r ) + Bc (r ))2 s[eu Ac + ev Bc ]ds + 2 0
r
seu+v (Ac + Bc )ds = 0.
0
(2.21) If R < ∞ then, by replacing c and r with C and R in (2.21) respectively, we easily have 0 = R 2 AC (R)v (R) + R 2 BC (R)u (R) + R 2 BC (R)ev(R) (1 − eu(R) ) + R 2 AC (R)eu(R) (1 − ev(R) ) R
R +2 r AC eu (ev − 1)dr + r BC ev (eu − 1)dr . (2.22) 0
0
Uniqueness and Structure of Solutions for the Chern-Simons System
333
Then, combining (i) of Lemma 2.2, (2.20) and (2.22), we deduce 0 > R 2 AC (R)v (R) + R 2 BC (R)u (R) R
R r AC eu (1 − ev )dr + r BC ev (1 − eu )dr > 0, =2 0
0
which yields a contradiction. If R = ∞ then we first claim that one of AC and BC is unbounded. Suppose (†) is not true. Then AC and BC are bounded. By (2.21) we have 0 = lim r AC (r ) · r v (r ) + r BC (r ) · r u (r ) r →∞ + lim BC (r ) · r 2 ev(r ) (1 − eu(r ) ) + AC (r ) · r 2 eu(r ) (1 − ev(r ) ) r →∞ ∞
∞ r AC eu (ev − 1)dr + r BC ev (eu − 1)dr . +2 0
(†)
(2.23)
0
Moreover, (i) of Lemma 2.2 implies that lim r u (r ) = 0 = lim r v (r ),
r →∞
r →∞
lim [r 2 eu(r ) (1 − ev(r ) )] = 0 = lim [r 2 ev(r ) (1 − eu(r ) )].
r →∞
(2.24)
r →∞
Since |u| and |v| decay exponentially at ∞, and (AC , BC ) is bounded, by (2.14) and (i) of Lemma 2.2, we get the limits
∞
∞ lim r AC (r ) = [r eu+v AC ]dr − [r ev (1 − eu )Bc ]dr, r →∞ 0 0 (2.25)
∞
∞ [r eu+v BC ]dr − [r eu (1 − ev )Ac ]dr lim r BC (r ) = r →∞
0
0
all exist. Hence, due to (2.20) and (2.23)–(2.25), we finally obtain ∞
∞ u v v u 0=2 r AC e (1 − e )dr + r BC e (1 − e )dr > 0. 0
0
This contradiction shows that AC or BC is unbounded, and the claim is proved. Secondly, suppose AC is unbounded. From (2.14) we have (AC − BC ) − eu (AC − BC ) = (eu − ev )BC , and hence, by Lemma 2.2 and the strong maximum principle, we obtain that AC (r ) intersects BC (r ) at most one point on [0, ∞). Thus, w.l.o.g., we may assume that there exists r1 > 0 such that AC (r1 ) ≥ 0, (2.26) AC (r ) > BC (r ) > 0 on [r1 , ∞). Since (u, v) is a topological solution of (1.7), there exists r2 > r1 such that eu(r )+v(r ) ≥ max{ev(r ) (1 − eu(r ) ), eu(r ) (1 − ev(r ) )} ∀r ≥ r2 .
(2.27)
334
J.-L. Chern, Z.-Y. Chen, C.-S. Lin
(r ) > 0 on (r , ∞) and thus Therefore, by (2.14) and (2.26)–(2.27), we get AC 1 lim AC (r ) = ∞. Now, by applying the same arguments in the proof of (i), we can r →∞ obtain AC (r ) lim = C A = c1 + C · c2 > 0, r →∞ − 21 r r e where c1 and c2 are constants in (i). Then there exists > 0 such that C = c1 + (C + (r ) = C . But, by Step 1 we have AC+ (r ) < 0 for large r . ) · c2 > 0 and lim AC+ −1 r →∞ r
2 er
We get a contradiction. The case of unboundedness for BC is similar. This shows Step 2. According to Steps 1 and 2, weeasily obtain (ii). φ1 (R) φ2 (R) = 0 for some R ∈ [0, R0 ] (resp., R ∈ [0, ∞) if (iii) Suppose det ψ1 (R) ψ2 (R) R0 = ∞). Then, w.l.o.g., there exists C0 > 0 such that 0 φ1 (R) φ2 (R) + C0 = . (2.28) ψ1 (R) ψ2 (R) 0 By (2.28) we obtain M A (R) = C0 = M B (R) which contradicts the result of (ii). Hence we prove (iii). (iv) Let (u, v) be a topological solution of (1.7). Then any solution pair (A(r ), B(r )) of the linearized equations (1.9) can be written as A(r ) = c1 φ1 (r ) + c2 ψ1 (r ) and B(r ) = c1 φ2 (r ) + c2 ψ2 (r ) for some c1 , c2 ∈ R. By the result of (†) in the proof of (ii), we easily obtain the non-degeneracy result if c = c2 /c1 > 0. When c ≤ 0 or c1 = 0, then by (i), we can also get that both Ac (r ) and Bc (r ) are unbounded. This proves (iv). 3. Uniqueness of Topological Solution In this section, we will use a continuation argument and Lemma 2.4 to establish the uniqueness of topological solutions. As we have seen in Sect. 2, if N1 = N2 , then u ≡ v, and the uniqueness follows from the case of scalar equation 1.10. Concerning the uniqueness for the scalar equation, we refer readers to [6 or 8]. Proof of Theorem 1.1. Suppose that for some pair (N10 , N20 ), Eq. (1.15) possesses at least two topological solutions. Without loss of generality, we may assume 0 ≤ N10 < N20 . Let N1∗ = inf{0 ≤ N1 |(1.15) possesses a unique topological solution for all ( Nˆ1 , N20 ) where N1 ≤ Nˆ1 ≤ N20 }. Clearly, N1∗ ≥ N10 . To yield a contradiction, we claim the following:
(∗) Suppose (u 0 , v0 ) is a topological solution of (1.15) with respect to (N1 , N2 ). Let U0 (r ) = u 0 (r ) − 2N1 log r and V0 (r ) = v0 (r ) − 2N2 log r . Then there is a neighborhood B of (N1 , N2 ) such that for any pair of (N1 , N2 ) in B, there exists the corresponding (U,V ) with respect to (N1 , N2 ), which is close to (U0 , V0 ) 2 2 in C B R (0) × C B R (0) for any R > 0, where (u(r ), v(r )) = (U (r ) + 2N1 log r, V (r ) + 2N2 log r ) is a topological solution of (1.15) with respect to (N1 , N2 ).
Uniqueness and Structure of Solutions for the Chern-Simons System
335
If the domain is bounded, then claim (∗) follows directly from the non-degeneracy of linearized equation and the Implicit Function Theorem. Since our domain is R2 , in order to apply the Implicit function theory, we need to show the linearized equation of (1.7) is an invertible operator from Wr2,2 (R2 ) × Wr2,2 (R2 ) to L 2 (R2 ), where Wr2,2 (R2 ) = {z(x) = z(r )|z, z , z ∈ L 2 (R2 )}. For the sake of completeness, we will present a proof of the claim (∗). First, let us assume claim (∗) holds. The proof of claim (∗) will be given later. By the claim (∗), at (N1∗ , N2 ), Eq. (1.7) possesses a unique topological solution. Thus N1∗ > N10 . By the definition of N1∗ , there are two sequences of solutions (u k , vk ), (u ∗k , vk∗ ) of (1.7) with (N1k , N20 ) such that N1k ↓ N1∗ . The following lemma shows the pre-compactness of (Uk , Vk ), where Uk (r ) = u k (r ) − 2N1k log r and Vk (r ) = vk (r ) − 2N20 log r . Lemma 3.1. Thereexists asubsequence of (Uk , Vk ) such that it converges to (U, V ) in 2 C B R (0) × C 2 B R (0) for any R > 0, where (u(r ), v(r )) = (U (r ) + 2N1∗ log r, V (r ) + 2N20 log r ) is a topological solution of (1.15). Proof. Since (u k , vk ) is a topological solution of (1.15) with (N1k , N20 ), we have
∞
∞ vk uk k e (1 − e )r dr = 2N1 and eu k (1 − evk )r dr = 2N20 . (3.1) 0
0
Since
r
2 s(1−eu k (s) )(1−evk (s) )ds =r 2 −2 0
r
s(eu k (s)+ evk (s) )ds +2
0
r
seu k (s)+vk (s) ds,
0
by the Pohozaev identity (2.1), we have
∞ r (1 − eu k (r ) )(1 − evk (r ) )dr = 2N1k N20 . 0
Thus by (3.1), we obtain
∞ [(1 − eu k ) + (1 − evk )]r dr = 2(N1k + N20 ) + 4N1k N20 0
≤ C0 < +∞ ∀k.
Then
Uk + evk (1 − eu k ) = 0 Vk + eu k (1 − evk ) = 0.
By integrating the equation, one has
r
evk (1 − eu k )s ds ≤ −Uk (r )r = 0
(3.2)
∞
(1 − eu k )s ds < C0 (by (3.2)).
0
Thus, |Uk (r )| is uniformly bounded in any bounded subinterval of [0, ∞). We claim that Uk (1) is bounded.
336
J.-L. Chern, Z.-Y. Chen, C.-S. Lin
Otherwise, since Uk (r ) − Uk (1) is uniformly bounded on any bounded subsequence of [0, ∞), we have Uk (r ) = Uk (1) + O(1) for 0 ≤ r ≤ R0 , where R0 is chosen so that 1 2
R0
s ds > C0 .
0
Suppose Uk (1) → −∞. Then for large k, 1 − eu k (r ) ≥ Therefore, we have
∞
(1 − eu k )r dr > C0 ≥
1 for 0 ≤ r ≤ R0 . 2
R0
(1 − eu k )r dr ≥
0
0
1 2
R0
r dr > C0 ,
0
a contradiction. Therefore, Uk (1) is bounded. Recall that u k and vk are increasing in r and both are negative. Thus |u k (r )| and |vk (r )| are uniformly bounded in [r0 , ∞) for any r0 > 0. Without loss of generality, we may assume Uk (r ), Vk (r ) converges to U (r ), V (r ) in C 2 ([0, R]) for all R > 0, and (u(r ), v(r )) which is defined in Lemma 3.1 satisfies (1.15) with (N1∗ , N20 ) and u (r ), v (r ) > 0. By (3.2) and Fatou’s Lemma,
∞
∞ [(1 − eu ) + (1 − ev )]r dr ≤ lim inf [(1 − eu k ) + (1 − evk )]r dr < c0 , k→∞
0
0
which implies limr →+∞ u(r ) = limr →+∞ v(r ) = 0, that is, (u, v) is a topological solution. This completes the proof. Now we go back to the proof of uniqueness. By Lemma 3.1, (u k , vk ) and (u k , vk ) converges to (u, v), due to the fact that (1.15) has only one topological solution at (N1∗ , N20 ). W.l.o.g., we can assume |(u k − u k )(xk )| = ||u k − u k || L ∞ ≥ ||vk − vk || L ∞ ∀k. Set Ak =
(u k −u k ) ||u k −u k || L ∞
and Bk =
(vk −vk ) . ||u k −u k || L ∞
Then Ak , Bk satisfies
Ak + eηk (x) (1 − eu k )Bk − eξk (x)+vk Ak = 0 Bk + eξk (x) (1 − evk )Ak − eηk (x)+u k Bk = 0,
where ξk (x) ∈ (u k (x), u k (x)) and ηk (x) ∈ (vk (x), vk (x)). Since for any fixed k, u k (x) → 0, vk (x) → 0 as x → ∞, we can apply the same argument of (3.13) and Lemma 3.2, to obtain that the maximum points xk are bounded. Thus Ak and Bk converges to A and B in C 2 (R2 ) respectively, where (A, B) satisfies A + ev (1 − eu )B − eu+v A = 0, B + eu (1 − ev )A − eu+v B = 0. Since A and B are bounded and not all zero in R2 , by Lemma 2.4, we have A ≡ 0 and B ≡ 0, a contradiction. This completes the proof of Theorem 1.1.
Uniqueness and Structure of Solutions for the Chern-Simons System
337
Now we need to show claim (∗). To show it, we define a background function pair (u 0 , v0 ) by |x|2 |x|2 and v0 (x) = N2 ln . (3.3) u 0 (x) = N1 ln 1 + |x|2 1 + |x|2 Let (uˆ + u 0 , vˆ + v0 ) be a topological solution of (1.7). Then (u, v) satisfies ⎧ 4N1 v +vˆ u +uˆ 2 ⎪ ⎨ uˆ + e 0 (1 − e 0 ) − (1+|x|2 )2 = 0 in R , ⎪ ⎩
4N2 2 vˆ + eu 0 +uˆ (1 − ev0 +vˆ ) − (1+|x| 2 )2 = 0 in R , u(x) ˆ → 0 and v(x) ˆ → 0 as |x| → ∞.
(3.4)
To prove our claim (∗), we have to prove the linearized equation is an invertible operator from Wr2,2 (R2 ) × Wr2,2 (R2 ) → L 2 (R2 ), i.e., Eq. (3.5) below is uniquely solvable in Wr2,2 (R2 ) × Wr2,2 (R2 ) for any pair ( f, g) ∈ L 2 , A + ev (1 − eu )B − eu+v A = f, (3.5) in R2 . B + eu (1 − ev )A − eu+v B = g, For any pair ( f, g), by Lemma 2.4 there is at most one solution (A, B) ∈ Wr2,2 (R2 ) × Wr2,2 (R2 ). Hence, it suffices for us to show the existence of solutions. Since R2 is an unbounded domain, the existence can not follow directly from the uniqueness of the solution of (3.5), i.e., the Fredholm alternative theorem might not hold always. However, for any R > 0, the equation ⎧ ⎨ Ak + ev (1 − eu )Bk − eu+v Ak = f, Bk + eu (1 − ev )Ak − eu+v Bk = g, in B R (O), (3.6) ⎩A = B =0 on ∂ B R (O), k k has a solution, i.e., the Fredholm alternative theorem is true for each R > 0. Then by letting R = Rn → +∞, we want to prove (An , Bn ) = (A Rn , B Rn ) has a convergent subsequence in Wr2,2 (R2 ) × Wr2,2 (R2 ). Lemma 3.2. (An , Bn ) has a convergent subsequence in Wr2,2 (R2 ). Proof. By Sobolev’s embedding theorem, (An , Bn ) is locally Hölder function. We want to show that ||An || L ∞ (B Rn ) + ||Bn || L ∞ (B Rn ) ≤ C{|| f || L 2 (R2 ) + ||g|| L 2 (R2 ) } for some constant C independent of n. Suppose (3.7) does not hold. Without loss of generality, one may assume ||An || L ∞ = max(||An || L ∞ , ||Bn || L ∞ ) and ||An || L ∞ → +∞ as n → ∞.
(3.7)
(3.8)
Let xn ∈ R2 , such that |An (xn )| = ||An || L ∞ . First, we claim xn is bounded.
(3.9)
338
J.-L. Chern, Z.-Y. Chen, C.-S. Lin
To prove our claim, Eq. (3.4) can be rewritten as An − An = f + ev (eu − 1)Bn + (eu+v − 1)An ,
(3.10)
and let K R (x, y) be the fundamental solution of − I with zero boundary value on ∂ B R (O). It is easy to see e−|x−y| 0 < K R (x, y) ≤ C √ for |x − y| ≥ 1. |x − y| Then
An (xn ) =
K R (xn , BR v−u
+(e
(3.11)
y)[ev (eu − 1)Bn (y)
− 1)An (y) + f (y)]dy.
(3.12)
Since v(y) → 0 and u(y) → 0 as y → +∞, by (3.12) we have |An (xn )| ≤ o(1)(||An || L ∞ + ||Bn || L ∞ ) + C · || f || L 2 ,
(3.13)
which yields a contradiction. Thus, xn is bounded. By letting Aˆn = ||AnA||n ∞ and Bˆn = ||AnB||n ∞ , then a subsequence of ( Aˆn , Bˆn ) will L L ˆ B) ˆ and ( A, ˆ B) ˆ satisfies converge to ( A,
Aˆ + ev (1 − eu ) Bˆ − eu+v Aˆ = 0 Bˆ + eu (1 − ev ) Aˆ − eu+v Bˆ = 0
and Aˆ ≡ 0. Since Aˆ and Bˆ are bounded, by Lemma 2.4, we have Aˆ ≡ 0 and Bˆ ≡ 0, which yields a contradiction. Thus (3.7) is established. By (3.7), a subsequence of (An , Bn ) will converge to (A, B) and (A, B) satisfies (3.5). Let Eq. (3.4) be rewritten as (3.6). Since A and B are bounded, by (3.5) and a standard argument, we can show that A(r ) and B(r ) in L 2 (R2 ). This proves the linearized equation is 1-1 and onto from Wr2,2 (R2 ) × Wr2,2 (R2 ) to L 2 (R2 ). Then by the open mapping theorem of functional Analysis (linear), we know the inverse operator of the linearized equation is bounded from L 2 (R2 ) to Wr2,2 (R2 ) × Wr2,2 (R2 ). By apply the Implicit Function Theorem, our claim (∗) is proved, and thus the proof of Theorem 1.1 is completely finished. 4. Asymptotic Behaviors of All Entire Solutions In this section, we give the asymptotic behaviors of all entire radial solutions for (1.7) as follows. Here solutions are not necessarily negative in R2 . Proposition 4.1. Let (u, v) be an entire solution of (1.13)–(1.14). Then (u, v) satisfies one of the following behaviors: (A) limr →∞ u(r ) = 0 and limr →∞ v(r ) = 0; (B) limr →∞ u(r ) = −∞ and limr →∞ v(r ) = −∞; −cu ) ) = (−cu , − e 4 ) for some cu > 0; (C) limr →∞ (u(r ), v(r r2 −cv
) (D) limr →∞ ( u(r , v(r )) = (− e 4 , −cv ) for some cv > 0; r2
Uniqueness and Structure of Solutions for the Chern-Simons System
339
(E) limr →∞ u(r ) = ∞, limr →∞ v(r ) = −∞; (F) limr →∞ u(r ) = −∞, limr →∞ v(r ) = ∞. In order to prove Proposition 4.1, we need the following lemmas. Lemma 4.1. Let (u, v) be an entire solution of (1.13)–(1.14). Then the following statements hold. (i) If u(r1 ) ≥ 0 for some r1 > 0, then u (r ) > 0 ∀r > 0, lim u(r ) = ∞ and r →∞ lim v(r ) = −∞. r →∞
(ii) If u (r1 ) ≤ 0 for some r1 > 0, then u(r1 ) < 0 and lim u(r ) = −∞. r →∞
Proof. (i) We shall use the maximum principle to prove the first part of (i). Suppose u (r2 ) ≤ 0 for some r2 > 0. Then, since u(r ) → −∞ as r → 0+ and u(r1 ) ≥ 0, we obtain that u either has a local minimum u(r3 ) < 0 or a local maximum u(r4 ) > 0 for some r3 and r4 depending on r1 and r2 . From (1.13) we get 0 = u(r3 ) + ev(r3 ) (1 − eu(r3 ) ) > 0 or 0 = u(r4 ) + ev(r4 ) (1 − eu(r4 ) ) < 0. This contradiction shows u (r ) > 0 ∀r ∈ (0, ∞). Furthermore, since (r u (r )) = r ev(r ) (eu(r ) − 1) > 0 ∀r > r1 , we have r u (r ) > r1 u (r1 ) > 0 ∀r > r1 and thus limr →∞ u(r ) = ∞. This proves the first part of (i). If limr →∞ v(r ) = −∞ is not true, then there exist r2 > 0 and a constant C1 such that v(r ) > C1 ∀r > r2 and u(r ) ≥ eC1 (eu(r ) − 1) ≥ Ceu(r ) ∀r > r3 ,
(4.1)
for some constants C > 0 and r3 > r2 . From (4.1) we easily obtain that u must blow up in finite time. This contradiction shows that limr →∞ v(r ) = −∞ and (i) holds. (ii) If u (r1 ) ≤ 0 for some r1 > 0, then by (i) we have u(r1 ) < 0. From this and (1.13), we obtain r u (r ) < r1 u (r1 ) < 0 ∀r > r1 . Hence we get limr →∞ u(r ) = −∞. This completes the proof. Lemma 4.2. Let (u, v) be an entire solution of (1.13)–(1.14). If limr →∞ u(r ) = −cu −cu ) exists and v (r1 ) ≤ 0 for some r1 > 0, then limr →∞ r u (r ) = 0, limr →∞ v(r = −e 4 r2 and cu > 0. Proof. Since v (r1 ) ≤ 0, by (ii) of Lemma 4.1 in the case of v, we have limr →∞ v(r ) = −∞ and
r r v (r ) = r1 v (r1 ) − seu(s) (1 − ev(s) ) ∀r > r1 r1
r < −C s ds ∀r > r2 > r1 and for some constant C > 0 r1
→ −∞ as r → ∞.
340
J.-L. Chern, Z.-Y. Chen, C.-S. Lin
Then, combining the above inequality and (1.13), we easily obtain v(r ) r v (r ) r eu (ev − 1) e−cu = lim = lim = − . r →∞ r 2 r →∞ 2r 2 r →∞ 4r 4 lim
This proves the second result of this lemma. Since lim u(r ) = −cu exists, by Lemma 4.1, we easily obtain that u(r ) < 0, u (r ) > r →∞ 0 ∀r ∈ (0, ∞) and thus −cu ≤ 0. Then, by (r u (r )) = −r ev(r ) (1 − eu(r ) ) < 0 ∀r > 0, it follows that lim r u (r ) = 0. Suppose cu = 0. We claim the following two statements: r →∞
u(r )
(a) lim (b)
r →∞ ev(r ) u (r )
u(r )
= 0;
≥ v (r ) ∀r ≥ r0 for some sufficiently large r0 .
Since lim u(r ) = 0 and lim ev(r ) = 0, by (1.13) we have r →∞
r →∞
u(r ) r u (r ) r ev(r ) (eu(r ) − 1) = lim = lim r →∞ ev(r ) r →∞ r ev(r ) v (r ) r →∞ r ev(r ) (v (r ))2 + r eu(r )+v(r ) (ev(r ) − 1) u e −1 = lim = 0. r →∞ (v )2 + eu (ev − 1) lim
This proves claim (a). In order to prove claim (b), we first show u(r ) u(r ) (i) limr →∞ e u(r−1 ) = 1; (ii) limr →∞ r u (r ) = 0; (iii) limr →∞
u (r ) u(r ) = 0. eu(r ) u (r ) eu −1 = 0, we obtain limr →∞ u = limr →∞ u (r ) = limr →∞ eu(r ) (i). In addition, combining the facts of limr →∞ r u (r ) = 0 and
Since limr →∞ u(r ) = 1 which proves limr →∞ r 2 ev = 0 with (1.13), we have
u(r ) r u (r ) r ev (eu − 1) = lim = lim r →∞ r u (r ) r →∞ r 2 ev (eu − 1) r →∞ 2r ev (eu − 1) + r 2 ev (eu − 1)v + r 2 eu+v u 1 = lim . (4.2) r →∞ 2 + r v + r u eu eu −1 lim
u
e Since lim r v (r ) = −∞ and reuu −1 < 0 ∀r > 0, by (4.2) we obtain (ii). r →∞ Using the assertions of (i) and (ii), we get
u (r ) u (r ) −u (r ) − r ev(r ) (1 − eu(r ) ) = lim = lim r →∞ u(r ) r →∞ u (r ) r →∞ r u (r ) r u(r )ev(r ) r u(r )ev(r ) eu(r ) − 1 · ] = lim (by (i)) = lim [ r →∞ r →∞ r u (r ) r u (r ) u(r ) u(r ) = lim [ · r ev(r ) ] = 0 (by (ii) and lim r ev(r ) = 0). r →∞ r u (r ) r →∞ lim
This proves (iii).
Uniqueness and Structure of Solutions for the Chern-Simons System
341
Applying the results (i)–(iii) and (1.13), we obtain r u (r ) u(r ) lim r →∞ r v (r )
u(u (r ) + r u (r )) − r (u (r ))2 1 = lim · u(r ) v(r ) r →∞ u 2 (r ) r e (e − 1) v u 2 −r ue (1 − e ) − r (u ) (by (1.13)) = lim r →∞ r u 2 eu (ev − 1) ev (1 − eu ) u = lim + lim ( )2 = 0 (by (i) and (iii)). r →∞ r →∞ u u
u (r ) Since ru(r ) < 0 and r v (r ) < 0 ∀r > 0, by (4.3) we finally have sufficiently large r . Hence we prove claim (b). From (b), we easily obtain that
r u (r ) u(r )
[ln(−u(r )) − v(r )] ≥ 0 ∀r ≥ r0 .
(by (1.13))
(4.3) ≥ r v (r ) for
(4.4)
u(r ) ev(r )
≤ −eC0 < 0 ∀r ≥ r0 , where Integrating both sides of (4.4) from r0 to r , we deduce C0 = ln(−u(r0 )) − v(r0 ). This contradicts (a). Therefore cu > 0 and we complete the proof. Now we are in a position to prove Proposition 4.1. Proof of Proposition 4.1. We divide the proof into the following cases. Case 1. u(r1 ) ≥ 0 (resp. v(r1 ) ≥ 0) for some r1 > 0 : Then, by (i) of Lemma 4.1, we obtain that lim u(r ) = ∞ and lim v(r ) = −∞ (resp. r →∞ r →∞ lim v(r ) = ∞ and lim u(r ) = −∞). Hence (E) (resp. (F)) happens in this case.
r →∞
r →∞
Case 2. u(r ) < 0 and v(r ) < 0 ∀r ∈ (0, ∞) : (i) If u (r1 ) ≤ 0 and v (r2 ) ≤ 0 for some r1 > 0 and r2 > 0, then by (ii) of Lemma 4.1, we have lim u(r ) = lim v(r ) = −∞. This proves that (B) holds in this case. r →∞
r →∞
(ii) If u (r ) > 0 and v (r ) > 0 ∀r ∈ (0, ∞), then lim u(r ) = −C1 ≤ 0 and r →∞ lim v(r ) = −C2 ≤ 0 all exist. If C1 < 0 then, by (1.13)–(1.14), we have r →∞
r u (r ) = 2N1 −
r
sev(s) (1 − eu(s) )ds ∀r > 0
r −C1 < 2N1 − (1 − e ) s 1+2N2 ds ∀r > 0 0
0
< −C < 0 for r large, which implies u (r ) < 0 for large r . This contradiction shows C1 = 0. Similarly, C2 = 0 as well. Thus (A) occurs under this case. (iii) If u (r ) > 0 ( resp. v (r ) > 0) ∀r ∈ (0, ∞) and v (r1 ) ≤ 0 (resp. u (r1 ) ≤ 0) for some r1 > 0, then lim u(r ) = −cu (resp. lim v(r ) = −cv ) exists. By r →∞
−cu
r →∞
−cv
) ) e e (resp. lim u(r Lemma 4.2, we obtain lim v(r 2 = − 4 2 = − 4 ) and cu > 0 r →∞ r r →∞ r (resp. cv > 0). This shows that (C) (resp. (D)) happens in this case.
According to Case 1 and Case 2, we complete the proof.
342
J.-L. Chern, Z.-Y. Chen, C.-S. Lin
5. The Structure of All Entire Solutions In this section, we will study the structures of all radial entire solutions for (1.7). Applying this classification, we give the proof of Theorems 1.2 and 1.3. Let the respective set of initial data according to the behaviors of solutions be depicted beneath the statement of Theorem 1.2 in Sect. 1. First, we derive the structure property of in the following. Proposition 5.1. is an open subset of R2 and the following statements are valid. (A) If (α1 , α21 ), (α1 , α22 ) ∈ with α21 < α22 , then (α1 , α2 ) ∈ ∀α21 < α2 < α22 . Similarly, if (α11 , α2 ), (α12 , α2 ) ∈ with α11 < α12 , then (α1 , α2 ) ∈ ∀α11 < α1 < α12 . (B) There exists (α˜ 1 , α˜ 2 ) ∈ R2 such that (α1 , α2 ) ∈ ∀α1 ≥ α˜ 1 or ∀α2 ≥ α˜ 2 . (C) is a simply connected and unbounded region such that
= N T
v , u
∂ = Su T Sv and S u S v = T. In particular, both Su and Sv are nonempty. Proof. We divide the proof into the following steps. ¯ > 0 such that u (r0 , α) ¯ 0 such that ¯ u (r0 , α) < 0 and v (r0 , α) < 0 ∀α ∈ Bδ (α).
(5.1)
By (ii) of Lemma 4.1, we obtain u(r, α) → −∞ and v(r, α) → −∞ as r → ∞ ∀α ∈ ¯ This proves Bδ (α) ¯ ⊂ and thus is open. Bδ (α). Step 2. By the monotone property of (φi , ψi ), i = 1.2, Lemma 2.3, we easily obtain (A). Step 3. We prove (B) by scaling arguments and monotone property. Choose d > 0 such that N2 N2 + 1 0 and d2 > 0. Now suppose there exists a sequence (s j , ds j ) ∈ with (s j , ds j ) → (∞, ∞).
Uniqueness and Structure of Solutions for the Chern-Simons System
343
Set (uˆ j , vˆ j ) = (uˆ s j , vˆs j ). Then by euˆ j ≤ 1, evˆ j ≤ 1 and euˆ j +vˆ j ≤ 1, we have that for all R > 0, |uˆ j (r )|, |vˆ j (r )| ≤ M on [0, R] for some M = M(R) > 0. Then ¯ |uˆ j (r )|, |vˆ j (r )| ≤ M¯ on [0, R] for some M¯ = M(R) > 0. From elliptic estimates, we have (uˆ j , vˆ j ) → (u, ˆ v) ˆ (passing subsequence if necessary) in C 2 ([0, R]) for any R > 0 and (u, ˆ v) ˆ which satisfies ⎧ ˆ vˆ ˆ ) = r 2(N1 +N2 ) eu+ ⎨ u(r ˆ vˆ (5.5) v(r ˆ ) = r 2(N1 +N2 ) eu+ ⎩ u(0) ˆ = 0, uˆ (0) = 0, v(0) ˆ = 0, vˆ (0) = 0. Since U (t, s, ds) and V (t, s, ds) are both decreasing in t, we have uˆ and vˆ are nonincreasing in r . But, any solution pair of (5.5) must be increasing in r . Actually, both uˆ or vˆ must blow up at finite r . This contradiction shows that there exists s0 > 0 such that u(r, s, ds) or v(r, s, ds) blows up ∀s > s0 , and hence, by Lemma 2.3, so does u(r, α1 , α2 ) or v(r, α1 , α2 ) for any α1 ≥ s0 ≡ α˜ 1 or α2 ≥ ds0 ≡ α˜ 2 . This proves (B). Step 4. In this step, we prove the result of (C). For this purpose we claim the following statements. (a) Let α ∈ and (u(r ), v(r )) = (u(r, α), v(r, α)). Then the corresponding (β1 , β2 ) satisfies either < ∞ or β2 < ∞. β1 (b) = N T
v . u (c) ∂ ⊂ Su T Sv . (d) If α = (α1 , α2 ) ∈ ∂ and (α1 − , α2 ) ∈ for some > 0, then α ∈ Su . (e) If α = (α1 , α2 ) ∈ ∂ and (α1 , α2 − ) ∈ for some
> 0, then α ∈ Sv . (f) Let 1 = ∂ Su and 2 = ∂ Sv . Then 1 2 = T . (g) is a simply connected and unbounded region. (h) If α = (α1 , α2 ) ∈ Su , then (α1 + , α2 ) ∈ Su ∀ > 0. (i) If α = (α1 , α2 ) ∈ Sv , then (α1 , α2 + ) ∈ Sv ∀ > 0. (a) Suppose the result is not true. Then β1 = ∞, β2 = ∞ and lim r u (r ) = −∞. r →∞
Hence, for each M > 2 there exists r M > 0 such that r u (r ) < −M ∀r > r M . Furthermore we get eu(r ) < C · r −M for large r and
∞
∞ u v β2 = r e (1 − e )dr ≤ r eu dr < ∞. 0
0
This contradiction proves (a). (b) By (a) and the of u and v , we easily have (b). definitions (c) Let E = Su T Sv , α ∈ ∂ and (u, v) be the respective solution. Then, by the Hopf lemma we have u(r ) < 0, v(r ) < 0 ∀r > 0. By Proposition 4.1, we have ∂ ⊂ E. This proves (c). (d) If α = (α1 , α2 ) ∈ ∂ and α = (α1 − , α2 ) ∈ for some > 0, then, by Lemma 2.3, v(r, α ) → −∞ and v(r, α) < v(r, α ) ∀r > 0. This proves α ∈ Su . (e) The proof is similar to (d). (f) From (c)-(e), 1 = ∅ and 2 = ∅, then by the continuity w.r.t. initial data, we obtain that 1 ∩ Sv = ∅ and 2 ∩ Su = ∅. Therefore, (f) is proved.
344
J.-L. Chern, Z.-Y. Chen, C.-S. Lin
(g) Suppose is not connected. Then there exist two disjoint open sets O1 and O2 satisfying O1 O2 = . From (d)–(f), O1 and O2 possess one initial data of topological solutions at least, respectively, and by (A)–(B), there exist two distinct initial data of topological solutions T1 and T2 such that T1 ∈ ∂ O1 and T2 ∈ ∂ O2 . This contradicts the uniqueness of the topological solution. Hence is connected. By using the similar arguments, we get is unbounded. From (A), we obtain that the set does not have a hole, that is, is a simply connected set. This shows (g). (h) First we show that if α = (α1 , α2 ) ∈ Su , then ∀ > 0 we have α = (α1 + , α2 ) and u(r0 , α ) ≥ 0 for some r0 = r0 () > 0, that is, α ∈ / Su ∀ > 0 by Lemma 4.1. Suppose there exists 0 > 0 such that u(r, α0 ) < 0 ∀r > 0. By Lemma 2.3, we have u(r, α) < u(r, α ) < u(r, α0 ) < 0 ∀r ∈ (0, ∞). v(r, α0 ) < v(r, α ) < v(r, α) < 0
(5.6)
From (5.6) and α ∈ Su , we obtain u(r, α0 ) is bounded below for large r . Then, by (ii) of Lemma 4.1, we have u (r, α0 ) > 0 ∀r > 0, and hence lim u(r, α0 ) = −cu ≤ 0. r →∞
Furthermore, by Lemma 4.2, we get lim r u (r, α) = lim r u (r, α0 ) = 0. Combining r →∞ r →∞ these facts, we attain β1 (α) = 2N1 = β1 (α0 ). But, from (5.6) we have β1 (α) > β1 (α0 ). This contradiction proves our assertion and thus (h) holds. (i) The proof is similar to (h). By above claims (b)–(i) and the existence of the topological solution, we finally obtain (C) and the proof is complete. In the following, we let u and v be defined in Sect. 1. Then the corresponding (β1 , β2 ) of each solution can be classified as follows. Proposition 5.2. Let (u, v) be an entire solution of (1.13)–(1.14) and the corresponding (β1 , β2 ) be defined in (1.20). Then the following statements are valid: (a) β1 is continuous w.r.t α1 and α2 for all (α1 , α2 ) ∈ N T ∪ u . Similarly, β2 is continuous w.r.t α1 and α2 for all (α1 , α2 ) ∈ N T ∪ v . (b) If (u, v) is a topological solution, then (β1 , β2 ) = (2N1 , 2N2 ). (c) If (u, v) is a non-topological solution, then the respective (β1 , β2 ) satisfies (β1 − 2(N1 + 1))(β2 − 2(N2 + 1)) > 4(N1 + 1)(N2 + 1). (d) For any α ∈ Su (resp. α ∈ Sv ), (u, v) is a Type (IV) solution with (β1 , β2 ) = (2N1 , ∞) (resp. (β1 , β2 ) = (∞, 2N2 )). (e) For any α ∈ u (resp. α ∈ v ), (u, v) is a Type (III) solution with 2N1 < β1 ≤ 2N1 + 2 and β2 = ∞ (resp. β1 = ∞ and 2N2 < β2 ≤ 2N2 + 2). Proof. (a) We prove the case of β1 . The case of β2 is similar. Let D = N T u , α0 = (α10 , α20 ) ∈ D. First, we want to show that D = N T u is open. Because of β1 (α0 ) < ∞, we obtain that lim r v (r ; α0 ) < −2 − ε
r →∞
for some ε > 0, by continuity and (1.13), there exist r0 , δ > 0 such that for all |α −α0 | < δ, ε r v (r ; α) < −2 − on [r0 , ∞), 2
Uniqueness and Structure of Solutions for the Chern-Simons System
345
which imply β1 (α) < ∞, that is, α ∈ D. Thus the set D is open. Now if δn > 0 and δn → 0 as n → ∞ such that α1n = α10 +δn , (α1n , α20 ) ∈ D and β1,n = β1 (α1n , α20 ) ∀n. We want to prove β1n → β1 (α0 ) as n → ∞. By monotone property, Lemma 2.3, and the continuity of (u, v) w.r.t. the initial data, we obtain that u(r, α1n , α20 ) u(r, α0 ) pointwise in r as n → ∞. (5.7) v(r, α1n , α20 ) v(r, α0 ) By (5.7), the definition of β1 and the monotone convergence theorem, we obtain β1n → β1 (α0 ) as n → ∞. The case of δn < 0 is similar. So β1 is continuous w.r.t. α1 . By using the same arguments, we get β1 is continuous w.r.t. α2 . This proves (a). (b) By (i) of Lemma 2.2 we obtain the result. (c) Since (u, v) is a non-topological solution of (1.13), the respective β1 < ∞ and β2 < ∞. Hence there exists a sequence {r j } such that r j → ∞ and r 2j eu(r j ) (1 − ev(r j ) ) → 0, r 2j ev(r j ) (1 − eu(r j ) ) → 0 as j → ∞. By the Pohozaev identity, Lemma 2.1, we easily obtain
r
v(s) u(s) e (1 − e )s ds − 2 r u (r ) · r v (r ) − 2 0 v(r )
r
eu(s) (1 − ev(s) )s ds
0
+r 2 eu(r ) (1 − e ) + r 2 ev(r ) (1 − eu(r ) )
r = 4N1 N2 + 6 seu(s)+v(s) ds ∀r ∈ (0, ∞).
(5.8)
0
Taking r = r j on both sides of (5.8) and then letting j → ∞, we have
∞ r eu(r )+v(r ) dr (2N1 − β1 )(2N2 − β2 ) − 2β1 − 2β2 = 4N1 N2 + 6 0
which implies
∞
(β1 − 2(N1 + 1))(β2 − 2(N2 + 1)) = 4(N1 + 1)(N2 + 1) + 6
r eu(r )+v(r ) dr.
0
This proves (c). (d) We prove the case of Su . The proof for Sv is similar. Let α ∈ Su and (u(r ), v(r )) = (u(r, α), v(r, α)). By Lemma 4.2 we have lim r u (r ) = 0 = 2N1 − β1 (α). This r →∞ proves (d). (e) We prove the case of u . The proof for v is similar. Let α ∈ u and (u(r ), v(r )) = (u(r, α), v(r, α)). By (1.21), (1.22) and the definition of u , we have lim r u (r ; α) < 0 r →∞
and lim r u (r ) = 2N1 − β1 (α). Hence β1 (α) > 2N1 . Now we claim β1 (α) ≤ 2N1 + r →∞ 2 ∀α ∈ u . Suppose not, then there exists α ∈ u such that β1 (α) > 2N1 + 2 and ∞ β2 (α) = ∞. Then we obtain lim r u (r ) = 2N1 − β1 (α) < −2, and thus 0 r eu dr < r →∞ ∞. From this we deduce
∞
∞ u ∞> r e dr > r eu (1 − ev )dr = β2 (α) = ∞. 0
0
This contradiction proves (e) and the proof is complete.
The following results describe the existence and properties of Type (V) solution.
346
J.-L. Chern, Z.-Y. Chen, C.-S. Lin
Proposition 5.3. Wu and Wv are open subsets of R2 . Furthermore, the following statements are valid: (i) For each (θ, η) ∈ Su , there exists > 0 such that (α1 , η) ∈ Wu ∀θ < α1 < θ + and ⎧ (u(r ) − λ log r ) = cu ⎪ ⎨ rlim →∞ ) e cu (5.9) lim rv(r 2+λ = − (2+λ)2 ⎪ r →∞ ⎩ β1 = 2N1 − λ, β2 = ∞, where cu and λ = λ(α1 , η) > 0 are constants. (ii) For each (µ, ν) ∈ Sv , there exists δ > 0 such that (µ, α2 ) ∈ Wv ∀ν < α2 < ν + δ and ⎧ u(r ) e cv ⎪ 2+γ = − (2+γ )2 ⎨ rlim →∞ r (5.10) lim (v(r ) − γ log r ) = cv , ⎪ ⎩ r →∞ β1 = ∞, β2 = 2N2 − γ , where cv and γ = γ (µ, α2 ) > 0 are constants. (iii) Wu and Wv are all nonempty. In order to prove Proposition 5.3, we need the following lemmas. Lemma 5.1. Suppose (u(r ), v(r )) is a radial solution satisfying u(r0 ) > 0, v(r0 ) < 0 and v (r0 ) < 0 (resp. v(r0 ) > 0, u(r0 ) < 0 and u (r0 ) < 0). Then (u(r ), v(r )) is an entire solution and lim u(r ) = ∞ and lim v(r ) = −∞ (resp. lim v(r ) = ∞ and r →∞ r →∞ r →∞ lim u(r ) = −∞).
r →∞
Proof. Suppose (u, v) is not an entire solution. Then there exists R0 > 0 such that u(r ) → ∞ as r → R0− . Then we claim that (a)
lim r v (r ) = lim v(r ) = −∞.
r →R0−
r →R0−
(b) (u + v) is bounded above on [R1 , R0 ] for some 0 < R1 < R0 . Since v(r0 ) < 0 and v (r0 ) < 0, we easily have v (r ) < 0 ∀r ∈ [r0 , R0 ). Then we obtain
r
r u v v(r0 ) r v (r ) = r0 v (r0 ) + se (e − 1)ds ≤ r0 v (r0 ) + (e − 1) seu ds → −∞, r0 r0
r r r s ln eu (ev − 1)ds v(r ) = v(r0 ) + v (r0 ) ln + r0 s r0
r r r v(r0 ) − 1) s ln ev (eu − 1)ds ≤ v(r0 ) + v (r0 ) ln + (e r0 s r0 r r v(r0 ) = v(r0 ) + v (r0 ) ln + (e − 1)(u(r ) − u(r0 ) − u (r0 ) ln ) → −∞ r0 r0 as r → R0− since lim u(r ) = lim r u (r ) = ∞. This prove (a). r →R0−
r →R0−
Uniqueness and Structure of Solutions for the Chern-Simons System
347
By (a), there exists 0 < R1 < R0 such that 2ev(r ) − 1 ≤ − 21 ∀r ∈ (R1 , R0 ). By (1.13) we easily have
r seu (2ev − 1 − ev−u )ds r (u (r ) + v (r )) = R1 (u (R1 ) + v (R1 )) + R1
1 r u se ds < 0 for r close to R0− . ≤C− 2 R1 This proves (u + v) (r ) < 0 for r near R0− and thus (b) follows. Now by (a)-(b) we deduce
∞ = lim r u (r ) = R1 u (R1 ) + r →R0−
R0
sev (eu − 1)ds < ∞.
R1
This contradiction proves the first result. From Lemma 4.1, we obtain the second result and the proof is complete. The following lemma depicts the asymptotic behaviors of Type (V) solution at infinity. Lemma 5.2. Let (u, v) be an entire solution of (1.13)–(1.14) on (0, ∞). If u(r0 ) ≥ 0 for some r0 > 0, then (a) lim r u (r ) = λ and lim (u(r ) − λ log r ) = cu for some constants λ > 0 and cu . r →∞
r →∞
(b) r p eu(r )+v(r ) ∈ L 1 (0, ∞) for any p ≥ 0. (r ) ) e cu e cu = − 2+λ and lim rv(r (c) lim rrv2+λ 2+λ = − (2+λ)2 , where λ and cu are the constants in r →∞ r →∞ (a). Proof. (a) By Lemma 4.1, we see that u(r ) ≥ C and 1 − ev(r ) ≤
1 on [r0 , ∞) 2
for some C, r0 > 0, then lim r v (r ) = −∞ and v(r ) ≤ −Cr 2 ∀r ≥ r0 .
r →∞
From (4.1) we obtain (r u (r )) > 0 on [r0 , ∞) and lim r u (r ) = λ, 0 < λ ≤ ∞.
r →∞
(5.11)
To complete this proof, we need the following fact. Claim. λ < ∞. Proof of Claim. Suppose λ = ∞, then, using (1.13)–(1.14), we obtain
∞ r eu(r )+v(r ) dr ≥ lim r u (r ) − 2N1 = ∞, 0
r →∞
(5.12)
348
J.-L. Chern, Z.-Y. Chen, C.-S. Lin
and by lim r v (r ) = −∞, r →∞
r u (r ) r ev(r ) (eu(r ) − 1) 1 − e−u(r ) = lim = lim = 0. u(r ) v(r ) r →∞ r v (r ) r →∞ r e (e − 1) r →∞ 1 − e−v(r ) lim
(5.13)
By (5.13) we easily have, for any p > 0, (r 2+ p eu+v ) = r 1+ p eu+v (2 + p + r u (r ) + r v (r )) < 0 for large r.
(5.14)
From (5.14) we see that r 2+ p eu+v is bounded from above by a positive constant and ∞ hence r eu+v ≤ Cr −(1+ p) for all large r and 0 r eu(r )+v(r ) dr < ∞. This contradicts (5.12) and thus λ < ∞. Next, we show the asymptotic behavior of u at r = ∞. Let y(r ) = u(r ) − λ log r . Then, by (5.11), we have lim r y (r ) = 0 and r →∞
r y (r ) r ev (eu − 1) = lim = 0 for any p > 0. r →∞ r − p r →∞ − pr − p−1 lim
(5.15)
Since lim r y (r ) = 0 and (r y ) = r ev (eu − 1) > 0 on [r0 , ∞), we obtain that r →∞
y (r ) < 0 on [r0 , ∞). From (5.15) and y (r ) < 0 on [r0 , ∞), we easily obtain y (r ) > −C1r −2 , y(r ) > C2 for large r for some C2 ∈ R and thus lim y(r ) = cu for r →∞ some cu ∈ R. This shows the results of (a). (b) Now, by Lemma 4.1 we easily obtain v(r ) ≤ −Cr 2 ∀r ≥ R for some constants C > 0 and R > 0. From this inequality and the claim in the proof of (a), we have that r p eu(r )+v(r ) < Cr −2 for all large r > 0 for any p > 0. Thus the result (b) is valid. (c) By Lemma 4.1 and Eq. (1.13) we have limr →∞ v(r ) = −∞ and v(r ) r v (r ) r eu (ev − 1) = lim = lim 2+λ 2+λ r →∞ r r →∞ (2 + λ)r r →∞ (2 + λ)2 r 1+λ u e 1 1 = · lim λ · lim (ev − 1) = · ecu · (−1). 2 r →∞ r →∞ (2 + λ) r (2 + λ)2 lim
Thus (c) is true and the proof is complete.
Remark 5.2. If we replace u by v in the condition of Lemma 5.2, we can obtain the following respective results. The proof is similar. Lemma 5.3. Let (u, v) be an entire solution of (1.13)– (1.14) on (0, ∞). If v(r0 ) ≥ 0 for some r0 > 0, then (a) lim r v (r ) = η and lim (v(r ) − η log r ) = cv for some constants η > 0 and cv . r →∞
r →∞
(b) r p eu(r )+v(r ) ∈ L 1 (0, ∞) for any p ≥ 0. (r ) ) e cv e cv = 2+η and lim ru(r (c) lim rru2+η 2+η = − (2+η)2 , where η and cv are the constants in r →∞ r →∞ (a) above. Now we are in the position to prove Proposition 5.3.
Uniqueness and Structure of Solutions for the Chern-Simons System
349
Proof of Proposition 5.3. We prove the case of (Wu , Su ). The case of (Wv , Sv ) is similar. We divide the proof into the following steps. Step 1. Wu is open. Let α¯ ∈ Wu . Then there exists r0 > 0 such that u(r0 , α) ¯ > 0, v(r0 , α) ¯ < 0 and v (r0 , α) ¯ < 0. By the continuity of (u, v), (u , v ) w.r.t α, there exists δ > 0 such that u(r0 , α) > 0, v(r0 , α) < 0 and v (r, α) < 0 ∀α ∈ Bδ (α). ¯
(5.16)
By (5.16) and Lemma 5.1, we obtain (u(r, α), v(r, α)) → (∞, −∞) as r → ∞ ∀α ∈ Bδ (α). ¯ This proves Bδ (α) ¯ ⊂ Wu and hence Wu is open. Step 2. Let α˜ = (θ, η) ∈ Su . Then there exists r0 > 0 such that v(r0 , α) ˜ < 0 and v (r0 , α) ˜ < 0. By continuity, there exists > 0 such that v(r0 , α) < 0 and v (r0 , α) < 0 ∀α = (α1 , η) with η < α1 < η + . By (h) of Step 4 in the proof of Proposition 5.1 and (1.13), we have u(r1 , α) > 0 and v (r1 , α) < 0 for some r1 > r0 . From Lemma 5.1, we obtain that (α1 , η) ∈ Wu ∀η < α1 < η + . By Lemma 5.2, we also obtain the asymptotic behavior of (u, v) at ∞ and the corresponding (β1 , β2 ) which satisfies β1 = 2N1 − lim r u (r ) = 2N1 − λ and β2 = 2N2 − lim r v (r ) = ∞. r →∞
r →∞
These prove (i) and (ii). Step 3. Wu and Wv are all nonempty. Since, by (C) of Proposition 5.1, Su and Sv are all nonempty, we get Wu and Wv are also all nonempty from (i)-(ii). This completes the proof. Now, we give the proof of Theorems 1.3 and 1.2 in the following. Proof of Theorem 1.3. We divide the proof into the following steps. Step 1. By (C) of Proposition 5.1, (c) of Proposition 5.2 and (i)-(ii) of Proposition 5.3, we obtain (i), (iii) and (iv)-(v) respectively. Step 2. First we claim the following statements. (a) For each α ∈ (Sv v ) (resp. α ∈ (Su u )), there does not exist {αk } ⊂ u (resp. {αk } ⊂ v ) such that αk → α as k → ∞. (b) For each α = (α1 , α2 ) ∈ Sv (resp. α ∈ Su ), there does not exist k 0 with {αk = (α1 + k , α2 )} ⊂ N T (resp. {αk = (α1 − k , α2 )} ⊂ N T ) such that αk → α as k → ∞. (c) For each α = (α1 , α2 ) ∈ Sv (resp. α ∈ Su ), there does not exist k 0 with {αk = (α1 , α2 − k )} ⊂ N T (resp. {αk = (α1 , α2 + k )} ⊂ N T ) such that αk → α as i → ∞. (a) We prove the case of Sv v . The case of Su u is based on the same arguments. Suppose there exists {αk } ⊂ u such that αk → α as k → ∞ for some α ∈ (Sv v ). Then, by (d)-(e) of Proposition 5.2, we have β1 (α) = ∞ and β1 (αk ) ≤ 2N1 + 2 ∀k. Denote (u(r ), v(r )) = (u(r, α), v(r, α)) and (u k (r ), vk (r )) = (u(r, αk ), v(r, R αk )) ∀k. Then, by the definition of β1 , we obtain that there exists R0 > 0 such that 0 0 r ev (1 − eu )dr > 2N1 + 2, and
R0 2N1 + 2 ≥ lim r evk (1 − eu k )dr =
k→∞ 0
R0 v
r e (1 − eu )dr (by Bounded Convergence Theorem)
0
> 2N1 + 2.
350
J.-L. Chern, Z.-Y. Chen, C.-S. Lin
Fig. 1. Structure of all entire solutions
This contradiction proves (a). (b) We prove the case of Sv . The case of Su is similar. Let α = (α1 , α2 ) ∈ Sv . Suppose there exists k 0 with {αk = (α1 + k , α2 )} ⊂ N T such that αk → α as k → ∞. Then, by (c)–(d) of 5.2, we have β2 (α) = 2N2 and β2 (αk ) > 2N2 + 2 ∀k. Then by the continuity of β2 w.r.t. α1 , i.e., (a) of Proposition 5.2, we obtain β2 (α) = lim β2 (αk ) ≥ 2N2 + 2. This contradiction shows (b). r →∞ (c) The proof is similar to (b). We omit the details. Since Su and Sv are all nonempty, by (a)-(c) above we obtain that ∀α = (α1 , α2 ) ∈ Sv and ∀α¯ = (α¯ 1 , α¯ 2 ) ∈ Su , there respectively exists δ1 = δ1 (α) > 0 and δ2 = δ2 (α) ¯ >0 such that (α1 + δ, α2 ), (α1 , α2 − δ) ∈ v ∀0 < δ < δ1 (α¯ 1 − δ, α¯ 2 ), (α¯ 1 , α¯ 2 + δ) ∈ u ∀0 < δ < δ2 .
(5.17)
By (5.17) we deduce both u and v are nonempty and connected. Now, by (C) of Proposition 5.1 and (a) above, we obtain N T = ∅. By Proposition 5.2, is simple and open connected. From Lemma 2.3, we also have u , N T and v are all simple. Suppose N T is not open. Then, by (5.17) and Lemma 2.3, w.l.o.g., there exists {αi = (αi1 , αi2 )} ⊂ v such that αi → α¯ as i → ∞ for some α¯ = (α¯ 1 , α¯ 2 ) ∈ N T . Let αi = (α¯ 1 , αi2 ). Then αi → α¯ and { αi } ⊂ v . By (a),(c) and (e) of Proposition 5.2, we finally obtain αi ) = β2 (α) ¯ > 2N2 + 2. 2N2 + 2 ≥ lim β2 ( i→∞
This contradiction proves that
N T is open. Now, suppose Z = ∂ N T ∂ = ∅, then there exist α ∈ u and a sequence {αi } ⊂ v such that αi → α as i → ∞. This contradicts (a). Hence Z = ∅ and Z = T . Furthermore, by (a) we also get N T is connected. The proof is complete. Proof of Theorem 1.2. Let (u, v) be a radial solution of (1.7). Then, by Propositions 4.1 and 5.2, we obtain that (u, v) must be one of the Types (I)–(V). Conversely, by Theorem 1.1, the Type (I) solution, i.e., topological solution, exists and is unique. Then,
Uniqueness and Structure of Solutions for the Chern-Simons System
351
by (C) of Proposition 5.1, we have that both ∂ and are nonempty. Thus Types (II)–(IV) solutions all exist due to Proposition 5.1 and Theorem 1.3. In particular, the non-topological solution exists. Furthermore, by Proposition 5.3, the Type (V) solution exists. We complete the proof. Remark 5.3. Combining the results of Theorems 1.1, 1.2 and 1.3, we can sketch the structure of entire solutions as in Fig. 1. Acknowledgement. The authors would like to express their gratitude to the referee for valuable comments and suggestions.
References 1. Busca, J., Sirakov, B.: Symmetry results for semi-linear elliptic systems in the whole space. J. Diff. Eqs. 163, 41–56 (2000) 2. Caffarelli, L.A., Yang, Y.: Vortex condensation in the Chern-Simons Higgs model: an existence theorem. Commun. Math. Phys. 168, 321–336 (1995) 3. Chae, D., Imanuvilov, O.Y.: The existence of non-topological multivortex solutions in the relativistic self-dual Chern-Simons theory. Commun. Math. Phys. 215, 119–142 (2000) 4. Chan, H., Fu, C.-C., Lin, C.-S.: Non-topological multi-vortex solutions to the self-dual Chern-SimonsHiggs equation. Commun. Math. Phys. 231, 189–221 (2002) 5. Chen, C.-C., Lin, C.-S.: Uniqueness of the ground state solutions of u + f (u) = 0 in Rn , n ≥ 3. Comm. Part. Diff. Eqs. 16, 1549–1572 (1991) 6. Chen, X., Hastings, S., Mcleod, J.B., Yang, Y.: A nonlinear elliptic equation arising from gauge filed theory and cosmology. Proc. Roy. Soc. London Ser. A 446, 453–478 (1994) 7. Dunne, G.: Self-Dual Chern-Simons Theories, Lecture Notes in Physics. Vol. 36, Berlin: Springer, 1995 8. Dziarmaga, J.: Low energy dynamics of [U (1)] N Chern-Simons solitons. Phys. Rev. D 49, 5469– 5479 (1994) 9. Hartman, P.: Ordinary Differential Equations. New York: Wiley, 1964 (2nd ed. Boston-Basel-Stattgart: Birkhäuser, 1982) 10. Hong, J., Kim, Y., Pac, P.Y.: Multivortex solutions of the Abelian Chern-Simons-Higgs theory. Phys. Rev. Lett. 64, 2230–2233 (1990) 11. Jackiw, R., Pi, S.-Y.: Soliton solutions to the gauged nonlinear Schrödinger equation on the plane. Phys. Rev. Lett. 64, 2969–2972 (1990) 12. Jackiw, R., Weinberg, E.J.: Self-dual Chern-Simons vortices. Phys. Rev. Lett. 64, 2234–2237 (1990) 13. Jaffe, A., Taubes, C.: Vortices and Monopoles. Progress in Physics Vol. 2, Boston. MA: Birkhäuser, 1980 14. Kumar, C.N., Khare, A.: Charged vortex of finite energy in nonabelian gauge theories with Chern-Simons term. Phys. Lett. B 178, 395–399 (1986) 15. Kim, C., Lee, C., Ko, P., Lee, B.H., Min, H.: Schrödinger fields on the plane with [U (1)] N Chern-Simons interactions and generalized self-dual solitons. Phys. Rev. 48, 1821–1840 (1993) 16. Lin, C.-S., Ponce, A.C., Yang, Y.: A system of elliptic equations arising in Chern-Simons field theory. J. Funct. Anal. 247, 289–350 (2007) 17. Spruck, J., Yang, Y.: The existence non-topological solutions in the self-dual Chern-Simons theory. Commun. Math. Phys. 149, 361–376 (1992) 18. Spruck, J., Yang, Y.: Topological solutions in the self-dual Chern-Simons theory: existence and approximation. Ann. Inst. H. Poincaré Anal. Non Linéaire 12, 75–97 (1995) 19. Tarantello, G.: Uniqueness of selfdual periodic Chern-Simons vortices of topological type. Calc. Var. Part. Diff. Eqns 29, 191–217 (2007) 20. de Vega, H.J., Schaponsnilk, F.A.: Electrically charged vortices in non-abelian gauge theories with ChernSimons term. Phys. Rev. Lett. 56, 2564–2566 (1986) 21. Yanagida, E.: Mini-maximizers for reaction-diffusion systems with skew-gradient structure. J. Diff. Eqs. 179, 311–335 (2002) 22. Yang, Y.: The relativistic non-abelian Chern-Simons equations. Commun. Math. Phys. 186, 199– 218 (1997) 23. Yang, Y.: Solitons in Filed Theory and Nonlinear Analysis. Springer Monographs in Mathematics, New York: Springer-Verlag, 2001 Communicated by M. Aizenman
Commun. Math. Phys. 296, 353–403 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1022-y
Communications in
Mathematical Physics
Linear Perturbations of Quaternionic Metrics Sergei Alexandrov1 , Boris Pioline2,3 , Frank Saueressig4 , Stefan Vandoren5 1 Laboratoire de Physique Théorique & Astroparticules ,
2
3 4 5
Université Montpellier II, 34095 Montpellier Cedex 05, France. E-mail:
[email protected] Laboratoire de Physique Théorique et Hautes Energies , Université Pierre et Marie Curie, 4 place Jussieu, 75252 Paris cedex 05, France. E-mail:
[email protected] Laboratoire de Physique Théorique de l’Ecole Normale Supérieure , 24 rue Lhomond, 75231 Paris cedex 05, France Institut de Physique Théorique , CEA, F-91191 Gif-sur-Yvette, France. E-mail:
[email protected] Institute for Theoretical Physics and Spinoza Institute, Utrecht University, Leuvenlaan 4, 3508 TD Utrecht, The Netherlands. E-mail:
[email protected] Received: 25 November 2008 / Accepted: 18 December 2009 Published online: 25 February 2010 – © The Author(s) 2010. This article is published with open access at Springerlink.com
Abstract: We extend the twistor methods developed in our earlier work on linear deformations of hyperkähler manifolds [1] to the case of quaternionic-Kähler manifolds. Via Swann’s construction, deformations of a 4d-dimensional quaternionic-Kähler manifold M are in one-to-one correspondence with deformations of its 4d + 4-dimensional hyperkähler cone S. The latter can be encoded in variations of the complex symplectomorphisms which relate different locally flat patches of the twistor space ZS , with a suitable homogeneity condition that ensures that the hyperkähler cone property is preserved. Equivalently, we show that the deformations of M can be encoded in variations of the complex contact transformations which relate different locally flat patches of the twistor space ZM of M, by-passing the Swann bundle and its twistor space. We specialize these general results to the case of quaternionic-Kähler metrics with d + 1 commuting isometries, obtainable by the Legendre transform method, and linear deformations thereof. We illustrate our methods for the hypermultiplet moduli space in string theory compactifications at tree- and one-loop level. Contents 1. 2.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quaternionic-Kähler Geometry and Twistors . . . . . . . . . . . . . . . . . 2.1 Bottom-up: from QK to HKC . . . . . . . . . . . . . . . . . . . . . . . 2.2 Top down: from HKC to QK . . . . . . . . . . . . . . . . . . . . . . . 2.3 Patchwork construction of twistor spaces of HK manifolds - a summary 2.4 Conditions for superconformal invariance . . . . . . . . . . . . . . . . Unité mixte de recherche du CNRS UMR 5207. Unité mixte de recherche du CNRS UMR 7589.
Unité mixte de recherche du CNRS UMR 8549. Unité de recherche associée au CNRS URA 2306.
354 357 357 360 361 363
354
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
2.5 Homogeneous symplectic vs. contact geometry . . . . . . Quaternionic Geometry with Commuting Isometries . . . . . . 3.1 Tri-holomorphic isometries and superconformal invariance 3.2 Superconformal quotient . . . . . . . . . . . . . . . . . . 3.3 Contact twistor lines . . . . . . . . . . . . . . . . . . . . 4. The Perturbative Hypermultiplet Moduli Space . . . . . . . . . 4.1 Tree-level geometry . . . . . . . . . . . . . . . . . . . . . 4.2 One-loop correction . . . . . . . . . . . . . . . . . . . . . 4.3 Superconformal quotient . . . . . . . . . . . . . . . . . . 5. Linear Deformations of O(2) Quaternionic-Kähler Spaces . . . 5.1 Linear deformations of O(2) hyperkähler cones . . . . . . 5.2 Perturbed contact twistor lines . . . . . . . . . . . . . . . A. Infinitesimal SU (2) Transformations . . . . . . . . . . . . . . B. An alternative Formulation for Hypermultiplet Moduli Spaces . C. Deformed Superconformal Quotient . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
367 371 371 376 378 380 380 383 384 386 386 389 394 395 397 402
1. Introduction Quaternionic-Kähler (QK) manifolds play an important role in string and supergravity theories, primarily because the hypermultiplet moduli spaces appearing in string theory backgrounds with 8 supercharges fall into this class. In this work, we study general aspects of QK manifolds and of their twistor spaces, and provide a general formalism for describing linear perturbations of 4d-dimensional QK manifolds with d + 1 commuting isometries. For this purpose we build on our previous study of linear deformations of hyperkähler (HK) manifolds obtainable by the Legendre transform method [1]. A key fact for the present study is the (local) one-to-one correspondence between 4d-dimensional QK manifolds M and 4d + 4-dimensional “hyperkähler cones” (HKC) S, i.e. 4d + 4-dimensional HK manifolds with a homothetic Killing vector and an isometric SU (2) action rotating the three complex structures (see Fig. 1 for orientation). In particular, Swann’s construction produces S as a C2 /Z2 bundle over M, twisted by the SU (2) part of the spin connection on M [2]. The converse relation goes under the name of “superconformal quotient” in the physics literature [3,4]. Moreover, any isometry of M can be lifted to a tri-holomorphic isometry of S, see e.g. [5,6]. Therefore, the formalism of [1] is directly applicable to the Swann bundle S, with a suitable restriction to ensure the hyperkähler cone (or “superconformal invariance”) property. For this purpose, one introduces the twistor space ZS = S × CP 1 of the HK manifold S, an open covering Uˆi of ZS projecting to open disks Ui on CP 1 , and a local I , µ[i] ) (I = , 0, . . . , d − 1) for the O(2)-twisted comDarboux coordinate system (ν[i] I plex symplectic structure [i] = dµ[i] ∧ dν I on Uˆi . Since1 [i] = f 2 [ j] mod dζ on [i]
I
ij [ j]
I , µ[i] ) and (ν I , µ ) must be related the overlap Uˆi ∩ Uˆ j , the coordinate systems (ν[i] I [ j] I by a symplectomorphism on Uˆi ∩ Uˆ j ; the latter can be parametrized in the usual way [ j]
I , µ , ζ ). The set of all S [i j] , subject to consistency by a generating function S [i j] (ν[i] I relations, reality conditions and gauge equivalence, encodes the complex symplectic structure on ZS , and therefore the HK metric on S. 1 Here f are the transition functions of the O(1) bundle on CP 1 with coordinate ζ . ij
Linear Perturbations of Quaternionic Metrics
355
Fig. 1. Summary of various coordinate systems on the QK space M, its twistor space ZM , its Swann bundle S and the twistor space of the Swann bundle ZS
As we show in Sect. 2.4 below, superconformal invariance restricts S [i j] to be a func[i] I tion of f i−2n j ν[i] and µ I only, with no further ζ dependence, and to be homogeneous of
I , µ[i] ) are rescaled with weight (n, 1 − n), respectively2 . The intedegree one when (ν[i] I ger n characterizes the transformation rules of the local coordinates under both dilations and SU (2) rotations. For n = 1, the relevant case for QK manifolds with isometries, the O(0) sections µ[i] I may acquire anomalous scaling dimensions, and the homogeneity condition may be relaxed into a “quasi-homogeneity” property, as explained further in Sect. 2.4. [i j] (ν I , µ[i] , ζ ) of the generating funcDeformations of S correspond to variations H(1) I [i] tions S [i j] , subject to the consistency, reality and quasi-homogeneity conditions and gauge equivalence. When S is obtainable by the Legendre transform method, which is the case when M admits d + 1 commuting isometries, the deformed twistor lines and hyperkähler potential are easily computed to first order in the perturbation. The deformed QK metric may in principle be obtained by the standard superconformal quotient procedure. In Appendix C, we construct a natural set of coordinates on the deformed QK manifold, but stop short of writing the deformed metric explicitly, as the expressions would be too cumbersome. While the strategy outlined above is conceptually straightforward, it is rather unpractical. As we explain in detail in Sect. 2.5, one may by-pass the twistor space of the Swann bundle ZS , and work directly with the twistor space ZM of the QK manifold M, as 2 See [7–9] for other discussions of superconformal invariance in projective superspace and [10,11] for an analysis in components.
356
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
emphasized in particular by Salamon and Lebrun [12,14–16]. While ZS carries a complex O(2)-valued symplectic structure and a HK metric degenerate along the CP 1 fiber, ZM carries a complex O(2)-valued contact structure and a non-degenerate Kähler-Einstein metric3 [12]. Conversely, Fano contact manifolds with a Kähler-Einstein metric are twistor spaces of QK manifolds [13,14]4 . Similarly to ZS , the complex contact , ξ˜ [i] , α [i] ( = 0, . . . , d − 1) structure X on ZM admits local Darboux coordinates ξ[i] such that locally, the contact one-form takes the canonical form X [i] = dα [i] + ξ dξ˜ [i] . [i]
These Darboux coordinates on ZM are essentially the projectivization5 of the Darboux I , µ[i] on Z . More precisely, we show that the projectivized complex coordinates ν[i] S I Darboux coordinates depend on the coordinates (ζ, π 1 , π 2 ) on the CP 1 × (C2 /Z2 ) fiber over a given point on M only through the ratio z defined in (2.57) below. Together with the projectivization, this provides the desired reduction from ZS to ZM . The homogeI , µ[i] , ζ ) of complex symplectomorphisms on Z neous generating functions S [i j] (ν[i] S I , ξ˜ [i] , α [i] ) of complex contact transformations on yield generating functions Sˆ [i j] (ξ[i] [i j] of the same variables, subZ. Their deformations can be encoded in functions Hˆ (1) ject to consistency relations, reality conditions, and gauge equivalence. This recovers ˇ Lebrun’s assertion that the QK deformations of M are classified by the Cech cohomol1 ogy group H (ZM , O(2)) [14]. The deformed QK metric on M can then be extracted in a systematic way from the knowledge of the “contact twistor lines” (referred to as the , ξ˜ [i] , α [i] on Z “twistor map” in [17]), i.e. the complex coordinates ξ[i] M expressed as 1 µ functions of the coordinates z on CP and x on the base M. We end this introduction with an important remark. In string theory or supergravity, only QK manifolds with negative scalar curvature appear as a consequence of supersymmetry [18]. Such QK spaces are generically non-compact. The linear deformation theory set up in this paper is local and applies to both compact and non-compact manifolds. Possible obstructions to extend and integrate infinitesimal deformations into finite global deformations, however, depend strongly on the (non-)compactness. For instance, it is known that complete QK manifolds with positive scalar curvature admit no deformations, see e.g. [15,16]. In contrast, the hypermultiplet moduli spaces arising from string theory compactifications are in general deformed by quantum corrections, as explained e.g. in the introduction of [1] and to be discussed further in [19]. This paper is organized as follows. • In Sect. 2, we review general aspects of QK manifolds, their twistor spaces, HKC and twistor spaces thereof, and study the consequences of superconformal invariance on the symplectomorphisms used in the patchwork construction of the complex symplectic structure. In particular, in Sect. 2.5, we explain in detail how the homogeneous complex symplectic structure on ZS reduces to a complex contact structure on ZM , thus allowing to by-pass the Swann bundle and its twistor space. • In Sect. 3 we specify the case when the 4d-dimensional QK space has d + 1 commuting isometries, i.e. when its Swann bundle is obtainable by the Legendre transform construction. We find the corresponding restriction on the symplectomorphisms, perform the superconformal quotient and obtain the contact twistor lines. • In Sect. 4, we illustrate these methods on the example of the hypermultiplet moduli space in type 3 Moreover, in contrast to Z , the projection from Z 1 S M to CP is not holomorphic. 4 In fact, our local analysis seems to support Lebrun’s conjecture [14] that every Fano contact manifold is
a twistor space. 5 The equivalence between contact structures and homogeneous symplectic structures is a standard trick in contact geometry, see e.g. [14] and references therein.
Linear Perturbations of Quaternionic Metrics
357
II string theory, both at tree and one-loop level, and in the process strengthen the case for the absence of perturbative corrections beyond one-loop. • Section 5 studies deformations of QK manifolds with d + 1 commuting isometries. We determine the allowed linear perturbations which preserve superconformal invariance, and find the deformed twistor lines and contact twistor lines. These results will be applied to the hypermultiplet moduli space of type II string theories in [19]. • In Appendix A, we spell out the SU (2) action on the various multiplets at the infinitesimal level. In Appendix B, we briefly discuss an alternative description of the hypermultiplet moduli space using a different choice of contour, and show that it is related to the one in Sect. 4 by a local symplectomorphism. In Appendix C we generalize the superconformal quotient of Sect. 3.2 to the perturbed case, and provide an independent check on the results of Sect. 5.2. 2. Quaternionic-Kähler Geometry and Twistors In this section, we review the relation between quaternionic-Kähler (QK) manifolds and hyperkähler cones (HKC). This relation is one-to-one up to coverings (Theorem (5.9) in [2]), and can be established “bottom up”, by constructing the Swann bundle S over the QK manifold M, or “top down”, by performing the superconformal quotient of S. These two constructions are summarized in Sects. 2.1 and 2.2 following [4,17]. In Sect. 2.3, we recall the patchwork construction of the twistor space ZS of the HK space S developed in our previous work [1]. In Sect. 2.4 we derive the restrictions on the transition functions imposed by superconformal invariance. In Sect. 2.5, we study the reduction of the homogeneous complex symplectic structure on ZS to a complex contact structure on ZM . The reader will find it helpful to refer to Fig. 1 for the various coordinate systems involved in these constructions. 2.1. Bottom-up: from QK to HKC. A quaternionic-Kähler manifold M is a 4d-dimensional manifold with Riemannian metric gM and Levi-Civita connection ∇ whose holonomy group is contained in U Sp(d) × SU (2) [12]. M admits a triplet of almost complex Hermitian structures J (defined up to SU (2) rotations) satisfying the algebra of the unit imaginary quaternions. The quaternionic two-forms ω M (X, Y ) = gM ( J X, Y ) are covariantly closed with respect to the SU (2) part p of the Levi-Civita connection, and are proportional with a fixed coefficient ν to the curvature of p, dω M + p × ω M = 0 ,
d p +
1 1 p × p = ν ω M , 2 2
(2.1)
i = i jk a j ∧ bk . As a consequence, the metric on where we use the notation ( a × b) M is Einstein, with constant Ricci scalar curvature R = 4d(d + 2)ν. HK manifolds are degenerate limits of QK manifolds, where ν = 0. We are mainly concerned in this work with the case of negative curvature, ν < 0. The Swann bundle S associated to M is the total space of a C2 bundle (more precisely C2 /Z2 with the zero section deleted) over M. It is a hyperkähler manifold of dimension 4(d + 1) with an SU (2) isometric action which rotates the complex structures into each other, and a homothetic Killing vector. The homothetic Killing vector ensures that the hyperkähler manifold is actually a cone, and the SU (2) isometries guarantee that this is a cone over a three-Sasakian space with S 3 fibres over the quaternionic base M [20]. In physics terminology, these properties follow from N = 2 superconformal
358
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
invariance of the associated sigma model [4]. We denote by π A the complex coordinates on the C2 /Z2 fiber, π¯ A ≡ (π A )∗ their complex conjugate, and use the antisymmetric tensor A B to raise and lower the indices.6 The HK metric on S is given by
dsS2 = |Dπ A |2 +
ν 2 2 r dsM , 4
(2.2)
2 is the QK metric on M, r 2 ≡ |π 1 |2 + |π 2 |2 = π A π ¯ A is the squared norm where dsM on the fiber, and
Dπ A ≡ dπ A + p A
B
πB ,
(2.3)
is the covariant differential of π A . The isometric SU (2) action on S is given by the infinitesimal transformations
δπ A =
i i 3 π A + + π¯ A , δ π¯ A = − 3 π¯ A + − π A . 2 2
(2.4)
In particular, the norm r 2 is SU (2) invariant. The homothetic Killing vector r ∂r = π A ∂π A + π¯ A ∂π¯ A corresponds to dilations of the fiber. With respect to the complex structure J 3 , where Dπ A are (1, 0) forms, the Kähler form is ν A B , (2.5) ωS3 = i Dπ A ∧ Dπ¯ A + π A π¯ B ωM 2 while the holomorphic symplectic form ωS+ = − 21 (ωS1 − iωS2 ) is given by ν A B = d π A Dπ A . ωS+ = Dπ A ∧ Dπ A + π A π B ωM 2
(2.6)
This construction directly defines the HKC, or Swann bundle S, given a QK manifold M, see [17] for more details. For many purposes, it is useful to decompose the construction above in two steps, by first introducing the twistor space ZM [12], a CP 1 bundle over M, and then obtaining the Swann bundle S as a C× bundle over ZM . The twistor space ZM over M should not be confused with the twistor space ZS of S itself, to be introduced in Sect. 2.3 below. ZM is a complex manifold with a canonical Kähler-Einstein metric and a complex contact structure7 . Introducing a complex coordinate z on CP 1 , the line element is given by 2 dsZ = M
|dz + P|2 ν 2 + ds , (1 + z z¯ )2 4 M
(2.7)
while the Kähler form on ZM is given by − 3 − 2izω+ + 2i¯ ¯ z ω (dz + P) ∧ (d¯z + P) ν (1 − z z¯ )ωM M M + ωZM = i , (2.8) (1 + z z¯ )2 2 1 + z z¯ 6 We use conventions in which = 1 = − and π¯ = π¯ B . 12 21 A B A 7 Recall that a complex contact form on a complex manifold of complex dimension 2d + 1 is a holomorphic
one-form Xˆ , defined globally, such that Xˆ ∧ (dXˆ )d is a nowhere vanishing holomorphic top form. A contact structure corresponds to the case where Xˆ is a local one-form defined up to multiplication by a nowhere vanishing smooth function.
Linear Perturbations of Quaternionic Metrics
359
± 1 ∓ iω2 ), ω+ = (ω− )∗ . In these expressions, P stands for the where ωM = − 21 (ωM M M M “projectivized connection”, defined from the SU (2) connection p A B as
P = p+ − i p3 z + p− z 2 ,
(2.9)
where p+ ≡ p 1 2 , p3 ≡ i( p 1 1 − p 2 2 ), p− ≡ − p 2 1 , with p3 real and ( p− )∗ = p+ . The complex contact structure on ZM is induced from the Liouville form X on S,
X ≡ π A Dπ A =
1 2 2 (π ) (dz + P) , 2
(2.10)
and, as apparent from the overall factor of (π 2 )2 , is a section8 of the O(2) line bundle on CP 1 . From the complex contact structure one may easily extract the SU (2) connection p, and therefore the triplet of quaternionic two-forms ω M via Eq. (2.1). Thus, the knowledge of the complex structure and contact structure on ZM is sufficient to reconstruct the quaternionic-Kähler metric. To construct the Swann bundle S we introduce two more real coordinates r and ψ parametrizing the fiber of a C× bundle over ZM , with metric9 2 , dsS2 = dr 2 + r 2 (Dψ)2 + dsZ M
(2.11)
where Dψ = dψ +
i (zd¯z − z¯ dz) − i(1 − z z¯ ) p3 + 2z p− − 2¯z p+ . 2(1 + z z¯ )
(2.12)
The metrics (2.2) and (2.11) are identical, provided the coordinates r, ψ, z, z¯ are related to π A , π¯ A via r 2 = |π 1 |2 + |π 2 |2 , eiψ =
π1 π¯ 1 π 2 /π¯ 2 , z = 2 , z¯ = , π π¯ 2
(2.13)
or conversely
π1 π2
r eiψ =√ 1 + z z¯
z . 1
(2.14)
For more details, we again refer the reader to [17]. 8 More precisely, X is defined on S. In (2.23) we define a contact one-form Xˆ proportional to X , which does live on ZM . 9 We follow the conventions of [17], but with a slightly different notation. E.g. the coordinate ψ here is denoted by φ in [17]. Also, in [17], the SU (2) index in π A was not lowered after complex conjugation.
360
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
2.2. Top down: from HKC to QK. The characterizing property of an HKC is that there exists a function χ on S, known as the hyperkähler potential, such that the metric, in local (real) coordinates φ M , M = 1, ..., 4(d + 1), satisfies [2,4] g M N = D M ∂ N χ (φ).
(2.15)
For any Hermitian complex structure, in adapted complex coordinates z m , m = 1, . . . , 2(d + 1), (2.15) implies that gm n¯ = ∂m ∂n¯ χ (z, z¯ ) ,
Dm ∂n χ (z, z¯ ) = 0.
(2.16)
In particular, χ provides a Kähler potential in any complex structure. The dilation and SU (2) symmetries are generated by the vector fields χ M = g M N χN ,
k M = JM N χ N ,
(2.17)
where χ M ≡ ∂ M χ , g M N is the inverse HKC metric, and J is a triplet of complex structures. The SU (2) Killing vector fields are not tri-holomorphic but rotate the complex structures into each other. It follows from (2.15) that the four vector fields χ M and k N satisfy N , DM χ N = δM
D M k N = JN M .
(2.18)
In particular, χ m ∂z m is holomorphic. One can also express the hyperkähler potential in terms of the metric and the homothetic Killing vector fields, χ=
1 M 1 χ gM N χ N = χ M ∂M χ , 2 2
(2.19)
consistent with (2.15). It is easy to check that this form of the hyperkähler potential is SU (2) invariant. In the coordinates that appear in the construction of the Swann bundle, the homothetic Killing vector is generated by the vector field χ M ∂ M = r ∂r , and so (2.19) yields
χ = r 2 = π A π¯ A .
(2.20)
One can descend from the HKC S to the twistor space ZM by performing a U (1) Kähler quotient. For any choice of complex structure n · J with n a unit vector, n · k is a holomorphic Killing vector. The Kähler quotient of ZM with respect to n · k provides a Kähler manifold of real dimension 4d + 2, independent of the choice of n, which is just the twistor space ZM . By Frobenius’ theorem, one may choose a set of independent complex coordinates λ, u i , i = 1, . . . , 2d + 1 adapted to the action of the holomorphic vector field χ M , χ m (z)∂m = ∂λ |u i .
(2.21)
The Kähler potential on ZM is then determined from the hyperkähler potential χ by means of ¯
¯ χ (λ, λ¯ , u, u) ¯ = eλ+λ+K ZM (u,u) .
(2.22)
Defining the O(2)-twisted holomorphic contact form on ZM , dz + P ¯ Xˆ ≡ e−2λ X = eλ−λ+2iψ+K ZM , 2(1 + z z¯ )
(2.23)
Linear Perturbations of Quaternionic Metrics
361
one may rewrite the metric on the fiber as the modulus square of the contact form [3], |dz + P|2 = 4 e−2K ZM |Xˆ |2 . (1 + z z¯ )2
(2.24)
Note that ψ is not an independent coordinate, but rather will be determined in terms of λ, z, x µ in Eq. (2.79) below, in such a way that λ¯ − λ + 2iψ is a function on ZM only. The QK metric on M can be computed from the holomorphic contact form X as indicated below (2.10), or by decomposing the metric on the twistor space as in (2.7), see [4,5] for more details. To express the metric on M in closed form, one needs to express the complex coordinates z m on S (or u i on ZM ) in terms of 4d independent real coordinates, corresponding to R+ × SU (2) invariant combinations of φ M , and coordi¯ As we shall see shortly, this problem is a QK analog of nates on the C2 fiber z, z¯ , λ, λ. the problem of “parametrizing the twistor lines” in HK geometry.
2.3. Patchwork construction of twistor spaces of HK manifolds - a summary. As explained e.g. in [1,21–25], HK geometry is equivalent to complex symplectic geometry on the twistor space, compatible with the real structure. This, of course, also applies to the HKC metric on the Swann bundle S, with suitable restrictions on the complex symplectic structure to ensure the HK cone property. In this subsection, we briefly review the twistorial description of general HK manifolds S following [1], before studying the implications of superconformal invariance in Sect. 2.4. In contrast to the quaternionic-Kähler case described in Sect. 2.1, the twistor space ZS over a 4d + 4-dimensional HK manifold10 S is a trivial product ZS = S × CP 1 . Its structure was developed from a physics viewpoint in [22,24], and its relation to projective superspace was recently further analysed in [25]. We denote by ζ a complex coordinate on the projective line CP 1 around the north pole ζ = 0. ZS carries an integrable complex structure given by J (ζ, ζ¯ ) =
1 − ζ ζ¯ 3 ζ + ζ¯ 2 ζ − ζ¯ 1 J + J +i J 1 + ζ ζ¯ 1 + ζ ζ¯ 1 + ζ ζ¯
(2.25)
on the base S (where Ji are the three complex structures on S) and the standard complex structure on CP 1 . Moreover, in this complex structure, ZS carries a holomorphic two-form (more accurately, a section of 2 TF∗ (2), see [22]) and a Kähler form given locally by (ζ ) = ωS+ − iζ ωS3 + ζ 2 ωS− ,
(2.26)
1 (1 − ζ ζ¯ )ωS3 − 2iζ ωS+ + 2iζ¯ ωS− , 1 + ζ ζ¯
(2.27)
and ω(ζ, ζ¯ ) =
where ωS± = − 21 (ωS1 ∓ iωS2 ). Note that, in contrast to the quaternionic-Kähler case, both of these forms are degenerate along the CP 1 fiber direction dζ . The Kähler form ω coincides with ωS3 at the north pole ζ = 0, and with −ωS3 at the south pole ζ = ∞. 10 For obvious reasons, we deviate from the notations of [1] which considered 4d dimensional HK manifolds
M.
362
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
The holomorphic two-form however, while coinciding with ωS+ at the north pole, diverges with a second order pole at ζ = ∞. As explained in [1], it is useful to introduce a set of patches Uˆi , i = 1, . . . , N on ZS , which project to open disks11 Ui on CP 1 , and a local section [i] which is regular on each patch. In order for the holomorphic section to be well defined, one must require that, on the overlap Uˆi ∩ Uˆ j , [i] = f i2j (ζ ) [ j]
mod dζ.
(2.28)
The factor f i j (ζ ) corresponds to the transition function of the O(1) bundle on CP 1 , and was discussed in detail in [1]. In particular, we recall that f i j f jk = f ik ,
f ii = 1, τ ( f i2j ) = f ı¯2j¯ ,
(2.29)
where τ is the antipodal map [τ (ν)](ζ ) ≡ ν(−1/ζ¯ ), and ı¯ labels the patch Uı¯ opposite to the patch Ui under the involution τ . Defining [0] = and using f 0∞ = ζ one finds that [∞] ≡ ζ −2 [0] = ωS− − iωS3 ζ −1 + ωS+ ζ −2
(2.30)
is regular at the south pole ζ = ∞. Now, we may choose the covering Uˆi such that, on each patch, the holomorphic section [i] takes the Darboux form I [i] = dµ[i] I ∧ dν[i] ,
(2.31)
I , µ[i] , ζ ) is a local complex coordinate system on Z , regular throughout where (ν[i] S I ˆ the patch Ui (here I runs over d + 1 values, which we shall denote , 0, . . . , d − 1). [ j]
I , µ[i] ) and (ν I , µ ) Equation (2.28) implies that on the overlap of two patches, (ν[i] I [ j] I must be related by a complex (O(2)-twisted) symplectomorphism. This is conveniently encoded by a generating function S [i j] of the initial “position” and final “momentum” coordinates, such that
ν[Ij] = ∂µ[ j] S [i j] (ν[i] , µ[ j] , ζ ), I
2 [i j] µ[i] (ν[i] , µ[ j] , ζ ). I = f i j ∂ν I S [i]
(2.32)
To check (2.28), one may use the identity [ j]
[i] I dS [i j] = ν[Ij] dµ I + f i−2 j µ I dν[i]
mod dζ .
(2.33)
The transition functions S [i j] are restricted by consistency conditions which ensure that the symplectomorphisms compose properly (see [1] for more details). As a result, the holomorphic symplectic structure on the twistor space ZS is entirely specified by N − 1 freely chosen functions S [0i] (ν[0] , µ[i] , ζ ). In order to ensure the reality of the resulting I , µ[i] transform under the real metric, it is also necessary to require that the sections ν[i] I structure as I = −ν[¯Iı ] , = −µ[¯Iı ] . τ ν[i] τ µ[i] (2.34) I 11 In principle, one should introduce a local coordinate ζ [i] on each connected disk; to avoid cluttering we shall use a single coordinate ζ to parametrize all patches at once, with each connected disk Ui being centered at ζ = ζi .
Linear Perturbations of Quaternionic Metrics
363
The condition (2.34) requires that the functions S [i ı¯] are related by complex conjugation to their Legendre transform [1]. For a suitably generic choice of such transition functions, it is a general property of I , HK manifolds that the space of solutions of (2.32) has dimension 4d + 4, i.e. all ν[i] µ[i] I can be expressed as infinite Taylor series around ζ = ζi whose coefficients are all functions of 4d + 4 parameters. The moduli space of solutions is isomorphic to the HK I , µ[i] ) defines the “twistor lines”, i.e. realizes the CP 1 base S, and the map ζ → (ν[i] I fiber over any point in S as a rational curve in ZS . Having found the twistor lines, the geometry of S can be computed by Taylor expanding the holomorphic section around any point ζ ∈ CP 1 . When S is a HKC, as we discuss further in the next section, all points of CP 1 are equivalent, and we can therefore expand around ζ = 0. Since is a global section of O(2), the Taylor expansion stops at quadratic order, [0] = dw I ∧ dv I − i ωS3 ζ + dw¯ I ∧ dv¯ I ζ 2 ,
(2.35)
where v I , w I are the complex coordinates in the complex structure J 3 = J (0, 0), I v I = ν[0] (ζ = 0), w I = µ[0] I (ζ = 0).
(2.36)
Knowing the complex coordinates and the Kähler form ωS3 , it is straightforward to obtain the metric and a Kähler potential. When S is a HKC, as we will discuss in the next section, it is always possible to choose the Kähler potential such that it is invariant under SU (2), and therefore equal to the hyperkähler potential χ . 2.4. Conditions for superconformal invariance. We now discuss the implications of superconformal invariance for the general construction of the twistor space ZS of a HK manifold S. We recall from Sect. 2.2 that superconformal invariance requires the existence of a homothetic Killing vector and an SU (2) group of Killing vectors that rotates the complex structures and commutes with the dilations. As follows from the first equation in (2.18), the dilations rescale the hyperkähler cone metric and leave the complex structures invariant. We normalize the action of the dilations such that the metric has weight 2, J = J.
g = 2 g,
(2.37)
This implies that all the two-forms ω S on the Swann bundle S scale with weight two. The action of the dilations can be extended to the twistor space ZS by assigning a scaling weight zero to ζ . In this way, the holomorphic two-form from (2.26) transforms uniformly throughout the ζ plane,
[i]
The local Darboux coordinates ν[i] and obeyed, so we postulate12 I ν [i] = 2n ν[i] , I
= 2 [i] . µ[i]
(2.38)
must transform in such a way that (2.38) is [i]
µ I = (2−2n) µ[i] I ,
(2.39)
12 One may also consider giving a different scaling weight n for each conjugate pair (ν I, µ ). The generI I alization of the following discussion is immediate.
364
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
for some constant n. This is a symmetry of the gluing conditions (2.32) provided the generating functions are homogenous functions of degree one when ν and µ are scaled with degree n and 1 − n respectively, S [i j] (2n ν[i] , (2−2n) µ[ j] , ζ ) = 2 S [i j] (ν[i] , µ[ j] , ζ ).
(2.40)
We now turn to the SU (2) action. In order for the complex structure J (ζ, ζ¯ ) given in (2.25) to be invariant, one should compensate the rotation of J by a rotation on the CP 1 fiber. Thus the fiber coordinate ζ must transform as ζ =
αζ + β , ¯ + α¯ −βζ
α α¯ + β β¯ = 1.
(2.41)
Under this transformation, should transform as a O(2) section,
[0] ¯ + α¯ 2 [0] ζ . (ζ ) = −βζ
(2.42)
Here, we have written the action in the patch U0 around the north pole of CP 1 , parametrized by the local coordinate ζ = ζ [0] . The action in the patch Ui can be obtained by replacing ζ → ζ [i] , [0] → [i] . If we continue to use ζ as a coordinate in Ui , then from (2.28) the transformation of [i] becomes [i]
(ζ ) =
f i0 (ζ ) f i0 (ζ )
2
¯ + α¯ 2 [i] ζ . −βζ
(2.43)
In order to ensure that transforms as (2.43) in every patch, we postulate that the local I , µ[i] transform locally as O(2n) and O(2 − 2n) sections Darboux coordinates ν[i] I
2n I f i0 (ζ ) ¯ ν[i] (ζ ) = −βζ + α¯ ν[i] (ζ ), f i0 (ζ )
2−2n [i] f i0 (ζ ) ¯ + α¯ µ I[i] (ζ ) = − βζ µ I (ζ ). f i0 (ζ ) I
(2.44)
This is a symmetry of the gluing equations (2.32) provided S
[i j]
[ j]
ν[i] (ζ ), µ
f (ζ ) 2 j0 2 [i j] [ j] ¯ + α) ν . (ζ ), ζ = (− βζ ¯ S (ζ ), µ (ζ ), ζ [i] f j0 (ζ ) (2.45)
Using the homogeneity property (2.40), this translates into 2n f i j (ζ ) [i j] [ j] [i j] [ j] S ν ν . (ζ ), µ (ζ ), ζ = S (ζ ), µ (ζ ), ζ [i] [i] f i2n j (ζ ) This equation fixes the ζ dependence to be of the form [ j] . S [i j] (ν[i] , µ[ j] , ζ ) = Sˆ [i j] f i−2n ν , µ [i] j
(2.46)
(2.47)
Linear Perturbations of Quaternionic Metrics
365
I and µ[i] are global sections of O(2n) In particular, note that the special case where ν[i] I and O(2 − 2n), [ j]
I S [i j] (ν[i] , µ[ j] , ζ ) = f i−2n j ν[i] µ I ,
(2.48)
solves the conditions of superconformal invariance. In addition, as in [1], one must impose the reality conditions
τ S [i j] (ν[i] , µ[ j] , ζ [i] ) = S [¯ı j¯] (ν[¯ı ] , µ[j¯] , ζ [¯ı ] ).
(2.49)
Thus, we conclude that superconformal invariance is guaranteed provided the gen[ j] erating functions S [i j] (ν[i] , µ[ j] , ζ ) are functions of f i−2n j ν[i] and µ , without explicit dependence on ζ , homogeneous of degree 1 when their first and second arguments are scaled with weight n and 1 − n, respectively, and satisfying the reality condition (2.49). Anomalous O(0) multiplets. In fact, the above conditions are sufficient but not strictly speaking necessary. Indeed, we have assumed that the Darboux coordinates are adapted to the superconformal action, in the sense that dilations and SU (2) act canonically as in (2.39) and (2.44), respectively. Clearly, a local gauge transformation depending on ζ only would not affect the existence of an isometric SU (2) action, but would just make it look more complicated. More importantly, when n = 1 (or equivalently n = 0, after exchanging µ I with ν I ), it is possible that µ I transforms anomalously under dilations, namely
[i] 2 µ I[i] = µ[i] I − c I log ,
(2.50)
for some constants c[i] I , which we shall refer to as “anomalous dimensions”. This anomalous transformation may be generated from the standard transformation (2.39) with n = 0 by a local symplectomorphism generated by
[i] I I T [i] = µ˜ [i] I ν[i] − c I ν[i] log ν[i] ,
(2.51)
I . This however need not be a regular gauge transformation where ν[i] is any one of the ν[i]
in the patch Ui , and so the geometry will in general depend non-trivially on c[i] I . After this local symplectomorphism, the generating functions S [i j] are now of the form13 [ j] [ j] [i] I Sˆ [i j] ν[i] , µ I + c I log( f i−2 (2.52) S [i j] = f i−2 j j ν[i] ) − c I ν[i] log ν[i] , where Sˆ [i j] is a homogeneous function of degree one in its first argument. In particular, S [i j] satisfy a “quasi-homogeneity condition” [ j] [ j] [ j] I I S [i j] 2 ν[i] , µ I − c I log 2 , ζ = 2 S [i j] ν[i] , µI , ζ [i] I 2 . (2.53) − f i−2 c ν log j I [i] 13 The Sˆ [i j] appearing in this equation differs from the one in (2.47), the relation between the two being transcendental in general.
366
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
Such generating functions are consistent with SU (2) invariance and dilations provided [i] µ[i] I transforms in the same way as −c I log ν[i] , namely as (2.50) under dilations and
f i0 (ζ ) [i] ¯ + α¯ − βζ (2.54) (ζ ) − 2c log µ I[i] (ζ ) = µ[i] I I f i0 (ζ ) under rotations. Anomalous transformations play an important role, e.g., in describing the one-loop correction to the hypermultiplet metric in Sect. 4.2. Note that the constants c[i] I are not arbitrary. Firstly, they must satisfy the reality [i] ∗ [¯ı ] conditions (c I ) = −c I . Besides, they are also subject to additional consistency constraints, which follow from the requirement that the open contours around the logarithmic branch cuts in ζ plane (as discussed in [1]) combine consistently into closed contours. This requires in particular that the anomalous dimensions associated with the patches containing the zeros of ν[i] are real. In this paper we assume that ν[i] has always two first order zeros ζ± in the patches U± related by the antipodal map, and therefore demand [−] that c[+] I = −c I are real constants. For a similar reason, we impose the same condition [∞] on c[0] (see footnote 18). I = −c I For later reference, we give the action of the symplectomorphism generated by (2.52), ˆ [i j] , ν[Ij] = f i−2 j ∂µ[ j] S
I
µ[i] I
= ∂ν I
[i]
Sˆ [i j] − c[i] I log ν[i] + δ I
1
[ j] c ∂ Sˆ [i j] J µ[Jj] ν[i]
−
J ν[i]
c[i] J ν[i]
.
(2.55)
From CPζ1 × C2π to CPz1 . We close this discussion of SU (2) transformations with an important observation, which will be instrumental for understanding the relation between the twistor spaces ZS and ZM . Notice that the isometric SU (2) action on S corresponds to an SU (2) action on the fiber coordinates π A , π¯ A (2.4), at a fixed position on the QK base M. Thus, any local O(2n) section ν[i] , viewed as a function of (ζ, π A , π¯ A ) and µ x , satisfies differential equations
∂ζ + π¯ 1 ∂π 2 − π¯ 2 ∂π 1 ( f i0−2n ν[i] ) = 0, 2ζ ∂ζ − 2n + π 1 ∂π 1 + π 2 ∂π 2 − π¯ 1 ∂π¯ 1 − π¯ 2 ∂π¯ 2 ( f i0−2n ν[i] ) = 0, (2.56) ζ 2 ∂ζ − 2nζ + π 1 ∂π¯ 2 − π 2 ∂π¯ 1 ( f i0−2n ν[i] ) = 0. It follows that there exists a function ν˜ [i] (z, x µ ) of the coordinates x µ on M and of the ratio z≡
π¯ 2 ζ + π 1 , −π¯ 1 ζ + π 2
(2.57)
such that
ν[i] (ζ, π A , π¯ A , x µ ) = f i02n (π 2 − ζ π¯ 1 )2n ν˜ [i] (z, x µ ).
(2.58)
For anomalous O(0) sections, the same argument guarantees the existence of a function µ˜ [i] (z, x µ ) such that 2 2 2 . (2.59) µ[i] (ζ, π A , π¯ A , x µ ) = µ˜ [i] (z, x µ ) − c[i] log f (π − ζ π ¯ ) 1 i0 I
Linear Perturbations of Quaternionic Metrics
367
Moreover, under the action of the antipodal map, τ (z) = −1/z,
τ (˜ν[i] ) = −˜ν[¯ı ] /z2n ,
τ (µ˜ [i] ) = −µ˜ [¯ı ] − 2c[¯ı ] log z.
(2.60)
The coordinate z can be viewed as a coordinate on the CP 1 fiber of the twistor space ZM . After an appropriate SU (2) rotation on the C2 /Z2 fiber, we can always assume that the zero and the pole of (2.57) occur at the zeros ζ± of the singled-out section ν , z=−
1 ζ − ζ+ , z¯ ζ − ζ−
ζ+ = −
π1 , π¯ 2
ζ− =
π2 , π¯ 1
z=
π1 . π2
(2.61)
In particular, the points (0, ζ+ , ζ− , ∞) in the ζ plane are mapped to (z, 0, ∞, −1/¯z ) in the z plane, respectively. Since ν[i] is assumed to be regular at ζ = ζi , ν˜ [i] (z, x µ ) is regular at the point zi ≡ z(ζi ), except when i = −, where the factors (π 2 − ζ π¯ 1 ) introduce extra singularities at z = ∞. In the next subsection, we elaborate on these observations and relate the symplectic and contact structures on the twistor spaces ZS and ZM .
2.5. Homogeneous symplectic vs. contact geometry. Having understood the constraints of superconformal invariance on the transition functions S [i j] , we now explain how the homogeneous symplectic structure on ZS descends to a contact structure on ZM , in effect rederiving the inverse construction of [13]. For definiteness, and since this is the case of most physical interest, we restrict to twistor spaces with n = 1 from this section onward. From homogeneous symplectic to contact. Let us return to (2.33): the term proportional to dζ , usually unspecified in HK geometry, can be computed explicitly in the case of HKC manifolds. Indeed, by differentiating the factors f i j appearing explicitly in (2.52), and integrating by parts, one obtains [ j] [i] I [i] −2 I I d S [i j] − f i−2 j µ I ν[i] = ν[ j] dµ I − f i j ν[i] dµ I [ j] [i] I −2 I + Sˆ [i j] +c I ∂µ[ j] Sˆ [i j] −c[i] ν log ν −µ ν I [i] I [i] d( f i j ). [i] I
(2.62) ˆ [i j] , one conRe-expressing µ[i] I using (2.55) and using the homogeneity property of S cludes that X[i] = f i2j X[ j] ,
(2.63)
where X[i] is the O(2)-valued complex Liouville form [i] I I X[i] = ν[i] dµ[i] I + c I dν[i] ,
satisfying the reality condition τ (X[i] ) = X[i] ¯ .
(2.64)
368
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
Let us now introduce the dilation-invariant O(0) sections14 I ξ[i] ≡
I ν[i] ν[i]
[i] ξ˜ I[i] ≡ µ[i] I + c I log ν[i] ,
,
(2.65)
the remaining d where we have singled out one coordinate15 ν[i] , and denoted by ν[i] [i] coordinates. In this trivialization, the Liouville form X leads to a contact form Xˆ [i] ,
[i] ˜ [i] Xˆ [i] ≡ dξ˜[i] + ξ[i] dξ + c dξ[i] .
X [i] = ν[i] Xˆ [i] ,
(2.66)
The term linear in c[i] I may be reabsorbed by defining I α [i] ≡ ξ˜[i] + c[i] I ξ[i] ,
˜ [i] Xˆ [i] = dα [i] + ξ[i] d ξ .
(2.67)
The gluing condition (2.63) becomes Xˆ [i] = fˆi2j Xˆ [ j] ,
fˆi2j ≡ f i2j ν[ j] /ν[i] ,
(2.68)
¯
while Xˆ satisfies the reality condition τ (Xˆ [i] ) = −Xˆ [i] . , ξ˜ [i] and fˆ2 are all According to the remark at the end of the previous section, ξ[i] I ij functions of the CP 1 coordinate z defined in (2.57) and of the coordinates x µ on M, ξ[i] =
(x µ , z) ν˜ [i]
ν˜ [i] (x µ , z)
[i] µ µ , ξ˜ I[i]=µ˜ [i] I (x , z) + c I log ν˜ [i] (x , z),
fˆi2j =
ν˜ [ j] (x µ , z)
ν˜ [i] (x µ , z)
.
(2.69)
, ξ˜ [i] ) provide local complex Darboux coordinates for the complex Thus, the sections (ξ[i] I contact structure Xˆ on ZM . They satisfy the reality conditions
τ (ξ˜ I[i] ) = −ξ˜ I[¯ı ] + iπ c[¯Iı ] .
) = ξ, τ (ξ[i] [¯ı ]
(2.70)
On the overlap of two patches, the Darboux coordinates are related by contact transformations following directly from (2.55), ˆ [i j] , ξ[j] = fˆi−2 j ∂ξ˜ [ j] S
ξ˜[i] = ∂ξ Sˆ [i j] , [i]
(2.71)
[ j] I ξ˜[i] = Sˆ [i j] − ξ[i] ∂ξ Sˆ [i j] + c I ∂ξ˜ [ j] Sˆ [i j] − c[i] I ξ[i] , [i]
I
and ξ˜ [ j] + c[ j] log fˆ−2 , related to the original where Sˆ [i j] is a general function of ξ[i] ij I I quasi-homogeneous generating function S [i j] via [ j] [ j] ˆ [i j] −2 I ˜ [ j] ˆ S [i j] (ν[i] ν ξ , µ I , ζ ) = f i−2 , ξ + c log( f ) S [i] I j ij [i] I [i] I (2.72) − c I ξ[i] ν[i] log ν[i] , 14 Our notations are related to the ones in [17] via ξ ,NPV = ξ , ξ˜ NPV = −2iξ˜ [0] , α NPV = 4iξ˜ [0] + [0] ξ˜ [0] , where the quantities on the r.h.s. are evaluated at ζ = 0, z = z. 2iξ[0] 15 Equation (2.65) is singular at the zeros of ν , and one should in principle single out a second coordinate [i] 0 and ξ˜ [i] , as ν[i] to cover these patches. Rather than doing so, we allow for poles and logarithmic cuts in ξ[i] I in (2.80) below, in effect trivializing the O(2) bundle over CPz1 .
Linear Perturbations of Quaternionic Metrics
369
and it is understood that ξ[i] = 1. In particular, note that the transition functions fˆi2j are holomorphic functions on ZM given by
fˆi2j = ∂ξ˜ [ j] Sˆ [i j] ,
(2.73)
and are equal to one if and only if ν is a global O(2) section. (z, x µ ) and Recovering the metric from the contact twistor lines. The functions ξ[i] [i] µ ξ˜ (z, x ) specify the twistor fiber over each point in M, and are the analogs of the I
I (ζ ), µ[i] (ζ ) on S. The knowledge of these “contact twistor lines” allows twistor lines ν[i] I to reconstruct the Kähler-Einstein metric on ZM and the quaternionic-Kähler metric on M, in the following manner. First, identifying X [0] = X in (2.10) and using (2.58), the holomorphic contact form in any patch Ui may be written as
X [0] 1 = Xˆ [i] = 2 2 f 0i ν[i]
π2 π 2 − ζ π¯ 1
2
dz + P ν˜ [i] (x µ , z)
,
(2.74)
Since Xˆ [i] depends on the fiber coordinates π A , π¯ A , ζ only through the combination z, we may set ζ = 0, z = z in this expression, and obtain dz p+ 1 + − i p3 + p− z = e−[i] Xˆ [i] , z z 2
(2.75)
where we define the “contact potential”, e−[i] (x
µ ,z)
≡ 4 ν˜ [i] (x µ , z)/z ,
τ ([i] ) = [¯ı ] .
(2.76)
Applying (2.24), we conclude that the Kähler potential on ZM is given by K ZM = log
4(1 + z z¯ ) + Re [i] (x µ , z) + log | fˆ0i |2 . |z|
(2.77)
Since fˆ0i is a holomorphic function, the last term in (2.77) can be absorbed by a Kähler transformation, leading to a Kähler potential valid in the patch Ui . In order to derive the metric on Z, one could therefore express z, z¯ and [i] in terms of the complex coor , ξ˜ [i] , α [i] ) in Uˆ . For the purpose of computing the QK metric on M, this dinates (ξ[i] i step is unnecessary and it suffices to study the contact twistor lines, as we show below. For later reference, we record the hyperkähler potential which follows from (2.22) using v = e2λ , 1 + z z¯ Re [[i] (x µ ,z)] e . χ = 4|v fˆ0i2 | |z|
(2.78)
By comparing (2.23) with (2.75) for i = 0, we can also relate the coordinate ψ in (2.14) to the coordinates v , z, x µ on S, v z¯ i Im [[0] (x µ ,z)] e2iψ = e . (2.79) v¯ z
370
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
We now restrict our attention to the patches Uˆ+ and Uˆ− around z = 0 and z = ∞, respectively, corresponding to ζ = ζ+ and ζ = ζ− . Using (2.65), (2.58), (2.59) and f 0+ ∼ ζ − ζ− , we find that the contact twistor lines behave near z = 0 as ,−1 −1 ,0 ,1 ξ[+] = ξ[+] z + ξ[+] + ξ[+] z + O(z2 ), [+] [+] [+] ξ˜[+] = c log z + ξ˜,0 + ξ˜,1 z + O(z2 ),
(2.80)
[+] ,−1 −1 α [+] = c[+] log z + c ξ[+] z + α0[+] + α1[+] z + O(z2 ).
Similarly, near z = ∞, ,−1 ,0 ,1 −1 ξ[−] = ξ[−] z + ξ[−] + ξ[−] z + O(z−2 ), [−] [−] [−] −1 ξ˜[−] = −c log z + ξ˜,0 + ξ˜,1 z + O(z−2 ),
(2.81)
[−] ,−1 ξ[−] z + α0[−] + α1[−] z−1 + O(z−2 ), α [−] = −c[−] log z + c
where the Laurent coefficients are related to those at z = 0 by the reality conditions (2.70). It is also useful to specify the Laurent expansion of the contact potentials, 0 1 [+] = φ[+] + φ[+] z + O(z2 ), 0 1 [−] = φ[−] + φ[−] z−1 + O(z−2 ),
(2.82)
related by the antipodal map [−] = τ ([+] ). For generic choices of contact transformations, we expect16 that similarly to the HK case [22], the moduli space of solutions to the gluing conditions (2.71) and reality conditions (2.70) is of real dimension 4d + 1, and can be parametrized by the lowest Laurent [−] ∗ ,−1 ,−1 ∗ ˜ [+] coefficients ξ[+] = −(ξ[−] ) , ξ,0 = −(ξ˜,0 ) and the real coefficient i(α0[+] +α0[−] ). This parameter space admits a U (1) action induced by phase rotations of z, which can be quotiented out to produce the QK manifold M itself. Expanding the contact form (2.67) for i = ± at z = 0, ∞ and identifying the coefficients of zn on either side of (2.75) allows to extract the SU (2) connection, 0 1 [±] ,−1 ,−1 ˜ [±] , p± = e−φ[±] ξ[±] dξ,0 + c dξ[±] 2 (2.83) 0 i −φ[+] [+] ,0 ˜ [+] ,−1 ˜ [+] 1 p3 = e dα0 + ξ[+] dξ,0 + ξ[+] dξ,1 − iφ[+] p+ , 2 and to express the Laurent coefficients of the contact potentials in terms of the Laurent coefficients of the contact twistor lines, 0 1 ,−1 [±] [±] ,0 ξ˜,1 + c ξ[±] + c[±] , eφ[±] = ± ξ[±] 2 (2.84) 0 1 [±] ,1 ,−1 ˜ [±] ,0 ˜ [±] 1 φ[±] = ± e−φ[±] α1[±] + 2ξ[±] ξ,2 + ξ[±] ξ,1 + c ξ[±] . 2 Via (2.1), one obtains the triplet of quaternionic forms ω M , in particular ωM,3 =
2 (d p3 + 2i p+ ∧ p− ) . ν
(2.85)
16 In the case where M admits d + 1 commuting isometries, or for perturbations thereof, this will be demonstrated in Sects. 3 and 5 below.
Linear Perturbations of Quaternionic Metrics
371
As anticipated above, the U (1) action induced by phase rotations of z shifts p3 by a total derivative and acts on p± in opposite ways, so lies in the kernel of ω3 . In order to obtain the metric from ωM,3 , it is still necessary to specify the almost complex structure J3 . This is achieved by expanding the holomorphic one-forms dξ[+] [+] and dξ˜ around z = 0, and projecting them along the base M: I
,−1 = ξ[+] p+ z−2 + V z−1 + O(z0 ) dξ[+] −1 + V˜ I + O(z1 ) dξ˜ I[+] = −c[+] I p+ z
mod Dz,
(2.86)
mod Dz,
where Dz = dz + p+ − i p3 z + p− z2 and ,−1 V ≡ (d − i p3 )ξ[+] ,
[+] [+] V˜ I ≡ dξ˜ I,0 − ξ˜ I,1 p+ + ic[+] I p3 .
(2.87)
(1, 0) forms with respect to the almost complex structure J3 on M can then be obtained and dξ˜ [+] which are regular at z = 0, and setting by forming linear combinations of dξ[+] I z = 0 in the corresponding expressions. Thus, singling the index 0 out of , a basis of (1, 0) forms on M is given by 0,−1 a a,−1 0 a = ξ[+] V − ξ[+] V ,
˜ I = ξ 0,−1 V˜ I + c[+] V 0 , [+] I
(2.88)
where a runs from 1 to d − 1. Note that the (1,0) form p+ is not linearly independent from those, as it satisfies 0 1 [+] 0,−1 ,−1 ˜ [+] a,−1 ˜ ξ[+] 1 − ξ[+] (2.89) . ξ,1 p+ = e−φ[+] ξ[+] a + c 2 Having determined J3 in this way, the QK metric then follows from ω3 (X, Y ) = gM (J3 X, Y ). Of course, the SU (2) connection and almost complex structure can equivalently be obtained by expanding near z = ∞. Before closing this section, let us note that the above discussion simplifies considerably in the special case where ν is a global O(2) section: in this case, the transition functions fˆi2j become equal to one, and the contact potentials i (x µ , z) become independent of z, defining a real function on M. 3. Quaternionic Geometry with Commuting Isometries In this section, we study aspects of the twistor space ZS of a HKC S (of real dimension 4d + 4) with d + 1 commuting tri-holomorphic isometries. As explained in the introduction, this situation arises when S is the Swann bundle of a QK manifold M with d + 1 commuting isometries. 3.1. Tri-holomorphic isometries and superconformal invariance. As explained in [1], the moment maps associated to the d + 1 commuting tri-holomorphic isometries provide d + 1 global O(2) sections, which can be taken as the “position” coordinates ν I (I = , 0, . . . , d − 1) for the holomorphic section . The fact that ν I are global O(2) sections restricts the form of the transition functions S [i j] to [ j]
I ˜ [i j] (ν[i] , ζ ), S [i j] (ν[i] , µ[ j] , ζ ) = f i−2 j ν[i] µ I − H
(3.1)
372
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
in such a way that, on the overlap of two patches, [ j] [i] I 2 ˜ [i j] (ν[i] , ζ ). ν[Ij] = f i−2 j ν[i] , µ I = µ I − f i j ∂ν I H [i]
(3.2)
The condition of superconformal invariance (2.47) further restricts H˜ [i j] (ν[i] , ζ ) to be of the form ˆ [i j] (ν[i] ), H˜ [i j] (ν[i] , ζ ) = f i−2 j H
(3.3)
where Hˆ [i j] (ν[i] ) is a homogeneous function of degree one in ν[i] 17 . Following [1], we like to express H˜ [i j] in terms of the standard O(2) multiplet vI + x I − v¯ I ζ. ζ
(3.4)
H [i j] (η I , ζ ) ≡ ζ −1 f 02j H˜ [i j] (ζ f 0i−2 η, ζ ).
(3.5)
I (ζ ) = η I (ζ ) ≡ ζ −1 ν[0]
Thus, we define (cf. Eq. (3.7) in [1])
Using (3.3), this reduces to H [i j] (η I , ζ ) = Hˆ [i j] (η I ) ≡ H [i j] (η I ).
(3.6)
In terms of H [i j] (η), the gluing conditions (3.2) simply become [ j]
[i j] µ[i] (η). I = µ I − ∂η I H
(3.7)
The consistency conditions on H [i j] (η, ζ ) were analyzed in [1], and just need to be restricted to the superconformal case. Thus, we require that H [ ji] = −H [i j] ,
H [ik] + H [k j] = H [i j] ,
(3.8)
subject to the equivalence relation H [i j] → H [i j] + G [i] − G [ j] ,
(3.9)
τ (H [i j] ) = −H [¯ı j¯] ,
(3.10)
and reality conditions
where all quantities are ζ -independent functions of η I , homogeneous of degree one. As in [1], we shall abuse notation and define Hˆ [i j] away from the overlap Uˆi ∩ Uˆ j (in particular when the two patches do not intersect) using analytic continuation and the second equation in (3.8) to interpolate from Uˆi to Uˆ j . 17 R. Ionas and A. Neitzke have independently shown that the condition that the generalized prepotential is a section of O(2) implies that S is HKC [27].
Linear Perturbations of Quaternionic Metrics
373
Anomalous O(0) multiplets. As discussed in the previous section, it is possible to relax the homogeneity condition (2.47) into the “quasi-homogeneity” condition (2.52). In this case, H˜ [i j] is restricted to be of the form [ j] I −2 I ˆ [i j] (ν[i] ) + c[i] ν[i] H , (3.11) log ν − c ν log f ν H˜ [i j] (ν[i] , ζ ) = f i−2 [i] I [i] j i j [i] I where Hˆ [i j] (ν[i] ) is again homogeneous of degree one in its argument. Defining H [i j] as in (3.5), we find [ j] −2 I − c I η I log ζ f 0−2 (3.12) H [i j] (η, ζ ) = Hˆ [i j] (η) + c[i] j η . I η log ζ f 0i η The explicit dependence on ζ may be removed by a local symplectomorphism −2 I I G [i] = −c[i] + c[0] I η log ζ f 0i I η log ζ,
(3.13)
where the second, i independent term was added to ensure regularity in the patches i = 0 and i = ∞.18 After this gauge transformation, we find [i j] H [i j] (η, ζ ) = Hˆ [i j] (η) + c I η I log η ≡ H [i j] (η), [i j]
where c I
(3.14)
[ j]
[i] ≡ c[i] I − c I , while the momentum coordinate µ is replaced by [i] [i] −2 µ[i] − c[0] I log ζ. T ;I = µ I + c I log ζ f 0i
(3.15)
Note that (3.14) is no longer a homogeneous function of η I , but rather satisfies the quasi-homogeneity condition [i j] η I ∂η I − 1 H [i j] = c I η I . (3.16) Complex contact structure. As in Sect. 2.5, using the (quasi)-homogeneity property of the transition functions H [i j] , one may reduce the complex symplectic structure on ZS to a complex contact structure on ZM . One should only be careful that due to the gauge transformation (3.13), the anomalous O(0) sections satisfy [i] [i] A µ µ 2 2 , x ) = µ µ[i] − c[0i] (ζ, π , π ¯ ˜ (z, x ) − c log (π − ζ π ¯ ) 1 A I I log ζ, (3.17) T ;I T ;I rather than (2.59), while ξ˜ I[i] (z, x µ ) defined in (2.65) becomes [i] [0] ξ˜ I[i] ≡ µ[i] T ;I (ζ ) + c I log η + c I log ζ.
(3.18)
Using the fact that fˆi j = 1 when ν is a globally defined O(2) section, the transition function (3.1) with H [i j] as in (3.14) then leads to [ j] ˜ [ j] ˜ [ j] ξ − Hˆ [i j] (ξ[i] , ξ I ) = ξ˜ + ξ[i] ). Sˆ [i j] (ξ[i] 18 This is the place where the additional reality condition c[0] = −c[∞] becomes necessary. I I
(3.19)
374
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
The section ξ ≡ ξ[j] is globally well defined, and therefore takes the form ξ = z−1 Y+ + A − z Y− ,
(3.20)
where A is real and (Y+ )∗ = Y− . The vector (2 Re (Y+ ), 2 Im (Y+ ), A ) is in fact the generalized moment map for the translational isometry along A , as defined in [26]. The relation between A , Y+ and x I , v I will be discussed in Sect. 3.3 below. On the other hand, the sections ξ˜ I[i] are defined only in the patch Ui , and are related on the overlap of two patches by the complex contact transformation [ j] ξ˜[i] = ξ˜ − ∂ξ Hˆ [i j] (ξ ),
[ j] [i j] ξ˜[i] = ξ˜ − Hˆ [i j] (ξ ) + ξ ∂ξ Hˆ [i j] (ξ ) − c I ξ I .
(3.21)
[i j]
It should be noted that the term proportional to c in this expression disappears when ξ˜ [i] is traded for α [i] as in (2.67),
α [i] = α [ j] − Hˆ [i j] (ξ ) + ξ ∂ξ Hˆ [i j] (ξ ).
(3.22)
Lagrangian, hyperkähler potential and twistor lines. As explained in [1], the transition functions H [i j] (η) determine the holomorphic symplectic structure of ZS , and therefore a HK metric on S. The latter can be computed from the “Lagrangian”, a function of the components v I , x I , v¯ I of η I defined by the contour integral dζ H [0 j] (η(ζ )), L= (3.23) 2π i ζ Cj j
where the contours C j encircle the centered disks U j in the complex ζ -plane. Note that due to the consistency conditions (3.8), the index 0 on the right-hand side of this expression may be substituted with any other value without changing the result. A Kähler potential for the HK metric on S is then obtained by Legendre transformation with respect to x I [22], χ (v I , w I , v¯ I , w¯ I ) = L − x I ∂x I L, ∂x I L = w I + w¯ I .
(3.24)
As shown in [1], the “momentum” coordinates µ[i] I (ζ ) are given by a single expression valid for all patches i, dζ ζ + ζ i [i] µT ;I (ζ ) = I + ∂η I H [0 j] (η(ζ )), (3.25) 2 C j 2π i ζ 2(ζ − ζ ) j
provided ζ lies in the open disk Ui . In particular, it is manifestly regular in Ui . The coordinates I ≡ −i(w I − w¯ I ) correspond to overall additive constants unconstrained by (3.7), and are adapted to the tri-holomorphic isometries ∂ I of S. We now discuss the homogeneity and SU (2) transformation properties of L and χ . Taking into account the quasi-homogeneous property (3.16), we readily find the scaling relation I 2 L 2 v I , 2 v¯ I , 2 x I = 2 L(v I , v¯ I , x I ) − 2c[0] . (3.26) x log I
Linear Perturbations of Quaternionic Metrics
375
On the other hand, the hyperkähler potential satisfies 2 2 2 I I ¯ I −c[0] ¯ I ). (3.27) χ 2 v I , 2 v¯ I , w I −c[0] I log , w I log = χ (v , v¯ , w I , w The SU (2) action on v I and w I can be obtained from (A.5) and (A.6), in Appendix A, leading to δv I = i3 v I + + x I , δ v¯ I = −i3 v¯ I + − x I , ¯ I = − Lv¯ I + i 3 c[0] δw I = + Lv I − i 3 c[0] I , δw I , while the real combinations x I and I transform as
δx I = −2 + v¯ I + − v I , δ I = −i + Lv I − − Lv¯ I − 23 c[0] I .
(3.28)
(3.29)
Note that in the quasi-homogeneous case, w I has anomalous transformations under dilations and U (1) transformations [28], compared to the transformations found in [4]. The anomalous terms can be removed by defining wˆ I ≡ w I + c[0] I log v . It is instructive to check explicitly that χ is SU (2) invariant. Keeping only the homogeneous term in (3.14), one may rewrite dζ ˆ [0 j] I H (η ) − x I ∂η I Hˆ [0 j] (η I ) χˆ = C j 2π i ζ j
ˆ [0 j] v dζ H − v¯ ζ = ζ η C j 2π i ζ j
∂η Hˆ [0 j] v x − x v + (x v¯ − v¯ x )ζ + . (3.30) ζ η Integrating the round bracket in the first term by parts, a short computation establishes dζ η ∂η Hˆ [0 j] dζ ∂η Hˆ [0 j] − ( r · r ) , (3.31) χˆ = (r )2 (η )2 η C j 2π iζ C j 2π iζ j
j
where r I · r J = x I x J + 2v I v¯ J + 2v¯ I v J (3.32)
I I I I is the inner product of the 3-vectors r = 2 Re (v ), 2 Im (v ), x associated to the O(2) multiplets η I . Each term is the product of a SU (2) invariant quantity times a contour integral of an O(−2) section, and so (according to a general argument discussed at the end of Appendix A) is SU (2) invariant. In the quasi-homogeneous case, the same line of argument combined with the contour deformations discussed in Sect. 3.4 of [1] leads to χ = χˆ + c[+−] I
r · r I , r
(3.33)
where c[+−] denotes the quasi-homogeneity coefficient relating the patches around the I two roots of η (ζ ). Thus χ is SU (2) invariant, and therefore equal to the hyperkähler potential on S. This concludes the proof that transition functions of the form (3.14) indeed lead to a HKC metric on S.
376
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
3.2. Superconformal quotient. In this subsection and the following one, we perform the superconformal quotient explicitly for a general HKC S described by the formalism of the previous subsection. We start by constructing a convenient set of coordinates x µ , π A , π¯ A on M × C2 /Z2 , in terms of the complex coordinates ν I (ζ ), µ I (ζ ) on the Swann bundle S in an arbitrary complex structure ζ . In the next subsection, we find the reciprocal change of variables, and determine the twistor lines. The real coordinates x µ on M are characterized by their invariance under the scaling and isometric SU (2) actions on ZS . Instead, the coordinates π A , π¯ A should transform as a pair of doublets under SU (2), and have a squared norm dictated by the hyperkähler potential, π A π¯ A = χ . These constraints do not determine the coordinates x µ , π A , π¯ A uniquely. In the case of QK spaces obtained by the classical and quantum corrected c-map, studied in [17,28], it was convenient to choose coordinates x µ adapted to the action of a 2d + 1 dimensional Heisenberg group of isometries. In the general O(2) case, the only isometries are the d + 1 abelian shift symmetries, and there is no such “canonical” choice. Our construction below is tailored to reproduce the results [17,28] for c-map spaces, as we illustrate later in Sect. 4. It also follows from considerations in contact geometry, as discussed in greater generality in Sect. 5.2. We start by singling out two multiplets η , η0 of the d + 1 O(2) multiplets η I , and denote by ηa , a = 1, . . . , d − 1, the remaining ones. The zeros of ν[0] = ζ η are now ζ± =
x ∓ r , 2v¯
r =
(x )2 + 4v v¯ .
(3.34)
As explained below (A.9), SU (2)-invariant quantities can be constructed by contour– integrating O(−2) sections on S. The simplest example is dζ 1 1 = , (3.35) r C+ 2π i ζ η which recovers the SU (2) invariant r , homogeneous of degree one under dilations. Other convenient SU (2) and dilation invariants are given by I r · r ηI dζ I = , A ≡r )2 2π i ζ (η (r )2 C+ (3.36) η− η dζ η dζ η+ Z ≡r = 0 , Z¯ ≡ −r = 0, 0 0 η+ η− C+ 2π i ζ η η C− 2π i ζ η η and, when µ[i] is a non-anomalous O(0) local section, B I ≡ −i r
C+
[+] dζ µ I + i r 2π i ζ η
C−
[−]
dζ µ I = −i µ+I + µ− I . 2π i ζ η
(3.37)
Here C± denote the contours containing ζ± , µ[±] I are the multiplets which are regular in [±] the patch containing ζ± , η± = η (ζ± ), and µ± I = µ I (ζ± ). Moreover, an additional invariant can be constructed out of the hyperkähler potential itself, eφ ≡
χ , 4r
(3.38)
Linear Perturbations of Quaternionic Metrics
377
As we shall see the 4d variables x µ = {φ, Z a , Z¯ a , A , B I } provide a convenient coordinate system on M. In particular, the coordinates B I correspond to the directions along the d + 1 isometries. In the quasi-homogeneous case, (3.37), is no longer SU (2)-invariant. One may imag˜ [+] ine replacing µ[+] I by ξ I , which is non-anomalous, however this quantity is singular in the patch U+ , as apparent from (2.80). The logarithmic singularity can be cancelled 0 without affecting the SU (2) transformation properties by adding c[+] I log(η /η ), which 19 leads us to define dζ [0] 0 µ[+] B I ≡ −i r + c[+] (3.39) I log η + c I log ζ − (+ ↔ −). T ;I C+ 2π i ζ η It is important to note that there exists another manifestly SU (2) and dilation invariant quantity, R≡
| r × r0 | |v η+0 | = , 2(r )2 (r )2
(3.40)
where × denotes the inner product of vectors in R3 . As we shall see shortly, it is R rather than φ which appears most naturally in the general formulae (3.50) for the twistor lines. Note that R vanishes when the zeros of η and η0 collide. As far as the coordinates on the fiber π A , π¯ A are concerned, the reasoning below (A.9) shows that SU (2) doublets can be constructed by contour-integrating O(−3) sections. Thus, it is natural to consider20 1 dζ C ζ+ 1 π =C = , r η+0 C+ 2π i η ζ η0 1 dζ C¯ ζ− 2 ¯ =− − 0 , π =C 0 r η− −ζ η C− 2π i η (3.41) 1 dζ C¯ ¯ π¯ 1 = −C = , 0 C− 2π i ζ η −ζ η0 r −ζ− η− 1 dζ C π¯ 2 = −C =− , 0 2π i ζη ζη C+ r ζ+ η+0 where the proportionality constant C can be chosen to be real, and adjusted such that (2.20) is obeyed. This gives the SU (2) invariant (3.42) C = 2r eφ/2 |v η+0 |. Equation (2.14) may now be rewritten as in [17], 1
1 √ 2 v¯ η¯ +0 π z φ/2 = 2 e v , , z = ζ 1 + π2 v η+0 z− 2
e4iψ =
v z¯ . v¯ z
(3.43)
19 This definition arises naturally from the general procedure explained in Sect. 5.2. 20 The contour integrals given below suffer from ambiguities in the choice of square root branches. This is inherent to the fact that the fiber of the Swann bundle is C2 /Z2 . We choose the branch cuts in such a way that the reality condition π¯ A = (π A )∗ is obeyed, and (π A , A B π¯ B ) transform as SU (2) doublets.
378
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
The following further relations are also often useful: v 1 v |v | |v | ζ− = , ζ = −|z| , r = = (1 + z z ¯ ), x (1 − z z¯ ). + |z| v¯ v¯ |z| |z|
(3.44)
Finally, we note that the reduced O(2) global section ν˜ [i] defined in (2.58) takes the simple form
ν˜ [i] =
1 −φ e z. 4
(3.45)
Comparing with (2.76), we see that the contact potential [+] is real, and coincides with the invariant φ defined in (3.38). This is also apparent from (2.78), using the third equation in (3.44) and the fact that fˆi j = 1 in O(2) geometries. 3.3. Contact twistor lines. We now consider the converse problem, of determining the complex coordinates ν I , µ I on S in terms of the coordinates x µ on the QK base M and of the coordinates π A on the C2 fiber. This is known in mathematics as “parametrizing the twistor lines”. In view of the discussion in Sect. 2.5, this is equivalent to expressing the “contact twistor lines” (ξ , ξ˜ I[i] ) in terms of the coordinates (z, x µ ) on ZM . for any i) Let us consider first ξ (z, x µ ). As explained in (2.80), ξ (equal to ξ[i] admits a single pole at z = 0 and z = ∞. The coefficients of the Laurent expansion of ξ (z, x µ ) around z = 0 can be extracted from the contour integral dz dζ η −k µ ξ (z, x ) = r z , (3.46) ξ ,k = k+1 2 0 2π iz C+ 2π iζ (η ) where we used dζ dz = r . 2π iz 2π iζ η
(3.47)
Equation (3.46) vanishes for k ≤ −2 and k ≥ 2, as can be seen by deforming the contour around ζ+ to a contour around ζ− . For k = 0, one immediately recovers the first quantity in (3.36). For k = −1, one may decompose dζ η z η0 ,−1 = R Z , =r (3.48) ξ η0 2π iζ η η C+ where Z 0 ≡ 1, and we used the fact that the term in brackets is regular at ζ = ζ+ and equal at that point to the invariant R defined in (3.40). For k = 1, a similar argument leads to ξ ,1 = −R Z¯ . As a side product, we obtain a contour integral representation for the quantity (3.40), η0 η0 −1 dζ dζ z = r z . (3.49) R=r 2 2 C+ 2π i ζ (η ) C− 2π i ζ (η ) Thus, we conclude that the contact twistor line is parametrized by ξ (z, x µ ) = A + R z−1 Z − z Z¯ ,
(3.50)
Linear Perturbations of Quaternionic Metrics
379
so that Y+ = R Z in (3.20). Setting ζ = 0, z = z in this expression allows to express the complex coordinate ξ on ZM in terms of the coordinates on the base and on the CP 1 fiber. For ξ˜ I[i] defined in (2.65), and assuming c[i] I = 0 for simplicity, one may eliminate I in (3.25) in favor of B I , and use the identity dζ 2π iζ
ζ + ζ dz z + z 1 ζ+ + ζ 1 ζ− + ζ = − , − ζ −ζ 2 ζ − ζ+ 2 ζ − ζ− 2π i z z − z
(3.51)
where z is obtained from (3.47) by replacing ζ → ζ , z → z . This gives ξ˜ I[0] (z, x µ ) =
i 1 BI + 2 2 j
C˜ j
dz z + z [0 j] H (ξ(z )), 2π i z z − z I
(3.52)
[0 j]
where C˜ i is the image of the contour Ci in the z plane, and H I (ξ ) ≡ ∂η I H [0 j] (η I ). In the quasi-homogeneous case, similar manipulations (explained in greater generality in Sect. 5.2) lead to dz z + z i 1 1 [+−] B + ∂ξ Hˆ [0 j] (ξ(z )) + c log z, 2 2 2 C˜ j 2π i z z − z j [0 j] 1 dz z + z ˆ i 1 ˆ + cI ξ I H − ξ = B + ∂ + c[+−] log z. H ξ z − z 2 2 2π i z 2 ˜ Cj
ξ˜[0] = ξ˜[0]
j
(3.53) Again, by substituting ζ = 0, z = z in these expressions, one may obtain the complex coordinate ξ˜ I on ZM in terms of the fiber coordinate z and the coordinates on M. As in the case of (3.25), the r.h.s. of (3.53) gives the contact twistor line ξ˜ I[i] in any patch Ui , provided z is chosen to lie in the corresponding patch (moreover, as in (3.25), one may replace the superscript [0 j] with any [k j] without changing the result). Indeed, one may check that the discontinuity of the r.h.s. of (3.53) across the contours C˜ i precisely implements the contact transformations given in (3.21). Another important remark is that, due to the fact that the argument ξ of Hˆ [0 j] has a pole at z = 0, the integrals appearing in (3.53) need not be regular at z = 0: we shall see an example of this phenomenon in (4.23) below. For what concerns ξ˜ I[+] however, the integrals are regular and the first Laurent coefficients needed to find the SU (2) connection and the quaternionic-Kähler metric are readily extracted: [+] ξ˜,0 =
dz dz [+] ∂ξ Hˆ [0 j] , ξ˜,1 = ∂ Hˆ [0 j] , 2 ξ 2π i z 2π i z ˜ ˜ Cj Cj j j (3.54) dz i 1 [0 j] [0 j] ˆ ˆ H . = B + − ξ ∂ξ H 2 2 C˜ j 2π i z
i 1 B + 2 2 α0[+]
j
380
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
From these expressions it is easy to find the contact potentials using (2.84), dz 1 1 I Z ∂ξ Hˆ [0 j] (ξ (z)) + c[+] R I A , 2 2 2π iz 2 ˜ Cj j dz ¯ 1 1 I =− R Z ∂ξ Hˆ [0 j] (ξ (z)) − c[−] I A , 2 2π i 2 ˜ Cj
e[+] = e
[−]
(3.55)
j
In particular, e[±] are independent of the fiber coordinate z and are equal to each other, since their difference is the integral of a total derivative. On the other hand, changing variable from ζ to z in (3.31), we may rewrite the hyperkähler potential χ , Eq. (3.33), as a contour integral
χ =r R
j
C˜ j
dz −1 z Z − z Z¯ ∂ξ Hˆ [0 j] (ξ (z)) + r c[+−] AI , I 2π i z
(3.56)
where we used the notation A = 1. Equation (3.56) may be used to express R in terms of eφ or vice-versa. Moreover, comparing (3.56) and (3.55), one confirms that the invariant φ defined in (3.38) is indeed equal to the contact potential [±] . 4. The Perturbative Hypermultiplet Moduli Space In order to illustrate the results in the previous section, we now discuss the geometry of hypermultiplet moduli spaces in type II string theories compactified on a Calabi-Yau three-fold, which is the main motivation for this study. In Sect. 4.1 we focus on the tree-level geometry, deferring the inclusion of the one-loop correction to the next subsection. Non-perturbative contributions will be considered in [19], using the results on deformation in Sect. 5 below.
4.1. Tree-level geometry. At tree-level in the string perturbative expansion, the hypermultiplet moduli space M in type IIA (resp. IIB) string theory compactified on a Calabi-Yau three-fold Y (resp. X ) is a QK space of quaternionic dimension d = h 2,1 (Y ) + 1 (resp. d = h 1,1 (X ) + 1). It is obtained by the c-map construction from the vector multiplet moduli space MV in type IIB (resp. IIA) theory compactified on the same Calabi-Yau manifold Y (resp. X ) [29,30]. MV is a projective special Kähler manifold of dimension 2d − 2, representing the moduli space of complex deformations of Y (resp. complexified Kähler deformations of X ), described in the standard way [31,32] by a holomorphic prepotential F(X ) ( = 0, . . . , d −1), homogeneous of degree two. As shown in [33,34], the Lagrangian describing the Swann bundle of M is given by L = Im C+
dζ F(η ) , 2π i ζ η
(4.1)
where η I (I = , 0, . . . , d − 1) are O(2) multiplets parameterized as in (3.4), and the contour C+ encloses the root ζ+ of ζ η (ζ ) (given in Eq. (3.34)) counter-clockwise. This
Linear Perturbations of Quaternionic Metrics
381
can be cast in our general framework (3.23) by introducing four patches21 on CP 1 , centered at 0, ∞, ζ+ , ζ− , with transition functions [0+] Htree =−
i F(η ) , 2 η
[0−] Htree =−
¯ ) i F(η , 2 η
[0∞] Htree = 0.
(4.2)
The contour integral (4.1) was evaluated in [17] (generalizing a previous computation in [33,34] restricted to the locus v = v¯ = 0), resulting in L(v, v, ¯ x) =
1 ¯ − F(η+ ) − F(η ) . 2ir
(4.3)
The hyperkähler potential χ following by Legendre transform is given by [17] χ=
v v¯ K (η+ , η− ), (r )3
(4.4)
where
¯ K (Z , Z¯ ) ≡ i Z¯ F (Z ) − Z F¯ ( Z¯ ) ≡ e−K(Z , Z ) .
(4.5)
The hyperkähler potential χ may be further expressed in terms of the complex coordinates v I , w I and their complex conjugate by means of the Hesse potential associated to the special Kähler manifold MV [17]. The momentum coordinates µ[0] I for this geometry can be evaluated using (3.25) (away from the locus where the zeros of η collide with other singularities of F(η )): i ζ +ζ− ¯ ζ +ζ+ 1 µ[0] = − F (η ) − (η ) F ζ −ζ− − , 4ir ζ −ζ+ + 2 i ¯ −) ζ +ζ− ¯ ζ+ F(η+ ) ζ− F(η ζ ζ +ζ+ x µ[0] = + 2i(r )2 (ζ −ζ+ )2 + (ζ −ζ− )2 + 4i(r )3 ζ −ζ+ F(η+ )− ζ −ζ− F(η− ) 2 ζ +ζ− ¯ +ζ+ v v + . (4.6) F (η ) + ζ v ¯ (η ) + ζ v ¯ F − 4i(r1 )2 ζζ−ζ + + − − ζ+ ζ −ζ− ζ− + Since H [0∞] = 0, the momentum coordinates around the south pole are given by µ[∞] = I [0] µ I , and one may check that the reality conditions (2.34) are indeed satisfied. [0] The multiplets µ[0] and µ have a first order and second order pole at ζ = ζ± , respectively, while being regular elsewhere. It is readily checked that the combinations [0] µ[+] = µ −
i F (η) i F(η) [0] , µ[+] , = µ + 2 η 2 (η )2
(4.7)
[0+] related to µ[0] I by the symplectomorphism generated by Htree , are regular at ζ = ζ+ , while being singular at ζ = ζ− and other possible singularities of F(η). Indeed, evaluating µ[+] I at ζ = ζ+ yields
i i x i v ¯ (4.8) F (η ) − F (η ) + F (η ) + v ¯ ζ µ+ = − + − + + 2 4(r )2 2r ζ+ 21 Since H [0∞] = 0, U and U are really one and the same patch. We further assume that all singularities ∞ 0 of F(η) belong to either U0 or U∞ , but not to U± .
382
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
and µ+ =
i (x )2 − 2v v¯ ¯ −) + i F(η+ ) − F(η 4 2 4(r )
v v i x ¯ F (η+ ) + + ζ+ v¯ + ζ− v¯ (4.9) − F (η− ) 4(r )3 ζ+ ζ−
i v¯ v − v v¯ v v i . − F (η ) + F + ζ v ¯ + ζ v ¯ + + + 2(r )3 4(r )2 ζ+ ζ+
Similarly, the combinations [0] µ[−] = µ −
¯ i F¯ (η) i F(η) [−] [0] , µ = µ + , 2 η 2 (η )2
(4.10)
[0−] , are regular at ζ = ζ , related to µ[0] − I by the symplectomorphism generated by H while being singular at ζ = ζ+ and other possible singularities of F(η). We note that the multiplets µ I may be obtained independently by making use of the special symmetry properties of the QK metrics in the image of the c-map, namely the existence of an extended Heisenberg group of tri-holomorphic isometries [29,30]. Upon lifting them to the Swann bundle, these isometries are generated by the holomorphic Killing vector fields [17]
i K = − ∂w , 4 M = w ∂w
i ∂ w , Q = w ∂w − v ∂v , 2 1 1 − v ∂v + w ∂w − v ∂v . 2 2 P =
(4.11)
The commuting isometries K , P are manifest in the O(2) projective superfield construction; their O(2)-valued moment maps are just the O(2) multiplets η , η . The moment maps, λ , λ associated to the remaining isometries Q and M provide d + 1 additional global O(2) sections
λ = v w /ζ + w ∂w χ − v ∂v χ + v¯ w¯ ζ,
1 1 1 −1 λ = v w + v w ζ + w ∂w χ − v ∂v χ + w ∂w χ − v ∂v χ 2 2 2
1 (4.12) + v¯ w¯ + v¯ w¯ ζ. 2 Matching the leading terms in the expansion around ζ = 0, one readily checks that the momentum coordinates around the north pole are given in terms of the global O(2) sections [0] µ =
λ , η
µ[0] =
λ λ η − . η 2(η )2
(4.13)
Linear Perturbations of Quaternionic Metrics
383
4.2. One-loop correction. In type II theories compactified on a Calabi-Yau Y , the metric on the hypermultiplet moduli space receives a one-loop correction, proportional to the Euler number of Y [35]. There is evidence that there are no perturbative corrections to the hypermultiplet metric beyond one-loop [36].22 As shown in [36], the one-loop correction can be described in the projective superspace formalism by adding a term
dζ x + r η log η = −4c r − x log = 2c 2|v | C 2π iζ
L1−loop
(4.14)
to the Lagrangian (4.1), where c is a constant determined in [36], proportional to the Euler character of the Calabi-Yau threefold. Here the contour C is a figure-eight contour around ζ+ and ζ− , and the branch cuts in log η are chosen to extend from ζ+ to 0 and ζ− to ∞ (see Sect. 3.4 in [1] for a more detailed discussion). Equivalently, the one-loop correction gives rise to additional contributions to the transition functions (4.2). [0+] H1−loop = 2c η log η ,
[0−] H1−loop = −2c η log η ,
[0∞] H1−loop = 0.
(4.15)
In particular, the transition functions are no longer homogeneous, but fall in the “quasihomogeneous” class, with anomalous dimensions c[0] = c[∞] = 0,
c[±] = ∓2c ,
[i] c = 0.
(4.16)
The one-loop contribution to the hyperkähler potential is given by a simple correction to the formula (4.4) [28] χ=
v v¯ K (η+ , η− ) − 4 c r , (r )3
(4.17)
in agreement with the general result (3.33). Let us now determine the twistor lines for the one-loop corrected hypermultiplet moduli space. Starting from the general expression (3.25), the additional contribution (4.15) to the transition functions gives rise to extra terms in µ[i] ;T ,
ζ − ζ− [0]tree , µ[0] = µ + 2c log |ζ | + T; ζ − ζ+
|v |(ζ − ζ− )2 [+] [+]tree µT ; = µ + 2c, + 2c log − ζ ζ−
(4.18)
[i]tree while the other momentum coordinates remain unaltered, µ[i] . It is easy to = µ check that the multiplet (4.18) transforms under SU (2) transformations according to (A.6). 22 In the case of the universal hypermultiplet, this was established rigorously in [37]. See the end of Sect. 4.3 for a strengthening of the non-renormalization argument in [36].
384
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
4.3. Superconformal quotient. The superconformal quotient of the HKC defined by (4.1) was studied in [17,28].23 The dilation and SU (2) invariant coordinates used in these references were given by χ r · r η+a x a ˜ = −i(w − w¯ ) + , ζ = , z = , ζ Re [F (η+ )] , 4r (r )2 (r )2 η+0 (4.19) η0 ˜ v v¯ x σ = 2i(w − w¯ )+i v w − v¯ w¯ − (r )2 Re η+ ζ − F (η+ )ζ −4ic log η0+ .
e2U =
−
While the SU (2)-invariance of ζ˜ and σ directly is rather tedious to check, it can be made manifest by casting the resulting expressions in the form of a contour integral of an O(−2) section, e.g. when c = 0, [0] [0] µ η µ[0] dζ µ dζ ζ˜ = −2ir , σ = 4ir + . (4.20) η 2(η )2 C + 2π iζ η C + 2π iζ In the presence of the one-loop correction, these expressions may be generalized by performing the same replacement as in (3.39) and taking the real part. We now relate the result (4.19) to the general SU (2) and dilation invariant coordinates introduced in Sect. 3.2. Clearly, U = φ/2, z a = Z a , ζ = A . On the other hand, evaluating (3.39) with the help of (4.8), (4.9) and (4.18) leads to x Re F (η+ ) − A Re F (η+ ), (r )2 1 +x A η0 Re F (η+ )+ A A Re F (η+ )+2ic log η0+ . B = + (r 1 )2 Re F(η+ )− x 2(r )2 − 2 (4.21)
B = +
Thus the coordinates B I differ from σ, ζ˜ by SU (2) invariant terms, B = ζ˜ − A Re F (Z ),
1 1 B = − σ − A B . 2 2
(4.22)
The contact twistor lines can be found using the general formulae (3.50), (3.53), ξ = A + R z−1 Z − z Z¯ , i ξ˜[0] = B + A Re F (Z ) + R z−1 F (Z ) − z F¯ ( Z¯ ) , 2
i 1 [0] ξ˜ = B − A A Re F (Z ) + R2 Re Z¯ F (Z ) 2 2 ¯ Z¯ ) − 2c log z. −RA z−1 F (Z ) − z F¯ ( Z¯ ) − R2 z−2 F(Z ) + z2 F( (4.23) Finally, it remains to express R, defined in (3.40) above, in terms of the base coordinates (3.36)–(3.39). For this purpose, one may substitute the one-loop corrected hyperkähler 23 Ref. [28] used a different contour prescription related to the one used here by a local gauge transformation, as we explain in Appendix B. As a result, the expressions for ζ˜ and σ acquired some additional terms.
Linear Perturbations of Quaternionic Metrics
385
potential (4.17) into the definition of φ, Eq. (3.38), and use the homogeneity property of K (·, ·) to obtain 1 ¯ R = 2 e 2 K(Z , Z ) eφ + c. (4.24) Introducing W ≡ F (Z ) ζ − Z ζ˜
(4.25)
and using (4.22), one may obtain the contact twistor lines for the one-loop corrected hypermultiplet geometry in the form found in [17], ξ = ζ + R z−1 Z − z Z¯ , −2iξ˜[0] = ζ˜ + R z−1 F − z F¯ , (4.26) 4iξ˜[0] + 2iξ˜[0] ξ = σ + R z−1 W − z W¯ − 8i c log z. We proceed to extract the one-loop corrected hypermultiplet metric from the twistor data on ZM . Following the procedure outlined at the end of Sect. 2.5, we first compute the Laurent coefficients of ξ˜ I[+] entering the SU (2) connection (2.83), i i [+] [+] ξ˜,0 ζ˜ − F ζ , ξ˜,0 = = − (σ + ζ ζ˜ − F ζ ζ ) − eφ + c, 2 4 1 i [+] F ζ ζ , = RN Z¯ − (4.27) ξ˜,1 2 4R 1 i [+] ξ˜,1 F ζ ζ ζ . = − RN ζ Z¯ + 2 12R where N ≡ i(F − F¯ ), leading to the SU (2) connection i p+ = e−φ R Z dζ˜ − F dζ = ( p− )∗ , 4 (4.28) 1 −φ ˜ φ ˜ p3 = e dσ + ζ dζ − ζ dζ + 4(e + c)A K , 8
where A K ≡ i Ka dZ a − Ka¯ d Z¯ a¯ is the Kähler connection of the projective special Kähler base MV . A direct computation of the (1, 0) forms (2.88) then yields a = R2 dZ a , i ˜ = R dζ˜ − F dζ − F ζ dZ 2
i 3 i T ˜ T Z R Im (F ) Z¯ + , + F ζ ζ d ζ − F dζ T T 4r 4R2 r + 2c ˜ = −R dr + c dK r +c i + dσ + ζ˜ dζ + ζ dζ˜ − F ζ ζ dZ − 2F ζ dζ 4
i i Z T dζ˜T − FT dζ T , R2 Im (F )ζ Z¯ + F ζ ζ ζ + 4r 12 (4.29)
386
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
where r ≡ eφ . Taking linear combinations, a basis of (1,0) forms can be chosen as dZ a , f a dζ˜ − F dζ , (4.30) r + 2c i Z dζ˜ − F dζ , dσ + ζ˜ dζ − ζ dζ˜ + c dK, dr + r +c 4 where f a = eK/2 (∂a Z +∂a K Z ), generalizing the one-forms ea , E a , u, v introduced in [30] for c = 0. Finally, computing the Kähler form (2.85) in this basis and raising the indices, one obtains the one-loop corrected metric on the hypermultiplet moduli space [28,36],
r + 2c 1 2(r + c) ¯ dζ˜ − F dζ ds 2 = 2 dr 2 − N − Z Z r (r + c) r rK × dζ˜ − F¯ dζ 2 4(r + c) r +c ¯ ˜ dζ − ζ dζ˜ + 4c A K + dσ + ζ + Ka b¯ dZ a d Z¯ b , 16r 2 (r + 2c) r (4.31) which identifies φ as the four-dimensional dilaton. It is worthwhile noting that the one-loop correction changes the topology of the fibration of the σ -circle over the torus coordinatized by ζ , ζ˜ , by a term proportional to A K [38]. Any perturbative correction to the hypermultiplet metric beyond one-loop would presumably induce extra terms in the connection on the σ circle bundle proportional to a positive power of r , and would therefore conflict with the quantization of its first Chern class. This observation reinforces the arguments given in [36] ruling out perturbative corrections to the hypermultiplet metric beyond one-loop. 5. Linear Deformations of O(2) Quaternionic-Kähler Spaces In this section, we study the infinitesimal deformations of 4d-dimensional QK spaces M with d + 1 commuting isometries, which preserve the QK property but may break some or all of the isometries. Our strategy is to apply the general analysis of linear deformations of O(2) HK spaces developed in [1] to the Swann bundle S of M, restricting to deformations which preserve the superconformal invariance property. As explained in the introduction, it is possible to bypass the Swann bundle and work directly with the twistor space ZM . This strategy will be realized in Sect. 5.2. 5.1. Linear deformations of O(2) hyperkähler cones. As explained in [1], deformations of HK spaces S are conveniently described by perturbing the transition functions S [i j] which encode the holomorphic symplectic structure on the twistor space ZS , [ j]
I [i j] ˜ [i j] (ν[i] , ζ ) − H˜ (1) (ν[i] , µ[ j] , ζ ), S [i j] (ν[i] , µ[ j] , ζ ) = f i−2 j ν[i] µ I − H
(5.1)
I ,µ and working out the perturbations νˆ [i] ˆ [i] I of the twistor lines, I I = ζ f 0i−2 η I + νˆ [i] , ν[i]
µ[i] ˘ [i] ˆ [i] I =µ I +µ I
(5.2)
Linear Perturbations of Quaternionic Metrics
387
[i j] . Here and below, unperturbed quantities are to first order in the perturbations H˜ (1) denoted with ˘, perturbations with ˆ, and perturbed quantities with no extra symbol (with the exception of η I , v I , x I which will continue to denote unperturbed quantities). As shown in Sect. 3.1, superconformal invariance restricts the undeformed transition functions H˜ [i j] (ν[i] , ζ ) to be homogeneous of degree one in ν[i] , and without any explicit dependence on ζ except for some factors of f i j . The same reasoning shows the [i j] should satisfy the same conditions, namely perturbation H˜ (1) [ j] [ j] −2 [i j] [i j] I ˆ (1) ν , (5.3) H˜ (1) (ν[i] , µ[ j] , ζ ) = f i−2 , µ + c log f ν H [i] j i j [i] I I [i j] is a homogeneous function of degree one in its first argument.24 Following where Hˆ (1) [i j] for [1], we now trade H˜ (1) [i j] [i j] H(1) (η, µ˘ [ j] , ζ ) ≡ ζ −1 f 02j H˜ (1) (ζ f 0i−2 η, µ˘ [ j] , ζ ) [ j] [ j] [i j] η I , µ˘ I + c I log f 0−2 . = Hˆ (1) j ζη
We then perform the gauge transformation (3.13) to obtain [ j] [ j] [ j] [i j] [i j] H(1) η I , µ˘ T ;I + c I log η + c[0] (η, µ˘ T , ζ ) = Hˆ (1) I log ζ .
(5.4)
(5.5)
[ j]
Finally, we trade the argument µ˘ I ;T for the real multiplet [0∞] ρ I (ζ ) ≡ −i(µ˘ [0] ˘ [∞] log ζ. T ;I + µ T ;I ) − i c I
(5.6)
This quantity has the advantage of having non-anomalous O(0) transformations, more[i ı¯] is a real function over the reality conditions are also automatically satisfied provided H(1) I [i j] I of η and ρ I . After these redefinitions, H(1) is now a function of η , ρ I , homogeneous of degree one in η I , and with no explicit dependence on ζ . In addition, it must satisfy the co-cycle condition (3.8) and is subject to the gauge equivalence (3.9), where G [i] is now a function of η I , ρ I regular in the patch Uˆi . We may now borrow the results from [1], Sect. 5. In particular, the first order variation of the HK twistor lines is given by dζ ζ 3 + ζ 3 I H [0 j]I (ζ ), = i f 0i−2 (5.7) νˆ [i] ζ (ζ − ζ ) (1) 2π i ζ C j j dζ ζ + ζ [0 j] G I (ζ ) µˆ [i] = (5.8) T ;I − ζ) 2π i ζ 2(ζ Cj j
with H I ≡ ∂η I H, [i j]
GI
H I J ≡ ∂η I ∂η J H, [ j0]
H(1) I ≡ ∂η I H(1) , [ j∞]
[i j] [i j]J ≡ H(1) I + i H(1) (H I J + H I J
I H(1) ≡ ∂ρ I H(1) , [i j]
J ) + ζ −1 f 0i2 νˆ [i] HI J .
(5.9)
24 For simplicity, we do not consider deformations of the anomalous dimensions. However, it may be checked [i] that all formulae below continue to hold provided c I denote the total perturbed anomalous dimensions.
388
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
The corresponding deformations of the Kähler potential can be conveniently described by introducing the deformed Lagrangian L(v, v, ¯ x, ) =
j
Cj
dζ [0 j] [0 j] H (η) + H(1) (η, ρ) . 2π i ζ
(5.10)
So defined, it is a function of the complex variables v I . However, after perturbations v I are no longer Darboux coordinates. Instead, a system of complex Darboux coordinates of the deformed HKC, such that ωS+ = dw I ∧ du I , is given by u I = v I + i ∂ I
Cj
j
dζ H [0 j] , 2π i (1)
wI =
i 1 I + ∂x I L(u, u, ¯ x, ), (5.11) 2 2
where in the second relation the arguments v I of the Lagrangian are replaced by the new complex variables u I and the derivative is evaluated keeping u I , u¯ I and I fixed. Similarly, the Kähler potential for the deformed HK metric is given by the Legendre transform of the deformed Lagrangian (5.10) but written as a function of the new variables χ (u, u, ¯ w, w) ¯ = L(u, u, ¯ x, ) − x I (w I + w¯ I )
xI
.
(5.12)
In particular, the variation of the hyperkähler potential is given by a Penrose-type integral, χ(1) (u, u, ¯ w, w) ¯ =
j
Cj
dζ H [0 j] (η, ρ). 2π i ζ (1)
(5.13)
We now verify that the perturbed HK manifold is indeed a HKC (as is of course guaranteed by construction). Using the quasi-homogeneity property of H(1) (η, ρ), and in particular the property u I ∂u I − u¯ I ∂u¯ I + ζ ∂ζ ρ J = 0,
(5.14)
it is easily checked that L satisfies u I Lu I − u¯ I Lu¯ I = −2i c[0] I L I , I x I Lx I + u I Lu I + u¯ I Lu¯ I = L − 2c[0] I x .
(5.15)
Together with the identities
∂x I + iLx I x K ∂ K ∂x J − iLx J x L ∂ L L
+ ∂u I − iLu I x K ∂ K ∂u¯ J + iLu¯ J x L ∂ L L = 0,
∂x I + iLx I x K ∂ K ∂u J − iLu J x L ∂ L L
− ∂x J + iLx J x K ∂ K ∂u I − iLu I x L ∂ L L = 0,
(5.16) (5.17)
Linear Perturbations of Quaternionic Metrics
389
these equations guarantee that L satisfies the constraints of superconformal invariance. Moreover, from (2.17), one may compute the homothetic Killing vector and the Killing vectors for the SU (2) isometric action, χ w I = −2c[0] I . δu I = i3 u I + + x I + iL I , δ u¯ I = −i3 v¯ I + − x I − iL I , I
χ u = 2u I ,
δw I = + Lu I − i 3 c[0] ¯ I = − Lu¯ I + i 3 c[0] I , δw I .
(5.18) (5.19)
In particular, the homothetic Killing vector is holomorphic and identical to the undeformed case. Moreover, one may check that the one-form obtained by lowering the index on k + using the deformed metric reproduces the Liouville form (2.64) on S in the patch i = 0 [4]. 5.2. Perturbed contact twistor lines. In order to extract the deformed quaternionicKähler metric on M, one possible strategy is to study the deformations of the superconformal quotient: this computationally intensive approach is outlined in Appendix C. However, it turns out to be more economic and elegant to work directly with the complex contact structure on the twistor space ZM , without reference to the Swann bundle and its twistor space. For this purpose, let us recast the deformed symplectomorphism (5.1) into the form of the contact transformations (2.71). Introducing the same coordinates as in (2.65), it is easy to check that the deformed contact transformations are generated by the following transition functions: [ j] ˜ [ j] ˜ [ j] [i j] ˜ [ j] ξ − Hˆ [i j] (ξ[i] Sˆ [i j] (ξ[i] , ξ I ) = ξ˜ + ξ[i] ) − Hˆ (1) (ξ[i] , ξ I ), [ j]
where ξ˜ I that
[ j]
should be replaced by ξ˜ I
(5.20)
[ j]
+ c I log( fˆi−2 j ). However, it follows from (2.73)
[i j] , fˆi2j ≈ 1 − ∂ξ˜ [ j] Hˆ (1)
(5.21)
so that its logarithm is of first order in the perturbation already. Therefore, to the first [ j] ˆ [i j] order it is consistent to neglect the term c I log( fˆi−2 j ) in the argument of H(1) , and take [ j] Hˆ [i j] to be an arbitrary function of the undeformed coordinates ξ˘ and ξ˘˜ . As a result, (1)
one finds the following deformed contact transformations: = ξ[j] − T[ij] , ξ[i]
[i]
I
[ j] [i j] ξ˜ I[i] = ξ˜ I − T˜I ,
(5.22)
where, in view of later applications, we abbreviated [i j] [i j] T[ij] ≡ −∂ξ˜ [ j] Hˆ (1) + ξ[i] ∂ξ˜ [ j] Hˆ (1) , [i j] [ j] [i j] [i j] − c ∂ξ˜ [ j] Hˆ (1) T˜ ≡ ∂ξ Hˆ [i j] + Hˆ (1) , (5.23) [i] [i j] [i j] I [ j] [i j] [i j] [i j] − ξ[i] + c I ξ[i] T˜ ≡ Hˆ [i j] + Hˆ (1) ∂ξ Hˆ [i j] + Hˆ (1) + c ∂ξ˜ [ j] Hˆ (1) . [i]
Hˆ [i j]
As usual, the functions (1) must satisfy the co-cycle condition (3.8) and are defined up ˇ to the gauge equivalence (3.10). Thus, they define an element in the Cech cohomology
390
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
group25 H 1 (ZM , O(2)), realizing Lebrun’s assertion that this group classifies the QK deformations of M [14]. We now determine the deformed contact twistor lines. For definiteness we focus on (z, x µ ) around ζ = ζ , i.e. z = 0. The pole and the constant term in the coordinate ξ[+] + the Laurent expansion (2.80) are readily obtained by contour integrating around z = 0: (z, x µ ) = Y+ z−1 + A ξ[+] + + O(z),
where A +
= 0
dz ξ , 2π i z [+]
Y+
= 0
dz ξ . 2π i [+]
(5.24)
(5.25)
(z, x µ ) at z = 0 is given by On the other hand, the full Laurent series expansion of ξ[+] the series ∞ dz z n −1 ξ[+] (z) = Y+ z + ξ[+] (z ). (5.26) z 0 2π i z n=0
The contour around z = 0 may be deformed into a sum of contours C˜ j around the other singularities in the z plane. Using the contact transformations (5.23) on each patch, we obtain ∞ dz z n −1 ξ ξ[+] (z) = Y+ z − (z ) − T (z ) . (5.27) [ j] [+ j] z C˜ j 2π iz n=0 j=+
The first term in the square bracket gives a non-vanishing contribution for j = − and n = 0, 1 only, while the second term contributes an infinite Laurent series. Therefore, we arrive at the following representation: dz 1 ξ[+] T (z ), (z) = Y+ z−1 + A − Y z + (5.28) − − − z [+ j] 2π i z ˜ Cj j
where now A −
=−
∞
dz ξ , 2π i z [−]
Y−
=
∞
dz ξ . 2π i z2 [−]
(5.29)
From the reality conditions (2.70), we conclude that A− = (A+ )∗ , Y− = (Y+ )∗ . Comparing the O(z0 ) terms between (5.24) and (5.28) gives the difference dz A − A = T (z ). (5.30) + − [+ j] C˜ j 2π iz j
Eliminating A − in (5.28) in favor of the real quantity A = (A+ + A− )/2 leads to 1 dz z + z −1 T[+ j] (z ) (5.31) ξ[i] (z) = A + z Y+ − zY− + 2 C˜ j 2π i z z − z j
25 The twisting by O(2) is not apparent in our formalism, as explained in footnote 14, but follows from the [i j] originates from a homogeneous function of degree 1 on S. fact that Hˆ (1)
Linear Perturbations of Quaternionic Metrics
391
for i = +. As observed below (3.53), these equations are in fact valid in any patch Ui , since they exhibit the correct discontinuities across the contours C˜ i . In an analogous way one may obtain the deformed conjugate coordinates ξ˜ I[i] . The [+] Laurent coefficient ξ˜ I,0 may be extracted by integrating dz [+] dz [+] 0 ˜ − c[+] log z = iB I+ − c[+] ξ˜ I,0 ξ = log z ξ [+] , (5.32) I I I 0 2π iz 0 2π iz where we defined
B I+
≡ −i 0
dz [+] [+] 0 µ + c log ν [+] . I I 2π iz
(5.33)
On the other hand, the Laurent series expansion around z = 0 may be obtained by deforming the contour around z = 0 into a sum of contours around the other singulari[ j] ties in the z plane, and use the symplectomorphism (5.22) to map ξ˜ I[+] to ξ˜ I : ξ˜ I[+] (z) = c[+] I log z −
∞ ˜ n=0 j=+ C j
dz z n [ j] ˜ [+ j] , (5.34) ξ˜ I − c[+] I log z − TI 2π iz z
[i j]
where T˜I are defined in (5.23). The first two terms in the bracket only contribute when j = −. The cancelation of the logarithmic singularity at z = ∞ is ensured by the con[−] dition c[+] I = −c I , corresponding to the figure-eight contour prescription discussed in [1]. Using (2.81), we obtain 0 ξ[−] dz dz [+ j] ˜ξ [+] (z) = c[+] log z + iB − + c[−] log + T˜I , I I I I z ∞ 2π iz C˜ j 2π i(z − z) j
(5.35) where B I−
≡i
∞
dz [−] [−] 0 µ + c log ν [−] I I 2π iz
(5.36)
is the complex conjugate of B I+ . Comparing the z-independent terms in (5.34) and (5.35) establishes the identity
dz dz z [−] 0 log z ξ − c log 0 i B I+ − B I− = c[+] [+] I I ξ[−] 0 2π iz ∞ 2π iz dz ˜ [+ j] + . (5.37) T I C˜ j 2π iz j
Eliminating B I− in (5.35) in favor of B I = B I+ + B I− leads to dz dz i 1 0 0 ˜ξ [+] (z) = B I − c[+] log z ξ[+] + log ξ[−] /z I 2 2 I 0 2π iz ∞ 2π iz dz z + z ˜ [+ j] 1 (5.38) + TI + c[+] I log z. 2 C˜ j 2π iz z − z j
392
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
0 ∼ Y 0 /z and ξ 0 ∼ Y 0 z at z = 0 and ∞, we finally obtain Using the fact that ξ[+] + − [−] ⎛ ⎞ Y−0 dz z + z ˜ [+ j] i 1 [+] ⎝z ⎠ ξ˜ I[i] (z) = B I + + c log (5.39) T I I 2 2 Y+0 C˜ j 2π iz z − z j
for i = +, and in fact also for any i. This relation generalizes (3.53) to the perturbed case. Taken together, (5.31) and (5.39) give the contact twistor lines of the deformed [i j] , which is considered as a function of the twistor space in terms of the perturbation H(1) [i] undeformed twistor lines ξ˘ , ξ˘˜ given in (3.20) and (3.53). In order to make contact [i]
I
with the construction in Sect. 3.3, one should recall that Y+ = RZ , where dz ξ[+] dz 0 ξ , Z = , R = 0 2π i z 2π i [+] ξ 0 0 [+]
(5.40)
in such a way that Z 0 = 1. To obtain the perturbed quaternionic-Kähler metric, we should also calculate the leading Laurent coefficients of the contact potentials [±] given in (2.84). It turns out that one can actually compute the full contact potentials, using the gluing conditions [i j] e−[i] − e−[ j] = e−φ ∂ξ˜ [ j] Hˆ (1) ,
(5.41)
where φ is defined by (3.38) in the unperturbed geometry. These conditions follow from the gluing conditions for ν˜ [i] in (2.69), using the results for the transition functions fˆi2j
(5.21) and the unperturbed ν˜ [i] (3.45). Repeating again the same steps as above, one easily arrives at ⎞ ⎛ z + z dz 1 [0 j] ⎠ e[i] = eφ ⎝1 + ∂ξ˜ [ j] Hˆ (1) (z ) , (5.42) 2 C˜ j 2π iz z − z j
where φ is defined in the perturbed case as φ ≡ Re [+] (z = 0) =
1 0 0 φ[+] + φ[−] . 2
(5.43)
Note that this definition coincides with (3.38) in the unperturbed case. Using (2.84), the leading Laurent coefficient of the contact potentials are given by ⎛ ⎞ 0 dz dz 1 1 1 1 [0 j] [+] ⎝ eφ[+] = Y+ T[0j] ⎠ + c[+] , A + T˜ + c 2 2 2 2 2 C˜ j 2π iz C˜ j 2π i z j j ⎛ ⎞ (5.44) 0 dz dz 1 1 1 1 [0 j] [−] ⎝ eφ[−] = − Y− T[0j] ⎠ − c[−] . A − T˜ − c 2 2 2 2 C˜ j 2π i C˜ j 2π i z j
j
Inserting in (5.43) we therefore obtain dz −1 1 1 [0 j] φ z Y+ − zY− T˜ + c[+−] AI , e = I 4 2π i z 4 C˜ j j
(5.45)
Linear Perturbations of Quaternionic Metrics
393
and the full contact potentials via (5.42). As a useful consistency check, note that the difference of (5.44) can be rewritten, after some considerable work, as dz 0 0 [0 j] − φ[−] = , (5.46) φ[+] ∂ξ˜ [ j] Hˆ (1) 2π iz ˜ Cj j
consistently with (5.42). Altogether these results allow us to extract the metric following the procedure outlined at the end of Sect. 2.5. It is straightforward to relate this contact construction on ZM to the symplectic construction on ZS . For this purpose, one needs to apply the change of variable (2.61), where ζ± denote the location of the zeros of the perturbed section ν , to all contour integrals in the z plane. Under this change of variable, the integration measure becomes (ζ+ − ζ− ) dζ dz = . 2π iz (ζ − ζ+ )(ζ − ζ− ) 2π i
(5.47)
Its expression to the first order in deformation can be found in (C.9). However, in some cases it can be simplified. For example, integrated against a function which is regular at ζ± , this may be rewritten as
r± dζ dz = , 2 ν 2π iz 2π i f 0± [±]
1
≡
r±
C+
dζ
2 ν 2π i f 0± [±]
,
(5.48)
where the factor r± ensures that the residue at ζ = ζ± is equal to one. Integrating a function with a simple pole at ζ± , (5.48) must be generalized to r± s± 1 dz dζ = + + , (5.49) 2π iz 2π i f 2 ν ζ− − ζ+ r± 0± [±]
where s± is the second coefficient in the Taylor expansion
2 ν[±] = r± (ζ − ζ± ) + s± (ζ − ζ± )2 + · · · . f 0±
(5.50)
Note that the correction term in round brackets in (5.49) vanishes in the case where ν remains a global O(2) section. In this way, we may rewrite the invariant coordinates introduced above as follows: ν[±] dζ Y± ≡ r± z±1 , 2 C± 2π i f 0± (ν[±] )2 ν[±] r± s± 1 dζ A± ≡ ± + + , (5.51) 2 ζ− − ζ+ r± C± 2π i f 0± ν[±] ν[±] dζ [±] [±] ± 0 µ . B I ≡ ∓ir± + c log ν [±] I I 2 ν C± 2π i f 0± [+] From the computation of the deformed hyperkähler potential (C.22) in Appendix C, one may check that the relation (3.38) continues to hold after perturbation, provided φ
394
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
is defined by (5.43) and r = (r+ + r− )/2. From the general equation (2.78), this implies that 2 1 + z z¯ Re [[+] (x r = |v fˆ0+ e | |z|
µ ,z)−φ 0 (x µ )] [+]
.
(5.52)
We have not attempted to check this relation directly. In [19], we shall apply this general framework to the hypermultiplet moduli space in compactifications of type II string theory on a Calabi-Yau three-fold. Acknowledgements. We are grateful to A. Neitzke for discussions and former collaboration on related topics. The research of S.A. is supported by CNRS and by the contract ANR-05-BLAN-0029-01. The research of B.P. is supported in part by ANR(CNRS-USAR) contract no.05-BLAN-0079-01. F.S. acknowledges financial support from the ANR grant BLAN06-3-137168. S.V. thanks the Federation de Recherches “Interactions Fondamentales” and LPTHE at Jussieu for hospitality and financial support. Part of this work is also supported by the EU-RTN network MRTN-CT-2004-005104 “Constituents, Fundamental Forces and Symmetries of the Universe”. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
A. Infinitesimal SU(2) Transformations In this appendix, we study the infinitesimal action of SU (2) on the local sections introduced in the main text. We parametrize the Lie algebra of SU (2) by ± = (∓ )∗ and 3 = (3 )∗ such that
α β 1 − 2i 3 + = + O( 2 ). (A.1) −β¯ α¯ −− 1 + 2i 3
The infinitesimal action of SU (2) on π A , π¯ A is given in (2.4), i 1
1 3 −+ π2 π2 π π 2 = · . δ −π¯ 2 π¯ 1 −π¯ 2 π¯ 1 − − 2i 3
(A.2)
The finite action of SU (2) on O(2n) sections was discussed in Sect. 2.4. At the infinitesimal level, (2.41) reduces to δζ ≡ ζ − ζ = + − i3 ζ + − ζ 2 + O( 2 ).
(A.3)
In the patch i = 0, the O(2n) transformation rule (2.44) then leads to
I I I δν[0] (ζ ) ≡ ν[0] (ζ ) − ν[0] (ζ )
I (ζ ) + O( 2 ). (A.4) = + ∂ζ − i3 ζ ∂ζ − n + − ζ 2 ∂ζ − 2nζ ν[0]
Thus, the Taylor coefficients of ν[0] = imal SU (2) action by
I m m νm ζ
around ζ = 0 vary under an infinites-
I I δνmI = (m + 1)νm+1 + − i(m − n)νmI 3 + (m − 2n − 1)νm−1 − .
(A.5)
I The variation of µ[0] I and its Laurent coefficients µ I,m is obtained by replacing ν → µ I , n → 1 − n in these expressions.
Linear Perturbations of Quaternionic Metrics
395
In an arbitrary patch Ui , the SU (2) action (2.44) is most easily expressed in terms I (ζ ), which formally transforms in the same way as ν I (ζ ). Similarly, the of f i0−2n (ζ ) ν[i] [0] SU (2) action (2.54) is most easily stated in terms of f i0−2 (ζ ) exp(−µ[i] /c[i] I ), which also I (ζ ). After the gauge transformation (3.13), formally transforms in the same way as ν[0]
the transformation rules of µ[i] I are changed to
+ [i] 2 ∂ δµ[i] (ζ ) = − i ζ + ζ µ (ζ ) + + ζ c[0] − i + 3 − ζ T ;I 3 − I T ;I ζ
+ − − ζ c[i] − I , ζ
(A.6)
consistently with the fact that (3.18) transforms like a non-anomalous O(0) section. The variation (A.5) applies for the Laurent coefficients of any local section of O(2n). In particular, one may consider a homogeneous function G(να ) of O(2n α ) multiplets να , of homogeneity degree n when each να are scaled with homogeneity degree n α : α
n α να ∂να G = n G.
(A.7)
Then, for an arbitrary contour (not necessarily surrounding the origin), the SU (2) variation of the integrals Gm ≡
dζ G(να ), 2π i ζ m+1
(A.8)
is given by δG m = (m + 1)G m+1 + − i(m − n)G m 3 + (m − 1 − 2n)G m−1 − ,
(A.9)
as one can check from explicit calculation. In particular, for n = m = −1, we recover the remark in [39], according to which a contour integral of a section of O(−2) is SU (2)invariant. For n = −3/2, the contour integral of a section of O(−3) with m = −1, −2 instead produces a SU (2) doublet. These observations are central to the superconformal quotient discussed in Sect. 3.2.
B. An alternative Formulation for Hypermultiplet Moduli Spaces In this appendix we explain the relation between the formulation of the hypermultiplet space used in Sect. 4, and the one introduced in [28] using a different contour prescription, and establish their equivalence up to a local symplectomorphism. Aiming for a Lagrangian L whose limit v → 0 is regular, [28] considered the contour integral L (v, v, ¯ x) = Im
C
dζ 2π iζ
F(η ) , + 4ic η log η η
(B.1)
396
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
where the contour C appearing in (4.1) encircles the poles ζ = 0, ζ+ in the counter-clockwise direction and the logarithmic branch cuts connect 0, ζ+ and ζ− , ∞, respectively. The resulting Lagrangian L =
1 F(v) F (v) Im F(η+ ) − x Im 2 + x Im r (v ) v
x +r , +4c x − r + x log 2
(B.2)
differs from (4.3), (4.14) by terms linear in x I only and therefore describes the same metric. In terms of our general discussion, the contour prescription (B.1) arises from the transition functions i F(η) H [0+] = 0, H [0i] = , 2 η (B.3) ¯ i F(η) − F(η) [0−] [0∞] =H = − 4cη log η , H 2 η where i labels the patches where F(η) is singular. These transition functions are related to the ones given in (4.2), (4.15) by the gauge transformation generated by G [0] =
i F(η) − 2c η log(η ζ ), 2 η
G [∞] =
¯ i F(η) + 2c η log(η /ζ ), 2 η (B.4)
G [+] = G [−] = G [i] = −2c η log(ζ ). The new non-vanishing quasi-homogeneity coefficients (3.16) are c[0] = c[+] = −2c,
c[∞] = c[−] = 2c.
(B.5)
Note that in contrast to the description in Sect. 4.1, the coefficient c[0] does not vanish, corresponding to the Lagrangian (B.2) being quasi-homogeneous. We now discuss the twistor lines arising from this new contour prescription. The global O(2) sections η(ζ ) are unchanged while the gauge transformation (B.4) induces i F (η) , 2 η i F(η) + 2c (log(η ζ ) + 1). µ[0] (ζ ) = µ[0] (ζ ) + 2 (η )2
µ[0] (ζ ) = µ[0] (ζ ) −
(B.6)
Consequently, the coordinates w I become w = w −
i F (v) , 2 v
w = w +
i F(v) + 2c (log(v ) + 1). 2 (v )2
(B.7)
The corresponding map between the coordinates I and I is readily obtained using I = −i(w I − w¯ I ). Furthermore, the base coordinates (3.36)–(3.39) are identical,
eφ = eφ ,
Z a = Z a ,
A = A ,
B I = B I ,
(B.8)
where the last equality follows from (3.39) upon a brief computation. Taking into account also that only c[+−] , which are equal in the two formulations, contribute to I the general expressions (3.53), this, in turn, implies that both contour prescriptions give rise to the same twistor lines (4.23).
Linear Perturbations of Quaternionic Metrics
397
C. Deformed Superconformal Quotient In this appendix we generalize the superconformal quotient procedure of Sect. 3.2 to include deformations. While conceptually straightforward, this procedure is toilsome compared to the contact geometry approach of Sect. 5.2. Nevertheless, we include it here for completeness, as it provides useful consistency checks on our formalism,
Coordinates on the deformed base. As in the undeformed case described in Sect. 3.2, SU (2) invariant functions on the Swann bundle S can be obtained by contour-integrating O(−2) sections on ZS . In the presence of deformations, the global sections η I in (3.36) I , leading to the definitions26 must be replaced by the deformed local sections ν[+] 1 ≡ Re r Z ≡r a
C+
C+
dζ 1 , 2 2π i f 0+ ν[+]
ν[+] dζ , 2 0 2π i f 0+ ν[+] ν[+]
A ≡ r Re
C+
a
B I ≡ 2r Im C+
ν[+] dζ , 2 2π i f 0+ (ν[+] )2
(C.1) [+] 0 µ[+] dζ I + c I log ν[+] . 2 2π i f 0+ ν[+]
To first order in the deformation, this gives v¯ 1 νˆ ζ,+ − νˆ ζ,− + νˆ + + νˆ − , 2 r a a 0 ˘ νˆ − A νˆ + νˆ + − A˘ 0 νˆ + ˘ a − Z a = Z˘ a + + Z , ζ+ η+0 ζ+ η+0 (C.2)
1 2v¯ ˘ A = A + νˆ ζ,+ − νˆ ζ,− + νˆ + + νˆ − 2r r 1 − 2 Re ζ+ η+ νˆ ζ ζ,+ + 2r A˘ −x + 2v¯ ζ+ νˆ ζ,+ +2 2v¯ A˘ − v¯ νˆ + , (r ) r = r˘ +
where ˘ marks the unperturbed quantities defined in (3.36) and we introduced 2 I I 2 I 2 I νˆ ±I = f 0± νˆ [±] (ζ± ), νˆ ζ,± =∂ζ f 0± νˆ [±] (ζ± ), νˆ ζI ζ,±=∂ζ2 f 0± νˆ [±] (ζ± ). (C.3) We omitted the expansion of B I since it will not be needed. The SU (2) invariant R defined in (3.40) may also be extended to the deformed case as 0 − A 0− A ˘ 0 νˆ − ˘ 0 νˆ + νˆ − 1 ν ˆ + ˘ 1+ R=R + 0 2 ζ+ η+0 ζ− η −
1 2v¯ . (C.4) − νˆ ζ,+ − νˆ ζ,− + νˆ + + νˆ − 2r r Using the formulae given at the end of this appendix, one can check explicitly that the above expressions are indeed SU (2) invariant. 26 Note that dζ / f 2 = dζ [+] is the natural integration measure in the patch U . + 0+
398
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
Coordinates on the C2 /Z2 fiber. The coordinates π A on the fiber of S, (3.41) can be similarly generalized to the deformed case as follows: ζ dζ 1 π =C 1/2 2π i C+ 2 ν 2 ν0 f f 0+ 0+ [+] [+]
0 νˆ ζ,+ νˆ + 4v¯ x 0 + 2v 0 /ζ+ νˆ + 1 = π˘ 1 − − − + , r 2r r 2ζ+ η+0 ζ+ η+0 (C.5) ζ dζ 2 ¯ π = −C 1/2 C− 2π i f 2 ν 2 0 0− [−] − f 0− ν[−] 0 νˆ ζ,− νˆ − νˆ − 4v¯ x 0 + 2v 0 /ζ− 2 = π˘ 1 − + − − . 0 0 r 2r r 2ζ− η− ζ− η − The conjugate variables π¯ A can be obtained from (C.5) using ζ+ = −1/ζ− ,
η+ = η− ,
νˆ +I = −ˆν−I /ζ−2 ,
In particular, one has very simple relations π1 νˆ + = ζ˘+ 1 − , ζ+ ≡ − π¯ 2 r ζ+
I I = −ˆ νˆ ζ,+ νζ,− + 2νˆ −I /ζ− .
νˆ − π2 = ζ˘− 1 + ζ− ≡ , π¯ 1 r ζ−
whereas the variable z = π 1 /π 2 parametrizing the fiber of ZM is given by 0 νˆ ζ,+ + νˆ ζ,− νˆ − νˆ +0 + − z = z˘ 1 − 0 r 2ζ+ η+0 2ζ− η−
νˆ − 4v¯ νˆ + 4v¯ x 0 + 2v 0 /ζ+ x 0 + 2v 0 /ζ− + − + − . 0 2r r 2r r ζ+ η+0 ζ− η −
(C.6)
(C.7)
(C.8)
To first order in the perturbation, the two quantities in (C.7) provide the zeros ζ± of the deformed section ν , and (2.57),(2.61) continue to hold. Using the explicit expressions (C.5) for π A , one finds νˆ − r νˆ + dz 1 = − + dζ. (C.9) z ζ η r (ζ − ζ+ )2 (ζ − ζ− )2 Relation to the contact geometric approach. Using this relation, one may relate the coordinates defined here to those defined in Sect. 5.2: ⎞ ⎛ d˘ z ∂ρ H [0 j] (ξ˘ (˘z), ρ(˘z))⎠ , Y+ = R Z ⎝1 − 3i z (1) ˜ j 2π i˘ C j d˘z −1 [0 j] ˘ ˘ z˘ Z + z˘ Z¯ ∂ρ H(1) A = A + i R (ξ (˘z), ρ(˘z)), (C.10) z C˜ j 2π i˘ j
1 r = (r+ + r− ), 2
BI = BI ,
Linear Perturbations of Quaternionic Metrics
where
399
˘ z˘ −1 Z˘ − z˘ Z˘¯ , ξ˘ (˘z) = A˘ + R d˘z z˘ + z˘ [i j] ρ I (˘z) = B˘ I − i H (ξ˘ (˘z )) ˘ z˘ − z˘ I ˜ j 2π i z C j [i0] ˘ −i H I (ξ (˘z)) + H I[i∞] (ξ˘ (˘z)) − ic[+−] log z˘ I
(C.11)
parametrize the unperturbed twistor lines and z˘ is related to ζ through the (undeformed) relation (2.61), z˘ = −
1 ζ − ζ˘+ . z¯˘ ζ − ζ˘−
(C.12)
With these definitions and relations, it is tedious but straightforward to compute the deformed complex coordinate ξ = u /u on Z in terms of the coordinates on M × CP 1 , d˘z z˘ + z −1 [0 j] ˘ [0 j] ξ = A +z Y+ −zY− +i ∂ρ H(1) . − ξ (˘z)∂ρ H(1) z z˘ − z C˜ j 2π i˘ j
(C.13) Taking into account the relation
[i0] ˘ [i∞] ˘ ρ I = −2iξ˘˜ [i] − i H ( ξ ) + H ( ξ ) , I I I
(C.14)
one verifies that this expression coincides with the result (5.31) from contact geometry, at ζ = 0, z = z. A similar derivation of ξ˜ I[i] in (5.39) ought to be possible but we have not attempted to carry it through. Deformed hyperkähler potential. Having defined an appropriate set of coordinates on the Swann bundle, we now generalize the representation (3.56) for the hyperkähler potential to the deformed case, and relate it to the contact potential [+] of contact geometry. For this purpose, we note that the hyperkähler potential (5.12) can be written as dζ
1 − x I ∂ η I + ∂ x I ρ J ∂ρ J χ = 1 + νˆ 0I ∂v I + νˆ¯ 0I ∂v¯ I C 2π i ζ (C.15) × (H (η) + H(1) (η, ρ)) , where we omitted the summation over patches and used the relation (5.11) between the undeformed and deformed complex coordinates v I and u I ≡ v I + νˆ 0I . Following the same steps (3.30) which led to the representation (3.56), one finds d˘z −1 ˘ ˘ z˘ Z − z˘ Z˘¯ (H + H(1) ) ξ˘ (˘z), ρ(˘z) χ = 1 + νˆ 0I ∂v I + ν¯ˆ 0I ∂v¯ I r R z C˜ 2π i˘ (C.16) dζ
v ζ −1 + v¯ ζ [+−] ˘ I I x ∂x I − ζ ∂ζ ρ J ∂ρ J H(1) , +r c I A − η C˜ 2π iζ where ξ˘ (˘z) and ρ(˘z) are given in (C.11).
400
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
The unperturbed quantities A˘ , Z˘ , etc., are not SU (2) invariant. To arrive at the desired form of the hyperkähler potential, we replace them in the first term of (C.16) by their deformed SU(2) invariant counterparts and collect the remaining terms which are all of order O(H(1) ). The first term in (C.16) then reads
d˘z −1 z˘ Z − z˘ Z¯ (H + H(1) ) ξ(0) (˘z), ρ(˘z) + r c[+−] A I , (C.17) r R I 2π i˘ z ˜ C where ξ(0) (˘z) ≡ A + z˘ −1 Y+ − z˘ Y− .
(C.18)
The remaining terms are of three types: (i) the second term in (C.16), (ii) the terms coming from the derivatives with respect to v I and v¯ I in the first term in (C.16) and (iii) the terms coming from the difference between deformed and undeformed invariants in (C.17). Altogether they should combine in an invariant expression written as a contour integral of a O(−2) section. After a long calculation, one obtains d˘z −1 [0 j] [0 j] χ =r R z˘ Z − z˘ Z¯ (H + H(1) ) + r c[+−] AI I 2π i˘ z ˜ C d˘z −1 [ j0] [ j∞] z˘ Z − z˘ Z¯ H + H − ir R z C˜ 2π i˘ [0 j] [0 j] ∂ρ H(1) − ξ(0) ∂ρ H(1) d˘z d˘z z˘ + z˘ −1 ¯ H [i j] ˘ ˘ + ir R z Z − z Z z C˜ 2π i˘z z˘ − z˘ C˜ 2π i˘ [0i] [0i] ∂ρ H(1) , (C.19) − ξ(0) (˘z )∂ρ H(1) [i j]
[0i] are functions of ξ (˘ where H are functions of ξ(0) (˘z), whereas H(1) z) (or, (0) z) and ρ(˘ in the last term, functions of z˘ ). This expression can be further simplified. First, taking into account (C.14) and [i j] [i j] [i j] [i j] H (ξ ) = Hˆ [i j] (ξ ) − ξ Hˆ (ξ ) + c ξ + c ,
(C.20)
it is easy to check that the first and third terms in (C.19) combine into one with the [0 j] derivatives of the transition functions replaced by T˜ (ξ(0) , ξ˜ [ j] ) (5.23). Moreover, the last term in (C.19) can be rewritten as d˘z −1 [0 j] z˘ Z − z˘ Z¯ H r R 2π i˘ z ˜ Cj j d˘z z˘ + z˘ 1 T × −T[0 j] + , (C.21) 2 z z˘ − z˘ [0i] C˜ i 2π i˘ i
where for i = j the variable z˘ lies inside the contour for z˘ and the first term appears since in (C.19) the situation was opposite. Then the expression in the square brackets (˘ (˘ is just ξ[0] z) − ξ(0) z) for z˘ ∈ U j . Finally, Y and RZ differ by a phase factor (see
Linear Perturbations of Quaternionic Metrics
401
(C.10)) which can be absorbed into a redefinition of the integration variable z˘ . As a result, the hyperkähler potential can be written more compactly as χ =r
j
C˜ j
d˘z −1 [0 j] z˘ Y+ − z˘ Y− T˜ (ξ[0] , ξ˜ [ j] ) + r c[+−] AI . I 2π i˘z
(C.22)
Comparing with the contact potential (5.45), one finds that the relation (3.38) continues to hold in the perturbed case.
Deformed SU (2) transformations. The SU (2) invariance of the quantities defined in this appendix can be checked using the following transformation rules, which follow from the general discussion in Appendix A: 3 ¯I 3 − νˆ 3 , δ v¯ I = −i3 v¯ I + − x I − + νˆ 3I , 2 2
δ I = i − Lu¯ I − + Lu I + 3 c I , (C.23) δx I = −2(− v I + + v¯ I ), i i δw I = + Lu I + 3 c I , δ w¯ I = − Lu¯ I − 3 c I , 2 2 3 ¯I I I δ νˆ 0 = i3 νˆ 0 + i+ L I + − νˆ 3 , 2 3 2 I I I δ νˆ ± = (i3 − 2− ζ± )ˆν± − + ζ± νˆ 3 − − ν¯ˆ 3I , (C.24) 2 I I I δ νˆ ζ,± = −2− νˆ ± − 3+ ζ± νˆ 3 , δv I = i3 v I + + x I −
I δ νˆ ζI ζ,± = −(i3 − 2− ζ± )ˆνζI ζ,± − 2− νˆ ζ,± − 3+ νˆ 3I ,
where νˆ 0I
= C
dζ I H , 2π (1)
νˆ 3I
= C
dζ HI , π ζ 3 (1)
ν¯ˆ 3I = −
C
dζ HI π ζ −1 (1)
(C.25)
are Laurent coefficients of the deformation νˆ 0I given in (5.7) (as usual, we omitted the sum over contours). The following properties, valid in the absence of perturbations, are also useful: δζ± = −+ − − ζ±2 + i3 ζ± , = −(+ ζ¯∓ + − ζ± )η± , δη± 1 + z z¯ v 1 + z z¯ v¯ z + , δ z¯ = z¯ − , δz = |z| v |z| v¯ δρ I = + + − ζ 2 − i3 ζ ∂ζ ρ I .
(C.26) (C.27) (C.28) (C.29)
402
S. Alexandrov, B. Pioline, F. Saueressig, S. Vandoren
References 1. Alexandrov, S., Pioline, B., Saueressig, F., Vandoren, S.: Linear perturbations of Hyperkähler metrics. Lett. Math. Phys. 87, 225 (2009) 2. Swann, A.: Hyper-Kähler and quaternionic Kähler geometry. Math. Ann. 289(3), 421–450 (1991) 3. de Wit, B., Kleijn, B., Vandoren, S.: Superconformal hypermultiplets. Nucl. Phys. B568, 475–502 (2000) 4. de Wit, B., Roˇcek, M., Vandoren, S.: Hypermultiplets, hyperkähler cones and quaternion-Kähler geometry. JHEP 02, 039 (2001) 5. Bergshoeff, E.A., Cucu, S., de Wit, T., Gheerardyn, J., Van Proeyen, A., Vandoren, S.: The map between conformal hypercomplex/hyper-Kähler and quaternionic(-Kähler) geometry. Commun. Math. Phys. 262, 411–457 (2006) 6. de Wit, B., Roˇcek, M., Vandoren, S.: Gauging isometries on hyperkaehler cones and quaternion- kaehler manifolds. Phys. Lett. B511, 302–310 (2001) 7. Ketov, S.V.: Superconformal hypermultiplets in superspace. Nucl. Phys. B582, 95–118 (2000) 8. Kuzenko, S.M.: On superconformal projective hypermultiplets. JHEP 12, 010 (2007) 9. Kuzenko, S.M., Lindström, U., Roˇcek, M., Tartaglino-Mazzucchelli, G.: 4D N = 2 Supergravity and Projective Superspace. JHEP 0809, 051 (2008) 10. de Wit, B., Saueressig, F.: Off-shell N = 2 tensor supermultiplets. JHEP 09, 062 (2006) 11. de Wit, B., Saueressig, F.: Tensor supermultiplets and toric quaternion-Kaehler geometry. Fortsch. Phys. 55, 699–704 (2007) 12. Salamon, S.M.: Quaternionic Kähler manifolds. Invent. Math. 67(1), 143–171 (1982) 13. LeBrun, C.: Quaternionic-Kähler manifolds and conformal geometry. Math. Ann. 284(3), 353–376 (1989) 14. LeBrun, C.: Fano manifolds, contact structures, and quaternionic geometry. Internat. J. Math. 6(3), 419–437 (1995) 15. LeBrun, C.: A Rigidity Theorem for Quaternionic-Kahler Manifolds. Proc. Amer. Math. Soc. 103(4), 1205–1208 (1988) 16. LeBrun, C., Salamon, S.: Strong rigidity of positive quaternion-Kähler manifolds. Invent. Math. 118(1), 109–132 (1994) 17. Neitzke, A., Pioline, B., Vandoren, S.: Twistors and black holes. JHEP 04, 038 (2007) 18. Bagger, J., Witten, E.: Matter couplings in N = 2 supergravity. Nucl. Phys. B222, 1 (1983) 19. Alexandrov, S., Pioline, B., Saueressig, F., Vandoren, S.: D-instantons and twistors. JHEP 0903, 044 (2009) 20. Boyer, C.P., Galicki, K.: 3-Sasakian manifolds. Surv. Diff. Geom. 7, 123–184 (1999) 21. Atiyah, M.F., Hitchin, N.J., Singer, I.M.: Self-duality in four-dimensional Riemannian geometry. Proc. Roy. Soc. London Ser. A 362(1711), 425–461 (1978) 22. Hitchin, N.J., Karlhede, A., Lindström, U., Roˇcek, M.: Hyperkähler metrics and supersymmetry. Commun. Math. Phys. 108, 535 (1987) 23. Hitchin, N.: The self-duality equations on a Riemann surface. Proc. London Math. Soc. (3) 55(1), 59– 126 (1987) 24. Ivanov, I.T., Roˇcek, M.: Supersymmetric sigma models, twistors, and the Atiyah-Hitchin metric. Commun. Math. Phys. 182, 291–302 (1996) 25. Lindström, U., Roˇcek, M.: Properties of Hyperkähler manifolds and their twistor spaces. Commun. Math. Phys. 293, 257–278 (2010) 26. Galicki, K.: A generalization of the momentum mapping construction for quaternionic Kähler manifolds. Commun. Math. Phys. 108(1), 117–138 (1987) 27. Neitzke, A.: Private communication 28. Alexandrov, S.: Quantum covariant c-map. JHEP 05, 094 (2007) 29. Cecotti, S., Ferrara, S., Girardello, L.: Geometry of type II superstrings and the moduli of superconformal field theories. Int. J. Mod. Phys. A4, 2475 (1989) 30. Ferrara, S., Sabharwal, S.: Quaternionic manifolds for type II superstring vacua of Calabi-Yau spaces. Nucl. Phys. B332, 317 (1990) 31. de Wit, B., Van Proeyen, A.: Potentials and symmetries of general gauged N = 2 supergravity: Yang-Mills models. Nucl. Phys. B245, 89 (1984) 32. Cremmer, E., de Wit, B., Derendinger, J.P., Ferrara, S., Girardello, L., Kounnas, C., Van Proeyen, A.: Vector multiplets coupled to N = 2 supergravity: SuperHiggs effect, flat potentials and geometric structure. Nucl. Phys. B250, 385 (1985) 33. Roˇcek, M., Vafa, C., Vandoren, S.: Hypermultiplets and topological strings. JHEP 02, 062 (2006) 34. Roˇcek, M., Vafa, C., Vandoren, S.: Quaternion-Kähler spaces, hyperkähler cones, and the c-map. http:// arXiv.org/abs/math/0603048v3[math.DG], 2006 35. Antoniadis, I., Ferrara, S., Minasian, R., Narain, K.S.: R 4 couplings in M- and type II theories on Calabi-Yau spaces. Nucl. Phys. B507, 571–588 (1997)
Linear Perturbations of Quaternionic Metrics
403
36. Robles-Llana, D., Saueressig, F., Vandoren, S.: String loop corrected hypermultiplet moduli spaces. JHEP 03, 081 (2006) 37. Antoniadis, I., Minasian, R., Theisen, S., Vanhove, P.: String loop corrections to the universal hypermultiplet. Class. Quant. Grav. 20, 5079–5102 (2003) 38. Günther, H., Herrmann, C., Louis, J.: Quantum corrections in the hypermultiplet moduli space. Fortsch. Phys. 48, 119–123 (2000) 39. Ionas, R.A.: Elliptic constructions of Hyperkähler metrics II: The quantum mechanics of a Swann bundle. http://arXiv.org/abs/0712.3600v1[math.DG], 2007 Communicated by N.A. Nekrasov
Commun. Math. Phys. 296, 405–428 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1024-9
Communications in
Mathematical Physics
The ADHM Construction and Non-local Symmetries of the Self-dual Yang–Mills Equations James D. E. Grant Fakultät für Mathematik, Universität Wien, Nordbergstrasse 15, 1090 Wien, Austria. E-mail:
[email protected] Received: 16 December 2008 / Accepted: 18 December 2009 Published online: 4 March 2010 – © Springer-Verlag 2010
To Nicola Ramsay Abstract: We consider the action on instanton moduli spaces of the non-local symmetries of the self-dual Yang–Mills equations on R4 discovered by Chau and coauthors. Beginning with the ADHM construction, we show that a sub-algebra of the symmetry algebra generates the tangent space to the instanton moduli space at each point. We explicitly find the subgroup of the symmetry group that preserves the one-instanton moduli space. This action simply corresponds to a scaling of the moduli space. 1. Introduction The self-dual Yang–Mills equations have been investigated from two rather distinct points of view in the last few decades. The first direction is in the study of the topology of four-manifolds, and the work of Donaldson (see, e.g., [13,14]). In this approach, a fundamental role is played by the analysis of moduli spaces of solutions of the self-dual Yang–Mills equations with L 2 curvature (so-called “instanton solutions”) on given four-manifolds. The analysis of such moduli-spaces then yields powerful information concerning differentiable structures on the underlying four-manifold. The second, seemingly unrelated, development is in the theory of integrable systems. In particular, it has been shown that many known integrable systems of differential equations may be derived as symmetry reductions of the self-dual Yang–Mills equations (see, e.g., [21]). The purpose of the current paper, and its companion [15] which studies reducible connections, is to investigate whether properties of the self-dual Yang–Mills equations that follow from its integrable nature may be used to yield global information about instanton moduli spaces. In particular, it is known that the self-dual Yang–Mills equations on R4 admit an infinite-dimensional algebra of non-local symmetries [6–8]. In this paper, we investigate the action of these symmetries on the instanton-moduli spaces on R4 and, in particular, investigate, on the one-instanton moduli space, the sub-algebra of symmetries that preserve the L 2 condition on the curvature of the connection. Thinking of
406
J. D. E. Grant
such symmetries as generating flows on the moduli space of all self-dual connections, M, and of the k-instanton moduli space, Mk , as a subspace of M, then it is known that the non-local symmetries in general do not lie tangent to the subspaces Mk and, therefore, do not preserve the L 2 nature of the curvature (see, e.g., [9, Chap. V], but also Remark 4.4 below). Our results are rather double-edged. In Theorem 4.1, we show that the full tangent space to the moduli spaces Mk is generated by the fundamental vector fields of the symmetry algebra acting on the moduli space of self-dual connections M. When we attempt to “exponentiate” these tangent vectors into a group action on Mk , however, our conclusion is that the family of transformations that preserves the L 2 nature of the connection is rather small. In particular, the symmetries have orbits of rather high codimension in the moduli spaces. More specifically, in Theorem 5.1, we deduce that the only symmetries of the self-dual Yang–Mills equations that act on five-dimensional one-instanton moduli space M1 correspond to a scaling of the instanton solutions. Such a collapse to orbits of large codimension is not unfamiliar from the theory of harmonic maps into Lie groups [1,17,20,28], where one has similar non-local symmetry algebras [10]. The paper is organised as follows. In Sect. 2, we summarise the relevant background material that we require from both the integrable systems approach to the self-dual Yang–Mills equations and the ADHM approach to the instanton problem. Since our considerations are aimed at making a connection between the local aspects of the selfdual Yang–Mills equations and the global aspects, and the literature in these fields generally have completely different notation, it was deemed necessary to give an integrated, relatively detailed description of both approaches in a consistent notation. This accounts for the length of this section1 . In Sect. 3, we show how the ADHM construction may be used to yield explicit patching matrices for holomorphic bundles over subsets of CP 3 , to which we may apply the results of [9] concerning the action of the symmetry algebra of the self-dual Yang–Mills equations. In Sect. 4, we show that one-parameter families of ADHM data yield transformations that fall into the category of transformations considered in [9], with the important proviso that these transformations are significantly restricted by the assumption that they are generated by flows on the full moduli space of self-dual connections. In Sect. 5, we show that our constructions can be carried out explicitly on the one-instanton moduli space, and that the only symmetries (consistent with a particular technical assumption) that have a well-defined action on the one-instanton moduli space are scalings. Finally, in an Appendix, we give a direct derivation of the action of the non-local symmetries of the self-dual Yang–Mills equations on the twistorial patching matrix from the action on the self-dual connection. Finally, note that the symmetries that we investigate can also be constructed, by the same methods, on hyper-complex manifolds. Since we wish to consider symmetries that generalise to manifolds other than R4 , and there exist hyper-complex manifolds with no continuous (conformal) isometries, we will not consider symmetries (such as those discussed in [26]) that follow from the existence of a non-trivial conformal group on our manifold.
1 For more information on the local aspects of the self-dual Yang–Mills equations that are relevant to us, see [9, Chaps. II & III]. For more information concerning the ADHM formalism see either [4] or [2, Chaps. II-IV].
The ADHM Construction and Non-local Symmetries
407
2. Preliminaries 2.1. Quaternions and twistor spaces. We will deal exclusively with the self-dual Yang– Mills equations on R4 and S 4 , and make constant use of isomorphisms R4 ∼ = C2 ∼ = H, 1 2 3 4 4 which we first fix. As such, let x := (x , x , x , x ) ∈ R . We may then view u := x1 + ix2 , v := x3 − ix4 as coordinates on C2 and defining an isomorphism R4 → C2 ; x → (u, v). In terms of these coordinates, the flat metric on R4 takes the form g=
1 (du ⊗ du + du ⊗ du + dv ⊗ dv + dv ⊗ dv), 2
and the corresponding volume form is = dx1 ∧ dx2 ∧ dx3 ∧ dx4 =
1 du ∧ du ∧ dv ∧ dv. 4
Let P → R4 be a principal SU2 bundle2 over R4 , and E → R4 the rank-two complex vector bundle associated to P via the fundamental representation of SU2 . (We will switch between the principal bundle and vector bundle pictures without comment.) An SU2 connection on E, A ∈ Ω 1 (R4 , su2 ) is a solution of the self-dual Yang–Mills equations if its curvature satisfies F = F. In terms of the complex coordinates introduced above, this equation is equivalent to Fuv = 0,
Fuu + Fvv = 0,
Fuv = 0.
(2.1)
We introduce complex vector fields X(z), Y(z) ∈ C ∞ (R4 , T M ⊗ C) depending on a parameter z ∈ C ∪ {∞} ≡ CP 1 : X(z) := ∂u − z∂v ,
Y(z) := ∂v + z∂u .
(2.2)
Then Eqs. (2.1) are equivalent to the requirement that F (X(z), Y(z)) = 0,
∀z ∈ CP 1 .
(2.3)
∼ S4 ∼ Since R4 ⊂ R4 ∪ {∞} = = HP 1 , a point x ∈ R4 ∼ = C2 determines a quatern2 ionic line in H . In particular, we define x := u + jv ∈ H. In terms of homogeneous coordinates ( p, q) ∈ H2 on HP 1 , the point x determines the quaternionic line lx := (q, p) ∈ H2 q = x p in H2 . Now, let p = z 1 + j z 2 , q = z 3 + j z 4 with z := (z 1 , . . . , z 4 ) ∈ C4 , and view z as homogeneous coordinates on CP 3 . Right-multiplication by j on H2 defines an anti-linear anti-involution σ : C4 → C4 ;
(z 1 , z 2 , z 3 , z 4 ) → (−z 2 , z 1 , −z 4 , z 3 ),
(2.4)
2 We restrict to SU for simplicity. All of our considerations are equally valid for any classical Lie group. 2
408
J. D. E. Grant
which descends to define an involution on the projective space: σ : CP 3 → CP 3 ;
[z 1 , z 2 , z 3 , z 4 ] → [−z 2 , z 1 , −z 4 , z 3 ] .
The image of the quaternionic line lx in CP 3 corresponding to x ∈ R4 is given by the embedded projective line L x ≡ L (u,v) = [z 1 , z 2 , z 3 , z 4 ] ∈ CP 3 z 3 = z 1 u − z 2 v, z 4 = z 1 v + z 2 u . (2.5) Similarly, the embedded line corresponding to the point ∞ ∈ S 4 is L ∞ := [0, 0, z 3 , z 4 ] (z 3 , z 4 ) ∈ C2 \ {(0, 0)} . The lines L p , for p ∈ S 4 , are invariant under the action of σ , and are referred to as real lines. We will make particular use of the projection π : CP 3 \L ∞ → R4 ;
L x → x.
1 On a fixed real line, L x , x ∈ R4 , we introduce the affine3coordinate z = z 2 /z 1 ∈ CP . Finally, on the subset U1 := [z 1 , z 2 , z 3 , z 4 ] ∈ CP z 1 = 0 , we may introduce coordinates w1 := z 3 /z 1 , w2 := z 4 /z 1 , w3 := z 2 /z 1 ≡ z. By definition, the coordinates (w1 , w2 , w3 ), viewed as functions on U1 , are holomorphic with respect to the complex structure that U1 inherits as an open subset of CP 3 . From Eq. (2.5), the intersection L x ∩ U1 consists of the set of points with (w1 , w2 , w3 ) = (u − zv, v + zu, z). We will therefore often view (u, v, z) as coordinates on the set U1 ∼ = C2 × C, with the functions (u − zv, v + zu, z) being holomorphic with respect to the complex structure on U1 . One may then check that, in this coordinate system, a basis for anti-holomorphic vector fields on the set U1 is given by the vector fields {X(z), Y(z), ∂z }, with X(z), argument may be carried out on the set A similar Y(z) as in Eq. (2.2). U2 := [z 1 , z 2 , z 3 , z 4 ] ∈ CP 3 z 2 = 0 . In practice, the construction on U2 means that we may use the formulae for X(z), Y(z) with z taking values in CP 1 . As such, we will often, henceforth, identify the set CP 3 \L ∞ with the set C2 × CP 1 . When we speak of a function being, for example, “holomorphic” on C2 × C ⊂ C2 × CP 1 , we will mean holomorphic on the set U1 with the induced complex structure mentioned above.3
Notation. Given > 0, we define the following open subsets of CP 1 : 1 0 1 ∞ 1 V := z ∈ CP |z| > V := z ∈ CP |z| < 1 + , , 1+ and their intersection
V := V0 ∩ V∞ = z ∈ CP 1
1 < |z| < 1 + . 1+
We define the involution on the projective line σ : CP 1 → CP 1 ; z → − 1 z which, geometrically, is simply the anti-podal map. Note that σ (V0 ) = V∞ and σ (V ) = V . 3 From a CP 1 point of view, we are viewing U ∪ U = CP 3 \L as the total space of the normal bundle ∞ 1 2 O(1) ⊕ O(1) of the rational curve L 0 ⊂ CP 3 \L ∞ , where 0 denotes the origin in R4 . X and Y are then
linearly independent sections of this normal bundle. Unfortunately, this picture is not particularly well-suited to the calculations that we wish to perform.
The ADHM Construction and Non-local Symmetries
409
Given any subset V ⊂ CP 1 and a map g : V → SL2 (C), we define a corresponding map g ∗ : σ (V) → SL2 (C) by g ∗ (z) := (g(σ (z)))† . (Throughout, † : SL2 (C) → SL2 (C) will denote complex-conjugate transpose.) Similarly, given any map f : U × V → SL2 (C), we define a corresponding map f ∗ : U × σ (V) → SL2 (C) by f ∗ (x, z) := ( f (x, σ (z)))† . 2.2. Holomorphic bundles. An important property of the self-dual Yang–Mills equations is that they are the compatibility condition for the following overdetermined system of equations [9, Theorem 1]: (∂u − z∂v ) Ψ (x, z) = − ( Au − z Av ) Ψ (x, z), (∂v + z∂u ) Ψ (x, z) = − (Av + z Au ) Ψ (x, z), ∂z Ψ (x, z) = 0,
(2.6a) (2.6b) (2.6c)
for a map Ψ : R4 × V → SL2 (C), where V is a subset of CP 1 . In particular, given > 0, there exists a solution, Ψ0 : R4 × V0 → SL2 (C) that is analytic in z for z ∈ V0 . This solution is unique up to right multiplication 0 (x, z) := Ψ0 (x, z)R(u − zv, v + zu, z), Ψ0 (x, z) → Ψ where R : C2 × V0 → SL2 (C) is holomorphic with respect to the complex structure that C2 × V0 inherits as a subset of U1 . Given Ψ0 (x, z), we define
−1 Ψ∞ (x, z) := Ψ0∗ (x, z) . It is straightforward to check that Ψ∞ (x, z) is also a solution of (2.6) that is analytic in z on R4 × V∞ . Defining the fields ψ0 (x) := Ψ0 (x, 0),
ψ∞ (x) := Ψ∞ (x, ∞),
then Eqs. (2.6) imply that we may write the components of the connection in the form Au = − (∂u ψ∞ (x)) ψ∞ (x)−1 , Au = − (∂u ψ0 (x)) ψ0 (x)−1 ,
Av = − (∂v ψ∞ (x)) ψ∞ (x)−1 , Av = − (∂v ψ0 (x)) ψ0 (x)−1 .
(2.7a) (2.7b)
We then define the Yang J -function J : R4 → SL2 (C) by J := ψ∞ (x)−1 · ψ0 (x).
(2.8)
−1 Noting that, from the definition of Ψ∞ , we have ψ∞ (x) = ψ0 (x)† , it then follows that J † = J . A short calculation shows that the remaining part of the anti-self-dual part of the curvature may be written in the form −1 Fuu + Fvv = −ψ∞ ∂u Ju J −1 + ∂v Jv J −1 ψ∞ = −ψ0 ∂u J −1 Ju + ∂v J −1 Jv ψ0−1 .
410
J. D. E. Grant
If the connection, A, satisfies the self-dual Yang–Mills equations it therefore follows that the field J satisfies the Yang–Pohlmeyer equation (2.9) ∂u Ju J −1 + ∂v Jv J −1 = 0. Conversely, given J : R4 → SL2 (C) that satisfies the Yang–Pohlmeyer equation and † admits a splitting of the form (2.8) for some ψ0 and ψ∞ such that ψ∞ = ψ0−1 , then the connection constructed as in Eqs. (2.7) will satisfy the self-dual Yang–Mills equations. Finally, the quantity G(x, z) := (Ψ∞ (x, z))−1 · Ψ0 (x, z),
(2.10)
will be referred to as the patching matrix. It defines a holomorphic map C2 × V ⊂ CP 3 \L ∞ → SL2 (C), and hence the transition function of a holomorphic vector bundle over CP 3 \L ∞ . The splitting (2.10) implies that this bundle is trivial on restriction to real lines. The above is an explicit version of the Ward correspondence [30], which defines a 1 − 1 correspondence between self-dual Yang–Mills fields and holomorphic bundles over appropriate subsets of CP 3 that are trivial on restriction to real lines.4 Given such a bundle, the transition functions necessarily admit a splitting of the form (2.10), and the connection may then be reconstructed from the resulting fields Ψ0 , Ψ∞ via Eqs. (2.7). 2.3. Non-local symmetries of the self-dual Yang–Mills equations. If we consider a oneparameter family of solutions, J (t), of (2.9), depending in a C 1 fashion on a parameter d t ∈ (−ε, ε), then we deduce that J˙ := dt J (t) must satisfy the linearisation of (2.9): (2.11) ∂u J ∂u J −1 J˙ J −1 + ∂v J ∂v J −1 J˙ J −1 = 0. Such a J˙ defines a symmetry of the self-dual Yang–Mills equations. It is known that the only local symmetries5 of the self-dual Yang–Mills equations on flat R4 are gauge transformations and those generated by the action of the conformal group (see, e.g., [25]). On the other hand, there exists a non-trivial family of non-local symmetries of the self-dual Yang–Mills equations [6–8]. To describe these, we define the auxiliary maps χ0 : R4 × V0 → SL2 (C), χ∞ : R4 × V∞ → SL2 (C), χ0 (x, z) := ψ0 (x)−1 · Ψ0 (x, z),
χ∞ (x, z) := ψ∞ (x)−1 · Ψ∞ (x, z),
which are analytic in z for z ∈ V0 and z ∈ V∞ , respectively. These maps have the property that χ0 (x, 0) = χ∞ (x, ∞) = Id. The following result, based on the results of [6–8], may then be extracted from Sect. III.A of [9]: Proposition 2.1. Let T : R4 × S 1 → SL2 (C) be a map that extends continuously to a map T : R4 ×V → SL2 (C) (for some > 0) that is analytic in z for z ∈ V and satisfies (∂v + z∂u ) T (x, z) = (∂u − z∂v ) T (x, z) = 0 4 See [9] for more details of the Ward correspondence from this point of view. 5 i.e. depending only on the connection and its derivatives pointwise
The ADHM Construction and Non-local Symmetries
411
for (x, z) ∈ R4 × V . Then, given any λ ∈ V , the quantity J˙ := χ∞ (x, λ)T (x, λ)χ∞ (x, λ)−1 · J + J · χ0 (x, σ (λ))T (x, λ)† χ0 (x, σ (λ))−1 = ψ∞ (x)−1 Ψ∞ (x, λ)T (x, λ)Ψ∞ (x, λ)−1 +Ψ0 (x, σ (λ))T (x, λ)† Ψ0 (x, σ (λ))−1 ψ0 (x) (2.12) is a solution of the linearisation (2.11). In the case where the function T is independent of x, it defines an element of the loop group ΛSL2 (C) that admits a holomorphic extension to an open neighbourhood of S 1 in C∗ . The algebra of symmetries generated by such T is then isomorphic to the Kac-Moody algebra of sl2 (C). The action of such symmetries on the patching matrix is given by the following result: Theorem 2.1. Let T : R4 × S 1 → SL2 (C) be as in the previous proposition. The induced flow on the patching matrix of the corresponding bundle over CP 3 \L ∞ is given by ˙ G(x, z) = −T (x, z)G(x, z) − G(x, z)T ∗ (x, z) + ρ∞ (x, z)G(x, z) + G(x, z)ρ0 (x, z), (2.13) for (x, z) ∈ R4 × V . In this equation, ρ0 : R4 × V0 → sl2 (C) and ρ∞ : R4 × V∞ → sl2 (C) are analytic functions of z on the respective regions and satisfy (∂v + z∂u ) ρ0 (x, z) = (∂u − z∂v ) ρ0 (x, z) = 0, (∂v + z∂u ) ρ∞ (x, z) = (∂u − z∂v ) ρ∞ (x, z) = 0. Moreover, the functions ρ0 , ρ∞ may be absorbed into holomorphic changes of bases on the regions V0 and V∞ . When this absortion process is carried out, the transformation (2.13) takes the simpler form ˙ G(x, z) = −T (x, z)G(x, z) − G(x, z)T ∗ (x, z).
(2.14)
Remark 2.1. These transformations have been investigated from the viewpoint of twistor-theory and have a natural sheaf-theoretic interpretation [18,19,24,25]. In the literature, it is standard to assume (2.13) (and the group-theoretic version (2.15) below) as the transformation law of the patching matrix, and to work backwards to derive the transformation law of J and the connection (see, e.g., [18,19,25]). Since a direct proof of (2.13), starting from (2.12), does not appear in the literature, we have included a proof in Appendix A. Remark 2.2. The transformation (2.14) is independent of the parameter λ that appears in Eq. (2.12). As such, the transformation depends only on the function T . In Eq. (2.13), the functions ρ0 , ρ∞ will generally depend on the parameter λ, but the corresponding dependence of G˙ on λ may be removed by a holomorphic change of frame. This situation is different from that in, for example, the theory of harmonic maps from a domain X ⊆ R2 to a Lie group G. In this case, one has a similar family of non-local symmetries [10] depending on a function T (λ). There, however, the transformation properties of the extended harmonic map depends explicitly on the value of the parameter λ (see, e.g., [28, §3]). Power-series expanding in λ then gives a family of flows acting on the extended solution, and hence on the space of harmonic maps.
412
J. D. E. Grant
The exponentiated form of the transformation law (2.14) is given by the following: Theorem 2.2 [9, Chap. IV.C]. Let g : R4 × S 1 → SL2 (C) be a smooth map that admits a continuous extension to a holomorphic map g : C2 × V ⊂ CP 3 \L ∞ → SL2 (C), for some > 0. Then we define the action of g on the patching matrix G(x, z) by the law G(x, z) → g(x, z) · G(x, z) · g ∗ (x, z). R4
(2.15)
If g extends holomorphically to × then the corresponding transformation is a holomorphic change of basis on the bundle over CP 3 \L ∞ , which leaves the self-dual connection, A, unchanged. V0 ,
Remark 2.3. The infinitesimal form of (2.15), where g(x, z) = exp(−t T (x, z)) is Eq. (2.14). 2.4. The ADHM construction. We wish to study the action of the symmetries mentioned above on the moduli spaces of instanton solutions of the self-dual Yang–Mills equations on R4 or, equivalently, S 4 . As such, we are concerned with connections whose curvatures are L 2 , in which case we have |F|2 d 4 x = −8π 2 k, R4
where k ∈ N0 is the second Chern number, c2 (E), of the bundle E (also called the instanton number of the connection). A self-dual connection with L 2 curvature on R4 necessarily extends to a self-dual connection on S 4 [29]. We will refer to such connections, defined on either R4 or S 4 as an instanton. The moduli space of instanton solutions, with instanton number k, modulo gauge transformations is a manifold of dimension (8k − 3) (away from singularities due to reducible connections), which we denote by Mk . For later considerations, it will be important to think of Mk as being finite-dimensional submanifolds of the (infinite-dimensional) space of all self-dual connections on R4 , not necessarily having L 2 curvature, which we denote by M. The symmetries of the selfdual Yang–Mills equations that we consider may be viewed as defining flows on the space M, and our main question is when these flows preserve the sub-manifolds Mk . Via the Ward correspondence [5,30], self-dual connections of instanton number k correspond to holomorphic bundles over CP 3 that are trivial on real lines. All such bundles may be constructed directly in terms of quaternionic linear algebra by the ADHM construction [4], which we now briefly recall. For each z = (z 1 , z 2 , z 3 , z 4 ) ∈ C4 , we define a linear map A(z) : W → V, between complex vector spaces W, V of dimension k, 2k + 2 respectively, which is of the form A(z) =
4
z i Ai .
i=1
The space W is assumed to admit an anti-linear involution σW : W → W . The space V is assumed to have a symplectic form (·, ·) and an anti-linear anti-involution σV : V → V that are compatible in the sense that (σV u, σV v) = (u, v),
∀ u, v ∈ V.
The ADHM Construction and Non-local Symmetries
413
We require that the map A(z) satisfies the compatibility condition σV (A(z)w) = A(σ (z)) σW (w),
∀ w ∈ W,
∀z ∈ C4 ,
(2.16)
where σ : C4 → C4 is as in Eq. (2.4). Finally, we impose the following conditions: – For all z ∈ C4 , the space Uz := A(z)(W ) ⊂ V is of dimension k; – For all z ∈ C4 , Uz is isotropic with respect to (·, ·) i.e. Uz ⊆ Uz⊥ , where ⊥ denotes the complement with respect to the form (·, ·). If we then define the quotient E z := Uz⊥ /Uz , then the collection of E z defines a holomorphic, rank-2 complex vector bundle E → CP 3 with structure group SL2 (C). The reality condition (2.16) then implies that the bundle is trivial on restriction to any real line and that the self-dual connection on S 4 determined by the Ward correspondence is an SU2 connection. 3. Patching Matrix Description of ADHM Construction In order to make contact between the action of the non-local symmetries of the self-dual Yang–Mills equations in the form of (2.13) and the ADHM construction, we first need to reformulate the ADHM construction in terms of patching matrices. We assume given an instanton solution of the self-dual Yang–Mills equations on S 4 , with corresponding holomorphic bundle E → CP 3 . We then consider (without any loss of information [29]) the restriction of this solution to R4 ⊂ S 4 and the restriction of the bundle E to π −1 (R4 ) ≡ CP 3 \L ∞ , which, for convenience, we denote by E → CP 3 \L ∞ . We split the set CP 3 \L ∞ as the union of two regions S0 := ((u, v), z) ∈ C2 × CP 1 |z| < 1 + = C2 × V0 , 1 S∞ := ((u, v), z) ∈ C2 × CP 1 |z| > = C2 × V∞ . 1+ Since S0 , S∞ ∼ = C3 , the bundle E restricted to either of these regions is holomorphically trivial [9,23]. The bundle E is therefore characterised by the transition function G : S0 ∩ S∞ → SL2 (C), which is the patching matrix from Sect. 2.2. The map G may be constructed directly from the ADHM data, at the expense of k fixing bases on the spaces V and W . In particular, let {ai }i=1 be a basis of vectors in W that are real with respect to σW , in the sense that σW (ai ) = ai ,
i = 1, . . . , k.
(So, in practice, we are looking on W as being the complexification of the fixed point set of σW .) The vectors vi (z) := A(z)ai ∈ V,
i = 1, . . . , k
define a collection of k vectors that span the space Uz ⊂ V . Due to the reality of the vectors ai , these vectors obey the reality condition σV (vi (z)) = vi (σ (z)),
i = 1, . . . , k,
∀z ∈ C4 .
Since Uz is isotropic with respect to the symplectic form, we deduce that
vi (z), v j (z) = 0, i, j = 1, . . . , k, z ∈ C4 .
(3.1)
(3.2)
414
J. D. E. Grant
We now view z as homogeneous coordinates on CP 3 , and construct bases for [z] ∈ S0 ⊂ CP 3 \L Uz⊥ is spanned by {vi (z)} along with [z] ∈ S0 , the annihilator ∞ . Given ⊥ two vectors e A (z) A = 1, 2 that span Uz /Uz . We therefore have (vi (z), e A (z)) = 0,
i = 1, . . . , k,
A = 1, 2,
[z] ∈ S0 ,
(3.3)
and, without loss of generality, may assume that (e1 (z), e2 (z)) = − (e2 (z), e1 (z)) = 1.
(3.4)
Although not strictly necessary, it will sometimes be useful to extend the vectors {e A (z), vi (z)} to a full basis for V by adding a set of vectors {wi (z)|i = 1, . . . , k} with the property that j wi (z), w j (z) = 0, vi (z), w j (z) = δi , wi (z), e A (z) = 0. (3.5) We may also define a basis f A (z) A = 1, 2 for Uz⊥ /Uz for [z] ∈ S∞ by the relations f1 (z) := −σV (e2 (σ (z))) ,
f2 (z) := σV (e1 (σ (z))).
This basis automatically has the property that (f1 (z), f2 (z)) = 1,
(vi (z), f A (z)) = 0.
(3.6)
Given that {e A (z)} and {f A (z)} are both bases for Uz⊥ /Uz for [z] ∈ S0 ∩ S∞ , there exist functions G A B (z), λ A i (z) defined on this region with the property that f A (z) = G A B (z) e B (z) + λ A i (z) vi (z).
(3.7)
(From now on, the summation convention will be assumed over repeated indices.) The matrix G(z), defined for [z] ∈ S0 ∩ S∞ is then the transition function of our bundle E. Before deriving some properties of the patching matrix that we will require, we define the SL2 (C)-invariant tensor by AB = − B A with 12 = 1 and the SO2 (C)-invariant tensor δ with components 1 A = B, δ AB = 0 A = B. Proposition 3.1. The patching matrix, G, as defined above obeys the conditions det G(z) = 1,
G ∗ (z) = G(z),
for [z] ∈ S0 ∩ S∞ , where G ∗ (z) := G(σ (z))† . The functions λ A i obey the reality condition λ A i (σ (z)) = δ AB G C B C D λ D i (z), for [z] ∈ S0 ∩ S∞ .
λ A i (z) = −G A B δ BC C D λ D i (σ (z))
The ADHM Construction and Non-local Symmetries
415
Proof. Firstly, we have
1 = (f1 (z), f2 (z)) = G 1 B (z) e B (z) + λ1 i (z) vi (z), G 2 B (z) e B (z) + λ2 i (z) vi (z) = G 1 1 (z)G 2 2 (z) − G 1 2 (z)G 2 1 (z) (e1 (z), e2 (z)) = det G(z),
where the four equalities follow from Eqs. (3.6), (3.7), (3.3) and (3.4), respectively. Therefore, det G(z) = 1, as required. The definition of the vectors f A (z) may be rewritten in the form f A (z) = −δ AB BC σV (eC (σ (z))).
(3.8)
We now apply σV to this equation, substitute Eqs. (3.7) and (3.1), and use the antilinear, anti-involutive nature of σV . After some manipulation of δ and tensors, and using the fact that det G = 1, we then find that
f A (z) = G ∗ (z) A B e B (z) − δ BC C D λ D i (σ (z))vi (z) .
Comparing with (3.7) then gives the required equalities.
Remark 3.1. We will be primarily interested in Eq. (3.7) when it is restricted to a real line L x ⊂ CP 1 \L ∞ . Since the patching matrix, G, defined above is holomorphic on CP 3 \L ∞ , when restricted to a neighbourhood of the line L x ≡ L (u,v) , then G will restrict to a function (which we denote by G(x, z)) that is holomorphic in (u − zv, v + zu, z) for z ∈ V , for some > 0. 4. One-Parameter Families of ADHM Data We now consider a one-parameter family of ADHM data A(z) := A(t : z), with t ∈ I a parameter, I a sub-interval of the real line containing the origin. We assume that A(t : z) is a C 1 function of t. We wish to investigate how the elements of the above explicit construction depend on A(t : z). The image A(t : z) (W ) is now spanned by the vectors {vi (t : z)}, and Uz⊥ /Uz is spanned by {e A (t : z)}, which we assume normalised such that (3.4) is satisfied for each t ∈ I . Constructing the vectors {f A (t : z)}, we then define the patching matrix G A B (t : z) and the functions λ A i (t : z) as in (3.7). Proposition 4.1. Given a one-parameter family of ADHM data, A(t : z), and patching matrices as defined in (3.7), then there exists a matrix-valued function d(t : z) with the property that ˙ : z) = d(t : z)G(t : z) + G(t : z)d ∗ (t : z). G(t
(4.1)
Proof. To investigate the t-dependence of these quantities, we consider their derivatives with respect to t. The derivatives of the relevant vectors are given as follows:6 v˙ i = Ai j v j + Bi j w j + AB s Ai e B ,
(4.2a)
˙ = C vj − Aj w − w
(4.2b)
i
ij
i
j
AB
r A eB , i
e˙ A = c A e B + r A vi + s Ai w , B
i
i
6 Everything depends on (t : z), but we drop explicit mention of this dependence for the moment.
(4.2c)
416
J. D. E. Grant
where Ai j , . . . s Ai are functions of (t, z), that satisfy the relationships Bi j = B ji , C i j = C ji , c A A = 0. ˙ i , e˙ A that It is straightforward to check that these are the most general forms of v˙ i , w preserve the relations (3.2), (3.3), (3.4) and (3.5). We also define functions that characterise the time-dependence of the vector fields fA: f˙ A = d A B f B + t A i vi + u Ai wi .
(4.3)
From this expression and Eq. (3.7), we deduce that G˙ A B = d A C G C B − G A C cC B + λ A i BC sCi ,
(4.4)
along with the relations j λ˙ A i = d A C λC i + t A i − G A B r B i − λ A A j i ,
u Ai = G A B s Bi + λ A j B ji . Also, equating f˙1 with −e˙2 , and f˙2 with e˙1 , we find that d A B = u Ai λ Bi + AC δ C D c D E δ E F B F , and u Ai = AB δ BC sCi . These equations, along with (4.4) imply that the t-derivative of the patching matrix obeys the relation (4.1) with d = u i ⊗ λi + AC δ C D c D E δ E F B F , as required.
Remark 4.1. The quantities that occur in Eq. (4.1) may all be constructed directly from the vector fields e A , vi since c A C C B . (vi , e˙ A ) = s Ai , (˙e A , e B ) = C
Therefore the construction does not actually require the introduction of the basis vectors {wi }. Corollary 4.1. Given a one-parameter family of ADHM data and patching matrix defined as above, then there exists a map d : I × C2 × V → SL2 (C) that is holomorphic in (u − zv, v + zu, z) for z ∈ V such that the restriction of the patching matrix to real-lines L x evolves according to ˙ : x, z) = d(t : x, z)G(t : x, z) + G(t : x, z)d ∗ (t : x, z), G(t for (x, z) ∈ C2 × V . Proof. Restrict (4.1) to L x .
(4.5)
The ADHM Construction and Non-local Symmetries
417
Remark 4.2. Let α(t : x, z) satisfy the first order ordinary differential equation α(t ˙ : x, z) = d(t : x, z) α (t : x, z),
α(0 : x, z) = Id.
Given an initial patching matrix G(x, z), it follows that the one-parameter family of patching matrices G(t : x, z) := α(t : x, z) G(x, z) α ∗ (t : x, z)
(4.6)
satisfies Eq. (4.1) with initial conditions G(0 : x, z) = G(x, z). Conversely, by uniqueness of solutions of (4.1), it follows that G(t : x, z), as defined in Eq. (4.6), is the unique one-parameter family of patching matrices determined by the flow (4.1) with initial data G(x, z). Note that these transformations (4.5) and (4.6) are of the same form as those generated by the symmetries of the self-dual Yang–Mills equations given in Eq. (2.13) and Theorem 2.2, with the important proviso that the function d(t : x, z) occurring in (4.5) depends explicitly on the parameter t. The symmetries (2.13) should be viewed as defining a flow on the space, M, of self-dual connections defined by the map T . In solving (2.13), we are simply constructing the integral curves of this flow, with t a parameter along the integral curve. As such, in (2.13), it is important that the function T (x, z) is independent of the parameter t. Viewing the function T as defining a flow on M and the instanton moduli spaces Mk as submanifolds of M, we directly deduce: Theorem 4.1. Let A ∈ Mk be a k-instanton self-dual connection (modulo gauge transformation) on R4 , with Mk viewed as a submanifold of the space, M, of all self-dual connections on R4 . Then for each vector v ∈ TA Mk , there exists a function T such that the fundamental vector field on M corresponding to T via Eq. (2.13) coincides with v at the point A ∈ Mk . Proof. Any element v ∈ TA Mk is generated by a one-parameter family of ADHM data, A(t : z), with A(0 : z) corresponding to the connection A. This one-parameter family of ADHM data then gives rise to a one-parameter family of patching matrices G(t : x, z) evolving according to (4.5), where G(0 : x, z) is the patching matrix corresponding to ˙ : x, z) corresponds to the tangent vector v. Taking T (x, z) := −d(0 : x, z) A and G(0 gives a symmetry that, via (2.13) (with ρ0 = ρ∞ = 0) generates the tangent vector v. Remark 4.3. Theorem 2.1 states that, given a function T , there is a corresponding fundamental vector field on M, the space of self-dual connections, corresponding to T . We shall denote this fundamental vector field by XT . Theorem 4.1 states that, given a connection A ∈ Mk and a tangent vector v ∈ TA Mk , then there exists such a function T such that XT |A = v. It is important to note, however, that the integral curve of XT starting at A ∈ Mk will, generally, not remain within the sub-manifold Mk of M. In order to determine which one-parameter groups of symmetries gives flows that remain in the moduli space Mk , we need to determine which transformations of the form (4.5) are generated by transformations of the form (2.13), with T (x, z) independent of t. From the form of (4.5) and (2.13), it appears natural to identify d(t : x, z) with −T (x, z) + ρ∞ (t, x, z). We impose that T is independent of t. The map ρ∞ simply generates a change of holomorphic frame for z ∈ V∞ . At this point, we should recall that we have partially fixed our holomorphic frames in deriving our patching matrix from the
418
J. D. E. Grant
ADHM data. As such, if we wish to employ our approach with one-parameter families of ADHM data, we must allow for one-parameter families of changes of frame in order to compensate for this fixing of frames. As such, we should allow ρ∞ to be t-dependent (i.e. ρ∞ = ρ∞ (t, x, z)). Note that such a t-dependent change of frame does not affect the corresponding self-dual connection A(t). As such, we may use ρ∞ to absorb any part of d(t : x, z) that is holomorphic on C2 × V∞ , leaving an irreducible part of d(t : x, z), denoted d0 (t : x, z), that has singularities in the region V∞ that cannot be removed by absorption into ρ∞ . In order to arise from a symmetry of the self-dual Yang–Mills equations, d0 (t : x, z) must then be independent of t. Since d(t : x, z) is determined by first t-derivatives of the ADHM data, A(t : z), imposing that d0 (t : x, z) is constant in t will impose conditions on the first t-derivatives of the A(t : z) data that must be satisfied in order for this one-parameter family of data (and corresponding self-dual connections) to arise from a symmetry of the self-dual Yang–Mills equations. Explicit calculations, in the next section, suggest that these conditions are quite restrictive. Remark 4.4. The fact that the flow on the moduli space does not generally preserve the L 2 nature of the curvature is well-known (see, e.g., [6–8] where this effect is mentioned). In [9, Chap. V], an explicit example of a transformation acting on a one-instanton patching matrix is given to demonstrate this phenomenon. In the notation of (4.6), this transformation takes the form 1 1 t/z . α(t : x, z) = √ 1 1 − t2 tz From this expression, we deduce that 1 d(t : x, z) =
3/2 1 − t2
t z
1/z . t
Following the programme of the previous remark, we then isolate the part of d that has singularities in the region z ∈ V∞ , namely 1 0 0 d0 (t : x, z) =
3/2 z 0 . 1 − t2 Since d0 depends explicitly on t, we deduce that the counterexample provided in [9, Chap. V] falls outside of the class of transformations generated by transformations (2.13) with T independent of t. Remark 4.5. If one drops the reality condition that our connections are SU2 connections, rather than SL2 (C) connections, then Takasaki has argued [27] that the action of the non-local symmetry group generated by transformations of the form J˙(x) = χ∞ (x, λ)T (x, λ)χ∞ (x, λ)−1 · J is transitive on the space of SL2 (C) solutions of the self-dual Yang–Mills equations. If, as here, we restrict to symmetries of the form (2.12) that explicitly preserve the SU2 nature of the connection, then the symmetry group need not act transitively on the moduli space of solutions, even though the symmetries have been shown to generate the tangent space at each point. Moreover, if we explicitly impose that we only consider symmetries that preserve the L 2 nature of the connection, then the explicit calculations carried out in the next section for the one-instanton moduli space suggest that the orbits of the symmetry group are actually of high codimension in the moduli space.
The ADHM Construction and Non-local Symmetries
419
5. The One-Instanton Solution In the case of a one-instanton solution, it is straightforward to carry out the ADHM construction and the construction of deformations explicitly. We find that the one-parameter subgroups of ADHM data with d(t : x, z) of the form −T (x, z) + ρ∞ (t, x, z) are rather small. First, we fix some notation. In the case k = 1, then we may write ⎛
⎞ A1 (z) ⎜ A (z)⎟ v(z) := A(z) = ⎝ 2 ⎠ ∈ C4 , A3 (z) A4 (z) where Ai (z) =
4
j
j=1
Ai z j , i = 1, . . . , 4. Letting ⎛ ⎞ ⎛ ⎞ α −β ⎜ α ⎟ ⎜β ⎟ ⎟ σV ⎝ ⎠ := ⎜ ⎝ −δ ⎠, γ δ γ
then (3.1) implies that the functions Ai (z) must satisfy the reality conditions: A1 (σ (z)) = −A2 (z),
A2 (σ (z)) = A1 (z),
A3 (σ (z)) = −A4 (z),
A4 (σ (z)) = A3 (z).
In particular, using the symmetry transformations inherent in the ADHM construction [2, Chap. II], we may fix A1 (z) = λz 1 , A3 (z) = αz 1 − βz 2 − z 3 ,
A2 (z) = λz 2 , A4 (z) = βz 1 + αz 2 − z 4 ,
(5.1a) (5.1b)
where λ is a positive, real number and α, β are complex numbers. Finally, we may take the symplectic form on V ∼ = C4 to be (a, b) = a 1 b2 − a 2 b1 + a 3 b4 − a 4 b3 ,
a, b ∈ C4 .
Theorem 5.1. The only transformations of the ADHM data (λ, α, β) that arise from a non-local symmetry of the self-dual Yang–Mills equations (2.12) according to (4.5) with d(t : x, z) of the form −T (x, z) + ρ∞ (t : x, z) are of the form λ λ → λ(t) := √ , 1 − kλ2 t
α, β constant,
(5.2)
where k ∈ R is a real constant. Proof. On a region with A1 (z) = 0 (and hence A2 (z) = 0), then we find that Uz = v(z)⊥ /v is spanned by the vectors A4 (z) A3 (z) e1 (z) = 0, , 1, 0 , e2 (z) = 0, − , 0, 1 , A1 (z) A1 (z)
420
J. D. E. Grant
which have the property that (e1 , e2 ) = 1. Such a basis, including the normalisation property, is unique up to a translation e A → e A + λ A v, and an SL2 (C) rotation of the vectors e A (z). Taking the conjugates of these vectors, we find that A4 (z) f1 (z) = −e2 (z) = − , 0, 1, 0 , A2 (z)
f2 (z) = e1 (z) =
A3 (z) , 0, 0, 1 . A2 (z)
These expressions imply that on the overlap where the two above regions overlap, we have the patching matrix (see [9, Chap. V])
G=
1+
A3 (z)A4 (z) A1 (z)A2 (z) 2 3 (z) − A1A(z)A 2 (z)
A4 (z)2 A1 (z)A2 (z) (z)A4 (z) 1 − AA13 (z)A 2 (z)
and λ1 (z) = −
A4 (z) , A1 (z)A2 (z)
λ2 (z) =
A3 (z) . A1 (z)A2 (z)
We may take the vector w(z) to be w = 0,
1 , 0, 0 , A1 (z)
which is unique up to w → w + φv. If we now let v(z) depend smoothly on a parameter t ∈ (−, ), then we may calculate the parameters of the deformation A, B, C, . . . as defined in (4.2) and (4.3). The parameter d is the one that we primarily require and a straightforward calculation shows that d(t : z) =
∂ A4 (t : z)/A2 (t : z) × A3 (t : z)/A1 (t : z) A4 (t : z)/A1 (t : z) . −A (t : z)/A (t : z) 3 2 ∂t
Taking Ai (t : z) as in (5.1), with λ replaced by λ(t), etc, then, restricted to the line L x , the deformation parameter that we require takes the form (β−v)+z(α−u) 1 ∂ λ (α−u)−z(β−b) d(x, z) = × λ z ∂t − (α−u)−z(β−v) λ
(β−v)+z(α−u) λ
.
This expression may be written in the form d(x, z) =
1 A(u − zv)2 + B(u − zv)(v + zu) + C(v + zu)2 z F H D + E (u − zv) + + G (v + zu) + + I + J z, + z z z
The ADHM Construction and Non-local Symmetries
where
421
λ˙ −1 0 λ˙ 0 −1 0 0 , B= 3 , C= 3 , 1 0 0 1 λ λ 0 0 1 −λβ˙ + β λ˙ 1 −λα˙ + α λ˙ 0 0 D= 3 , E= 3 , λ λα˙ − 2α λ˙ −β λ˙ λ −λβ˙ + 2β λ˙ −α λ˙ 1 α λ˙ −λβ˙ + 2β λ˙ 1 −β λ˙ −λα˙ + 2α λ˙ F= 3 , G = , 0 λα˙ − α λ˙ λ λ3 0 −λβ˙ + β λ˙ 1 α(λβ˙ − β λ˙ ) β(λβ˙ − β λ˙ ) , H= 3 λ α(−λα˙ + α λ˙ ) β(−λα˙ + α λ˙ ) 1 αλα˙ − βλβ˙ − αα λ˙ + ββ λ˙ βλα˙ + αλβ˙ − 2αβ λ˙ I = 3 , ˙ + β(λβ˙ − β λ) ˙ λ βλα˙ + αλβ˙ − 2αβ λ˙ α(−λα˙ + α λ) 1 β(−λα˙ + α λ˙ ) α(λα˙ − α λ˙ ) J= 3 , ˙ ˙ λ β(−λβ˙ + β λ) α(λβ˙ − β λ) A=
λ˙ λ3
where˙denotes differentiation with respect to t. According to the philosophy of Remark 4.3, we note that the coefficients D, F, H and I correspond to terms that are analytic for z ∈ V∞ , and therefore may be absorbed into the ρ∞ term. The remaining part of the parameter d is then 1 d0 (x, z) = A(u − zv)2 + B(u − zv)(v + zu) + C(v + zu)2 z +E(u − zv) + G(v + zu) + J z. All of the terms in d0 have singularities at z = ∞ ∈ V∞ . In order for such transformations to arise from a T that is independent of t with d = −T + ρ∞ , we therefore require that the remaining coefficients A, B, C, E, G and J must be independent of t (i.e. constant). An analysis of the explicit form of these coefficients given above shows that this condition is only possible if λ˙ k = , λ3 2
α˙ = β˙ = 0,
where k is a constant. Integrating these equations yields (5.2). Therefore the only transformation on the one-instanton moduli space that arises from a symmetry of the form (2.12) with d(t : x, z) = −T (x, z) + ρ∞ (t, x, z) is a scaling of the moduli space. Remark 5.1. The group of transformations on the one-instanton moduli space is therefore only one-dimensional. Such a collapse to a finite-dimensional action is familiar from the theory of harmonic maps (see, e.g., [1,17,20,28]), where the orbits of the group action are also, generically, of high codimension. 6. Final Remarks Our first main result is Theorem 4.1, which states that the tangent space to the instanton moduli spaces, Mk , are generated by symmetries of the self-dual Yang–Mills equations. Nevertheless, our second main result, based on an analysis of the one-instanton moduli
422
J. D. E. Grant
space, is that the subgroup of the symmetry group that preserves the L 2 nature of the connection, and hence has orbits that lie in a particular Mk , is rather small. In particular, the orbits of this subgroup on the space Mk are of high codimension. We have restricted ourselves to one-parameter families of ADHM data that arise from transformations of the form (2.13) and (4.5) with d(t : x, z) = −T (x, z) + ρ∞ (t, x, z). Note that this is a sufficient, but not necessary, condition for Eqs. (2.13) and (4.5) to be consistent. It is conceivable that there might be a larger group of transformations acting on the moduli spaces, Mk , consistent with these equations, but we have not investigated this possibility. It is hoped that there is a more elegant way of carrying out the calculations in the previous section. In particular (also regarding the remark in the previous paragraph) one would like to pull the infinitesimal action on the patching matrix (2.13) directly up to the space of ADHM data. An alternative approach to extending our analysis would be to investigate our approach from the point of view of Donaldson’s reformulation of the ADHM construction [12], where one views instantons as defining holomorphic bundles over CP 2 . Restricting our constructions to the CP 2 picture is straightforward, but it is again to directly calculate the action of the symmetry transformations on the data. Work of Nakamura [22] concerning dynamical systems defined on the space of data of the Donaldson construction may be relevant in this regard. The approach where one might expect the symmetries to have the simplest form would be within Atiyah’s reformulation [3] of the instanton moduli spaces in terms of holomorphic maps CP 1 → ΩG. In this case, the connection with harmonic map theory is quite strong. In the case of the self-dual Yang–Mills equations, however, one expects the symmetry group to act directly on the map in the Atiyah construction, whereas for harmonic maps the “dressing action” acts purely on the space ΩG. It is also quite difficult to see directly how the action on the patching matrix or ADHM data transfer to the Atiyah picture, due to the non-holomorphic transformations required in passing from the ADHM construction to this approach. More broadly, thinking of (λ, α, β) as coordinates on the five-dimensional ball (with (α, β) compactified to the four-sphere and λ the radial coordinate) then the flow in (5.2) is simply a radial scaling. In particular, for k > 0, the flow converges to the fixed point 1 λ = 0 as t → −∞, and diverges to +∞ as t → kλ2 −. Such flows are, in some respects, reminiscent of Morse flows, and it would be of interest to know whether our approach has a Morse-theoretic interpretation. In addition, it would be interesting to relate our work to other examples of systems where one has a symmetry algebra, but no corresponding group action e.g. Teichmüller theory.7 As mentioned in the Introduction, the original motivation for this work was to determine whether the integrable systems approach to the self-dual Yang–Mills equations could give information about instanton moduli spaces as used in the more topological context of Donaldson theory. In this regard, the results of this paper should be viewed alongside the results of the companion paper [15]. In [15], reducible connections were studied on open subsets of R4 , and were found to bear a strong resemblance to harmonic maps of finite type (see, e.g., [16, Chap. 24]). In particular, all reducible connections lie in the orbit, under flows (2.13), of the flat connection on R4 . Therefore instanton solutions on R4 and reducible connections (which are necessarily not L 2 on R4 ) appear to have quite different behaviour under the symmetry group of the self-dual Yang–Mills equations. Since reducible and irreducible connections play a different role in Donald7 The author is grateful to Prof. K. Ono for this suggestion.
The ADHM Construction and Non-local Symmetries
423
son’s work [11], corresponding to the smooth and singular parts of the moduli space respectively, it is striking that such connections also seem to have different behaviour from the point of view of integrable systems. In this respect, it would be of particular interest to investigate the one-instanton moduli space on CP 2 , where one has L 2 and reducible connections in the same moduli space. Acknowledgements. This work was supported by START-project Y237–N13 of the Austrian Science Fund and a Visiting Professorship at the University of Vienna. The author is grateful to the anonymous referee, whose detailed comments led to the significant improvement of this paper.
A. Action of Symmetries on the Patching Matrix It appears that the direct derivation of the infinitesimal flow, (2.13), from the flow of the J -function, (2.12), has not appeared in the literature. We therefore give a proof of this result here. The closest to our derivation that we have found is the corresponding construction for harmonic maps into Lie groups given in [28, §3-4]. For ease of notation, we define the quantities α(x, λ) := Ψ∞ (x, λ)T (x, λ)Ψ∞ (x, λ)−1 , α(x, λ)† := Ψ0 (x, σ (λ))T (x, λ)† Ψ0 (x, σ (λ))−1 , and recall the solution of the linearisation equation, (2.12), in this notation: J˙(x) = ψ∞ (x)−1 α(x, λ) + α(x, λ)† ψ0 (x).
(A.1a) (A.1b)
(A.2)
Proposition A.1. There exists a function h ∞ (x, z) ≡ h ∞ (u − zv, v + zu, z) with the property that λ Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 − ψ˙ ∞ (x)ψ∞ (x)−1 = (α(x, λ) − α(x, z)) λ−z 1 α(x, λ)† − α(x, σ (z))† − Ψ∞ (x, z)h ∞ (z)Ψ∞ (x, z)−1 , + 1 + zλ (A.3)
for all z ∈ CP 1 such that z = 0, λ, −1 λ. Similarly, there exists a function h 0 (x, z) ≡ h 0 (u − zv, v + zu, z) such that Ψ˙ 0 (x, z)Ψ0 (x, z)−1 − ψ˙ 0 (x)ψ0 (x)−1 = −
zλ 1 + zλ
z (α(x, λ) − α(x, z)) λ−z
α(x, λ)† − α(x, σ (z))† + Ψ0 (x, z)h 0 (z)Ψ0 (x, z)−1 ,
(A.4)
for all z ∈ CP 1 such that z = ∞, λ, −1 λ. Proof. From (A.2), we deduce that ψ˙ 0 (x)ψ0 (x)−1 − ψ˙ ∞ (x)ψ∞ (x)−1 = α(x, λ) + α(x, λ)† .
(A.5)
424
J. D. E. Grant
From the defining relations for ψ0 (x, z), ψ∞ (x, z) we deduce that the derivative of the components of the connection are given by
A˙ u − z A˙ v = − (Du − z Dv ) Ψ˙0 (x, z)Ψ0 (x, z)−1 = − (Du − z Dv ) Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 ,
A˙ v + z A˙ u = − (Dv + z Du ) Ψ˙0 (x, z)Ψ0 (x, z)−1 = − (Dv + z Du ) Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 . This expression implies that (Du − z Dv ) Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 = Du ψ˙ 0 (x)ψ0 (x)−1 − z Dv ψ˙ ∞ (x)ψ∞ (x)−1 , (Dv + z Du ) Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 = Dv ψ˙ 0 (x)ψ0 (x)−1 + z Du ψ˙ ∞ (x)ψ∞ (x)−1 . We need to solve these equations for Ψ∞ (x, z) with the boundary condition that Ψ˙ ∞ (x, z) → ψ˙ ∞ (x) as z → ∞. These equations may be rewritten in the form (Du − z Dv ) Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 − ψ˙ ∞ (x)ψ∞ (x)−1 = Du α(x, λ) + α(x, λ)† , (Dv + z Du ) Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 − ψ˙ ∞ (x)ψ∞ (x)−1 = Dv α(x, λ) + α(x, λ)† . We now note that (Du − λDv ) α(x, λ) = (Dv + λDu ) α(x, λ) = 0. Therefore, for all z = λ, λ (Du − z Dv ) α(x, λ), λ−z λ Dv α(x, λ) = (Dv + z Du ) α(x, λ). λ−z
Du α(x, λ) =
Similarly,
Dv + λDu α(x, λ)† = Du − λDv α(x, λ)† = 0,
from which we deduce that, for all z = −1 λ, Du α(x, λ)† = Dv α(x, λ)† =
1 1 + zλ 1 1 + zλ
(Du − z Dv ) α(x, λ)† , (Dv + z Du ) α(x, λ)† .
The ADHM Construction and Non-local Symmetries
425
Hence, (Du − z Dv ) Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 − ψ˙ ∞ (x)ψ∞ (x)−1 λ 1 † α(x, λ) = 0, − α(x, λ) − λ−z 1 + zλ and, similarly, (Dv + z Du ) [. . . ] = 0. It then follows that there exists a function H∞ (u − zv, v + zu, z) with the property that Ψ˙ ∞ (x, z)Ψ∞ (x, z)−1 − ψ˙ ∞ (x)ψ∞ (x)−1 = +
1 1 + zλ
λ α(x, λ) λ−z
(A.6)
α(x, λ)† − Ψ∞ (x, z)H∞ (z)Ψ∞ (x, z)−1 .
Taking H∞ (x, z) = h ∞ (x, z) − Ψ∞ (x, z)−1
λ 1 α(x, z) + α(x, σ (z))† Ψ∞ (x, z) λ−z 1 + zλ
cancels the poles in the first two terms in the right-hand-side of (A.6), and yields Eq. (A.3). A similar argument for Ψ0 (x, z) yields Eq. (A.4). Lemma A.1. ˙ G(z) = T (z)G(z) + G(z)T ∗ (z) + h ∞ (z)G(x, z) + G(x, z)h 0 (z).
Proof. Firstly, ∂ ˙ Ψ∞ (x, z)−1 · Ψ0 (x, z) G(z) = ∂s = Ψ∞ (x, z)−1 Ψ˙ 0 (x, z) · Ψ0 (x, z)−1 − Ψ˙ ∞ (x, z) · Ψ∞ (x, z)−1 Ψ0 (x, z). Now use Eqs. (A.1), (A.3), (A.4) and (A.5).
The left-hand-side of (A.4) is analytic for |z| < 1+. Any singularities in this region that occur in the first two terms on the right-hand-side must therefore be cancelled by corresponding singularities in the function h 0 . It turns out that this consideration is enough to determine h 0 up to addition of a function of (u − zv, v + zu, z) that is holomorphic on the region |z| < 1 + . Similar remarks apply to h ∞ and Eq. (A.3). Proposition A.2. There exists a function ρ0 (x, z) ≡ ρ0 (u − zv, v + zu, z), holomorphic 1 for |z| < 1 + with the property that on the region 1+ < |z| < 1 + we have 1 Ψ˙ 0 (x, z)Ψ0 (x, z)−1 − ψ˙ 0 (x)ψ0 (x)−1 = (zα(x, λ) − λα(x, z)) λ−z 1 zλα(x, λ)† + α(x, σ (z))† + Ψ0 (x, z)ρ0 (z)Ψ0 (x, z)−1 . − 1 + zλ
(A.7)
426
J. D. E. Grant
Proof. Rearranging Eq. (A.4) yields z Ψ0 (z)−1 (α(x, λ) − α(x, z)) Ψ0 (z) λ−z zλ + Ψ0 (z)−1 α(x, λ)† − α(x, σ (z))† Ψ0 (z). 1 + zλ
h 0 (z) = χ0 (z)−1 χ˙ 0 (z) −
Since Ψ0 is analytic for |z| < 1 + and the poles at z = λ, σ (λ) have been can1 celled, it follows that h 0 is analytic for 1+ < |z| < 1 + . We may therefore split (0) (∞) (0) (∞) h 0 (z) = h 0 (z) + h 0 (z), where h 0 is analytic for |z| < 1 + and h 0 is analytic for 1 1 |z| > 1+ . For |z| > 1+ , we have (∞)
h0
(z) = −
1 2πi
γ−
h 0 (w) dw, w−z
1 1 where γ− = {w ∈ C : w = 1+ }, where < is chosen such that |z| > 1+ . Using 1 the fact that χ and Ψ0 are analytic for |z| < 1+ , we find that " ! wλ 1 1 w (∞) ∗ −1 h 0 (z) = T (w) − G(w) T (w)G(w) dw 2πi γ− w − z 1 + wλ λ−w
for |z| >
1 1+ .
Differentiating under the integral sign, we find that (∞)
(∂u − z∂v ) h 0 where K (x) :=
1 2πi
(z) = ∂u K (x),
γ−
(∞)
(∂v + z∂u ) h 0
(z) = ∂v K (x),
1 1 T ∗ (w) + G(w)−1 T (w)G(w) dw. w − σ (λ) w−λ
Note that this expression is independent of z. In order to construct the function h 0 , we (0) must find a function h 0 , holomorphic (in z) for |z| < 1 + with the property that (∂u − z∂v ) h 00 (z) = −∂u K (x),
(∂v + z∂u ) h 00 (z) = −∂v K (x).
To construct such a function, we define the contour γ+ = {w ∈ C : |w| = 1 + } and deduce that 1 1 1 ∗ −1 K (x) = T (w) + G(w) T (w)G(w) dw 2πi γ+ w − σ (λ) w−λ −T ∗ (σ (λ)) − G(λ)−1 T (λ)G(λ). We then find that, for |z| < 1 + , 1 1 1 −∂u K (x) = − ∂u T ∗ (w) + ∂u G(w)−1 T (w)G(w) dw 2πi γ+ w − σ (λ) w−λ +∂u T ∗ (σ (λ)) + ∂u G(λ)−1 T (λ)G(λ) = (∂u − z∂v ) Φ(x, λ, z),
The ADHM Construction and Non-local Symmetries
where
427
1 1 w 1 T ∗ (w) + G(w)−1 T (w)G(w) dw 2πi γ+ w − z w − σ (λ) w−λ σ (λ) λ T ∗ (σ (λ)) + G(λ)−1 T (λ)G(λ), + σ (λ) − z λ−z
Φ(x, λ, z) := −
with a similar expression for −∂u K (x). Again cancelling the poles at z = λ, σ (λ), we deduce that, for |z| < 1 + , we may take λ (0) G(λ)−1 T (λ)G(λ) − G(z)−1 T (z)G(z) h 0 (z) = ρ0 (z) + λ−z 1 w 1 1 T ∗ (w) + G(w)−1 T (w)G(w) dw − 2πi γ+ w − z w − σ (λ) w−λ # $ σ (λ) T ∗ (σ (λ)) − T ∗ (z) , + σ (λ) − z where ρ0 = ρ0 (u − zv, v + u, z) is analytic for |z| < 1 + . Finally, we note that, in the 1 region 1+ < |z| < 1 + we have (∞) h 0 (z) = h (0) 0 (z) + h 0 (z) =
z zλ G(z)−1 T (z)G(z) − T ∗ (z) + ρ0 (z). λ−z 1 + zλ
Substituting this expression into (A.4) yields (A.7). Theorem A.1. On the region
1 1+
(A.8)
< |z| < 1 + we have
˙ G(z) = −T (z)G(z) − G(z)T ∗ (z) + ρ∞ (z)G(x, z) + G(x, z)ρ0 (z). Proof. The reality conditions for Ψ0 and Ψ∞ imply that h ∞ (z) = h ∗0 (z). The result then follows from Lemma A.1 and Eq. (A.8). Remark A.1. Since the functions ρ0 , ρ∞ are holomorphic in (u − zv, v + zu, z) and 1 analytic for |z| < 1 + , |z| > 1+ , respectively, they simply generate holomorphic changes of basis on these regions. As such, modulo holomorphic changes of basis, the symmetry (2.12) generates the flow ˙ G(z) = −T (z)G(z) − G(z)T ∗ (z) for the patching matrix. Since T is independent of t, the corresponding one-parameter group of transformations determined by T with initial conditions the patching matrix G 0 (x, z) is of the form
G(t; x, z) = exp (−t T (x, z)) G 0 (x, z) exp −t T ∗ (x, z) . In particular, we recover the group action constructed on heuristic grounds by Crane [9]: Given a map h : X × S 1 → SL2 (C) that extends to a holomorphic map h˜ : X × V → SL2 (C) (where holomorphic means with respect to the complex structure X × V as a subset of CP 3 ) then the group action on the patching matrix is of the form ˜ G(x, z) → (h · G) (x, z) := h(x, z)G(x, z)h˜ ∗ (x, z).
428
J. D. E. Grant
References 1. Arsenault, G., Jacques, M., Saint-Aubin, Y.: Collapse and exponentiation of infinite symmetry algebras of Euclidean projective and Grassmannian σ models. J. Math. Phys. 29, 1465–1471 (1988) 2. Atiyah, M.F.: Geometry on Yang–Mills fields. Scuola Normale Superiore Pisa, Pisa, 1979 3. Atiyah, M.F.: Instantons in two and four dimensions. Commun. Math. Phys. 93, 437–451 (1984) 4. Atiyah, M.F., Hitchin, N.J., Drinfel d, V.G., Manin, Y.I.: Construction of instantons. Phys. Lett. A 65, 185–187 (1978) 5. Atiyah, M.F., Hitchin, N.J., Singer, I.M.: Self-duality in four-dimensional Riemannian geometry. Proc. Roy. Soc. London Ser. A 362, 425–461 (1978) 6. Chau, L.L., Ge, M.L., Sinha, A., Wu, Y.S.: Hidden-symmetry algebra for the self-dual Yang–Mills equation. Phys. Lett. B 121, 391–396 (1983) 7. Chau, L.L., Ge, M.L., Wu, Y.S.: Kac–Moody algebra in the self-dual Yang–Mills equation. Phys. Rev. D (3) 25, 1086–1094 (1982) 8. Chau, L.-L., Wu, Y.S.: More about hidden-symmetry algebra for the self-dual Yang–Mills system. Phys. Rev. D (3) 26, 3581–3592 (1982) 9. Crane, L.: Action of the loop group on the self-dual Yang–Mills equation. Commun. Math. Phys. 110, 391– 414 (1987) 10. Dolan, L.: Kac–Moody algebra is hidden symmetry of chiral models. Phys. Rev. Lett. 47, 1371–1374 (1981) 11. Donaldson, S.K.: An application of gauge theory to four-dimensional topology. J. Diff. Geom. 18, 279– 315 (1983) 12. Donaldson, S.K.: Instantons and geometric invariant theory. Commun. Math. Phys. 93, 453–460 (1984) 13. Donaldson, S.K., Kronheimer, P.B.: The Geometry of Four-Manifolds. Oxford Mathematical Monographs, New York: The Clarendon Press/Oxford University Press, 1990 14. Freed, D.S., Uhlenbeck, K.K.: Instantons and Four-Manifolds, Vol. 1 of Mathematical Sciences Research Institute Publications, New York: Springer-Verlag, Second ed., 1991 15. Grant, J.D.E.: Reducible connections and non-local symmetries of the self-dual Yang–Mills equations. Commun. Math. Phys. doi:10.1007/s00220-010-1025-8 16. Guest, M.A.: Harmonic Maps, Loop Groups, and Integrable Systems, Vol. 38 of London Mathematical Society Student Texts, Cambridge: Cambridge University Press, 1997 17. Guest, M.A., Ohnita, Y.: Group actions and deformations for harmonic maps. J. Math. Soc. Japan 45, 671–704 (1993) 18. Ivanova, T.A.: On infinite-dimensional algebras of symmetries of the self-dual Yang–Mills equations. J. Math. Phys. 39, 79–87 (1998) 19. Ivanova, T.A.: On infinitesimal symmetries of the self-dual Yang–Mills equations. J. Nonlinear Math. Phys. 5, 396–404 (1998) 20. Jacques, M., Saint-Aubin, Y.: Infinite-dimensional Lie algebras acting on the solution space of various σ models. J. Math. Phys. 28, 2463–2479 (1987) 21. Mason, L.J., Woodhouse, N.M.J.: Integrability, Self-duality, and Twistor Theory. Vol. 15 of London Mathematical Society Monographs. New Series, New York: The Clarendon Press/Oxford University Press, 1996 22. Nakamura, Y.: Nonlinear integrable flow on the framed moduli space of instantons. Lett. Math. Phys. 20, 135–140 (1990) 23. Okonek, C., Schneider, M., Spindler, H.: Vector Bundles on Complex Projective Spaces, Vol. 3 of Progress in Mathematics, Boston, M.A.: Birkhäuser, 1980 24. Park, Q.-H.: 2D sigma model approach to 4D instantons. Int. J. Mod. Phys. A 7, 1415–1447 (1992) 25. Popov, A.D.: Self-dual Yang–Mills: symmetries and moduli space. Rev. Math. Phys. 11, 1091–1149 (1999) 26. Popov, A.D., Preitschopf, C.R.: Extended conformal symmetries of the self-dual Yang–Mills equations. Phys. Lett. B 374, 71–79 (1996) 27. Takasaki, K.: A new approach to the self-dual Yang–Mills equations. Commun. Math. Phys. 94, 35–59 (1984) 28. Uhlenbeck, K.: Harmonic maps into Lie groups: classical solutions of the chiral model. J. Diff. Geom. 30, 1–50 (1989) 29. Uhlenbeck, K.K.: Removable singularities in Yang–Mills fields. Commun. Math. Phys. 83, 11–29 (1982) 30. Ward, R.S.: On self-dual gauge fields. Phys. Lett. A 61, 81–82 (1977) Communicated by N.A. Nekrasov
Commun. Math. Phys. 296, 429–446 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1025-8
Communications in
Mathematical Physics
Reducible Connections and Non-local Symmetries of the Self-dual Yang–Mills Equations James D. E. Grant Fakultät für Mathematik, Universität Wien, Nordbergstrasse 15, 1090 Wien, Austria. E-mail:
[email protected] Received: 16 December 2008 / Accepted: 18 December 2009 Published online: 4 March 2010 – © Springer-Verlag 2010
To David E. Williams Abstract: We construct the most general reducible connection that satisfies the selfdual Yang–Mills equations on a simply-connected, open subset of flat R4 . We show how all such connections lie in the orbit of the flat connection on R4 under the action of non-local symmetries of the self-dual Yang–Mills equations. Such connections fit naturally inside a larger class of solutions to the self-dual Yang–Mills equations that are analogous to harmonic maps of finite type. 1. Introduction Reducible connections play an important rôle in Donaldson’s study of four-manifold topology [11]. In particular, the singular ends of the moduli space of global solutions of the self-dual Yang–Mills equations on a four-manifold are due to the existence of connections on which the group of gauge transformations (modulo its centre) do not act freely. In the current paper, we study reducible connections from a different point of view, that of integrable systems theory. In this paper and its companion [13] we investigate the non-local symmetry algebra of the self-dual Yang–Mills equations on R4 discussed in [7–9] and corresponding group actions on spaces of solutions of the self-dual Yang–Mills equations on open subsets of R4 . In [13] we studied instanton moduli spaces, as explicitly described by the ADHM construction [1,2], and group actions that preserved the L 2 nature of the curvature of the connection. In the current work, we investigate reducible connections defined on a simply-connected, open subset of R4 . Given that reducible and irreducible connections play a different rôle in Donaldson’s work, our main motivation was to study whether such connections have different properties from an integrable systems point of view. This does, indeed, seem to be the case. In the case of instanton solutions on R4 , it was argued in [13] that the symmetry group that acts on the moduli space has orbits of high codimension in the moduli space. (In other words, the orbits are quite small.) Our conclusion for reducible connections, however, is quite different. After explicitly constructing the most general reducible self-dual
430
J. D. E. Grant
connection on a simply-connected, open subset of R4 (there are no reducible self-dual connections on R4 with L 2 curvature), we deduce that all reducible connections lie in the orbit of the flat connection on R4 under the action of the non-local symmetry group of the self-dual Yang–Mills equations. Also, the reducible connections lie within a larger class of solutions that arise quite naturally from the symmetries of the self-dual Yang–Mills equations. Solutions in this larger family are determined by aholomorphic function, T , defined on an open subset of CP 3 that obeys the condition T, T ∗ = 0. Such formulae bear a strong resemblance to those arising in the theory of harmonic maps of finite type (see, e.g., [14, Chap. 24]). This result is distinct from the instanton case discussed in [13], which bears more of a resemblance to the theory of harmonic maps of finite uniton number [25]. The organisation of this paper is as follows. In the following section, we set up notation and recall the non-local symmetries of the self-dual Yang–Mills equations on R4 constructed in [7–9]. We also recall the main results of [10] concerning the twistorial interpretation of these symmetries in terms of their action on the patching matrix of holomorphic bundles over open subsets of CP 3 . In sect. 3, we determine the most general reducible self-dual connection on a simply-connected, open subset of R4 . We show that these may be constructed directly from harmonic functions. We then show the patching matrix of such connections may be constructed directly. In Sect. 4, we deduce that all such patching matrices, and therefore all reducible self-dual connections, lie in the orbit of the flat connection on R4 . In particular, we see there is a larger class of patching matrices that appear quite naturally from the group action of [10] that contains all reducible connections. Analogies between this larger class of solutions and harmonic maps of finite type are briefly investigated in Sect. 5. After some final remarks, in an Appendix we study some properties of a simplified version of the group action of [10]. As in the companion paper [13], we study only the non-local symmetries of the selfdual Yang–Mills equations constructed in [7–9], and not symmetries that require the existence of a non-trivial conformal group on our manifold. We also specialise to the case of SU2 structure group, although the generalisation to any classical Lie group is straightforward. 2. Preliminaries 2.1. The self-dual Yang–Mills equations on R4 . Let U be a connected, simply connected open subset of R4 with its flat metric. From the Cartesian coordinates (t, x, y, z) on R4 , we define complex coordinates u := t + ix,
v := y − i z
on R4 ∼ = C2 . In terms of these coordinates, the metric on R4 is g=
1 (du ⊗ du + du ⊗ du + dv ⊗ dv + dv ⊗ dv) , 2
and the standard volume form is = dt ∧ dx ∧ dy ∧ dz =
1 du ∧ du ∧ dv ∧ dv. 4
In terms of these coordinates, a local basis for the bundle of anti-self-dual two-forms (with respect to the above metric and volume form) on U is given by {du ∧dv, 21 (du ∧du + dv ∧ dv), du ∧ dv}.
Reducible Connections and Non-local Symmetries
431
Let π : P → U be a principal SU2 bundle over U . Since we are, essentially, working locally, a connection on P may be represented by an su2 -valued one-form A ∈ 1 (U, su2 ), with curvature F ∈ 2 (U, su2 ). In terms of the complex coordinates above, the connection satisfies the self-dual Yang–Mills equations on U if and only if
Fuu
Fuv = 0, + Fvv = 0, Fuv = 0.
(2.1a) (2.1b) (2.1c)
Since U is simply-connected, (2.1a) and (2.1c) imply the existence of maps ψ0 , ψ∞ : U → SL2 (C) with the property that −1 Au = − (∂u ψ∞ ) ψ∞ ,
Au =
−1 Av = − (∂v ψ∞ ) ψ∞ ,
− (∂u ψ0 ) ψ0−1 ,
Av =
− (∂v ψ0 ) ψ0−1 .
(2.2a) (2.2b)
The fields ψ0 , ψ∞ are determined by Eqs. (2.2) up to transformations 0 (x) := ψ0 (x)R(u, v), ψ0 (x) → ψ
∞ (x) := ψ∞ (x)S(u, v), ψ∞ (x) → ψ
where R, S are arbitrary analytic functions of (u, v), (u, v) respectively. We may use † this freedom to set, without loss of generality, ψ∞ (x) = ψ0 (x)−1 , ∀x ∈ U . The remaining freedom in the choice of ψ0 , ψ∞ is then of the form 0 (x) := ψ0 (x)R(u, v), ψ0 (x) → ψ
† ∞ (x) := ψ∞ (x) R(u, v)−1 . ψ∞ (x) → ψ (2.3)
In terms of these fields we define the Yang J -function, J : U → SL2 (C), by −1 J (x) := ψ∞ (x) · ψ0 (x),
x ∈ U.
It follows from the reality properties of ψ0 , ψ∞ that J is Hermitian J (x) = J (x)† ,
x ∈ U,
and that, under the transformation (2.3), J transforms according to the rule J (x) → J(x) := R(u, v)† J (x)R(u, v).
(2.4)
Substituting into Eq. (2.1b), we find that the connection, A, satisfies the self-dual Yang–Mills equations if and only if J satisfies the two (equivalent) versions of the Yang–Pohlmeyer equation ∂u Ju J −1 + ∂v Jv J −1 = 0, ∂u J −1 Ju + ∂v J −1 Jv = 0.
(2.5a) (2.5b)
432
J. D. E. Grant
2.2. Associated linear problem. Let be a non-empty, open subset of CP 1 := C∪{∞}, and consider the following overdetermined system of equations for a map : U × → SL2 (C) (∂v + z∂u ) (x, z) = − (Av + z Au ) (x, z), (∂u − z∂v ) (x, z) = − (Au − z Av ) (x, z), ∂z (x, z) = 0.
(2.6a) (2.6b) (2.6c)
An important property of the self-dual Yang–Mills equations is that they are the integrability condition for this system. In particular, if the connection A satisfies the self-dual Yang–Mills equations on U , then there exists > 0 and a solution 0 : U × V0 →
0 0 SL2 (C) of this system that is analytic in z for z ∈ V , where V := z ∈ CP 1 |z| < 1 + . Notation. We define the involution σ : CP 1 → CP 1 by σ (z) = − 1 z. Given a subset V ⊂ CP 1 and a map g : V → SL2 (C), we define a corresponding map g ∗ : σ (V) → SL2 (C) by g ∗ (z) := (g(σ (z)))† . Similarly, given any map f : U ×V → SL2 (C), we define a corresponding map f ∗ : U × σ (V) → SL2 (C) by f ∗ (x, z) := ( f (x, σ (z)))† . Given the solution, 0 , of (2.6), ∞: U ×
another solution we may now construct 1 , by ∞ (x, z) := 0∗ (x, z)−1 . V∞ → SL2 (C), where V∞ := z ∈ CP 1 |z| > 1+ The solution ∞ is analytic in z for z ∈ V∞ . Remark 2.1. Note that, for the construction of the connection in Eq. (2.2), we may take ψ0 (x) := 0 (x, 0) and ψ∞ (x) := ∞ (x, 0). We will assume, from now on, that ψ0 and ψ∞ are defined in this way. Definition 2.1. The patching matrix (or clutching function, in Uhlenbeck’s terminology [25]), G : U × V → SL2 (C) is defined by
where V := V0 ∩ V∞
G(x, z) = ∞ (x, z)−1 · 0 (x, z), 1
= z ∈ CP 1 1+ < |z| < 1 + .
(2.7)
Remark 2.2. Viewing U × V as a subset of π −1 (U ) ⊆ CP 3 , the patching matrix is the transition function of the holomorphic vector bundle over CP 3 corresponding to our self-dual connection A [3,27]. Since U × V0 and U × V∞ are open subsets of C3 , any holomorphic bundle over them is trivial. As such, the bundle over π −1 (U ) is completely determined by the patching matrix G (see, e.g., [10]). The fact that the patching matrix splits as in (2.7) implies that the bundle is trivial on restriction to a line π −1 (x), for each x ∈ U [27]. Since the patching matrix obeys the reality condition G(t, z) = G ∗ (t, z), the bundle admits a Hermitian structure, and the corresponding self-dual connection is an SU2 connection, rather than an SL2 (C) connection.
Reducible Connections and Non-local Symmetries
433
2.3. Non-local symmetries. In order to study symmetries of the self-dual Yang–Mills equations, we let J (·, s) : Us → SL2 (C) be a one-parameter family of solutions of the Yang–Pohlmeyer equations (2.5). Here, s ∈ I with I an open interval in R containing the origin, J is assumed to depend in a C 1 fashion on the parameter s, and Us ⊆ R4 is the open subset of R4 on which the solution is well defined (i.e. non-singular)1 . Taking the derivative with respect to s of (2.5), we find that J (·, s) must satisfy the linearised equation ∂u J ∂u J −1 J˙ J −1 + ∂v J ∂v J −1 J˙ J −1 = 0, (2.8) where J˙ := ∂ J ∂s. It is known that the only local symmetries of the self-dual Yang–Mills equations on flat R4 are gauge transformations and those generated by the action of the conformal group (see, e.g., [22]). However, there exists a non-trivial family of non-local symmetries of the self-dual Yang–Mills equations [7–9], defined as follows. We define maps χ0 : U × V0 → SL2 (C), χ∞ : U × V∞ → SL2 (C) by the relations χ0 (x, z) := ψ0 (x)−1 · 0 (x, z), χ∞ (x, z) := ψ∞ (x)−1 · ∞ (x, z),
(x, z) ∈ U × V0 ,
(x, z) ∈ U × V∞ .
χ0 is analytic in z for all z ∈ V0 , with χ (x, 0) = Id, for all x ∈ U , and is a solution of the system
∂v − Jv J −1 + z∂u χ0 (x, z) = 0,
∂u − Ju J −1 − z∂v χ0 (x, z) = 0, for all (x, z) ∈ U × V0 . Similarly, χ∞ is analytic in z for all z ∈ V∞ , with χ∞ (x, ∞) = Id, for all x ∈ U . Note that we have χ∞ (x, λ) = (χ0 (x, σ (λ)))−† , for all (x, λ) ∈ U × V∞ . Based on the work of [7–9], we have the following result from [10]: Proposition 2.1. Let T : U × V → sl2 (C) obey the relations (∂v + λ∂u ) T (x, λ) = (∂u − λ∂v ) T (x, λ) = 0, and be analytic in λ on a neighbourhood, V , of the unit circle S 1 ⊂ C. Then J˙(x, s) = χ∞ (x, λ)T (x, λ)χ∞ (x, λ)−1 · J + J · χ0 (x, σ (λ))T (x, λ)† χ0 (x, σ (λ))−1 = ψ∞ (x)−1 ∞ (x, λ)T (x, λ)∞ (x, λ)−1
+ 0 (x, σ (λ))T (x, λ)† 0 (x, σ (λ))−1 ψ0 (x) is a solution of the linearisation equation (2.8), for all x ∈ U , and all λ ∈ V . 1 We will often notationally suppress the dependence of the domain U on s.
(2.9)
434
J. D. E. Grant
Remark 2.3. In the case where the function T is independent of (u, v), it defines an element of the loop group SL2 (C) with a holomorphic extension to an open neighbourhood of S 1 in C∗ . The algebra of symmetries generated by such T is then isomorphic to the Kac-Moody algebra of sl2 (C) [7–9]. The natural question is how to exponentiate the above algebra into a group action on the space of solutions of the self-dual Yang–Mills equations. A solution to this problem is given by the following result Theorem 2.1. [10, Chap. IV.C]. Let g : X × S 1 → SL2 (C) be a smooth map that admits a continuous extension to a holomorphic map g : X × V ⊂ CP 3 → SL2 (C), for some > 0. Then the action of g on the patching matrix, G(x, z), is defined by G(x, z) → (g · G) (x, z) := g(x, z) · G(x, z) · g ∗ (x, z).
(2.10)
This equation defines an action of the set of such maps g on the space of self-dual connections on X . If g extends holomorphically to z ∈ V0 , then the corresponding transformation is a holomorphic change of basis on the bundle over π −1 (X ), which leaves the self-dual connection, A, unchanged. Remark 2.4. The above group action on the solution space have been given a cohomological description by Park (see [20] and references therein), which has been further investigated by Popov and Ivanova (see [17,18,22] and references therein). Remark 2.5. The group action (2.10) is slightly unusual since, in integrable systems theory, it is usually adjoint or coadjoint orbits of Lie groups that turn out to be relevant. If we consider the case where G and g are constant, and study the action of SL2 (C) on itself by (g, G) → g · G := gGg † , then the generic orbits of this action are five-dimensional. As such, the orbits do not carry the invariant symplectic structures that one would associate with, for example, coadjoint orbits. A brief investigation of this orbit structure is given in Appendix A. The connection between Theorem 2.1 and the transformation (2.9) is given by the following: Theorem 2.2. Given T : U × V → sl2 (C) as in Proposition 2.1, the corresponding flow on the space of patching matrices is given by ˙ G(x, z) = −T (x, z)G(x, z) − G(x, z)T ∗ (x, z) + ρ∞ (x, z)G(x, z) + G(x, z)ρ0 (x, z) (2.11) for (x, z) ∈ R4 × V . In this equation, ρ0 : R4 × V0 → sl2 (C) and ρ∞ : R4 × V∞ → sl2 (C) are analytic functions of z on the respective regions and satisfy (∂v + z∂u ) ρ0 (x, z) = (∂u − z∂v ) ρ0 (x, z) = 0, (∂v + z∂u ) ρ∞ (x, z) = (∂u − z∂v ) ρ∞ (x, z) = 0. Remark 2.6. It follows from a similar argument to that given in the proof of Proposition 1 (b) of [10] that the terms ρ0 and h ∞ in the above formula may be absorbed into a change of holomorphic frame on the sets z ∈ V0 and V∞ , respectively. Remark 2.7. The group action (2.10) was derived in [10], arguing by analogy with the action of dressing transformations in harmonic map theory. It has been investigated further in, for example, [17,18,22]. The first direct derivation of the infinitesimal result (2.11) from the generator (2.9), to my knowledge, appears in [13].
Reducible Connections and Non-local Symmetries
435
3. Reducible Connections Recall [12, Chap. 3] that a connection, D, on an SU2 bundle π : E → U is reducible if the group of gauge transformations G := C ∞ (U, SU2 ) modulo its centre does not act freely on the connection D. We now proceed to derive the most general form of a reducible connection on a simply-connected, open subset of R4 . In doing so, we make extensive use of the classical Pauli matrices, which we define as follows: 01 0 −i 1 0 τ ≡ (τ1 , τ2 , τ3 ) = , , . 10 i 0 0 −1 Proposition 3.1. Let U be a simply-connected, open subset of R4 , a ∈ C ∞ (U, R) a harmonic function. We define the connection A=
1 ∂a − ∂a τ3 ∈ 1 (U, su2 ). 2
(3.1)
Then the connection, A, is reducible and satisfies the self-dual Yang–Mills equations on U . Conversely, up to gauge transformation, all reducible self-dual connections on U arise in this way. Proof. A connection is reducible if and only if there exists a parallel section, η, of the adjoint bundle ad su2 [12, Theorem 3.1]. In terms of local coordinates (u, v) ∈ C2 ∼ = R4 , and relative to a local trivialisation of the bundle E, this condition implies that ∂ η + [Aa , η] = 0. ∂xa It follows from Eqs. (2.2) that, for a parallel section η, the map A : C2 → sl2 (C) defined by the equation η = ψ0 (x)A(u, v)ψ0 (x)−1 is holomorphic. In a similar fashion, using (2.2) and the relationship between ψ0 and ψ∞ , one may verify that η = −ψ∞ (x)A(u, v)† ψ∞ (x)−1 . These relations imply that J A(u, v) + A(u, v)† J = 0.
(3.2)
Note that this equation and the fact that det J = 0 implies that det A(u, v) = det A(u, v)∗ . Therefore det A(u, v) is real. Since A is holomorphic in u and v, it follows that det A is a real constant. Let A = (R + iI) · τ , with R, I : U → R3 . The fact that A depends holomorphically on (u, v) implies that ∂t R = ∂x I,
∂x R = −∂t I,
∂y R = −∂z I,
∂z R = ∂y I.
(3.3)
Moreover, we find that det A = |I|2 − |R|2 − 2iR, I. Since det A is real, we deduce that R, I = 0. We now let J = +λ·τ , with λ : U → R3 and the condition that det J = 1 implies that we require 2 = 1 + |λ|2 . Imposing (3.2), we find that we require
R = λ × I.
(3.4)
436
J. D. E. Grant
This equation implies that
2 |R|2 = |λ × I|2 = |λ|2 |I|2 − I, λ2 . Therefore 1 |λ|2 1 = |λ|2 ≥ 0.
det A = |I|2 − |R|2 =
2 |R|2 + I, λ2 − |R|2 |R|2 + I, λ2
For λ = 0, equality occurs in this inequality if and only if R = 0 and I, λ = 0. From (3.4), it follows that, in this case, R = I = 0, so A = 0. Moreover, if λ = 0, then (3.2) implies that R = 0, so det A = |I|2 , which is, again, strictly positive unless I = 0 and, hence, A = 0. To summarise, the fact that det A is real, along with (3.2), implies that det A is a nonnegative constant. Moreover, det A = 0 if and only if A = 0. Since we are assuming A = 0 and since any constant multiple of a parallel section is also parallel we may, without loss of generality, assume that det A = 1. In this case, it follows that the eigenvalues of A are ±i. Note that we still have the freedom to rotate the ψ’s, as given in (2.3). It follows that A transforms under the adjoint action of R −1 : v) := R(u, v)−1 A(u, v)R(u, v) = Ad R −1 A. A(u, v) → A(u,
(3.5)
We now write A in the form
a(u, v) b(u, v) A(u, v) = , c(u, v) −a(u, v)
where a, b, c are holomorphic functions of (u, v). On a neighbourhood of any point (u, v) at which a(u, v) = −i, the holomorphic change of frame a+i b R+ (u, v) = c a+i has the property that R+ (u, v)−1 A(u, v)R+ (u, v) = iτ3 . Similarly, for a(u, v) = +i,
R− (u, v) =
−b a − i a−i c
gives a holomorphic change of frame with the property that R− (u, v)−1 A(u, v)R− (u, v) = iτ3 . As such, given any point p ∈ X , there exists a neighbourhood of p and a holomorphic frame such that A = iτ3 in that frame.
Reducible Connections and Non-local Symmetries
437
From (3.2), it follows that there exist real functions α, β such that J = α Id + βτ3 . Since det J = 1, we have α 2 = 1+β 2 . Since J is continuous, α will have constant sign, so we assume that α > 0. Therefore, since U is assumed simply-connection, we may consistently define a real-valued function a with the property that α = cosh a, β = sinh a. It then follows that J = exp (aτ3 ) .
(3.6)
We therefore have J −1 Ju = au τ3 ,
J −1 Jv = av τ3 .
Imposing the Yang–Pohlmeyer equation implies that a is harmonic: (∂u ∂u + ∂v ∂v ) a = 0. From (3.6), we see that, up to a gauge transformation, we may take a a ψ∞ = exp − τ3 . ψ0 = exp τ3 , 2 2 The form of the connection given in Eq. (3.1) then follows from Eq. (2.2). The parallel section, η, is equal to iτ3 . Example 1. The case J (x) = exp
|u|2 − |v|2 τ3
corresponds to a = |u|2 −|v|2 and, therefore, defines a reducible connection. In this case, the connection is non-singular on R4 . However, the curvature is not L 2 , and therefore the connection cannot be extended to S 4 [26]. In this particular case the connection is algebraically special, in the sense that, in addition to satisfying the self-dual Yang–Mills equations (2.1), the curvature satisfies Fuv = 0,
Fvu = 0.
It can be shown that all algebraically special connections arise in this way, and are thus reducible. Recall [16] that there is a 1 − 1 correspondence between harmonic functions on , O(−2)), where U := π −1 (U ) ⊆ CP 3 . U ⊆ R4 and sheaf cohomology classes in H 1 (U In the present case, such a cohomology class may be represented by a holomorphic function2 f : U × C∗ → C. In terms of homogeneous coordinates on CP 3 , we have f (λz) = λ−2 f (z). The corresponding harmonic function on U ⊆ R4 is then given by the contour integral 1 a(x) = f (u − wv, v + wu, w)dw, (3.7) 2πi γ where γ := {w ∈ C ⊂ CP 1 : |w| = 1}. 2 By holomorphic, we mean with respect to the complex structure induced on U × C∗ as a subset of CP 3 .
438
J. D. E. Grant
Proposition 3.2. Given the connection as in Proposition 3.1, the patching matrix for the may be taken as holomorphic bundle on U G(x, z) = exp
1 F(x, z) + F ∗ (x, z) τ3 , 2
where F(x, z) :=
1 2πi
γ
w+z f (u − wv, v + wu, w)dw, w−z
(3.8)
and the holomorphic function f is a representative of the cohomology class in , O(−2)) corresponding to the harmonic function a. H 1 (U Proof. We assume that there exists 0 (x, z) of the form exp 21 F(x, z)τ3 , with F(x, 0) = a(x). From (2.6) for and the explicit form of the connection, we deduce that F must be analytic in z and satisfy the relations (∂u − z∂v ) F(x, z) = (∂u + z∂v ) a(x),
(∂v + z∂u ) F(x, z) = (∂v − z∂u ) a(x).
From (3.7) we then calculate (∂u + z∂v ) a(x) = = = = =
1 f (u − wv, v + wu, w)dw (∂u + z∂v ) 2πi γ 1 (w + z) (∂2 f ) (u − wv, v + wu, w)dw 2πi γ 1 w+z (w − z) (∂2 f ) (u − wv, v + wu, w)dw 2πi γ w − z 1 w+z (∂u − z∂v ) f (u − wv, v + wu, w)dw 2πi γ w − z w+z 1 f (u − wv, v + wu, w)dw. (∂u − z∂v ) 2πi γ w − z
Along with a similar calculation for (∂v − z∂u ) a(x), we have the above candidate for F given in (3.8). By its definition as a contour integral, it follows that F is analytic in w+z z, for z in the interior of γ . Since the function w−z f (u − wv, v + wu, w) is continuous for w ∈ γ , and γ is compact, the Dominated Convergence Theorem implies that lim z→0 F(x, z) = a(x). As such, F has all of the required properties. Remark 3.1. Note that the patching matrix in Proposition 3.2 takes the form G(x, z) = exp (ϕ(x, z)τ3 ), where ϕ is a holomorphic function of (u − zv, v + zu, z) that satisfies ϕ ∗ (x, z) = ϕ(x, z). Example 2. In the case of our algebraically special connection, where a = |u|2 − |v|2 , we may take f (x, z) = z12 (u −zv)(v +zu). We then find that F(x, z) = |u|2 −|v|2 −2uv
and G(x, z) = exp 1z (u − zv) (v + zu) τ3 .
Reducible Connections and Non-local Symmetries
439
4. Orbit of the Flat Connection Let T : U × S 1 be analytic on a neighbourhood of U × S 1 in U × C∗ ⊂ CP 3 , with the property that (4.1) T, T ∗ = 0. Given a solution of the self-dual Yang–Mills equations, described by patching matrix G(x, z), we may consider the one-parameter family of connections generated by T via the flow (2.11). Since the functions ρ0 , ρ∞ can be removed by a holomorphic change of basis on the regions U × V0 and U × V∞ we may, without loss of generality, fix the holomorphic bases by setting ρ0 = ρ∞ = 0. The unique solution of (2.11) with initial conditions G(x, z) is then G t (x, z) = exp (−t T (x, z)) G(x, z) exp −t T (x, z)∗ . In general, one would perform a Birkhoff splitting of G t , which would yield connections At generated from the connection, A, corresponding to the patching matrix G. (Generally, there will be jumping points at which G does not admit a splitting of the form (2.7), so we will need to shrink the set U accordingly. The set of such points will, generically, be of strictly positive codimension in U .) A case of particular interest to us is when the initial connection is flat, in which case we may take G(x, z) = 1. We then have G t (x, z) = exp −t T (x, z) + T (x, z)∗ as the patching matrix of connections that lie in this orbit of the flat connection. As a special case of this construction, letting T = − 21 ϕ(x, z)τ3 , where ϕ : U ×V → SL2 (C) satisfies (∂u − z∂v ) ϕ = (∂v + z∂u ) ϕ = 0 and is analytic in z ∈ V for some > 0. Then, assuming that is chosen sufficiently small that G(x, z) is analytic on U × V , we deduce that G t (x, z) = exp (tϕ(x, z)τ3 ) ,
(x, z) ∈ U × V .
In light of this construction, and the classification of patching matrices arising from reducible connections in the previous section, we deduce: Theorem 4.1. Let A be a reducible connection on an open subset U ⊂ R4 . Then A lies on the orbit of the flat connection on U under the action of the non-local symmetry group of the self-dual Yang–Mills equations. Remark 4.1. As mentioned earlier, the set U on which the self-dual connection is defined will generally shrink under the action of the symmetry group. In the case of reducible connections, where one may start from the flat connection on R4 , then there do exist reducible connections defined on the whole of R4 . (Our algebraically special connection is an example of such.) Since there are no non-trivial reducible connections on S 4 , however, Uhlenbeck’s theorem [26] implies that the curvature of such a connection cannot be L 2 . (This property may also be checked directly from the explicit form of the connection.)
440
J. D. E. Grant
Remark 4.2. Takasaki [24] has argued that, if we drop the reality conditions on self-dual connections and consider SL2 (C) connections rather than SU2 ones, then the group action generated by transformations of the form J˙(x, s) = χ∞ (x, λ)T (x, λ)χ∞ (x, λ)−1 · J
(4.2)
is transitive on the space of local solutions of the SL2 (C) self-dual Yang–Mills equations. In the current context, we are explicitly restricting ourselves (via the form of Eq. (2.9)) to transformations that preserve the SU2 nature of the connection, in which case there is no reason to believe that the group action should be transitive. In an analogous situation in the theory of harmonic maps into Lie groups, one can show that transformations analogous to (4.2) map real extended harmonic maps to real harmonic maps if and only if the action is trivial [4, Prop. 3.4] (i.e. g · = ). It is, similarly, expected that transformations of the form (4.2) will map SU2 connections to SU2 connections if and only if the connections coincide. 5. Harmonic Maps of Finite Type It is clear from the discussion in the previous section that the reducible connections on a simply-connected, open subset U ⊆ R4 are a special case of a more general type of connection. In particular, given a map T : U × V → SL2 (C) satisfying the commutator condition (4.1), then the patching matrix (5.1) G(x, z) = exp T (x, z) + T (x, z)∗ will generate a solution of the self-dual Yang–Mills equations on a subset of U . The forms of condition (4.1) and the patching matrix (5.1) are reminiscent of formulae that appear when one considers harmonic maps of finite type into Lie groups (see, e.g., [14, Chap. 24] and [5,6] for harmonic maps into k-symmetric spaces). Recall that, in this context, we consider Lie groups G, G1 , G2 such that G = G1 · G2 (in the sense that, given g ∈ G, there exist unique g1 ∈ G1 , g2 ∈ G2 such that g = g1 g2 ). At the Lie algebra level, we have a direct sum decomposition g = g1 + g2 , and we denote the projections onto the two summands by π1 , π2 . In the case where AdG1 g2 ⊆ g2 , then various Lax flows on g can be solved explicitly. In particular, let J1 , J2 be invariant vector fields on g3 and consider the Lax equations ∂s X (s, t) = [X (s, t), (π1 ◦ J1 ) (X (s, t))] , ∂t X (s, t) = [X (s, t), (π1 ◦ J2 ) (X (s, t))] ,
(5.2a) (5.2b)
for a map X : R2 → g with initial conditions X (0, 0) = V ∈ g. These equations are compatible, and the solution to this problem may be written in the form X (s, t) = Ad F(s,t)−1 V, where F : R2 → G1 takes the form F(s, t) = exp (s (π1 ◦ J1 ) (V) + t (π1 ◦ J2 ) (V)) ,
(s, t) ∈ R2 .
3 I.e. J , J : g → g satisfy J (Ad v) = Ad J (v) for all g ∈ G and v ∈ g and similarly for J . g g 1 1 2 1 2
Reducible Connections and Non-local Symmetries
441
The connection with harmonic maps arises if we let G be a compact Lie group and use the standard loop-group decompositions (see, e.g., [14, Chap. 12] and [23, Chap. 8]) to take G := G C , G1 := G, G2 := + G C . If we now impose that the initial conditions for the Lax equation (5.2) correspond to an element of the loop group of G (rather than G C ) and that their Laurent expansion lies between degrees −d and d: d n V = λ → f (λ) ≡ αn λ ∈ G C , n=−d
then it turns out that the map X also has Laurent expansion that lies between degrees −d and d. Moreover, F : R2 → G is automatically the extended solution corresponding to a harmonic map ϕ : R2 → G. (In particular, ϕ(s, t) = F(s, t)|λ=−1 .) Such harmonic maps are of finite type. In the context of self-dual Yang–Mills connections, the analogue of harmonic maps of finite type for self-dual Yang–Mills fields would appear to be patching matrices of the form G(x, z) = exp (x, z), where : U × V → SL2 (C) satisfies 1) (∂u − z∂v ) (x, z) = (∂v + z∂u ) (x, z) = 0; 2) ∗ (x, z) = (x, z) 3) (x, z) is analytic in z for z ∈ V for some > 0, and there exists d ∈ N0 such that has a finite Laurent expansion of the form (x, z) =
d
an (x)z n ,
(x, z) ∈ U × V ,
n=−d
for some ai : U → gC , i = −d, . . . , d on this set. In particular, fixing a point p ∈ U , then the finite Laurent expansion at p is analogous to the initial condition V having finite Laurent expansion. Moreover, Condition 1) above is then the analogue of the Lax equations (5.2) satisfied by the map X in the harmonic map case. Definition 5.1. We will call a solution of the self-dual Yang–Mills equations for which there exists a patching matrix that satisfies the above criteria a self-dual connection of finite type4 . Remark 5.1. The conditions above imply that the maps ai satisfy the conditions ∂u a−d = 0, ∂u a−d = 0, ∂u an+1 = ∂v an , ∂v an+1 = −∂u an , n = −d, . . . , d − 1, ∂v ad = 0, ∂u ad = 0.
(5.3a) (5.3b) (5.3c)
For d = 0, we deduce that G is constant, and therefore the corresponding self-dual connection A is flat. For d ≥ 1, the algebraic condition (4.1) imposes non-trivial restrictions on the coefficients ai . Note that our algebraically special connection is a self-dual connection of type 1. 4 Or, of type d, when we wish to be more specific
442
J. D. E. Grant
Remark 5.2. Let A be a reducible connection defined by a harmonic function a as in (3.1). Letting a0 (x) := a(x) then A is a self-dual connection of finite type if and only if there exists d > 0 and functions a−d , . . . , ad such that Eqs. (5.3) hold. In particular, this condition implies that ∂da = 0, for all r, s such that r + s = d. ∂u r ∂v s Therefore, is necessarily a polynomial of degree less than or equal to d in (u, u, v, v). As such, the space of self-dual connections of type d is necessarily finite-dimensional. Remark 5.3. The most restrictive case is when we impose that G splits in the form (2.7) with d a0 (x) n + 0 (x, z) = exp an (x)z , 2 n=1 −1 a0 (x) n − an (x)z . ∞ (x, z) = exp − 2 n=−d
Such a splitting only occurs if ai (x), a j (x) = 0, for all i, j = −d, . . . , d. Since SL2 (C) is of rank one, it follows that there exists a constant element α ∈ SL2 (C) such that ai (x) = ϕi (x)α, for functions ϕi : U → C. A change of basis (rotating so that α → τ3 ) implies that such patching matrices give rise to reducible connections when a0 is real. Remark 5.4. One of the main differences between the integrable systems approach to harmonic maps and the self-dual Yang–Mills equations is the form of the symmetry group action on the solutions. In the case of harmonic map equations from a domain X ⊆ R2 to a Lie group G, one interprets the harmonic map equations as implying the existence of a holomorphic map E : X → G into the based loop group of G. The “dressing action” on the space of harmonic maps is then induced by the action of various groups on the group G [15,25]. In particular, the symmetry group acts only on the space where the map E takes its values, rather than on the map E itself. In the case of the self-dual Yang–Mills equations, the object of study is the patching matrix G : U × V → sl2 (C), and the group action (2.10) acts non-trivially on the map G. This difference is the main issue that makes the case of self-dual Yang–Mills equations more complicated. As remarked earlier, the particular form of the group action (2.10) implies that many of the techniques used to study orbits in the harmonic map case have no direct analogue in the self-dual Yang–Mills case. 6. Final Remarks Our main result is that reducible connections that satisfy the self-dual Yang–Mills equations on simply-connected, open subsets of R4 lie in the orbit of the flat connection under the action of the non-local symmetry group of these equations found in [7–9]. In particular, such connections lie within a larger class of solutions, dis4, defined by a holomorphic function T (x, z) with the property that cussed in Sect. T (x, z), T ∗ (x, z) = 0. This condition defines a class of solutions of the self-dual Yang–Mills equations that seem quite natural from the integrable systems point of view,
Reducible Connections and Non-local Symmetries
443
and suggests a connection with the theory of harmonic maps of finite type. Whether the analogy with such harmonic maps may be extended, and techniques developed in, for example [5,6], may be adapted to the study of our class of self-dual connections, is under investigation. It is clear that the work here (and in the sister paper [13]) may be extended in several ways. The investigation of the symmetry group on the one-instanton moduli space on the four-manifold CP 2 would be of particular interest, since, in this case, the standard reducible connection is also L 2 , so we have reducible and irreducible connections in the same moduli space. Such an investigation would yield further information concerning the different behaviour of reducible connections studied here and the instanton connections studied in [13] under the symmetry group. In a different direction, given that the reducible connections and the class discussed in Sect. 4 seem quite a natural family of solutions to investigate from the point of view of integrable systems, it would be of interest to investigate whether there are similar families of self-dual Ricci-flat four-manifolds (for example, those with algebraically special self-dual Weyl tensor) that arise naturally from the symmetries of, for example, Plebanski’s equations [21]. We should also point out that we have exclusively considered the self-dual Yang–Mills equations on Riemannian manifolds, due to the original motivation of Donaldson theory. It is more usual to investigate the integrable systems aspects of the self-dual Yang–Mills equations on manifolds of signature (− − ++) (see, e.g., [19] for an extensive treatment of this topic). It would be of interest to investigate the action of symmetries on, for example, reducible connections in the case of signature (− − ++). Viewed in conjunction with the results of the companion paper [13], where instanton solutions of the self-dual Yang–Mills equations were investigated, it appears that the action of the non-local symmetry group on the space of solutions of the self-dual Yang–Mills equations is quite different in the two cases. In the case of instanton moduli spaces, evidence was found that the orbits of the symmetry group that preserve the L 2 nature of the curvature of the connection are rather small. In the present case, however, all reducible connections are contained in a single orbit. It appears that the distinction between instanton connections and reducible connections for the self-dual Yang–Mills equations are, in this sense, similar to the distinction between harmonic maps of finite uniton number [25] and harmonic maps of finite type. Since the original motivation for the current work (and [13]) was to investigate connections between integrable systems theory and Donaldson’s use of the self-dual Yang–Mills equations in connection with four-dimensional topology [11], it is rather striking that the behaviour of reducible connections and irreducible connections should be so different from the integrable systems point of view. Whether these results point to a deeper relationship between integrable systems theory and topological field theory would certainly seem worthy of further investigation. Acknowledgements. This work was supported by START-project Y237–N13 of the Austrian Science Fund and a Visiting Professorship at the University of Vienna. The author is grateful to Prof. M.A. Guest for discussions and, in particular, for drawing his attention to harmonic maps of finite type. He would also like to thank the anonymous referee for a detailed critique of the original version of this paper.
A. Constant Group Action The group action G(x, z) → h(x, z)G(x, z)h ∗ (x, z) is a little unusual. In order to gain some insight into this action, we consider some similar actions on simpler groups, analogous to the case where G and h are constant.
444
J. D. E. Grant
A.1. SL2 (R). Consider the action of SL2 (R) on itself given by SL2 (R) × SL2 (R) → SL2 (R);
(h, g) → h · g := hgh t ,
where t denotes transpose. The subgroup PSL2 (R) ∼ = SO02,1 acts effectively. We decompose g into symmetric and skew-symmetric parts g = U + α, 0 1 . It then follows that α is invariant under where U is symmetric, α ∈ R and = −1 0 the action of h and U transforms according to U → h · U := hU h t . The fact that g lies in SL2 (C) implies that det U = 1 − α 2 . Writing
t +x y , U= y t −x
then det U = −u2 , where u = (t, x, y) ∈ R2,1 . The orbits of the group action are then parametrised by α ∈ R and consist of vectors u ∈ R2,1 with u2 = α 2 − 1. Since the restriction on u is insensitive to the sign of α, we consider the orbits for α ≥ 0: α = 0 Here there are two orbits consisting of symmetric elements of SL2 (R). We have u2 = −1, so u lies on the two-sheeted hyperboloid in R2,1 , with each sheet constituting an orbit. In this case, giving the orbits the induced hyperbolic metric, the group SL2 (R) acts isometrically. 0 < α < 1 In this case, there are two orbits, i.e. the two components of the hyperboloid u2 = −1 + α 2 ∈ (−1, 0) in R2,1 . Again, the group SL2 (R) acts isometrically with respect to the induced metric on the orbits. α = 1 In this case, u2 = 0, so either u = 0 or u is null. In the first case, the group orbit consists of the point u = 0. In the latter case, the future and past null-cones of the origin give two distinct group orbits. α > 1 In this case, there is one orbit, consisting of the one-sheeted hyperboloid u2 = −1 + α 2 ∈ (1, ∞) in R2,1 . In this case, SL2 (R) acts isometrically with respect to the induced (Lorentzian) metric on the orbit. A.2. SL2 (C). In particular, we consider the action of SL2 (C) on itself given by SL2 (C) × SL2 (C) → SL2 (C);
(h, g) → h · g := hgh ∗ ,
(A.1)
where ∗ denotes complex-conjugate transpose. The subgroup PSL2 (C) ∼ = SO3,1 acts effectively. It is straightforward to check that † 1 −1 I [g] := tr g g 2 is invariant under the transformation g → h · g. It is useful to split g into Hermitian and skew-Hermitian parts:
Reducible Connections and Non-local Symmetries
445
g = U + V, where U ∗ = U, V ∗ = −V, and to note that this decomposition is preserved under (A.1) (i.e. (h · g)U = h · gU , etc.). A straightforward calculation implies that det U =
1 (I + 1) . 2
In particular, letting U = t + xτ1 + yτ2 + zτ3 and u := (t, x, y, z) ∈ R3,1 , we deduce that 1 u2 = − (I + 1) . (A.2) 2 Similarly, letting V = i (T + X τ1 + Y τ2 + Z τ3 ) and v := (T, X, Y, Z ) ∈ R3,1 , we find that det g = −u2 − 2iu, v + v2 . Since g ∈ SL2 (C), we therefore deduce that 1 v2 = − (I − 1) , 2
(A.3)
u, v = 0.
(A.4)
1 then u and v would be non-zero, orthogonal, time-like vectors in R3,1 . Since
If I > this cannot occur, we deduce that I ≤ 1. We investigate the distinct cases separately: I = 1 In this case, u2 = −1 and v2 = 0. As such, u lies on the two-sheeted hyperboloid in R3,1 . The condition that v2 = 0 and is orthogonal to the non-zero, time-like vector u then implies that v = 0. As such, we have two distinct orbits, corresponding to the two components of the two-sheeted hyperboloid. These orbits correspond to the Hermitian elements of SL2 (C). Giving the orbits the hyperbolic metric induced from R3,1 , the group SL2 (C) acts isometrically. Note that for I < 1, the vector v is always space-like, and lies on the one-sheeted
hyperboloid I := w ∈ R3,1 : w2 = 21 (1 − I ) in R3,1 . −1 < I < 1 We have u2 = −(I + 1)/2 in R3,1 . Since u is orthogonal to v, we may √ view u as a time-like vector of length (I + 1)/2 lying in the two-sheeted hyperboloid in Tv I . As such, we have two distinct orbits, consisting of the two components of the two-sheeted hyperboloid bundle in T I . In this case, the group action on the orbit is the action induced by the isometric action of SL2 (R) on the Lorentzian metric induced on the one-sheeted hyperboloid. Alternatively, we may view u as a time-like vector lying on the two-sheeted hyperboloid u2 = −(I + 1)/2 in R3,1 . We then view v as a tangent vector to the hyperboloid of length √1 (1 + |I |). Therefore the orbits in this case may be identified with the radius √1 2 √1 2
2
(1 + |I |) sphere sub-bundle of the tangent bundle of the hyperbolic space of radius
(I + 1). Again, there are two orbits corresponding to the two components of the hyperboloid. In this case, the group action on the orbit is the action induced by the isometric action of SL2 (R) on the induced metric on the two-sheeted hyperboloid. I = −1 In this case, u2 = 0 and v2 = 1. As such, we may view u as a null vector in Tv −1 . There are then three distinct orbits. The first consists of u = 0, and is simply the hyperboloid −1 . This orbit consists of the skew-Hermitian elements of SL2 (C).
446
J. D. E. Grant
The other orbits consist of the sub-bundle of T −1 consisting of the past and future null cone of the origin in each tangent space. I < −1 In this case, u2 = (|I | + 1)/2 > 0 in R3,1 . Therefore there is one orbit, consisting of the one-sheeted hyperboloid sub-bundle of T 1 . The SL2 (C) action is that induced by the isometric action on the induced Lorentzian metric on I . References 1. Atiyah, M.F.: Geometry on Yang–Mills Fields, Scuola Normale Superiore Pisa, Pisa, 1979 2. Atiyah, M.F., Hitchin, N.J., Drinfel d, V.G., Manin, Y.I.: Construction of instantons. Phys. Lett. A 65, 185–187 (1978) 3. Atiyah, M.F., Hitchin, N.J., Singer, I.M.: Self-duality in four-dimensional Riemannian geometry. Proc. Roy. Soc. London Ser. A 362, 425–461 (1978) 4. Bergvelt, M.J., Guest, M.A.: Actions of loop groups on harmonic maps. Trans. Amer. Math. Soc. 326, 861–886 (1991) 5. Burstall, F.E., Pedit, F.: Harmonic maps via Adler-Kostant-Symes theory. In: Harmonic Maps and Integrable Systems, Aspects Math., E23, Braunschweig: Vieweg, 1994, pp. 221–272 6. Burstall, F.E., Pedit, F.: Dressing orbits of harmonic maps. Duke Math. J. 80, 353–382 (1995) 7. Chau, L.L., Ge, M.L., Sinha, A., Wu, Y.S.: Hidden-symmetry algebra for the self-dual Yang–Mills equation. Phys. Lett. B 121, 391–396 (1983) 8. Chau, L.L., Ge, M.L., Wu, Y.S.: Kac–Moody algebra in the self-dual Yang-Mills equation. Phys. Rev. D (3) 25, 1086–1094 (1982) 9. Chau, L.-L., Wu, Y.S.: More about hidden-symmetry algebra for the self-dual Yang–Mills system. Phys. Rev. D (3) 26, 3581–3592 (1982) 10. Crane, L.: Action of the loop group on the self-dual Yang–Mills equation. Commun. Math. Phys. 110, 391–414 (1987) 11. Donaldson, S.K.: An application of gauge theory to four-dimensional topology. J. Diff. Geom. 18, 279–315 (1983) 12. Freed, D.S., Uhlenbeck, K.K.: Instantons and Four-Manifolds, Vol. 1 of Mathematical Sciences Research Institute Publications, New York: Springer-Verlag, Second ed., 1991 13. Grant, J.D.E.: The ADHM construction and non-local symmetries of the self-dual Yang–Mills equations. Commun. Math. phys. doi:10.1007/s00220-010-1024-9 14. Guest, M.A.: Harmonic Maps, Loop Groups, and Integrable Systems, Vol. 38 of London Mathematical Society Student Texts, Cambridge: Cambridge University Press, 1997 15. Guest, M.A., Ohnita, Y.: Group actions and deformations for harmonic maps. J. Math. Soc. Japan 45, 671–704 (1993) 16. Hitchin, N.J.: Linear field equations on self-dual spaces. Proc. Roy. Soc. London Ser. A 370, 173–191 (1980) 17. Ivanova, T.A.: On infinite-dimensional algebras of symmetries of the self-dual Yang–Mills equations. J. Math. Phys. 39, 79–87 (1998) 18. Ivanova, T.A.: On infinitesimal symmetries of the self-dual Yang–Mills equations. J. Nonlinear Math. Phys. 5, 396–404 (1998) 19. Mason, L.J., Woodhouse, N.M.J.: Integrability, Self-Duality, and Twistor Theory, Vol. 15 of London Mathematical Society Monographs. New Series, New York: The Clarendon Press/Oxford University Press, 1996 20. Park, Q.-H.: 2D sigma model approach to 4D instantons. Int. J. Mod. Phys. A 7, 1415–1447 (1992) 21. Pleba´nski, J.F.: Some solutions of complex Einstein equations. J. Math. Phys. 16, 2395–2402 (1975) 22. Popov, A.D.: Self-dual Yang–Mills: symmetries and moduli space. Rev. Math. Phys. 11, 1091–1149 (1999) 23. Pressley, A., Segal, G.: Loop Groups, Oxford Mathematical Monographs, New York: The Clarendon Press/Oxford University Press, 1986 24. Takasaki, K.: A new approach to the self-dual Yang–Mills equations. Commun. Math. Phys. 94, 35–59 (1984) 25. Uhlenbeck, K.: Harmonic maps into Lie groups: classical solutions of the chiral model. J. Diff. Geom. 30, 1–50 (1989) 26. Uhlenbeck, K.K.: Removable singularities in Yang–Mills fields. Commun. Math. Phys. 83, 11–29 (1982) 27. Ward, R.S.: On self-dual gauge fields. Phys. Lett. A 61, 81–82 (1977) Communicated by N.A. Nekrasov
Commun. Math. Phys. 296, 447–474 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1018-7
Communications in
Mathematical Physics
Random Current Representation for Transverse Field Ising Model Nicholas Crawford1 , Dmitry Ioffe2 1 Department of Statistics, UC Berkeley, Berkeley, CA 94720, USA 2 Faculty of Industrial Engineering, Technion, Haifa 3200, Israel.
E-mail:
[email protected] Received: 8 February 2009 / Accepted: 14 December 2009 Published online: 2 March 2010 – © Springer-Verlag 2010
Abstract: Random current representation (RCR) for transverse field Ising models (TFIM) has been introduced in [14]. This representation is a space-time version of the classical RCR exploited by Aizenman et. al. [1,3,4]. In this paper we formulate and prove corresponding space-time versions of the classical switching lemma and show how they generate various correlation inequalities. In particular we prove exponential decay of truncated two-point functions at positive magnetic fields in the z-direction and address the issue of the sharpness of the phase transition. 1. The Model and the Results In what follows, we shall, for brevity, consider translation invariant models on Zd . Spe- cifically, let T N be the d-dimensional lattice torus of linear size N and J = Ji j = Ji− j is a finite range irreducible translation invariant interaction. Let h ≥ 0, ρ > 0, λ ≥ 0 and 0 ≤ β ≤ ∞. The quantum Hamiltonian we are going to consider is of the form, ρ ˆ ix . − HN = Ji j σˆ iz σˆ jz + h σˆ iz + λ (1.1) 2 i, j
ˆ x = I + σˆ Above
x
i
i
/2, and σˆ z and σˆ x are the usual Pauli matrices, 1 0 01 and σˆ x = . σˆ z = 0 −1 10
Let us introduce the partition function
d ¯ Zβ,N (h, ρ, λ) = e−N β (ρ J +h+λ) Tr e−β H N ,
This research was supported by a grant from G.I.F., the German Israeli Foundation for Scientific Research and Development and by a grant from BSF, the United States—Israel Binational Science Foundation.
448
N. Crawford, D. Ioffe
where J¯ = j Ji j and we remark that this choice of normalization is made so as to seamlessly introduce certain stochastic integral representations below. Mean values of various local observables are denoted as ·β,N . For instance,
z −β H x e−β H N ˆ Tr N Tr σˆ i e i ˆ ix β,N = , , σˆ iz β,N = −β H N Tr e Tr e−β H N ˆ x e−β H N Tr σˆ jz i ˆ ix β,N = . or, for i = j, σˆ jz Tr e−β H N Most of the results which we shall derive in the sequel hold uniformly in β < ∞ and/or in N . Whenever this is the case we shall omit the corresponding sub-index. Note that in many cases uniformity in β < ∞ implies extensions of the corresponding properties to the ground state β = ∞. Important quantities to be considered here are the z-magnetization: Mβ,N (h, ρ, λ) = σˆ iz β,N , and the truncated two-point functions,
ˆ xj β,N and ˆ ix ; ˆ xj β,N . σˆ iz ; σˆ jz β,N = σˆ iz σˆ jz β,N − σˆ iz β,N σˆ jz β,N , σˆ iz ; Our two main results are: Theorem A. For every h > 0, λ ≥ 0 and ρ ≥ 0 there exists c1 = c1 (h, λ, ρ) > 0 and c2 = c2 (h, λ, ρ) < ∞, such that ˆ ix ; ˆ xj ≤ c2 e−c1 | j−i| , 0 ≤ σˆ iz ; σˆ jz ≤ c2 e−c1 | j−i| , 0 ≤ ˆ xj ≤ 0. and, for i = j, − c2 e−c1 | j−i| ≤ σˆ iz ;
(1.2)
By our convention the above results are claimed to be uniform in the torus size N and in β < ∞. Theorem B. Uniformly in h > 0, ρ > 0 and λ > 0 the following differential inequalities hold: M(h, ρ, λ) ≤ h
∂M ∂M ∂M + M 3 + M 2ρ − 2λM 2 , ∂h ∂ρ ∂λ
(1.3)
and, −
∂M M ∂M ∂M ∂M ≤ and ≤ J¯ M . 2 ∂λ 1 − M ∂h ∂ρ ∂h
(1.4)
Again, by convention, the above inequalities are claimed to hold uniformly in N and in β < ∞.
Random Current Representation
449
In view of the fundamental techniques developed in [2,3], differential inequalities (1.3) and (1.4) imply a certain sharpness of phase transition as the transverse field λ ˆ x do not comand/or the inverse temperature β are varied. In particular, since σˆ z and mute, the uniformity of our estimates in β imply that taking β → ∞, these inequalities still hold and can be used to derive a genuine quantum phase transition, albeit the fact that we derive it using a somewhat classical re-interpretation of the model (see Sect. 5). In principle, since the model in question could be considered as the strong coupling limit of (d + 1)-dimensional classical Ising models [5,9], Theorem B could be attempted as a limiting conclusion from the result of [3]. The point of this paper, however, is to try to understand something new; that is to develop a general and robust stochastic geometric description of quantum systems, hopefully also yielding simpler, or at least alternative, proofs even in the classical case of λ = 0. In particular, the conclusions of both theorems above will become rather transparent in the stochastic geometric context which we develop here. The rest of this paper is organized as follows. Section 2 introduces a recasting of the transverse Ising model in a useful probabilistic language. Further, we set down various geometric notions for this recasting which form the basis of our proofs of Theorem A and B. Section 3 applies these notions to the truncated correlation functions appearing in Theorem A. The resulting expressions may be seen as generalizing the results of the classical Switching Lemma employed in [1,3,4]. Section 4 provides a derivation of Theorem B. Section 5 analyzes expressions for truncated correlations to obtain a proof of Theorem A. Finally, at the end of Sect. 5 we briefly address the implications for a quantum phase transition in the ground state β = ∞. A Bibliographical Remark. Shortly after the first draft of this work was posted on the web, there appeared a preprint of [8]. The authors of [8] draw motivation from a parity calculus via strong coupling limits for classical RCR, and they develop what they call “randomparity representation” for TFIM. The paper [8] contains very similar formulations and proofs of the corresponding switching lemma and of the differential inequalities. The following bibliographical remark is due: (a) Although it might look ostensibly different, the random-parity representation of [8] can be readily derived (see Remark 1 below) from the RCR which was introduced in [14] and which we use here. [14] is a transcript of lectures given at Prague’s Probability school in 2006. (b) A simple example of the application to TFIM of the classical switching lemma via limiting parity calculus appears in the Appendix of [10].
2. Stochastic Geometry of the Model The stochastic geometric approach to quantum models via the Lie-Trotter product expansion in the imaginary time variable (additional dimension) and a subsequent classical re-interpretation was introduced in [11]. An important milestone along these lines is the seminal paper [6]. The approach expounded upon in that paper has many degrees of freedom in the sense that one can experiment with numerous decompositions of the Hamiltonian and with the basis in which the Lie-Trotter expansion is performed to achieve different representations. We shall skip the derivation of the representation of interest in the present context and proceed directly to its probabilistic description. We refer the interested reader to [14] where the quantum random current representation we are using here was introduced
450
N. Crawford, D. Ioffe
and where various other stochastic geometric descriptions of the transverse field Ising model are discussed at length. To each site i ∈ T N one attaches a copy Siβ of the circle Sβ of circumference β. In
the ground state case β = ∞, S∞ = R. The resulting (d + 1)-dimensional state space of the model is S N ∪ g, where,
S N = ∪i∈T N Siβ , and g is an artificial “ghost site”. The parameters h, J and λ enter the picture in the following fashion: Consider graphs G N = (V N , E N ) with the vertex set V N = T N ∪ g, and g edge set E N = E N0 ∪ E N which comprise either edges e = (i, j) ∈ E N0 with i, j ∈ T N g and Ji− j > 0, or e = (i, g) ∈ E N with i ∈ T N . As above, we omit the sub-index N whenever it has no impact on the corresponding definition or claim. Let us define the following families of independent Poisson point processes on Sβ : Processes of flips. With each e ∈ E N we associate a Poisson process ξe which has intensity ρ Ji− j if e = (i, j) and intensity h if e = (i, g). Processes of marks. With each i ∈ T N we associate a Poisson process mi of intensity λ. In the sequel we shall denote the corresponding product measure as P (dξ, dm). In particular, for notational convenience, whenever there is no confusion the dependence on (β, J, h, ρ, λ) will be suppressed. To write down the random current representation we still need to introduce the notion of labels: Labels. Labels ν are piece-wise constant maps ν : S N → {r, l}. Here r and l are just two symbols, which, if one traces the original derivation of [14], are related to the one particle eigenfunctions in the transverse x-basis. Given a realization (ξ, m) of the Poisson point processes and a finite subset A ⊂ S, let us say that a label ν is compatible A
(see Fig. 1)– which will be denoted by ν ∼ (ξ, m) – if (1) νi has a jump at u for every u ∈ A. (2) All other jumps of ν happen at arrival times of ξ : For e = (i, g), an arrival of ξe enforces a flip of νi , and, similarly, an arrival of ξi j enforces a simultaneous flip of νi and ν j . (3) For each i, νi (t) = r at each arrival time t of mi A
To facilitate the notation we shall drop A from ν ∼ (ξ, m) whenever A = ∅. Representation Formulas. The following formulas are established in [14]: For the partition function (and β < ∞), Z N = P (dξ, dm) 1. (2.1) ν∼(ξ,m)
Remark 1. Integrating out the process of marks m and calling r “even” and l “odd”, one recovers the “random parity” representation of [8].
Random Current Representation
451
(a)
(b)
Fig. 1. Poisson processes of arrivals and compatible labels on S = ∪61 Siβ :
(4,t)
(a) ν ∼ (ξ, m) (b) ν ∼ (ξ, m)
Given u = (i, t) define ˆ ux = e−t H σˆ ix et H . σˆ uz = e−t H σˆ iz et H and, accordingly, Note here that the signs match the imaginary time rotation of the quantum evolution. For one- and two-point functions in the z component of spin: z 1 σˆ u = 1. (2.2) P (dξ, dm) Z u ν ∼(ξ,m)
For the two-point function,
1 σˆ uz σˆ vz = Z
P (dξ, dm)
1.
(2.3)
u,v
ν ∼ (ξ,m)
In fact, it is straightforward to check that similar formulas hold for x-observables and mixed two-point functions (see [14] for details): Namely, 1 x ˆ u = 1I{ν(u)=r } , P (dξ, dm) Z ν∼(ξ,m) (2.4) 1 x ˆx ˆ u v = 1I{ν(u)=r } 1I{ν(v)=r } , P (dξ, dm) Z ν∼(ξ,m)
and, for u = v,
ˆ vx σˆ uz
1 = Z
P (dξ, dm)
u
ν ∼(ξ,m)
1I{ν(v)=r } .
(2.5)
452
N. Crawford, D. Ioffe
Note that once these formulas are available with u = v, they may be extended by continuity to the appropriate limiting correlation functions. We do not state them here as they will not appear in our derivations below. Intervals, paths and replicas. Let (ξ, m) be a realization of the Poisson processes introduced in the previous section, A a finite subset of S and let ν be a compatible label A
ν ∼ (ξ, m). An interval of ν is a maximal connected component I = (u, v) of some Siβ on which νi is constant. A path P of (ν, ξ, m) is an ordered sequence I1 , I2 , . . . , In , where Il is either an interval or a ghost site g and, (1) If Il = (ul , vl ) and Il+1 = (ul+1 , vl+1 ) then either vl = ul+1 or vl = (i, t), ul+1 = ( j, t) and t is an arrival time of ξi j . (2) If Il = (ul , vl ), vl = ( j, t) and Il+1 = g, then t is an arrival time of ξi,g. (3) If Il = g, Il+1 = (ul+1 , vl+1 ) and ul+1 = ( j, t), then t is an arrival time of ξ j,g. (4) There could not be two successive ghost sites g in a path. A path P = {I1 , . . . , In } is said to be ground if it does not contain g, except possibly at the last step In . Finally, a path P is said to be left if all the ground intervals of P bear ν-label l. Let us define the set {u ←→ v} to be the collection of triples (ξ, m, ν) so that there t
exists a left path with endpoints at u and v and the set u ←→ v to be the collection of triples (ξ, m, ν) so that there exists a ground left path from u to v. Note that ground left paths are self-avoiding and that there is a unique ground left path from u to g whenever u ν ∼ (ξ, m). We shall denote this path by Cl (u, g) and we shall use Cˇ l (u, g) for the union of its ground intervals, that is for Cl (u, g)\g. Consider now two finite (and not necessarily disjoint) subsets A, B ⊂ S and two A
B
copies (ξ 1 , m1 , ν 1 ) and (ξ 2 , m2 , ν 2 ) such that ν 1 ∼ (ξ 1 , m1 ) and ν 2 ∼ (ξ 2 , m2 ). We
shall denote the combined processes of flips and marks as (η, n) = (ξ 1 ∪ ξ 2 , m1 ∪ m2 ), where the union is understood in the coordinate wise sense, e.g. ηi j = ξi1j ∪ ξi2j . In all considerations below the processes (ξ 1 , m1 ) and (ξ 2 , m2 ) are independent. Consequently, (η, n) is just a collection of independent Poisson processes of arrivals with double intensities. Furthermore, given a realization (η, n), the conditional distribution of (ξ 1 , m1 ) ⊆ (η, n) is uniform with point mass #(η)+#(n) e ηe [Sβ ]+ i ni [Sβ ] 1 1 = . 2 2
(2.6)
Note that given η and the locations of the discontinuities of (ν 1 , ν 2 ), the arrivals of (ξ 1 , ξ 2 ) may be recovered. However it is not usually possible to reconstruct (m1 , m2 ) from n even knowing the values of (ν 1 , ν 2 ). Let us introduce geometric notions for pairs of configurations, extending our previous definitions. It will be convenient to make definitions relative to a fixed finite subset G ⊂ S. An interval I of (ν 1 , ν 2 ) is a maximal connected component I = (u, v) of some Siβ , on which both labels ν 1 and ν 2 are constant and which does not contain points from G. A path P of (ν 1 , ν 2 , η, n) is an ordered sequence I1 , I2 , . . . , In , where Il is either an interval of (ν 1 , ν 2 ) or a ghost site g and,
Random Current Representation
453
Fig. 2. Special set G = {(1, t), (4, s)}: Blocked intervals for two replicas (ξ 1 , m1 ), (ξ 2 , m2 ) and two com(1,t)
(4,s)
patible labels ν 1 ∼ (ξ 1 , m1 ), ν 2 ∼ (ξ 2 , m2 )
(1) If Il = (ul , vl ) and Il+1 = (ul+1 , vl+1 ), then either vl = ul+1 = (i, t), and then either (i, t) ∈ G or t is an arrival time of ηi,g; or, otherwise, vl = (i, t), ul+1 = ( j, t) and t is an arrival time of ηi j . (2) If Il = (ul , vl ), vl = ( j, t) and Il+1 = g, then t is an arrival time of η j,g. (3) If Il = g, Il+1 = (ul+1 , vl+1 ) and ul+1 = ( j, t), then t is an arrival time of η j,g. (4) There can not be two successive ghost sites g in a path. (5) All ground intervals Il ⊂ S are disjoint. As before, a path P = {I1 , . . . , In } is said to be ground if it does not contain g, with a possible exception of the last step In . A path P = {I1 , I2 , . . . , In } is said to be a loop if either I1 = In = g or vn = u1 . It is useful to keep in mind that the above notions do not depend on the values of compatible labels (ν1 , ν2 ) or arrivals of marks n. Rather, they only depend on the arrivals of flips η. On the other hand, we also consider an important notion which very much depends on the pair of configurations: Let us say that the interval I is blocked if (see Fig. 2) both ν 1 and ν 2 are equal to r on I and, in addition, n(I) > 0. A path P = {I1 , . . . , In } is said to be unblocked if it∗ does not contain blocked intervals. We shall say that u and v are ∗-connected; u ←→ v , if, for G = {u, v}, there ∗t exists an unblocked path with end-points at u and v, and we shall write u ←→ v whenever there exists a ground unblocked path from u to v. Basic Transformation. Let P = (I1 , . . . , In ) be an unblocked path of ν 1 , ν 2 , η, n from u to v. Obviously the labels ν 1 and ν 2 unambiguously define the splitting η = ξ 1 ∪ ξ 2 . Moreover, since P is unblocked, ν 1 and ν 2 unambiguously define the splitting of marks n = m1 ∪ m2 along P. Make the following transformation of labels and marks on each of the ground intervals I of P:
454
N. Crawford, D. Ioffe
(1) If the (ν 1 , ν 2 ) label of I is (l, r ), then flip it to (r, l) and transfer all marks accordingly – set m1 (I) = m2 (I) and set m2 (I) = 0. Perform the analogous procedure if the label is (r, l). (2) If the label is (l, l) then flip it to (r, r ). Accordingly, if the label is (r, r ), then flip it to (l, l). Note that in the latter case, since we are moving along an unblocked path, n(I) has to be equal to zero, and no incompatibility arises. (3) Adjust ξ 1 and ξ 2 accordingly - those are, of course completely defined by the labels (flips of the labels, to be precise). The above transformation, let us call it P , defines a map 1 ), ( 2 ) . ν1, ξ 1, m ν2, ξ 2, m (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) → ( The map P enjoys the following set of properties: (1) It is invertible: Indeed just apply P once more to recover the original data. (2) It does not change ν 1 and ν 2 labels and m1 , m2 -marks on intervals which do not belong to P. In addition, the original and modified configurations have the same set of intervals (defined by η, u and v), and P does not change the blocked/unblocked status of any of those. A
B
(3) If ν 1 ∼ (ξ 1 , m1 ) and ν 2 ∼ (ξ 2 , m2 ), then A{u,v}
∼
1 ) and ( ξ 1, m ν2
B{u,v}
2 ). ( ξ 2, m (2.7) 1 1 1 1 ) have the same (4) It is measure preserving: In view of (2.6), (ξ , m ) and (ξ , m conditional weights. ν1
∼
Minimal paths. Most of the transformations we are going to perform will be along minimal unblocked paths, often satisfying additional geometric constraints. Let us, therefore, define what we mean by minimal. First of all given an unblocked path P = (I1 , . . . , In )
if I is a ground define its length as |P| = n1 |Il |, where |I| is the Euclidean length interval, and, by definition, |g| = 0. Consider now two replicas ξ 1 , m1 , ξ 2 , m2 and a pair of compatible labels ν 1 , ν 2 . Let u, v ∈ S ∪ g and assume that there are unblocked paths from u to v. Then the minimal path C∗ (u, v) satisfies, |C∗ (u, v)| ≤ |P| for any unblocked path P from u to v.
(2.8)
C∗ (u, v)
It is easy to see that in general (2.8) alone does not define uniquely, and one needs to impose an additional rule in order to choose the minimal path from a set of paths with the same minimal length. For example the following rule will do: Write a coarse grained description of P(u, v) = R1 , . . . , Rm , where Rl is either a ghost site g or a maximal collection of successive ground intervals of P on some Siβ . Then for two unblocked paths P = (R1 , . . . Rm ) and P = (R1 , . . . , Rk ) we shall say that P ≺ P if either |P| < |P |, or if the lengths are equal, there exists l such that |Ri | = |Ri | for i = 1, . . . , l − 1, but |Rl | > |Rl |.
(2.9)
Then C∗ (u, v) is unambiguously defined as the unique unblocked path from u to v which
is ≺-less than any other unblocked path from u to v. In other words, the minimal path, as we define it, is the most conservative of all the paths of the same minimal length: it tries to stay as much as possible on each subsequent spatial circle Sβ . The important feature of the path transformation which was introduced above is (see Fig. 3): If C∗ (u, v) is the minimal path, then it remains so after C∗ (u,v) is performed. As a result, transformations along minimal paths are well defined and invertible.
Random Current Representation
455
(a)
(b)
u
v
Fig. 3. Two replicas (ξ 1 , m1 ), (ξ 2 , m2 ) and two compatible labels ν 1 ∼ (ξ 1 , m1 ), ν 2 ∼ (ξ 2 , m2 ), where u = (1, t) and v = (4, s). (a) Minimal unblocked path C∗ (u, v) from u to the ghost site g. (b) Basic transforu,v 1 1 ) and 1 ). Labels which are switched along the minimal path ξ 1, m ν 2 ∼ ( ξ ,m mation: New labels ν 1 ∼ ( are shaded. Note that the flips and the marks are switched accordingly
3. Switching Lemmas and Related Correlation Inequalities Recall that ξ 1 , m1 and ξ 2 , m2 are independent copies of our Poisson processes of flips and marks, and that we use η = ξ 1 ∪ ξ 2 , n = m1 ∪ m2 for the combined processes. Let E denote the expectation with respect to two independent replicas of Poisson processes of flips and marks; ξ 1 , m1 and ξ 2 , m2 . In this section, we give exact formulae for the truncated correlations appearing in (1.2) and discuss the term ∂ M/∂ρ which appears in Theorem B.
Representation of σˆ uz ; σˆ vz . In view of (2.6) we can record (2.2) in terms of two replicas as, #(η)+#(n) z z 1 1 σˆ u σˆ v = 2 P (dη, dn) 1. (3.1) Z 2 u 1 2 ξ ∪ξ =η
m1 ∪m2 =n
ν 1 ∼(ξ 1 ,m 2 ) v
ν 2 ∼(ξ 1 ,m2 )
Similarly, we can record (2.1) and (2.3) as, z z #(η)+#(n) z z Z σˆ i σˆ j 1 1 = 2 P (dη, dn) σˆ u σˆ v = Z Z 2 1 2
ξ ∪ξ =η
ν 1 ∼(ξ 1 ,m 1 ) 2 u,v 2 2
1. (3.2)
m1 ∪m2 =n ν ∼ (ξ ,m )
456
N. Crawford, D. Ioffe u
Let us have a closer look at (3.1). The constraint ν 1 ∼ (ξ 1 , m1 ) implies that there is a path P from u to g such that ν 1 ≡ l on P. In particular this path P must be unblocked. An analogous statement also applies with respect to v in the second replica. Therefore, one can rewrite (3.1) as
1 σˆ uz σˆ vz = 2 Z
#(η)+#(n) 1 P (dη, dn) 2 1I ∗ 1I
×
u ξ 1 ∪ξ 2 =η ν 1 ∼(ξ 1 ,m 1 ) m1 ∪m2 =n 2 v 2 2 ν ∼(ξ ,m )
u←→g
. ∗ v←→g
(3.3)
Similarly, one can rewrite (3.2) as,
σˆ uz σˆ vz
1 = 2 Z
#(η)+#(n) 1 × P (dη, dn) 2
1I
∗
u←→v
ξ 1 ∪ξ 2 =η
.
ν 1 ∼(ξ 1 ,m 1 )
(3.4)
m1 ∪m2 =n ν 2 u,v ∼ (ξ 2 ,m2 ) g
g
Let us fix Au,v = Au,v (η, n) to be the set of pairs of a realization of (η, n). Define objects (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) which contribute to the double sum on the righthand side of (3.3). Similarly let Au,v be the set of pairs of objects (currents and labels) which contribute to the double sum on the right-hand side of (3.4). Each of the objects g in Au,v contains an unblocked path, and hence the minimal unblocked path C∗ (u, g) g from u to g. We claim that the map , ≡ C∗ (u,g) : Au,v → Au,v is a measure preserving injection. This follows immediately from the properties of basic transformations minimal paths. However, is not onto: any couple of objects in the image 1 and g (ν , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) ∈ (Au,v ) necessarily contains an unblocked path from u to g. We have proved: Theorem 3.1. Truncated z-correlation functions satisfy the following version of the Switching Lemma:
1 σˆ uz ; σˆ vz = 2 Z =
1 E 2 Z 1 1
P (dη, dn)
#(η)+#(n) 1 × 2
ξ 1 ∪ξ 2 =η
ν 1 ∼(ξ 1 ,m 1 )
1I
m1 ∪m2 =n ν 2 u,v ∼ (ξ 2 ,m2 )
1I
ν ∼(ξ ,m 1 ) u,v ν 2 ∼ (ξ 2 ,m2 )
∗
∗ u←→g
(3.5)
.
u←→g
ˆ vx . Consider two independent replicas (ξ 1 , m1 ), (ξ 2 , m2 ) and ˆ ux ; Representation of two labels ν 1 ∼ (ξ 1 , m1 ) and ν 2 ∼ (ξ 2 , m2 ). Let us say that a couple of labels (ν 1 , ν 2 ) ∈ 1 2 1 2 [(r, 1r )u2, (r, l)v ] if ν (u) = r = ν (u), whereas ν (v) =r and ν (v) = l. The events, (ν , ν ) ∈ [(r, l)u , (r, l)v ] , (ν 1 , ν 2 ) ∈ [(l, l)u , (r, l)v ] etc. (all together 16 events) are defined in a completely similar fashion. In terms of two replicas, the representation
Random Current Representation
formulas (2.4) read: 1 ˆ vx = ˆ ux E Z2
457
1I{(ν 1 ,ν 2 )∈[(r,r )u ,(r,r )v ]} + 1I{(ν 1 ,ν 2 )∈[(r,r )u ,(r,l)v ]}
ν 1 ∼(ξ 1 ,m 1 ) ν 2 ∼(ξ 2 ,m2 )
+ 1I{(ν 1 ,ν 2 )∈[(r,l)u ,(r,r )v ]} + 1I{(ν 1 ,ν 2 )∈[(r,l)u ,(r,l)v ]} . Similarly, 1 ˆ ux ˆ vx = E Z2 1 1
(3.6)
1I{(ν 1 ,ν 2 )∈[(r,r )u ,(r,r )v ]} + 1I{(ν 1 ,ν 2 )∈[(r,l)u ,(r,r )v ]}
ν ∼(ξ ,m 1 )
ν 2 ∼(ξ 2 ,m2 )
+ 1I{(ν 1 ,ν 2 )∈[(r,r )u ,(l,r )v ]} + 1I{(ν 1 ,ν 2 )∈[(r,l)u ,(l,r )v ]} .
(3.7)
Evidently, E
1I{(ν 1 ,ν 2 )∈[(r,r )u ,(r,l)v ]} = E
ν 1 ∼(ξ 1 ,m 1 ) ν 2 ∼(ξ 2 ,m2 )
1I{(ν 1 ,ν 2 )∈[(r,r )u ,(l,r )v ]} .
ν 1 ∼(ξ 1 ,m 1 ) ν 2 ∼(ξ 2 ,m2 )
Consequently, we arrive to the following representation for the truncated two point function: 1 ˆ ux ; ˆ vx = 1 I E I 1 ,ν 2 )∈[(r,l) ,(r,l) ]} − 1 1 ,ν 2 )∈[(r,l) ,(l,r ) ]} . (3.8) (ν (ν { { u v u v Z2 1 1 1 ν ∼(ξ ,m )
ν 2 ∼(ξ 2 ,m2 )
At this stage we proceed much along the lines of our proof of Theorem 3.1. Fix a reali- zation of (η, n) and let B+ (η, n) be the set of pairs of objects (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) which contribute to the sum 1I{(ν 1 ,ν 2 )∈[(r,l)u ,(r,l)v ]} . ξ 1 ∪ξ 2 =η
ν 1 ∼(ξ 1 ,m 1 )
m1 ∪m2 =n ν 2 ∼(ξ 2 ,m2 )
Similarly, let B− (η, n) be the set of pairs of objects (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) which contribute to the sum 1I{(ν 1 ,ν 2 )∈[(r,l)u ,(l,r )v ]} . ξ 1 ∪ξ 2 =η
ν 1 ∼(ξ 1 ,m 1 )
m1 ∪m2 =n ν 2 ∼(ξ 2 ,m2 )
An injective map = η,n : B− (η, n) → B+ (η, n) is constructed as follows: Any (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) ∈ B− (η, n) contains an unblocked loop L from v to v such that u ∈ L. Indeed, such a loop may be constructed with ν 1 ≡ l. Now just choose the minimal such loop (in the sense discussed above) and perform on this minimal loop the very same surgery as in the Basic Transformation. Again, the property that the loop is minimal is not changed under the surgery and hence is invertible. On the other hand, the image set B− (η, n) ⊂ B+ (η, n).
458
N. Crawford, D. Ioffe
Geometrically, it is evident that B+\B− is characterized by the following condi tion: A pair (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) from B+ belongs to B+ \B− if and only if any unblocked loop containing v also contains u. In this case, let us say that u is loop-pivotal for v. We conclude: Theorem 3.2. Truncated x-correlation functions satisfy the following version of the Switching Lemma: 1 ˆ ux ; ˆ vx = E 1I{(ν 1 ,ν 2 )∈[(r,l)u ,(r,l)v ]} 1I{u is loop pivotal for v }. (3.9) Z2 1 1 1 ν ∼(ξ ,m )
ν 2 ∼(ξ 2 ,m2 )
Representation of cross-correlations. As before, let E denote the expectation with respect to two independent replicas of Poisson processes of flips and marks; ξ 1 , m1 and ξ 2 , m2 . With this notation we have (from (2.2), the first of (2.4) and (2.5)), 1 ˆ vx = σˆ uz E 1I{ν 1 (v)=r } , (3.10) 2 Z u ν 1 ∼(ξ 1 ,m 1 )
ν 2 ∼(ξ 2 ,m2 )
and, accordingly,
x 1 ˆv = E σˆ uz Z2
1I{ν 2 (v)=r } .
(3.11)
u ν 1 ∼(ξ 1 ,m 1 ) ν 2 ∼(ξ 2 ,m2 )
Fix a realization of (η, n) and let D+ (η, n) be the set of pairs of objects (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) which contribute to the sum 1I{ν 2 (v)=r } . ξ 1 ∪ξ 2 =η
u ν 1 ∼(ξ 1 ,m 1 )
m1 ∪m2 =n ν 2 ∼(ξ 2 ,m2 )
Similarly, let D− (η, n) be the set of pairs of objects (ν 1 , ξ 1 , m1 ), (ξ 2 , m2 , ν 2 ) which contribute to the sum 1I{ν 1 (v)=r } . ξ 1 ∪ξ 2 =η
u ν 1 ∼(ξ 1 ,m 1 )
m1 ∪m2 =n ν 2 ∼(ξ 2 ,m2 )
Note now that any pair of objects (ν 1 , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) ∈ D− contains an unblocked path (e.g. with ν 1 ≡ l) and hence the minimal unblocked path C∗, v (u, g) from u to g which avoids v. An injective map = η,n : D− (η, n) → D+ (η, n) is then constructed as follows: (1) Perform the Basic Transformation along the minimal path C∗, v (u, g). (2) Using the symmetry of replicas, rename the resulting 1 ↔ 2 . ν1, ν2, ξ 1, m ξ 2, m
Random Current Representation
459
that D+ \D− is It is1evident characterized by the following condition: A pair of objects (ν , ξ 1 , m1 ), (ν 2 , ξ 2 , m2 ) from D+ belongs to D+ \D− if and only if any unblocked ∗
path from u to the ghost site g contains v. Let us say that v is pivotal for u ←→ g if the latter condition holds. We have proved: Theorem 3.3. Truncated cross-correlation functions satisfy the following version of the Switching Lemma:
1 ˆ vx = − σˆ uz ; E Z2
1I{ν 2 (v)=r } 1I
u ν 1 ∼(ξ 1 ,m 1 ) 2 ν ∼(ξ 2 ,m2 )
∗
v is pivotal for u ←→ g
.
(3.12)
Note that the following (straightforward) generalization of Theorem 3.3 holds. Let G = {v1 , . . . , vl , vl+1 , . . . , vl+k } be a finite subset of S which is time-ordered in the following sense: The coordinates vq = (i q , tq ) satisfy tq < t p whenever q < p. Let u= (i, t) be ˆx such that tl < t < tl+1 . Then, the truncated cross-correlation σˆ uz ; l+k 1 vq is defined as l+k l l+k l+k z z x x z x x ˆv = ˆ v σˆ u ˆ v − σˆ u ˆv . σˆ u ; q q q q 1
1
1
l+1
We have, σˆ uz ;
l+k
ˆ vx = − q
1
1 E Z2
l+k
u ν 1 ∼(ξ 1 ,m 1 ) ν 2 ∼(ξ 2 ,m2 )
1I{ν 2 (vq )=r } 1I
1
∗
G is pivotal for u ←→ g
.
(3.13)
Further correlation inequalities. In the classical case (see e.g. [1,3,4,16]) random current representations of correlations generate a variety of correlation inequalities. In fact, the morphology in the quantum case is even richer and this issue will be systematically addressed elsewhere. Here we shall focus only on such inequalities which are needed for proving our main results. Partial derivatives with respect to the parameters (h, λ, ρ) of the magnetization M = σˆ 0z are related to truncated correlations in the following way: Fix the origin 0 of T N and let 0 ∈ S be the point with the space time coordinates 0 = (0, 0). In view of (space and time) translation invariance it is of course inessential how we fix 0. Then, β ∂M z = σˆ 0z ; σˆ (i,t) dt, ∂h 0 i∈T N
and
∂M = ∂λ
i∈T N
β 0
∂M = ∂ρ
ˆ x dt. σˆ 0z ; (i,t)
(i, j):Ji− j >0
Ji j 2
β 0
z σˆ 0z ; σˆ (i,t) σˆ (zj,t) dt, (3.14)
Random current representations for the z and cross-correlations were already given z z z above. Let us therefore turn to σˆ 0 ; σˆ (i,t) σˆ ( j,t) terms. In order to facilitate the notation,
460
N. Crawford, D. Ioffe
set w = (i, t) and z = ( j, t). The random current representation of z z z z z z Z 1 = 2E σˆ 0 σˆ w σˆ z = σˆ 0 σˆ w σˆ z Z Z
1,
{0,w,z} 1 1 ν1 ∼ ξ ,m ν 2 ∼ ξ 2 ,m2
(
(
)
)
is straightforward. Consider now,
1 σˆ wz σˆ zz σˆ 0z = 2 E Z
1.
{w,z} ν 1 ∼ ξ 1 ,m 1 0 ν 2 ∼ ξ 2 ,m2
(
) )
( 1 1 1 2 2 2 which contributes to the latter integral Each pair of triples ν , ξ , m , ν , ξ , m contains an unblocked path from 0 to g. Performing our Basic Transformation along the minimal such path, we infer,
1 σˆ wz σˆ zz σˆ 0z = 2 E Z
1I
{0,w,z} 1 1 ∼ ξ ,m ν 2 ∼ ξ 2 ,m2
ν1
(
(
)
. ∗ 0←→g
)
Consequently,
1 σˆ 0z ; σˆ wz σˆ zz = 2 E Z
1I
{0,w,z} 1 1 ν1 ∼ ξ ,m ν 2 ∼ ξ 2 ,m2
(
(
)
)
. ∗ 0←→g
(3.15)
In particular, ∂ M/∂ρ ≥ 0. One can readily generalize the latter conclusion to a system with inhomogeneous flip rates in the following fashion: Let ρe : Sβ → R+ ; e ∈ E 0 be a collection of non-negative (and, say, piece-wise smooth) functions. Let us view the ρe ’s as time-inhomogeneous rates of arrivals of (ground) flips corresponding to the endpoints of e. In this way, we may introduce an analog of (2.2), defining z-expectation values Mu (ρ(·)) = Mu (h, ρ(·), λ) ; u ∈ S, via the right-hand side of (2.2) but using the inhomogeneous arrival rates (ρe (t))e∈E 0 . Then, for every u ∈ S, the functional Mu (·) is non-decreasing in ρ, that is ∀e ρe ≤ ρe t − a.e.
⇒
∀u Mu (ρ) ≤ Mu (ρ ).
(3.16)
It is worth noting that this may be seen as a special case of Griffith’s second inequality [12]. Obviously, we may use the random current representation to introduce time-inhomogeneous versions of all correlations we have already encountered in this paper. With that in mind, the following combination of (3.16) with (3.13) will be useful in the sequel: Let A be a disjoint union; A = I1 ∪ · · · ∪ In ∪ Siβ1 ∪ · · · Siβm ,
Random Current Representation
461
where Il -s are ground segments of the form Il = (wl , zl ) with both wl and zl lying on some circle Siβ (and being time ordered to avoid notational ambiguities). Define Ac = S\A. Finally define the reduced arrival rates ρ A, ⎧ ⎪ ⎨ ρ, if the corresponding flip is either between two points in A ρeA(t) = (3.17) or between two points in Ac ⎪ ⎩ 0, otherwise. In other words we suppress arrivals of flips between A and Ac . Let u ∈ Ac and let v1 , . . . , vl , u, vl+1 , . . . , v2n be the time ordering of the set {w1 , . . . , zn , u}. Then, exactly as in (3.13), l 2n 2n 2n
ˆ vx σˆ uz ˆ vx (ρ A) ≤ σˆ uz (ρ A) ˆ vx (ρ A) ≤ M(ρ) ˆ vx (ρ A), q q q q 1
1
l+1
1
(3.18) where the expectations are understood in terms of the corresponding (generalized to time-inhomogeneous rates) random current representations, and the second inequality follows from (3.16). In view of how the rates ρ A were defined, fixing labels at the end-points of I1 , . . . , In completely decouples the two regions A and Ac . As a result, (3.18) implies the following inequality: Eρ A
2n
u
1
ν|Aˇ c ∼(ξ,m)
1I{ν(vq )=r } ≤ M(ρ)Eρ A
2n
ν|Aˇ c ∼(ξ,m) 1
1I{ν(vq )=r } ,
(3.19)
where the expectation above is with respect to ρ A-arrival rates and the summation is over all reduced labels ν|Ac : Ac → {r, l}. 4. Differential Inequalities The following is an adaptation of the ideas of [2,3] to the quantum case. It is worth noting that the space-time techniques we develop here yield simplified proofs even in the classical case. A fruitful idea of [3] is to work with three replicas in order to control the above quan- tities. In our case these will be three independent replicas ξ 1 , m1 , ξ 2 , m2 , ξ 3 , m3 of Poisson processes of flips and marks and, respectively, three sets of compatible labels ν 1 , ν 2 , ν 3 . We shall always indicate in sub-indices which replicas we are talking about, e.g. we shall talk about left l1 paths in the first replica or about unblocked ∗23 -paths in the replicas 2 and 3. In the sequel P is the product measure for all three independent replicas and E denotes the corresponding expectation. Let us go back to the representations (2.2) and (2.1),
Z2 1 σˆ 0z = σˆ 0z 2 = 3 E Z Z
2 2 2 0 ν 1 ∼(ξ 1 ,m1 ) ν3 ∼(ξ 3,m 3) ν ∼(ξ ,m )
1.
(4.1)
462
N. Crawford, D. Ioffe
Fig. 4. The ground left 1-path Cˇ l1 (0, g) contains three intervals. r and are ν 1 -labels of the first replica. The unblocked 23-cluster C∗23 (g) is depicted schematically. Case 1: Cˇ l1 (0, g) is disjoint from C∗23 (g). Case 2: 0 ∈ C∗23 (g). Case 3: 0 ∈ / C∗23 (g), but Cˇ l1 (0, g) ∩ C∗23 (g) = ∅
Let C∗23 (g) be the set of all points v ∈ S which are ∗23 -connected to g and let us denote Cˇ l1 (0, g) as the set of ground (S) points on the unique ground left path from 0 to g. We shall distinguish three cases which exhaust all possible contributions to the right-hand side of (4.1) and lead to the various terms in (1.3): (1) Cˇ l1 (0, g) ∩ C∗23 (g) = ∅. (2) 0 ∈ C∗23 (g). (3) 0 ∈ C∗23 (g) but Cˇ l1 (0, g) ∩ C∗23 (g) = ∅. Below we consider these cases in turn (see Fig. 4). During our exposition of Case 3, we also derive the pair of inequalities (1.4). Case 2. If 0 ∈ C∗23 (g) then there exist ∗23 -paths from 0 to g. Hence the notion of the mini
mal path P ∗ = C∗23 (0, g) from 0 to g is well defined. Applying the Basic Transformation P ∗ on 23-labels, we readily conclude, 1 E Z3
0
ν 1 ∼(ξ 1 ,m1 )
ν 2 ∼(ξ 2 ,m 2 )
ν 3 ∼(ξ 3 ,m3 )
1 1I{0∈C∗ (g)} = 3 E 23 Z
ν 1 ∼(ξ 1 ,m1 )
0 ν 2 ∼(ξ 2 ,m 2 ) 0 3 3 3
0
1 = M 3 . (4.2)
ν ∼(ξ ,m )
Case 1. By construction, Cˇ l1 = (I1 , . . . , In ). All the intervals in this sequence are ξ 1 be the ground, and the last interval In = (w, v = (i, t)) satisfies t ∈ ξi1g. Let modified realization of 1-process of flips with the corresponding arrival removed, but the configuration (ν 1 , ξ 1 , m1 ) otherwise kept intact. Obviously, the relative weight of removing this arrival contributes a factor hdt, and one can recover the original ξ 1 by adding a flip from v to the ghost site g. Formally, fixing realizations of the second and third replicas and fixing compatible values of ν 1 and ν 2 , taking expectations only with
Random Current Representation
463
respect to (ξ 1 , m1 ) and summing only with respect to compatible ν 1 -labels we obtain ⎡ ⎤ ⎢ ⎥ E1 ⎣ 1I{Cˇ l (0,g)∩C∗ (g)=∅} ⎦ 23
1
0
ν 1 ∼(ξ 1 ,m1 )
=
i∈T N
β
0
⎡
⎤
⎢ hdtE1 ⎣
⎥ 1I{Cˇ l (0,g)∩C∗ (g)=∅} 1I{(i,t)∈Cˇ l (0,g)} |ξi,1g(t) = 1⎦ . 23 1 1
0
ν 1 ∼(ξ 1 ,m1 )
(4.3) Now, ⎡
⎤
⎢ E1 ⎣
⎥ 1I{Cˇ l (0,g)∩C∗ (g)=∅} 1I{(i,t)∈Cˇ l (0,g)} |ξi,1g(t) = 1⎦ 23 1 1
0
ν 1 ∼(ξ 1 ,m1 )
⎡
⎢ = E1 ⎣
⎤ ⎥ 1I{Cl (0,v)∩C∗ (g)=∅} ⎦ 23 1
(4.4)
1 0,v
ν ∼ (ξ 1 ,m1 )
with v = (i, t) on the right-hand side. Taking into account replicas 2 and 3, let us determine the properties of the resulting triple of configurations from the joint integration on the right-hand side of (4.4). Since Cl1 (0, v) ∩ C∗23 (g) = ∅, there exist ∗12 -paths from 0 to v which are disjoint from C∗23 (g). Let P ∗ be the minimal such path. P ∗ on Basic Transformation Consider the 1 1 1 2 2 2 ˆ ˆ ˆ , νˆ , ξ , m ˆ , which satisfies 12-labels. It produces a new collection νˆ , ξ , m the following set of conditions: (1) P ∗ is still the minimal ∗12 path from 0 to v which avoids C∗23 (g). In particular, the ∗23
transformation is invertible →g and 0 ← 0,v ∗23 1 1 1 2 2 2 ˆ and νˆ ∼ ξˆ , m ˆ . In particular, 0 ←→ v. (2) νˆ ∼ ξˆ , m Comparing with (3.5) (applied to 2 and 3 labels) and with the first of (3.14), we conclude β ∂M 1 z ≤ . =h E 1 I hdt σˆ 0z ; σˆ (i,t) l ∗ ˇ 3 C1 (0,g)∩C23 (g)=∅ Z ∂h 0 2 2 2 0
ν 1 ∼(ξ 1 ,m1 )
ν ∼(ξ ,m )
i
ν 3 ∼(ξ 3 ,m3 )
(4.5) Case 3. This is the most difficult case. In fact it contains two sub-cases, which we proceed to describe: The left ground path from 0 to g, denoted by Cˇ l1 (0, g), is a ground path which may be naturally written as an ordered collection of ground intervals, Cˇ l1 (0, g) = ∪n1 Il : Each
interval Il = [zl , wl ] is also naturally oriented with respect to the direction of the path towards g. Therefore, in the case under consideration we can speak of the first interval
464
N. Crawford, D. Ioffe
Fig. 5. Double transformation in Case 3 (a): The 23-mark at u∗ is removed at the cost 2λdt. The point u∗ is ∗23 pivotal for the 0 ←→ g -connection in the modified configuration. A∗23 (0, u∗ ) is the set of all the points
u ∈ S which can be reached from 0 via unblocked 23-paths avoiding u∗ . A∗23 (g, u∗ ) is the set of all the points u ∈ S which can be reached from g via unblocked 23-paths avoiding u∗
Il ∗ = [zl ∗ , wl ∗ ], where Cˇ l1 (0, g) hits C∗23 (g) and, furthermore about the first hitting point u∗ ∈ Il ∗ . ∗23
Case 3(a). Pivotal Marks: In this sub-case zl ∗ ←→ g or, equivalently, zl ∗ = u∗ . Since u∗ is in the boundary of C∗23 (g), there is necessarily a 23-mark at u∗ . Also, both the 2 and 3 labels are necessarily r at u∗ . By construction, (if we understand the interval (zl ∗ , u∗ ) ⊂ Cl1 (0, u∗ ) as being topologically open) Cˇ l1 (0, u∗ ) ∩ C∗23 (g) = ∅.
(4.6)
∗ (0, u∗ ) be the Hence there exist ∗12 -paths from 0 to u∗ which avoid C∗23 (g). Let P12 ∗ ∗ minimal such path. Let also P23 (u , g) be the minimal ∗23 -path from u∗ to g. These paths are disjoint. Let us make the following double transformation on all three collections of replicas and compatible labels:
(1) Remove the 23-mark at u∗ . This yields the weight 2λdt. ∗ (0,u∗ ) on 12-labels. (2) Perform the Basic Transformation P12 ∗ (u∗ ,g) on 23-labels. (3) Perform the Basic Transformation P23 Since the Basic Transformations are on disjoint paths the latter two operations are well ∗ (0, u∗ ) and defined, commute and moreover do not change the minimal character of P12 ∗ (u∗ , g). In other words, they are invertible. The resulting set of triples ˆ1 , νˆ 1 , ξˆ 1 , m P23 ˆ 2 , νˆ 3 , ξˆ 3 , m ˆ 3 satisfies the following conditions (see Fig. 5): νˆ 2 , ξˆ 2 , m u∗ 0 u∗ ˆ 1 , νˆ 2 ∼ ξˆ 2 , m ˆ 2 and νˆ 3 ∼ ξˆ 3 , m ˆ3 . (1) νˆ 1 ∼ ξˆ 1 , m (2) νˆ 2 (u∗ ) = l. ∗ 23 (3) u∗ is pivotal for 0 ←→ g .
Random Current Representation
465
Note that (2) is a consequence of (1) and (3) and, therefore, can be omitted. We claim that, ≤ ME 1I 3 ∗ E 1I ∗ . 1I ∗ ∗23 ∗23 u is pivotal for 0 ←→ g u is pivotal for 0 ←→ g {ν (u )=r } ( (
) )
( (
0 ν 2 ∼ ξ 2 ,m 2 u∗ ν 3 ∼ ξ 3 ,m3
) )
0 ν 2 ∼ ξ 2 ,m 2 3 ν ∼ ξ 3 ,m3
(4.7) Assuming (4.7) for the moment, a comparison with (3.12) and with the third of (3.14) reveals that the total contribution to M which comes from the Case 3(a) is bounded above by M2 E Z2 i
β
2λdt 0
( (
1I
1I 3 ∗ ∗23 u∗ = (i, t) is pivotal for 0 ←→ g {ν (u )=r }
= −2λM 2
) )
0 ν 2 ∼ ξ 2 ,m 2 3 ν ∼ ξ 3 ,m3
∂M . ∂λ (4.8)
To check (4.7) let A∗23 (0, u∗ ) be the set of all the points u ∈ S which can be reached from 0 via unblocked 23-paths avoiding u∗ . Evidently, A∗23 (0, u∗ ) can be written as a union A∗23 (0, u∗ ) = ∪R j , which satisfy the following set of properties:
(1) Each R j is either a full circle, or it is an interval R j = (p j , q j ) (which formally speaking union of successive ground intervals on some Siβ ) and then it bears 23-marks at its endpoints p j and q j , except, of course, for the interval which contains u∗ as one of its endpoints – recall that the 23-mark at u∗ was removed. Moreover, both labels ν 2 and ν 3 equal r at such end-points. (2) Let R j ∗ = (p j ∗ , u∗ ) be the remaining interval which contains u∗ as one of its
endpoints. Then ν 3 (u∗ −) = limz∈R j ∗ ,z→u∗ ν 3 (z) = r . (3) There are no arrivals of 23-flips between points in A∗23 (0, u∗ ) and points in S\A∗23 (0, u∗ ). The inequality (4.7) is then proved as follows: Conditioning on A∗23 (0, u∗ ) with realizations of all the processes and values of both 2 and 3 labels on it, we integrate with respect to marks on S\A∗23 (0, u∗ ), flips on S\A∗23 (0, u∗ ) ∪ g and compatible 2 and 3 labels. The constrained integration clearly decouples the two configurations on S\A∗23 (0, u∗ )∪g and so we can integrate the restricted 2 and 3 quantities independently. We arrive at a situation where (3.19) applies (for the restriction of ν 3 ). More precisely, what we use is actually a limiting case of (3.19), with the z component of spin in the
expectation occurring at the point u∗ on the boundary of S\A∗23 (0, u∗ )= A∗23 (0, u∗ )c . Putting things together concludes Step 3(a). Before proceeding to Case 3(b), let us prove the first of (1.4) by techniques similar to those of the previous paragraph. Consider the expression for ∂ M/∂λ as it appears in 0 (4.8). two labels ν 2 ∼ ξ 2 , m2 and ν 3 ∼ ξ 3 , m3 the condition that u∗ is pivotal Given ∗23 for 0 ←→ g is equivalent to A∗23 (0, u∗ )∩A∗23 (g, u∗ ) = ∅, where A∗23 (g, u∗ ) is the set of points which can be reached from g by unblocked paths avoiding u∗ . Consequently,
466
N. Crawford, D. Ioffe
by (3.19) (or, more precisely by the limiting case of the latter, applied this time to the restriction of ν 2 to A∗23 (0, u∗ )c at the point u∗ on the boundary of A∗23 (0, u∗ )c ), 1I 3 ∗ 1I ∗ E ∗23 u is pivotal for 0 ←→ g {ν (u )=r } ( (
) )
0 ν 2 ∼ ξ 2 ,m 2 3 ν ∼ ξ 3 ,m3
≤ ME
1I{A∗
0,u∗ ν 2 ∼ ξ 2 ,m 2 3 ν ∼ ξ 3 ,m3
( (
23 (0,u
∗ )∩A∗ (g,u∗ )=∅ 23
} 1I{ν 3 (u∗ )=r } .
) ) ∗23
In order to estimate the latter expression we shall separately consider whether u∗ ←→ g or not. First of all, E 1I{A∗ (0,u∗ )∩A∗ (g,u∗ )=∅} 1I{ν 3 (u∗ )=r } 1I ∗23 ≤ E 1I ∗23 , 23 23 u∗ ←→g u∗ ←→g 0,u∗ ν 2 ∼ ξ 2 ,m 2 ν 3 ∼ ξ 3 ,m3
( (
0,u∗ ν 2 ∼ ξ 2 ,m 2 ν 3 ∼ ξ 3 ,m3
) )
( (
) )
and the right-hand side is Z 2 σˆ 0z ; σˆ uz∗ (see (3.5)). On the other hand E 1I{A∗ (0,u∗ )∩A∗ (g,u∗ )=∅} 1I{ν 3 (u∗ )=r } 1I ∗ ∗23 23 23 u ←→g 0,u∗ ν 2 ∼ ξ 2 ,m 2 3 ν ∼ ξ 3 ,m3
( (
=E
) )
1I{A∗
23 (0,u
( (
∗ )∩A∗ (g,u∗ )=∅ 23
}
(4.9)
) )
0 ν 2 ∼ ξ 2 ,m 2 u∗ ν 3 ∼ ξ 3 ,m3
as can be seen by performing our Basic Transformation on the minimal ∗23 path from u∗ to g (which would necessarily lie in A∗23 (0, u∗ )c ). Since the constraints appearing on the ∗ right-hand side imply that u is pivotal, we may use (4.7) and bound the right-hand-side ˆ x∗ . The inequality (1.4) then follows easily. in (4.9) by −MZ 2 σˆ 0z ; u ∗23
Case 3(b). Pivotal Flips: Assume now that zl ∗ ←→ g or, equivalently, that zl ∗ ∈ In order to simplify notation set z∗ = zl ∗ and w∗ = wl ∗ −1 . Under the above assumption C∗23 (g) is disjoint from the left path Cl1 (0, w∗ ). Hence there exist ∗12 -paths ∗ (0, w∗ ) be the minimal such path. Let also from 0 to w∗ which avoid C∗23 (g). Let P12 ∗ ∗ P23 (z , g) be the minimal ∗23 -path from z∗ to g. These paths are disjoint. Let us make now the following transformation on all three replicas and labels: C∗23 (g).
(1) Remove the arrival of ξ 1 between w∗ and z∗ , yielding the weight ρ Ji, j dt. ∗ (0,w∗ ) on 12-labels. (2) Perform the Basic Transformation P12 ∗ (z∗ ,g) on 23-labels. (3) Perform the Basic Transformation P23 Again, since the Basic Transformations are on disjoint paths they are well defined and do ∗ (0, w∗ ) and P ∗ (z∗ , g). Thus, they are invertible not change the minimal character of P12 23 ˆ 1 , νˆ 2 , ξˆ 2 , m ˆ 2 , νˆ 3 , ξˆ 3 , m ˆ3 and the resulting collection of configurations νˆ 1 , ξˆ 1 , m satisfy the following set of conditions (see Fig. 6):
Random Current Representation
467
Fig. 6. Double transformation in Case 3 (b): The 1-flip between w∗ at z∗ is removed at the cost ρ Ji j dt. In the modified configuration w∗ ∈ C∗23 (0) and the clusters C∗23 (0) and C∗23 (g) are disjoint
{0,w∗ ,z∗ } 2 2 z∗ z∗ ˆ 1 , νˆ 2 ˆ and νˆ 3 ∼ ξˆ 3 , m ˆ3 . (1) νˆ 1 ∼ ξˆ 1 , m ∼ ξˆ , m (2) C∗23 (0, w∗ ) and C∗23 (g) are disjoint. Therefore, the contribution to M which comes from Case 3(b) is bounded by 1 β 1I ∗ ∗ ρ Ji j dt 1I ∗23 . M 2E 0←→(i,t) {C23 (0,(i,t))∩C23 (g)=∅} Z 0
i, j
ν2
{0,(i,t),( j,t)} 2 2 ∼ ξ ,m ( j,t) ν 3 ∼ ξ 3 ,m3
(
(
(4.10)
)
)
We claim that the latter expression is bounded above by 1 β 1I ∗ ∗ ρ Ji j dt . (4.11) M2 2 E 1I ∗23 0←→(i,t) {C23 (0,(i,t))∩C23 (g)=∅} Z 0
i, j
ν2
{0,(i,t),( j,t)} 2 2 ∼ ξ ,m ν 3 ∼ ξ 3 ,m3
(
(
)
)
The proof is the same as that of (4.7) and is omitted here. The expression in (4.11) is exactly M 2 ρ∂ M/∂ρ. Indeed, just compare it with (3.15): ∗23
If we define w = (i, t) and z = ( j, t), then 0 ←→ g precisely means that either ∗23 ∗23 ∗23 0 ←→ w, z ←→ g and C∗23 (0, w) ∩ C∗23 (z, g) = ∅ or, the other way around, 0 ←→ z, ∗23
w ←→ g and C∗23 (0, z) ∩ C∗23 (w, g) = ∅. The second inequality of (1.4) is also an immediate consequence. From a (by now) standard application of the Basic Transformation, E 1I{C∗ (0,w)∩C∗ (z,g)=∅} = E 1I{C∗ (0,w)∩C∗ (z,g)=∅} . 23 23 23 23 {0,w,z} 2 2 ∼ ξ ,m ν 3 ∼ ξ 3 ,m3
ν2
(
(
)
{0,w} ν 2 ∼ ξ 2 ,m 2 z ν 3 ∼ ξ 3 ,m3
)
(
By (3.19) and in view of the representation (3.5), E 1I ∗23 ≤ ME {0,w} ν 2 ∼ ξ 2 ,m 2 z ν 3 ∼ ξ 3 ,m3
(
(
) )
0←→g
1I
{0,w} ν 2 ∼ ξ 2 ,m 2 ν 3 ∼ ξ 3 ,m3
(
(
(
) )
) )
∗23
0←→g
= M σˆ 0z ; σˆ wz .
468
N. Crawford, D. Ioffe
The analogous statement holds if the roles of z and w are interchanged. The conclusion follows by collecting terms. 5. Proof of Theorem A: Exponential Decay In the sequel we shall continue to use P and, respectively, E for the product probability for two independent replicas ξ 1 , m1 and ξ 2 , m2 . As before n = m1 ∪ m2 and η = ξ 1 ∪ ξ 2. The proof is given in three subsections, corresponding to each of the three truncated correlations. The proof for z-correlations is given in some detail and, as the proofs of the second two inequalities only require small modifications of this result, we will be more brief in proving the last two statements. Proof of Theorem A for z-correlations. Let i, j ∈ T N , s, t ∈ Sβ be fixed and let u = (i, t), v = ( j, s). We shall prove the following generalization of the first of (1.2): Lemma 5.1. There exist c1 = c1 (h, λ, ρ) > 0 and c2 = c2 (h, λ, ρ) < ∞ such that, z z (5.1) σˆ u ; σˆ v ≤ c2 e−c1 d(u,v) ,
where d(u, v) = | j − i| + |t − s|. The above inequality is uniform in N , β, u and v. Proof. The starting point for our analysis is the formula (3.5) reproduced here: ⎛ ⎞
σˆ uz ; σˆ vz
⎜ 1 ⎜ = 2 E⎜ ⎜ Z ⎝
ν 1 ∼(ξ 1 ,m 1 ) u,v ν 2 ∼ (ξ 2 ,m2 )
⎟ ⎟
1I ∗t 1I ∗ ⎟ ⎟. u←→v u←→g
(5.2)
⎠
∗t There is a simple reason to include a redundant constraint u ←→ v : Given a realization of ξ 1 and ξ 2 , the function
(m1 , m2 ) →
1I
∗t
u←→v
ν 1 ∼(ξ 1 ,m 1 ) u,v
ν 2 ∼ (ξ 2 ,m2 )
is monotone non-increasing. Consequently, for any F(m1 , m2 ) non-decreasing, the FKG property of the pair of Poisson processes m1 and m2 implies: ⎛ ⎞ ⎜ ⎜ 1 2 E⎜ ⎜ F(m , m ) ⎝
⎟ ⎟
1I ∗t ⎟ u←→v ⎟ ⎠
ν∼(ξ 1 ,m 1 ) u,v ν 2 ∼ (ξ 2 ,m2 )
⎞
⎛
⎜ ≤ E F(m1 , m2 ) E ⎝
ν 1 ∼(ξ 1 ,m1 ) ν 2 u,v ∼ (ξ 2 ,m2 )
1I
∗t
u←→v
⎟ .
⎠
(5.3)
Random Current Representation
469 β
For every δ > 0 fixed (for convenience we’ll assume that δ divides β) let Zδ = δZ/ ((β/δ)Z) be the rescaled one-dimensional lattice torus which is just an equal δ-spacing embedding of β/δ sites into Sβ . We construct non-decreasing functions Fδ (m1 , m2 ) = Fδu,v (m1 , m2 ) as follows: First β β of all let us map S onto Zδ × Zd : A point p = (δk, j) ∈ Zδ × Zd corresponds to the j interval [(k − 1)δ, kδ) of Sβ . Two points p = (δk, j) and q = (δl, m) are said to be connected if either j = m and |k − l| ≤ 1 mod (β/δ) or k = l and ( j, m) ∈ E. β Consider the following Bernoulli site percolation process X δ on Zδ × Zd , which is generated by the combined process of marks n: ) X δ (p) =
0, if n ( j × [(k − 1)δ, kδ)) > 0 δ, otherwise. β
Clearly, P (X δ = δ) tends to one as δ tends to zero. For p, q ∈ Zδ × Zd we can define the minimal passage time Tδ (p, q) = min
γδ :p →q
X δ (r).
r∈γ
Then, there exist c1 , c2 > 0 such that δ P Tδ (p, q) < dδ (p, q) ≤ c2 e−c1 dδ (p,q) , 2
(5.4)
β
uniformly in 0 ≤ δ ≤ δ0 small enough and in p, q ∈ Zδ × Zd . Moreover, our choice of δ0 may be made independent of β. Here, dδ (p, q) is the minimal possible number of points in connected paths γδ : p → q. Note that if pu and pv label δ-intervals containing u and v, then dδ (pu , pv ) ≥ c3 d(u, v) uniformly in δ small and, say, d(u, v) ≥ 1. Suppose that for such δ, pu and pv , we also assume δ > 0 is chosen to satisfy (5.4). If we define Dδc
u,v c δ = Dδ = Tδ (pu , pv ) < dδ (pu , pv ) , 2
then since Fδ = 1IDδc is non-decreasing, the FKG inequality (5.3) along with (5.4) imply that for all δ small there exist c1 = c1 (δ), c2 > 0, such that ⎛ ⎜ ⎜ E⎜ ⎜1IDδc ⎝
⎞
1I
∗t
u←→v
ν 1 ∼(ξ 1 ,m 1 ) 2 u,v 2 2
ν ∼ (ξ ,m )
⎛
⎜ ⎟ ⎜ ⎟ −c1 d(u,v) ⎜ ≤ c e E 2 ⎜ ⎟ ⎝ ⎠
⎟
⎞
1I
∗t
u←→v
ν 1 ∼(ξ 1 ,m 1 ) 2 u,v 2 2
ν ∼ (ξ ,m )
⎟ ⎟ ⎟ ⎠
⎟ .
(5.5)
470
N. Crawford, D. Ioffe
In view of (5.5) it suffices to check that, perhaps by adjusting further c1 , c2 > 0, ⎛
⎞
⎛
⎞
⎜ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ ⎟ ≤ c e−c1 d(u,v) E ⎜1I ∗t 1I ∗t ⎟ . 1 I E⎜ 1 I 1 I ∗ 2 Dδ ⎜ Dδ ⎜ ⎟ ⎟ u←→v u ←→v u←→g ⎠ ⎝ ⎝ ⎠ ν 1 ∼(ξ 1 ,m 1 ) ν 1 ∼(ξ 1 ,m 1 ) u,v
u,v
ν 2 ∼ (ξ 2 ,m2 )
ν 2 ∼ (ξ 2 ,m2 )
(5.6) Consider now the set C ∗t(u, v) of all the points z ∈ S which are ∗t-connected to both u ∗t and v. The set C ∗t(u, v) is non-empty on the event u ←→ v , and it is represented as a union of intervals C ∗t(u, v) = ∪l Rl . Each interval Rl ⊂ Slβ is either a full circle, or
Rl = (zl , wl ) ⊂ Siβl with combined n marks placed at both end-points. Note that these endpoints must also have ν 1 , ν 2 = r . Let us say that p = (δk, l) ∈ Gδ (Rl ) if p ∈ Rl and X δ (p) = δ.
Note that p ∈ Gδ (Rl ) implies in particular that [(k − 1)δ, kδ) × l ⊆ Rl . The crucial property is that on the event Dδu,v the following happens: The number of all δ-intervals associated with points p ∈ ∪l Gδ (Rl ) is bounded below as
l
p∈Gδ (Rl )
1≥
1 1 Tδ (pu , pv ) > c3 d(u, v). δ 2
(5.7)
∗t Let us condition on realizations of C ∗t(u, v) which are compatible with u ←→ v and Dδu,v . As before, such a conditioning rules out simultaneous flips between points in C ∗t(u, v) and S\C ∗t(u, v). Therefore, the corresponding conditional integration and summation over compatible flips, marks and labels inside and outside C ∗t(u, v) decouples over the two regions. In other words, to establish (5.6) it is enough to prove the following statement ((5.9) below): Let A = ∪Rl be a collection of full circles and disjoint intervals, such that u and v are interiour points of A. Further, suppose that A contains at least c3 21 d(u, v) disjoint sub-intervals each with length at least δ and let us say that Dδu,v (A) occurs for the realization of the combined process of marks n whenever (5.7) holds. Let ρ A denote the reduced time-inhomogeneous rates of arrivals of flips (associated to edges on the torus) as in (3.17), ⎧ ⎪ ⎨ ρ, if the corresponding flip is either between two points in A A ρ e (t) = (5.8) or between two points in S\A, ⎪ ⎩ 0, otherwise.
Random Current Representation
471
Then,
Eρ A
1I{C∗t (u,v)=A} 1I
νˇ 1 ∼(ξ 1 ,m 1 ) u,v
νˇ 2 ∼ (ξ 2 ,m2 )
1I u,v ∗ Dδ (A) u←→g
≤ c2 e−c1 d(u,v) Eρ A
1I{C∗t (u,v)=A} 1IDu,v (A) , δ
(5.9)
νˇ 1 ∼(ξ 1 ,m 1 ) 2 u,v 2 2
νˇ ∼ (ξ ,m )
where νˇ 1 , νˇ 2 are restrictions of the labels to A = ∪Rl which are compatible with the marks, and in particular with r -boundary conditions, at the end-points of Rl -s. The inequality (5.9) is established by the following embedding procedure: Let νˇ 1 , ξ 1 , 2 2 2 1 m , νˇ , ξ , m be a pair of configurations which contribute to the left-hand side of (5.9). All such configurations have no arrivals of g-induced flips on A. At this stage it is convenient to introduce the following separate notation for processes of flips: let g,k ξˇek ; k = 1, 2, to denote arrivals for e = (i, j) ∈ E 0 and ξe ; k = 1, 2, to denote arrivals g 1 2 g for e = (i, g) ∈ E . A similar notation ηˇ = ξˇ ∪ ξˇ and η = ξ g,1 ∪ξ g,2 is introduced for g combined processes of flips. Then, on the event conditions = ∅, the compatibility η (A) u,v 1 1 1 2 ˇ on the left-hand side of (5.9) read as νˇ ∼ ξ , m and, accordingly, νˇ ∼ ξˇ 2 , m2 , and the expression on the left-hand side of (5.9) equals to
1I{C∗t (u,v)=A} 1IDu,v (A) . (5.10) e−2h l |Rl | Eρ A δ νˇ 1 ∼(ξˇ 1 ,m 1 ) u,v
νˇ 2 ∼ (ξˇ 2 ,m2 )
Fix now a realization of (ξˇ 1 , m1 ), (ξˇ 2 , m2 ) and compatible labels νˇ 1 ∼ (ξˇ 1 , m1 ) and (ξˇ 2 , m2 ). Consider the following event: E(A) = ∩l ∩p∈Gδ (Rl ) ξ g,i (Ip ) is even for i = 1, 2 ∩ ηg(A\Aδ ) = 0 ,
u,v νˇ 2 ∼
β
where, for p = (kδ, l) ∈ δZδ × Zd , we set Ip = [(k − 1)δ, kδ) × l
and Aδ = ∪l ∪p∈Gδ (Rl ) Ip .
Evidently, P (E(A)) = e−2h
l
|Rl |
(cosh(δh))2 ≥ e−2h
l
|Rl |
(cosh(δh))c3 d(u,v) ,
l p∈Gδ (Rl )
(5.11) g,1 g,2 where the second inequality follows from (5.7). Each E(A)-realization of ξ , ξ u,v gives rise to compatible labels νˇ 1 [ξ g,1 ] ∼ ξˇ 1 , ξ g,1 , m1 and νˇ 2 [ξ g,2 ] ∼ ξˇ 2 , ξ g,2 , m2 u,v which are unambiguously constructed from the original νˇ 1 ∼ (ξˇ 1 , m1 ) and νˇ 2 ∼ (ξˇ 2 , m2 ) by the appropriate even number of flips on each of the intervals Ip ⊆ Aδ (see Fig. 7).
472
N. Crawford, D. Ioffe
(b)
(a)
g,1 and ξ g,2 Fig. 7. Compatible labels which are constructed from νˇ 1 , νˇ 2 and even number of arrivals of ξ 1 1 1 2 2 2 on intervals from Gδ : (a) Original configurations νˇ , ξˇ , m and νˇ , ξˇ , m . (b) Example of admissible (in the sense of event E) even number of arrivals of ξ g,1 , ξ g,2 : the circled numbers indicate total number of arrivals on the corresponding intervals
As a result, the expectation on the right-hand side of (5.9) is bounded below by
1I{C∗t (u,v)=A} 1IDu,v (A) , (cosh(δh))c3 d(u,v) e−2h l |Rl | Eρ A δ νˇ 1 ∼(ξˇ 1 ,m 1 ) u,v
νˇ 2 ∼ (ξˇ 2 ,m2 )
and (5.9) follows.
Proof of Theorem A for xz-correlations. Recall the expression (3.12), 1 . ˆ xx = − σˆ uz ; E 1I{ν 2 (v)=r } 1I ∗ 2 v is pivotal for u ←→ g Z u
(5.12)
ν 1 ∼(ξ 1 ,m 1 )
ν 2 ∼(ξ 2 ,m2 ) ∗
Observe that if v is pivotal for u ←→ g then A∗ (u, v) (recall that the latter notation stands for the set of points which are ∗-connected to u by paths avoiding v) does not contain g, which means that there are no arrivals of ηg on A∗ (u, v). At this point we may proceed exactly as in the proof of Theorem A for z-correlations. Proof of Theorem A for x-correlations. Recall the expression (3.9), 1 ˆ ux ; ˆ vx = E 1I{(ν 1 ,ν 2 )∈[(r,l),(r,l)]} 1I{v is loop pivotal for u} . 2 Z ν 1 ∼(ξ 1 ,m 1 ) ν 2 ∼(ξ 2 ,m2 )
(5.13)
Random Current Representation
473
Observe that under the constraints on the right-hand side, if v is loop-pivotal for u then the set A∗ (u, v)\{u, v} contains at least two disjoint components. Hence at least one of these components should be disjoint from g. Again, at this point we may proceed exactly as in the proof of Theorem A for z-correlations. Implications for the ground state β = ∞. As was proved above, exponential decay of truncated two-point functions is uniform in β < ∞. Consequently, for every N < ∞, the limit
M∞,N (h, ρ, λ) = lim Mβ,N (h, ρ, λ) β→∞
also satisfies (1.3) and (1.4). On the other hand, by an obvious time scaling, M∞,N (αh, αρ, αλ) = M∞,N (h, ρ, λ) for every α > 0. Hence, ρ
∂ M∞,N ∂ M∞,N ∂ M∞,N ∂ M∞,N = −λ −h ≤ −λ . ∂ρ ∂λ ∂h ∂λ
Therefore, (1.3) implies that M∞,N ≤ h
∂ M∞,N ∂ M∞,N 3 2 + M∞,N . − 3M∞,N λ ∂h ∂λ
(5.14)
Together with the first of (1.4) (for M∞,N ) the inequality (5.14) sets up the stage for an analysis of sharpness of the σˆ z phase transition literally along the lines of [2,3]. Acknowledgements. Our proof of exponential decay is based on an argument which was developed in the classical setting together with Roberto Fernandez and Yvan Velenik (see [14]). We are grateful to Anna Levit for useful remarks and a very careful reading of the first draft of this paper.
References 1. Aizenman, M.: Geometric analysis of φ 4 fields and Ising models. Commun. Math. Phys. 86(1), 1–48 (1982) 2. Aizenman, M., Barsky, D.J.: Sharpness of the phase transition in percolation models. Commun. Math. Phys. 108(3), 489–529 (1987) 3. Aizenman, M., Barsky, D.J., Fernández, R.: The phase transition in a general class of Ising-type models is sharp. J. Stat. Phys. 47(3-4), 343–374 (1987) 4. Aizenman, M., Fernández, R.: On the critical behavior of the magnetization in high-dimensional Ising models. J. Stat. Phys. 44(3-4), 393–454 (1986) 5. Aizenman, M., Klein, A., Newman, C.: Percolation methods for disordered quantum Ising models. In: Kotecky, R., ed., Phase Transitions: Mathematics, Physics, Biology,.., Singapore: World Scientific, 1993, pp. 1–26 6. Aizenman, M., Nachtergaele, B.: Geometric aspects of quantum spin states. Commun. Math. Phys. 164, 17–63 (1994) 7. Biskup, M., Chayes, L., Crawford, N.: Mean-field driven first-order phase transitions in systems with long-range interactions. J. Stat. Phys. 119(6), 1139–1193 (2006) 8. Björnberg, J.E., Grimmett, G.: The phase transition of the quantum Ising model is sharp. J. Stat. Phys. 136(2), 231–273 (2009) 9. Campanino, M., Klein, A., Perez, J.F.: Localization in the ground state of the Ising model with a random transverse field. Commun. Math. Phys. 135, 499–515 (1991) 10. Chayes, L., Crawford, N., Ioffe, D., Levit, A.: The phase diagram of the quantum Curie-Weiss model. J. Stat. Phys. 133(1), 131–149 (2008) 11. Ginibre, J.: Existence of phase transitions for quantum lattice systems. Commun. Math. Phys. 14, 205–234 (1969) 12. Griffiths, R.: Correlations in Ising Ferromagnets. II. J. Math. Phys. 8, 484 (1967)
474
N. Crawford, D. Ioffe
13. Griffiths, R., Hurst, C., Sherman, S.: Concavity of magnetization of an Ising ferromagnet in a positive external field. J. Math. Phys. 11, 790 (1970) 14. Ioffe, D.: Stochastic geometry of classical and quantum Ising models. Lecture Notes in Mathematics 1970, Berlin-Heidelberg: Springer, 2000 15. Ioffe, D., Levit, A.: Long range order and giant components of quantum random graphs. Markov. Proc. Rel. Fields 13(3), 469–492 (2007) 16. Shlosman, S.: Signs of rsell’s functions. Commun. Math. Phys. 102(4), 679–686 (1985)
u
Communicated by M. Aizenman
Commun. Math. Phys. 296, 475–523 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1026-7
Communications in
Mathematical Physics
-adic Quantum Vertex Algebras and Their Modules Haisheng Li∗ Department of Mathematical Sciences, Rutgers University, Camden, NJ 08102, USA. E-mail:
[email protected] Received: 9 March 2009 / Accepted: 22 December 2009 Published online: 7 March 2010 – © Springer-Verlag 2010
Abstract: This is a paper in a series to study vertex algebra-like structures arising from various algebras including quantum affine algebras and Yangians. In this paper, we study notions of -adic nonlocal vertex algebra and -adic (weak) quantum vertex algebra, slightly generalizing Etingof-Kazhdan’s notion of quantum vertex operator algebra. For any topologically free C[[]]-module W , we study -adically compatible subsets and -adically S-local subsets of (EndW )[[x, x −1 ]]. We prove that any -adically compatible subset generates an -adic nonlocal vertex algebra with W as a module and that any -adically S-local subset generates an -adic weak quantum vertex algebra with W as a module. A general construction theorem of -adic nonlocal vertex algebras and -adic quantum vertex algebras is obtained. As an application we associate the centrally extended double Yangian of sl2 to -adic quantum vertex algebras.
1. Introduction In [EK], one of an important series of papers, Etingof and Kazhdan introduced a fundamental notion of quantum vertex operator algebra and they constructed a family of quantum vertex operator algebras which are formal deformations of vertex operator alge bras associated with affine Lie algebras sl n+1 . For a quantum vertex operator algebra in this sense, the underlying space is a topologically free C[[]]-module V = V 0 [[]] with V 0 a vector space over C, and the vertex operator map Y is a C[[]]-module map from V to Hom(V, V 0 ((x))[[]]), where the key axioms are a quasi commutativity property, called S-locality, and an associativity property. Furthermore, the S-locality by assumption is governed by a unitary rational quantum Yang-Baxter operator S(x) on V . It follows from the definition that V /V is an ordinary vertex algebra (over C), so quantum vertex operator algebras in this sense are formal deformations of vertex ∗ Partially supported by NSF grant DMS-0600189.
476
H. Li
algebras. As it was mentioned therein, a generalization of this theory to the super case is straightforward. Inspired by [EK], in a series of papers ([Li4,Li5,Li6,KL]) we have extensively studied a notion of (weak) quantum vertex algebra, as a generalization of the notions of vertex algebra and vertex superalgebra. For a (weak) quantum vertex algebra V in this sense, the underlying space is a vector space over C and the vertex operator map Y is a linear map from V to Hom(V, V ((x))), which satisfies a certain braided Jacobi identity, or equivalently an S-locality and associativity. This theory of (weak) quantum vertex algebras has many of the features of the theory of ordinary vertex (super-)algebras. For example, as it was proved in [Li4], weak quantum vertex algebras and their modules can be constructed from what were called S-local sets of vertex operators on an arbitrarily given vector space, just as vertex (super)algebras and modules can be constructed from “mutually local” vertex operators (see [Li1]). Examples of quantum vertex algebras and modules were constructed in [Li5] from Zamolodchikov-Faddeev algebras of a certain type and in [Li6] from q-versions of double Yangian DY(sl2 ) with q a nonzero complex number. In this paper, we come back to Etingof-Kazhdan’s notion of quantum vertex operator algebra with a slight generalization such that the classical limits are more general quantum vertex algebras. More specifically, we systematically study notions of -adic nonlocal vertex algebra and -adic (weak) quantum vertex algebra, and we establish general construction theorems, with the ultimate goal to associate such adic quantum vertex algebras to centrally extended double Yangians essentially in the same way that affine Lie algebras were associated with vertex operator algebras. An -adic nonlocal vertex algebra will be a topologically free C[[]]-module V equipped with a C[[]]-module map Y from V to (EndV )[[x, x −1 ]] and a vector 1 ∈ V such that for every positive integer n, V /n V is a nonlocal vertex algebra over C, while an -adic weak quantum vertex algebra is an -adic nonlocal vertex algebra V that satisfies S-locality in the sense of [EK] with S(x) only a C[[]]-module map without any other assumption. Furthermore, an -adic quantum vertex algebra is an -adic weak quantum vertex algebra V such that the S-locality operator S(x) is a unitary rational quantum Yang-Baxter operator and satisfies the shift condition and hexagon identity as in [EK]. For each finite-dimensional simple Lie algebra g, Drinfeld (see [Dr1]) introduced a Hopf algebra Y (g), called Yangian, as a deformation of the universal enveloping algebra U (g[t]) of Lie algebra g[t]. Then the double DY(g) of Y (g) in the sense of Drinfeld was studied in [KT]. Furthermore, centrally extended double Yangian DY (g) was studied in [Kh] and [IK], as a deformation of the universal enveloping algebra U (ˆg) of the affine Lie algebra gˆ , where a vertex operator representation was also given. Our objective is to establish a canonical association of the centrally extended double Yangians (in which the parameter is a formal variable, instead of a complex number) with vertex algebra-like structures. This is our main motivation to study -adic (weak) quantum vertex algebras. In this paper we build the foundation for this theory of -adic (weak) quantum vertex algebras. As one of the main results, we establish a general construction of -adic weak quantum vertex algebras and their modules. This is an -adic version of the general construction in [Li5] of weak quantum vertex algebras and their modules, and we here extensively use the results therein. Let W be a general C[[]]-module. Consider formal series a(x) = am x −m−1 ∈ (EndW )[[x, x −1 ]] m∈Z
-adic Quantum Vertex Algebras and Their Modules
477
satisfying the condition that for every w ∈ W and for every positive integer n, there exists an integer k such that am w ∈ n W for m ≥ k, and let E(W ) consist of all such a(x). In the case that W = W 0 [[]] is topologically free (with W 0 a vector space over C), we have E(W ) = E(W 0 )[[]] which is also topologically free. (Recall that for a vector space U over C, E(U ) = Hom(U, U ((x))).) We then study what we call -adically compatible subsets and -adically S-local subsets of E(W ). We prove that any -adically compatible subset of E(W ) generates an -adic nonlocal vertex algebra with W as a canonical module, while an -adically S-local subset of E(W ) generates an -adic weak quantum vertex algebra with W as a canonical module. The fact is that the generating functions of a centrally extended double Yangian on a highest weight module W together with the identity operator 1W form an -adically S-local subset of E(W ), and hence one has an -adic weak quantum vertex algebra generated by those generating functions. It was known (see [Li1,LL]) that if W is a highest weight module for the affine Lie algebra gˆ of level ∈ C, then the canonical generating functions of gˆ generate a vertex algebra, which can be identified as a so-called vacuum gˆ -module of the same level . For centrally extended double Yangians, the situation is different; the generated -adic weak quantum vertex algebra is not a module for DY (g), but it is a module for a certain cover of DY(g). This is mainly due to the fact that the field associated to a Cartan element is broken into two fields in the quantum case. In this paper, we pick up the simplest case with g = sl2 and work out the details. More specifically, we introduce a cover DY (sl2 ) of DY(sl2 ) and by using our general construction we show that on a universal vacuum DY (sl2 )-module of a generic level (which is defined suitably), there exists a canonical -adic quantum vertex algebra structure with every highest weight DY (sl2 )-module of the same level as a module. In principle, a generalization to the centrally extended double Yangian of a general finite-dimensional simple Lie algebra can be done in a similar way, but one has to deal with the complicated Serre type relations. We plan to study this in a future publication. In this paper, we also construct a family of -adic quantum vertex algebras as deformations of certain quantum vertex algebras which were studied in [KL]. Those quantum vertex algebras were constructed by using certain generalized Weyl-Clifford algebras, or Zamolodchikov-Faddeev algebras. In one special case, we obtain a quantum βγ -system, and in another we obtain a formal deformation of the vertex operator superalgebra VL associated with the lattice L = Zα with α, α = 1. There is a very interesting paper [AB], in which Anguelova and Bergvelt studied a broad class of vertex algebra-like structures called H D -quantum vertex algebras, using some ideas of Borcherds from [B2], and they constructed certain interesting examples by employing Borcherds’ bicharacter construction. This notion of H D -quantum vertex algebra generalizes the notion of braided vertex operator algebra in [EK] in several aspects. For example, the braiding operator S (describing quasi locality) is allowed to have two (independent) spectral parameters, instead of one. What we here call -adic quantum vertex algebras can be considered as a subfamily of H D -quantum vertex algebras. A drawback of this generality is that general H D -quantum vertex algebras, just as Etingof-Kazhdan’s braided vertex operator algebras, fail to satisfy the usual associativity for vertex algebras, though they do satisfy a braided associativity. On the other hand, weak quantum vertex algebras in the sense of [Li4] and -adic weak quantum vertex algebras all satisfy the usual associativity, which promises a transparent representation theory. Especially, examples of -adic quantum vertex algebras (and their modules) are
478
H. Li
constructed by using vertex operators on potential modules from a representation point of view. This paper is organized as follows: In Sect. 2, we study -adic nonlocal vertex algebras and -adic (weak) quantum vertex algebras and present some basic results. In Sect. 3, we present some technical results. In particular we discuss -adic nonlocal vertex subalgebras. In Sect. 4, we give a general construction of -adic (weak) quantum vertex algebras and their modules using -adic S-local sets of (formal) vertex operators. In Sect. 5, we construct some -adic quantum vertex algebras which are deformations of certain quantum vertex algebras. In Sect. 6, as an application of the general construction, we associate the centrally extended double Yangian of sl2 with -adic quantum vertex algebras. 2. -adic Nonlocal Vertex Algebras and -adic Weak Quantum Vertex Algebras In this section we study the notions of -adic nonlocal vertex algebra and -adic (weak) quantum vertex algebra, and we present basic properties of -adic nonlocal vertex algebras. The notion of -adic (weak) quantum vertex algebra slightly generalizes EtingofKazhdan’s notion of quantum vertex operator algebra. In this paper, we use the standard formal variable notations and conventions as established in [FLM] (cf. [LL]). The scalar field will be the field C of complex numbers, N and Z+ denote the set of nonnegative integers and the set of positive integers, respectively. We start by recalling the notion of nonlocal vertex algebra (cf. [BK,Li2]). A nonlocal vertex algebra is a vector space V , equipped with a linear map Y : V → Hom(V, V ((x))) ⊂ (EndV )[[x, x −1 ]], vn x −n−1 (vn ∈ EndV ), v → Y (v, x) = n∈Z
and equipped with a distinguished vector 1 ∈ V , satisfying the conditions that Y (1, x) = 1, Y (v, x)1 ∈ V [[x]]
and
lim Y (v, x)1 (= v−1 1) = v for v ∈ V,
x→0
(2.1) (2.2)
and that for u, v, w ∈ V , there exists l ∈ N such that (x0 + x2 )l Y (u, x0 + x2 )Y (v, x2 )w = (x0 + x2 )l Y (Y (u, x0 )v, x2 )w in V [[x0±1 , x2±1 ]] (the weak associativity). For a nonlocal vertex algebra V , define a linear operator D on V by d Y (v, x)1 for v ∈ V. Dv = v−2 1 = lim x→0 d x
(2.3)
(2.4)
We have ([Li2], Prop. 2.6) d Y (v, x) dx
for v ∈ V,
(2.5)
e x D Y (v, x1 )e−x D = Y (e x D v, x1 ) = Y (v, x1 + x),
(2.6)
[D, Y (v, x)] = Y (Dv, x) = and furthermore, Y (v, x)1 = e
xD
v.
(2.7)
-adic Quantum Vertex Algebras and Their Modules
479
In [Li4], the following class of nonlocal vertex algebras was singled out: Definition 2.1. A weak quantum vertex algebra is a nonlocal vertex algebra V , satisfying S-locality: For u, v ∈ V , there exist r
u (i) ⊗ v (i) ⊗ f i (x) ∈ V ⊗ V ⊗ C((x))
i=1
and a nonnegative integer k such that (x1 −x2 )k Y (u, x1 )Y (v, x2 ) = (x1 −x2 )k
r
f i (x2 − x1 )Y (v (i) , x2 )Y (u (i) , x1 ),
i=1
(2.8) where f i (x2 − x1 ) is to be expanded in the nonnegative powers of x1 , i.e., in view of the formal Taylor theorem, f i (x2 − x1 ) = e
−x1 ∂ ∂x
2
f i (x2 ) ∈ C((x2 ))[[x1 ]].
Remark 2.2. Let V be a weak quantum vertex algebra. Let u, v, w ∈ V and assume that (2.8) holds. Then weak associativity relation (2.3) and (2.8) imply x1 − x2 Y (u, x1 )Y (v, x2 )w x0 r x2 − x1 f i (−x0 )Y (v (i) , x2 )Y (u (i) , x1 )w −x0−1 δ −x0 i=1 x1 − x0 Y (Y (u, x0 )v, x2 )w = x2−1 δ x2
x0−1 δ
(2.9)
(the S-Jacobi identity). In fact, the notion of weak quantum vertex algebra can be alternatively defined by using all the axioms that define the notion of nonlocal vertex algebra except that the weak associativity axiom is replaced by the S-Jacobi identity. Definition 2.3. Let U be a vector space. A unitary rational quantum Yang-Baxter operator (with one parameter) on U is a linear map S(x) : U ⊗ U → U ⊗ U ⊗ C((x)) satisfying the condition that S 21 (x)S(−x) = 1, S 12 (x1 )S 13 (x1 + x2 )S 23 (x2 ) = S 23 (x2 )S 13 (x1 + x2 )S 12 (x1 ), where S 21 (x) = PS(x)P with P the permutation operator on U ⊗ U .
480
H. Li
The following notion is essentially due to Etingof and Kazhdan [EK]: Definition 2.4. A quantum vertex algebra is a nonlocal vertex algebra V equipped with a unitary rational quantum Yang-Baxter operator S(x) on V , satisfying the conditions that [D ⊗ 1, S(x)] = −
dS(x) (the shift condition), dx
(2.10)
and that for any u, v ∈ V , there exists a nonnegative integer k such that (x1 − x2 )k Y (x1 )(1 ⊗ Y (x2 ))(S(x1 − x2 )(u ⊗ v) ⊗ w) = (x1 − x2 )k Y (x2 )(1 ⊗ Y (x1 ))(v ⊗ u ⊗ w)
(2.11)
for all w ∈ V , and that S(x)(Y (z) ⊗ 1) = (Y (z) ⊗ 1)S 23 (x)S 13 (x + z)
(2.12)
(the hexagon identity), where Y (x) : V ⊗ V → V ((x)) is the map associated to the vertex operator map Y (·, x) : V → Hom(V, V ((x))). The following notion is due to Etingof and Kazhdan [EK]: Definition 2.5. Let V be a nonlocal vertex algebra. For each positive integer n, define a linear map Z n : C((x1 )) · · · ((xn )) ⊗ V ⊗n → V ((x1 )) · · · ((xn ))
(2.13)
Z n ( f ⊗ v (1) ⊗ · · · ⊗ v (n) ) = f (x1 , . . . , xn )Y (v (1) , x1 ) · · · Y (v (n) , xn )1
(2.14)
by
for v (1) , . . . , v (n) ∈ V, f ∈ C((x1 )) · · · ((xn )). If all the linear maps Z n for n ≥ 1 are injective, V is said to be nondegenerate. The following proposition ([Li4], Theorems 4.8 and 5.11) was lifted from [EK]: Proposition 2.6. Let V be a weak quantum vertex algebra. Assume that V is nondegenerate. Then there exists a unique linear map S(x) : V ⊗ V → V ⊗ V ⊗ C((x)) satisfying the condition that for any u, v ∈ V , there exists a nonnegative integer k such that (2.11) holds with S(x)(v ⊗ u) =
r
v (i) ⊗ u (i) ⊗ f i (x).
i=1
Furthermore, V equipped with S(x) is a quantum vertex algebra. Definition 2.7. Let V be a nonlocal vertex algebra. A V -module is a vector space W equipped with a linear map YW : V → Hom(W, W ((x))) ⊂ (EndW )[[x, x −1 ]] vn x −n−1 (vn ∈ EndW ) v → YW (v, x) = n∈Z
-adic Quantum Vertex Algebras and Their Modules
481
satisfying the conditions that YW (1, x) = 1W (where 1W denotes the identity operator on W ),
(2.15)
and that for any u, v ∈ V, w ∈ W , there exists l ∈ N such that (x0 + x2 )l YW (u, x0 + x2 )YW (v, x2 )w = (x0 + x2 )l YW (Y (u, x0 )v, x2 )w. (2.16) A quasi V -module is defined by using all the above axioms except that the last weak associativity axiom is replaced by a weaker axiom: For any u, v ∈ V, w ∈ W , there exists 0 = p(x1 , x2 ) ∈ C[x1 , x2 ] such that p(x0 +x2 , x2 )YW (u, x0 +x2 )YW (v, x2 )w = p(x0 +x2 , x2 )YW (Y (u, x0 )v, x2 )w. (2.17) Next, we study -adic analogues. Let be a formal variable throughout this paper. A C[[]]-module W is said to be torsion-free if w = 0 for every 0 = w ∈ W , and is said to be separated if ∩n≥1 n W = 0. For a C[[]]-module W , using subsets w + n W for w ∈ W, n ≥ 1 as the basis of open sets one obtains a topology on W , which is called the -adic topology. A C[[]]-module W is said to be -adically complete if every Cauchy sequence in W with respect to this -adic topology has a limit in W . A C[[]]-module W is topologically free if W = W 0 [[]] for some vector space W 0 over C. It is a fact that a C[[]]-module is topologically free if and only if it is torsion-free, separated, and -adically complete (cf. [K]). Definition 2.8. Let W be a C[[]]-module. Define E(W ) to consist of each formal series a(x) = am x −m−1 ∈ (EndW )[[x, x −1 ]] m∈Z
such that for every w ∈ W , am w → 0, that is, for every positive integer n, a m w ∈ n W
for m sufficiently large.
For every C[[]]-endomorphism F of W , F preserves the submodules n W for n ∈ N, so that F gives rise to an endomorphism of W/n W for each n ∈ N. In this way, we have natural C[[]]-module homomorphisms π˜ n : EndW → End(W/n W ) for n ∈ N. We also use π˜ n for its canonical extensions—the C[[]]-module homomorphisms from (EndW )[[x1±1 , . . . , xr±1 ]] to (End(W/n W ))[[x1±1 , . . . , xr±1 ]] for r ≥ 1. In terms of the maps π˜ n we have E(W ) = a(x) ∈ (EndW )[[x, x −1 ]] | π˜ n (a(x)) ∈ E(W/n W ) for all n ∈ N . (2.18) Definition 2.9. An -adic nonlocal vertex algebra is a topologically free C[[]]-module V , equipped with a C[[]]-module map Y : V → E(V ) ⊂ (EndV )[[x, x −1 ]], v → Y (v, x) = vn x −n−1 , n∈Z
482
H. Li
and equipped with a distinguished vector 1 ∈ V , satisfying the conditions that Y (1, x) = 1, Y (v, x)1 ∈ V [[x]]
and
lim Y (v, x)1 (= v−1 1) = v
x→0
for v ∈ V,
and that for u, v, w ∈ V and for n ∈ N, there exists l ∈ N such that (x0 + x2 )l Y (u, x0 + x2 )Y (v, x2 )w ≡ (x0 + x2 )l Y (Y (u, x0 )v, x2 )w
(2.19)
modulo n V [[x0±1 , x2±1 ]] (the -adic weak associativity). Remark 2.10. Notice that for r, s ∈ Z, the coefficient of the monomial x0r x2s in the expression Y (u, x0 + x2 )Y (v, x2 )w is r + i u −r −1−i v−s−1+i w, i i≥0
which is an infinite sum in general, even though it converges to an element of V . This is one of the few places where V needs to be -adically complete. The following is a characterization of an -adic nonlocal vertex algebra in terms of nonlocal vertex algebras over C: Proposition 2.11. Let V be a topologically free C[[]]-module, equipped with a vector 1 ∈ V and a C[[]]-module map Y from V to (EndV )[[x, x −1 ]]. Then (V, Y, 1) carries the structure of an -adic nonlocal vertex algebra if and only if for every n ∈ N, (V /n V, Y (n) , 1 + n V ) is a nonlocal vertex algebra over C, where Y (n) : V /n V → (End(V /n V ))[[x, x −1 ]] is the canonical map reduced from Y . Proof. From definition it is clear that if (V, Y, 1) is an -adic nonlocal vertex algebra, V /n V is a nonlocal vertex algebra over C for every n ∈ N. Now, assume that for every n ∈ N, V /n V is a nonlocal vertex algebra over C. For v ∈ V , we have π˜ n (Y (v, x)) ∈ E(V /n V ) for n ∈ N. Thus Y (v, x) ∈ E(V ). On the other hand, for each n ∈ N, with 1 + n V being the vacuum vector of V /n V , we have Y (1, x)v − v ∈ n V [[x, x −1 ]] for v ∈ V . Because V is separated, we must have Y (1, x)v − v = 0. Similarly, we can show Y (v, x)1 ∈ V [[x]] and lim x→0 Y (v, x)1 = v. The weak associativity of V /n V for n ∈ N exactly amounts to the -adic weak associativity. Then (V, Y, 1) is an -adic nonlocal vertex algebra. Remark 2.12. Let V be an -adic nonlocal vertex algebra. We have a projective inverse system of nonlocal vertex algebras over C (or over C[[]]): 0 ← V /V ← V /2 V ← V /3 V ← · · · ,
(2.20)
and the -adic nonlocal vertex algebra V can be considered as an inverse limit. Using Proposition 2.11 (and the arguments in the proof) we immediately have: Lemma 2.13. Let V be an -adic nonlocal vertex algebra. Define D ∈ EndV by D(v) = v−2 1
for v ∈ V.
(2.21)
Then [D, Y (v, x)] = Y (Dv, x) =
d Y (v, x) dx
for v ∈ V.
-adic Quantum Vertex Algebras and Their Modules
483
Let V be an -adic nonlocal vertex algebra. Following [EK], let ˆ → V [[x, x −1 ]] Y (x) : V ⊗V be the C[[]]-module map associated to the vertex operator map Y : V → ˆ and V ⊗V ˆ ⊗C((x))[[]] ˆ (EndV )[[x, x −1 ]], where here and henceforth V ⊗V stand for 0 the -adically completed tensor products. If V = V [[]] with V 0 a C-vector space, we have ˆ = (V 0 ⊗ V 0 )[[]] V ⊗V
and
ˆ ⊗C((x))[[]] ˆ V ⊗V = (V 0 ⊗ V 0 ⊗ C((x)))[[]].
Definition 2.14. Let V be an -adic nonlocal vertex algebra. Define a C[[]]-module map ˆ → (EndV )[[x1±1 , x2±1 ]] Y (x1 , x2 ) : V ⊗V by Y (x1 , x2 )(u ⊗ v)(w) = Y (x1 )(1 ⊗ Y (x2 ))(u ⊗ v ⊗ w) = Y (u, x1 )Y (v, x2 )w
(2.22)
for u, v, w ∈ V . Definition 2.15. An -adic weak quantum vertex algebra is an -adic nonlocal vertex algebra V which satisfies -adic S-locality: For u, v ∈ V , there exists ˆ ⊗C((x))[[]] ˆ F(u, v, x) ∈ V ⊗V satisfying the condition that for every positive integer n, there exists k ∈ N such that (x1 −x2 )k Y (u, x1 )Y (v, x2 )w ≡ (x1 −x2 )k Y (x2 )(1 ⊗ Y (x1 ))(F(u, v, x2 −x1 ) ⊗ w) (2.23) modulo n V [[x1±1 , x2±1 ]] for all w ∈ V . Remark 2.16. Let V be an -adic weak quantum vertex algebra. We see that for every positive integer n, V /n V is a weak quantum vertex algebra over C. For u, v, w ∈ V , by Remark 2.2 we have x1 − x2 −1 Y (u, x1 )Y (v, x2 )w x0 δ x0 x2 − x1 −1 Y (x2 )(1 ⊗ Y (x1 ))(F(u, v, −x0 ) ⊗ w) −x0 δ −x0 x1 − x0 Y (Y (u, x0 )v, x2 )w (2.24) ≡ x2−1 δ x2 modulo n V [[x0±1 , x1±1 , x2±1 ]]. Since V is -adically complete, the coefficient of each p q monomial x0r x1 x2 for r, p, q ∈ Z in each of the three main terms is an element of V . n As ∩n≥1 V = 0, we obtain x1 − x2 x0−1 δ Y (u, x1 )Y (v, x2 )w x0 x2 − x1 Y (x2 )(1 ⊗ Y (x1 ))(F(u, v, −x0 ) ⊗ w) −x0−1 δ −x0 x1 − x0 Y (Y (u, x0 )v, x2 )w (2.25) = x2−1 δ x2
484
H. Li
(the S-Jacobi identity). Clearly, the S-Jacobi identity is equivalent to -adic weak associativity and -adic S-locality. In view of this, the notion of -adic weak quantum vertex algebra can be defined alternatively by using the S-Jacobi identity. Definition 2.17. Let U be a C[[]]-module and let r be a positive integer. For A(x1 , . . . , xr ), B(x1 , . . . , xr ) ∈ U [[x1±1 , . . . , xr±1 ]], we write A ∼± B if for every positive integer n, there exists a (nonzero) polynomial p(x1 , . . . , xr ) ∈ xi ± x j | 1 ≤ i < j ≤ r ⊂ C[x1 , . . . , xr ] such that p(x1 , . . . , xr )(A(x1 , . . . , xr ) − B(x1 , . . . , xr )) ∈ n U [[x1±1 , . . . , xr±1 ]]. (2.26) Clearly, relations ∼± on U [[x1±1 , . . . , xr±1 ]] are equivalence relations. It is also clear that the left multiplication by a Laurent polynomial and the formal partial differential operators ∂/∂ x1 , . . . , ∂/∂ xr preserve the equivalence relations. For convenience, we shall also use the notation ∼ for ∼− . If V is an -adic nonlocal vertex algebra, for u, v, w ∈ V we have Y (u, x0 + x2 )Y (v, x2 )w ∼+ Y (Y (u, x0 )v, x2 )w V [[x0±1 , x2±1 ]].
Furthermore, if V is an -adic weak quantum vertex algebra, for any in u, v ∈ V , there exists ˆ ⊗C((x))[[]] ˆ F(u, v, x) ∈ V ⊗V such that Y (u, x1 )Y (v, x2 ) ∼ Y (x2 , x1 )F(u, v, x2 − x1 ) in (EndV )[[x1±1 , x2±1 ]]. Remark 2.18. Note that the equivalence relations ∼± when restricted to certain subspaces of U [[x1±1 , . . . , xr±1 ]] amount to the equality relation. For example, let U = U 0 [[]] be a topologically free C[[]]-module. For A(x1 , . . . , xr ), B(x1 , . . . , xr ) ∈ (U 0 ((x1 )) · · · ((xr )))[[]], if A ∼± B, we must have A = B. This is simply because (U 0 ((x1 )) · · · ((xr )))[[]] is a vector space over the field C((x1 )) · · · ((xr )) which contains xi ± x j for 1 ≤ i < j ≤ r . Proposition 2.19. Let V be an -adic nonlocal vertex algebra and let ˆ ⊗C((x))[[]]. ˆ u, v ∈ V, A(x) ∈ V ⊗V Then Y (u, x1 )Y (v, x2 ) ∼ Y (x2 , x1 )(A(x2 − x1 )) if and only if Y (u, x)v = e x D Y (−x)(A(−x)).
(2.27)
-adic Quantum Vertex Algebras and Their Modules
485
Furthermore, V is an -adic weak quantum vertex algebra if and only if there exists a C[[]]-module map ˆ → V ⊗V ˆ ⊗C((x))[[]] ˆ S(x) : V ⊗V such that Y (u, x)v = e x D Y (−x)S(−x)(v ⊗ u)
for u, v ∈ V.
(2.28)
Proof. We only need to prove the first part. Note that for every positive integer n, V /n V is a nonlocal vertex algebra over C. By Corollary 5.3 of [Li4], there exists a nonnegative integer k such that (x1 − x2 )k Y (u, x1 )Y (v, x2 )w ≡ (x1 − x2 )k Y (x2 , x1 )(A(x2 − x1 ))w mod n V for all w ∈ V if and only if Y (u, x)v ≡ e x D Y (−x)(A(−x)) mod n V. Since V is -adically complete, the coefficient of each power of x in e x D Y (−x)(A(−x)) is an element of V . As ∩n≥1 n V = 0, the relations Y (u, x)v ≡ e x D Y (−x)(A(−x)) mod n V for all n ≥ 1 amount to (2.27). The following is a reformulation and a slight generalization of Etingof and Kazhdan’s notion of quantum vertex operator algebra (see [EK]): Definition 2.20. An -adic quantum vertex algebra is an -adic nonlocal vertex algebra V equipped with a C[[]]-module map ˆ → V ⊗V ˆ ⊗C((x))[[]], ˆ S(x) : V ⊗V
(2.29)
which satisfies the shift condition: [D ⊗ 1, S(x)] = −
dS(x) , dx
(2.30)
the quantum Yang-Baxter equation: S 12 (x1 )S 13 (x1 + x2 )S 23 (x2 ) = S 23 (x2 )S 13 (x1 + x2 )S 12 (x1 ),
(2.31)
and the unitarity condition: S 21 (x)S(−x) = 1,
(2.32)
subject to the following axioms: (QA1) The -adic S-locality: For any u, v ∈ V and for any positive integer n, there exists k ≥ 0 such that for any w ∈ V the series (x1 − x2 )k Y (x1 )(1 ⊗ Y (x2 ))(S(x1 − x2 )(u ⊗ v) ⊗ w) and (x1 − x2 )k Y (x2 )(1 ⊗ Y (x1 ))(v ⊗ u ⊗ w) coincide modulo n V [[x1±1 , x2±1 ]]. (QA4) The hexagon identity: S(x1 )(Y (x2 ) ⊗ 1) = (Y (x2 ) ⊗ 1)S 23 (x1 )S 13 (x1 + x2 ).
(2.33)
486
H. Li
Let V be an -adic nonlocal vertex algebra. For a positive integer n, we define a C[[]]-module map ˆ ±1 ±1 ˆ Z nV : V ⊗n ⊗C((x 1 )) · · · ((x n ))[[]] → V [[x 1 , . . . , x n ]]
as in Definition 2.5. Recall that V /V is a nonlocal vertex algebra over C. Lemma 2.21. Let V = V 0 [[]] be an -adic nonlocal vertex algebra such that the nonlocal vertex algebra V /V over C is nondegenerate. Then for every positive integer n, the C[[]]-module map Z nV is injective. ˆ ˆ Proof. Let n be any positive integer and let 0 = A ∈ V ⊗n ⊗C((x1 )) · · · ((xn ))[[]]. Then
A = k (A0 + A1 + 2 A2 + · · · ) for some k ∈ N, Ai ∈ (V 0 )⊗n ⊗ C((x1 )) · · · ((xn )) with A0 = 0. Writing Y (x) = Y0 (x) + Y1 (x) + 2 Y2 (x) + · · · with Yi ∈ (EndV 0 )[[x, x −1 ]] for i ≥ 0, we have Y (x1 )(1 ⊗ Y (x2 )) · · · (1⊗(n−1) ⊗ Y (xn ))A = k Y0 (x1 )(1 ⊗ Y0 (x2 )) · · · (1⊗(n−1) ⊗ Y0 (xn ))(A0 ) + O(k+1 ). As V /V is nondegenerate we have Y0 (x1 )(1 ⊗ Y0 (x2 )) · · · (1⊗(n−1) ⊗ Y0 (xn ))(A0 ) = 0, so that Z nV (A) = 0. This proves that Z nV is injective.
The following is a reformulation of Proposition 1.11 of [EK]: Proposition 2.22. Let V be an -adic weak quantum vertex algebra such that the nonlocal vertex algebra V /V over C is nondegenerate. Then S-locality defines a unique C[[]]-module map ˆ → V ⊗V ˆ ⊗C((x)))[[]] ˆ S(x) : V ⊗V
(2.34)
with S(x)(u ⊗ v) = F(v, u, x) for u, v ∈ V as in Definition 2.15 and (V, Y, 1, S) carries the structure of an -adic quantum vertex algebra. Next, we study modules and quasi modules for -adic nonlocal vertex algebras. Definition 2.23. Let V be an -adic nonlocal vertex algebra. A V -module is a topologically free C[[]]-module W , equipped with a C[[]]-module map, YW : V → E(W ) ⊂ (EndW )[[x, x −1 ]], satisfying the conditions that YW (1, x) = 1W and that for u, v ∈ V, w ∈ W and for every positive integer n, there exists l ∈ N such that (x0 + x2 )l YW (u, x0 + x2 )YW (v, x2 )w ≡ (x0 + x2 )l YW (Y (u, x0 )v, x2 )w
(2.35)
n W [[x0±1 , x2±1 ]].
We define a notion of quasi V -module by replacing the modulo -adic weak associativity with the following axiom: For u, v ∈ V, w ∈ W and for every positive integer n, there exists 0 = p(x1 , x2 ) ∈ C[x1 , x2 ] such that p(x0 +x2 , x2 )YW (u, x0 +x2 )YW (v, x2 )w ≡ p(x0 +x2 , x2 )YW (Y (u, x0 )v, x2 )w modulo
n W [[x0±1 , x2±1 ]].
(2.36)
-adic Quantum Vertex Algebras and Their Modules
487
We have the following straightforward analogue of Proposition 2.11: Proposition 2.24. Let V be an -adic nonlocal vertex algebra, let W be a topologically free C[[]]-module, and let YW be a C[[]]-module map from V to (EndW )[[x, x −1 ]]. Then (W, YW ) is a (quasi) V -module if and only if for every positive integer n, W/n W is a (quasi) V /n V -module. The following is an -adic version of a result of [Li4]: Proposition 2.25. Let V be an -adic nonlocal vertex algebra, let ˆ ⊗C((x))[[]] ˆ m ∈ Z, u, v, c(0) , c(1) , · · · ∈ V, A(x) ∈ V ⊗V such that lim j→∞ c( j) = 0, and let (W, YW ) be a V -module. If (x1 − x2 )m Y (u, x1 )Y (v, x2 ) − (−x2 + x1 )m Y (x2 , x1 )(A(x)) ∂ j −1 1 x1 ( j) , Y (c , x2 ) x2 δ = j! ∂ x2 x2
(2.37)
j≥0
then (x1 − x2 )m YW (u, x1 )YW (v, x2 ) − (−x2 + x1 )m YW (x2 , x1 )(A(x)) ∂ j −1 1 x1 ( j) . YW (c , x2 ) x2 δ = j! ∂ x2 x2
(2.38)
j≥0
If (W, YW ) is a faithful V -module, the converse also holds. Proof. For any positive integer n, V /n V is a nonlocal vertex algebra over C and W/n W is a (V /n V )-module. It follows from [Li4] (Prop. 6.7) that after being applied to a vector in W , (2.38) holds modulo n W . With W topologically free, W is -adically complete and separated. Then (2.38) must hold. For the converse, for any positive integer n, denote by ρn the C[[]]-module map from V to E(W/n W ). Clearly, n V ⊂ ker ρn . Then V / ker ρn is a nonlocal vertex algebra over C and W/n W is a faithful (V / ker ρn )-module. Again, from [Li4], (Prop. 6.7), (2.37) modulo ker ρn holds. For v ∈ ∩n≥1 ker ρn , with W separated we have YW (v, x) = 0. As W is faithful, we must have ∩n≥1 ker ρn = 0. Since V is -adically complete, (2.37) holds on V . We shall also need the following variation: Proposition 2.26. Let V be an -adic nonlocal vertex algebra, let ˆ ⊗C((x))[[]] ˆ m ∈ Z, u, v, c(0) , c(1) , · · · ∈ V, A(x) ∈ V ⊗V such that lim j→∞ c( j) = 0, and let (W, YW ) be a V -module. If (x1 − x2 )m Y (u, x1 )Y (v, x2 ) − (−x2 + x1 )m Y (x2 , x1 )(A(x)) ∂ j −1 1 x1 ( j) , Y (c , x1 ) x2 δ = j! ∂ x2 x2 j≥0
(2.39)
488
H. Li
(note that the only change is on the variable x for vertex operators Y (c( j) , x)), then (x1 − x2 )m YW (u, x1 )YW (v, x2 ) − (−x2 + x1 )m YW (x2 , x1 )(A(x)) ∂ j −1 1 x1 ( j) . YW (c , x1 ) x2 δ = j! ∂ x2 x2
(2.40)
j≥0
If (W, YW ) is a faithful V -module, the converse also holds. Proof. Let (U, YU ) be a V -module, e.g., U = V or U = W . For any v ∈ V, w ∈ U and for any n ≥ 1, with U/n U a module for V /n V viewed as a nonlocal vertex algebra over C, we have YU (Dv, x)w ≡
d YU (v, x)w mod n U. dx
Since U is separated, we have YU (Dv, x)w =
d YU (v, x)w. dx
Using this we get ∂ j −1 1 x1 x2 δ j! ∂ x2 x2 j≥0 1 ∂ j x1 YU (c( j) , x2 )x2−1 δ = j! ∂ x2 x2 j≥0 j 1 ∂ j−i ∂ i −1 1 x1 ( j) = YU (c , x2 ) x2 δ ( j − i)! ∂ x2 i! ∂ x2 x2
YU (c( j) , x1 )
j≥0 i=0
∂ i −1 1 1 x1 YU (D j−i c( j) , x2 ) x2 δ ( j − i)! i! ∂ x2 x2 j≥0 i=0 1 ∂ i −1 1 x1 YU (Dr c(r +i) , x2 ) . x2 δ = r! i! ∂ x2 x2 =
j
i≥0 r ≥0
Notice that for i ≥ 0, r ≥0 r1! Dr c(r +i) ∈ V (as V is -adically complete). Then it follows from Proposition 2.25. 3. Some Technical Results In this section we present certain technical results which we need in Sect. 4. In particular, we study -adic nonlocal vertex subalgebras and subalgebras generated by subsets of an -adic nonlocal vertex algebra. Definition 3.1. Let V be an -adic nonlocal vertex algebra. An -adic nonlocal vertex subalgebra is a C[[]]-submodule containing 1 such that (U, Y, 1) carries the structure of an -adic nonlocal vertex algebra. In particular, U is a topologically free submodule.
-adic Quantum Vertex Algebras and Their Modules
489
We say that a C[[]]-submodule U of an -adic nonlocal vertex algebra V is Y -closed if u m v ∈ U for all u, v ∈ U, m ∈ Z. (This is to distinguish the algebraic closedness from the topological closedness.) Remark 3.2. Let U be a C[[]]-submodule of a topologically free C[[]]-module V . With n U ⊂ U ∩ n V for n ∈ N, we see that the induced topology on U from V (with the -adic topology) coincides with the -adic topology of U if and only if for any n ∈ N, there exists k ∈ N such that U ∩ k V ⊂ n U . Proposition 3.3. Let V be an -adic nonlocal vertex algebra and let U be a C[[]]submodule satisfying the conditions that 1 ∈ U , U is Y -closed, and that the induced topology on U from V coincides with its own -adic topology. In addition we assume that U is -adically complete. Then U is an -adic nonlocal vertex subalgebra of V . Proof. Notice that as a submodule of V , U is torsion-free and separated. Since U is also -adically complete, U is topologically free. Let u, v ∈ U and n ∈ N. From assumption, there exists k ∈ N such that U ∩ k V ⊂ n U . With u, v ∈ U ⊂ V and k ∈ N being fixed, there exists r ∈ N such that u m v ∈ k V for m ≥ r . Then u m v ∈ U ∩ k V ⊂ n U
for m ≥ r.
That is, Y (u + n U, x)(v + n U ) ∈ (U/n U )((x)). Furthermore, let w ∈ V . By -adic weak associativity, there exists l ∈ N such that (x0 + x2 )l Y (u, x0 + x2 )Y (v, x2 )w ≡ (x0 + x2 )l Y (Y (u, x0 )v, x2 )w (mod k V ). Because U is Y -closed and -adically complete, and because U ∩ k V ⊂ n U , we have (x0 + x2 )l Y (u, x0 + x2 )Y (v, x2 )w ≡ (x0 + x2 )l Y (Y (u, x0 )v, x2 )w (mod n U ). Now, (U, Y, 1) satisfies all the axioms for an -adic nonlocal vertex algebra.
Definition 3.4. Let M be a C[[]]-module. For any C[[]]-submodule K , we define [K ] = {w ∈ M | n w ∈ K for some n ∈ N}.
(3.1)
The following two lemmas are straightforward: Lemma 3.5. Let M be a C[[]]-module and let K be a C[[]]-submodule such that [K ] = K . Then K ∩ n M = n K for all n ∈ N. In particular, the induced topology on K from M (with the -adic topology) coincides with the -adic topology of K . Lemma 3.6. Let V be an -adic nonlocal vertex algebra and let K be a Y -closed C[[]] of K in V are Y -closed. submodule. Then both [K ] and the -adic completion K Furthermore, we have: Proposition 3.7. Let V be an -adic nonlocal vertex algebra and let K be a Y -closed C[[]]-submodule containing 1. Then [K ] = [K ] and [K ] is an -adic nonlocal vertex subalgebra of V .
490
H. Li
Proof. Since [[K ]] = [K ], by Lemma 3.5 we have [K ] ∩ n V = n [K ] for n ∈ N. ] is -adically complete with respect to its only -adic topology. Then [K ] is Then [K
topologically free. From Lemma 3.6, [K ] is Y -closed. If we can prove [K ] = [K ], then
] is an -adic nonlocal vertex subalgebra of V . Let u ∈ [K ] . by Proposition 3.3, [K ]. Furthermore, there exists a By definition, there exists k ∈ N such that k u ∈ [K Cauchy sequence {an } in [K ] with k u as the limit. Then there exists r ≥ 1 such that an − k u ∈ k V for n ≥ r . As V is torsion-free, for each n ≥ r , there exists uniquely bn ∈ V such that an = k bn . As [[K ]] = [K ] and n bn = an ∈ [K ], we have bn ∈ [K ] for n ≥ r . Using the fact that V is torsion-free, we see that {bn }n≥r is a Cauchy sequence ]. This proves [K ] = [K ], concluding the in [K ], converging to u. Thus u ∈ [K proof. Now, let U be a subset of an -adic nonlocal vertex algebra V . Let U (1) be the C[[]]span of U ∪ {1} and then inductively define U (n+1) for n ≥ 1 to be the C[[]]-span of the vectors am b for a, b ∈ U (n) , m ∈ Z. From definition we have 1 ∈ U (n) ⊂ U (n+1) for n ≥ 1. Set
U o = ∪n≥1 U (n) ⊂ V (3.2) and furthermore, set U = U o (the -adic completion).
(3.3)
We have: Proposition 3.8. Let V be an -adic nonlocal vertex algebra and let U be a subset of V . Then U is an -adic nonlocal vertex subalgebra satisfying the condition that U ⊂ U and [U ] = U . Furthermore, any -adic nonlocal vertex subalgebra H , which satisfies the condition that U ⊂ H and [H ] = H , contains U . Proof. It follows from the definition that ∪n≥1 U (n) is Y -closed and contains {1} ∪ U . Then the first assertion follows from Proposition 3.7. Let H be an -adic nonlocal vertex subalgebra such that [H ] = H and U ⊂ H . Then ∪n≥1 U (n) ⊂ H . Furthermore, we have
U o = ∪n≥1 U (n) ⊂ [H ] = H. Since [U o ] = U o , the induced topology on U o (from H or V ) coincides with its own -adic topology. Then U ⊂ H as H is -adically complete. We shall need the following result later: Lemma 3.9. Let V be an -adic nonlocal vertex algebra with a generating subset U in the sense that V = U and let (W, YW ) be a quasi V -module equipped with a C[[]]-linear operator D such that [D, YW (u, x)] =
d YW (u, x) dx
for u ∈ U.
(3.4)
Assume that w is a vector of W such that Dw = 0. Then Y (v, x)w ∈ V [[x]] for all v ∈ V and the linear map φ defined by φ(v) = v−1 w for v ∈ V is a V -module homomorphism.
-adic Quantum Vertex Algebras and Their Modules
491
Proof. For every positive integer n, V /n V is a nonlocal vertex algebra over C and W/n W is a quasi (V /n V )-module. Set K n = (∪k≥1 U (k) + n V )/n V ⊂ V /n V. It is clear that K n is a nonlocal vertex subalgebra of V /n V . Let φn : K n → W/n W be the map induced from φ. From [Li4] (Prop. 6.2), we have vr w ∈ n W
for v ∈ ∪k≥1 U (k) , r ≥ 0,
and φn is a K n -module homomorphism, which amounts to φ(u m v) ≡ u m φ(v) mod n W for u, v ∈ ∪k≥1 U (k) , m ∈ Z. As ∩n≥1 n W = 0, we have vr w = 0 for v ∈ ∪k≥1 U (k) , r ≥ 0 and for u, v ∈ ∪k≥1 U (k) , m ∈ Z.
φ(u m v) = u m φ(v)
Now, let u, v ∈ [∪k≥1 U (k) ]. There exists t ∈ N such that t u, t v ∈ ∪k≥1 U (k) . Then t vr w = 0 for r ≥ 0 and 2t φ(u m v) = 2t u m φ(v)
for m ∈ Z.
Since W is torsion-free, we have vr w = 0 for r ≥ 0 and φ(u m v) = u m φ(v) for m ∈ Z. Recall that U is the completion of [∪k≥1 U (k) ]. Let u (i) , v (i) (i ≥ 1) be sequences in [∪k≥1 U (k) ], converging to u, v ∈ V , respectively. We have ( j) (i) ( j) vr(i) w = 0 for r ≥ 0 and φ(u (i) m v ) = u m φ(v )
for i, j ≥ 1, m ∈ Z.
From this we see that for every positive integer n, vr w ∈ n W for r ≥ 0 and φ(u m v) − u m φ(v) ∈ n W
for m ∈ Z.
Again as ∩n≥1 n W = 0, we get vr w = 0 for r ≥ 0 and φ(u m v) = u m φ(v) for m ∈ Z. Now we have proved Y (v, x)w ∈ V [[x]], as needed.
φ(u m v) = u m φ(v)
for all u, v ∈ U , m ∈ Z,
Let W be a topologically free C[[]]-module and let A be a C-subspace of W . Notice that for any sequence {an }n≥0 in A, n≥0 an n ∈ W . Set A[[]] = which is a C[[]]-submodule of W .
⎧ ⎨ ⎩
n≥0
⎫ ⎬
a n n | a n ∈ A , ⎭
(3.5)
492
H. Li
Definition 3.10. Let A and B be subsets of an -adic nonlocal vertex algebra V . We say that the ordered pair (A, B) is -adically S-local if for any a ∈ A, b ∈ B, there exists ˆ ⊗C((x)))[[]] ˆ P(a, b; x) ∈ ((CB) ⊗ (CA) ⊗ C((x))) [[]] ⊂ V ⊗V such that Y (a, x1 )Y (b, x2 ) ∼ Y (x2 , x1 )P(a, b; x2 − x1 ).
(3.6)
We say that a subset A of V is -adically S-local if (A, A) is -adically S-local. We have the following technical result: Lemma 3.11. Let A and B be C-subspaces of V such that (A, B) is -adically S-local. Then (A, B (2) ) and (A(2) , B) are -adically S-local. Proof. From definition, there exists a C[[]]-module map ˆ ˆ ˆ → B[[]] ⊗A[[]] ⊗C((x))[[]] S(x) : B[[]] ⊗A[[]]
such that for a ∈ A, b ∈ B, Y (a, x1 )Y (b, x2 ) ∼ Y (x2 , x1 )S(x2 − x1 )(b ⊗ a).
(3.7)
We have the maps ˆ ˆ ˆ ˆ ˆ S 32 (x) : B[[]] ⊗A[[]] ⊗B[[]] → B[[]] ⊗A[[]] ⊗B[[]] ⊗C((x))[[]], 13 ˆ ˆ ˆ ˆ ˆ S (x) : B[[]] ⊗B[[]] ⊗A[[]] → B[[]] ⊗B[[]] ⊗A[[]] ⊗C((x))[[]].
Note that by Proposition 2.19, (3.7) is equivalent to Y (a, x)b = e x D Y (−x)S(−x)(b ⊗ a). Let a ∈ A, u, v ∈ B. Using Lemma 2.13 we get Y (a, x)Y (u, z)v ∼ Y (z)(1 ⊗ Y (x))(S(z − x)(u ⊗ a) ⊗ v) = Y (z)(1 ⊗ e x D Y (−x))S 32 (−x)(S(z − x)(u ⊗ a) ⊗ v) = e x D Y (z − x)(1 ⊗ Y (−x))S 32 (−x)(S(z − x)(u ⊗ a) ⊗ v) ∼ e x D Y (−x)(Y (z) ⊗ 1)S 32 (−x)(S(z − x)(u ⊗ a) ⊗ v) ∼ e x D Y (−x)(Y (z) ⊗ 1)S 32 (−x)(S(−x + z)(u ⊗ a) ⊗ v) in V [[x ±1 , z ±1 ]]. In view of Remark 2.18 we have Y (a, x)Y (u, z)v = e x D Y (−x)(Y (z) ⊗ 1)S 32 (−x)(S(−x + z)(u ⊗ a) ⊗ v). It follows that (A, B (2) ) is -adically S-local. Similarly, for a, b ∈ A, u ∈ U , we have Y (Y (a, x0 )b, x)u ∼+ Y (a, x0 + x)Y (b, x)u = Y (a, x0 + x)e x D Y (−x)S(−x)(u ⊗ b) = e x D Y (a, x0 )Y (−x)S(−x)(u ⊗ b) ∼+ e x D Y (−x)(1 ⊗ Y (x0 ))S 13 (−x − x0 )(S(−x)(u ⊗ b) ⊗ a) in V [[x0±1 , x ±1 ]], which by Remark 2.18 implies Y (Y (a, x0 )b, x)u = e x D Y (−x)(1 ⊗ Y (x0 ))S 13 (−x − x0 )(S(−x)(u ⊗ b) ⊗ a). It follows that (A(2) , B) is -adically S-local.
-adic Quantum Vertex Algebras and Their Modules
493
Now, we have (cf. [Li5,LTW]): Proposition 3.12. Let V be an -adic nonlocal vertex algebra and let U be an -adically S-local subset such that V = (∪n≥1 U (n) )[[]] . Then V is an -adic weak quantum vertex algebra. Proof. We must prove that V as a subset of V is -adically S-local. Because U is -adically S-local, it follows from Lemma 3.11 (and induction) that ∪n≥1 U (n) is -adically S-local. Let u, v ∈ V . From assumption, we have u=
a(i)i , v =
i≥0
b( j) j with a(i), b( j) ∈ ∪n≥1 U (n) .
j≥0
For any i, j ∈ N, there exists ˆ ⊗C((x))[[]] ˆ Ai, j (x) ∈ V ⊗V such that Y (a(i), x1 )Y (b( j), x2 ) ∼ Y (x2 , x1 )Ai, j (x2 − x1 ). Notice that
ˆ ⊗C((x))[[]] ˆ Ai, j (x)i+ j ∈ V ⊗V
i, j∈N
and ⎛ Y (u, x1 )Y (v, x2 ) ∼ Y (x2 , x1 ) ⎝
⎞
Ai, j (x2 − x1 )⎠ .
i, j∈N
This proves that V is -adically S-local.
Using the proof of Proposition 2.8 in [Li5] and Proposition 3.12, we immediately have: Proposition 3.13. Let V be an -adic nonlocal vertex algebra, let U be an -adically S-local subset, and let W be a V -module with e ∈ W such that YW (u, x)e ∈ V [[x]] for u ∈ U . Set K = (∪n≥1 U (n) )[[]] ⊂ V . Then YW (v, x)e ∈ V [[x]] for all v ∈ K . Furthermore, the C[[]]-module map θ : K → W , defined by θ (v) = v−1 e for v ∈ K , satisfies that φ(Y (u, x)v) = YW (u, x)φ(v)
for u, v ∈ K .
494
H. Li
4. A General Construction of -adic Quantum Vertex Algebras In this section we give a general construction of -adic nonlocal vertex algebras and -adic weak quantum vertex algebras by using what we call -adic quasi-compatible sets of vertex operators on topologically free C[[]]-modules. We start by recalling from [Li4] (cf. [Li2]) the general construction of nonlocal vertex algebras from quasi-compatible sets of formal vertex operators. Let W 0 be a vector space over C. Set E(W 0 ) = Hom(W 0 , W 0 ((x))).
(4.1)
The identity operator on W 0 , denoted by 1W 0 , is a typical element of E(W 0 ), and the formal differential operator ddx is an endomorphism of E(W 0 ). Definition 4.1. An (ordered) sequence (ψ (1) (x), . . . , ψ (r ) (x)) in E(W 0 ) is said to be quasi-compatible if there exists 0 = p(x1 , x2 ) ∈ C[x1 , x2 ] such that ⎞ ⎛ ⎝ p(xi , x j )⎠ ψ (1) (x1 ) · · · ψ (r ) (xr ) ∈ Hom(W 0 , W 0 ((x1 , . . . , xr ))). (4.2) 1≤i< j≤r
A subset U of E(W 0 ) is said to be quasi-compatible if every (ordered) finite sequence in U is quasi-compatible. We also define a notion of compatibility by assuming that p(x1 , x2 ) is of the form (x1 − x2 )k with k ∈ N. Assume that (a(x), b(x)) is a quasi-compatible pair in E(W 0 ). By definition, there exists 0 = p(x1 , x2 ) ∈ C[x1 , x2 ] such that p(x1 , x2 )a(x1 )b(x2 ) ∈ Hom(W 0 , W 0 ((x1 , x2 ))).
(4.3)
Recall from [Li4] that ιx1 ,x2 : C∗ (x1 , x2 ) → C((x1 ))((x2 )) is the algebra-embedding that preserves each element of C[[x1 , x2 ]], where C∗ (x1 , x2 ) denotes the algebra extension of C[[x1 , x2 ]] by inverting every nonzero polynomial. We have ιx,x0 (1/ p(x0 + x, x)) ( p(x1 , x)a(x1 )b(x)) |x1 =x+x0 ∈ Hom(W 0 , W 0 ((x))) ((x0 )). Definition 4.2. Let (a(x), b(x)) be a quasi-compatible pair in E(W 0 ). Define a(x)n b(x) for n ∈ Z, elements of E(W 0 ), in terms of the generating function a(x)n b(x)x0−n−1 YE (a(x), x0 )b(x) = n∈Z
by YE (a(x), x0 )b(x) = ιx,x0 (1/ p(x0 + x, x)) ( p(x1 , x)a(x1 )b(x)) |x1 =x+x0 , where p(x1 , x2 ) is any nonzero polynomial such that (4.3) holds.
-adic Quantum Vertex Algebras and Their Modules
495
A quasi-compatible subspace U of E(W 0 ) is said to be YE -closed if a(x)n b(x) ∈ U
for a(x), b(x) ∈ U, n ∈ Z.
(4.4)
We have ([Li4], Theorem 2.19; cf. [Li2]): Theorem 4.3. Let W 0 be a vector space over C and let U be any quasi-compatible subset of E(W 0 ). There exists a (unique) smallest YE -closed quasi-compatible subspace U of E(W 0 ), containing U and 1W 0 , and (U , YE , 1W 0 ) carries the structure of a nonlocal vertex algebra with U as a generating subset and W 0 is a quasi-module for U with YW 0 (α(x), x0 ) = α(x0 ) for α(x) ∈ U . Furthermore, if U is compatible, W 0 is a module for U . Definition 4.4. Let W 0 be a vector space over C. A subset U of E(W 0 ) is said to be S-local if for any a(x), b(x) ∈ U , there exist a (i) (x), b(i) (x) ∈ U, f i (x) ∈ C((x))(i = 1, . . . , r ) such that (x1 − x2 )k a(x1 )b(x2 ) = (x1 − x2 )k
r
f i (x2 − x1 )b(i) (x2 )a (i) (x1 )
(4.5)
i=1
for some k ∈ N. From [Li4] (Lemma 3.2), every S-local subset of E(W 0 ) is quasi-compatible. In fact, the same proof shows that every S-local subset is compatible. Furthermore, we have (see [Li4], Theorem 5.8): Theorem 4.5. Let W 0 be a vector space over C and let U be an S-local subset of E(W 0 ). Then the nonlocal vertex algebra U generated by U is a weak quantum vertex algebra with W 0 as a faithful module. Now, let W be a C[[]]-module. Recall from Sect. 2 that E(W ) is the C[[]]-submodule of (EndW )[[x, x −1 ]], consisting of each a(x) = m∈Z am x −m−1 satisfying the condition that for any w ∈ W, n ∈ N, there exists k ∈ Z such that a m w ∈ n W
for m ≥ k.
For the rest of this section we assume that W = W 0 [[]] is a fixed topologically free C[[]]-module. Then W is torsion-free, separated in the sense that ∩n≥1 n W = 0, and -adically complete. We have the following projective inverse system: 0 ← W/W ← W/2 W ← W/3 W ← · · ·
(4.6)
to for n ≥ 0) with W as (equipped with the canonical maps from an inverse limit. Let F be an endomorphism of W . For every nonnegative integer n, F gives rise to an endomorphism Fn of W/n W . Then we have an endomorphism {Fn } of the projective inverse system (4.6). Conversely, given any endomorphism, a sequence { f n }, of the inverse system (4.6), we have an endomorphism f of W . The C[[]]-module EndW can be naturally identified with (EndW 0 )[[]] and we have W/n+1 W
W/n W
(EndW )[[x, x −1 ]] = (EndW 0 )[[x, x −1 ]][[]], which is topologically free. Furthermore, we have E(W ) = E(W 0 )[[]], which is also topologically free.
(4.7)
496
H. Li
Lemma 4.6. For a(x) ∈ (EndW )[[x, x −1 ]], if k a(x) ∈ E(W ) for some k ∈ N, then a(x) ∈ E(W ). Proof. For any n ∈ N, w ∈ W , with k a(x) ∈ E(W ), there exists q ∈ Z such that k am w ∈ k+n W for m ≥ q, where a(x) = m∈Z am x −m−1 . Since W is torsion-free, we have am w ∈ n W for m ≥ q. This proves a(x) ∈ E(W ). Remark 4.7. For each n ∈ N, we have a canonical C[[]]-module map π˜ n : (EndW )[[x, x −1 ]] → (End(W/n W ))[[x, x −1 ]].
(4.8)
As W is torsion-free, we have ker π˜ n = n (EndW )[[x, x −1 ]]. Recall from Sect. 2 that an element a(x) of (EndW )[[x, x −1 ]] lies in E(W ) if and only if π˜ n (a(x)) ∈ E(W/n W ) for n ∈ N. Then we have canonical C[[]]-module maps πn : E(W ) → E(W/n W )
(4.9)
for n ∈ N, where by Lemma 4.6, ker πn = E(W ) ∩ n (EndW )[[x, x −1 ]] = n E(W ). For every n ∈ N, we have a canonical C[[]]-module map
(4.10)
θn : E(W/n+1 W ) → E(W/n W ). We have the following projective inverse system 0 ← E(W/W ) ← E(W/2 W ) ← E(W/3 W ) ← · · ·
(4.11)
with E(W ) equipped with C[[]]-module maps πn as an inverse limit. Then for any sequence {ψn (x)} with ψn (x) ∈ E(W/n W ) for n ∈ N, satisfying the condition that θn (ψn+1 (x)) = ψn (x), there exists a unique ψ(x) ∈ E(W ) such that πn (ψ(x)) = ψn (x) for n ∈ N. Definition 4.8. A finite sequence a 1 (x), . . . , a r (x) in E(W ) is said to be -adically quasi-compatible if for every positive integer n, the sequence πn (a 1 (x)), . . . , πn (a r (x)) in E(W/n W ) is quasi-compatible. A subset U of E(W ) is said to be -adically quasicompatible if every finite sequence in U is -adically quasi-compatible. Correspondingly, we define notions of -adically compatible sequence and -adically compatible subset. Let r be a positive integer. For each n ∈ N, we have a canonical C[[]]-module map π˜ n(r ) : (EndW )[[x1±1 , . . . , xr±1 ]] → (End(W/n W ))[[x1±1 , . . . , xr±1 ]], (1)
where π˜ n = π˜ n defined in (4.8). We also have a canonical C[[]]-module map: θ˜n(r ) : (End(W/n+1 W ))[[x1±1 , . . . , xr±1 ]] → (End(W/n W ))[[x1±1 , . . . , xr±1 ]]. It is clear that for n ∈ N, (r ) . π˜ n(r ) = θ˜n(r ) ◦ π˜ n+1
(4.12)
For any vector space U over C, we set E (r ) (U ) = Hom(U, U ((x1 , . . . , xr ))), which is naturally a C((x1 , . . . , xr ))-module.
(4.13)
-adic Quantum Vertex Algebras and Their Modules
497 (r )
Definition 4.9. Let r be a positive integer. For every n ∈ N, define En (W ) to be the C[[]]-submodule of (EndW )[[x1±1 , . . . , xr±1 ]], consisting of each formal series ψ(x1 , . . . , xr ) =
ψ(m 1 , . . . , m r )x1−m 1 −1 · · · xr−m r −1 ,
m 1 ,...,m r ∈Z
satisfying the condition that π˜ n(r ) (ψ(x1 , . . . , xr )) ∈ E (r ) (W/n W ), or equivalently, for every w ∈ W , there exists k ∈ Z such that ψ(m 1 , . . . , m r )w ∈ n W whenever m i ≥ k for some 1 ≤ i ≤ r. (r )
We see that En (W ) are also C((x1 , . . . , xr ))-modules and we have (r ) E0(r ) (W ) ⊃ E1(r ) (W ) · · · ⊃ En(r ) (W ) ⊃ En+1 (W ) ⊃ · · · . (r )
In terms of En (W ), a sequence ψ 1 (x), . . . , ψ r (x) in E(W ) is -adically quasi-compatible if and only if for every n ∈ N, there exists 0 = p(x, y) ∈ C[x, y] such that ⎛ ⎞ ⎝ p(xi , x j )⎠ ψ 1 (x1 ) · · · ψ r (xr ) ∈ En(r ) (W ). 1≤i< j≤r
Generalizing the maps πn and θn , we have canonical C[[]]-module maps for n ∈ N: πn(r ) : En(r ) (W ) → E (r ) (W/n W ), θn(r ) : E (r ) (W/n+1 W ) → E (r ) (W/n W ), which satisfy (r ) . πn(r ) = θn(r ) ◦ πn+1
Set (r )
E (W ) = ∩n≥1 En(r ) (W ) ⊂ (EndW )[[x1±1 , . . . , xr±1 ]].
(4.14)
Note that if (a(x), b(x)) is an -adically quasi-compatible pair in E(W ), then for every n ∈ N, (πn (a(x)), πn (b(x))) is a quasi-compatible pair in E(W/n W ) and hence πn (a(x))m πn (b(x)) are defined for all m ∈ Z. Lemma 4.10. Let (a(x), b(x)) be an -adically quasi-compatible pair in E(W ). We have θn+1 (πn+1 (a(x))m πn+1 (b(x))) = πn (a(x))m πn (b(x)) for n ∈ N, m ∈ Z.
(4.15)
498
H. Li
Proof. For any fixed n ∈ N, let p(x, y) ∈ C[x, y] be a nonzero polynomial such that (2)
p(x1 , x2 )a(x1 )b(x2 ) ∈ En+1 (W ) ⊂ En(2) (W ). From Definition 4.2, we have p(x0 + x, x)YE (πn (a(x)), x0 )πn (b(x)) = ( p(x1 , x)πn (a(x1 ))πn (b(x))) |x1 =x+x0 , = πn(2) ( p(x1 , x)a(x1 )b(x)) |x1 =x+x0 , p(x0 +x, x)YE (πn+1 (a(x)), x0 )πn+1 (b(x)) = ( p(x1 , x)πn+1 (a(x1 ))πn+1 (b(x))) |x1 =x+x0 (2) = πn+1 ( p(x1 , x)a(x1 )b(x))) |x1 =x+x0 .
With (4.12) it follows that p(x0 + x, x)YE (πn (a(x)), x0 )πn (b(x)) = p(x0 + x, x)θn+1 (YE (πn+1 (a(x)), x0 )πn+1 (b(x))) , which implies YE (πn (a(x)), x0 )πn (b(x)) = θn+1 (YE (πn+1 (a(x)), x0 )πn+1 (b(x))), as desired.
Using Lemma 4.10 we define the following partial operations on E(W ): Definition 4.11. Let (a(x), b(x)) be an -adically quasi-compatible (order) pair in E(W ). For m ∈ Z, we define a(x)m b(x) = lim πn (a(x))m πn (b(x)) ∈ E(W ). ←
(4.16)
Form the generating function YE (a(x), x0 )b(x) =
(a(x)m b(x))x0−m−1 .
(4.17)
m∈Z
From definition, for every positive integer n we have πn (a(x)m b(x)) = πn (a(x))m πn (b(x))
(4.18)
πn (YE (a(x), x0 )b(x)) = YE (πn (a(x)), x0 )πn (b(x)).
(4.19)
for m ∈ Z. Namely, (2)
Recall that E (W ) consists of each ψ(x1 , x2 ) ∈ (EndW )[[x1±1 , x2±1 ]], such that for every n ∈ N, π˜ n(2) (ψ) ∈ E (2) (W/n W ). Proposition 4.12. Let a(x), b(x) ∈ E(W ). Assume that there exists p(x1 , x2 , ) ∈ C[x1 , x2 , ] with p(x1 , x2 , 0) = 0 such that p(x1 , x2 , )a(x1 )b(x2 ) ∈ E(2) (W ). Then (a(x), b(x)) is -adically quasi-compatible and YE (a(x), x0 )b(x) = ιx,x0 ,(1/ p(x0 + x, x, )) ( p(x1 , x, )a(x1 )b(x)) |x1 =x+x0 .
-adic Quantum Vertex Algebras and Their Modules
499
Proof. Set f (x1 , x2 ) = p(x1 , x2 , 0) ∈ C[x1 , x2 ], A = p(x1 , x2 , )a(x1 )b(x2 ) ∈ E(2) (W ). Then p(x1 , x2 , ) = f (x1 , x2 ) − g(x1 , x2 , ) for some g(x1 , x2 , ) ∈ C[x1 , x2 , ]. We have ιx1 ,x2 ( f (x1 , x2 )−k−1 )g(x1 , x2 , )k k A. a(x1 )b(x2 ) = ιx1 ,x2 ,(1/ p(x1 , x2 , ))A = k≥0
For any positive integer n, we have f (x1 , x2 )n a(x1 )b(x2 ) ≡
n−1
f (x1 , x2 )n−k−1 g(x1 , x2 , )k k A modn (EndW )[[x1±1 , x2±1 ]],
k=0
so that π˜ n(2) f (x1 , x2 )n a(x1 )b(x2 ) ∈ E (2) (W/n W ),
(4.20)
(2)
as π˜ n (A) ∈ E (2) (W/n W ). This proves that (a(x), b(x)) is -adically quasi-compatible. Furthermore, for n ∈ N we have f (x0 + x, x)n πn (YE (a(x), x0 )b(x)) = f (x1 , x)n πn (a(x1 ))πn (b(x)) |x1 =x2 +x0 = πn(2) f (x1 , x)n a(x1 )b(x) |x1 =x2 +x0 . Then (2)
πn ιx,x0 ,( p(x0 + x, x, )−1 )( p(x1 , x, )a(x1 )b(x))|x1 =x+x0 (2)
= πn ιx,x0 ,( p(x0 + x, x, )−1 f (x0 + x, x)−n )( p(x1 , x, ) f (x1 , x)n a(x1 )b(x))|x1 =x+x0 (2)
= πn ιx,x0 ,( p(x0 + x, x, )−1 f (x0 + x, x)−n ) p(x0 + x, x, )( f (x1 , x)n a(x1 )b(x))|x1 =x+x0 (2)
= ιx,x0 ,( f (x0 + x, x)−n )πn ( f (x1 , x)n a(x1 )b(x))|x1 =x+x0 = πn YE (a(x), x0 )b(x) ,
from which the second part follows.
An -adically quasi-compatible C[[]]-submodule K of E(W ) is said to be YE -closed if a(x)m b(x) ∈ K
for a(x), b(x) ∈ K , m ∈ Z.
Proposition 4.13. Let V be a YE -closed -adically quasi-compatible C[[]]-submodule of E(W ), containing 1W . Suppose that [V ] = V and V is -adically complete. Then (V, YE , 1W ) carries the structure of an -adic nonlocal vertex algebra and W is a faithful quasi V -module with YW (a(x), x0 ) = a(x0 ) for a(x) ∈ V . Furthermore, if V is -adically compatible, W is a module (instead of a quasi module).
500
H. Li
Proof. Recall that E(W ) is topologically free. As a submodule of E(W ), V is torsionfree and separated. Being assumed to be -adically complete, V is topologically free. For any n ∈ N, as πn (a(x))m πn (b(x)) = πn (a(x)m b(x)) for a(x), b(x) ∈ V, m ∈ Z, we see that πn (V ) is a YE -closed quasi-compatible C[[]]-submodule of E(W/n W ), containing 1W . It follows from Theorem 4.3 that πn (V ) is a nonlocal vertex algebra over C with W/n W as a quasi-module. Now we prove that the map πn from V to πn (V ) reduces to a C[[]]-isomorphism from V /n V onto πn (V ), so that V /n V is a nonlocal vertex algebra over C. Let a(x) ∈ V be such that πn (a(x)) = 0 in E(W/n W ). Then a(x)W ⊂ n W [[x, x −1 ]]. So a(x) = n b(x) for some b(x) ∈ (EndW )[[x, x −1 ]]. By Lemma 4.6, b(x) ∈ E(W ). Then we have b(x) ∈ [V ] = V . Thus a(x) = n b(x) ∈ n V . This proves that V ∩ ker πn = n V , which implies V /n V πn (V ) ⊂ E(W/n W ). Consequently, V /n V is a nonlocal vertex algebra over C. By Propositions 2.11 and 2.24, V is an -adic nonlocal vertex algebra with W as a quasi V -module. The last part follows from Theorem 4.3 and Proposition 2.24. For convenience, we call any -adic nonlocal vertex algebra V in Proposition 4.13 an -adic nonlocal vertex subalgebra of E(W ). Lemma 4.14. Let U be an -adically quasi-compatible C[[]]-submodule of E(W ). Then [U ] is -adically quasi-compatible. If U is YE -closed, so is [U ]. Proof. Notice that [U ] ⊂ E(W ) by Lemma 4.6. As W is torsion-free, for any w ∈ W, s, n ∈ N, the relation s w ∈ s+n W implies w ∈ n W . Furthermore, for ψ ∈ (r ) (W ) implies ψ ∈ En(r ) (W ). (EndW )[[x1±1 , . . . , xr±1 ]], s, n ∈ N, the relation s ψ ∈ En+s Let a 1 (x), . . . , a r (x) ∈ [U ]. There exists k ∈ N such that k a i (x) ∈ U for i = 1, . . . , r . As the sequence k a 1 (x), . . . , k a r (x) in U is -adically quasi-compatible, for every n ∈ N, there exists 0 = p(x, y) ∈ C[x, y] such that ⎞ ⎛ (r ) r k ⎝ p(xi , x j )⎠ a 1 (x1 ) · · · a r (xr ) ∈ En+r k (W ), 1≤i< j≤r
which gives
⎛ ⎝
⎞ p(xi , x j )⎠ a 1 (x1 ) · · · a r (xr ) ∈ En(r ) (W ).
1≤i< j≤r
This proves that the sequence a 1 (x), . . . , a r (x) is -adically quasi-compatible. Therefore, [U ] is -adically quasi-compatible. Assume that U is YE -closed. Let a(x), b(x) ∈ [U ], m ∈ Z. By definition, there exists k ∈ N such that k a(x), k b(x) ∈ U . Then 2k (a(x)m b(x)) = (k a(x))m (k b(x)) ∈ U. Thus a(x)m b(x) ∈ [U ]. This proves that [U ] is YE -closed.
Theorem 4.15. Let K be a maximal -adically quasi-compatible C[[]]-submodule of E(W ). Then [K ] = K , K is -adically topologically free and YE -closed. Furthermore, (K , YE , 1W ) carries the structure of an -adic nonlocal vertex algebra with W as a quasi-module with YW (α(x), x0 ) = α(x0 ) for α(x) ∈ K . If K is -adically compatible, W is a module (instead of a quasi-module).
-adic Quantum Vertex Algebras and Their Modules
501
Proof. By Lemma 4.14, [K ] is -adically quasi-compatible. As K is maximal, we have [K ] ⊂ K . Thus [K ] = K . Let a(x), b(x) ∈ K , m ∈ Z. For every n ∈ N, πn (K ) is quasi-compatible, then by Theorem 4.3, πn (K ) generates a nonlocal vertex algebra πn (K ) over C and we have πn (K + Ca(x)m b(x)) = πn (K ) + Cπn (a(x))m πn (b(x)) ⊂ πn (K ), a quasi-compatible C-subspace of E(W/n W ). This proves that K + Ca(x)m b(x) is -adically quasi-compatible in E(W ). Again, with K maximal, we have a(x)m b(x) ∈ K . Thus K is YE -closed. Now, let {ψm (x)} be a sequence in K , satisfying the condition that for any r ≥ 0, there exists k ≥ 0 such that ψm (x) − ψn (x) ∈ r K whenever m, n ≥ k. Since E(W ) is -adically complete, the sequence {ψm (x)} has a limit, say ψ(x), in E(W ). For any n ∈ N, there exists m ∈ N such that ψm (x)−ψ(x) ∈ n E(W ), which implies πn (ψm (x)) = πn (ψ(x)). Thus πn (ψ(x)) = πn (ψm (x)) ∈ πn (K ). Consequently, πn (K + Cψ(x)) ⊂ πn (K ), which is quasi-compatible. This proves that K + Cψ(x) is -adically quasi-compatible in E(W ) and then it follows that ψ(x) ∈ K . Thus K is -adically complete, so that it is topologically free. Now, in view of Proposition 4.13, K is an -adic nonlocal vertex algebra with W as a quasi module. Furthermore, W is a module if K is -adically compatible. Now, let U be an -adically quasi-compatible subset of E(W ). In view of Zorn’s lemma, there exists a maximal -adically quasi-compatible C[[]]-submodule K of E(W ), containing U and 1W . Set U (1) = C[[]]U + C[[]]1W . Define U (2) to be the C[[]]-span of the vectors a(x)m b(x) for a(x), b(x) ∈ U (1) , m ∈ Z. Since K is YE -closed by Theorem 4.15, U (2) ⊂ K . Then U (2) is -adically quasi-compatible. For n ≥ 1, we inductively define U (n+1) = (U (n) )(2) . In this way, we obtain an increasing sequence of -adically quasi-compatible C[[]]-submodules: U (1) ⊂ U (2) ⊂ U (3) ⊂ · · · . Set U o = {a(x) ∈ E(W ) | k a(x) ∈ ∪n≥2 U (n) for some k ≥ 1}.
(4.21)
That is, U o = [∪n≥2 U (n) ]. In view of Lemma 3.5 we have U o ∩ n E(W ) = n U o
for n ≥ 1.
(4.22)
In particular, the induced topology of U o from E(W ) coincides with the -adic topology of U o . Then we define U to be the -adic completion of U o . Theorem 4.16. Let U be an -adically quasi-compatible subset of E(W ). Then [U ] = U , U is topologically free, -adically quasi-compatible, and YE -closed, and (U , YE , 1W ) carries the structure of an -adic nonlocal vertex algebra and W is a faithful quasi-U -module with YW (α(x), x0 ) = α(x0 ) for α(x) ∈ U . Furthermore, for any -adic nonlocal vertex subalgebra V of E(W ), containing U , such that [V ] = V , we have U ⊂ V . Proof. To prove [U ] = U , let a(x) ∈ [U ]. By definition, a(x) ∈ E(W ) and there exists k ≥ 0 such that k a(x) ∈ U . As U is the -adic completion of U o , there exists a Cauchy sequence {ψm (x)} in U o with k a(x) as a limit. There exists r ≥ 1 such that ψm (x) − k a(x) ∈ k E(W )
for m ≥ r.
502
H. Li
Then ψm (x) ∈ k E(W ) for m ≥ r . Set ψm (x) = k φm (x) for m ≥ r with φm (x) ∈ E(W ). Noticing that [U o ] = U o , we have φm (x) ∈ U o for m ≥ r . We see that {φm (x)}m≥r is a Cauchy sequence in U o with a(x) as a limit. Thus a(x) ∈ U . This proves [U ] = U . As U is torsion-free, separated, and -adically complete by definition, U is topologically free. It follows from definition that ∪n≥2 U (n) is -adically quasi-compatible and YE -closed. By Lemma 4.14, U o (= [∪n≥2 U (n) ]) is -adically quasi-compatible and YE -closed. Let ψ1 (x), . . . , φr (x) be a sequence in U and let n be any positive integer. For 1 ≤ i ≤ r , there exists a sequence {ψim (x)} in U o , which converges to ψi (x). Let k be a positive integer such that ψim (x) − ψi (x) ∈ n E(W )
for 1 ≤ i ≤ r, m ≥ k.
As U o is -adically quasi-compatible, there exists 0 = p(x, y) ∈ C[x, y] such that ⎛ ⎞ πn ⎝ p(xi , x j )⎠φ1k (x1 ) · · · φr k (xr ) ∈ Hom((W/n W ), (W/n W )((x1 , . . . , xr ))). 1≤i< j≤r
Then ⎛ πn ⎝
⎞ p(xi , x j )⎠φ1 (x1 ) · · · φr (xr ) ∈ Hom((W/n W ), (W/n W )((x1 , . . . , xr ))).
1≤i< j≤r
This proves that ψ1 (x), . . . , φr (x) is -adically quasi-compatible. It follows from Lemma 3.6 that U is YE -closed. Now, by Proposition 4.13, (U , YE , 1W ) carries the structure of an -adic nonlocal vertex algebra and W is a faithful quasi-U -module. Let V be an -adic nonlocal vertex subalgebra of E(W ) satisfying the condition that U ⊂ V and [V ] = V . It is straightforward to see that U ⊂ V . For a topologically free C[[]]-module W , we say a subset T spans W -adically if W = (CT )[[]] . The following is an -adic version of ([Li4], Theorem 6.3) which is an analogue of a theorem of Frenkel-Kac-Radul-Wang [FKRW] and of Muerman-Primc [MP] (cf. [LL]): Theorem 4.17. Let V be a topologically free C[[]]-module, U a subset of V , 1 a vector in V , D a C[[]]-module endomorphism of V , and Y 0 a map Y 0 : U → E(V ); u → Y 0 (u, x) = u(x) = u m x −m−1 . m∈Z
Assume that all the following conditions hold: D1 = 0, Y 0 (u, x)1 ∈ V [[x]] and
lim Y 0 (u, x)1 = u,
x→0
d 0 Y (u, x) [D, Y 0 (u, x)] = dx
for u ∈ U,
(4.23) (4.24) (4.25)
U (x) = {u(x) | u ∈ U } is -adically compatible, and V is -adically spanned by vectors (r ) u (1) m 1 · · · u mr 1
(4.26)
-adic Quantum Vertex Algebras and Their Modules
503
for r ≥ 0, u (i) ∈ U, m i ∈ Z. In addition we assume that there exists a C[[]]-module morphism ψ from V to U (x) ⊂ E(V ) such that ψ(1) = 1V and ψ(u m v) = u(x)m ψ(v)
for u ∈ U, v ∈ V, m ∈ Z.
(4.27)
Then Y 0 extends uniquely to a C[[]]-module map Y from V to E(V ) such that (V, Y, 1) carries the structure of an -adic nonlocal vertex algebra. Proof. The uniqueness is obvious, so it remains to establish the existence. As U (x) is an -adically compatible subset of E(V ), by Theorem 4.16 we have an -adic nonlocal vertex algebra U (x) with V as a faithful U (x)-module, where YV (α(x), x0 ) = α(x0 ) for α(x) ∈ U (x). For u ∈ U , we have YV (u(x), x0 )1 = u(x0 )1 ∈ V [[x0 ]], [D, YV (u(x), x0 )] = [D, Y 0 (u, x0 )] =
d 0 d Y (u, x0 ) = YV (u(x), x0 ). d x0 d x0
By Lemma 3.9, the map φ from U (x) to V , defined by φ(α(x)) = Resx x −1 α(x)1, is a U (x)-module homomorphism. We see that φ(1V ) = 1 and that for u ∈ U, α(x) ∈ U (x), φ(YE (u(x), x0 )α(x)) = YV (u(x), x0 )φ(α(x)) = u(x0 )φ(α(x)), which amounts to φ(u(x)m α(x)) = u m φ(α(x))
for m ∈ Z.
It follows that φ ◦ ψ = 1V . Thus ψ is a C[[]]-module isomorphism from V onto ψ(V ) ⊂ U (x). For u ∈ U , we have ψ(u) = ψ(u −1 1) = u(x)−1 1V = u(x) and
φ(u(x)) = φ(ψ(u)) = u.
Inside U (x), ψ(V ) is -adically spanned by vectors u (1) (x)m 1 · · · u (r ) (x)m r 1V for r ≥ 0, u (i) ∈ U, m i ∈ Z. For a ∈ V , we define Y (a, x) ∈ (EndV )[[x, x −1 ]]] by Y (a, x0 )b = φ (YE (ψ(a)(x), x0 )ψ(b)(x)) for b ∈ V. As φ is a U (x)-module homomorphism with φ ◦ ψ = 1, we have Y (a, x0 )b = YV (ψ(a)(x), x0 )b = ψ(a)(x0 )b, so that Y (a, x) = ψ(a)(x) ∈ E(V ). In particular, for u ∈ U , Y (u, x0 ) = YV (ψ(u), x0 ) = YV (u(x), x0 ) = u(x0 ) = Y 0 (u, x0 ), so the map Y extends Y0 . For v ∈ V , we have Y (1, x0 )v = YV (1V , x0 )v = 1V (v) = v, Y (v, x0 )1 = YV (ψ(v)(x), x0 )1 ∈ V [[x0 ]],
504
H. Li
and lim Y (v, x0 )1 = lim YV (ψ(v)(x), x0 )1 = φ(ψ(v)) = v.
x0 →0
x0 →0
To prove that (V, Y, 1) is an -adic nonlocal vertex algebra, we show that for every positive integer n, V /n V with the reduced structures is a nonlocal vertex algebra over C. For any n ≥ 1, U (x)/n U (x) is a nonlocal vertex algebra over C and φ reduces to a homomorphism φ¯ from U (x)/n U (x) to V /n V . On the other hand, the map ψ reduces to a map ψ¯ from V /n V to U (x)/n U (x) such that φ¯ ◦ ψ¯ = 1. We see that the image ψ(V ) of ψ(V ) in U (x)/n U (x) is a nonlocal vertex subalgebra ¯ It follows that V /n V is a nonlocal and ψ(V ) V /n V through the maps φ¯ and ψ. vertex algebra over C. Therefore, (V, Y, 1) is an h-adic nonlocal vertex algebra. This establishes the existence, concluding the proof. Definition 4.18. Let W be a topologically free C[[]]-module as before. We define a C[[]]-module map ˆ (W )⊗C((x))[[]] ˆ Z (x1 , x2 ) : E(W )⊗E → (EndW )[[x1±1 , x2±1 ]] by Z (x1 , x2 )(a(x) ⊗ b(x) ⊗ f (x)) = f (x1 − x2 )a(x1 )b(x2 ).
(4.28)
ˆ (W )⊗C((x))[[]] ˆ Lemma 4.19. Let a(x), b(x) ∈ E(W ), B(x) ∈ E(W )⊗E such that a(x1 )b(x2 ) ∼ Z (x2 , x1 )(B(x)). Then (a(x), b(x)) is -adically compatible and YE (a(x), x0 )b(x) x1 − x x − x1 = Resx1 x0−1 δ a(x1 )b(x) − x0−1 δ Z (x, x1 )(B(x)). x0 −x0
(4.29)
Proof. By definition, for any positive integer n, there exists k ∈ N such that (x1 − x2 )k πn (a(x1 ))πn (b(x2 )) = (x1 − x2 )k πn(2) (Z (x2 , x1 )(B(x))). From [Li4] we have YE (πn (a(x)), x0 )πn (b(x)) x1 − x x − x1 πn (a(x1 ))πn (b(x))−x0−1 δ πn(2) Z (x, x1 )(B(x)). = Resx1 x0−1 δ x0 −x0 Then it follows
Definition 4.20. Let V be an -adic nonlocal vertex subalgebra of E(W ). Define a ˆ ⊗C((x))[[]] ˆ C[[]]-module map YE (x2 , x1 ) from V ⊗V to (EndV )[[x1±1 , x2±1 ]] by YE (x2 , x1 )(a(x) ⊗ b(x) ⊗ f (x)) = f (x2 − x1 )YE (a(x), x2 )YE (b(x), x1 ) for a(x), b(x) ∈ V, f (x) ∈ C((x))[[]].
(4.30)
-adic Quantum Vertex Algebras and Their Modules
505
Lemma 4.21. Let V be an -adic nonlocal vertex subalgebra of E(W ), and let ˆ ⊗C((x))[[]]. ˆ u(x), v(x) ∈ V, A(x) ∈ V ⊗V Suppose that u(x1 )v(x2 ) ∼ Z (x2 , x1 )(A(x)) in
(EndW )[[x1±1 , x2±1 ]].
(4.31)
Then
YE (u(x), x1 )YE (v(x), x2 ) ∼ YE (x2 , x1 )(A(x))
(4.32)
in (EndV )[[x1±1 , x2±1 ]] and x1 − x2 x2 − x1 −1 −1 YE (u(x), x1 )YE (v(x), x2 ) − x0 δ YE (x2 , x1 )(A(x)) x0 δ x0 −x0 x1 − x0 YE (YE (u(x), x0 )v(x), x2 ). (4.33) = x2−1 δ x2 Proof. Let n be a positive integer. We have a nonlocal vertex algebra πn (V ) ⊂ E(W/n W ) over C with ker πn = V ∩ n E(W ) (= n V ). From assumption, there exists a nonnegative integer k such that (x1 − x2 )k πn (u(x1 ))πn (v(x2 )) = (x1 − x2 )k Z (x2 , x1 )(πn (A(z))). From [Li4] (Prop. 3.13), we have (x1 − x2 )k YE (πn (u(x)), x1 )YE (πn (v(x)), x2 ) = (x1 − x2 )k YE (x2 , x1 )πn (A(z)) as desired.
Definition 4.22. A subset U of E(W ) is said to be -adically S-local if for any a(x), b(x) ∈ U , there exists A(x) ∈ (CU ⊗ CU ⊗ C((x)))[[]] such that a(x1 )b(x2 ) ∼ Z (x2 , x1 )(A(x)).
(4.34)
Lemma 4.23. Every -adically S-local subset of E(W ) is -adically compatible. Proof. Let U be an -adically S-local subset of E(W ). For every positive integer n, we see that πn (C[[]]U ) is an S-local subset of E(W/n W ), so that πn (C[[]]U ) is compatible. By definition, C[[]]U is an -adically compatible subset of E(W ). Thus U is an -adically compatible subset. The following is a refinement of Theorem 4.17 (cf. [Li5], Theorem 2.9): Theorem 4.24. Let V be a topologically free C[[]]-module, U a subset of V , 1 a vector in V , and Y 0 a map Y 0 : U → E(V ); u → Y 0 (u, x) = u(x) = u m x −m−1 . m∈Z
Assume that all the following conditions hold: Y 0 (u, x)1 ∈ V [[x]] and
lim Y 0 (u, x)1 = u
x→0
for u ∈ U,
(4.35)
506
H. Li
U (x) = {u(x) | u ∈ U } is -adically S-local, and V is -adically spanned by vectors (r ) u (1) m 1 · · · u mr 1
(4.36)
for r ≥ 0, u (i) ∈ U, m i ∈ Z. In addition we assume that there exists a C[[]]-module morphism ψ from V to U (x) ⊂ E(V ) such that ψ(1) = 1V and ψ(u m v) = u(x)m ψ(v)
for u ∈ U, v ∈ V, m ∈ Z.
(4.37)
Then the map Y 0 extends uniquely to a C[[]]-module map Y from V to E(V ) such that (V, Y, 1) carries the structure of an -adic weak quantum vertex algebra. Proof. We shall slightly modify the proof of Theorem 4.17. By Lemma 4.23, U (x) is -adically compatible, so it generates an -adic nonlocal vertex algebra U (x). Set K = (∪n≥1 U (x)(n) )[[]] ⊂ U (x) ⊂ E(V ). By Proposition 3.13, we have a C[[]]-map φ : K → V such that φ(1V ) = 1, φ(u(x)n a(x)) = u n φ(a(x))
for u ∈ U, n ∈ Z, a(x) ∈ K .
Then continue with the proof of Theorem 4.17 to see that Y 0 extends uniquely to a C[[]]-module map Y from V to E(V ) such that (V, Y, 1) carries the structure of an -adic nonlocal vertex algebra. From Lemma 4.21, U is an -adically S-local subset of V and then by Proposition 3.12, V is an -adic weak quantum vertex algebra. Notice that compared with the corresponding theorem in [FKRW] and [MP], Theorems 4.17 and 4.24 have an extra assumption on the existence of the map ψ. By using classical linear algebra, it is not hard to see that in the general noncommutative situation, an assumption like this is indeed necessary. This assumption means that V is a universal vacuum module for a certain algebra. The following results are companions in practical applications: Lemma 4.25. Let U be a subset of E(W ) satisfying the condition that for a(x), b(x) ∈ U , there exist B(z) ∈ (CU ⊗CU ⊗C((x)))[[]] and p(x, ) ∈ C[x, ] with p(x, 0) = 0 such that p(x1 − x2 , )a(x1 )b(x2 ) = p(x1 − x2 , )Z (x2 , x1 )(B(x)).
(4.38)
Then U is -adically S-local. Proof. Let a(x), b(x) ∈ U . By assumption there exist B and p(x, ) with all the assumed properties. With p(x, 0) = 0, we have p(x, ) = f (x) − g(x, ), where 0 = f (x) ∈ C[x], g(x, ) ∈ C[x, ]. Expand p(x, )−1 in the nonnegative powers of as follows: f (x)−1−i g(x, )i i ∈ C((x))[[]], p(x, )−1 = i≥0
where f (x)−i−1 is understood as an element of C((x)). Let n be a positive integer. Then p(x, )−1 ≡
n−1 i=0
f (x)−1−i g(x, )i i (mod n C((x))[[]]).
-adic Quantum Vertex Algebras and Their Modules
507
Let k be a nonnegative integer such that x k f (x)−n ∈ C[[x]], so that (x1 − x2 )k f (x1 − x2 )−1−i = (−x2 + x1 )k f (−x2 + x1 )−1−i for all i = 0, . . . , n − 1. Set A = p(x1 − x2 , )a(x1 )b(x2 ), the common quantity of both sides of (4.38). We have (x1 − x2 )k a(x1 )b(x2 ) = (x1 − x2 )k p(x1 − x2 , )−1 A n−1 k −1−i i i f (x1 − x2 ) g(x1 − x2 , ) A (mod n ) ≡ (x1 − x2 ) = (−x2 + x1 )k
i=0 n−1
f (−x2 + x1 )−1−i g(−x2 + x1 , )i i A
i=0
≡ (−x2 + x1 )
k
f (−x2 + x1 )−1−i g(−x2 + x1 , )i i A (mod n )
i≥0
= (−x2 + x1 ) Z (x2 , x1 )B(x). k
This proves a(x1 )b(x2 ) ∼ Z (x2 , x1 )B(x). Thus U is -adically S-local.
Proposition 4.26. Let V be an -adic nonlocal vertex subalgebra of E(W ). Suppose ˆ ⊗C((x))[[]], ˆ that a(x), b(x) ∈ V , B(x) ∈ V ⊗V and p(x, ) ∈ C[x, ] with p(x, 0) = 0 satisfy p(x1 − x2 , )a(x1 )b(x2 ) = p(x1 − x2 , )Z (x2 , x1 )(B(x)).
(4.39)
Then p(x1 − x2 , )YE (a(x), x1 )YE (b(x), x2 ) = p(x1 − x2 , )YE (x2 , x1 )(B(x)). (4.40) Proof. In view of Lemma 4.25, we have a(x1 )b(x2 ) ∼ Z (x2 , x1 )(B(x)). Furthermore, by Lemma 4.21 we have YE (a(x), x1 )YE (b(x), x2 ) ∼ YE (x2 , x1 )(B(x)). Then the following Jacobi identity holds: x1 − x2 x2 − x1 x0−1 δ YE (a(x), x1 )YE (b(x), x2 ) − x0−1 δ YE (x2 , x1 )(B(x)) x0 −x0 x2 + x0 YE (YE (a(x), x0 )b(x), x2 ). = x1−1 δ x1 By Proposition 4.12 we have p(x0 , )YE (a(x), x0 )b(x) = ( p(x1 − x, )a(x1 )b(x)) |x1 =x+x0 , which involves only nonnegative integer powers of x0 . Multiplying both sides of the Jacobi identity by p(x0 , ), and then applying Resx0 we obtain the desired identity.
508
H. Li
We also have: Proposition 4.27. Let W be given as before and let V be an -adic nonlocal vertex subalgebra of E(W ) such that V is -adically compatible, and let ˆ ⊗C((x))[[]], ˆ m ∈ Z, u(x), v(x), c0 (x), c1 (x), · · · ∈ V, A(x) ∈ V ⊗V satisfying the condition that for every positive integer n, there exists a nonnegative integer r such that c j (x) ∈ n V for j ≥ r . Suppose that (x1 − x2 )m u(x1 )v(x2 ) − (−x2 + x1 )m Z (x2 , x1 )(A(x)) ∂ j −1 1 x1 j . c (x2 ) x2 δ = j! ∂ x2 x2
(4.41)
j≥0
Then (x1 − x2 )m YE (u(x), x1 )YE (v(x), x2 ) − (−x2 + x1 )m YE (x2 , x1 )(A(x)) ∂ j −1 1 x1 j . (4.42) YE (c (x), x2 ) x2 δ = j! ∂ x2 x2 j≥0
Proof. Since V is -adically compatible, W is a faithful V -module with YW (α(x), x0 ) = α(x0 ) for α(x) ∈ V . Then it follows immediately from Proposition 2.25. 5. -deformations of Quantum Vertex Algebras VQ In this section we construct a family of -adic quantum vertex algebras as -deformations of certain quantum vertex algebras which were studied in [KL]. One special case gives rise to an -deformed βγ -system, while another special case gives rise to an -deformation of the vertex operator superalgebra VL associated to the rank-one lattice L = Zα with α, α = 1. We essentially deal with the same algebras as in [KL] with a formal parameter , instead of a nonzero complex number q. We first recall the quantum vertex algebras of Zamolodchikov-Faddeev type from [KL]. Let l be a positive integer and let Q = (qi j )li, j=1 be a complex matrix such that qi j q ji = 1 for 1 ≤ i, j ≤ l.
(5.1)
Define AQ to be the associative algebra with identity (over C) with generators X i,n , Yi,n
(i = 1, . . . , l, n ∈ Z),
subject to relations X i,m X j,n = qi j X j,n X i,m , Yi,m Y j,n = qi j Y j,n Yi,m , X i,m Y j,n − q ji Y j,n X i,m = δi, j δm+n+1,0
(5.2)
for 1 ≤ i, j ≤ l, m, n ∈ Z. A vector w in an AQ -module is called a vacuum vector if X i,n w = Yi,n w = 0
for 1 ≤ i ≤ l, n ≥ 0,
and an AQ -module W equipped with a vacuum vector that generates W is called a vacuum AQ -module.
-adic Quantum Vertex Algebras and Their Modules
509
Let JQ be the left ideal of AQ , generated by X i,n , Yi,n (1 ≤ i ≤ l, n ≥ 0), that is, JQ =
l
(AQ X i,n + AQ Yi,n ).
i=1 n≥0
Set VQ = AQ /JQ ,
(5.3)
a left AQ -module, and set 1 = 1 + JQ ∈ VQ . Then 1 is a vacuum vector and VQ equipped with 1 is a vacuum AQ -module. For 1 ≤ i ≤ l, set u (i) = X i,−1 1, v (i) = Yi,−1 1 ∈ VQ and set X i (x) =
X i,n x −n−1 , Yi (x) =
n∈Z
(5.4)
Yi,n x −n−1 ∈ AQ [[x, x −1 ]].
(5.5)
n∈Z
It was proved therein that there exists a unique quantum vertex algebra structure on VQ with 1 as the vacuum vector and with Y (u (i) , x) = X i (x), Y (v (i) , x) = Yi (x)
for 1 ≤ i ≤ l.
It was also proved that VQ is nondegenerate. Next, we are going to construct a family of -adic quantum vertex algebras by deforming VQ . For this purpose we shall need the following notion (cf. [Li3]): Definition 5.1. Let V be a general nonlocal vertex algebra. An -adic pseudo-endomorphism of V is a linear map
(x) : V → (V ⊗ C((x)))[[]] satisfying the condition that (x)1 = 1 ⊗ 1,
(x1 )Y (v, x2 ) = Y ( (x1 − x2 )v, x2 ) (x1 )
for v ∈ V,
(5.6)
where the map Y is canonically extended. An -adic pseudo-endomorphism (x) is called a pseudo-automorphism if there exists an -adic pseudo-endomorphism (x) such that (x)(x)v = v = (x) (x)v for v ∈ V . We say that -adic pseudo-endomorphisms (x) and (x) commute if (x1 )(x2 ) = (x2 ) (x1 ). The following is an -adic version of Lemma 3.15 of [KL]: Lemma 5.2. Let Q = (qi j ) be an l × l matrix as before and let p1 (x, ), . . . , pl (x, ) be any sequence in C((x))[[]] with pi (x, 0) = 0 for 1 ≤ i ≤ l (so that pi (x, ) are invertible). Then there exists an -adic pseudo-automorphism (x) of VQ such that
(x)(u (i) ) = u (i) ⊗ pi (x, ),
(x)(v (i) ) = v (i) ⊗ pi (x, )−1
Furthermore, all such pseudo-automorphisms mutually commute.
for 1 ≤ i ≤ l.
510
H. Li
Proof. Note that VQ ⊗ C((x))[[]] ⊂ (VQ ⊗ C((x)))[[]]. As C((x))[[]] is a commutative associative algebra over C with − ddx as a derivation, C((x))[[]] becomes a vertex algebra with 1 as the vacuum vector and with d
Y ( f, z)g = (e−z d x f )g
for f, g ∈ C((x))[[]].
We then equip VQ ⊗ C((x))[[]] with the tensor product vertex algebra structure with Yten denoting the vertex operator map. Then Yten (v ⊗ f (x), z) = Y (v, z) ⊗ f (x − z) for v ∈ V, f (x) ∈ C((x))[[]]. For A(x) ∈ V ⊗ C((x))[[]], we have Yten (A(x), z) = Y (A(x − z), z), noting that as in (5.6), Y is C((x))[[]]-bilinear. An -adic pseudo-endomorphism from VQ to VQ ⊗ C((x))[[]] exactly amounts to a vertex algebra homomorphism. It is straightforward to check that with X i (z) and Yi (z) (1 ≤ i ≤ l) acting on VQ ⊗ C((x))[[]] as Y (u (i) ⊗ pi (x, ), z) and Y (v (i) ⊗ pi (x, )−1 , z), respectively, VQ ⊗ C((x))[[]] becomes an AQ -module with 1 ⊗ 1 as a vacuum vector. Since VQ is a universal vacuum AQ -module, it follows that there exists an AQ -module homomorphism θ from VQ to VQ ⊗ C((x))[[]] such that θ (1) = 1 ⊗ 1. Because VQ as a quantum vertex algebra is generated by u (i) , v (i) (1 ≤ i ≤ l), it follows that θ is a vertex algebra homomorphism. We have θ (u (i) ) = u (i) ⊗ pi (x, ), θi (v (i) ) = v (i) ⊗ pi (x, )−1 for 1 ≤ i ≤ l. Denoting θ alternatively by (x), we see that (x) satisfies all the properties.
Set G(x, ) =
p(x, ) | p(x, ), q(x, ) ∈ C[x, ] with p(x, 0), q(x, 0) = 0 , q(x, )
(5.7)
an abelian group. We also consider G(x, ) as a (multiplicative) subgroup of C((x))[[]]. Theorem 5.3. Let Q = (qi j )li, j=1 be given as before and let pi j (x, ) ∈ G(x, ) ⊂ C((x))[[]]
(5.8)
such that pi j (x, 0) = 1 for 1 ≤ i, j ≤ l. For 1 ≤ i ≤ l, let i (x) be the pseudo-automorphism of VQ such that
i (x)(u ( j) ) = u ( j) ⊗ pi j (x, ),
i (x)(v ( j) ) = v ( j) ⊗ pi j (x, )−1
for 1 ≤ j ≤ l,
obtained in Lemma 5.2. Then there exists a unique -adic quantum vertex algebra structure on VQ [[]] with 1 as the vacuum vector and with Y(u (i) , x) = Y (u (i) , x) i (x),
Y(v (i) , x) = Y (v (i) , x) i (x)−1
for 1 ≤ j ≤ l.
-adic Quantum Vertex Algebras and Their Modules
511
Furthermore, VQ [[]] is non-degenerate and generated by u (i) , v (i) (1 ≤ i ≤ l), and the following relations hold for 1 ≤ i, j ≤ l: pi j (x1 − x2 , )−1 Y(u (i) , x1 )Y(u ( j) , x2 ) = q ji p ji (x2 − x1 , )−1 Y(u ( j) , x2 )Y(u (i) , x1 ), pi j (x1 − x2 , )−1 Y(v (i) , x1 )Y(v ( j) , x2 ) = qi j p ji (x2 − x1 , )−1 Y(v ( j) , x2 )Y(v (i) , x1 ), pi j (x1 − x2 , )Y(u (i) , x1 )Y(v ( j) , x2 ) − q ji p ji (x2 − x1 , )Y(v ( j) , x2 )Y(u (i) , x1 ) x1 = δi j x2−1 δ . x2 Proof. Note that from Lemma 5.2, pseudo-automorphisms i (x) (1 ≤ i ≤ l) are mutually commuting. For 1 ≤ i ≤ l, set a (i) (x) = Y (u (i) , x) i (x),
b(i) (x) = Y (v (i) , x) i−1 (x).
We have pi j (x1 − x2 , )−1 a (i) (x1 )a ( j) (x2 ) = pi j (x1 − x2 , )−1 Y (u (i) , x1 ) i (x1 )Y (u ( j) , x2 ) j (x2 ) = Y (u (i) , x1 )Y (u ( j) , x2 ) i (x1 ) j (x2 ) = qi j Y (u ( j) , x2 )Y (u (i) , x1 ) j (x2 ) i (x1 ) = qi j p ji (x2 − x1 , )−1 a ( j) (x2 )a (i) (x1 ), pi j (x1 − x2 , )−1 b(i) (x1 )b( j) (x2 ) = pi j (x1 − x2 , )−1 Y (v (i) , x1 ) i−1 (x1 )Y (v ( j) , x2 ) −1 j (x 2 ) = Y (v (i) , x1 )Y (v ( j) , x2 ) i−1 (x1 ) −1 j (x 2 ) −1 = qi j Y (v ( j) , x2 )Y (v (i) , x1 ) −1 j (x 2 ) i (x 1 )
= qi j p ji (x2 − x1 , )−1 b( j) (x2 )b(i) (x1 ), pi j (x1 − x2 , )a (i) (x1 )b( j) (x2 ) − q ji p ji (x2 − x1 , )b( j) (x2 )a (i) (x1 ) −1 ( j) (i) = Y (u (i) , x1 )Y (v ( j) , x2 ) i (x1 ) −1 j (x 2 )−q ji Y (v , x 2 )Y (u , x 1 ) j (x 2 ) i (x 1 ) x1
−1 = δi j x2−1 δ j (x 2 ) i (x 1 ) x2 x1
−1 = δi j x2−1 δ j (x 2 ) i (x 2 ) x2 x1 . = δi j x2−1 δ x2
Set T = {a (i) (x), b(i) (x) | 1 ≤ i ≤ l} ⊂ E(VQ [[]]). From Lemma 4.25, T is -adically S-local and hence -adically compatible by Lemma 4.23. By Theorem 4.16, T generates an -adic nonlocal vertex algebra T inside E(VQ [[]]) with VQ [[]] as a module.
512
H. Li
We are going to apply Theorem 4.24 with V = VQ [[]], U = {u (i) , v (i) | 1 ≤ i ≤ l}, and Y0 (u (i) , x) = a (i) (x), Y0 (v (i) , x) = b(i) (x). We claim that VQ [[]] is generated from 1 by field operators a (i) (x), b(i) (x) (1 ≤ i ≤ l). Let W be the C[[]]-submodule of VQ [[]] generated from 1 by field operators a (i) (x), b(i) (x) (1 ≤ i ≤ l). We have i (x)1 = 1 ⊗ 1,
i (x)a ( j) (x1 ) = pi j (x − x1 , )a ( j) (x1 ) i (x),
i (x)b( j) (x1 ) = pi j (x − x1 , )−1 b( j) (x1 ) i (x) for 1 ≤ i, j ≤ l. It follows from induction that i (x)W ⊂ W [[x, x −1 ]] for 1 ≤ i ≤ l. Similarly, we have i (x)−1 W ⊂ W [[x, x −1 ]]. As Y (u (i) , x) = a (i) (x) i (x)−1 , Y (v (i) , x) = b(i) (x) i (x), it follows that W is closed under vertex operators Y (u (i) , x) and Y (v (i) , x) for 1 ≤ i ≤ l. Consequently, W = VQ [[]], as claimed. By Proposition 3.13, there exists a K -module homomorphism π from K to VQ [[]], sending 1VQ [[]] to 1. We are going to prove that π is in fact an isomorphism. First, we see that π is surjective and it gives rise to a surjective linear map π0 : K /K → VQ (= VQ [[]]/VQ [[]]). By Lemma 5.4, which follows next, we have pi j (x1 − x2 , )YE (a (i) (x), x1 )YE (a ( j) (x), x2 ) = qi j p ji (x2 − x1 , )YE (a ( j) (x), x2 )YE (a (i) (x), x1 ), pi j (x1 − x2 , )YE (b(i) (x), x1 )YE (b( j) (x), x2 ) = qi j p ji (x2 − x1 , )YE (b( j) (x), x2 )YE (b(i) (x), x1 ), pi j (x1 − x2 , )YE (a (i) (x), x1 )YE (b( j) (x), x2 ) −q ji p ji (x2 − x1 , )YE (b( j) (x), x2 )YE (a (i) (x), x1 ) x1 = δi j x2−1 δ x2 for 1 ≤ i, j ≤ l. From this, it follows that K /K is a vacuum AQ -module with X i (z) and Yi (z) (1 ≤ i ≤ l) acting as YE (a (i) (x), z) and YE (b(i) (x), z), respectively. Furthermore, we see that π0 is a surjective AQ -module homomorphism from K /K to VQ . As every nonzero vacuum AQ -module is irreducible from [KL], π0 must be an isomorphism. With K separated and with VQ [[]] torsion-free, it follows from a result of Enriquez ([En], Lemma 3.14) that π is an isomorphism. Now, by Theorem 4.24, there exists an -adic weak quantum vertex algebra structure on VQ [[]] with 1 as the vacuum vector and with Y(u (i) , x) = a (i) (x),
Y(v (i) , x) = b(i) (x)
for 1 ≤ i ≤ l.
Then the last assertion follows immediately. As VQ is nondegenerate, VQ [[]] is an -adic quantum vertex algebra. Now, the proof is complete.
-adic Quantum Vertex Algebras and Their Modules
513
The following is the result we need in the proof of Theorem 5.3: Lemma 5.4. Let W be a topologically free C[[]]-module and let V be an -adic nonlocal vertex subalgebra of E(W ). Assume that a(x), b(x) ∈ V, p(x, ), q(x, ) ∈ C[x, ], f (x, ) ∈ C((x))[[]] with p(x, 0), q(x, 0) = 0 such that p(x1 − x2 , ) x1 , (5.9) a(x1 )b(x2 ) − f (x2 − x1 , )b(x2 )a(x1 ) = λx2−1 δ q(x1 − x2 , ) x2 where λ is a complex number. Then p(x1 −x2 , ) YE (a(x), x1 )YE (b(x), x2 )− f (x2 − x1 , )YE (b(x), x2 )YE (a(x), x1 ) q(x1 −x2 , ) x1 = λx2−1 δ . (5.10) x2 Proof. From (5.9), we have (x1 − x2 ) p(x1 − x2 , )a(x1 )b(x2 ) = (x1 − x2 )q(x1 − x2 , ) f (x2 − x1 , )b(x2 )a(x1 ) = (x1 − x2 ) p(x1 − x2 , ) q(x1 − x2 , ) f (x2 − x1 , ) b(x2 )a(x1 ). × p(−x2 + x1 , ) In view of Lemma 4.25, we have q(x1 − x2 , ) f (x2 − x1 , ) b(x2 )a(x1 ). a(x1 )b(x2 ) ∼ p(−x2 + x1 , ) By Lemma 4.19, we have x1 − x a(x1 )b(x) x0 q(x1 − x, ) f (x − x1 , ) x − x1 b(x)a(x1 ). −Resx1 x0−1 δ −x0 p(−x + x1 , )
YE (a(x), x0 )b(x) = Resx1 x0−1 δ
Multiplying both sides by
p(x0 ,) q(x0 ,)
(viewed as an element of C((x0 ))[[]]) we obtain
p(x0 , ) p(x1 − x, ) x1 − x YE (a(x), x0 )b(x) = Resx1 x0−1 δ a(x1 )b(x) q(x0 , ) x0 q(x1 − x, ) x − x1 f (x − x1 , )b(x)a(x1 ). − Resx1 x0−1 δ −x0
Furthermore, for n ≥ 0, applying Resx0 x0n to both sides and then using (5.9) we get x p(x0 , ) 1 YE (a(x), x0 )b(x) = λResx1 (x1 − x)n x −1 δ = δn,0 λ. Resx0 x0n q(x0 , ) x
514
H. Li
On the other hand, by Lemma 4.21 we have the following Jacobi identity in V : x1 − x2 −1 YE (a(x), x1 )YE (b(x), x2 ) x0 δ x0 q(x1 − x2 , ) f (x2 − x1 , ) x2 − x1 −1 YE (b(x), x2 )YE (a(x), x1 ) −x0 δ −x0 p(−x2 + x1 , ) x2 + x0 YE (YE (a(x), x0 )b(x), x2 ). = x1−1 δ x1 p(x0 ,) and then taking Resx0 we obtain Multiplying both sides by q(x 0 ,) p(x1 −x2 , ) YE (a(x), x1 )YE (b(x), x2 )− f (x2 − x1 , )YE (b(x), x2 )YE (a(x), x1 ) q(x1 −x2 , ) p(x0 , ) x2 + x0 −1 YE (YE (a(x), x0 )b(x), x2 ) = Resx0 x1 δ x1 q(x0 , ) x2 , = λx1−1 δ x1
proving (5.10).
Example 5.5. Consider the special case with l = 1, q11 = 1, and p11 (x, ) = x+x . In this case (with Q = q11 = 1), AQ is a Weyl algebra and VQ is a vertex algebra. By Theorem 5.3, there exists an -adic quantum vertex algebra VQ [[]] with generators u, v such that x1 − x2 x2 − x1 Y (u, x1 )Y (u, x2 ) = Y (u, x2 )Y (u, x1 ), x1 − x2 + x2 − x1 + x1 − x2 x2 − x1 Y (v, x1 )Y (v, x2 ) = Y (v, x2 )Y (v, x1 ), x1 − x2 + x2 − x1 + x1 − x2 + x2 − x1 + x1 −1 Y (u, x1 )Y (v, x2 ) − Y (v, x2 )Y (u, x1 ) = x2 δ . x1 − x2 x2 − x1 x2 This gives an -deformed βγ -system (cf. [EFK]). Example 5.6. Consider the case with l = 1, q11 = −1, and p11 (x, ) = x+x . In this case, AQ is a Clifford algebra and VQ is a vertex superalgebra (cf. [FFR]). Theorem 5.3 asserts that there exists an -adic quantum vertex algebra VQ [[]] with generators u, v such that x1 − x2 x2 − x1 Y (u, x1 )Y (u, x2 ) = − Y (u, x2 )Y (u, x1 ), x1 − x2 + x2 − x1 + x1 − x2 x2 − x1 Y (v, x1 )Y (v, x2 ) = − Y (v, x2 )Y (v, x1 ), x1 − x2 + x2 − x1 + x1 − x2 + x2 − x1 + x1 −1 Y (u, x1 )Y (v, x2 ) + Y (v, x2 )Y (u, x1 ) = x2 δ . x1 − x2 x2 − x1 x2 Let L = Zα be a rank-one lattice with α, α = 1. Associated to L, one has a vertex superalgebra VL (cf. [DL]). It is known that vertex superalgebras VQ and VL are isomorphic with u = eα and v = e−α . In view of this, VQ [[]] is an -deformation of VL .
-adic Quantum Vertex Algebras and Their Modules
515
6. -adic Quantum Vertex Algebras Associated with Double Yangians In this section, we shall associate the centrally extended double Yangian of sl2 (see [Kh]) with -adic quantum vertex algebras. This can be viewed as an -adic version of the corresponding result of [Li6] for the centerless double Yangian of sl2 with evaluated as a nonzero complex number. The following is a variant of the centrally extended double Yangian DY (sl2 ) in [Kh]: Definition 6.1. Define DY (sl2 ) to be the topological associative algebra over C[[]] with generators em , f m , h m , c, d (m ∈ Z), which are grouped together in terms of generating functions em x −m−1 , f (x) = f m x −m−1 , e(x) = m∈Z
h (x) = 1 + +
m∈Z
hm x
−m−1
,
m≥0
−
h (x) = 1 −
h m x −m−1 ,
m 0, contradicting the definition of a. 4.11. Proof of Theorem 4. For a heuristic discussion see § 5.9. The proof is by rigorous WKB. The fact that there are two competing potentially large variables, k and 1/r makes it necessary to rigorously match two regimes. First, note that (37) implies gn 0 −k (r ) = i k m k (r )h k (r ).
(82)
We need a few more preliminary results. Lemma 31. For any 1 > 0, there exists C3 > 0 independent of k and 1 so that for k k0 = C3 1−1 , and for r ∈ [1 , 1], 1/2 k0 sup |h k | C4 k0 , (83) k 1 ≤r ≤1 where C4 is independent of 1 and k. The proof of Lemma 31 is given in §4.13. Definition 32. For fixed , we define L = αC3 (2C4 C3 /)2 , with C3 and C4 defined in Lemma 31, and ζ = αkr , where α is given in (32). We will take small enough so that L ≥ C3 α. Finally, in what follows, c∗ is a positive “generic” constant, the value of which is immaterial. Lemma 33. For > 0 small enough and kαr = ζ ∈ [L , kα], we have |h k (r ) − 1| . The proof of Lemma 33 is given in § 4.14.
(84)
Ionization of Coulomb Systems in R3
705
Definition 34. Let h˜ k (ζ ) = h k (ζ /(αk)). Lemma 35. For any small > 0, there exists a subsequence S = {h˜ k j } j∈N that con˜ ), we verges to a continuous function h˜ for ζ ∈ [0, L ]. For the limiting function h(ζ ˜ have |h(ζ ) − 1| ≤ 4 for ζ ∈ [0, L ]. The proof of this proposition is given in § 4.15. Proposition 36. For any r ∈ [0, 1], lim j→∞ h k, j (r ) = 1. Proof. From Lemma 35 and Lemma 33 it follows that for any r ∈ [0, 1] and any > 0 we have lim j→∞ |h k j (r ) − 1| ≤ 4. The proof of Theorem 4 now follows from the definition of h k in (36), Remark 24, Note 5 and Proposition 36. 4.12. Further results on gn 0 −k and h k . Lemma 37. For any j, k ∈ N ∪ {0} we have, at r = 1, i.e. at s = 0, ∂ j+τ gn 0 −k |s=0 = δ j,2k i k for 0 j 2k ∂s j+τ Proof. In case (i) (corresponding to τ = 0), note that (80) may be rewritten, cf. (31), as (gn 0 −k )ss −
Qk gn 0 −k = i gn 0 −k+1 − gn 0 −k−1 , (gn 0 −k )s + 3/2 2
(85)
where Qk =
b l(l + 1) + (k − n 0 )ω + i p1 − r r2
(86)
Since gn 0 −k (1) = 0 = gn 0 −k (1) for all k 1, while gn 0 (1) = 1, the statement follows from (85) for any 0 j 2, if 2k j. Assuming the statement holds for some j 2 for 2k j, we prove it for ( j + 1) for 2k ( j + 1). Taking ( j − 1) derivatives in s of (85) at s = 0, we obtain ∂ j+1 gn 0 −k ∂ j−1 ∂ j−1 = i g − i gn −(k+1) + L n −(k−1) 0 ∂s j+1 ∂s j−1 ∂s j−1 0 where L is a linear combination of derivatives of gn 0 −k up to order j, which are all zero since 2k ( j + 1) > j. The first two terms on the rhs give a contribution of ii k δ( j−1),2(k−1) + 0 since 2k ( j + 1) implies 2(k − 1) ( j − 1) and 2(k + 1) > ( j − 1) completing the inductive step. In case (ii) (corresponding to τ = 1): since gn 0 (1) = 0 and gn 0 −k (1) = 0 = gn 0 −k (1) for all k 1, it follows from (85) that gn0 −k = 0 for all k 1 implying the conclusion for j = 0 and j = 1. By taking an additional derivative of (85) with respect to s and evaluating at s = 0, we obtain gn (1) ∂ ∂ 3 gn 0 −k = iδ2,2k = iδ2,2k gn 0 |s=0 = iδ2,2k √0 3 ∂s ∂s − (1) so the statement holds for j = 2 and any k with 2k j. The rest of the proof is very similar to that for τ = 0.
706
O. Costin, J. L. Lebowitz, S. Tanveer
Let ψ1,k , ψ2,k be two independent solutions of Lk ψ = 0 ; and Wk = ψ1,k (r )ψ2,k (r ) − ψ2,k (r )ψ1,k (r )
(87)
Lk ψ = ψ + Q k ψ
(88)
where
From the form of the equation we see that Wk is independent of r . Lemma 38. For n = n 0 − k, k 1, the system (80) is equivalent to 1 gn 0 −k (r ) = i (s) gn 0 −k+1 (s) − gn 0 −k−1 (s) G k (r, s)ds k 1
(89)
r
where G k (r, s) = Wk−1 [ψ1,k (r )ψ2,k (s) − ψ2,k (r )ψ1,k (s)]
(90)
Proof. The proof simply follows from variation of parameters, the two boundary conditions at r = 1 and gn 0 −k (1) = gn 0 −k (1) = 0. Definition 39. Define jk =
s
Lk m k − m k−1 mk
(91)
Lemma 40. For k ≥ 1, there exist constants C1 , C2 and c∗ , independent k so that for of any r ∈ (0, 1] we have | jk | ≤ c∗ . For r ≥ k1 , we have | jk (r )| ≤ C1 / kr 2 + C2 Proof. In the Appendix, (253), we obtain an explicit expression for jk . Routine asymptotics for large k in different regimes of r ∈ (0, 1], discussed in the Appendix §5.8, show that k 2 jk(2) + k jk(1) = O(1) in all cases and hence jk = O(1). In fact, as r → 0 and k → ∞ with ζ = kαr = O(1) fixed, we have jk → g(ζ ), where g(ζ ) is bounded. Also taking the r - derivative of jk for r = O(1) not small, we get jk (r ) = O(1). When r 1, the asymptotics in the regime k1 r 1 gives jk = O ζ −1 = O (1/(kr )). Since the asymptotics is differentiable, we have jk (r ) = O 1/(kr 2 ) . Finally, we look d d jk = k dζ jk ∼ kg (ζ ), where ζ 2 g (ζ ) is bounded for all at ζ = O(1), ζ ≥ 1. Since dr ζ , it follows that | jk (r )| ≤ C1 / kr 2 + C2 for r ≥ 1/k. Lemma 41. For k 1, h k (r ) defined in (82) satisfies the system of differential equations: m k−1 jk m k−1 m k+1 mk hk = + + h k−1 (r ) + h k+1 (r ) , (92) h k + 2h k mk mk s mk mk and the system of integral equations (89) is equivalent, for k 1, to 1 (s)m k−1 (s) h k (r ) = G k (r, s)h k−1 (s)ds m k (r ) r 1 (s)m k+1 (s) G k (r, s)h k+1 (s)ds := Ak h k−1 + Hk h k+1 . + m k (r ) r
(93)
Ionization of Coulomb Systems in R3
707
Proof. This simply follows by substituting gn 0 −k (r ) = i k m k (r )h k (r ) into (80) and (89), and using m k Lk m k m k−1 jk + Qk = = + , mk mk mk s in turn a consequence of Lemma 40.
Remark 42. Let now r ∈ [ˆ , 1], where ˆ C2 k −1 for sufficiently large C2 independent of k. It is convenient to rewrite Ak and Hk in (93) in terms of s (see (3.11.1)). Furthermore, changing the variable of integration from s to t = s(s)/s(r ), we obtain 1 [Ak h k−1 ](s) = (2k + τ )(2k + τ − 1) t 2k−2+τ Tk (s, t)h k−1 (st)dt, (94) 0
where, using (36), we get Tk (s, t) = and
√ (r (st))Fk−1 (r (st) G k (r (s), r (st)) sFk (r (s))
1 s3 (r (st)t 2k+2+τ [Hk h k+1 ](s) = (2k + 2 + τ )(2k + 1 + τ ) 0 Fk+1 (r (st) × G k (r (s), r (st))h k+1 (st)dt. Fk (r (s))
(95)
(96)
In evaluating Ak for large k, it is useful to calculate the Taylor expansion of Tk (s, t) and its s derivative at t = 1. To do so, we first note that (r ) Fk−1 ∂ Tk (r )Fk−1 (r ) Fk−1 (r ) ∂G k = − − G (r, r ) − (r, r ), (97) k ∂t 2(r )Fk (r ) Fk (r ) Fk (r ) ∂r where, simplify notation, we wrote r (s) = r and r (ts) = r and used ∂r s(r ) = √ to − (r ). From (87) and (90) we get G k (r, r ) = 0 and ∂r G k (r, r ) = 1 at r = r ; (97) implies ∂ Tk Fk−1 (r ) . (98) =− ∂t t=1 Fk (r ) Using (97), taking an additional derivative with respect to t, using also (86) and (88) to see that ∂r r G k = −Q k G k , we obtain (r ) 2ξ Fk−1 ∂ 2 Tk ξ (r ) Fk−1 (r ) + . (99) = √ ∂t 2 t=1 Fk (r ) 23/2 (r ) (r )Fk−1 (r ) A similar calculation can be carried out for the third derivative. We only write down the potentially largest term in the regime kr ≥ C2 (for large k and small r ) ξ 2 Fk−1 (r ) 1 l(l + 1 ξ 2 Fk−1 (r )Q k (r ) ∂ 3 Tk = + O 1, kω − = ∂t 3 t=1 (r )Fk (r ) kr 3 (r )Fk (r ) r2 1 (100) +O 1, 3 . kr
708
O. Costin, J. L. Lebowitz, S. Tanveer
Note that if kr is sufficiently large, (35) gives Fk−1 (r ) H (α(k − 1)r )H (αk) = = 1 + O(k −2 r −1 ) Fk (r ) H (αkr )H (α(k − 1)
(101)
and (r ) Fk−1
l(l + 1) +O 2αkr 2
1 k 2r 3
. (102) Fk−1 (r ) √ Note also that (32) implies α − 2 (r )/ (s(r )) = O(r ) for small r . Including all terms that become important when r is small, we note that in the regime when kr is sufficiently large, we have 2 k f2 (1 − t)2 (1 − t)3 − Tk = (1 − t) + − f 1 + 2 4 r 3 k 4 3 3 2 (1 − t) (1 − t) (1 − t) (1 − t) (1 − t)2 (1 − t) +O , , , (103) , , r3 kr 3 r kr k 2r 3 k 2r 2 ∂ Tk k f3 (1 − t)2 3 = − f1 + 3 (1 − t) − ∂s 4 r 3 k 4 3 3 (1 − t) (1 − t) (1 − t) (1 − t)2 (1 − t)2 (1 − t) , (104) +O , , , , , 2 2 r4 kr 4 r2 kr 2 k 2r 4 k r ∼−
where ωs2 , (r (s)) l(l + 1)s2 , f 2 (s) = 4 l(l + 1)s2 f 3 (s) = . 23/2 f 1 (s) =
(105) (106) (107)
When r ∈ [0, ], ˆ for ˆ = C2 /k, it is sometimes more convenient to express Ak in terms of ζ = kαr . For that purpose, we define ! 1 r (0) 4 1 s(0)−s ωs(r ) − log exp dr √ Q(ζ ) = −2k log 1 − , (108) s(0) (r ) 4 0 (r ) where we recall the relation (31) between s and r = ζ /(kα), ζ ∈ [0, kα]. A series expansion in k −1 leads to 3 2 ω ζ2 (0) ζ ζ (0) ζ . (109) + 1 + + O − , Q(ζ ) = ζ − k 2α 2 4(0)α 4k α(0) k2 k2 We choose ˆ1 = C˜ 2 k −1 log k, for some k-independent C˜ 2 (chosen more precisely later). We define δˆ1 , dependent of r , so that (1 − δˆ1 )s(r ) = s(1 ).
(110)
Ionization of Coulomb Systems in R3
709
From (31), it follows that for sufficiently large C˜ 2 we have δˆ1 1 −
(5 + l) log k s(1 ) . s() (4k + 2τ )k
(111)
It follows from the definition of Ak in (93) that for r ∈ [0, ], ˆ i.e. ζ ∈ [0, kα ˆ ], kα ˆ1 a H (η(1−k −1 ))
1 Ak h k−1 (ζ ) = e−Q(η)+Q(ζ ) 1+ G(ζ, η)h k−1 (η(1−k −1 ))dη k H (ζ ) ζ +(2k + τ )(2k + τ − 1) 1 2k−2+τ Fk−1 (r ) s(r ) h k−1 (r )dr × (r )G k r, r 2 s(r ) s Fk (r ) ˆ1 =: [A0k h k−1 ](ζ ) + [A1k h k−1 ](r ),
(112)
where G(ζ, η) is defined by G(ζ, η) = kαG k (r (ζ ), r (η)), while
(113)
a1 (η, ζ ) H (αk) τ τ − 1 s2 (0)(η/(kα)) = 1+ 1+ k H (α(k − 1)) 2k 2k s2 (η/(kα))(0) τ s(η/(kα)) − 1, × s(ζ /(kα))
while for large k and 0 < ζ ≤ η ≤ ˆ1 α we have 2 (0) τ η η 1 η + (ζ − η) + O , . a1 (η, ζ ) = τ − + 1 + 2 α(0) 2 k k
(114)
(115)
Similarly, for kαr = ζ ∈ (0, k ˆ1 α), we define b1 (η, ζ ) H (αk) s2 (η/(kα))(η/(kα)) = k H (α(k + 1)) s2 (0)(0)
s(η/(kα)) s(ζ /(kα))
τ
− 1.
(116)
We then have [Hk h k+1 ] (ζ ) =
(0)s2 (0) α 2 k 2 (2k + 2 + τ )(2k + 1 + τ ) kα ˆ1 b1 H (η(1+k −1 )) −Q(η)+Q(ζ ) 1+ G(ζ, η)h k+1 (η(1+k −1 ))dη × e k H (ζ ) ζ s2 (2k + 2)(2k + 1 + 2τ ) 1−δˆ1 Fk+1 (r (st)) h k+1 (st)dt × (r (st))G k (r (s), r (st)) t 2k+2+τ Fk (r (s)) 0 +
=: [Hk0 h k+1 ] + [Hk1 h k+1 ].
(117)
710
O. Costin, J. L. Lebowitz, S. Tanveer
Lemma 43. For k 2 and k1 ∈ {k − 1, k, k + 1} we have
(1) If r ∈ (0, 1) and s ∈ (r, r + δ), where δ ≤ min C2 k −1 log k, 1 − r , then G k (r, s) Fk1 (s) c∗ , ∂ G k (r, s) Fk1 (s) < c∗ k 1/2 . Fk (r ) k 1/2 ∂r Fk (r ) (2) If r ∈ (0, 1), δ ≤ C2 k −1 log k with r + δ < 1, then for s ∈ (r + δ, 1), G k (r, s) Fk1 (s) < c∗ k l/2−1/2 , ∂ G k (r, s) Fk1 (s) < c∗ k l/2+1/2 . Fk (r ) ∂r Fk (r ) Proof. It suffices to find bounds for G k (r, s)H (αk1 s)/H (αkr ) since the other functions involved are regular everywhere for r, s ∈ [0, 1], see (36). We first consider k → +∞. It is easily verified that G(ζ, η), defined in (113), is the Green’s function (see (86), (88) ) for l(l + 1) ω b L := → − + + (118) + 2 2 [i p1 − n 0 ω] 2 2 ζ k α αζ k α and is given by G(ζ, η) := kαG k (r (ζ ), r (η)) =
1 (ζ )2 (η) − 2 (ζ )1 (η) , W
(119)
where 1 , 2 are two independent solutions of L = 0 and W = 1 (ζ )2 (ζ ) − 2 (ζ )1 (ζ ) is their constant Wronskian. Standard asymptotic results show there exist √ two independent solutions 1 , 2 such that for large k, we have uniformly in z ∈ [0, ωk], √ ω 2l l! π z Yl+1/2 (z) ; where z = ζ = ωkr, 1 ∼ − (120) 2 (2l)! 2 α k 2−l−1 (2l + 2)! π z Jl+1/2 (z). (121) 2 ∼ (l + 1)! 2 √ √ The Wronskian W is asymptotic, for large k, to (2l + 1) ω/ α 2 k. The expressions (120) and (121) may also be used to determine the asymptotics of 1 and 2 . Using (119), (120), (121) and (36) and the bounds on W , with l1 = l + 21 it follows that ' k1 H α z ω Fk1 (s) c∗ |zz |1/2 G (r, s) (z)J (z ) − J (z)Y (z )] [Y l l l l ' 1 1 1 1 , (122) F (r ) k k 1/2 k k H α ωz √ √ √ where z = η ω/ α 2 k = ωks. A similar bound holds for ∂ Fk1 (s) ∂r F (r ) G k (r, s) . k We now prove part (1). We break this case up into two subcases: (a) r ∈ [k −2/3 , 1] and (b) r ∈ [0, k −2/3 ]. In case (a), we note that s ∈ [r, r + δ] implies s/r and therefore
Ionization of Coulomb Systems in R3
711
1 ≤ z /z = O(1). The√ function H in (122) √ is close to 1 because its argument is large. Furthermore, note that zYl+1/2 (z) and z Jl+1/2 (z) are bounded for large z, while they are asymptotic to constant multiples of z −l and z l+1 for small z. Using (122), part 1 of the lemma follows by inspection in case (a). For case (b), (122) further simplifies since z,z are small and H (k1 η/k) H (k1 η/k) G k (r (ζ ), r (η)) = G(ζ, η) H (ζ ) kα H (ζ ) H (k1 η/k) ηl+1 ζ −l − ζ l+1 η−l . ∼ kα H (ζ )(2l + 1)
(123)
When ζ ∈ [log k, αk 1/3 ] and η ∈ [ζ, ζ + αkδ], we have 1 ≤ [η/ζ ]l ≤ c∗ and therefore H (k1 η/k) H (k1 η/k) c∗ H (ζ ) G k (r (ζ ), r (η)) = kα H (ζ ) G(ζ, η) k 1/2 . The same inequality holds if ζ ∈ [0, log k], since η ∈ [ζ, (C2 + 1) log k] since in this regime ζ −l /H (ζ ) is bounded and the logarithmic growth in k of terms involving η can be bounded by, say, k 1/2 , while for small η, ηl H (k1 η/k) is bounded. The bounds on d d derivatives follow in a similar manner using dr = kα dζ . Part 2 (which is only relevant for r + δ 1) follows similarly on careful inspection of (122), from the asymptotic behavior in different regimes of z and z . Lemma 44. Let r ∈ (0, ], ˆ with ˆ1 = Ck2 log k. We choose C2 large enough so that (5+l) log k s(1 ) 1 l/2−1/2 (1 − δ )2k−2+τ f 1 ∞ s(r ) = (1 − δ1 ) ≤ 4k+2τ . Then |[Ak f ](r )| c∗ k d 1 −3 l/2+1/2 2k−2+τ −2 c∗ k f ∞ and | dr [Ak f ](r )| c∗ k (1 − δ1 ) f ∞ c∗ k f ∞ . Proof. Consider A1k given by (112). We note that s−2 (s) and its r −derivative are bounded, while G k (s, r )Fk (s)/Fk (r ) and its r -derivative are bounded by c∗ k l/2−1/2 and c∗ k l/2+1/2 respectively for any τ (cf. Lemma 43). Further |s(s)/s(r )| (1 − δ1 ) and from (111), we have (1 − δ1 )2k−2+τ and the lemma follows.
c∗ , l/2+5/2 k
Remark 45. Since for r ∈ (0, ], ˆ the bound in Lemma 44 on A1k is O(k −2 ), we will see later that Ak is dominated by A0k (defined in (112)) as k → ∞. Lemma 46. Define G0 (ζ, η) = limk→∞ G(ζ, η) and H0 (ζ ) = limk→∞ H (ζ ), where ζ, η k 1/2 as k → ∞. Then, ∞ H0 (η) dη = 1, (124) e−η+ζ G0 (ζ, η) H 0 (ζ ) ζ ∞ H (ζ ) H0 (η) e−η+ζ G0ζ (ζ, η) dη = −1 + 0 . (125) H0 (ζ ) H0 (ζ ) ζ
712
O. Costin, J. L. Lebowitz, S. Tanveer
Proof. Using (120) and (121) and the behavior of Bessel functions for small argument, [1], it follows that for ζ, η k 1/2 we have G0 (ζ, η) = lim G(ζ, η) = k→∞
2 1/2 ζ e K l+1/2 (ζ ). Now, using the πζ verified that f (ζ ) = e−ζ H0 (ζ ) satisfies
and H0 (ζ ) = limk→∞ H (ζ ) = function equation, it is easily
'
ηl+1 ζ −l − ζ l+1 η−l 2l + 1
f −
(126) modified Bessel
l(l + 1) f = f ζ2
with f (ζ ) ∼ e−ζ as ζ → ∞. Using variation of parameters to invert the left hand side of the above equation, and using the boundary conditions at ∞ we obtain ∞ G0 (ζ, η) f (η)dη. f (ζ ) = ζ
Dividing through by f (ζ ), the first identity in the lemma follows. By differentiating the first identity with respect to ζ , and using the first identity in the resulting expression, we obtain the second identity. Lemma 47. For any r ∈ (0, 1), Ak [1](r ) − 1 =
r
For
1 k
1
c∗ m k−1 (s) G k (r, s)ds − 1 2 . (s) m k (r ) k
≤ r ≤ 21 we get 1 d c∗ m k−1 (s) c∗ Ak [1](r ) = d G k (r, s)ds 2 + 3 2 , (s) dr dr m k (r ) k k r r
while for any r ∈ [0, 21 ], 1 ∂ m k−1 (s) ds c∗ k. G k (r, s) (s) ∂r m k (r ) r
(127)
(128)
(129)
Proof. Recalling the definition (93), it follows from (39) and Lemma 40 that Lk m k − m k−1 =
jk (r ) mk , s
(130)
where jk (r ) = O(1) as k → +∞ for any r ∈ [0, 1]. We can check from (36) that m k (1) = 0, m k (1) = 0 for k 1. From (130), inversion of Lk yields 1 jk (s) m k (s) ds. G k (r, s) (s)m k−1 (s) + (131) m k (r ) = s(s) r Therefore, r
1
G k (r, s)
(s)m k−1 (s) ds = 1 − m k (r )
r
1
G k (r, s)
jk (s)m k (s) ds. s(s)m k (r )
(132)
Ionization of Coulomb Systems in R3
713
First, we choose δˆ1 so that 1 − δˆ1 = (5 + l) log k/ (4k + 2τ ). We then define ˆ It is clear that for large k we have δˆ ∼ δˆ so that (1 − δˆ1 )s(r) = s(r + δ). (5/2 + l/2) s(r ) log k/ (2k + τ ) (r ) . Lemma 43, and the fact that k l/2−1/2 (1 − δˆ1 )2k+1+τ /(2k + 1 + τ ) 13 give k
ˆ jk (s)m k (s) 1−δ1 2k+τ G k (r, s) t ds s(s)m k (r ) 0 r +δˆ
1
Fk (r (st) G k (r (s), r (st)) jk (r (st))dt (r (st))Fk (r (s) c∗ c∗ 3 jk ∞ 3 . (133) k k × √
r +δˆ Now, consider the contribution from r . There are again two cases: (i) 1 r k −2/3 and (ii) 0 < r k −2/3 . In the√first case, Taylor expanding G k (r, s) near s = r we get G k = (s−r )+O((s−r )3 Q k ) = (r )s(1 − t) + O(k 4/3 (1 − t)3 , (1 − t)2 ). Hence, 1 r +δˆ jk (s)m k (s) c∗ ds c∗ jk ∞ G k (r, s) t 2k+τ −1 (1 − t)dt 2 . (134) r s(s)m k (r ) k 1−δˆ1 For the case (ii), we rewrite the integral in terms of ζ = kαr , to obtain ζ +kα δˆ r +δˆ jk (s)m k (s) c∗ H (η) ds 2 jk ∞ dη G k (r, s) e−Q(η)+Q(ζ ) G(ζ, η) r s(s)m k (r ) k H (ζ ) ζ ˆ c∗ ζ +kα δ H0 (η) 2 dηe−η+ζ G0 (ζ, η) k ζ H0 (ζ ) c∗ (η) H c∗ ∞ 0 2 dηe−η+ζ G0 (ζ, η) 2 k ζ H0 (ζ ) k
(135)
by Lemma 46. Using (132) and (135), the first part follows. To prove (128), we note that if C3 is large and r k > C3 , Taylor expansion gives Fk (r (st) U1 (s, t) := √ G k (r (s), r (st)) (r (st))Fk (r (s) 3 (1 − t)2 2 3 (1 − t) (136) = f 4 (s)(1 − t) + O (1 − t) , k(1 − t) , , r2 kr 2 for f 4 = −s/ (r (s)), while ∂ (1 − t)3 (1 − t)2 . U1 (s, t) = f 4 (s)(1 − t) + O (1 − t)2 , k(1 − t)3 , , ∂s r3 kr 3
714
O. Costin, J. L. Lebowitz, S. Tanveer
From (132) we note that 1 j (r (st)) d U1 (s, t)dt t 2k+τ +1 √k Ak [1](r ) = − (r (s)) dr (r (st)) 1−δˆ1 1 t 2k+τ jk (r (st))U1,s(s, t)dt − 1−δˆ1
−
d dr
1
r +δˆ
ξ(s) ξ(r )
2k+τ
jk (s)Fk (s) G k (r, s)ds. ξ(s)Fk (r )
(137)
We note further that
ξ(s) 2k+τ jk (s)Fk (s) G k (r, s)ds ξ(s)Fk (r ) r +δˆ ξ(r ) 1 ξ(s) 2k+τ jk (s) ∂ Fk (s) G k (r, s) ds =− ξ(s) ∂r Fk (r ) r +δˆ ξ(r ) ξ (r ) 1 ξ(s) 2k+τ jk (s)Fk (s) G k (r, s)ds +(2k + τ ) ξ(r ) r +δˆ ξ(r ) ξ(s)Fk (r ) " #2k+τ ˆ ˆ k (r + δ) ˆ ξ(r + δ) jk (r + δ)F ˆ 1 + δˆ (r ) . + G k (r, r + δ) ˆ k (r ) ξ(r ) ξ(r + δ)F
−
d dr
1
(138)
From the bounds in Lemmas 40 and 43 and the fact that ξ(s)/ξ(r ) ≤ (1 − δˆ1 ), we easily 1 d Ak [1](r ) is O(1/k 2 ). conclude that the contribution of r +δˆ in (138) to dr Since Lemma 40 implies | jk (r )| < c∗ and | jk (r )| < c∗ + c∗ /(kr 2 ) for 21 ≥ r ≥ k1 , it follows from the local expansion of U1 (s, t) and its s-derivative in a neighborhood of t = 1 in the first integral in (137) that d r +δˆ c∗ jk (s)m k (s) c∗ ds G k (r, s) + dr r s(s)m k (r ) k 3r 2 k 2 and (128) follows. ˆ from (90), We now prove (129). We first note that for r ≥ k −2/3 , s ∈ (r, r + δ), ∂r G(r, s) = −1 at s = r and therefore, from (120), (121), it follows that for s − r = ˆ The same is true O(k −1 log k) k −1/2 , ∂r G(r, s) ∼ −1 < 0 for s ∈ (r, r + δ). −2/3 for r ∈ [0, k ] since in this regime, ∂r G k (r, s) ∼ ∂ζ G0 (ζ, η) (see (126)), with ζ = r/(αk), η = s/(αk). Therefore, from (31) and (36), we get −
∂ ∂r
m k−1 (r )G k (r, s) m k (r )
√ F (r ) m k−1 (r )G k (r, s) (r ) = − (2k + τ − 2) − k ξ(r ) Fk (r ) m k (r ) m k−1 (s) . (139) −∂r G k (r, s) m k (r )
Ionization of Coulomb Systems in R3
715
1 Since the contributions to the integrals from r +δˆ is O( k12 ), and the first term on the 1 right on (139) is negative for large k, while the second is positive, it follows that 1 d ∂ m k−1 (r ) ds ≤ Ak [1](r ) (s) ∂r m k (r )G k (r, s) dr r √ Fk (r ) (r ) 1 +2 (2k + τ − 2) ≤ c∗ k − |Ak [1](r )| + O (140) ξ(r ) Fk (r ) k2 1
for r ∈ C2 k −1 , 1 . For r ∈ 0, C2 k −1 , we note that since the contribution from r +δˆ d for dr Ak [1](r ) is negligible, we have d d Ak [1](r ) ∼ kα dr dζ
ζ +kα δˆ1
ζ
a1 H (η(1 − 1/k)) e−Q(η)+Q(ζ ) 1 + G(ζ, η) k H (ζ )
ˆ
ζ +kα δ1 H0 (η) d G0 (ζ, η) e−η+ζ dζ ζ H0 (ζ ) ∞ H0 (η) d G0 (ζ, η); ∼ kα e−η+ζ dζ ζ H0 (ζ )
∼ kα
(141)
d Ak [1](r )| ≤ c∗ k. Hence it follows immediately from Lemma 46 that in this case , | dr the inequality in (140) is valid for all r ∈ [0, 1/2].
Lemma 48. For any f ∈ L ∞ [0, 1],
c∗ For r ∈ [0, 1], Ak f ∞ 1 + 2 f ∞ , k ( ( (d ( 1 ( , ( (b) For r ∈ 0, ( dr [Ak f ](r )( c∗ k f ∞ . 2 ∞ (a)
(142) (143)
Proof. Consider the expression for Ak f from (93). We break up the integral into 1 and r +δ , where δ = C2 k −1 log k, with C2 large enough so that (1 − δ1 )2k−2+τ
1 k l/2+7/2
; (1 − δ1 ) :=
r +δ r
s(r + δ) . s(r )
From (36) and Lemma 43, part (2), transforming the integration variable to t, it follows that 1 c∗ m k (s) f ∞ . G (s) (r, s) f (s)ds (144) k k2 m k (r ) r +δ r +δ In r (we replace the upper limit r + δ by 1 if r + δ > 1). Since δ1 = O k −1 log k and t ∈ (1 − δ1 , 1) then Tk (s, t) ≥ 0 and G k (r, s) 0 for r ∈ [k −2/3 , 1]. Therefore, r +δ (s)m k−1 (s) c∗ (145) G k (r, s) + 2 . Ak f ∞ f ∞ m k (r ) k r From (144) we get
c∗ m k−1 (s) G k (r, s)ds 2 . (s) m k (r ) k r +δ 1
716
O. Costin, J. L. Lebowitz, S. Tanveer
Hence r +δ r
(s)
m k−1 (s) G k (r, s)ds = m k (r )
1
(s)
r
m k−1 (s) G k (r, s)ds + O m k (r )
1 k2
.
Using Lemma 47, (142) (a) follows. For (b) we write 1 m k−1 (s) d (s) G k (r, s) f (s)ds dr r m k (r ) 1 ∂ m k−1 (s) G k (r, s) f (s)ds. = (s) ∂r m k (r ) r By Lemma 47, the quantity above is bounded by c∗ k f ∞ .
(146)
(147)
Lemma 49. For any f ∈ L∞ [0, 1], c∗ f ∞ , k2 c∗ 2 f ∞ . k
Hk f ∞
(d ( ( [Hk f ](r )∞ dr
Proof. As before, we choose δ = C2 k −1 log k large C2 independent of k. Using Lemma 43, it follows that 1 (s)m k+1 (s) G k (r, s) f (s)ds c∗ (1 − δ1 )2k+2 k l/2−5/2 f ∞ m k(r ) r +δ c∗ 4 f ∞ , (148) k 1 ∂ (s)m k+1 (s) G (r, s) f (s)ds k ∂r m r +δ
c∗ (1 − δ1 )
k(r )
2k+2 l/2−3/2
k
f ∞
c∗ f ∞ . k3
Now, Lemma 43 implies r +δ 1 (s)m k+1 (s) c∗ f ∞ G (r, s) f (s)ds t 2k+2+τ dt k m k(r ) k2 r 0 c∗ f ∞ , k3 r +δ c∗ f ∞ 1 2k+2+τ ∂ (s)m k+1 (s) G k (r, s) f (s)ds t dt ∂r m k(r ) k r 0 c∗ f ∞ . k2
(149)
(150)
(151)
Lemma 50. There exist k0 and c∗ , independent of k, so that for k > k0 , over the r -interval (0, 1), h k ∞ < c∗ .
(152)
Ionization of Coulomb Systems in R3
717
Proof. First we note that for k0 sufficiently large, h k0 ∞ exists since gk0 is continuous for r ∈ [0, 1] and the expression for m k in (36) shows that 1/m k0 is bounded as well for sufficiently large k0 since K l+1/2 has no zeros in the region of interest. Define rk = Hk h k+1 . Note that h k = Ak (Ak−1 h k−2 + rk−1 ) + rk .
(153)
In k − k0 inductive steps we get k−k 0 −1
h k = Ak Ak−1 ..Ak0 +1 h k0 + Hk h k+1 +
⎛ ⎝
m=1
m )
⎞ Ak− j+1 ⎠ Hk−m h k−m+1 .
(154)
j=1
We write this abstractly as h = h0 + Nh,
(155)
where h0k = Ak Ak−1 ..Ak0 +1 h k0 ; ⎛ ⎞ k−k m 0 −1 ) ⎝ Ak− j+1 ⎠ Hk−m h k−m+1 , [Nh]k = Hk h k+1 + m=1
(156)
j=1
and N is defined on the space S of sequences h = {h k }∞ k=k0 +1 in the norm h = sup h k ∞ . k k0 +1
(157)
Lemmas 48 and 49 imply ⎧ ⎫ ⎞ ⎬ k−k m 0 −1 ⎨ ) c c 1 ∗ ∗ ⎠ 1+ |[Nh]k | h∞ ⎝ 2 + c∗ ⎩ (k − j + 1)2 ⎭ (k − m)2 k0 m=1 j=1 ⎛
< νh∞ ,
(158)
where, if k0 is large ν < 1 is independent of k. Thus, N is contractive and there is a unique solution of (155) in S.
d Lemma 51. For any r ∈ 0, 21 and for large enough k we have dr h k ∞ c∗ k. Proof. Since by Lemma 50 h k is bounded, Lemmas 49 and 48 imply |h k (r )| |
d d [Ak h k−1 ](r )| + | [Hk h k+1 ](r )| c∗ k. dr dr
Lemma 52. For all k 1, h k (1) = 1.
718
O. Costin, J. L. Lebowitz, S. Tanveer
Proof. In case (i), a simple computation shows that ∂ 2k gn 0 −k |s=0 = i k h k (1); (gn 0 −k := i k m k h k ). ∂s2k (By the differential equation for h k , all derivatives exist.) Lemma 37 with j = 2k gives ik =
∂ 2k |s=0 gn 0 −k = i k h k (1), ∂s2k
implying the result in case (i). In case (ii), using Lemma 38, a similar computation shows that ik =
∂ 2k+1 |s=0 gn 0 −k = i k h k (1) (gn 0 −k := i k m k h k ). ∂s2k+1
Definition 53. Let Tˆk (s, s) = s −2k+1−τ
s
t 2k−2+τ s
0
∂ Tk (s, t)dt, ∂s
(159)
where Tk (s, t) is defined in (95). Lemma 54. Let δ = k −1 log k and Sk (s) := ∂∂s s ∈ (0, δ) and r (s) k −1 C2 , we have
1 0
t 2k−2 Tk (s, t)dt. If C2 is large enough,
s f (s) s f 3 (s) Tˆk (s, s) = sSk (s) − 1 (1 − s)3 + (1 − s)3 12 3kr 3 (1 − s)4 (1 − s)3 (1 − s)2 (1 − s) (1 − s)3 (1 − s)2 . (160) +O , , , , , kr 4 k 2r 4 k 3r 4 k 4r 3 kr 2 k 2r 2 Proof. This simply follows by integrating (103) from t = 1 to s of Tk and the fact that Tˆk (s, 1) = sSk (s). 4.13. Proof of Lemma 31. First choose 1 > 0. From Lemma 49, it follows that ( ( (d ( ( [Hk h k+1 ]( ( ds (
∞
c∗ c∗ h k+1 ∞ 2 , k2 k
where we applied Lemma 50. Further, we note that d 1 Ak h k−1 (s) (2k + τ )(2k + τ − 1) ds 1 1 ∂ Tk (s, t)h k−1 (st)dt + t 2k+τ −2 t 2k+τ −1 Tk (s, t)h k−1 (st)dt. = ∂s 0 0
(161)
Ionization of Coulomb Systems in R3
719
We have
2k+τ −2 ∂ Tk
∂ Tk (s, t) dt t 2k+τ −2 s ∂s 0 0 s 1 1 2k+τ −2 ∂ Tk × (s, t)dt ds h k−1 (ss)ds = h k−1 (s)Sk (s) − h k−1 (ss) t s ∂s t 0 0 1 = h k−1 (s)Sk (s) − h k−1 (ss)s 2k−1+τ Tˆk (s, s)ds = (2k + τ − 1)Sk (s) 1
t
∂s
(s, t)h k−1 (st)dt = h k−1 (s)Sk (s) −
0
1
×
1
s 2k−2+τ h k−1 (ss)ds −
0
0
1
s 2k−1+τ [Tˆk (s, s) − Tˆk (s, 1)]h k−1 (ss)ds. (162)
Therefore, d d s Ak [h k−1 ](s)
(2k + τ )(2k + τ − 1)
1
=
[Tk (s, s) − Tˆk (s, s) + sSk (s)]s 2k+τ −1
0
×h k−1 (ss)ds + (2k + τ − 1)Sk (s)
1
s 2k+τ −2 h k−1 (ss)ds.
0
(163) We note that ∂ (2k + τ )(2k + τ − 1)Sk (s) = [Ak [1](s)] = O ∂s
"
1 1 , 2 2 3 k 1 k
#
1 and that (2k + τ − 1) 0 s 2k+τ −2 h k−1 (ss)ds has a bound independent of k. Combining (103) with Lemma 54, if k is large so that k1 is large, then f2 k f1 ˆ + 2 Tk (s, s) − [Tk (s, s) − sSk (s)] = (1 − s) + − 4 r f (1 − s)2 2 f 3 × − (1 − s)3 + (1 − s)3 − s − 1 + k 3 12 3kr 3 (1 − s)4 (1 − s)3 (1 − s)2 (1 − s) (1 − s)3 (1 − s)2 +O , , , 4 3 , , , kr 4 k 2r 4 k 3r 4 k r kr 2 k 2r 2 (1 − s)4 (1 − s)3 (1 − s)3 (1 − s)2 , . , , × r3 kr 3 r kr
(164)
From (164), it is clear that Tk (s, s) −Tˆk (s, s) + sSk (s) > 0 if s ∈ (1 − δ, 1) and k1 is sufficiently large. Now, s f 3 / 3kr 3 (1 − s)3 > 0 exceeds any term following it in (164), except possibly when 1 − r , i.e. s is small. Thus, if we define Mk =
sup
r (s)∈[1 ,1]
|h k (s)|
(165)
720
O. Costin, J. L. Lebowitz, S. Tanveer
we get
k f1 f2 + 2 s 2k+τ −1 (1−s)+ − 4 r 1−δ1 f c∗ 2 c∗ 1 × [− (1 − s)2 + (1 − s)3 ] + s 1 (1 − s)3 ds + 2 + 3 2 . k 3 12 k k 1
|h k (s)| (2k + τ )(2k +τ −1)Mk−1
1
(166)
When (1 − r ) (and thus s) is small, we can replace the term s f 1 /(12)(1 − s)3 on the right side of the above equation simply by (1 − s)3 , which is clearly bigger. From the 1 fact that 1−δ s 2k−1 [−k −1 (1 − s)2 + (2/3)(1 − s)3 ]ds = O(k −5 ), it follows that " # 2k − 1 + τ c∗ c∗ c∗ c∗ + (167) Mk Mk−1 + 2 + 3 2. + 2k + 1 + τ k 2 k 3 12 k k 1 Let C3 be large enough and define k0 (1 ) = C3 /1 , so that for k k0 we have " # 2k + τ − 1 c∗ c∗ k − 1 1/2 + 2 3+ 2 . 2k + τ + 1 1 k k k Then for k k0 ,
Mk
implying Mk Mk 0 +
k0 k
k−1 k
1/2 Mk−1 +
c∗ c∗ + 3 2, 2 k k 1
(168)
1/2
k 3/2 k0 1 c∗ 1 c∗ c∗ c∗ + c + . ∗ 1/2 + 1/2 3/2 5/2 2 1/2 1/2 k 1/2 j 3/2 k 1 /2 k j k k0 k k0 12 1 j=k0 j=k0
(169) The result follows from the definition of Mk and noting that last two terms in (169) are 3/2 O(c∗ k0 k −1/2 ). 4.14. Proof of Lemma 33. From Lemma 31 and the definition of k0 , it follows that |h k (1 )|
3/2
C4 C3
3/2
k 1/2 1
for k C3 1 −1 = k0 . Using h k (1) = 1, it follows that for k ≥ C3 /r , 1 3/2 C4 C3 |h k (r )|dr 1 . |h k (r ) − 1| 1/2 r 2 (kr ) " #2 3/2 C4 C3 α 1/2 7 = L then |h k (r ) − 1| . Additionally, if αkr 1 2 7 It is to be noted that for small enough the inequality αkr ≥ L always implies k ≥ C /r . 3
Ionization of Coulomb Systems in R3
721
4.15. Proof of Lemma 35. For ζ ∈ [0, L ], using the a priori boundedness of h k in k and Lemma 51, we note that both h˜ k (ζ ) := h k (r (ζ )) and (h˜ k )ζ are bounded independently of k. Hence the sequence {h˜ k }k 2 is bounded and equicontinuous. By Ascoli-Arzelà’s ˜ The theorem, there exists a subsequence h˜ k j (ζ ) converging to a continuous function h. ˜ )−1| ≤ 4. Now, from Lemma 33, first part of the result is proved. We first prove that |h(ζ |h˜ k (ζ ) − 1| for ζ ∈ [L , αk] for sufficiently large k.
(170)
Let h˜ k, j be a subsequence that converges to h˜ for ζ ∈ [0, L ]. Let ζm , ζ M be a minimum, and a maximum point of h˜ on [0, L ] and the corresponding minimum and maximum values are denoted by m and M respectively. Continuity at the endpoint ζ = L implies that M ≥ 1 − , m ≤ 1 + . If both M − 1 − < 0 and m − 1 + > 0, there is ˜ ) − 1| ≤ 2. Now, consider the nothing to prove because in that case it is clear that |h(ζ possibility that (i): M > 1 + . In a similar manner, we will also consider the possibility ˜ ) < 1 + , (ii): m < 1 − . Consider (i) first. Since at the end point of the interval, h(L from continuity there exists an interval [a, b] ⊂ [ζ M , L ] of nonzero length for which 1 ˜ h(η) ≤ (M + 1 + ) < M for η ∈ [a, b]. 2
(171)
For some Lˆ > L , independent of k (to be determined shortly), we write " ˆ # $ % kα1 L 0 Ak f (ζ ) = + K (ζ, η) f (η(1 − k −1 ))dη Lˆ
ζ
a1 H (η(1 − k −1 ) G(ζ, η)dη with K (ζ, η) := e−Q(η)+Q(ζ ) 1 + k H (ζ ) 01 =: [A00 k f ](ζ ) + [Ak f ](ζ ).
(172)
For fixed ζ and η we have lim K (ζ, η) = K 0 (ζ, η) = e−η+ζ
k→∞
H0 (η) G0 (ζ, η). H0 (ζ )
(173)
On our interval we have η ζ . Thus G0 0 (see (126)); G0 can vanish only if η = ζ . Furthermore, by (171) we have ζ M ∈ [a, b]. We can then define J=
˜ 3 sup[0,L ] |h| , where K m = min K 0 (ζ M , η) > 0. η∈[a,b] (b − a)K m
Note that Q(η) ∼ η for large k and, aside from the exponential term, K is algebraically bounded. We can thus choose Lˆ > L large enough independently of k, so that −1 f ∞,[L ,kα1 ] . |[A01 k f ](ζ )| J
(174)
ˆ for simplicity, There is a subsequence of h˜ k j that converges uniformly on ∈ [0, L]; ˜ ˜ ) we will use the same notation h k, j for the subsequence. It is clear that the limit is h(ζ ˜ ˆ if ζ ∈ [0, L ]. We keep the notation h for the limit on [0, L]. We note that (170) implies ˜ ) − 1| for ζ ∈ [L , L]. ˆ |h(ζ
(175)
722
O. Costin, J. L. Lebowitz, S. Tanveer
ˆ h(ζ ˜ ) ≤ 1 + < M. Now choose a small 2 > 0. It is clear that in the interval [L , L], ˜ ), we have For sufficiently large k j , using continuity of h(ζ ˜ [A00 k, j h(ζ M )] ≤
ˆ η∈[ζ M , L]\[a,b]
˜ K (ζ M , η)h(η)dη +
b
˜ K (ζ, η)h(η)dη + M2
a
b 1 M K (ζ M , η)dη + (M + 1 + ) K (ζ M , η)dη + 2 M 2 ˆ a η∈[ζ M , L]\[a,b] Lˆ b 1 =M K (ζ M , η)dη − (M − 1 − ) K (ζ M , η)dη + M2 2 0 a (b − a) (M − 1 − )K m + M2 . MA00 k j [1](ζ M ) − 3
01 1 Since Ak j [1] = A00 k j [1] + Ak j [1] + Ak j [1] (see (112) and (172)) Lemmas 47, 44 and (174) imply that for large k j we have
[A00 k j [1]](ζ M ) 1 +
+ 2 . J
Hence, for large k j we have K m ˜ [A00 + 22 − (M − 1 − )(b − a). k j h](ζ M ) M 1 + J 3
(176)
˜ Now, there exists N so that if j N , h˜ k j − h ˆ < 2 and j = Ak j+1 ...Ak j +1 ∞,[0, L] satisfies j − I ∞ 2 while r j+1 := Bk j+1 +
k j+1 −k j −1 m ) m=1
Ak j+1 −l+1 Bk j+1 −m ,
l=1
where Bl = Hl h l+1 , satisfies the estimate |r j+1 | < 2 . Therefore, from h˜ k j+1 = j Ak j h˜ k j + r j+1 it follows that ˜ M ) − 2 = M − 2 . h˜ k j+1 (ζ M ) h(ζ On the other hand, at ζ = ζ M we have Km (M − 1 − )(b − a) + 2 . j Ak j h˜ k j + r j+1 (1 + 2 ) M(1 + + 22 ) + 2 − J 3 (177)
Ionization of Coulomb Systems in R3
723
Thus,
Km M − 2 (1 + 2 ) M(1 + + 22 ) + 2 − (M − 1 − )(b − a) + 2 . J 3
This is true for any 2 , hence as 2 ↓ 0. Thus, Km − (M − 1 − )(b − a) . M M 1+ J 3 However, from the definition of J , this implies M − 1 − . We note that for (ii), ˜ which has a maximum at ζm , to conclude that we repeat the above argument for −h, either (−m) − (−1 + ) ≤ 0 or (−m) − (−1 + ) = 1 − − m . Therefore, 1 − 2 m M 1 + 2, implying that |h˜ − 1| ≤ 4. 5. Appendix 5.1. Short proof of the regularity of the unitary propagator. Theorem 5. Assume that H1 = H + V (x, t), where H is time independent and selfadjoint, and V (·, t) is in L ∞ (Rn ) for every t and is differentiable in time, with integrable derivative. Consider the Schrödinger problem iψt = H1 ψ; ψ(x, 0) ∈ D(H ).
(178)
Then there exists a strongly differentiable unitary propagator on L 2 (Rn ) U (t) so that ψ(x, t) = U (t)ψ0 ∈ D(H ) for all t and ψ(x, t) solves (178). Proof. We note that it is enough to prove this property on a finite interval [0, ], since the problem can be restarted at t = . Let y = ψ − e−t ψ0 . Then y satisfies the inhomogeneous Schrödinger equation i yt = y0 e−t + H y + V y; y0 := iψ0 + H ψ0 + V ψ0 , y(0) = 0.
(179)
We transform this equation into an integral equation, formally for now. Straightforward calculations show that i(ei H t y)t = ei H t e−t y0 + ei H t V y
(180)
or (still formally)
t e(i H −1)s ds y0 + ei H s V (s)y(s)ds 0 0 t −1 i H t−t = (i H − 1) (e − 1)y0 + ei H s V (s)y(s)ds
iei H t y =
t
(181)
0
or, equivalently, i y = (i H − 1)−1 (e−t − e−i H t )y0 + e−i H t
t 0
ei H s V (s)y(s)ds.
(182)
724
O. Costin, J. L. Lebowitz, S. Tanveer
It is clear that (182) is contractive in the norm supt∈[0,] · L 2 (R3 ) for small , and has a unique solution. Clearly, the first term on the right side of (182) is differentiable in time and the derivative is continuous since e−i H t is; let u 0 denote this derivative. We now write a formal equation for u = yt . We have t−s t −i H s e V (t − s) u(s )ds ds iu = u 0 + 0 0 t + e−i H s V (t − s)u(t − s)ds. (183) 0
This equation is also contractive, and has a unique solution, in the same space. Thus both sides of (183) are integrable in time. By t integration and appropriate changes of variables and order of integration, we see that 0 u(s)ds satisfies the same equation as y, which t has a unique solution. Thus y = 0 u(s)ds is strongly differentiable. Since both y and ei H t y are strongly differentiable (the latter by inspection from (181)), y ∈ D(H ) for all t and is strongly differentiable. It is clear that ψ ∈ D(H ) and easy to check that it is differentiable and satisfies (178). 5.2. Laplace transform of the Schrödinger equation. We look more generally at equations of the form iψt = H ψ + V (t, x)ψ,
(184)
where H is self-adjoint and time independent, and V (x, t) is bounded on R3 and differentiable and bounded in t, and ψ(x, 0) ∈ D(H ). The conditions on V can be relaxed. (For the purpose of this paper, H would be taken to be HC .) ˆ p, ·) of ψ(t, ·) Proposition 55. Under the assumptions above, the Laplace transform ψ( exists for Re p > 0; it is in D(H ) and satisfies ( p + i H )ψˆ = ψ0 − i V ψ.
(185)
Proof. We take the unitary propagator of the time-independent problem, U = e−i H t and apply U ∗ (t) = U −1 (t) to both sides of (184). Since (cf. § 1.2) U −1 is strongly differentiable, with derivative iU −1 H , and ψ is t−differentiable in L 2 , U −1 ψ is differentiable and we get (U −1 ψ)t = iU −1 H ψ + U −1 ψt = −iU −1 V ψ.
(186)
Since U −1 V ψ
is continuous in t, we can integrate both sides and get, after multiplication by U and using the fact that U −1 (t) = U (−t), t t U −1 V ψ(s)ds = U ψ0 − i U (t − s)(V ψ)(s)ds ψ = U ψ0 − iU 0
0
= U ψ0 − iU ∗ (V ψ),
(187)
where ∗ is the usual Laplace convolution. Taking the Laplace transform (which clearly exists) in (187) and using standard functional calculus we get ψˆ = ( p + i H )−1 ψ0 − i( p + i H )−1 V ψ, and thus ψˆ is a D(H ) solution of (185).
(188)
Ionization of Coulomb Systems in R3
725
Now, from Eq. (7), it follows that yˆ satisfies (9). Furthermore, using (188) and the fact that y 0 and j are compactly supported, we see that yˆ also satisfies ⎡ ⎤ yˆ ( p, ·) = R0 χ B yˆ 0 ( p, ·) − R0 χ B ⎣ j yˆ ( pˆ − i jω, ·)⎦ , (189) j∈Z
where R0 = (HC − i p)−1 . 5.3. Analyticity of (I − Cl,m)−1 in X . This is standard, and can be seen directly from analytic functional calculus. We provide a self-contained argument, for completeness. We write C X to emphasize the X − dependence of C, and for simplicity of notation we drop the (l, m) subscript. We have (I − C X 1 )−1 − (I − C X )−1 = (I − C X )−1 (C X 1 − C X )(I − C X 1 )−1 and $ % (I − C X )−1 I + (C X 1 − C X )(I − C X 1 )−1 = (I − C X 1 )−1 .
(190)
We fix X 1 and let X → X 1 . Since (I − C X 1 )−1 is bounded, then (C X 1 − C X )(I − C X 1 )−1 → 0 as X → X 1 and I + (C X 1 − C X )(I − C X 1 )−1
(191)
is invertible when X 1 and X are close enough and [I + (C X 1 − C X )(I − C X 1 )−1 ]−1 → I in operator norm as X → X 1 . Thus (I − C X )−1 → (I − C X 1 )−1 in operator norm, as X 1 →
X .
(192)
Now diferentiability in X follows from (190).
5.4. Coulomb Green’s function representation. The retarded Green’s functions G = G + is defined as the solution of the equation, A0 G(x, x ; k) = δ(x − x )
(193)
in distributions, satisfying the radiation condition G(x, x ; k) ∼ F(θ, φ)eikr r −1−iν ; as r → ∞,
(194)
where k=
b . i p (Im k > 0 if Re p > 0), ν = 2k
(195)
Equivalently, G is the R3 \{0} solution of (193) with zero right hand side, satisfying (194) and |x − x |G(x, x ; k) → (4π )−1 as x − x → 0. Proposition 56. R0 χ B g =
G(x, x ; k)g(x )d x . B
(196)
726
O. Costin, J. L. Lebowitz, S. Tanveer
Proof. The function
G(x, x ; k)g(x )d x
f :=
(197)
B
solves, as can be checked, the equation A0 f = χ B g
(198)
with the radiation condition (194). Such a solution is unique since the difference of two solutions satisfies the equation A0 f = 0 (with the radiation condition (194)). Multiplying by G(x, x ; k), integrating over a volume and passing to the limit where the volume approaches R3 , we see that f ≡ 0. Symmetries of the Coulomb potential −b/r allow for a closed form of G (cf. [26]– where the sign is chosen differently) in terms of Whittaker functions W and M, ∂ ∂ (1 − iν) G(x; x ; k) = − Wiν, 1 (−ikξ )Miν, 1 (−ikη), (199) 2 2 4πik|x − x | ∂ξ ∂η where Im k > 0 , 2kν = b and ξ = |x| + |x | + |x − x |, η = |x| + |x | − |x − x |.
(200)
The Whittaker functions are defined in terms of the Kummer functions M and U by the relations, see [1], Chap. 13, z 1 1 Mκ,µ (z) = e− 2 z 2 +µ M + µ − κ, 1 + 2µ, z , −π < arg z π, 2 (201) 1 − 2z 21 +µ + µ − κ, 1 + 2µ, z , −π < arg z π. Wκ,µ (z) = e z U 2 The following integral representation follows from [1], Chap. 13, for the values we are interested in, z 1 = −ikξ , z 2 = −ikη, a = 1 − iν, b = 2 (a different “b” than the one in our Coulomb potential) 1
1
e− 2 z z J (z) e− 2 z z I (z) ; Wiν; 1 (z) = , Miν; 1 (z) = 2 2 (1 − iν)(1 + iν) (1 − iν)
(202)
where I and J are as defined in (51) and the expression is valid in the regions where the integrals converge (in particular, |Im ν| < 1). For other values of ν of interest, the integrals can be replaced by appropriate contour integrals. For instance J would be replaced by −1 0 1 − e−2π ν e zt t −iν (1 − t)iν dt, C
where C is a smooth simple curve encircling [0, 1], as it can be checked by calculating the jump across the cut of the integrand. It follows from these integral representations that the Green’s function is analytic at any (small) p, Re p = 0. Substituting (202) into (199), we obtain (49).
Ionization of Coulomb Systems in R3
727
5.5. Dependence of A in Eq. (56) on Z , p. We now seek √ to determine the asymptotics of A in (56) in the resolvent χ B Rβ χ B in terms of λ = −i p and Z = exp [iπ b/(2λ)] √ for X = ( p, Z ) ∈ D+ × D for sufficiently small . Recall the expression A in (56). Note that since α = λ2 − ic ∼ e−iπ/4 c1/2 1 + O(λ2 ) ≡ α0 + λ2 α1 + · · · , (203) κ1 =
% beiπ/4 $ b 1 + O(λ2 ) ≡ κ1,0 + λ2 κ1,2 + , = √ 2α c
(204)
each of m 1 (a) and w1 (a) is analytic in λ for small λ, with the expansion 1 Mκ1 ,l+1/2 (2αa) ∼ m 1,0 (a) + λm 1,1 (a) + · · · , r 1 w1 (a) = Wκ1 ,l+1/2 (2αa) ∼ w1,0 (a) + λw1,1 (a) + · · · . a
m 1 (a) =
(205) (206)
The asymptotics in this case is also differentiable with respect to a and we get similar expressions as above for m 1 (a) and w1 (a). It follows that the expression for f 0 in (55) also possesses a regular series expansion in λ: f 0 (a) = f 0,0 (a) + λ f 0,1 (a) + · · · .
(207)
To simplify A as in (56) for small λ, we now consider the asymptotics of w2 (a) and w2 (a) for small λ. 5.6. Asymptotics of w2 (a), w2 (a) for small λ. Since w2 (a) = a1 Wκ,l+1/2 (2λa), with κ = b/(2λ), it follows from formula (13.1.33) and analytic continuation to larger values of κ of (13.2.5) of [1], p. 505 and the identity (x)(1 − x) = π/ sin[π x] that w2 (a) = −
e−iπ(l−κ) e−λa (2λa)(l+1) (κ − l) H (2λa; κ, l), 2πia e−zt t l−κ (1 + t)l+κ dt,
where H (z; κ, l) =
(208)
C
where the contour C starts at ∞ei0 , circles around the origin once counter-clockwise to the right of t = −1 and goes to ∞ei2π . In defining the integrand, we choose arg t ∈ [0, 2π ], arg(1 + t) ∈ (−π, π ] so that there is no branch cut on the real axis between −1 and 0. It follows from (208) that l e−iπ(l−κ) e−λa (2λ)(l+2) a l (κ − l) w2 (a)= −λ+ w2 (a)+ H1 (2λa; κ, l), a 2πi where H1 (z; κ, l) = e−zt t l−κ+1 (1 + t)l+κ dt. (209) C
We now seek to determine H (2λa; b/(2λ), l) and H1 (2λa, b/(2λ), l) asymptotically for small λ. For that purpose it is convenient to define a 1/2 1 2 + τ, (210) 2 = 2λ , τ = 2 t , P(τ ; 2 ) = − log 1 + b 2 τ
728
O. Costin, J. L. Lebowitz, S. Tanveer
where we use the principal branch of log in defining P(τ ; 2 ) above. Then, noting that in the definition of log τ and log (τ + 2 ), arg τ ∈ [0, 2π ) and arg (τ + α) ∈ (−π, π ], we have 2 log τ − log(τ + 2 ) = − log 1 + τ for τ in the upper-half plane, while for τ in the lower-half plane, we have 2 log τ − log(τ + 2 ) = i2π − log 1 + . τ It is readily checked that $ √ % b bl+1/2 H 2λa; , l = 2l+1 2l+1 l+1/2 τ l (τ +2 )l exp − ab P(τ ; 2 ) dτ 2λ 2 λ a C 1 $ √ % iπ b l l + exp − τ (τ + 2 ) exp − ab P(τ ; 2 ) dτ , λ C2 (211) $ √ % b bl+1 H1 2λa; , l = 2l+2 2l+2 l+1 τ l+1 (τ +2 )l exp − ab P(τ ; 2 ) dτ 2λ 2 λ a C1 $ √ % iπ b + exp − τ l+1 (τ + 2 )l exp − ab P(τ ; 2 ) dτ . λ C2 (212) Here C1 is a contour in the upper-half complex τ -plane from +∞ to −2 along a steepest descent line, passing through the saddle point τs,1 = i(1+o(1)), where P (τs,1 ; 2 ) = 0. The contour C2 is the steepest descent line in the lower-half τ -plane from τ = −2 to +∞ through the saddle point τs,2 , = −i(1 + o(1)) where P (τs,2 ; 2 ) = 0. We rewrite w2 and w2 as w2 (a) = where
% √ √ (−1)l+1 e−λa bl+1/2 (κ − l) $ −1 Z M ( ab, )+ Z M (2 ab, ) , √ 1 2 2 2 2l+1 aλl 1 M1 (ζ, 2 ) = e−ζ P(τ ;2 ) τ l (τ + 2 )l dτ, πi C1 1 M2 (ζ, 2 ) = e−ζ P(τ ;2 ) τ l (τ + 2 )l dτ, πi C2 l (−1)l e−λa bl+1 (κ − l) w2 (a) + w2 (a) = −λ + a 2l+1 aλl $ % √ √ × Z M3 ( ab) + Z −1 M4 ( ab) ,
where M3 (ζ, 2 ) =
1 πi and
e−ζ P(τ ;2 ) τ l+1 (τ + 2 )l dτ 1 e−ζ P(τ ;2 ) τ l+1 (τ + 2 )l dτ. M4 (ζ, 2 ) = πi C2
C1
(213)
(214)
(215)
(216)
Ionization of Coulomb Systems in R3
729
√ It follows that, with 2 = 2λ a/b, we have # √ √ 1/2 " 2 w2 (a) Z M3 ( ab, 2 ) + M4 ( ab, 2 ) l b . = −λ + − √ √ w2 (a) a a 1/2 Z 2 M1 ( ab, 2 ) + M2 ( ab, 2 )
(217)
5.6.1. Analyticity in 2
√ Proposition 57. The functions Mi ( ab, ·), i = 1, ..., 4, are analytic near zero. Proof. We look at M1 , the others being similar. We can make a change of variable q = P(τ ; 2 ) − P(τs,1 ; 2 ),
(218)
where the function q is real on the steepest descent contour and changes monotonically from ∞ to 0, as we move from +∞ to τ = τs,1 , and then increases monotonically again from 0 to ∞ as we move along the steepest descent path from τ = τs,1 to τ = −2 . We denote the two branches of the inverse function τ (q) in (218) by τ1 (q) and τ2 (q). Noting that 1 d P(τ ; 2 ) = + 1, dτ τ (τ + 2 ) we have M1 (ζ, 2 ) = e
−ζ τs,1
"
"
∞
e
−ζ q
τ2l+1 (τ2 + 2 )l+1
#
τ22 + 1 + 2 τ2 " # # ∞ l+1 l+1 −ζ q τ1 (τ1 + 2 ) e − dq . τ12 + 1 + 2 τ1 0
dq
0
(219)
It is easy to check that (τi − τs1 )2 is analytic for small 2 , regular in q and nonzero at 2 = 0 for all q. Furthermore, the integrands in (219) are clearly bounded by an L 1 function uniformly in 2 (see (210) and (218)), ensuring 2 -analyticity of the integrals. Returning to the original variable τ we get √ √ 1 1 − ab − τ1 +τ 2l M1 ( ab, 0) = e τ dτ πi πi C1,0 π $ % √ = exp i(2l + 1)θ − 2 ab sin θ dθ 0 √ √ √ % $ = J2l+1 2 ab − i Y2l+1 2 ab + G2l+1 2 ab and
√ √ 1 1 − ab − τ1 +τ 2l M2 ( ab, 0) = e τ dτ πi πi C2,0 0 $ % √ exp i(2l + 1)θ − 2 ab sin θ dθ = −π √ √ √ % $ = J2l+1 2 ab + i Y2l+1 2 ab + G2l+1 2 ab ,
(220)
(221)
730
O. Costin, J. L. Lebowitz, S. Tanveer
where J2l+1 and Y2l+1 are the usual Bessel functions of order 2l + 1 and 2 1 ∞1 G2l+1 (ν) ≡ exp[(2l + 1)t] + (−1)2l+1 exp[−(2l + 1)t] e−ν sinh t dt π 0 2 ∞ sinh ((2l + 1)t) e−ν sinh t dt, (222) = π 0 3 2 2 = i + Series in 2 . (223) τs,1 = i 1 − 2 − 4 2 √ Thus, asymptotically, to the leading order in λ, we have with ν = 2 ab, a w2 (a) − b w2 (a)
2 Z (J2l+2 (ν) − iY2l+2 (ν) − iG2l+2 (ν)) + (J2l+2 (ν) + iY2l+2 (ν) + iG2l+2 (ν)) = 2 Z (J2l+1 (ν) − iY2l+1 (ν) − iG2l+1 (ν)) + (J2l+1 (ν) + iY2l+1 (ν) + iG2l+1 (ν)) × (1 + O(λ)). (224) The discussion on w2 (a)/w(a) shows that A=
f 0 (a)w2 (a) − f 0 (a)w2 (a) m 1 (a)w2 (a) − m 1 (a)w2 (a)
(225)
√ is an analytic function of the extended parameter set X for X = p, Z ∈ D+ × D as long as the denominator for A is nonvanishing as λ → 0. We can prove it is nonvanishing by simplifying the leading order expression in λ for (a)/w (a) under the further assumption that a and c (as w2 (a)/w2 (a), defined as w2,0 2,0 in the definition of β) are sufficiently large. 5.6.2. Further simplification for large a. For large a, there is additional simplification since √ √ (−1)l+1 J2l+1 2 ab ± iY2l+1 2 ab ∼ π 1/2 a 1/4 b1/4 $ √ π % , (226) × exp ±i 2 ab + 4 √ √ (−1)l+1 J2l+2 2 ab ± iY2l+2 2 ab ∼ π 1/2 a 1/4 b1/4 $ √ π % , (227) × exp ±i 2 ab − 4 and from Watson’s Lemma, we get √ √ √ G2l+1 2 ab = O(1/a), G2l+2 2 ab = O(1/ a).
(228)
It follows that for large a, (a) w2,0
w2,0 (a)
∼
r2 a
n1 − Z 2 Z 2 + n1
1 + O(a −1/2 ) ,
(229)
Ionization of Coulomb Systems in R3
where n 1 = ie
√ 4i ba
731
√ iπ b , r2 = i ab. , Z = exp 2λ
(230)
5.6.3. Nonvanishing of the denominator of A in (225) Now, defining m = m 1 (a), m = m 1 (a), f = f 0 (a), f = f 0 (a), we have to the leading order in λ, for large a, n 1 − Z 2 + O(λ) − af r2 n 1 + Z 2 + O(λ, a −1/2 ) −A= 3 n 1 − Z 2 + O(λ) − am − r2 m n 1 + Z 2 + O(λ, a −1/2 ) 4r2
f 4r2 (n 1 − Z 2 ) + O(λ) − 4a f n 1 + Z 2 + O(λ)
. =
m 4r2 (n 1 − Z 2 ) + O(λ) − 4am n 1 + Z 2 + O(λ) The denominator of A is m m 2 + 4r2 + O(λ) + n 1 4a − 4r2 + O(λ) . D = −m Z 4a m m
(231)
(232)
(233)
We note that
1 b m ≡ m 1 (a) = M b ,l+1/2 (2αa) = e−αa (2α)l+1 a l M l +1− , 2l +2, 2αa a 2α 2α $ % αa e ∼ (2λ)l+1 a l for α large, 1 + O αa)−1 b l + 1 − 2α
(234)
and for large α in the fourth quadrant % $ eαa −1 , 1 + O(αa) b l + 1 − 2α $ π% α = λ2 − ic → c1/2 exp −i as λ → 0. 4 Therefore, D can be zero for large enough c (i.e. large β) only if √ % $ √ Z 2 2 2ac1/2 (1 − i) 1 + O (ca)−1 + 4i ba √ % $ √ = −n 1 2 2ac1/2 (1 − i) 1 + O (ca)−1 − 4i ba + O(λ). m ≡ m 1 (a) ∼ (2α)l+1 a l α
(235) (236)
Taking the absolute square of both sides, we obtain, √ √ √ |Z |2 [2a 2c]2 1 + O(c−1 a −1 + [4 ab − 2a 2c 1 + O(c−1 a −1 ]2 √ √ √ = [2a 2c]2 1 + O(c−1 a −1 + [4 ab + 2a 2c(1 + O(c−1 a −1 )]2 + O(λ). This is impossible, since |Z | ≤ 1. This means that for large enough a and c (that is, β large), D cannot be zero. It means that the resolvent is well-defined as p = 0 is approached from the closure of H.
732
O. Costin, J. L. Lebowitz, S. Tanveer
Note 58. Note that the denominator of A in (232) vanishes at points in the region |Z | > 1, where, as a result, the resolvent Rβ has poles. From the relation between Z and p, it follows that p = 0 is an accumulation point of a sequence of poles in the left half plane approaching zero tangentially to iR.
5.7. Stationary phase analysis needed to calculate the ionization rate. √ We know that the solution yˆ (is, x) is analytic in the extended parameter is, Z ) , where
√ Z = exp iπ b/ 2 s . So, for X =
(237)
√ is, Z ∈ D+ × D, yˆ (is, x) =
∞
s l/2 Fl (Z ).
(238)
iπ b Fl exp √ . 2 s
(239)
l=0
Consider ∞
G(s) ≡
s
l/2
l=4
It is clear that G(s) is a C 1 function of s in [−a, a]. Integration by parts gives a G(s)eist ds = O(t −1 ). −a
Now note that
Fl
iπ b exp √ 2 s
=
j≥0
π bj D j,l exp i √ 2 s
(240)
(241)
with D j,l decreasing exponentially with j, because of analyticity of Fl (Z ) for |Z | ≤ 1. For 0 ≤ l ≤ 3, it follows there exists constants c and C independent of j so that 3
|D j,l | ≤ Ce−cj .
(242)
l=0
It follows that for large t, we have |
3
∞
√ l=1 j=[ t]+1
√ ibj exp √ eist s l/2 ds| ≤ C1 e−c t . 2 s −a
D j,l
a
(243)
Further, for large t, 3 a C 2 ist l/2 . D0,l e s ds ≤ t −a l=0
(244)
Ionization of Coulomb Systems in R3
733
Therefore, a 3 s l/2 Fl (Z ) eist ds = −a l=0
∼
a
0≤l≤3 −a √ t] [
s l/2 Fl (Z )eist) ds D j,l
0≤l≤3 j=1
+O
1 . t
a
−a
s
l/2
(245)
πb ds exp i st + j √ 2 s
We first evaluate the terms of the form a √ s l/2 eits+id j / s ds
(246)
−a
for large t, where dj =
jπ b . 2
0 The contribution from −a is obviously small, at most O(1/t), uniformly for all t, since
√ the integrand vanishes exponentially as s → 0− . So we only consider, for 1 ≤ j ≤ t , a −1/2 s l/2 eits+id j s ds. (247) 0
We have a point of stationary phase at s = s0, j , where 2/3 dj . s0, j = 2t
(248)
√ Note that s0, j 1 for t large since j is restricted to j ≤ t. It is then convenient to rescale s = s0, j q, to obtain a % $ s0, j 2−2/3 1+l/2 s0, j exp iν j q + 2q −1/2 q l/2 dq, where ν j = 2/3 t 1/3 . (249) 0 dj Using standard stationary phase arguments we obtain that, for large t, and hence large νj, √ l+1/2 a/s0, j % $ 2π s0, j eiν j −iπ/4 1+l/2 −1/2 l/2 q dq − exp iν j q + 2q e | |s0, j √ νj 0 1+l/2
≤C
s0, j
νj
.
(250)
For large t, the dominant contribution comes from the term with l = 0 and so
√
√ √ a t t iν j 2π s e 0, j ist −iπ/4 −1 yˆ (is, x)e ds − D j,0 e e−cj ≤ C1 t −1 . √ ≤ Ct νj −a j=0
j=0
(251)
734
O. Costin, J. L. Lebowitz, S. Tanveer
The
√ sum over j is clearly convergent because of the exponential decay of D j,0 ; hence t in the upper limit can be replaced by ∞. From the definition of s0, j and ν j , it follows that a (252) yˆ (is, x)eist ds = O t −5/6 . −a
At all other singular points, p = inω, n ∈ Z, the behavior is similar, and a similar calculation gives a einωt t −5/6 contribution. Since Y ∈ H, there is sufficient decay in n to ensure that the sum over all such contributions is convergent. 5.8. Calculation of jk . Substituting the explicit expressions for m k (r ) and m (k−1) (r ), it may be checked that in both cases, τ = 0 and τ = 1, corresponding to (i) and (ii) respectively (2)
(1)
(0)
jk = k 2 α 2 s(r ) jk + kα 2 s(r ) jk + jk , where (253) √ H (αk)H (ζ − ζ /k) H (ζ ) l(l + 1) 4 (r ) H (ζ ) 4 + − jk(2) = 2 2 1 − − , α s H (α(k − 1))H (ζ ) H (ζ ) ζ2 αs(r ) H (ζ ) H (αk)H (ζ − ζ /k)) b 2(1 − 2τ ) (1) 1 − + jk = − 2 2 α s H (α(k − 1))H (ζ ) αζ ! √ 2τ ωs H (ζ ) + − − + √ , αs 2α H (ζ ) 2 α (1 + 2τ ) ω2 s3 5s 2 ωs2 s − ωs + − (ωn 0 − i p1 )s, − − 162 43/2 4 4 16 1√ where s(r ) = r (s)ds, ζ = kαr and jk := s[Lk m k − m k−1 ]/m k . Recall that H (ζ ) satisfies ω (0)(1 + 2ζ ) τ l(l + 1) b + H H, (254) H = 2 1 − + + − 2kα 2 4kα(0) 2k ζ2 αζ k jk(0) =
where
√ α=2
(0) , s(0)
(255)
and that H (ζ ) has the following asymptotic behavior: l(l +1) b log ζ 1 , (ζ, k → ∞, , ζ ≤ kα) . H (ζ ) ∼ 1+ + log ζ + O , 2ζ 2kα kζ ζ 2 (2)
(1)
(256)
Now, we claim that for any r ∈ (0, 1), | jk + k −1 jk | ≤ Ck −2 . In the regime r 1, we use Taylor expansion: 1 1 ζ ζ , s = s(0) − (0) (257) = (0) + (0) + O + O kα k2 kα k2 √ (2) (1) and substitute r = ζ /(kα) in jk + k −1 jk ; we then use α = 2 (0)/s(0), (254) (2) and the asymptotic behavior (256) to evaluate H (αk) and H (α(k − 1)) to find jk +
Ionization of Coulomb Systems in R3
735
(1)
k −1 jk ∼ k −2 g(ζ ) for some bounded differentiable function g(ζ ), with asymptotic behavior g(ζ ) ∼ const./ζ for large ζ . When r is not small, we use the asymptotic behavior (256) to evaluate all terms involving the function H and to find the same inequality | jk(2) + k −1 jk(1) | ≤ Ck −2 . Therefore, jk (r ) = O(1) in all Further, it is easily checked that in the regimes. regime k ζ 1, jk (r ) = O 1, ζ −1 = O(1/(kr ), 1). Since the asymptotics is differentiable (since H satisfies a second order differential equation), it follows jk (r ) = O(k −1r −2 , 1). When r is not small, using (256), it is readily checked that jk = O(1). 5.9. Generalizations. In fact, the same asymptotic arguments hold more generally if V (t, x) =
M
ei jωt j (r )
j=−M
with j (r ) satisfying the conditions we used for . We substitute for r = O(1), ⎡ ⎤ M c∗ exp ⎣k log f 0 (r ) + gn 0 −k (r ) = k 1− j/M f j (r )⎦ , (2k/M + 1) j=1
and calculate the error term Rk as before. By requiring that the O(k 2−2 j/M ) terms vanish for j = 0, .., M, we obtain (M + 1) first order differential equations for f j . To leading order 1 2/M f 0 (r ) = −M (s)ds . r
The expressions for f j (r ) for j 1 are more complicated and involve arbitrary constants to be determined from the information for small k at r = 1. Again because of the presence of r −2 l(l + 1) in Lk , the remainder is O(r −2 ), which is O(k 2 ) when r = O(k −1 ). We write ⎡ ⎤ M H (αkr ) . (258) gn 0 −k (r ) ∼ c∗ exp ⎣k log f 0 (r ) + k 1− j/M f j (r )⎦ (2k/M + 1) j=1
Then, if ζ = O(1), we find to leading order H (ζ ) ∼ H0 (ζ ), where l(l + 1) H0 = 0, ζ2 1 where now α = 2 −M (0)/s(0) and s(r ) = r −M (s). As for M = 1, we have to require H0 (ζ ) ∼ 1 as ζ → ∞. This leads to 2 ζ 1/2 H0 (ζ ) = e ζ K l+ 1 (ζ ). 2 π H0 − 2H0 −
For nonzero gn 0 −k , the constant multiple in (258) is expected to be nonzero. On the other hand, the asymptotic behavior as ζ ↓ 0, H0 (ζ ) ∼ c∗ ζ −l implies that the behavior at r = 0 of gn 0 −k /r is not acceptable unless every gn vanishes identically.
736
O. Costin, J. L. Lebowitz, S. Tanveer
* The analysis is likely to extend to systems with HC replaced by HW = − − b/r + W (r ), where b may be zero and W (r ) = O(r −1− ) for large r and is in L ∞ (R3 ). Under these assumptions, W (r ) does not participate in the asymptotics, to the orders relevant to the proofs.
5.10. Further remarks on the asymptotics. Remark 59. A weaker statement than Theorem 4 suffices to complete the proof of Theorem 1. For instance, it suffices to show that for sufficiently large j, |Rk, j | < 1, where r l+1 vn 0 −k j (r ) = i k j r l m k j (r )[1 + Rk j (r )]. Remark 60. Stronger results than those in Proposition 36 hold. Noting that for any integer q 0 we have Ak j +q ...Ak j +2 Ak j +1 Ak j [h˜ − 1]∞
∞ ) q =0
c∗ 1+ (k j + q )2
h˜ − 1∞ ,
while Ak j +q ...Ak j +2 Ak j +1 Ak j [1] = 1 + O(k −1 j ), ˜ ∞ c∗ k −2 , it follows that the sequence h˜ k , satisfying and the fact that Hk j +q h j h˜ k = Ak h˜ k−1 + Hk h˜ k+1 , has the property limk→∞ h˜ k = 1. Indeed, this is in accordance with the heuristic arguments presented in § 5.9. While these results completely justify the formal asymptotics, they are not needed in the proofs and we omit the details.
Acknowledgements. We thank R. D. Costin, S. Goldstein, W. Schlag, A. Soffer and C. Stucchio for very useful discussions. We are very grateful to Kenji Yajima for many useful comments and suggestions on earlier drafts of this paper. Work supported in part by NSF Grants DMS-0100495, DMS-0406193, DMS-0600369, DMS-0100490, DMS-0807266, DMR 01-279-26 and AFOSR grant AF-FA9550-04. O. C. and J. L. L. acknowledge the partial support from IAS and IHES and S.T. acknowledges support by the EPSRC and the Mathematics Institute at Imperial College during his 2005-2006 stay. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Ionization of Coulomb Systems in R3
737
References 1. Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. New York: Wiley-Interscience, 1984 2. Agmon, S.: Spectral properties of Schrödinger operators and scattering theory. Ann. Scuola. Norm. Sup. Pisa, Ser. IV 2, 151–218 (1975) 3. Agmon, S.: Analyticity properties in scattering and spectral theory for schrodinger operators with longrange radial potentials. Duke Math. J. 68(2), 337–399 (1992) 4. Belissard, J.: Stability and instability in quantum mechanics. In: Trends and Developments in the Eighties, Albeverio, S., Blanchard, Ph. (eds.) Singapore: World Scientific, 1985, pp. 1–106 5. Bourgain, J.: On long-time behaviour of solutions of linear Schrödinger equations with smooth timedependent potential. In: Geometric Aspects of Functional Analysis, Lecture Notes in Math. 1807, Berlin: Springer, 2003, pp. 99–113 6. Bourgain, J.: Growth of Sobolev norms in linear Schrödinger equatios with quasi-periodic potential. Commun. Math. Phys. 204(1), 207–240 (1999) 7. Bourgain, J.: On growth of Sobolev norms in linear Schrödinger equations with smooth time-dependent potential. J. Anal Math. 77, 315–348 (1999) 8. Bourgain, J.: Fourier transform restriction phenomena for certain lattice subsets and applications to nonlinear evolution equations. I. Schrödinger equations. Geom. Funct. Anal 3(2), 107–156 (1993) 9. Buchholz, H.: The Confluent Hypergeometric Function. Berlin-Heidelberg-NewYork: Springer-Verlag, 1969 10. Costin, O., Costin, R.D., Lebowitz, J.L.: Transition to the Continuum of a Particle in Time-Periodic Potentials, Advances in Differential Equations and Mathematical Physics, AMS Contemporary Mathematics 327 ed. Karpeshina, Yu., Stolz, C., Weikard, R., Zeng, Y. Providence, RI: Amer. Math. Soc., 2003, pp. 75–86 11. Costin, O., Lebowitz, J.L., Rokhlenko, A.: Exact results for the ionization of a model quantum system. J. Phys. A: Math. Gen. 33, 1–9 (2000) 12. Costin, O., Costin, R.D., Lebowitz, J.L., Rokhlenko, A.: Evolution of a model quantum system under time periodic forcing: conditions for complete ionization. Commun. Math. Phys. 221(1), 1–26 (2001) 13. Costin, O., Rokhlenko, A., Lebowitz, J.L.: On the Complete Ionization of a Periodically Perturbed Quantum System. CRM Proceedings and Lecture Notes 27, Providence, RI: Amer. Math. Soc., 2001, pp. 51–61 14. Costin, O., Soffer, A.: Resonance theory for Schrödinger operators. Commun. Math. Phys. 224, 133–152 (2001) 15. Costin, O., Costin, R.D., Lebowitz, J.L.: Time asymptotics of the Schrödinger wave function in timeperiodic potentials. J. Stat. Phys. 116(1–4), 283–310 (2004) 16. Costin, O., Lebowitz, J.L., Stucchio, C.: Ionization in a one-dimensional dipole model. Rev. Math. Phys. 7, 835–872 (2008) 17. Treves, F.: Basic Linear Partial Differential Equations. London-New York: Academic Press, 1975 18. Costin, O., Lebowitz, J.L., Stucchio, C., Tanveer, S.: Exact results for ionization of model atomic systems. J. Math Phys. 51, 015211 (2010) 19. Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schrödinger Operators. Berlin-Heidelberg-NewYork: Springer-Verlag, 1987 20. Galtbayar, A., Jensen, A., Yajima, K.: Local time-decay of solutions to Schrödinger equations with time-periodic potentials. J. Stat. Phys. 116(1–4), 231–282 (2004) 21. Goldberg, M.: Strichartz estimates for the Schrödinger equation with time-periodic Ln/2 potentials. J. Funct. Anal. 256(3), 718–746 (2009) 22. Hislop, P.D., Sigal, I.M.: Introduction to Spectral Theory with Applications to Schrödinger Operators. Applied Mathematical Sciences 113, Berlin-Heidelberg-NewYork: Springer, 1996 23. Hörmander, L.: Linear Partial Differential Operators. Berlin-Heidelberg-NewYork: Springer, 1963 24. Howland, J.S.: Stationary scattering theory for time dependent Hamiltonians. Math. Ann. 207, 315– 335 (1974) 25. Jauslin, H.R., Lebowitz, J.L.: Spectral and stability aspects of quantum Chaos. Chaos 1, 114–121 (1991) 26. Hostler, L., Pratt, R.H.: Coulomb’s Green’s function in closed form. Phys. Rev. Lett. 10(11), 469–470 (1963) 27. Jensen, A.: High energy resolvent estimates for generalized many-body Schrodinger operators. Publ. RIMS, Kyoto U. 25, 155–167 (1989) 28. Kato, T.: Perturbation Theory for Linear Operators. Berlin-Heidelberg-NewYork: Springer Verlag, 1995 29. Koch, P.M., van Leeuven, K.A.H.: The importance of resonances in microwave “Ionization” of excited hydrogen atoms. Phys. Repts. 255, 289–403 (1995) 30. Miller, P.D., Soffer, A., Weinstein, M.I.: Metastability of breather modes of time dependent potentials. Nonlinearity 13, 507–568 (2000) 31. Reed, M., Simon, B.: Methods of Modern Mathematical Physics. New York: Academic Press, 1972
738
O. Costin, J. L. Lebowitz, S. Tanveer
32. Möller, J.S., Skibsted, E.: Spectral theory of time-periodic many-body systems. Adv. Math. 188(1), 137– 221 (2004) 33. Möller, J.S.: Two-body short-range systems in a time-periodic electric field. Duke Math. J. 105(1), 135– 166 (2000) 34. Rodnianski, I., Tao, T.: Long-time Decay Estimates for Schrödinger Equations on Manifolds. Ann. of Math. Stud. 163, Princeton, NJ: Princeton Univ. Press, 2007 35. Rokhlenko, A., Costin, O., Lebowitz, J.L.: Decay versus survival of a local state subjected to harmonic forcing: exact results. J. Phys. A: Mathematical and General 35, 8943 (2002) 36. Schlag, W., Rodnianski, I.: Time decay for solutions of Schrödinger equations with rough and timedependent potentials. Invent. Math 3, 451–513 (2004) 37. Herbst, I., Möller, J.S., Skibsted, E.: Asymptotic completeness for N -body Stark Hamiltonians. Commun. Math. Phys. 174(3), 509–535 (1996) 38. Merzbacher, E.: Quantum Mechanics. 3rd ed., New York: Wiley, 1998 39. Simon, B.: Schrödinger operators in the twentieth century. J. Math. Phys. 41, 3523 (2000) 40. Slater, L.J.: Confluent hypergeometric functions. Cambridge: Cambridge University Press, 1960 41. Soffer, A., Weinstein, M.I.: Nonautonomous Hamiltonians. J. Stat. Phys. 93, 359–391 (1998) 42. Wasow, W.: Asymptotic Expansions for Ordinary Differential Equations. New York: Interscience Publishers, 1968 43. Yajima, K.: Resonances for the AC-Stark effect. Commun. Math. Phys. 87(3), 331–352 (1982/83) 44. Graffi, S., Yajima, K.: Exterior complex scaling and the AC-Stark effect in a Coulomb field. Commun. Math. Phys. 89(2), 277–301 (1983) 45. Yajima, K.: Scattering theory for Schrödinger equations with potentials periodic in time. J. Math. Soc. Japan 29, 729 (1977) 46. Yajima, K.: Existence of solutions of Schrödinger evolution equations. Commun. Math. Phys. 110, 415 (1987) Communicated by M. Aizenman
Commun. Math. Phys. 296, 739–767 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1027-6
Communications in
Mathematical Physics
Statistical Stability and Continuity of SRB Entropy for Systems with Gibbs-Markov Structures José F. Alves, Maria Carvalho, Jorge Milhazes Freitas Centro de Matemática da Universidade do Porto, Rua do Campo Alegre 687, 4169-007 Porto, Portugal. E-mail:
[email protected];
[email protected];
[email protected] Received: 7 May 2009 / Accepted: 22 November 2009 Published online: 12 March 2010 – © Springer-Verlag 2010
Abstract: We present conditions on families of diffeomorphisms that guarantee statistical stability and SRB entropy continuity. They rely on the existence of horseshoe-like sets with infinitely many branches and variable return times. As an application we consider the family of Hénon maps within the set of Benedicks-Carleson parameters. Contents 1.
Introduction . . . . . . . . . . . . . . . . . . . . . . 1.1 Gibbs-Markov structure . . . . . . . . . . . . . . 1.2 Uniform families . . . . . . . . . . . . . . . . . 1.3 Statement of results . . . . . . . . . . . . . . . . 2. Quotient Dynamics and Lifting Back . . . . . . . . . 2.1 The natural measure . . . . . . . . . . . . . . . . 2.2 Lifting to the Gibbs-Markov structure . . . . . . 2.3 Entropy formula . . . . . . . . . . . . . . . . . . 3. Statistical Stability . . . . . . . . . . . . . . . . . . . 3.1 Convergence of the densities on the reference leaf 3.2 Continuity of the SRB measures . . . . . . . . . 4. Entropy Continuity . . . . . . . . . . . . . . . . . . . 4.1 Auxiliary results . . . . . . . . . . . . . . . . . . 4.2 Convergence of metric entropies . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
739 741 743 744 745 745 748 750 752 752 756 761 761 763 766
1. Introduction A physical measure for a smooth map f : M → M on a manifold M is a Borel probability measure µ on M for which there is a positive Lebesgue measure set of points
740
J. F. Alves, M. Carvalho, J. M. Freitas
x ∈ M, called the basin of µ, such that n−1 1 n→∞ δ f j (x) −→ µ n→∞ n
lim
(1.1)
j=0
in the weak* topology, where δz stands for the Dirac measure on z ∈ M. Sinai, Ruelle and Bowen showed the existence of physical measures for Axiom A smooth dynamical systems. These were obtained as equilibrium states for the logarithm of the Jacobian along the unstable direction. Besides, such probability measures exhibit positive Lyapunov exponents and conditionals which are absolutely continuous with respect to Lebesgue measure on local unstable leaves; probability measures with the latter properties are nowadays known as Sinai-Ruelle-Bowen measures (SRB measures, for short). Statistical properties and their stability have met with wide interest, particularly in the context of dynamical systems which do not satisfy classical structural stability. This may be checked through the continuous variation of the SRB measures, referred to in [AV] as statistical stability. Another characterization of stability addresses the continuity of the metric entropy of SRB measures. Although an old issue, going back to [N] and [Y1] for example, this continuity (topological or metric) is in general a hard problem. Notice that for families of smooth diffeomorphisms verifying the entropy formula, see [LY2], and whose Jacobian along the unstable direction depends continuously on the map, the entropy continuity is an immediate consequence of the statistical stability. This holds for instance in the setting of Axiom A attractors whose statistical stability was established in [R] and [M]. The regularity of the SRB entropy for Axiom A flows was proved in [C]. Analyticity of metric entropy for Anosov diffeomorphisms was proved in [P]. More recently, statistical stability for families of partially hyperbolic diffeomorphisms with non-uniformly expanding centre-unstable direction was established in [V]. Due to the continuous variation of the centre-unstable direction in the partial hyperbolicity context, the entropy continuity follows as in the Axiom A case. Statistical stability for Hénon maps within Benedicks-Carleson parameters have been proved in [ACF]; the entropy continuity for this family is a more delicate issue, since the lack of partial hyperbolicity, mostly due to the presence of “critical” points, originates a highly irregular behavior of the unstable direction. In the endomorphism setting, many advances have been obtained for important families of maps, for instance in [RS,T2,T1,AV,A,F,FT] concerning statistical stability, and in [AOT] for the entropy continuity. Actually, our main theorem may be regarded as a version for diffeomorphisms of the entropy continuity result in [AOT]. In this work we give sufficient conditions on families of smooth diffeomorphisms for the statistical stability and the continuous variation of the SRB entropies. The families we study here, though having directions of non-uniform expansion, do not allow the approach of the hyperbolic case, since no continuity assumptions on these directions with the map will be assumed. Instead, we consider diffeomorphisms admitting GibbsMarkov structures as in [Y2] that may be thought of as “horseshoes” with infinitely many branches and variable return times. This is mainly motivated by the important class of Hénon maps presented in the next paragraph. Our assumptions, which have a geometrical and dynamical nature, ensure in particular the existence of SRB measures. Gibbs-Markov structures were used in [Y2] to derive decay of correlations and the validity of the Central Limit Theorem for the SRB measure. Here we prove that under some additional uniformity requirements on the family we obtain statistical stability and SRB entropy continuity.
Statistical Stability and Continuity of SRB Entropy
741
The major application of our main result concerns the Benedicks-Carleson family of Hénon maps, f a,b :
R2 −→ R2 (x, y) −→ (1 − ax2 + y, bx).
(1.2)
For small b > 0 values, f a,b is strongly dissipative, and may be seen as an “unfolded” version of a quadratic interval map. It is known that for small b there is a trapping region whose topological attractor coincides with the closure of the unstable manifold W of ∗ of f a fixed point z a,b a,b . In [BC] it was shown that for each sufficiently small b > 0 there is a positive Lebesgue measure set of parameters a ∈ [1, 2] for which f a,b has a dense orbit in W with a positive Lyapunov exponent, which makes this a non-trivial and strange attractor. We denote by BC the set of those parameters (a, b) and call it the Benedicks-Carleson family of Hénon maps. As shown in [BY1], each of these nonhyperbolic attractors supports a unique SRB measure µa,b , whose main features were further studied in [BY2,BV1,BV2]. In [BY2] a Gibbs-Markov structure was built for each f a,b with (a, b) ∈ BC, which has been used to obtain statistical behavior of Hölder observables. These structures have also been used in [ACF] to deduce the statistical stability of this family. In this work we add the metric entropy continuity with respect to these measures. 1.1. Gibbs-Markov structure. Let f : M → M be the C k diffeomorphism (k ≥ 2) defined on a finite dimensional Riemannian manifold M, endowed with a normalized volume form on the Borel sets that we denote by Leb and call Lebesgue measure. Given a submanifold γ ⊂ M we use Lebγ to denote the measure on γ induced by the restriction of the Riemannian structure to γ . An embedded disk γ ⊂ M is called an unstable manifold if dist( f −n (x), f −n (y)) → 0 as n → ∞ for every x, y ∈ γ . Similarly, γ is called a stable manifold if dist( f n (x), f n (y)) → 0 as n → ∞ for every x, y ∈ γ . Definition 1. Let D u be the unit disk in some Euclidean space and Emb1 (D u , M) be the space of C 1 embeddings from D u into M. We say that u = {γ u } is a continuous family of C 1 unstable manifolds if there is a compact set K s and u : K s × D u → M such that i) γ u = u ({x} × D u ) is an unstable manifold; ii) u maps K s × D u homeomorphically onto its image; iii) x → u |({x} × D u ) defines a continuous map from K s into Emb1 (D u , M). Continuous families of C 1 stable manifolds are defined similarly. Definition 2. We say that ⊂ M has a hyperbolic product structure if there exist a continuous family of unstable manifolds u = {γ u } and a continuous family of stable manifolds s = {γ s } such that i) ii) iii) iv)
= (∪γ u ) ∩ (∪γ s ); dim γ u + dim γ s = dim M; each γ s meets each γ u in exactly one point; stable and unstable manifolds meet with angles larger than some θ > 0.
742
J. F. Alves, M. Carvalho, J. M. Freitas
Let ⊂ M have a hyperbolic product structure, whose defining families are s and u . A subset ϒ0 ⊂ is called an s-subset if ϒ0 also has a hyperbolic product structure and its defining families 0s and 0u can be chosen with 0s ⊂ s and 0u = u ; u-subsets are defined analogously. Given x ∈ , let γ ∗ (x) denote the element of ∗ containing x, for ∗ = s, u. For each n ≥ 1, let ( f n )u denote the restriction of the map f n to γ u -disks, and let det D( f n )u be the Jacobian of D( f n )u . In the sequel C > 0 and 0 < β < 1 are constants, and we require the following properties from the hyperbolic product structure : (P0 ). Positive measure: for every γ ∈ u we have Lebγ ( ∩ γ ) > 0. (P1 ). Markovian: there are pairwise disjoint s-subsets ϒ1 , ϒ2 , · · · ⊂ such that (a) Lebγ ((\ ∪ ϒi ) ∩ γ ) = 0 on each γ ∈ u ; (b) for each i ∈ N there is τi ∈ N such that f τi (ϒi ) is a u-subset, and for all x ∈ ϒi , f τi (γ s (x)) ⊂ γ s ( f τi (x)) and f τi (γ u (x)) ⊃ γ u ( f τi (x)); (c) for each n ∈ N there are finitely many i’s with τi = n. (P2 ). Contraction on stable leaves: for each γ s ∈ s and each y ∈ γ s (x), dist( f n (y), f n (x)) ≤ Cβ n , ∀n ≥ 1. For the last two properties we introduce the return time R : → N and the induced map F = f R : → , which are defined for each i ∈ N as R|ϒi = τi and f R |ϒi = f τi |ϒi , and, for each x, y ∈ , the separation time s(x, y) is given by s(x, y) = min n ≥ 0 : ( f R )n (x) and ( f R )n (y) lie in distinct ϒi s . (P3 ). Regularity of the stable foliation: (a) for y ∈ γ s (x) and n ≥ 0, log
∞ det D f u ( f i (x)) i=n
γ, γ
det D f u ( f i (y))
≤ Cβ n ;
(b) given ∈ we define : γ ∩ → γ ∩ by (x) = γ s (x) ∩ γ . Then is absolutely continuous and u ,
∞
d( ∗ Lebγ ) det D f u ( f i (x)) ; (x) = d Lebγ det D f u ( f i ( −1 (x))) i=0
(c) letting v(x) denote the density in item (b), we have log
v(x) ≤ Cβ s(x,y) , for x, y ∈ γ ∩ . v(y)
(P4 ). Bounded distortion: for γ ∈ u and x, y ∈ ∩ γ , log
det D( f R )u (x) R R ≤ Cβ s( f (x), f (y)) . R u det D( f ) (y)
Statistical Stability and Continuity of SRB Entropy
743
Remark 1.1. We do not assume uniform backward contraction along unstable leaves as (P4)(a) in [Y2]. Properties (P3 )(c) and (P4 ) are new if comparing our setup to that in [Y2]. However, these are a consequence of (P4) and (P5) of [Y2] as done in [Y2, Lemma 1]. In spite of the uniform contraction on stable leaves demanded in (P2 ), this is not too restrictive in systems having regions where the contraction fails to be uniform, since we are allowed to remove stable leaves, provided a subset with positive measure of leaves remains in the end. This has been carried out for Hénon maps in [BY2].
1.2. Uniform families. Let F be a family of C k maps (k ≥ 2) from the finite dimensional Riemannian manifold M into itself, and endow F with the C k topology. Assume that each map f ∈ F admits a Gibbs-Markov structure f as described in Sect. 1.1. Let uf = {γ uf } and sf = {γ sf } be its defining families of unstable and stable curves. Denote by R f : f → N the corresponding return time function. Given f 0 ∈ F, take a sequence f n ∈ F such that f n → f 0 in the C 1 topology as n → ∞. For the sake of notational simplicity, for each n ≥ 0 we will indicate the dependence of the previous objects on f n just by means of the index or supra-index n. If γnu ∈ nu is sufficiently close to γ0u ∈ 0u in the C k topology, we may define a projection by sliding through the stable manifolds of 0 , Hn : γnu ∩ 0s −→ γ0u s z −→ γ0 (z) ∩ γ0u and set 0 = γ0u ∩ 0 , 0n = Hn−1 (0 ), n = γnu ∩ n , n0 = Hn (n ∩ 0n ).
(1.3)
Given k ∈ N and positive integers i 1 , . . . , i k , we denote by ϒin1 ,...,ik the s-sublattice that j
satisfies Fn (ϒin1 ,...,ik ) ⊂ ϒinj for every 1 ≤ j < k and Fnk (ϒin1 ,...,ik ) = ϒink . Definition 3. F is called a uniform family if the conditions (U0 )–(U5 ) below hold: (U0 ). Absolute constants: The constants C and β in (P2 ),(P3 ) and (P4 ) can be chosen the same for all f ∈ F. (U1 ). Proximity of unstable leaves: There are unstable leaves γˆ0 ∈ 0u and γˆn ∈ n such that γˆn → γˆ0 in the C 1 topology as n → ∞. (U2 ). Matching of structures: Defining the objects of (1.3) with γˆn replacing γnu , we have Lebγˆn n 0n → 0, as n → ∞. (U3 ). Proximity of stable directions: For every z ∈ n0 ∩ 0 we have γns (z) → γ0s (z) in the C 1 topology as n → ∞.
744
J. F. Alves, M. Carvalho, J. M. Freitas
(U4 ). Matching of s-sublattices: Given N , k ∈ N and ϒi01 ,...,ik with R0 ϒi0j ≤ N for 1 ≤ j ≤ k, there is ϒ n1 ,..., k such that Rn ϒ nj = R0 ϒi0j for 1 ≤ j ≤ k and Lebγˆ0 Hn ϒ n1 ,..., k ∩ γˆn ϒi01 ,...,ik ∩ γˆ0 → 0, as n → ∞. (U5 ). Uniform tail: Given ε > 0, there are N = N (ε) and J = J (ε, N ) such that ∞
j Lebγˆn {Rn = j} < ε, ∀n > J.
j=N
This last property ensures in particular that γˆn Rn d Lebγˆn < ∞ for large n, which by [Y2, Theorem 1] implies the existence of an SRB measure for each f n . Remark 1.2. Using that stable and unstable manifolds of f 0 meet with angles uniformly bounded away from zero at points in 0 , and the proximities given by (U1 ) and (U3 ), it follows that there is some θ > 0 such that, for n large enough, the stable manifolds through points in 0n meet γˆn with an angle bigger than θ . Together with (P3 ) and (U1 ), this implies that: i) (Hn )∗ Lebγˆn Lebγˆ0 with uniformly bounded density; ii)
d(Hn )∗ Lebγˆn Lebγˆ0
→ 1 on L 1 (Lebγˆ0 ), as n → ∞.
1.3. Statement of results. Consider a family F such that each f ∈ F admits a unique SRB measure µ f . Letting P(M) denote the space of probability measures on M endowed with the weak* topology, we say that F is statistically stable if the map F −→ P(M) f −→ µ f , is continuous. In the sequel h µ f denotes the metric entropy of f with respect to the measure µ f . Theorem A. Let F be a uniform family such that each f ∈ F admits a unique SRB measure. Then (1) F is statistically stable; (2) F f → h µ f is continuous. Corollary B. The family BC is statistically stable and the map BC (a, b) → h µa,b is continuous. This corollary follows immediately after building Gibbs-Markov structures satisfying (P0 )–(P4 ), as was done in [BY2], and verifying the uniformity conditions (U0 )–(U5 ), as in [ACF]. For the sake of clearness, the following list specifies exactly where each property is obtained.
Statistical Stability and Continuity of SRB Entropy
(P0 ) (P1 ) (P2 ) (P3 )(a) (P3 )(b) (P3 )(c) (P4 ) (U0 ) (U1 ) (U2 ) (U3 ) (U4 ) (U5 )
745
[BY2, Proposition A(3)] [BY2, Proposition A(1),(2)] [BY2, Proposition A(2)] [BY2, Sublemma 8] [BY2, Sublemma 10] [BY2, Sublemma 11] [BY2, Sublemma 9] [ACF, Sects. 6,7,8] Hyperbolicity of the fixed point z ∗ [ACF, Sect. 6 in particular Corollary 6.4] [ACF, Sect. 7 in particular Proposition 7.3] [ACF, Sect. 8 in particular Proposition 8.9] [BY2, Proposition A(4)]
Concerning (U0 ) and (U5 ), observe that the constants depend exclusively on the maximum value for b > 0 and the minimum for a < 2 in the choice of Benedicks-Carleson parameters. 2. Quotient Dynamics and Lifting Back In this section we shall analyze some dynamical features of a diffeomorphism f admitting with a Gibbs-Markov structure that verifies properties (P0 )–(P4 ). Consider a ¯ obtained by collapsing the stable curves of ; i.e. ¯ = / ∼, where quotient space
s z ∼ z if and only if z ∈ γ (z). Since by (P1 )(b) the induced map F = f R : → ¯ → ¯ is well defined takes γ s leaves to γ s leaves, then the quotient induced map F : ¯ and if ϒ¯ i is the quotient of ϒi , then F takes the sets ϒ¯ i homeomorphically onto . ¯ through the canonical Given an unstable leaf γ , the set γ ∩ is suited as a model for ¯ We will see in Sect. 2.1 that we may define a natural reference projection π¯ : → . ¯ Besides, F is an expanding Markov map (see Lemma 2.1), thus having measure m¯ on . an absolutely continuous (w.r.t m), ¯ F-invariant probability measure µ. ¯ Moreover, if µ˜ denotes the F-invariant measure supported on then µ¯ = π¯ ∗ (µ). ˜ To build an SRB measure µ out of µ˜ is just a matter of saturating the measure µ. ˜ The existence of the measures µ, ¯ µ˜ and the fact that µ¯ = π¯ ∗ (µ) ˜ follows from standard methods, which can be found for instance in [Y2]. For the sake of completeness we will present the construction of the SRB measure, also having in mind how some properties can be carried up through the lifting. We will accomplish this by adapting some ideas used in the construction of Gibbs states; see [B]. 2.1. The natural measure. The purpose of this subsection is to introduce a natural prob¯ and establish some properties of the Jacobian of F with respect ability measure m¯ on to m. ¯ Moreover, we show the existence of an F-invariant density ρ¯ with respect to the measure m. ¯ Fix an arbitrary γˆ ∈ u . The restriction of π¯ to γˆ ∩ gives a homeomorphism that ¯ Given γ ∈ u and x ∈ γ ∩ , let xˆ be the point in we denote by πˆ : γˆ ∩ → . s γ (x) ∩ γˆ . Defining for x ∈ γ ∩ , u(x) ˆ =
∞ det D f u ( f i (x)) i=0
det D f u ( f i (x)) ˆ
,
(2.1)
746
J. F. Alves, M. Carvalho, J. M. Freitas
we have that uˆ satisfies the bounded distortion property (P3 )(c). For each γ ∈ u let m γ be the measure in γ such that dm γ = uˆ 1γ ∩ , d Lebγ where 1γ ∩ is the characteristic function of the set γ ∩ . These measures have been defined in such a way that if γ , γ ∈ u and is obtained by sliding along stable leaves from γ ∩ to γ ∩ , then
∗ m γ = m γ .
(2.2)
To verify this let us show that the densities of these two measures with respect to Lebγ coincide. Take x ∈ γ ∩ and x ∈ γ ∩ such that (x) = x . By (P3 )(b) one has d ∗ Lebγ u(x ˆ ) , (x ) = d Lebγ u(x) ˆ which implies that dm γ d ∗ m γ d ∗ Lebγ (x ) = u(x) ˆ (x ) = u(x ˆ ) = (x ). d Lebγ d Lebγ d Lebγ Conditions (P0 ) and (2.2) allow us to define the reference probability measure m¯ whose representative in each unstable leaf γ ∈ u is exactly Leb1 () m γ . γˆ Let T : (X 1 , m 1 ) → (X 2 , m 2 ) be a measurable bijection between two probability measure spaces. T is called nonsingular if it maps sets of zero m 1 measure to sets of zero m 2 measure. For a nonsingular transformation T we define the Jacobian of T with dT −1 (m )
respect to m 1 and m 2 , denoted by Jm 1 ,m 2 (T ), as the Radon-Nikodym derivative ∗dm 1 2 . By assertion (1) of the following lemma it makes sense to consider the Jacobian of the quotient map F : (, m) → (, m) that we simply denote J F. Lemma 2.1. Assuming that F(γ ∩ ϒi ) ⊂ γ for γ , γ ∈ u , let J F(x) denote the Jacobian of F with respect to the measures m γ and m γ . Then (1) J F(x) = J F(y) for every y ∈ γ s (x); (2) there is C0 > 0 such that for every x, y ∈ γ ∩ ϒi , J F(x) s(F(x),F(y)) ; J F(y) − 1 ≤ C0 β (3) for every k ∈ N and any k positive integers i 1 , . . . i k , there is C1 > 0 such that for every x, y ∈ ϒi1 ,...,ik ∩ γ , J F k (x) J F k (y) ≤ C1 . Proof. (1) For Lebγ almost every x ∈ γ ∩ we have u(F(x)) ˆ . J F(x) = det D F u (x) · u(x) ˆ
(2.3)
Statistical Stability and Continuity of SRB Entropy
747
Denoting ϕ(x) = log | det D f u (x)| we may write log J F(x) =
R−1
∞ ϕ( f i (F(x))) − ϕ( f i ( F(x)) ϕ( f (x)) + i
i=0
=
i=0
∞ ϕ( f i (x)) − ϕ( f i (x) ˆ −
R−1
i=0
∞ . ϕ( f i (F(x))) ϕ( f i (x)) ˆ + ˆ − ϕ( f i ( F(x))
i=0
i=0
which is Thus we have shown that J F(x) can be expressed just in terms of xˆ and F(x), enough for proving the first part of the lemma. (2) It follows from (2.3) that J F(x) det D F u (x) u(F(x)) ˆ u(y) ˆ = log + log + log . J F(y) det D F u (y) u(F(y)) ˆ u(x) ˆ
log
Observing that s(x, y) > s(F(x), F(y)) the conclusion follows from (P3 )(c) and (P4 ). (3) Again, from (2.3), we obtain u det D F k (x) J F k (x) u(y) ˆ u(F ˆ k (x)) log = log + log . + log u k J F (y) u(F ˆ k (y)) u(x) ˆ det D F k (y) By (P4 ) we have u k ∞ det D F k (x) s(F l (x),F l (y)) Cβ ≤C β l < ∞. log ≤ u det D F k (y) l=1 l=0 The remaining terms are easily controlled once again due to (P3 )(c). ¯ → ¯ has an invariant probability measure µ¯ with d µ¯ = Lemma 2.2. The map F : −1 ρd ¯ m, ¯ where K ≤ ρ¯ ≤ K , for some K = K (C1 , β) > 0. Proof. We construct ρ¯ as the density with respect to m¯ of an accumulation point of n−1 i F ∗ (m). ¯ Let ρ¯ (n) denote the density of µ¯ (n) and ρ¯ i the density of µ¯ (n) = 1/n i=0 i i F ∗ (m). ¯ Also, let ρ¯ i = j ρ¯ ij , where ρ¯ ij is the density of F ∗ (m|σ ¯ ij ) and the σ ij ’s range i ¯ such that F (σ i ) = . ¯ over all components of j
¯ ij ). We have for x¯ ∈ σ ij such that Consider the normalized density ρ˜ ij = ρ¯ ij /m(σ i
x¯ = F (x¯ ) and for some y¯ ∈ σ ij , i
ρ˜ ij (x) ¯ =
J F (y¯ ) i
J F (x¯ )
¯ −1 = (m( ¯ ))
k−1 i J F(F (y¯ )) k=1
J F(F
k−1
(x¯ ))
.
By Lemma 2.1(2) we have for every k = 1, . . . , i, J F(F J F(F
k−1
(y¯ ))
k−1
(x¯ ))
k k s F (¯y ),F (x¯ ) ¯ y) , ≤ exp C1 β ≤ exp C1 β (i−k)+s(x,¯
748
J. F. Alves, M. Carvalho, J. M. Freitas
from where we conclude that ⎧ ⎫ ⎨ ⎬ ¯ y) ¯ ≤ exp C1 β s(x,¯ β j ≤ exp {C1 /(1 − β)} = K . ρ˜ ij (x) ⎩ ⎭ j≥0
Observe that we also get 1 ρ˜ ij (x) ¯
i
=
J F (x¯ ) i
J F (y¯ )
¯ ≤ K, (m( ¯ ))
¯ ≥ K −1 . Now, since ρ¯ i = ¯ ij )ρ˜ ij , we have K −1 ≤ ρ¯ i ≤ K which yields ρ˜ ij (x) j m(σ −1 (n) which implies that K ≤ ρ¯ ≤ K , from where we obtain that K −1 ≤ ρ¯ ≤ K . 2.2. Lifting to the Gibbs-Markov structure. We now adapt standard techniques for lifting the F- invariant measure on the quotient space to an F- invariant measure on the initial Gibbs-Markov structure. Given an F-invariant probability measure µ, ¯ we define a probability measure µ˜ on as follows. For each bounded φ : → R consider its discretizations φ • : γˆ ∩ → R ¯ → R defined by and φ ∗ : φ • (x) = inf{φ(z) : z ∈ γ s (x)}, and φ ∗ = φ • ◦ πˆ −1 .
(2.4)
If φ is continuous, as its domain is compact, we may define var φ(k) = sup |φ(z) − φ(ζ )| : |z − ζ | ≤ Cβ k , in which case var φ(k) → 0 as k → ∞. Lemma 2.3. Given any continuous φ : → R, for all k, l ∈ N we have (φ ◦ F k )∗ d µ¯ − (φ ◦ F k+l )∗ d µ¯ ≤ var φ(k). Proof. Since µ¯ is F-invariant, (φ ◦ F k )∗ d µ¯ − (φ ◦ F k+l )∗ d µ¯ = (φ ◦ F k )∗ ◦ F l d µ¯ − (φ ◦ F k+l )∗ d µ¯ l ≤ (φ ◦ F k )∗ ◦ F − (φ ◦ F k+l )∗ d µ. ¯ By definition of the discretization we have l l (φ ◦ F k )∗ ◦ F (x) = inf φ(z) : z ∈ F k γ s (F (x)) and
(φ ◦ F k+l )∗ (x) = inf φ(ζ ) : ζ ∈ F k+l γ s (x) .
Statistical Stability and Continuity of SRB Entropy
749
l Observe that F k+l (γ s (x)) ⊂ F k γ s (F (x)) and by (P2 ), l diam F k γ s (F (x)) ≤ Cβ k . l Thus, (φ ◦ F k )∗ ◦ F − (φ ◦ F k+l )∗ ≤ var φ(k).
By the Cauchy criterion the sequence (φ ◦ F k )∗ d µ¯ k∈N converges. Hence, the Riesz Representation Theorem yields a probability measure µ˜ on , (2.5) φd µ˜ := lim (φ ◦ F k )∗ d µ¯ k→∞
for every continuous function φ : → R. Proposition 2.4. The probability measure µ˜ is F-invariant and has absolutely continuous conditional measures on γ u leaves. Moreover, given any continuous φ : → R we have (1) φd µ˜ − (φ ◦ F k )∗ d µ¯ ≤ var φ(k); ¯ µ, ¯ → R is defined by (2) if φ is constant in each γ s , then φd µ˜ = φd ¯ where φ¯ : −1 ¯ φ(x) = φ(z), where z ∈ π¯ (x); (3) if φ is constant in each γ s and ψ : → R is continuous, then ψ.φd µ˜ − (ψ ◦ F k )∗ (φ ◦ F k )∗ d µ¯ ≤ φ1 var ψ(k). (4) µ˜ is ergodic. Proof. Regarding the F-invariance property, note that for any continuous φ : → R, ∗ ˜ φ ◦ F k+1 d µ¯ = φd µ, φ ◦ Fd µ˜ = lim k→∞
by Lemma 2.3. Assertion (1) is an immediate consequence of Lemma 2.3. Property (2) follows from ∗ ¯ µ, ¯ φd µ˜ = lim φ¯ ◦ F¯ k d µ¯ = φd φ ◦ F k d µ¯ = lim k→∞
k→∞
¯ which holds by definition of µ, ˜ and the F-invariance of µ. ¯ For statement (3) let ¯ → R be defined by φ(x) ¯ φ¯ : = φ(z), where z ∈ π¯ −1 (x). For any k, l positive integers observe that (ψ.φ ◦ F k )∗ d µ¯ = (ψ ◦ F k )∗ (φ ◦ F k )∗ d µ¯ φ∗
and
(ψφ ◦ F k+l )∗ d µ¯ − (ψφ ◦ F k )∗ d µ¯ k+l ∗ ¯ k+l k ∗¯ k ¯ ¯ = (ψ ◦ F ) φ ◦ F d µ¯ − (ψ ◦ F ) φ ◦ F d µ¯ ≤ (ψ ◦ F k+l )∗ − (ψ ◦ F k )∗ ◦ F¯ l |φ ◦ F¯ k+l |d µ¯ ≤ var ψ(k)φ1 .
Inequality (3) follows letting l go to ∞.
750
J. F. Alves, M. Carvalho, J. M. Freitas
Remark 2.5. Since the continuous functions are a dense subset of L 1 - functions, then properties (2) and (3) also hold when φ ∈ L 1 , by dominated convergence. In particular, this gives that π¯ ∗ µ˜ = µ. ¯ In order to prove item (4), let E˜ denote the set of points z ∈ such that for every ˜ we have g ∈ L 1 (µ) n−1 1 lim g ◦ f i (z) = g d µ. ˜ (2.6) n→∞ n i=0
¯ and g ∈ L 1 (µ). We define similarly E¯ with respect to µ, ¯ points in ¯ Recall that ergo¯ = 1 and µ( ˜ = 1, respectively. Actually, it dicity of µ¯ and µ˜ is equivalent to µ( ¯ E) ˜ E) is enough to consider continuous functions in the previous definitions. We will show ¯ ⊂ E, ˜ which then by Remark 2.5 implies that µ˜ is ergodic. Let x¯ ∈ E, ¯ that π¯ −1 (E) z ∈ π¯ −1 (x) ¯ and consider a continuous function g : → R. Then for every k ∈ N we have k−1 n−1 n−1 1 1 1 i i g(F (z)) − gd µ˜ ≤ g(F (z)) + ¯ g(F k+i (z)) − (g ◦ F k )∗ (F i (x)) n n n i=0 i=0 i=0 n−1 n+k−1 1 1 + g(F i (z)) + (g ◦ F k )∗ (F i (x)) ¯ − (g ◦ F k )∗ d µ¯ + (g ◦ F k )∗ d µ¯ − gd µ˜ n n i=n−1 i=0 n−1 2kg∞ 1 (g ◦ F k )∗ (F i (x)) ¯ − (g ◦ F k )∗ d µ¯ −−−→ 2varg(k), ≤ + 2varg(k) + n→∞ n n i=0
and the conclusion follows by letting k → ∞. We are then left to verify the absolute continuity of µ. ˜ We already know that supports an F-invariant ergodic measure µ with absolutely continuous conditional measures on γ u leaves; see e.g. [Y2, Sect. 2]. In fact, we know that on a.e. γ u , the conditional measure µγ u is equivalent to the conditional Lebesgue measure Lebγ u , when restricted to . We are going to show that µ˜ = µ. Consider E the set of points z ∈ for which (2.6) holds with µ instead of µ. ˜ Our goal is to show that E ∩ E˜ = ∅, which gives the desired equality of the two measures. As µ(E) = 1, by the equivalence (on a.e. γ u ) between µγ u and Lebγ u restricted to , there exists an unstable leaf γ u such that Lebγ u ((\E)∩γ u ) = 0. ˜ = 1 which implies that µ( ˜ = 0 because π¯ ∗ µ˜ = µ¯ by By (4) we have µ( ˜ E) ¯ π¯ (\E)) ˜ = 0. Now, since Remark 2.5. Since µ¯ is equivalent to m, ¯ it follows that m( ¯ π¯ (\E)) the representative of m¯ on γ u is also equivalent to Lebγ u restricted to , we also have ˜ ∩ γ u ) = 0. Consequently, we have Lebγ u ((E ∩ E) ˜ ∩ γ u ) > 0 which that Lebγ u ((\E) proves that E ∩ E˜ = ∅. 2.3. Entropy formula. Let µ˜ be the SRB measure for F obtained from µ¯ = ρ¯ m¯ as in (2.5). We define the saturation of µ˜ by µ∗ =
∞
f ∗l (µ|{R ˜ > l}) .
(2.7)
l=0
It that µ∗ is f -invariant and that the finiteness of µ∗ is equivalent to is well known R d µ˜ = R d µ¯ < ∞. By construction of and m¯ and µ, ¯ the finiteness of µ∗ is
Statistical Stability and Continuity of SRB Entropy
751
also equivalent to γ ∩ R d Lebγ < ∞. Clearly, each f ∗l (µ|{R ˜ > l}) has absolutely l u continuous conditional measures on { f γ }, which are Pesin unstable manifolds. Consequently µ=
1 µ∗ µ∗ (M)
is an SRB measure for f . Lemma 2.6. If λ is a Lyapunov exponent of µ, ˜ then λ/σ is a Lyapunov exponent of µ, where σ = Rd µ. ˜ Proof. As µ is obtained by saturating µ˜ in (2.7), one easily gets µ∗ () ≥ µ() ˜ = 1, and so µ() > 0. By ergodicity, it is enough to compare the Lyapunov exponents for points z ∈ . Let n be a positive integer. We have for each z ∈ , F n (z) = f Sn (z) (z), where Sn (z) =
n−1
R(F i (z)).
i=0
As Sn (z) = Sn (ζ ) for Lebesgue almost every z ∈ and ζ close to z, we have for v ∈ Tz M, n 1 log D f Sn (z) (z)v = log D F n (z)v. Sn (z) nSn (z) Since µ˜ is ergodic, the Birkhoff ergodic theorem yields Sn (z) = lim R d µ˜ = σ n→∞ n
(2.8)
(2.9)
for µ˜ almost every z ∈ . ¯ Then Proposition 2.7. Let J F¯ be the Jacobian of F¯ with respect to the measure m¯ on . ¯ m. h µ = σ −1 log J Fd ¯ ¯
Proof. By [LY2, Cor. 7.4.2] we have hµ =
λi dim E i ,
(2.10)
λi >0
where λi are Lyapunov exponents of µ and E i the corresponding linear spaces given by Oseledets’ decomposition. By Lemma 2.6 we have h µ = σ −1 λ˜ i dim E i , λ˜ i >0
˜ As a consequence of Oseledets theorem we may where λ˜ i are Lyapunov exponents of µ. also write λ˜ i dim E i = log det D F u d µ. ˜ λ˜ i >0
752
J. F. Alves, M. Carvalho, J. M. Freitas
According to (2.3), log J Fd µ˜ = log det D F u d µ˜ + log uˆ ◦ Fd µ˜ − log ud ˆ µ˜ = log det D F u d µ, ˜
where the last equality follows from the F-invariance of µ. ˜ Finally, since by Lemma 2.1 J F is constant in each γ s -leaf it follows from Proposition 2.4 (2) that ¯ m. log J Fd µ˜ = log J Fd ¯ ¯
3. Statistical Stability Let F be a uniform family of maps. Fix f 0 ∈ F and take any sequence ( f n )n≥1 in F such that f n → f 0 , as n → ∞, in the C k topology. For each n ≥ 0, let µn denote the (unique) SRB measure for f n . Given n ≥ 0, the map f n ∈ F admits a Gibbs-Markov structure n with nu = {γnu } and ns = {γns } its defining families of unstable and stable leaves. Consider Rn : n → N the return time, Fn : n → n the induced map, γˆn the special unstable leaf given by condition (U1 ) and Hn : γˆn ∩0s → γˆ0 obtained by sliding through the stable leaves of 0 . Recall that n0 = Hn (γˆn ∩ n ) and 0 = γˆ0 ∩ 0 . Remark 3.1. Since f n → f 0 , as n → ∞, in the C k topology and (U1 ) holds, then for every ε > 0 and ∈ N, there exists N0 ∈ N such that for every n ≥ N0 we have γˆn − γˆ0 1 < ε, −1 max n |( f n ◦ Hn − f 0 )(x)|, . . . , |( f n ◦ Hn−1 − f 0 )(x)| < ε,
x∈0 ∩0
and u ( f ◦ H −1 (x)) u ( f ◦ H −1 (x)) det D f det D f n n n n n n , . . . , log log max < ε. det D f u ( f (x)) x∈ ∩n det D f u ( f (x)) 0
0
0
0
0
0
Our goal is to show that µn → µ0 in the weak* topology, i.e. for each continuous function g : M → R the sequence g dµn converges to g dµ0 . We will show that given any continuous g : M → R, each subsequence of g dµn admits a subsequence converging to g dµ0 .
3.1. Convergence of the densities on the reference leaf. In Sect. 2.1 we built a family of holonomy invariant measures on unstable leaves that gives rise to a measure m¯ n on ¯ n . Moreover, (πˆ n )∗ m γˆn = m¯ n and m γˆn = 1γˆn ∩n Lebγˆn ,
(3.1)
Statistical Stability and Continuity of SRB Entropy
753
where 1(·) stands for the indicator function. By Lemma 2.2, for each n ≥ 0 there is an F¯n -invariant measure µ¯ n = ρ¯n m¯ n with ρ¯n ∞ ≤ K for all n ≥ 0. We define the sequence (n )n≥0 of functions in γˆ0 as n = ρ¯n ◦ πˆ n ◦ Hn−1 · 1n0 ,
(3.2)
which in particular gives 0 = ρ¯0 ◦ πˆ 0 . The main purpose of this section is to prove that the sequence (n )n∈N converges theorem there is a subsequence to 0 in the weak* topology. By the Banach-Alaoglu n i i∈N converging to some ∞ ∈ L ∞ (Lebγˆ0 ) in the weak* topology, i.e. φn i d Lebγˆ0 −−−→ φ∞ d Lebγˆ0 , ∀φ ∈ L 1 (Lebγˆ0 ). (3.3) i→∞
The following lemma establishes that integration with respect to m¯ n is close to integration with respect to n Lebγˆ0 , up to a small error. Lemma 3.2. Let φ¯ ∈ L ∞ (m¯ n ). If n is sufficiently large, then −1 ¯ ∞ Qn , (φ¯ ◦ πˆ n ◦ Hn )n d Lebγˆ0 ≤ K φ φ¯ ρ¯n d m¯ n − n ¯ n 0
where Q n = Lebγˆn (0n n ) + n d(Hn )∗ Lebγˆn − n d Lebγˆ0 .
0
φ¯ ρ¯n d m¯ n =
0
¯
d Lebγˆn . It follows that −1 ¯ ¯ ¯ φ ρ¯n d m¯ n − n (φ ◦ πˆ n ◦ Hn )n d Lebγˆ0 0 n ≤ (φ¯ ◦ πˆ n )(ρ¯n ◦ πˆ n ) d Lebγˆn 0n n + (φ¯ ◦ πˆ n )(ρ¯n ◦ πˆ n ) d Lebγˆn − (φ¯ ◦ πˆ n ◦ Hn−1 )n d Lebγˆ0 n 0n ∩n
Proof. By (3.1), we have
¯n
ˆ n )(ρ¯n ◦ πˆ n ) n (φ ◦ π
0
¯ ∞ Lebγˆ (0n n ) ≤ K φ n −1 −1 (φ¯ ◦ πˆ n ◦ Hn )n d(Hn )∗ Lebγˆn − (φ¯ ◦ πˆ n ◦ Hn )n d Lebγˆ0 + n n 0 0 ¯ ∞ Lebγˆ (0n n ) + K φ ¯ ∞ ≤ K φ d(Hn )∗ Lebγˆn − d Lebγˆ0 . n n n 0
0
Consider the maps G 0 : γˆ0 → γˆ0 and G n : γˆ0 → γˆn defined by G 0 = πˆ 0−1 ◦ F¯0 ◦ πˆ 0 and G n = πˆ n−1 ◦ F¯n ◦ πˆ n ◦ Hn−1 .
754
J. F. Alves, M. Carvalho, J. M. Freitas
γ^n
Fig. 1.
Lemma 3.3. For every ε > 0, n ∈ N sufficiently large and Lebγˆ0 -almost every x ∈ 0 ∩ n0 ∩ {Rn = } ∩ {R0 = } we have |G n (x) − G 0 (x)| < ε. Proof. Consider a point x ∈ 0 ∩ n0 ∩ {Rn = } ∩ {R0 = }. We may assume that G n (x) is a Lebesgue density point of n . Then, using (U2 ) and the continuity of the stable foliation (see Definition 1 (iii)), for sufficiently large n ∈ N we may guarantee the existence of a point y˜ ∈ 0n ∩ n such that γns (y) ˜ is at most ε sin(θ )/4 apart from γns (G n (x)) in the C 1 -norm; recall Remark 1.2 (see Fig. 1). Using (U3 ) we may assume ˜ that n ∈ N is also sufficiently large so that the distance in the C 1 norm between γns (y) and γ0s (y) ˜ is at most ε sin(θ )/4. Taking into account Remark 3.1 and the continuity of the stable foliation, we may assume that n ∈ N is large enough so that | f nl (Hn−1 (x)) − f 0l (x)| is sufficiently small in order for γ0s ( f 0l (x)) to belong to a ε sin(θ )/4-neighborhood of γ0s (y), ˜ in the C 1 -norm. s l s l −1 It follows that γn ( f n (Hn (x))) and γ0 ( f 0 (x)) are at most 3ε sin(θ )/4 apart, in the C 1 norm. Finally, observing that G n (x) = γns ( f nl (Hn−1 (x)))∩γnu , G 0 (x) = γ0s ( f 0l (x))∩γ0u and γnu can be made arbitrarily close to γ0u , in the C 1 -norm (by (U1 )), then, as long as n is sufficiently large, we have |G n (x) − G 0 (x)| < ε. Proposition 3.4. The measure (∞ ◦ πˆ 0−1 )m¯ 0 is F¯0 -invariant. ¯ 0 → R, Proof. We just have to verify that for every continuous ϕ :
(ϕ ◦ F¯0 )(∞ ◦ πˆ 0−1 )d m¯ 0 =
ϕ(∞ ◦ πˆ 0−1 )d m¯ 0 .
Given such ϕ, consider a continuous function φ : M → R such that φ∞ ≤ ϕ∞ and φ|0 = ϕ ◦ πˆ 0 . Since µ¯ n i = ρ¯n i d m¯ n i is F¯n i -invariant we have
(φ ◦ πˆ n−1 ◦ F¯n i )ρ¯n i d m¯ n i = i
(φ ◦ πˆ n−1 )ρ¯n i d m¯ n i . i
(3.4)
Statistical Stability and Continuity of SRB Entropy
755
Recalling definitions (3.1),(3.2), the fact that n i is supported on n0 i ⊂ 0 and applying Lemmas 3.2 and 2.2 we get (φ ◦ πˆ −1 )ρ¯n d m¯ n − ϕ(∞ ◦ πˆ −1 ) d m¯ 0 ni i i 0 −1 ≤ (φ ◦ Hn i )n i d Lebγˆ0 − (ϕ ◦ πˆ 0 )∞ d Lebγˆ0 + Q n i −1 = (φ ◦ Hn i )n i d Lebγˆ0 − φ∞ d Lebγˆ0 + Q n i −1 ≤ (φ ◦ Hn i )n i d Lebγˆ0 − φn i d Lebγˆ0 + φn i d Lebγˆ0 − φ∞ d Lebγˆ0 + Q n i −1 ≤ K φ ◦ Hn i − φ d Lebγˆ0 + φn i d Lebγˆ0 − φ∞ d Lebγˆ0 + Q n i . Therefore, using (U1 ) for the first term on the right, (3.3) for the second and (U2 ) plus Remark 1.2 for the Q term, we conclude that −1 (3.5) (φ ◦ πˆ n i )ρ¯n i d m¯ n i −−−→ ϕ(∞ ◦ πˆ 0−1 )d m¯ 0 . i→∞
Once we prove the next claim, then equality (3.4), the limit (3.5) and the uniqueness of the limit give the desired result. ¯n i )ρ¯n i d m¯ n i −−−→ ϕ ◦ F¯0 (∞ ◦ πˆ −1 )d m¯ 0 . Claim 3.1. (φ ◦ πˆ n−1 ◦ F 0 i i→∞
Let
¯n i )ρ¯n i d m¯ n i − ϕ ◦ F¯0 (∞ ◦ πˆ −1 )d m¯ 0 . E 1 := (φ ◦ πˆ n−1 ◦ F 0 i
Again, using definitions (3.1),(3.2) and applying Lemma 3.2 we get E 1 ≤ (φ ◦ G n i )n i d Lebγˆ0 − (φ ◦ G 0 )∞ d Lebγˆ0 + Q n i . Now, observe that by (U2 ) and Remark 1.2 the term Q n i can be made arbitrarily small for large i. This leaves us with the first term on the right that we denote by E 2 . Using Lemma 2.2 we have E 2 ≤ φ ◦ G n i − φ ◦ G 0 n i d Lebγˆ0 + (φ ◦ G 0 )n i d Lebγˆ0 − (φ ◦ G 0 )∞ d Lebγˆ0 ≤ K φ ◦ G n i − φ ◦ G 0 d Lebγˆ0 + (φ ◦ G 0 )n i d Lebγˆ0 − (φ ◦ G 0 )∞ d Lebγˆ0 .
756
J. F. Alves, M. Carvalho, J. M. Freitas
According to Eq. (3.3) it is clear that the last term on the right can be made arbitrarily small provided i is large enough. So, denote by E 3 the first term on the right. Recalling the fact that n i is supported on n0 i ⊂ 0 , we have for any N , ∞
Lebγˆ0 ({Rn i = }) + Lebγˆ0 ({R0 = })
E 3 ≤ K φ∞
=N +1
+ K φ∞
+K
N =1
N
Lebγˆ0 ({Rn i = } {R0 = })
=1
n {Rni = }∩{R0 = }∩0 ∩0i
φ ◦ G n − φ ◦ G 0 d Lebγˆ . i 0
Denote by E 4 , E 5 and E 6 respectively the terms in the last sum. Having in mind (U5 ) and Remark 1.2, we may choose N ∈ N sufficiently large so that E 4 is small for large i. For this choice of N , by (U4 ), we also have that E 5 is small for large i. We now turn our attention to E 6 . For = 1, . . . , N , let φ ◦ G n − φ ◦ G 0 1 n E 6 = i 0 ∩ i d Lebγˆ0 . {Rni = }∩{R0 = }
0
Since φ is continuous and M is compact then each E 6 can be made arbitrarily small by Lemma 3.3. Corollary 3.5. Given φ ∈ L 1 (Lebγˆ0 ), we have φn d Lebγˆ0 −−−→ φ0 d Lebγˆ0 . n→∞
¯ it follows Proof. By uniqueness of the absolutely continuous invariant measure for F, −1 from Proposition 3.4 that ρ¯0 = ∞ ◦ πˆ 0 , which immediately yields ∞ = 0 . Hence φn i d Lebγˆ0 −−−→ φ0 d Lebγˆ0 , for all φ continuous. (3.6) i→∞
The same argument proves that any subsequence of (n )n has a weak* convergent subsequence with limit also equal to 0 . This shows that (n )n itself converges to 0 in the weak* topology. Since continuous functions are dense in L 1 (Lebγˆ0 ), using that the densities n are uniformly bounded, by Lemma 2.2, the result follows easily from (3.6). 3.2. Continuity of the SRB measures. For each n ≥ 0 let µ˜ n be the Fn - invariant measure lifted from µ¯ n as in (2.5), µ∗n the saturation of µ˜ n as in (2.7), and µn = µ∗n /µ∗n (M) the SRB measure. The main goal of this section is to prove the following result. Proposition 3.6. For every continuous g : M → R, gdµ∗n −−−→ gdµ∗0 . i→∞
Statistical Stability and Continuity of SRB Entropy
757
Proof. As M is compact, then g is uniformly continuous and g∞ < ∞. Recalling (2.7) we may write for all n ∈ N0 and every integer N0 , µ∗n =
N 0 −1
µ n + ηn ,
=0
where µ n = f ∗ (µ˜n |{Rn > }) and ηn = ≥N0 f ∗ (µ˜n |{Rn > l}). By (U5 ), we may choose N0 so that ηn (M) is as small as we want, for all n ∈ N0 . We are left to show that for every < N0 , if n is large enough then (g ◦ f )1{R > } d µ˜ n − (g ◦ f )1{R > } d µ˜ 0 n 0 n 0 is arbitrarily small. We fix < N0 and take k ∈ N large so that var(g(k)) is sufficiently small. Then, we use Proposition 2.4 (3) and its Remark 2.5 to reduce our problem to controlling the following error term: k ∗ k ∗ k ∗ k ∗ E := (g ◦ f n ◦ Fn ) (1{Rn > } ◦ Fn ) d µ¯ n − (g ◦ f 0 ◦ F0 ) (1{R0 > } ◦ F0 ) d µ¯ 0 . Let 0 : γˆ0 → R be such that 0 = ρ¯0 ◦ πˆ 0 · 10 and define (g ◦ f n ◦ Fnk )• (1{Rn > } ◦ Fnk )• ◦ Hn−1 n d Lebγˆ0 E 0 = − (g ◦ f 0 ◦ F0k )• (1{R0 > } ◦ F0k )• 0 d Lebγˆ0 . By Lemma 3.2, we have E ≤ E 0 + K g∞ Q n . Observe that by (U2 ) and Remark 1.2 we may consider n large enough so that K g∞ Q n is negligible. Applying the triangular inequality we get E 0 ≤ K (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• 10 ∩n0 d Lebγˆ0 + K g∞ (1{Rn > } ◦ Fnk )• ◦ Hn−1 − (1{R0 > } ◦ F0k )• 10 ∩n0 d Lebγˆ0 k • k • + (g ◦ f 0 ◦ F0 ) (1{R0 > } ◦ F0 ) 10 ∩n0 (n − 0 ) d Lebγˆ0 . By Corollary 3.5 the term (g ◦ f ◦ F k )• (1{R > } ◦ F k )• 1 ∩n (n − 0 ) d Lebγˆ 0 0 0 0 0 0 0 is as small as we want as long as n is large enough. The analysis of the remaining terms (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• 10 ∩n0 d Lebγˆ0 and
(1{Rn > } ◦ Fnk )• ◦ Hn−1 − (1{R0 > } ◦ F0k )• 10 ∩n0 d Lebγˆ0
is left to Lemmas 3.8 and 3.9, respectively.
758
J. F. Alves, M. Carvalho, J. M. Freitas
In the proofs of Lemmas 3.8 and 3.9 we have to produce a suitable positive integer N so that returns that take longer than N iterations are negligible. The next lemma provides the tools for an adequate choice. We consider the sequence of consecutive return times for z ∈ , 1 2 n−1 R 1 (z) = R(z) and R n (z) = R f R +R +...+R (z) . (3.7) Lemma 3.7. Given k, N ∈ N,
m¯ z ∈ : ∃t ∈ {1, . . . , k} such that R t (z) > N ≤ kC1 m({R ¯ > N }). Proof. We may write
k−1 Bt , z ∈ : ∃t ∈ {1, . . . , k} such that R t (z) > N = t=0
where
Bt = z ∈ : R(z) ≤ N , . . . , R t (z) ≤ N , R t+1 (z) > N .
If R(z) ≤ N , . . . , R t (z) ≤ N then there exist j1 , . . . jt ≤ N with R(ϒ jl ) ≤ N for every ¯ and there is y ∈ ϒ j1 ,..., jt l = 1, . . . , t and z ∈ ϒ j1 ,..., jt . Observe that F¯ t ϒ j1 ,..., jt = t ¯ ≤ J F¯ (y).m(ϒ such that m( ¯ ) ¯ j1 ,..., jt ). Also, there exists x ∈ ϒ j1 ,..., jt ∩ F¯ −t ({R > N }) ¯ j1 ,..., jt ∩ F¯ −t ({R > N }). Then, using bounded such that m({R ¯ > N }) ≥ J F¯ t (x).m(ϒ distortion we obtain m(ϒ ¯ j1 ,..., jt ∩ F¯ −t ({R > N }) J F¯ t (y) m({R ¯ > N }) ≤ ¯ > N }). ≤ C1 m({R ¯ m(ϒ ¯ j1 ,..., jt ) m( ¯ ) J F¯ t (x) Finally, we conclude that |Bt | =
m(ϒ ¯ j1 ,..., jt ∩ F¯ −t ({R > N })
j1 ,..., jt : R(ϒ jl )≤N , l=1...t
¯ > N }) ≤ C1 m({R
m(ϒ ¯ j1 ,..., jt )
j1 ,..., jt : R(ϒ jl )≤N , l=1...t
≤ C1 m({R ¯ > N }). Lemma 3.8. Given , k ∈ N and ε > 0 there is J ∈ N such that for every n > J , (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• 10 ∩n0 d Lebγˆ0 < ε. Proof. We split the argument into three steps: (1) We appeal to Lemma 3.7 to choose N ∈ N sufficiently large so that the set L := x ∈ 0 ∩ n0 : ∃t ∈ {1, . . . , k} R0t (x) > N or Rnt (x) > N has sufficiently small mass.
Statistical Stability and Continuity of SRB Entropy
759
(2) We pick J ∈ N large enough to guarantee that, according to condition (U4 ), for every k positive integers j1 , . . . , jk such that R0 (ϒ 0jl ) ≤ N , for all i = 1, . . . , k, each set ϒ 0j1 ,..., jk and its corresponding ϒ nj1 ,..., jk satisfy the condition: ϒ 0j1 ,..., jk Hn ϒ nj1 ,..., jk has sufficiently small conditional Lebesgue measure. (3) Finally, in each set ϒ 0j1 ,..., jk ∩ Hn ϒ nj1 ,..., jk we control (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• .
Step (1). From Lemma 3.7 we have |L| ≤ kC1 . Lebγˆ0 ({R0 > N }) + Lebγˆn ({Rn > N }) . So, by assumption (U5 ), we may choose N and J large enough so that
ε 2g∞ kC1 . Lebγˆ0 ({R0 > N }) + Lebγˆn ({Rn > N }) < , 3 which implies that ε (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• 10 ∩n0 d Lebγˆ0 < . 3 L Step (2). By (P1 )(c) it is possible to define V = V (N , k) as the total number of sets ϒ j1 ,..., jk such that R(ϒ jl ) ≤ N for all i = 1, . . . , k. Now, using (U4 ), we may choose J so that for every n > J and ϒ 0j1 ,..., jk such that R0 (ϒ 0jl ) ≤ N for all i = 1, . . . , k, then the corresponding ϒ nj1 ,..., jk is such that ε < V −1 (2 max{1, g∞ })−1 . Lebγˆ0 ϒ 0j1 ,..., jk Hn ϒ nj1 ,..., jk 3 Under these circumstances we have j1 , . . . , jk : R0 (ϒ 0j ) ≤ N l l = 1, . . . , k
ε (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• 1 ∩n d Lebγˆ < . 0 0 n 0 0 3 ϒ j ,..., j Hn ϒ j ,..., j 1 1 k k
Step (3). For each i = 1, . . . , k, let τ ji = R0 (ϒ 0ji ). In each set ϒ 0j1 ,..., jk ∩ ϒ nj1 ,..., jk we have that F0k = f 0τ1 +...+τk and Fnk = f nτ1 +...+τk . Since M is compact, each f n is C k and f n → f 0 , as n → ∞, in the C k topology then • there exists ϑ > 0 such that |z − ζ | < ϑ ⇒ |g(z) − g(ζ )| < 3ε V −1 ; • there exists J1 such that for all n > J1 and z ∈ M we have max | f 0 (z) − f n (z)|, . . . , | f 0k N +l (z) − f nk N +l (z)| < ϑ2 ; • there exists η > 0 such that for all z, ζ ∈ M and f ∈ F, |z − ζ | < η ⇒ max | f (z) − f (ζ )|, . . . , | f k N +l (z) − f k N +l (ζ )| < Furthermore, according to (U3 ),
ϑ 2.
760
J. F. Alves, M. Carvalho, J. M. Freitas
• there is J2 such that for every n > J2 and x ∈ 0 ∩ n0 we have s γ (x) − γ s (x) 1 < η. n 0 C Let n > max{J1 , J2 }, z ∈ γ0s (x) and take ζ ∈ γns (x) such that |z − ζ | < η. This together with the choices of η and J1 implies τ +...+τk +l (z) − f 0τ1 +...+τk +l (ζ ) f 0 ◦ F0k (z) − f n ◦ Fnk (ζ ) ≤ f 0 1 + f 0τ1 +...+τk +l (ζ ) − f nτ1 +...+τk +l (ζ ) < ϑ/2 + ϑ/2 = ϑ. Finally, the above considerations and the choice of ϑ allow us to conclude that for every n > max{J1 , J2 }, x ∈ 0 ∩ n0 and z ∈ γ0s (x), there exists ζ ∈ γns (x) such that ε (3.8) g( f n ◦ Fnk (ζ )) − g( f 0 ◦ F0k (z)) < V −1 . 3 Attending to (2.4), (3.8) and the fact that we can interchange the roles of z and ζ in the latter, we obtain that for every n > max{J1 , J2 }, ε (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• < V −1 , 3 from which we deduce that j1 , . . . , jk R0 (ϒ 0j ) ≤ N l 1≤l ≤k
ε (g ◦ f n ◦ Fnk )• ◦ Hn−1 − (g ◦ f 0 ◦ F0k )• 1 ∩n d Lebγˆ < . 0 0 0 3 ϒ 0j ,..., j Hn ϒ nj ,..., j 1 1 k k
Lemma 3.9. Given l, k ∈ N and ε > 0 there exists J ∈ N such that for every n > J , (1{Rn > } ◦ Fnk )• ◦ Hn−1 − (1{R0 > } ◦ F0k )• 10 ∩n0 d Lebγˆ0 < ε. Proof. As in the proof of Lemma 3.8, we divide the argument into three steps. (1) The condition on N : Consider the set L 1 = x ∈ 0 ∩ n0 : ∃t ∈ {1, . . . , k + 1} such that R0t (x) > N or Rnt (x) > N .
From Lemma 3.7 we have |L 1 | ≤ (k + 1)C1 . Lebγˆ0 ({R0 > N }) + Lebγˆn ({Rn > N }) . So we choose N large enough so that
ε 2g∞ (k + 1)C1 . Lebγˆ0 ({R0 > N }) + Lebγˆn ({Rn > N }) < , 3 which implies that ε (1{Rn > } ◦ Fnk )• ◦ Hn−1 − (1{R0 > } ◦ F0k )• 10 ∩n0 d Lebγˆ0 < . 3 L1
Statistical Stability and Continuity of SRB Entropy
761
(2) Let as before V = V (N , k + 1) be the total number of sets ϒ j1 ,..., jk+1 such that R(ϒ ji ) ≤ N for all i = 1, . . . , k + 1. Now, using (U4 ), we may choose J so that for every n > J and ϒ 0j1 ,..., jk+1 such that R0 (ϒ 0ji ) ≤ N for all i = 1, . . . , k + 1 then the corresponding ϒ nj1 ,..., jk+1 is such that ε Lebγˆ0 ϒ 0j1 ,..., jk+1 Hn ϒ nj1 ,..., jk+1 < V −1 (2 max{1, g∞ })−1 . 3 Let L 2 = ϒ 0j1 ,..., jk+1 Hn ϒ nj1 ,..., jk+1 and observe that
j1 , . . . , jk+1
:
ε (1{Rn > } ◦ Fnk )• ◦ Hn−1 − (1{R0 > } ◦ F0k )• 10 ∩n0 d Lebγˆ0 < . 3 L2
R0 (ϒ 0j ) ≤ N l l = 1, . . . , k + 1
(3) At last, notice that in each set ϒ 0j1 ,..., jk+1 ∩ Hn ϒ nj1 ,..., jk+1 we have (1{Rn >l} ◦ Fnk )• ◦ Hn−1 − (1{R0 >l} ◦ F0k )• = 0, which gives the result. 4. Entropy Continuity In Proposition 2.7 we have seen that the SRB entropy can be written just in terms of the quotient dynamics. Our aim now is to show that the integrals appearing in that formula are close for nearby dynamics, and this is the content of Proposition 4.4. Notice that since the integrands are not necessarily continuous functions, the continuity of the integrals is not an immediate consequence of the statistical stability.
4.1. Auxiliary results. Lemma 4.1. Let (ϕn )n∈N be a bounded sequence of m-measurable functions defined on M belonging to L ∞ (m). If ϕn → ϕ in the L 1 (m)-norm and ψ ∈ L 1 (m), then ψ(ϕn − ϕ)dm → 0, when n → ∞. Proof. Take any ε > 0. Let C > 0 be an upper bound for ϕn ∞ . Since ψ ∈ L 1 (m), there is δ > 0 such that for any Borel set B ⊂ M, ε . (4.1) |ψ|dm < m(B) < δ ⇒ 4C B Define for each n ≥ 1,
Bn = x ∈ M : |ϕn (x) − ϕ0 (x)| >
ε . 2ψ1
762
J. F. Alves, M. Carvalho, J. M. Freitas
Since ϕn − ϕ0 1 → 0 when n → ∞, then there is n 0 ∈ N such that m(Bn ) < δ for every n ≥ n 0 . Taking into account the definition of Bn , we may write |ψ||ϕn − ϕ0 |dm = |ψ||ϕn − ϕ0 |dm + |ψ||ϕn − ϕ0 |dm Bn M\Bn ε ≤ 2C |ψ|dm + |ψ|dm. 2ψ 1 M\Bn Bn Then, using (4.1), this last sum is upper bounded by ε, as long as n ≥ n 0 . Lemma 4.2. There is C2 > 0 such that log J F¯n ≤ C2 Rn for every n ≥ 0. Proof. Define L n = maxx∈M {| det D f nu (x)|}, for each n ≥ 0. By the compactness of M and the continuity on the first order derivative, there is L > 1 such that L n ≤ L for all n ≥ 0. We have | det D(Fn )u (x)| =
Rn (x)−1
| det D f nu ( f n (x))| ≤ L Rn (x) . j
j=0
By (2.3) it follows that log J (Fn )(x) = log | det D Fnu (x)| + log u(F ˆ n (x)) − log u(x). ˆ ˆ n (x)) − log u(x)| ˆ ≤ 2Cβ 0 = 2C, we Observing that by (P3 )(a) it follows that | log u(F have log J (Fn )(x) ≤ Rn (x) log L + 2C. To conclude, we take C2 = log L + 2C. Lemma 4.3. Given ε > 0, there is J ∈ N such that for all n > J , |Rn − R0 | d Lebγˆ0 ≤ ε n0 ∩0
Proof. Let ε > 0 be given. Using condition (U5 ) and Remark 1.2, take N ≥ 1 and J = J (N , ε) > 0 in such a way that ∞ j=N j Lebγˆ0 {Rn = j} < ε/3 and ∞ j Leb {R = j} < ε/3. Since 0 γˆ0 j=N ∞
Rn =
1{Rn > j} ,
j=0
we may write N −1 N −1 −1
N Rn − R0 1 = Rn − 1{Rn > j} − 1{R0 > j} + 1{Rn > j} + 1{R0 > j} − R0 1 j=0
≤
∞ j=N
1{Rn > j} 1 +
j=0 N −1 j=0
j=0
∞ 1{Rn > j} − 1{R0 > j} 1 + 1{R0 > j} 1 j=N
∞ −1 ∞ N = 1{Rn > j} 1 + 1{Rn ≤ j} − 1{R0 ≤ j} 1 + 1{R0 > j} 1 . j=N
j=0
j=N
Statistical Stability and Continuity of SRB Entropy
763
By the choices of N and J , the first and third terms in this last sum are smaller than ε/3. By (U4 ), increasing J if necessary, we can make Lebγˆ0 ({Rn = j}{R0 = j}) sufficiently small in order to have the second term smaller than /3. 4.2. Convergence of metric entropies. Our aim is to show that h µn → h µ0 as n → ∞, which by Proposition 2.7 can be rewritten as σn−1 log J F¯n d µ¯ n −→ σ0−1 log J F¯0 d µ¯ 0 , as n → ∞. (4.2) ¯n
¯0
Observing that σn = n Rn d µ˜ n = µ∗n (M), then by Proposition 3.6 we have σn → σ0 , as n → ∞. Hence, (4.2) is a consequence of the next result. log J F¯n d µ¯ n −→ log J F¯0 d µ¯ 0 as n → ∞. Proposition 4.4. ¯n
¯0
Proof. The convergence above will follow if we show that the following term is arbitrarily small for large n ∈ N. E := (log J F¯n ◦ πˆ n )(ρ¯n ◦ πˆ n ) d Lebγˆn − (log J F¯0 ◦ πˆ 0 )0 d Lebγˆ0 . n
0
Recall that 0 = ρ¯0 ◦ πˆ 0 and n = ρ¯n ◦ πˆ n ◦ Hn−1 , for every n ∈ N. Define E 0 := (log J F¯n ◦ πˆ n ◦ Hn−1 )n d(Hn )∗ Lebγˆn n ∩0 0 − (log J F¯0 ◦ πˆ 0 )0 d Lebγˆ0 . n ∩0 0
By Lemmas 2.2 and 4.2 we have E ≤ E 0 + K C2
n \0n
Rn d Lebγˆn +K C2
0 \n0
R0 d Lebγˆ0 .
Since R0 ∈ L 1 (Lebγˆ0 ), then, by (U2 ) and Remark 1.2, for large n, we may have Lebγˆ0 (0 n0 ) small so that 0 \n R0 d Lebγˆ0 becomes negligible. Now, for each 0 N ∈ N, Rn d Lebγˆn ≤ N d Lebγˆn + Rn d Lebγˆn . n \0n
n \0n
{Rn >N }
Using condition (U5 ) we may choose N so that for all n ∈ N large enough the quantity {R = j} is arbitrarily small. Again, using (U2 ), j=N +1 j Leb {Rn >N } Rn d Lebγˆn = γˆ0 n if n ∈ N is sufficiently large then n \0 d Lebγˆ0 is as small as we want. Therefore, we 0 are reduced to estimating E 0 .
764
J. F. Alves, M. Carvalho, J. M. Freitas
Note that by definition n0 ⊂ 0 . Having this in mind, we split E 0 into the next three terms that we call E 1 , E 2 , E 3 respectively. −1 ¯ ¯ (log J Fn ◦ πˆ n ◦ Hn )n d(Hn )∗ Lebγˆn − (log J F0 ◦ πˆ 0 )n d(Hn )∗ Lebγˆn E0 ≤ n n 0 0 + (log J F¯0 ◦ πˆ 0 )n d(Hn )∗ Lebγˆn − (log J F¯0 ◦ πˆ 0 )n d Lebγˆ0 n n 0 0 + (log J F¯0 ◦ πˆ 0 )n d Lebγˆ0 − (log J F¯0 ◦ πˆ 0 )0 d Lebγˆ0 . n n 0
0
Concerning E 2 , using Lemma 2.2 and Lemma 4.2 we have d(Hn )∗ Lebγˆn ¯ E2 ≤ | log J F0 ||n | − 1 d Lebγˆ0 n d Lebγˆ0 0 d(Hn )∗ Lebγˆn ≤ K C2 R0 − 1 d Lebγˆ0 . n d Leb 0
γˆ0
Now, Remark 1.2 and Lemma 4.1 guarantee that E 2 can be made arbitrarily small for large n ∈ N. Using Corollary 3.5, E 3 can also be made small for large n. We are left with E 1 . By Lemma 2.2 and Remark 1.2 we only need to control (log J F¯n ◦ πˆ n ◦ Hn−1 ) − (log J F¯0 ◦ πˆ 0 ) d Lebγˆ0 n0 ∩0
whose estimation we leave to Lemma 4.6. Remark 4.5. Assume that γn is a compact unstable manifold of the map f n for n ≥ 0 and γn → γ0 , in the C 1 topology. The convergence of f n to f 0 in the C 1 topology ensures that given ∈ N and > 0 there exist δ = δ( , ) > 0 and J = J (δ) ∈ N such that for every n > J, x ∈ γ0 and y ∈ γn with |x − y| < δ, j j j j max | f n (y) − f 0 (x)|, | log det(D f n )u (y) − log det(D f 0 )u (x)| < . j=1,...,
Lemma 4.6. Given any ε > 0 there exists J ∈ N such that for every n > J , (log J F¯n ◦ πˆ n ◦ Hn−1 ) − (log J F¯0 ◦ πˆ 0 ) d Lebγˆ0 < ε. n0 ∩0
Proof. Let ε > 0 be given. For n, N ∈ N define An,N = {Rn ≤ N } ∩ {R0 ≤ N } and Acn,N = {Rn > N } ∪ {R0 > N }. By Lemma 4.2 we have (log J F¯n ◦ πˆ n ◦ Hn−1 ) − (log J F¯0 ◦ πˆ 0 ) d Lebγˆ0 n0 ∩Acn,N
≤ C2
Rn d Lebγˆ0 +C2
n0 ∩Acn,N
n0 ∩Acn,N
R0 d Lebγˆ0 .
SinceR0 ∈ L 1 (Lebγˆ0 ), there is δ > 0 such that if a measurable set A has Lebγˆ0 (A) < δ, then A R0 d Lebγˆ0 < ε/(4C2 ). According to (U5 ), we may pick N ∈ N and choose
Statistical Stability and Continuity of SRB Entropy
765
J ∈ N such that for every n > J we get Lebγˆ0 (Acn,N ) < δ. This implies that the second term on the right hand side of the inequality above is smaller than ε/4. The same argument and Lemma 4.3 allow us to conclude that for a convenient choice of N ∈ N and for J ∈ N sufficiently large, ε C2 Rn d Lebγˆ0 ≤ C2 R0 d Lebγˆ0 +C2 |Rn − R0 | d Lebγˆ0 ≤ . 4 n0 ∩Acn,N n0 ∩Acn,N n0 So, assuming that N has been chosen and J is sufficiently large so that (log J F¯n ◦ πˆ n ◦ Hn−1 ) − (log J F¯0 ◦ πˆ 0 ) d Lebγˆ0 ≤ ε/2, n0 ∩Acn,N
we are left to deal with (log J F¯n ◦ πˆ n ◦ Hn−1 ) − (log J F¯0 ◦ πˆ 0 ) d Lebγˆ0 n0 ∩An,N
≤
i:R0 (ϒi0 )≤N
+
i:R0 (ϒi0 )≤N
ϒi0 ∩ϒin
ϒi0 ϒin
(log J F¯n ◦ πˆ n ◦ Hn−1 ) − (log J F¯0 ◦ πˆ 0 ) 1n0 ∩0 d Lebγˆ0 (log J F¯n ◦ πˆ n ◦ Hn−1 )−(log J F¯0 ◦ πˆ 0 ) 1n0 ∩0 ∩An,N d Lebγˆ0 .
Denote by S1 and S2 respectively the first and second sums above, and v the number of terms in S1 and S2 . By Lemma 4.2 we have S2 ≤ C2 (Rn + R0 )1n0 ∩0 ∩An,N d Lebγˆ0 ≤ 2C2 N Lebγˆ0 (ϒi0 ϒin ). ϒi0 ϒin
Hence, using (U4 ) we consider J ∈ N large enough to have Lebγˆ0 (ϒi0 ϒin ) < ε/ (8C2 N v), and so S2 ≤ ε/4. Let τi = R0 (ϒi0 ) = Rn (ϒin ) ≤ N . We want to see that for all n large enough and all x ∈ ϒi0 ∩ ϒin with τi ≤ N , (4.3) (log J F¯n ◦ πˆ n ◦ Hn−1 )(x) − (log J F¯0 ◦ πˆ 0 )(x) ≤ ε/4v, which yields S1 ≤ ε/4. Using (2.3) and observing that the curves γˆn , γˆ0 are the leaves we chose to define the reference measures m¯ n , m¯ 0 , then we easily get for y = Hn−1 (x), log J F¯n ◦ πˆ n (y) − log J F¯0 ◦ πˆ 0 (x) ≤ log det(D f τi )u (y) − log det(D f τi )u (x) n 0 +| log uˆ n ( f nτi (y)) − log uˆ 0 ( f 0τi (x))|.
Using Remark 4.5 with = N and ε/8v instead of , and recalling that τi ≤ N , we may find δ > 0 and J ∈ N so that for all n > J , log det(D f τi )u (y) − log det(D f τi )u (x) < ε/8v. (4.4) n 0 Observe that |x − y| < δ as long as J is sufficiently large, since x = Hn (y).
766
J. F. Alves, M. Carvalho, J. M. Freitas
For every n, k ∈ N0 and t ∈ n , let uˆ kn (t) =
k j det D f nu ( f n (t)) j
j=0
det D f nu ( f n (tˆ))
.
By definition of uˆ n (see (2.1)) and by (P3 )(a), there is k ∈ N such that for every n ∈ N0 and t ∈ n we have | log uˆ n (t) − log uˆ kn (t)| < ε/(48v). Thus, | log uˆ n ( f nτi (y)) − log uˆ 0 ( f 0τi (x))| ≤ | log uˆ n ( f nτi (y)) − log uˆ kn ( f nτi (y))|
+| log uˆ kn ( f nτi (y)) − log uˆ k0 ( f 0τi (x))|
+| log uˆ k0 ( f 0τi (x)) − log uˆ 0 ( f 0τi (x))| k j j ≤ log det D f nu ( f n (ζ )) − log det D f 0u ( f 0 (z)) j=0 k j j + log det D f nu ( f n (ζˆ )) − log det D f 0u ( f 0 (ˆz )) j=0
+
ε , 24v
where z = f 0τi (x), ζ = f nτi (y), zˆ is the only point on the set γ0s (z) ∩ γˆ0 and ζˆ is the unique point on the set γns (ζ ) ∩ γˆn . Observe that since γˆn → γˆ0 and f n → f 0 in the C 1 topology, and τi ≤ N , then u γn (ζ ) → γ0u (z), in the C 1 topology. Besides, using Lemma 3.3 we also have |ˆz − ζˆ | as small as we want for J large enough. Consequently, by Remark 4.5, we may find J ∈ N sufficiently large so that for all n > J , we have k j j log det D f nu ( f n (ζ )) − log det D f 0u ( f 0 (z)) < ε/(24v),
(4.5)
j=0
and k j j log det D f nu ( f n (ζˆ )) − log det D f 0u ( f 0 (ˆz )) < ε/(24v).
(4.6)
j=0
Estimates (4.4),(4.5) and (4.6) yield (4.3).
References [A] [ACF] [AOT]
Alves, J.F.: Strong statistical stability of non-uniformly expanding maps. Nonlinearity 17(4), 1193–1215 (2004) Alves, J.F., Carvalho, V., Freitas, J.M.: Statistical stability for Hénon maps of Benedicks-Carleson type. Ann. Inst. H. Poincaré Anal. Non Linéaire 27, 595–637 (2010). doi:10.1016/j.anihpc.2009. 09.009 Alves, J.F., Oliveira, K., Tahzibi, A.: On the continuity of the srb entropy for endomorphisms. J. Stat. Phys. 123(4), 763–785 (2006)
Statistical Stability and Continuity of SRB Entropy
[AV] [BC] [BDV] [B] [BV1] [BV2] [BY1] [BY2] [C] [F] [FT] [LY1] [LY2] [M] [N] [O] [P] [RS] [R] [T1] [T2] [V] [Y1] [Y2]
767
Alves, J.F., Viana, M.: Statistical stability for robust classes of maps with non-uniform expansion. Ergod. Th. & Dyn. Sys. 22, 1–32 (2002) Benedicks, M., Carleson, L.: The dynamics of the Hénon map. Ann. Math. 133, 73–169 (1991) Bonatti, C., Díaz, L.J., Viana, M.: Dynamics Beyond Uniform Hiperbolicity. Berlin-HeidelbergNew York: Springer-Verlag, 2005 Bowen, R.: Equilibrium States and Ergodic Theory of Anosov Diffeomorphisms. Volume 470 of Lecture Notes in Mathematics, Berlin-Heidelberg-NewYork: Springer-Verlag, 1975 Benedicks, M., Viana, M.: Solution of the basin problem for hénon-like attractors. Invent. Math. 143, 375–434 (2001) Benedicks, M., Viana, M.: Random perturbations and statistical properties of hénon-like maps. Ann. Inst. H. Poincaré Anal. Non Linéaire 23(5), 713–752 (2006) Benedicks, M., Young, L.S.: Sinai-bowen-ruelle measures for certain hénon maps. Invent. Math. 112, 541–576 (1993) Benedicks, M., Young, L.S.: Markov extensions and decay of correlations for certain hénon maps. Astérisque 261, 13–56 (2000) Contreras, G.: Regularity of topological and metric entropy of hyperbolic flows. Math. Z. 210(1), 97–111 (1992) Freitas, J.M.: Continuity of srb measure and entropy for benedicks-carleson quadratic maps. Nonlinearity 18, 831–854 (2005) Freitas, J.M., Todd, M.: Statistical stability of equilibrium states for interval maps. Nonlinearity 22, 259–281 (2009) Ledrappier, F., Young, L.S.: The metric entropy of diffeomorphisms. i. characterization of measures satisfying pesin’s entropy formula. Ann. of Math. (2) 122(3), 509–539 (1985) Ledrappier, F., Young, L.S.: The metric entropy of diffeomorphisms. ii. relations between entropy, exponents and dimension. Ann. of Math. (2) 122(3), 540–574 (1985) Mañé, R.: The hausdorff dimension of horseshoes of diffeomorphisms of surfaces. Bol. Soc. Bras. Mat, Nova Série 20(2), 1–24 (1990) Newhouse, S.: Continuity properties of entropy. Ann. Math. 129(2), 215–235 (1989) Oseledets, V.I.: A multiplicative ergodic theorem. lyapunov characteristic numbers for dynamical systems. Trans. Moscow. Math. Soc. 19, 197–231 (1968) Pollicott, M.: Zeta functions and analyticity of metric entropy for anosov systems. Israel J. Math. 75, 257–263 (1991) Rychlik, M., Sorets, E.: Regularity and other properties of absolutely continuous invariant measures for the quadratic family. Commun. Math. Phys. 150, 217–236 (1992) Ruelle, D.: Differentiation of srb states. Commun. Math. Phys. 187(1), 227–241 (1997) Thunberg, H.: Unfolding of chaotic unimodal maps and the parameter dependence of natural measures. Nonlinearity 14, 323–337 (2001) Tsujii, M.: On continuity of bowen-ruelle-sinai measures in families of one dimensional maps. Commun. Math. Phys. 177, 1–11 (1996) Vásquez, C.H.: Statistical stability for diffeomorphisms with dominated splitting. Erg. Th. Dyn. Sys. 27(1), 253–283 (2007) Yomdin, Y.: Volume growth and entropy. Israel J. Math. 57(3), 285–300 (1987) Young, L.S.: Statistical properties of dynamical systems with some hyperbolicity. Ann. Math. 147, 585–650 (1998)
Communicated by G. Gallavotti
Commun. Math. Phys. 296, 769–826 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1016-9
Communications in
Mathematical Physics
Equilibrium Fluctuations for a Model of Coagulating-Fragmenting Planar Brownian Particles Mojtaba Ranjbar1 , Fraydoun Rezakhanlou2, 1 Mathematics and Computer Science Faculty, Amirkabir University, Tehran, Iran 2 Department of Mathematics, University of California, Berkeley,
California 94720–3830, USA. E-mail:
[email protected] Received: 29 May 2009 / Accepted: 24 November 2009 Published online: 18 February 2010 – © The Author(s) 2010. This article is published with open access at Springerlink.com
Abstract: We study a model of mass-bearing coagulating-fragmenting planar Brownian particles. Coagulation occurs when two particles are within a distance of order ε. Our model is macroscopically described by an inhomogeneous Smoluchowski’s equation in the low ε limit provided that the initial number of particles N is of order | log ε|. When a detailed balance condition is satisfied, we establish a central limit theorem by showing that in the low ε limit, the particle density fluctuation fields obey an Ornstein-Uhlenbeck stochastic differential equation. 1. Introduction One of the main purposes of statistical mechanics is to explain the macroscopic behavior of various phenomena in terms of the statistics of their microscopic structures. Macroscopically we often have a PDE involving a small number of parameters. The microscopic description however involves a large number of components that are evolving by either deterministic or stochastic rules. Let us name three reasons to justify our interest in understanding the connection between the microscopic and macroscopic descriptions. As our first reason, we remark that historically the macroscopic equation is formally derived from the microscopic description of the phenomenon under study. It is an important task of statistical mechanics to justify such a derivation rigorously and verify the validity of the macroscopic PDE. For our second reason, we note that we often have simple dynamical rules for the microscopic model and the main challenging feature of the model has to do with its large size. On the other hand, the macroscopic evolution involves only a few variables but is dictated by a rather sophisticated nonlinear rules. It is the case for many examples that the macroscopic equation is not fully understood. Hopefully by exploring its relation with its microscopic counterpart we may discover new tools and techniques for the macroscopic equation. This work is supported in part by NSF grant DMS-0707890.
770
M. Ranjbar, F. Rezakhanlou
As our third reason, we should mention that even though the macroscopic equation is preferred because of its dependence on a small number of variables, it is only a reduced description of the microscopic phenomenon at hand and we would like to find practical ways of recovering some of the lost information as we switch to the macroscopic world. Since the passage from the microscopic details to macroscopic parameters can be recast as a law of large numbers for some conserved quantities in many scenarios, probability theory suggests some standard routes for going beyond the law of large numbers and gain new information. The celebrated central limit theorem and large deviations for classical examples are guidelines for producing some vital information for the microscopic model under study. It is the latter reason which is the chief motivation for the present article. Our microscopic model is a system of coagulating-fragmenting Brownain particles which is macroscopically described by an inhomogeneous Smoluchowski equation. This equation is derived as a law of large numbers. The main contribution of this article is a central limit theorem for the aforementioned law of large numbers when the system is in equilibrium. In our model, we start with N particles with each particle traveling in Rd as a Brownain motion. Each particle has a size m ∈ Z and a radius r ∈ (0, ∞). In fact our interpretation of the location x of a particle is that x is the center of a ball of radius r and in some sense only a small fraction of the ball is occupied by the true particle. It turns out that in reality each particle is a cluster of smaller objects and the cluster is a complex fractal like entity that is too complicated to be treated with the existing techniques. That is why we simplify the model by replacing the cluster with a ball of radius r (m) = m χ so that when χ > d1 , we are taking into account the fact that the mass of the particle comes from a small portion of the ball which is occupied by the cluster. This may appear somewhat native and not too realistic from a physical point of view. Nevertheless, as was explained in [HR1–3] and [R2], the model does exhibit some expected features of the underlying 1 physics. For example, the condition χ < d−2 guarantees that no gelation occurs in finite time. That is, no particle of size infinity is formed in finite time at the macroscopic level. 1 We also conjecture that an instantaneous gelation would occur if χ > d−2 . In fact the true radius of a particle is εr with ε very small. A calculation involving Wiener sausages reveals that if N = ε2−d when d ≥ 3 and N =| log ε | when d = 2, then the expected value of the number of times a particle coagulates with other particles in one unit of time stays positive and finite as → 0. This property allows us to obtain the Smoluchowski’s equation for the evolution of cluster densities in the low ε limit. To further simplify the involved mathematical technicalities, we forget about balls presenting each particle and regard each particle as a point. Now the coagulation occurs stochastically only when particles of positions x and y and masses m and n, satisfy | x − y |≤ c0 ε(m χ + n χ ), for a constant c0 . In the preceding works [HR1–2], [R2 and HRY], we were able to derive the macroscopic equation as a law of large numbers; if we label the locations and masses of particles as (xi , m i ), i ∈ I , then our law of large numbers asserts 1 δxi (t) (d x)11(m i (t) = n) → f n (x, t)d x Kε i∈I
with f n solving Smoluchowski equation (2.6) of Sect. 2. Here 2−d if d ≥ 3, ε Kε = | log ε| if d = 2.
(1.1)
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
771
As a central limit theorem, we are interested in the limit of the fluctuation fields ε −1 ξn (d x, t) = K ε K ε δxi (t) (d x) 11(m i (t) = n) − f n (x, t)d x (1.2) i
as ε → 0. In Sect. 2, we state a conjecture regarding the evolution of ξnε in low the ε limit. According to this conjecture, the limit ξ solves an Ornstein-Uhlenbeck stochastic differential equation in an infinite dimensional setting with ξ living in a negative Sobolev space. The conjecture is formulated using the so-called fluctuation–dissipation principle of non-equilibrium statistical mechanics. In this article, we establish the conjecture only when the dimension is 2 and the model satisfies a detailed-balance condition. Some steps of our proof do not apply to higher dimensions. The case d ≥ 3 is more challenging and is left for a future investigation. We continue this Introduction by mentioning some previous work related to our model. Smoluchowski’s equation was introduced by Smoluchowski in the seminal work [Sm]. The first mathematically rigorous derivation of Smoluchowski’s equation from a model of coagulating Brownian particles was carried out by Lang–Nyugen [LN] when d = 3 and all particles have the same diffusion coefficient. A related problem has been studied by Sznitman [Sz] when d = 2. A completely different approach has been employed in [HR1-2 and YRH] to treat the model in general. A thorough survey on related models and their applications can be found in Aldous [A]. In fact Open Problem 16 in [A] is exactly our central limit theorem when there is no spacial dependence. We refer to the monograph [Sp] for an introduction to related questions in statistical mechanics and a discussion of the fluctuation–dissipation principle. An equilibrium fluctuation result has been studied in [R1] for a model of the colliding particles associated with the discrete Boltzmann equation. We end this section with an outline of the paper. In Sect. 2, we state a conjecture for the macroscopic evolution of the fluctuation fields. In Sect. 3, a family of reversible invariant measures for the microscopic model is constructed. In this section, the conjecture is restated as the main result of this paper under the assumption that the model starts from one of the reversible measures and that the dimension is 2. In Sect. 4, the strategy of the proof is described. The first step of the proof is a regularity of the coagulation term and is carried out in Sects. 5–7. The proof of the main result is completed in Sects. 8 and 9. 2. A Conjecture We start with the description of our model. The configuration space consists of pairs ω = (x, m) with x a subset of Rd with no accumulation point and m : x → N = {1, 2, 3, . . . } is a map that assigns a positive integer to each element of x. Throughout this section we assume that d ≥ 2. It is often convenient to write ω = (xi , m i )i∈I with xi ∈ Rd and m i ∈ N, where I = I (ω) is a countable index set. We regard x as a collection of cluster positions in Rd with no accumulation point and m assigns asize to each such position. We may also identify ω = (x, m) as a discrete measure ωˆ = i∈I δ(xi ,m i ) on Rd × N. Using this identification we equip the space with the topology of vague convergence. We now describe the evolution of coagulating and fragmenting Brownian clusters as a Markov process on the configuration space . For this, functions d : N → (0, ∞), α : N × N → (0, ∞) and β : N × N → (0, ∞) are given which represent the diffusion
772
M. Ranjbar, F. Rezakhanlou
coefficient, the coagulation rate and the fragmentation rate respectively. We assume that both α and β are symmetric. Also a parameter χ ∈ [0, ∞) and a continuously differentiable function V : Rd → [0, ∞) are given for our model. We then define a Markov process ω(t) with infinitesimal generator A = A0 + Ac + A f , where A0 represents the “Brownian motion” part of the dynamics, and Ac and A f represent the coagulation and fragmentation parts of the evolution. For the “Brownian motion” part, we use the representation ω = (xi , m i )i∈I to write A0 F(ω) = d(m i )xi F(ω), (2.1) i∈I
for any C 2 function F. Here xi represents the Laplace operator which acts on the xi th variable. + As for the coagulation part, we write Ac F(ω) = A+c F(ω) − A− c F(ω) with Ac and − Ac given by 1 mi + α(m i , m j )Vε (xi − x j ; m i , m j ) F(Si1j ω) Ac F(ω) = 2 mi + m j i, j∈I i= j
mj F(Si2j ω) , mi + m j 1 α(m i , m j )Vε (xi − x j ; m i , m j )F(ω). A− c F(ω) = 2 +
i, j∈I i= j
Here, (i) ε > 0 is a small parameter that represents the
range of interaction. (ii) The function Vε (x; m, n) = λ(ε)V xε ; m, n , where | log ε|−1 ε−2 if d = 2, λ(ε) = ε−2 if d ≥ 3, and V (x; m, n) = r (m, n)−2 V
x , r (m, n)
(2.2)
(2.3)
with r (m, n) = r (m) + r (n), for r (n) = n χ , and V is a symmetric Hölder continuous function of compact support and total integral 1. (iii) We denote by Si1j ω the configuration formed from ω by removing x j from x and assigning the size m i + m j to the surviving cluster at xi . The configuration Si2j ω is defined in the same way, except that we remove xi from x and assign the size m i + m j to the cluster at x j . We note that the cluster at xi survives the coagulation i with probability m im+m . j Before describing the fragmentation part of the dynamics, let us explain the form of the function Vε . Note that Vε (xi − x j ; m i , m j ) = 0 only if xi − x j is of order ε(r (m i ) + r (m j )) with r (m) = m χ . This means that we regard a particle of size m to be roughly a ball of radius εr (m) so that a pair of clusters of centers xi and x j coagulate
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
773
when their corresponding balls overlap. If we assume that the mass of the i th cluster is distributed evenly in the ball Br (m i ) (xi ), then we expect to have χ = d1 . However, in reality a cluster is far from being a round ball and expected to be a fractal like object. By allowing χ ∈ (0, ∞) we are hoping to have a more physically relevant model. In particular, the case χ < d1 represents a scenario in which the ball Br (m i ) (xi ) contains the true cluster and only a fraction of the ball is occupied with the cluster. We believe that 1 the case χ > d−2 corresponds to the occurrence of “gelation”. We refer to [HR1,HR3 and R2] for more discussions. (Note that no finite χ can cause gelation when d = 2; we guess that the radius must grow exponentially with the mass in order to have a gel in this case.) The occurrence of the factor λ(ε) in the definition of Vε is to guarantee that when two clusters collide, then they coagulate with a probability that stays away from 0 as ε → 0. Indeed B = xi − x j is a Brownian motion that spends a time of order ε2 r (m i , m j )2 (respectively ε2 r (m i , m j )2 | log ε|) in the support of Vε when d ≥ 3 (respectively d = 2). We also multiply the sum in the definition Ac by 1/2 to ensure that the summation is over unordered pairs {i, j}. As for the fragmentation part, A f F(ω) = A+f F(ω) − A−f F(ω), is given by
m i −1 1 y,m β(m, m i − m) V ε (xi − y; m i − m, m)(F(Si ω) − F(ω))dy, 2 i
m=1
with A+f F(ω) =
m i −1 1 y,m β(m, m i − m) V ε (xi − y; m i − m, m)F(Si ω)dy. 2 i
m=1
Here, V ε (a; m, n) = ε−d V
a ε
; m, n ,
(2.4)
y,m
and Si is that configuration formed from ω by replacing (xi , m i ) with a pair of clusters of positions xi and y and sizes m and m i − m. The central object to study is the cluster density of a given size. Microscopically we are interested in the empirical measures δxi (t) (d x)11(m i (t) = n), gnε (d x, t) = K ε−1 i
where K ε was defined by (1.1). If for example we select (x1 (0), m 1 (0)), . . . , (x N (0), m N (0)) randomly and independently with the law
1 P(xi (0) ∈ A, m i (0) = n) = f 0 (x)d x, (2.5) Z A n with Z = n f n0 d x, then by the law of large numbers, gnε (d x, 0) converges weakly to f n0 (x)d x provided that N = K ε Z and ε → 0. Note that such a choice of initial condition implies that on average there are K ε f n0 (x)d x many particles of size n. A Wiener Sausage calculation reveals that in average, each particle in our model experiences finitely many coagulations per unit time. This explains our reason for choosing
774
M. Ranjbar, F. Rezakhanlou
1 K ε as above. The main result of [HR1,HR2 and R1] states that if χ < d−2 , there is no fragmentation, and α satisfies some technical conditions, then the empirical density gnε (d x, t) converges to f n (x, t)d x where f n is a solution to Smoluchowski’s equation, subject to the initial condition f n (x, 0) = f n0 (x). It is shown in [HR3] that this solution is unique. Smoluchowski’s equation has the form
∂ fn +, f −, f −,c (x, t) = d(n)x f n (x, t) + Q +,c n (f) − Q n (f) + Q n (f) − Q n (f), ∂t where f = ( f n : n ∈ N), and Q +,c n (f) = +, f
Q n (f) =
n 1 α(m, ˆ n − m) f m f n−m , 2
m=1 ∞
ˆ β(m, n) f n+m ,
m=1
−, f
Qn
Q n−,c (f) =
∞
(2.6)
α(m, ˆ n) f m f n ,
m=1
(f) =
n−1 1 ˆ β(m, n − m) f n , 2 m=1
with ˆ α(m, ˆ n) = η(m, n)α(m, n), β(m, n) = η(m, n)β(m, n).
(2.7)
The function η(m, n) is calculated in terms of the microscopic details of the model. We start with the case d = 2. In this case η is independent of the function V and the parameter χ , and is simply given by η(m, n) =
2π(d(m) + d(n)) . 2π(d(m) + d(n)) + α(m, n)
(2.8)
The formula for η(m, n) is more complicated when d ≥ 3 and does depend on both V and χ . Here is the recipe for η: First we find the unique solution to the equation (d(m) + d(n))u m,n (x) = α(m, n)V (x; m, n)(1 + u m,n (x)) with u(x; m, n) = u m,n (x) satisfying u m,n (x) → 0, as |x| → ∞. Then we set
η(m, n) = V (x; m, n)(1 + u m,n (x))d x.
(2.9)
(2.10)
Remark 2.1. • For the purposes of this section, we have assumed that n f n0 d x < ∞, which implies that there are finitely many particles almost surely. However in Sect. 3 when the main result of this article is discussed, the density f n0 is constant and the system involves infinitely many particles. The existence of such a particle system is no longer obvious, and in Remark 3.5 we will explain how such a particle system is constructed. • Note that we deliberately choose a mechanism for the fragmentation that is, in some sense, dual to the coagulation mechanism. This allows us to easily construct reversible invariant measures for the process ω(t). In other words the fragmentation is defined in such a way that if we reverse time after a coagulation, we obtain a fragmentation. For the kinetic limit however, we can use a kernel W for the fragmentation that is different from V , or even choose two new locations y1 and y2 near xi for the locations of new clusters of a fragmented cluster. However, for this fragmentation mechanism, the macroscopic coagulation and fragmentation rates read αˆ = αη, βˆ = βη with possibly η = η .
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
775
−, f
−,c • Let us write Q n = Q +,c n − Q n + Q n − Q n . We then have the following useful formula: For any sequence (Jn : n ∈ N), 1 ˆ Jn Q n = (α(m, ˆ n) f m f n − β(m, n) f m+n )(Jm+n − Jm − Jn ). 2 n,m n +, f
The main goal of this article is to derive an equation for the evolution of the density fluctuations about the solution to Smoluchowski’s equation. To this end, recall the fluctuation fields ξnε (d x, t) that was defined by (1.2). Let us 0 assume that χ < (d − 2)−1 and that the total mass n n f n d x is finite. Conjecture 2.1. As ε → 0, the process ξnε converges to ξn , where ξn is the unique solution to the Uhlenbeck–Ornstein equation ∂ξn f = d(n)x ξn + Lcn ξ + Ln ξ + γn , ∂t ξn (x, 0) = ξ¯n (x),
(2.11)
where ξ = (ξn : n ∈ N), and −,c Lcn = L+,c n − Ln , Ln = Ln f
+, f
−, f
− Ln
,
(2.12)
with L+,c n ξ = +, f Ln ξ =
n−1 m=1 ∞
α(m, ˆ n − m) f m ξn−m , Ln−,cξ = 2 ˆ β(m, n)ξn+m ,
−, f
Ln
ξ =
m=1
1 2
∞
α(m, ˆ n)( f m ξn + f n ξm ),
m=1 n−1
ˆ β(m, n − m)ξn ,
(2.13)
(2.14)
m=1
and γn is a space-time white noise with variance given by 2
d(n) f n |∇ Jn |2 d xdt Jn γn d xdt =2 n
n
1 + 2 1 + 2
α(m, ˆ n) f n f m (Jn+m − Jn − Jm )2 d xdt
m,n
ˆ β(m, n) f n+m (Jn+m − Jn − Jm )2 d xdt
m,n
(2.15) for any smooth test function J = (Jn : n ∈ N) of compact support in Rd × (0, ∞). In fact γ belongs to a suitable negative Sobolev space and the integral of Jn γn must be understood as the value of the distribution γn at the smooth test function Jn . See the next section or the beginning of Sect. 8 for the precise definition of ξ and γ and the meaning of Eq. (2.11). The main result of this paper asserts that Conjecture 2.1 is valid if the initial distribution of the cluster is chosen according to a reversible equilibrium state and d = 2.
776
M. Ranjbar, F. Rezakhanlou
3. Equilibrium Fluctuations We start with constructing reversible invariant measures for the process ω(t). For this we take a collection of positive numbers λ = (λn : n ∈ N) such that n λn < ∞, and α(m, n)λn λm = β(m, n)λn+m
(3.1)
for every m, n ∈ N. Note that for such a collection, the functions f n (x, t) ≡ λn do solve Smoluchowski’s equation because by (2.7) and (3.1), ˆ α(m, ˆ n)λn λm = β(m, n)λn+m ,
(3.2)
and this in turn implies −, f
λ) = Q n Q +,c n (λ
f λ), Q n−,c (λ λ) = Q +, λ). (λ n (λ
(3.3)
Given such λ , we construct a reversible invariant measure µλ for our process ω(t): Let n xn be a Poisson point process with intensity are inde∞ K ε nλn . Assume that (x , n ∈ N) pendent and define ω = (x, m) by x = n=1 x and m(a) = n for a ∈ xn . In words, particles of size n form a Poisson point process of intensity of K ε λn and these processes are independent for different choices of n. We note that if is a bounded subset of Rd , then
λn , M dµλ = ||K ε n
where n n M (ω) = M (x, m) = #{a ∈ x : a ∈ , m(a) = n},
M =
∞
n M .
n=1
Hence, if we assume that n λn < ∞, then there are finitely many clusters in a bounded domain almost surely with respect to µλ . We now assert that µλ is indeed reversible. To explain this, let us take two bounded local C 2 functions F, G : → R. By a local function F we mean that there exists a positive constant c0 such that F depends only on particles (xi , m i ) such that |xi |, m i ≤ c0 . We then have
(3.4) G AF dµλ = F AG dµλ . Indeed,
G A0 F dµλ = −
G
A+c F
dµλ =
d(m i )∇xi F · ∇xi G dµλ ,
i
F
A−f G
dµλ ,
G
A− c F
(3.5)
dµλ =
F A+f G dµλ . (3.6)
Note that (3.6) is the microscopic analog of (3.3), and together with (3.5) imply (3.4). The proof of (3.5) follows from an integration by parts. As for (3.6), observe that for any bounded set , L ∞ n (λn K ε ) L n −λn K ε || µλ (dω ) = e 11(m ni = n, xni ∈ )d xni , Ln! L 1 ,L 2 ,... n=1
i=1
(3.7)
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
777
where ω is the configuration in the set and µλ is the law of ω under µλ . Here we n is the number of such have labeled particles of size n by n1, n2, . . . , n L n and L n = M particles. Using the representation (3.7), one can readily verify (3.6). Let us write Pωε and Eωε for the probability and the expectation with respect to the process ω(·) subject to the initial condition ω(0) = ω. When ω(0) is distributed according eq eq to an invariant measure µλ , we write Pε and Eε instead. Given ω(·), we define
1 ε ξn (t, J ) = K ε J (xi (t))11(m i (t) = n) − λn J (x)d x (3.8) Kε i
for every smooth J : Rd → R of compact support. Let D denote the space of smooth functions of compact support and let D denote the space of distributions (the dual of D). We regard ξnε as an element of the Skorohod space D = D([0, T ], (D )N ). The transformation ω(·) → ξ ε induces a probability measure P ε on D. We regard ξnε (t, J ) as the value of the distribution ξnε (t) at J. To state our assumptions, take a nondecreasing function a ≥ 1, such that α (m, n) = α(m, n)/(d(m) + d(n)) ≤ a(n) + a(m), and set β (n) = n−1 m=1 β(n − m, m). Hypothesis 3.1. The function d(·) is bounded. Moreover for some θ > 1/2, lim τ (ε) := lim K ε1/2 a(n)λn = 0, ε→0
ε→0
(3.9)
2εr (n)>δ(ε)
where δ(ε) = | log ε|−θ , and [a(n)(r (n) + β (n) log n) + a(n)2 (a(n) + log n)]λn < ∞.
(3.10)
n
Remark 3.1. Note that by detailed balance, we have that β(n, m) = α(m, n)λn λm /λm+n . Hence, if α and λ are known, then β is determined. As an example, consider the case with λn decaying like e−cn , as n → ∞. In this case, we can readily see that if a(n) is growing at most like a polynomial as n gets large, then both (3.9) and (3.10) are satisfied. Theorem 3.1. Assume Hypothesis 3.1 and that the dimension d = 2. Then the finite dimensional marginals of the sequence P ε converges to the finite dimensional marginals of P, where P is the distribution of a stationary Ornstein–Uhlenbeck Gaussian process with covariance
∞
ξn (t, Jn )ξn (0, Hn )P(dξ ) =
∞
n=1
(Tt Jn )(x)Hn (x)λn d x.
(3.11)
n=1
Here Jn , Hn ∈ D for n ∈ N and Tt is the semigroup generated by the linear Smoluchowski’s operator (J)n = d(n)x Jn +
∞
ˆ β(m, n)Jn+m −
m=1
+
n−1 m=1
α(m, ˆ n − m)Jn−m λm −
n−1 1 ˆ β(m, n − m)Jn 2 m=1
∞ m=1
α(m, ˆ n)(Jn λm + Jm λn ). (3.12)
778
M. Ranjbar, F. Rezakhanlou
Remark 3.2. Note that the macroscopic coagulation and fragmentation rates αˆ and βˆ are strictly smaller than their microscopic counterparts α and β. We refer the reader to Sect. 4 for a heuristic explanation and how a fundamental auxiliary function u ε would ˆ allow us to switch from the microscopic rates α and β to macroscopic rates αˆ and β. Note also that even though the “strengths” of the noises associated with the coagulation and fragmentation are given by α and β, the corresponding macroscopic “strengths” are given by αˆ and βˆ as the expressions (2.15) and (2.15) indicate. In fact this reduction in the strength happens in a very curious way: – The auxiliary function u ε corrects the original noises coming from the coagulation and fragmention by reducing their strengths to α˜ = αη2 and β˜ = βη2 . (See formulas (8.33) and (8.37) and the definitions of Ac0 and A f 0 which are given right after (8.26) and (8.34).) – The Brownian part of the dynamics uses the corrector u ε and produces some noise ˆ (See which enhances the reduced strengths α˜ and β˜ to their final values αˆ and β. formula (8.24), expression A02121111 which is defined right before (8.23), and the final step of the proof of (8.4).) Remark 3.3. In fact what we can prove is somewhat stronger than what has appeared in the statement of Theorem 3.1. We will show that the process ξ ε = ξ − ξ with both ξ eq and ξ stationary processes in time, where the law of ξ under Pε converges to P, and lim Eeq ε |ξ (t, J )| = 0
ε→0
for every t, n ∈ N, and test function J . We refer the reader to Sect. 9 for the details. An alternative description of the law P is the martingale formulation of Holley and Stroock [HS] that will be defined in Sect. 8. It is this formulation which we use for the proof of Theorem 3.1. Remark 3.4. The intuition behind (3.11) is the standard dissipation-fluctuation principle. This principle is used to predict the form of the diffusion coefficient once the drift and the invariant measure for the fluctuation fields are known. In fact (3.11) is equivalent to saying that the process ξ is a solution to the stochastic differential equation dξ = ξ dt + BdWt ,
(3.13)
where dWt = (dW1,t , . . . , dWn,t , . . . ) with (dWn : n ∈ N) independent space-time white noises and the operator B is determined by
∞ ∞ (Bζζ )n (BH)n d x = 2 λn ∇x Jn · ∇x Hn d x n=1
+
1 2
1 + 2
n=1
∞
m,n=1 ∞ m,n=1
α(m, ˆ n)λn λm (Jn+m − Jn − Jm )(Hn+m − Hn − Hm )d x ˆ β(m, n)λn+m (Jn+m − Jn − Jm )(Hn+m − Hn − Hm )d x.
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
779
Indeed if we start with the ansatz that ξ satisfies an Ornstein-Uhlenbeck equation of the form (3.13), then we have an obvious guess for the linear drift ξ , namely the linearization of the right-hand side of the macroscopic equation (2.6). We also have a candidate for its invariant measure, namely the measure P 0 given by (3.11) at t = 0;
∞
ξn0 (Jn )ξn0 (Hn )P 0 (dξ 0 ) =
n=1
∞
Jn (x)Hn (x)λn d x.
n=1
We then select the diffusion operator B to be compatible with what we have for the drift and the invariant measure of the process ξ . Remark 3.5. As our final remark, we comment that it is not obvious that our Markov process ω(·) exists because we are dealing with infinitely many interacting diffusions. However, since we are only interested in the process ω(·) at equilibrium, its existence can be shown by rather standard arguments which we now sketch. (i) Observe that if initial macroscopic densities ( f n0 : n ∈ N) satisfy n f n0 d x < ∞, then we can construct our process by starting from N independent particles (x1 , m 1 ), . . . , (xN , m N ) satisfying (2.5), where N and ε are related by the equation N = K ε n f n0 d x. In other words, if the total density is finite macroscopically, then initially we are dealing with finitely many particles almost surely and the existence of the process ω(·) is obvious. However, since at equilibrium f n0 ≡ λn is not integrable, we need to consider a Poisson point process with infinitely many particles. (ii) We now argue that we can construct our process if we make two assumptions: n f n0 (x)d x < ∞, (3.14) n
|x|≤r
α(m, n) = β(m, n) = 0 if m + n > ,
(3.15)
for every r > 0 and some > 0. In other words, we assume that locally the total mass is finite macroscopically but now we assume that no interaction occurs if particles are large. To construct ω(·) in this case, we first replace f 0 with f 0 11(|x| ≤ k). Our process exists for such an initial macroscopic density by (i). The corresponding process is denoted by ωk (·). We now want to send k to infinity and show that the sequence (ωk : k ∈ N) is tight and that any of its limit point ω is a solutionto the martingale problem associated with the generator A. t That is, F(ω(t)) − 0 AF(ω(s)ds, is a martingale for every C 2 local function F. This can be readily achieved by establishing a control on the total number of particles in a ball {x : |x| ≤ r }. Here is a way of establishing such a control uniformly in k: Pick a positive smooth function J which equals to exp(−|x|) for large x, and set H (x) = − |y|≤1 log |y|J (x − y)dy. We can readily show that H > 0 and that H ≤ c0 H for a constant c0 . Then use the martingale t M(t) = F(ωk (t)) − 0 AF(ωk (s)ds for F(ω) = i H (xi )m i to show sup E sup F(ωk (t))2 < ∞, k
t∈[0,T ]
(3.16)
for every T . This can be used to establish the tightness of ωk and the existence of our process provided that (3.14) and (3.15) are true.
780
M. Ranjbar, F. Rezakhanlou
(iii) It remains to relax the restriction (3.15). We now would like to take advantage of the fact that we only need to consider f n0 ≡ λn . More precisely, by (ii), we know that Peq exists if we assume that (3.15) is true. Let us write ω for our process when α and β are replaced with α (m, n) = α(m, n)11(m + n ≤ ), β (m, n) = β(m, n)11(m + n) ≤ ). Again, we need to show the tightness of ω and pass to the limit in the martingale formulation of our process. For this, we need to show something like (3.16) for the sequence ω . This can be readily achieved by bounding various terms that appear in the martingale M(·), using the fact that the process ω is stationary in time. 4. A Sketch of the Proof We aim to show that the expression X ε (ω(t)) = K ε−1/2
J (xi (t), m i (t)),
(4.1)
i
with J (x, n) = Jn (x) satisfying Jn d x = 0, is close to n ξn (t, Jn ), with the distributions (ξn : n ∈ N) solving (3.13) in the weak sense. To derive (3.13), we use Markov property of the process ω(t) to write X ε (ω(t)) = X ε (ω(0)) +
0
n
t
+ 0
t
A0 X ε (ω(s))ds +
t 0
Ac X ε (ω(s))ds
A f X ε (ω(s))ds + Mε (t)
(4.2)
=: Yε1 + Yε2 (t) + Yε3 (t) + Yε4 (t) + Mε (t), with Mε a martingale for which
Nε (t) = Mε (t) −
t
2
0
(AX ε2 − 2X ε AX ε )(ω(s))ds,
(4.3)
is a martingale. The identity (4.2) should be compared to what we have as the weak form of (3.13), namely n
ξn (Jn , t) =
ξn (Jn , 0) +
n
+ +
t
0 m,n
t 0 m,n 1 2
t 0
d(n)ξn (Jn , s)ds
n
α(m, ˆ n)λm ξn (Jn+m − Jm − Jn , s)ds ˆ β(m, n)ξn+m (Jn+m − Jm − Jn , s)ds + M(t)
=: Y + Y (t) + Y 3 (t) + Y 4 (t) + M(t),
(4.4)
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
781
where the process M(t) is a martingale for which
d(n)λn |∇ Jn (x)|2 d x N (t) = M(t)2 − t 2 t − 2 t − 2
n
α(m, ˆ n)λm λn (Jn+m − Jn − Jm )2 (x)d x
m,n
ˆ β(m, n)λn+m (Jn+m − Jn − Jm )2 (x)d x,
m,n
is a martingale. To establish Theorem 3.1, we may try to show Yεj → Y j ,
Mε → M,
for i = 1, . . . , 4. It turns out that this is not what is going on! Firstly, it is rather straightforward to show that Yε1 → Y 1 by the classical central limit theorem with Y 1 a Gaussian random variable with variance n λn Jn2 d x. Also, virtually by definition, we have that t if ξ ε converges to ξ , then Yε2 → n d(n) 0 ξn (Jn , s)ds. This stems from the fact that Yε2 corresponds to the “non-interacting” part of the evolution, namely the Laplacian operator . However we need to split the “interacting” part of the microscopic evolution into 3 distinct parts of completely different natures. Indeed, we have a decomposition Yε3 = Yε3,1 + Yε3,2 + Yε3,3 ,
(4.5)
where Yε3,1 → Y 3 as ε → 0, the term Yε3,2 contributes to the fragmentation term so that Yε3,2 + Yε4 → Y 4 , and Yε3,3 contributes to the martingale part. That is, Yε3,3 + Mε → M. It is as if a part of the microscopic “drift” becomes some type of “white noise” as ε → 0. Perhaps this is the most surprising aspect of the present work and is in complete contrast with some earlier works on the equilibrium and non-equilibrium fluctuations on models with diffusive scaling [CY,C] and a stochastic model with kinetic scaling [R1]. This ramification of the diffusion coefficient by the “drift” is reminiscent of a similar phenomenon for the tagged particles in the exclusion processes (see Kipnis-Vardhan [KV]). In our setting however, the ramification of the noise happens in a rather curious way as we explained in Remark 3.2. To explain the decomposition (4.5), and sketch our method of proof further, we need to recall how Smoluchowski’s equation has been derived from our microscopic model in the articles [HR1,HR2,R2 and HRY]. For this derivation, we need to understand how the microscopic coagulation (respectively fragmentation) rate α(m, n) (respectively ˆ β(m, n)) leads to the macroscopic coagulation rate α(m, ˆ n) (respectively β(m, n)). For the derivation of (2.6), we start from the expression Xˆ ε (ω(t)) = K ε−1/2 X ε (ω(t)) = K ε−1 J (xi (t), m i (t)), i
and study the corresponding (4.2) which we obtain by multiplying both sides of −1/2 −1/2 −1/2 3 (4.2) by K ε . Since K ε Mε → 0, we only need to concentrate on K ε Yε −1/2 4 −1/2 4 and K ε Yε . The term K ε Yε is in some sense linear and all challenges come −1/2 3 −1/2 3 from K ε Yε . It turns out that there is a splitting K ε Yε = Z ε1 + Z ε2 with
782
M. Ranjbar, F. Rezakhanlou
t −1/2 4 c 2 Z ε1 converging to 0 Yε converging to n Jn (x)Q n (f)(x, t)d x and Z ε + K ε t f n Jn (x)Q n (f)(x, t)d x. This splitting is not hard to justify; when a fragmen0 tation occurs, a pair of particles are produced which are within a distance of order O(ε) and prone to coagulate. Of course such a coagulation undoes the fragmentation that has just been occurred. Indeed Z ε2 is negative which results in a macroscopic fragmentation βˆ strictly less than β. To describe the decomposition (4.5), let us observe Yε3 =
1 −1/2 K α(m i , m j )Vε (xi − x j ; m i , m j ) J˜(xi , m i , x j , m j ) 2 ε
(4.6)
i, j
=
1 −3/2 α(m i , m j )V ε (xi − x j ; m i , m j ) J˜(xi , m i , x j , m j ), K 2 ε i, j
where V ε = K ε Vε and J˜(xi , m i , x j , m j ) is given by mj mi J (xi , m i + m j ) + J (x j , m i + m j ) − J (xi , m i ) − J (x j , m j ). (4.7) mi + m j mi + m j Our goal would be a decomposition of the form
t
t
t Yε3 (s)ds = Bεz (ω(s))ds + Cε (ω(s))ds + Dε (t) + Err or, 0
0
(4.8)
0
where Bεz (ω) =
1 −3/2 K α(m i , m j )W ε (xi − x j + z; m, n) J˜(xi , m i , x j , m j ), (4.9) 2 ε i, j
for a suitable function W ε which will be defined shortly, and Err or represents a term that will go to zero as ε → 0 and |z| → 0. The form of W ε would allow us to replace α with its macroscopic counterpart α. ˆ The term Cε is given by
K ε−3/2
i −1 m
i
β(m, m i − m)V ε (xi − y; m, m i − m)
m=1
u ε (xi − y; m, m i − m) J˜(xi , m, y, m i − m)dy, and the term Dε (t) is a martingale. It is the decomposition (4.8) that leads to the decomposition (4.5). To achieve the decomposition (4.8), fix z and start from the expression G ε (ω) = K ε−3/2 uˆ ε (xi − x j ; m i , m j ) J˜(xi , m i , x j , m j ), (4.10) i, j
where uˆ ε (a; m, n) = u ε (a + z; m, n) − u ε (a; m, n), with u ε (a; m, n) satisfying the equation (d(m) + d(n))u ε (x; m, n) = α(m, n) Vε (x; m, n)u ε (x; m, n) + V ε (x; m, n) . (4.11)
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
783
(The functions V ε and Vε were defined by (2.4) and right before (2.2) respectively.) We then apply the martingale decomposition as in (4.2) to assert
t G ε (ω(t)) = G ε (ω(0)) + AG ε (ω(s)) + E ε (t), (4.12) 0
with E ε (t) a martingale. This involves various terms as we apply the operators A0 , Ac and A f on G ε . As it turns out, the first term G ε (ω(0)) and many other terms on the right-hand side of (4.12) are small if |z| is sufficiently small. However, the choice of u ε results in a component in (A0 + Ac )G ε , which is exactly our 2(Bεz − Yε3 ) in (4.9), and a component in A f G ε which is exactly Cε . The function W ε in (4.9) is given by W ε (a; m, n) = V ε (a; m, n)(1 + K ε−1 u ε (a; m, n)).
(4.13)
Of course we need to show that all other components in (A0 + Ac )G ε , and A f G ε are small if ε and |z| are small. This can be achieved by rather straightforward reasoning if we require K ε1/2 |z|| log |z|| → 0,
K ε−1/2 | log |z|| → 0.
(4.14)
−1/2
(In higher dimension, the second condition is replaced with K ε |z|2−d → 0, which is inconsistent with the first condition if d ≥ 3.) These two conditions are satisfied if |z| = | log ε|−θ , for some θ > 1/2. At this stage, we simply use the smallness of uˆ ε for z satisfying ε 1/2. In order to figure out a successful way of going beyond | log ε|−θ and reach a density δ f with δ satisfying K ε δ d → ∞, we need to review what has been achieved so far and what to learn from it. Basically our goal is a central limit theorem (CLT) for the particle density (4.1) and for this we need to perform some type of CLT for the time average of (4.6). Note that Yε3 is in some sense singular because the function V ε is a delta-type expression. That is, eq in a region of volume O(εd ), V ε is of order O(ε−d ). In fact if we calculate Eε Yε3 (ω)2 , we get an expression that blows up as ε → 0. All this ultimately stems from the fact that the coagulation occurs when particles are microscopically close. We wish to replace V ε with a smoother kernel and this is exactly what purpose (4.8) serves. We try to replace xi − x j , the argument of V ε , with xi − x j + z. That is, we try to figure out the coagulation rate when particles xi and x j are not microscopically close but only macroscopically close, i.e., xi − x j = z + O(ε) with |z| → 0 after sending ε → 0. (For example |z| could be as “large” as | log ε|−θ .) However there is a price to pay for such a replacement; we need to replace V ε with W ε and modify the fragmentation term (we are referring to the term Cε ), and even the noise is modified (the term Dε ). To carry this out, we encountered various additional terms which are presumably small. We have a relatively easy ride, if |z| > | log ε|−1/2 , we have already achieved three important tasks:
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
785
(i) The correctors Cε and Dε would modify the fragmentation and martingle terms as required in the proof of the main result Theorem 3.1. (ii) The term W ε would allow us to replace α with αˆ because lim W ε = η as ε → 0. (iii) We have been able to replace the singular term V ε (a; m, n) with a less singular term W¯ ε (a; m, n) = W ε (a + z; m, n)ζ δ(ε) (z)dz, where δ(ε) = | log ε|−θ for some θ > 1/2. We are now t in a position to explain the central role of Eq. (4.11). Because of the time average in 0 Yε3 (s)ds, we are dealing with an expression which is almost as smooth as A−1 Yε3 . Of course A−1 is too complicated to use. The message behind Eq. (4.11) and its use is that we only need to consider 2-particles dynamics. Namely, the fact that xi − x j is a diffusion with generator (d(m i ) + d(m j )), and that once a coagulation occurs with rate α(m i , m j ) between the i th and j th particles, (xi , x j ) as a pair no longer exists and hence the dynamics of xi − x j has an infinitesimal generator of a killed Brownian motion: ε = (d(m) + d(n)) − α(m, n)Vε (·; m, n), with m = m i and n = m j . Now the function u ε = ε−1 V ε is smoother than V ε and this allows us to perturb its argument by a small vector z and obtain (4.8). By (iii), we are −d ), and has a support of now dealing W¯ ε in place of V ε . We note that W¯ ε = O(δ(ε) diameter O(δ(ε)). To replace W¯ ε with W˜ ε (a; m, n) = W ε (a + z; m, n)ζ δ (ε) (z)dz, for some δ (ε) >> | log ε|−1/2 , we almost repeat the formula (4.12) where V ε is replaced with W¯ ε , and u ε is replaced with v ε which now solves (d(m) + d(n))v ε (x; m, n) = α(m, n)W¯ ε (x; m, n). (4.17) This time we can show that various terms that appear in AG ε are small provided that |z| ≤ δ (ε) for δ , that is, now can be chosen as large as | log log ε|−θ for any θ > 1/2. For this step of the proof we show that all the error terms have small second moments, in other words, a CLT is taking place and the errors have small variances. (see Sect. 7). As a consequence of the main result of Sect. 7, we have the decomposition (4.15) where δ(ε) is replaced with δ (ε). We can now rigorously show that 3 is small by ignoring the time integration and showing that the integrand has a small second moment with respect to the equilibrium measure. As for 1 , we first carry out dz 2 integration and use the fact that lim W ε (a; m, n)da = η(m, n), as ε → 0. (This was proved as Theorem 3.2 in [HR2].) After some straightforward manipulations, t
t
ξn (t, J )ds η(m, n)λm + Error 2 (ε). 1 dz 1 dz 2 ds = 0
0
As for 2 , we first replace J (z 1 ) with J (z 2 ) for a small error because |z 1 − z 2 | = O(ε). We then integrate with respect to z 1 and repeat our reasoning for 1 to obtain t
t
ξm (s, J )ds η(m, n)λn + Error 3 (ε). 2 dz 1 dz 2 ds = 0
In summary
0
t 0
Yε3 (s)ds =
1 α(m, ˆ n) 2 m,n
t
(ξm (s, J )λn + ξn (s, J )λm )ds
0
+ Dε (t) + Err or (ε) for an error Error(ε) that goes to 0 on ε → 0.
(4.18)
786
M. Ranjbar, F. Rezakhanlou
5. Regularity of the Coagulation Term, Part I As we mentioned in Sect. 4, the main ingredient for the proof of Theorem 3.1 is the statement (4.18). In this section this statement is partially established and the full proof of (4.18) will be achieved in Sect. 7. We now prepare for the main result of this section, which will appear as Theorem 5.1 at the end of the section. The proof of Theorem 5.1 willbe given in Sect. 6. Note that the function J in (4.1) is of compact support and satisfies J (x, n)d x = 0, for every n. In fact we only need to consider J (x, n) = J¯(x)11(n = m) ¯ with J¯(x)d x = 0. Evidently for such a function J , we have J (x, n)d x = 0 for every n. Note that J˜ of (4.7) is not of compact support. However, for some positive l, we have that J˜(x, m, y, n) = 0, if either m, n ≥ l or |xi |, |x j | ≥ l. Because of the Vε term in the definition of Yε3 , we may replace J˜ with Jˆ(xi , m i , x j , m j ) = J˜(xi , m i , x j , m j )K (xi − x j ), for a smooth symmetric function K (a) of compact support which is 1 whenever |a| ≤ 1. The advantage of Jˆ to J˜ is that the former is of compact support in the spatial variables. Note however, the term Vε only implies that |xi − x j | ≤ c0 εr (m i , m j ) for a constant c0 . Hence such a replacement is valid only if c0 εr (m i , m j ) ≤ 1. This causes an error that can be readily handled with the aid of our hypothesis (3.9). (See the first step of the proof of Theorem 8.1 in Sect. 8.) Recall that u ε (x; m, n) solves (d(m) + d(n))u ε (x; m, n) = α(m, n)[Vε (x; m, n)u ε (x; m, n) + V ε (x; m, n)], where V ε (x; m, n) = ε−2 V (x/ε; m, n), and Vε (x; m, n) = K ε−1 ε−2 V (x/ε; m, n). Given such a function u ε , we define G(ω; z) = G(ω) = K ε−3/2
uˆ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ),
(5.1)
i, j
where uˆ ε (a; m, n) = u ε (a + z; m, n) − u ε (a; m, n). We have
t
G(ω(t)) = G(ω(0)) +
AG(ω(s))ds + Mt ,
0
where Mt is a martingale. We write AG = A0 G + Ac G + A f G =: H1 + H2 + H3 .
(5.2)
We now study various terms which appeared on the right-hand side. We write Jˆx and Jˆy for the derivatives of Jˆ with respect to its first and second spatial arguments. We then write H1 = H11 + H12 + H13 ,
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
787
with H11 (ω) = K ε−3/2
uˆ ε (xi − x j ; m i , m j )[(d(m i )xi +d(m j )x j ) Jˆ(xi , m i , x j , m j )],
i, j
H12 (ω) =
K ε−3/2
d(m i )uˆ εx (xi − x j ; m i , m j ) · Jˆx (xi , m i , x j , m j )
i, j
−K ε−3/2
d(m j )uˆ εx (xi − x j ; m i , m j ) · Jˆy (xi , m i , x j , m j )
i, j
=: H121 (ω) − H122 (ω), (d(m i ) + d(m j ))uˆ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ) H13 (ω) = K ε−3/2 i, j
=:
z H13 (ω) −
0 0 H131 (ω) − H132 (ω),
where z H13 (ω) = K ε−3/2
α(m i , m j )W ε (xi − x j + z; m i , m j ) Jˆ(xi , m i , x j , m j )
i, j
with W ε (a; m, n) = u ε (a; m, n)Vε (a; m, n) + V ε (a; m, n), and 0 H131 (ω) = K ε−3/2
α(m i , m j )u ε (xi − x j ; m i , m j )
i, j
Vε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ), 0 (ω) = K ε−3/2 α(m i , m j )V ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ). H132 i, j
We also write H2 = H21 + H22 ,
z 0 H21 = H21 − H21 ,
z (ω) given by with H21
1 − K ε−3/2 α(m i , m j )Vε (xi − x j ; m i , m j ) 2 i, j u ε (xi − x j + z; m i , m j ) Jˆ(xi , m i , x j , m j ) +u ε (x j − xi + z; m i , m j ) Jˆ(x j , m j , xi , m i ) α(m i , m j )Vε (xi − x j ; m i , m j ) = −K ε−3/2 i, j
u (xi − x j + z; m i , m j ) Jˆ(xi , m i , x j , m j ). ε
788
M. Ranjbar, F. Rezakhanlou
Moreover, H22 (ω) =
1 α(m i , m j )Vε (xi − x j ; m i , m j )K ε−3/2 2 i, j mi [uˆ ε (xi − xk ; m i + m j , m k ) Jˆ(xi , m i + m j , xk , m k ) mi + m j k
+uˆ ε (xk − xi , m k , m i + m j ) Jˆ(xk , m k , xi , m i + m j )] +
mi [uˆ ε (x j − xk ; m i + m j , m k ) Jˆ(x j , m i + m j , xk , m k ) mi + m j
+uˆ ε (xk − x j ; m k , m i + m j ) Jˆ(xk , m k , x j , m i + m j )] −[uˆ ε (xi − xk ; m i , m n ) Jˆ(xi , m i , xk , m k ) +uˆ ε (xk − xi ; m k , m i ) Jˆ(xk , m k , xi , m i )] −[uˆ ε (x j − xk ; m j , m k ) Jˆ(x j , m j , xk , m k ) + uˆ ε (xk − x j ; m k , m j ) Jˆ(xk , m k , x j , m j )] . The expression H22 arises from the changes in the function G when a coagulation occurs due to the influence of the appearance and disappearance of particles on other particles that are not directly involved. The expression H21 represents those terms in G that are absent after a coagulation. Note that for our formula for H12 , we used the fact that K is symmetric and since V is symmetric, the function u ε is also symmetric. As for the fragmentation part of dynamics, we have H3 = H31 + H32 + H33 , where H31 = H311 + H312 , with
i −1 m 1 H311 (ω) = β(m, m i − m)V ε (xi − y; m, m i − m) K ε−3/2 2 i, j m=1 ε uˆ (xi − x j ; m, m j ) Jˆ(xi , m, x j , m j ) − uˆ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ) dy,
j −1 m 1 −3/2 H312 (ω) = β(m, m j − m)V ε (x j − y; m, m j − m) Kε 2 i, j m=1 ε uˆ (xi − x j ; m i , m) Jˆ(xi , m i , x j , m) −uˆ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ) dy.
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
789
We carry out dy integration and use symmetry to obtain that H31 = 2H311 , where
i −1 m 1 H311 (ω) = β(m, m i − m) K ε−3/2 2 i, j m=1 uˆ ε (xi − x j ; m, m j ) Jˆ(xi , m, x j , m j )
−uˆ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ) .
Also, H32 = H321 + H322 , with H321 (ω) =
1 2
K ε−3/2
i −1 m
β(m, m i − m)V ε (xi − y; m, m i − m)
i, j m=1
uˆ (y − x j ; m, m j ) Jˆ(y, m, x j , m j )dy,
j −1 m 1 β(m, m j − m)V ε (x j − y; m, m j − m) H322 (ω) = K ε−3/2 2 ε
i, j m=1
uˆ (xi − y; m i , m) Jˆ(xi , m i , y, m)dy, ε
z 0 , with and H33 = H33 − H33
K ε−3/2
z H33 (ω) =
i −1 m
i
β(m, m i − m)V ε (xi − y; m, m i − m)
m=1
u (xi − y + z; m, m i − m) Jˆ(xi , m, y, m i − m) dy. ε
0 + H 0 = 0. We may rewrite (5.2) as Note that H131 21 z z z 0 0 AG + [H132 − H13 + H33 ] = (H11 + H12 ) + H21 + (H22 + H31 + H32 ) + H33 .
(5.3)
We are now ready to state the main result of this section. √ Theorem 5.1. Let Jˆ be as above and assume that ε < |z| < 1. Then t
t z 0 0 AG(ω(s))ds + Eeq [H (ω(s)) − H (ω(s)) + H (ω(s))]ds ε 132 33 13 0 0 ≤ C0 t K ε1/2 |z|| log |z|| + K ε−1/2 | log |z|| . (5.4) We establish Theorem 5.1 by examining various terms that appeared on the right-hand side of (5.3). Indeed we show 1/2 Eeq ε |G(ω(t))||G(ω(0))| ≤ C 0 K ε |z|,
(5.5)
1/2 Eeq ε |H11 (ω(s))| ≤ C 0 K ε |z|,
(5.6)
1/2 Eeq ε |H12 (ω(s))| ≤ C 0 K ε |z|| log |z||,
(5.7)
790
M. Ranjbar, F. Rezakhanlou 1/2 Eeq ε |H22 (ω(s))| ≤ C 0 K ε |z|,
(5.8)
1/2 Eeq ε |H31 (ω(s))| ≤ C 0 K ε |z|,
(5.9)
1/2 Eeq ε |H32 (ω(s))| ≤ C 0 K ε |z|,
(5.10)
z −1/2 |log |z||, Eeq ε H21 (ω(s)) ≤ C 0 K ε
(5.11)
z −1/2 |log |z||. Eeq ε H33 (ω(s)) ≤ C 0 K ε
(5.12)
Theorem 5.1 is an immediate consequence of (5.5-11). The bound (5.5) will be used for the proof of Theorem 3.1. As we mentioned in Sect. 4, our method of proof can be used to establish a law of t 1/2 large number (LLN) for the expression 0 K ε Yε3 (s)ds with Yε3 as in (4.4). This can be achieved as in [HR2] by using the regularity of the coagulation term and this time z can be chosen to be any small vector. Moreover for J˜, we may choose any smooth function of compact support. Note that since we are at equilibrium, the proof of LLN is much easier than what we have in [HR2] because all the correlationbounds needed for 1/2 t the proof are trivially true. This would allow us to find the limit of 0 K ε Yε3 (s)ds as ε → 0. Since this limit is not random, the limit can be calculated by passing to the limit eq t 1/2 eq 1/2 in Eε 0 K ε Yε3 (s)ds = tEε K ε Yε3 (0). In summary, Lemma 5.1. Let K (x, m, y, n) by any smooth function of compact support. Then t ε = 0, ¯ (5.13) lim Eeq Z (ω(s))ds − t Z ε ε→0
where Z ε (ω) = K ε−2 Z¯ =
0
α(m i , m j )V ε (xi − x j ; m i , m j )K (xi , m i , x j , m j ),
i, j
λm λn α(m, n)
K (x, m, x, n)d x.
m,n
Lemma 5.1 will be needed in Sect. 8. In Sect. 8 we also need another LLN which can be established with a similar argument. This time our Z ε (ω) is given by d(m i ) K ε−1 |∇u ε (xi − x j ; m i , m j )|2 K ε−2 i, j
J˜(xi , m i , xi , m j )2 11(|xi − x j | ≤ 1).
(5.14)
As we will see in Lemma 6.1 of Sect. 6, the function W ε (a; m, n) = K ε−1 |∇u ε (a; m, n)|2 11(|a| ≤ 1), ε ε −2 −1 is almost as |a| ≤ ε. singular as V (a; m, n) because W (a; m, n) = O(ε K ε ) when However W ε da stays bounded as ε → 0. We will calculate γ = limε→0 W ε da in Sect. 8 (see the final step of the proof of (8.4).) We have,
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
791
Lemma 5.2. Let Z ε be as in (5.14). Then (5.13) is true for Z¯ =
λm λn d(m)γ (m, n)
J˜(x, m, x, n)2 d x.
m,n
This lemma can be proved in a similar way. This time we start with a function w ε (x; m, n) that now solves (d(m) + d(n))w ε (x; m, n) = α(m, n)Vε (x; m, n)w ε (x; m, n) + d(m)W ε (x; m, n), and define G(ω) = K ε−2
wˆ ε (xi − x j ; m i , m j ) J˜(xi , m i , xi , m j )2 ,
(5.15)
i, j
where wˆ ε (a; m, n) = w ε (a + z; m, n) − w ε (a; m, n). Again, using the same method of proof as [HR2] we can show that the limit in (5.13) exists and then by taking the expectation of Z ε , we identify the limit.
6. Proof of Theorem 5.1 In this section, we establish (5.5)– (5.12). As a preliminary step, we state a lemma about the regularity of the function u ε . Recall that u ε satisfies (4.11) or equivalently (d(m) + d(n))x u ε (x; m, n) = α(m, n)V ε (x; m, n) | log ε|−1 u ε (x; m, n) + 1 , In fact log ε ≤ u ε and u ε is given by 1 α (m, n) 2π
log |x − y|V ε (y; m, n) | log ε|−1 u ε (y; m, n) + 1 dy,
where α (m, n) = α(m, n)/(d(m) + d(n)). To ease the notation, we do not display the dependence of α (m, n) and r (m, n) on m and n. Lemma 6.1. There exist positive constants C1 and C2 such that for all x, |x| , | log ε| , |u (x; m, n)| ≤ C1 α min 1 + log r |∇u ε (x; m, n)| ≤ C1 α min |x|−1 , (r ε)−1 , ε
(6.1) (6.2)
and for |x| ≥ 2|z| + C2 r ε, |∇u ε (x + z; m, n) − ∇u ε (x; m, n)| ≤ C1 α |x|−2 |z|.
(6.3)
792
M. Ranjbar, F. Rezakhanlou
Also,
|∇u ε (a; m, n)|da ≤ C1 α l,
(6.4)
|uˆ ε (a; m, n)|da ≤ C1 α (l + |z|)|z|,
(6.5)
|a|≤l
|a|≤l
|∇ uˆ ε (a; m, n)|da ≤ C1 α |z| | log(|z| + r ε)| + 1 + log+ l + r ε , |a|≤l 2
ε 2 2 2 2 2 2 + l u (a; m, n) da ≤ C1 α r ε | log ε| + l log +1 , r |a|≤l
l + r2 . |∇u ε (a; m, n)|2 da ≤ C1 α 2 1 + log+ rε |a|≤l
(6.6) (6.7) (6.8)
Proof. The proofs of (6.1), (6.2) and (6.3) are omitted and can be found in Sect. 2.2 of [HR2]. Note however that in [HR2] we are assuming that χ = 0 and that we were dealing with V ε (x) = ε−2 V (x/ε) instead of (εr )−2 V (x/(r ε)). Since we have u ε (x; m, n) = v ε (x/r ) for r = r (m, n) and v ε solving (d(n) + d(m))v ε (x) = α(m, n)V ε (x) | log ε|−1 v ε (x) + 1 , we can readily use the results of [HR2] to obtain (6.1), (6.2) and (6.3). As for (6.4), we apply (6.2) to assert
|∇u ε (a; m, n)|da ≤ c1 α min |a|−1 , (r ε)−1 da ≤ c2 α l. |a|≤l
|a|≤l
As for (6.5), we simply write,
ε
|a|≤l
|uˆ (a; m, n)|da =
|a|≤l
≤ |z|
0
1
∇u (a + t z; m, n) · zdt da
|a|≤l+|z|
ε
|∇u ε (a; m, n)|da,
and apply (6.4). As for (6.6), we use (6.3) and (6.4) to write
|∇ uˆ ε (a; m, n)|da ≤ |∇ uˆ ε (a; m, n)|da |a|≤2|z|+C2 r ε
α |a|−2 |z|da +C1 2|z|+C2 r ε≤|a|≤l ≤ c2 α (|z| + r ε) + |z|| log(|z| + r ε)| + |z|| log l| . For the proof of (6.7), let us write A(l; m, n) for the left-hand side of (6.7). We use (6.1) to assert that if l ≤ εr , then A(l; m, n) ≤ c2 l 2 α 2 | log ε|2 ≤ c2 α 2 r 2 ε2 | log ε|2 ,
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
793
and if εr < l, then A(l; m, n) is bounded above by
|a| da c3 α 2 r 2 ε2 | log ε|2 + 11(|a| ∈ (εr, l)) log r l 2 2 2 2 2 2 ≤ c4 α r ε | log ε| + l log + 1 , r completing the proof of (6.7). In the same fashion, we can readily establish (6.8).
Proof of (5.5), (5.6) and (5.7). We omit the proof of (5.6) because its proof is very similar to the proof of (5.5). Evidently
1/2 Eeq |G(ω(0))| ≤ c K λm λn |uˆ ε (a; m, n)|da 1 ε ε |a|≤1 m,n
≤
c2 K ε1/2 |z|
α (m, n)λm λn ≤ c3 K ε1/2 |z|,
m,n
where we used (6.5) and (3.10) for the the second and third inequalities respectively. This proves (5.5). We now turn to the proof of (5.7). We certainly have
1/2 Eeq |H (ω(0))| ≤ c K λm λn |∇ uˆ ε (a; m, n)|da 12 1 ε ε |a|≤1 m,n
α (m, n) (r (m, n)ε + |z| log(|z| + r (m, n)ε)) λm λn
≤
c2 K ε1/2
≤
m,n 1/2 c3 K ε |z|| log |z||,
by (6.6) of Lemmas 6.2. We now use (3.10) to deduce (5.7).
eq Eε |H22 (ω(0))|
Proof of (5.8). Evidently the expression is bounded by
c1 K ε1/2 α(m, n)V ε (b; m, n) |uˆ ε (a; m, p)| |a|≤1
m,n, p
+|uˆ (a; m + n, p)| λm λn λ p dadb ≤ c2 K ε1/2 |z| α(m, n)(α (n, p) ε
m,n, p
+α (m + n, p))λm λn λ p ≤ c3 K ε1/2 |z|, where we used (6.5) and (3.10) for the second and third inequalities respectively. This proves (5.8). Proof of (5.9) and (5.10). We start with the proof (5.9). We have H311 = H3111 −H3112 , where
i −1 m 1 β (m i )uˆ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ), H3112 (ω) = K ε−3/2 2 i, j m=1
794
M. Ranjbar, F. Rezakhanlou
with β (m i ) =
m i −1
β(m, m i − m).
m=1 eq
1/2
Repeating the proof of (5.5) yields that Eε |H3112 (ω(0))| ≤ c1 K ε |z|. The term H3111 is treated in the same fashion: 1/2 Eeq ε |H3111 (ω(0))| ≤ c2 K ε |z|
n−1
β(m, n − m)(a( p) + a(m))λ p λn ≤ c3 K ε1/2 |z|.
n, p m=1
This completes the proof of (5.9). We now turn to the proof of (5.10). The terms H321 and H322 are similar and both eq can be treated as (5.9). We only treat the latter. We certainly have that Eε |H321 (ω(0))| is bounded by
c1
K ε1/2
n−1
β(m, n − m)
V ε (a − y; m, n − m)|uˆ ε (y − b; m, p)|
n, p m=1
11(|y − b| ≤ 1, |y|, |b| ≤ l)dadbdy
n−1 1/2 =c1 K ε β(m, n − m) |uˆ ε (y − b; m, p)|11(|y − b| ≤ 1, |y| ≤ l)dbdy n, p m=1
≤ c2 K ε1/2 |z|
n−1
β(m, n − m)(a( p) + a(m))λ p λn ≤ c3 K ε1/2 |z|,
n, p m=1
completing proof of (5.10).
eq
z (ω)| is bounded Proof of (5.11) and (5.12).. We certainly have that the term Eε |H21 above by −3/2 c1 Eeq ε Kε
α(m i , m j )Vε (xi − x j ; m i , m j )|u ε (xi − x j + z; m i , m j )|11(|xi | ≤ l)
i, j
c2 |z| 11(εr (m, n) ≤ |z|)λm λn α(m, n)α (m, n) log r (m, n) m,n +c2 K ε1/2 α(m, n)α (m, n)λm λn 11(εr (m, n) > |z|)
≤ c2 K ε−1/2
m,n
≤
c3 K ε−1/2 | log |z|| + c3 K ε−1/2 +c3 K ε1/2
n
α(m, n)α (m, n) log r (m, n) λm λn
m,n
a(n) λn 11(εr (n) > |z|) 2
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
795
≤c3 K ε−1/2 | log |z|| + c4 K ε−1/2 a(n)2 log n λn + c4 K ε1/2 a(n)2 11(εr (n) >|z|)λn n
n
|z| −1 −1/2 1/2 log ≤ c3 K ε | log |z|| + c5 K ε a(n)2 log n λn ε n ≤ c3 K ε−1/2 | log |z|| + c5 K ε−1 , where we used Lemma 6.1 for the first inequality. This completes the proof of (5.11). eq z Similarly the term Eε |H33 (ω)| is bounded above by
i −1 m c1 Eeq β(m, m i − m)V ε (xi − y; m, m i − m) K ε−3/2 ε i
m=1
|u ε (xi − y + z; m, m i − m)|11(|xi | ≤ l)dy ≤ c2 K ε−1/2 log
β(m, n − m)α (m, n − m)
n m=1
c2 |z| 11(εr (m, n − m) ≤ |z|)λn r (m, n − m)
+c2 K ε1/2 ≤
n−1
n−1
β(m, n − m)α (m, n − m)11(εr (m, n − m) > |z|)λn
n m=1 −1/2 c3 K ε | log |z|| + c3 K ε−1 .
This completes the proof of (5.12).
7. Regularity of the Coagulation Term, Part II As we explained in Sect. 4, one of the main steps of the proof of Theorem 3.1 is the 0 with a more manageable replacement of the expression V ε (·) in the collision term H132 ε expression W (· + z) for small z. Ultimately we average out W ε (· + z) over z and apply a CLT. For this to succeed, we need to make sure that we can afford a small z which is as big as | log ε|−a for some a < 1/2. In Sect. 5, we used the auxiliary function G in z order to relate H132 to H13 provided that |z| is of order δ(ε) = | log ε|−θ for θ > 1/2. In this section, we would like to fill the gap by showing that in fact z can be chosen so that |z| is as large as δ (ε) = | log log ε|−θ , provided that θ ∈ (0, 1/2). To achieve this, we fix a θ ∈ (0, 1/2) and set H¯ 13 (ω) to be equal
z (ω)ζ δ(ε) (z)dz = K ε−3/2 α(m i , m j )W¯ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ), H13 i, j
where W¯ ε (a; m, n) = W (a + z; m, n)ζ δ(ε) (z)dz and W ε was defined by (4.13). First observe that there exists a constant c1 such that the function W¯ ε has a support that is contained in a ball of center 0 and radius δ(ε; m, n) = c1 δ(ε) + r (m, n)ε. For our purposes, it is more convenient to assume that r (m, n)ε ≤ δ(ε) so that for a constant c2 , the support W¯ ε is contained in a ball of center 0 and radius c2 δ(ε), with c2 = c1 + 1, and that |W¯ ε | ≤ c2 δ(ε)−2 . Such a restriction causes a small error. Indeed, if we set H¯ 13 (ω) := K ε−3/2 α(m i , m j )W¯ ε (xi − x j ; m i , m j ) J¯(xi , m i , x j , m j ), (7.1) i, j
796
M. Ranjbar, F. Rezakhanlou
with J¯(xi , m i , x j , m j ) = J˜(xi , m i , x j , m j )11(r (m, n)ε ≤ δ(ε)), then 1/2 ¯ ¯ Eeq ε | H13 (ω) − H13 (ω)| ≤ c1 K ε
α(m, n)λm λn 11(r (m, n)ε > δ(ε))
m,n
≤ c2 K ε1/2
a(n)λn 11(2r (n)ε > δ(ε)),
(7.2)
n
which goes to 0 by our assumption (3.9). Define v ε by
1 v ε (x; m, n) = log |x − y|W¯ ε (y; m, n)dy. 2π
(7.3)
We then set G (ω; z) = G (ω) = K ε−3/2
qˆ ε (xi − x j ; m i , m j ) J¯(xi , m i , x j , m j ),
(7.4)
i, j
where qˆ ε (a; m, n) = v ε (a + z; m, n)K (a + z) − v ε (a; m, n)K (a). We have
t G (ω(t)) = G (ω(0)) + AG (ω(s))ds + Mt , 0
where Mt is a martingale. Note that G is very similar to G of Sect. 5; uˆ is replaced with qˆ and J¯ is replaced with J˜. The latter difference has to do with the fact that now the function K appears in the definition of qˆ and we no longer need to multiply J˜ with a cut-off function. We write AG = A0 G + Ac G + A f G =: H1 + H2 + H3 .
(7.5)
We now study various terms which appeared on the right-hand side. We write H1 = H11 + H12 + H13 .
We do not repeat the definition of various H -expressions which all correspond to H expressions of Sect. 5. However, since v ε satisfies (7.3), we have a different decompo . The decomposition sition for H13 qˆ ε (a; m, n) = v ε (a + z; m, n)K (a + z) − v ε (a; m, n)K (a) +∇v ε (a + z; m, n) · ∇ K (a + z) − ∇v ε (a; m, n) · ∇ K (a) +v ε (a + z; m, n)K (a + z) − v ε (a; m, n)K (a) =: q1ε (a; m, n) + q2ε (a; m, n) + q3ε (a; m, n), results in a decomposition H13 = H131 + H132 + H133 ,
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
where H13r = K ε−3/2
797
qˆrε (xi − x j ; m i , m j ) J˜(xi , m i , x j , m j ).
i, j
We may rewrite (7.5) as
t H131 (ω(s))ds = G(ω(t)) − G(ω(0)) − Mt 0
t − (H11 + H12 + H132 + H133 + H2 + H3 )(ω(s))ds.
(7.6)
0
We are now ready to state the main result of this section. Theorem 7.1. Assume that δ(ε) < |z| < 1. Then t eq Eε H131 (ω(s))ds ≤ C0 (t + 1) |z| + K ε−1/2 | log δ(ε)|1/2 . 0
Remark 7.1. With the aid of this theorem, we can readily improve the z-average from is given by |z| = O(δ(ε)) to |z| = O(δ (ε)). Indeed H131 K ε−3/2 , W¯ ε (xi − x j + z; m i , m j )K (xi − x j + z) J¯(xi , m i , x j , m j ) − H¯ 13 i, j can be replaced with H ¯ 13 , for an error that goes to 0 as ε → 0. and by (7.2), the term H¯ 13 z (ω)ζ δ(ε) (z)dz can be replaced From this and Theorem 7.1 we deduce that H¯ 13 = H13 with W¯ ε (xi − x j + z; m i , m j )K (xi − x j + z) J˜(xi , m i , x j , m j ), K ε−3/2 i, j
so long as |z| = δ (ε). We establish Theorem 7.1 by examining various terms that appeared on the right-hand side of (7.6). Indeed we show −1/2 + |z| , (7.7) Eeq ε |G (ω(t))| ≤ C 0 K ε −1/2 Eeq + |z| , ε (H11 + H133 )(ω(s)) ≤ C 0 K ε
(7.8)
1/2 Eeq ε (H12 + H132 )(ω(s)) ≤ C 0 K ε |z|| log |z||,
(7.9)
1/2 Eeq , ε H22 (ω(s)) ≤ C 0 |z|| log δ(ε)|
(7.10)
−1/2 + |z| , Eeq ε (H31 + H32 )(ω(s)) ≤ C 0 K ε
(7.11)
−1/2 |log δ(ε)|, Eeq ε (H21 + H33 )(ω(s)) ≤ C 0 K ε
(7.12)
2 Eeq ≤ C0 t K ε−1 δ(ε) + |z|2 (log |z|)2 . ε Mt
(7.13)
To prepare for the proof of Theorem 7.1, we start with an elementary lemma.
798
M. Ranjbar, F. Rezakhanlou
Lemma 7.1. Assume that G(x, y, m, n)d xd y = 0, for every m, n ∈ N. Then 2 Y dνλ = Z 1 + Z 2 + Z 3 , where
Z 1 = N (N − 1)(N − 2) G(y1 , y2 , n 1 , n 2 )G(y1 , y3 , n 1 , n 3 ) n 1 ,n 2 ,n 3
Z 2 = N (N − 1)(N − 2)
λn 1 λn 2 λn 3 dy1 dy2 dy3 , G(y1 , y2 , n 1 , n 2 )G(y3 , y2 , n 3 , n 1 ) n 1 ,n 2 ,n 3
Z 3 = N (N − 1)
λn 1 λn 2 λn 3 dy1 dy2 dy3 , G(y1 , y2 , n 1 , n 2 )2 dy1 dy2 λn 1 λn 2 .
n 1 ,n 2
The straightforward proof of Lemma 7.1 is omitted. See also Lemma 3.3 of [R1] where a similar lemma is proved. As our next lemma we state some bounds on the function v ε . The proof of this lemma is omitted because it is identical to the proof of Lemma 6.1. Lemma 7.2. There exist positive constants C1 and C2 such that for all x, |v ε (x; m, n)| ≤ C1 α min {1 + |log |x|| , | log δ(ε)|}, |∇v ε (x; m, n)| ≤ C1 α min |x|−1 , δ(ε)−1 ,
(7.14) (7.15)
and for |x| ≥ 2|z| + C2 δ(ε), |∇v ε (x + z; m, n) − ∇v ε (x; m, n)| ≤ C1 γ (ε; m, n)α |x|−2 |z|. Also,
|∇q ε (a; m, n)|da ≤ C1 α ,
(7.17)
|qˆ ε (a; m, n)|da ≤ C1 α |z|,
(7.18)
|∇ qˆ ε (a; m, n)|d x ≤ C1 α {|z|[| log(|z| + δ(ε))| + 1] + δ(ε)},
q ε (a; m, n)2 da ≤ C1 α 2 ,
|∇q ε (a; m, n)|2 da ≤ C1 α | log δ(ε)|.
(7.19)
(7.16)
(7.20) (7.21)
Proof of (7.7) and (7.8). We only prove (7.7) because (7.8) can be proved by a verbatim argument. To apply Lemma 7.1, we need to check that for every n 1 and n 2 ,
(7.22) qˆ ε (y1 − y2 ; n 1 , n 2 ) J˜(y1 , n 1 , y2 , n 2 )dy1 dy2 = 0. We certainly have
J (y, n 1 )dy qˆ ε (a; n 1 , n 2 )da = 0. qˆ ε (y1 − y2 ; n 1 , n 2 )J (y1 , n 1 )dy1 dy2 =
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
799
The same is true if we replace J (y1 , n 1 ) with J (y1 , n 1 + n 2 ). This completes the proof of (7.22). In view of Lemma 7.1,
2 Eeq G (ω(0)) = G 2 (ω)νλ (dω) = R1 + R2 + R3 , ε with R1 =
K ε (K ε − 1)(K ε − 2)K ε−3
qˆ ε (y1 − y2 ; n 1 , n 2 )qˆ ε (y1 − y3 ; n 1 , n 3 )
n 1 ,n 2 ,n 3
J¯(y1 , n 1 , y2 , n 2 ) J¯(y1 , n 1 , y3 , n 3 )λn 1 λn 2 λn 3 dy1 dy2 dy3 ,
qˆ ε (y1 − y2 ; n 1 , n 2 )2 J¯2 (y1 , n 1 , y2 , n 2 )λn 1 λn 2 dy1 dy2 , R3 = K ε (K ε − 1)K ε−3 n 1 ,n 2
and R2 is given by an expression similar to R1 . We start with bounding R3 . We certainly have
−1 λn 1 λn 2 q ε (a; n 1 , n 2 )2 + q ε (a + z; n 1 , n 2 )2 da. R3 ≤ c1 K ε n 1 ,n 2
By Lemmas 7.2, R3 ≤ c2 K ε−1
n 1 ,n 2
α (n 1 , n 2 )2 λn 1 λn 2 ≤ c3 K ε−1
a(n)2 λn ≤ c4 K ε−1 .
(7.23)
n
We now turn to R1 . First observe that R1 ≤ R1 , where
λn 1 λn 2 λn 3 |qˆ ε (a; n 1 , n 2 )|da |qˆ ε (a; n 1 , n 3 )|da. R1 = c1 n 1 ,n 2 ,n 3
By Lemma 7.2 we deduce R11 ≤ c1 |z|2
a(n)2 λn ≤ c2 |z|2 .
(7.24)
n
From this and (7.23) we deduce (7.7).
is very similar to H , we only establish (7.9) for H . In Proof of (7.9). Since H12 132 12 view of Lemma 7.1, 2 Eeq ε H12 (ω(0)) = R1 + R2 + R3 ,
with R1 = K ε (K ε −1)(K ε −2)K ε−3
∇ qˆ ε (y1 − y2 ; n 1 , n 2 ) · ∇ qˆ ε (y1 − y3 ; n 1 , n 3 )
n 1 ,n 2 ,n 3
J¯(y1 , n 1 , y2 , n 2 ) J¯(y1 , n 1 , y3 , n 3 )λn 1 λn 2 λn 3 dy1 dy2 dy3 ,
−3 |∇ qˆ ε (y1 − y2 ; n 1 , n 2 )|2 J¯2 (y1 , n 1 , y2 , n 2 )λn 1 λn 2 dy1 dy2 , R3 = K ε (K ε −1)K ε n 1 ,n 2
and R2 is given by an expression similar to R1 .
800
M. Ranjbar, F. Rezakhanlou
We start with bounding R3 . We certainly have
λn 1 λn 2 |∇q ε (a; n 1 , n 2 )|2 + |∇q ε (a + z; n 1 , n 2 )|2 da. R3 ≤ c1 K ε−1 n 1 ,n 2
By Lemmas 7.2, R3 ≤ c2 | log δ(ε)|K ε−1 ≤ c3 K ε−1 | log δ(ε)|
α (n 1 , n 2 )2 λn 1 λn 2
n 1 ,n 2
a(n)2 λn ≤ c4 K ε−1 | log δ(ε)|.
(7.25)
n
We now turn to R1 . First observe that R1 ≤ R1 , where
λn 1 λn 2 λn 3 |qˆ ε (a; n 1 , n 2 )|da |qˆ ε (a; n 1 , n 3 )|da. R1 = c1 n 1 ,n 2 ,n 3
By Lemma 7.2 we deduce R1 ≤ c1 |z|2
α (n 1 , n 2 )α (n 1 , n 3 )λn 1 λn 2 λn 3 ≤ c2 |z|2 .
(7.26)
n 1 ,n 2 ,n 3
From this and (7.25) we deduce (7.9).
= 2H , H = Proof of (7.11). As in the proof of (5.8) and (5.9), we have that H31 311 311 H3111 − H3112 , where H3112 (ω) =
1 −3/2 K β (m i )qˆ ε (xi − x j ; m i , m j ) J˜(xi , m i , x j , m j ). 2 ε i, j
Repeating the proof of (7.9) yields 2 Eεeq H3112 (ω(s)) ≤ c3 (|z|2 + K ε−1 ). below. The term H3111 is handled in just the same way we handle H32 . The terms H and H are similar and both can be treated as We now turn to H32 321 322 (7.11). We only treat the latter. We apply Lemma 7.1 for G(xi , x j , m i , m j ) given by
1 2
m j −1
β(m, m j − m)V ε (x j − y; m, m j − m)qˆ ε (xi − y; m i , m) J¯(xi , m i , y, m)dy.
m=1
As a result, 2 Eeq ε [H322 (ω(0))] = R1 + R2 + R3 ,
with R1 , R2 , and R3 corresponding to Z 1 , Z 2 , and Z 3 in Lemma 7.1. We first treat R3 . For this term we need to bound |G(xi , x j , m i , m j )|. In this case we simply move the absolute value inside the summation and replace |qˆ ε (a; m i , m)| with a constant multiple of |q ε (a; m i , m)| + |q ε (a + z; m i , m)|. We then apply Lemma 6.1 to assert |G(xi , x j , m i , m j )| ≤ S(xi − x j , m i , m j ) + S(xi − x j + z, m i , m j ),
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
801
with S(a, p, n) given by c2
n−1
β(m, n − m)V ε (y; m, n − m)α ( p, m) min {| log δ(ε)|, | log |a + y||} 11(|a| ≤ 2)dy.
0
We have that there exists constants c3 and c4 such that ! c3 n−1 β(m, n − m)α ( p, m)(| log |a||+1), if |a| ≥ c4 (r (n)ε + δ(ε)), 0n−1 S(a, p, n) ≤ otherwise. c3 0 β(m, n − m)α ( p, m)| log δ(ε)|, From this we can readily deduce that R3 ≤ c5 K ε−1 , as in the proof of (7.11). (Note that n−1 β(m, n − m)α ( p, m) ≤ (a(n) + a( p))β (n) because by our choice, the function 0 a is non-decreasing.) We now turn to R1 and R2 . We certainly have that |G(xi , x j , m i , m j )| is bounded above by
1 m j −1 |z| β(m, m j − m)V ε (x j − y; m, m j − m) 0
m=1
|∇q (xi − y + t z; m i , m) J¯(xi , m i , y, m)|dydt. ε
We then apply Lemma 7.2 to assert
|G(xi , x j , m i , m j )| ≤ |z|
1
L(xi − x j + t z, m i , m j )dt,
0
with L(a, p, n) given by
n−1 c5 β(m, n − m)V ε (y; m, n − m)α ( p, m) 0
min δ(ε)−1 , |a + y|−1 11(|a| ≤ 2)dy. Again we can readily show ! c3 n−1 α (m, n)β(m, n − m)|a|−1 , if |a| ≥ c4 (r (n)ε + δ(ε)), 0n−1 L(a, n) ≤ c3 0 α (m, n)β(m, n − m)δ(ε)−1 , otherwise. Repeating the proof of (7.7) yields that R1 + R2 ≤ c7 |z|2 , completing the proof of (7.11). because H can be treated by an Proof of (7.12). We only establish (7.12) for H21 33 identical argument. Choose c1 so that V (a) = 0 if |a| > c1 . We certainly have that the eq z expression Eε |H21 (ω)| is bounded above by −3/2 Eeq α(m i , m j )Vε (xi − x j ; m i , m j )|qˆ ε (xi − x j + z; m i , m j )| ε Kε i, j
≤ c1 | log δ(ε)|K ε−1/2
α(m, n)α (m, n)λm λn
m,n
≤ c2 | log δ(ε)|K ε−1/2 , where we used Lemma 6.1 for the first inequality. This completes the proof of (7.12).
802
M. Ranjbar, F. Rezakhanlou
is a sum of eight terms H , i = 1, . . . , 8, and we Proof of (7.10). We note that H22 22i . Since all the eight terms establish (7.10) by showing the analogous bound for each H22i can be treated in the same way, we only treat the sixth term which is given by
1 −3/2 α(m i , m j )Vε (xi − x j ; m i , m j ) K 2 ε
H226 (ω) =
i, j,k
q (xk − xi ; m k , m i ) J¯(xk , m k , xi , m i ). ε
We note that J¯ is a sum of 4 terms which yields a decomposition = H2261 + H2262 − H2263 − H2264 . H226
(7.27)
which is Again all the 4 terms can be treated in the same way, so we only treat H2264 given by
1 −3/2 α(m i , m j )Vε (xi − x j ; m i , m j )q˜ ε (xk − xi ; m k , m i )J (xi , m i ), K 2 ε i, j,k
where q(a; ˜ m k , m i ) = q(a; ˆ m k , m i )11(r (m i , m k )ε ≤ δ(ε)). We use the elementary inequality |a| ≤ δ + δ −1 a 2 to assert | ≤ H22641 + H22642 , |H2264
(7.28)
and H22642 are respectively given by where H22641
δ −1 K α(m i , m j )Vε (xi − x j ; m i , m j )|J (xi , m i )| 2 ε i, j
δ −1 2
K ε−1
α(m i , m j )Vε (xi − x j ; m i , m j )|J (xi , m i )|
i, j
K ε−1/2
2 q˜ ε (xk − xi ; m k , m i )
.
k Evidently, Eεeq H22641 ≤ c1 δ, for some constant c1 . Moreover, by squaring the expression in the brackets, we learn that H22642 = H226421 + H226422 , where H226421 = δ −1 K ε−2 α(m i , m j )Vε (xi −x j ; m i , m j )|J (xi , m i )| q˜ ε (xk −xi ; m k , m i )2 , i, j,k H226422
=δ
−1
K ε−2
α(m i , m j )Vε (xi − x j ; m i , m j )|J (xi , m i )|
i, j, k=l
q˜ ε (xk − xi ; m k , m i )q˜ ε (xl − xi ; m l , m i ). Because of our choice of q, ˜ we have that q(a; ˜ m, n)da = 0. As a consequence, = 0. Eεeq H226422
(7.29)
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
803
We certainly have that Eεeq H226421 is bounded above by
α(n 1 , n 2 )λn 1 λn 2 λn 3
n 1 ,n 2 ,n 3
= c2 δ −1
c2 δ −1 K ε
Vε (a; n 1 , n 2 )da
α(n 1 , n 2 )λn 1 λn 2 λn 3
qˆ ε (b; n 3 , n 1 )2 db
qˆ ε (b; n 3 , n 1 )2 db.
n 1 ,n 2 ,n 3
On the other hand, by Lemma 7.2,
ε 2 2 qˆ (b; n 3 , n 1 ) db ≤ |z| ∇q ε (b; n 3 , n 1 )|2 db ≤ c3 α (n 3 , n 1 )2 | log δ(ε)||z|2 . As a result, Eεeq H226421 is bounded above by
c4 δ −1 |z|2 | log δ(ε)|
α(n 1 , n 2 )α (n 3 , n 1 )2 λn 1 λn 2 λn 3 ≤ c5 δ −1 |z|2 | log δ(ε)|.
n 1 ,n 2 ,n 3
In summary, from this (7.28), and (7.29) we deduce ≤ c1 δ + c5 δ −1 |z|2 | log δ(ε)|. Eεeq H2264 By choosing δ = |z|| log δ(ε)|1/2 we deduce (7.10).
Proof of (7.13). As it is well-known,
t 2 eq [M ] = E (AG − 2G AG )(ω(s))ds = t (Z 1 + Z 2 + Z 3 ), Eeq ε t ε 0
where Z 1 = 2Eeq ε (A0 G − 2G A0 G )(ω), Z 2 = Eeq ε (Ac G − 2G Ac G )(ω), Z 3 = Eeq ε (A f G − 2G A f G )(ω).
We start with bounding Z 1 : d(m i )|∇xi G (ω)|2 ≤ Z 11 + Z 12 + Z 13 + Z 14 , Z 1 = K ε−3 Eeq ε i
where Z 11
Z 12
2 −3 eq ε ¯ = 4K ε Eε d(m i ) ∇ qˆ (xi − x j ; m i , m j ) J (xi , m i , x j , m j ) , j i 2 ε ¯ = 4K ε−3 Eeq d(m ) q ˆ (x − x , m , m )∇ , m , x , m ) J (x i i j i j xi i i j j . ε j i
804
M. Ranjbar, F. Rezakhanlou
The term Z 13 and Z 14 are given by a similar expression; xi and x j are swapped inside the absolute values. We only bound Z 11 because Z 11 involves ∇ qˆ ε which is more singular than qˆ ε . The remaining Z 1r can be bounded in a similar way. Squaring yields Z 11 ≤ 4K ε−3 Eeq d(m i ) ∇ qˆ ε (xi − x j ; m i , m j ) · ∇ qˆ ε (xi − xk ; m i , m k ) ε j=k
i
J¯(xi , m i , x j , m j ) J¯(xi , m i , xk , m k ) +4K ε−3 Eeq d(m i ) |∇ qˆ ε (xi − x j ; m i , m j )|2 J¯(xi , m i , x j , m j )2 ε i
j
=: Z 111 + Z 112 . Use Lemma 7.2 to deduce Z 112 ≤ c1 K ε−1 | log δ(ε)|
n 1 ,n 2
d(n 1 )α (n 1 , n 2 )2 λn 1 λn 2 ≤ c2 K ε−1 | log δ(ε)|.
We now turn to Z 111 . By Lemma 7.2, 2 Z 111 ≤ c1 d(n 1 ) |z|| log |z|| + δ(ε) α (n 1 , n 2 )α (n 1 , n 3 )λn 1 λn 2 λn 3 n 1 ,n 2 ,n 3
2 2 ≤ c2 |z|| log |z|| + δ(ε) ≤ 4c2 |z|| log |z|| . In summary Z 1 ≤ c3 K ε−1 | log δ(ε)| + c3 [|z|| log |z||]2 .
(7.30)
We now look at Z 2 . We have Z2 =
1 −3 eq K E α(m i , m j )Vε (xi − x j ; m i , m j ) 2 ε ε i, j ⎧ ⎡ ⎤⎫2 8 ⎬ ⎨ ⎣i, j (0) + i, j,k ( p)⎦ , × ⎭ ⎩ p=1
k
where 8p=1 i, j,k ( p) represents the eight terms that appeared in the definition of H¯ 22 ε and i, j (0) = −qˆ (xi − x j ; m i , m j ) J¯(xi , m i , x j , m j ). An application of the inequality ⎞2 ⎛ 8 8 ⎠ ⎝ ap ≤9 a 2p , p=0
p=0
yields that Z 2 is bounded by
⎡ 2 ⎤ 8 9 −3 eq K E α(m i , m j )Vε (xi − x j ; m i , m j ) ⎣i, j (0)2 + i, j,k ( p) ⎦ 2 ε ε i, j
=:
8 p=0
Z2 p
p=1
k
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
805
with for example, 9 eq −3 E K α(m i , m j )Vε (xi − x j ; m i , m j ) 2 ε ε i, j 2 ε qˆ (xk − x j ; m k , m j ) J¯(xk , m k , x j , m j ) .
Z 28 ≤
k
We only treat Z 20 and Z 28 as the other terms Z 2r for r = 1, . . . , 7 can be treated as Z 28 . We have Z 28 = Z 281 + Z 282 , where 9 −3 Z 281 = Eeq Vε (xi − x j ; m i , m j )α(m i , m j ) ε Kε 2 i, j qˆ ε (xk − x j ; m k , m j )qˆ ε (xl − x j ; m l , m j ) k=l
Z 282
J¯(xk , m k , x j , m j ) J¯(xl , m l , x j , m j ), 9 −3 = Eeq Vε (xi − x j ; m i , m j )α(m i , m j ) ε Kε 2 i, j qˆ ε (xk − x j ; m k , m j )2 J¯(xk , m k , x j , m j )2 . k
We start with the former Z 281 ≤ c1
α(n 1 , n 2 )λn 1 λn 2 λn 3 λn 4
|qˆ ε (a; n 3 , n 2 )|da
n 1 ,n 2 ,n 3 ,n 4
As for Z 282 we have Z 282 ≤ c1 K ε−1
|qˆ ε (a; n 4 , n 2 )|da ≤ c2 |z|2 .
α(n 1 , n 2 )
n 1 ,n 2 ,n 3
qˆ ε (a; n 3 , n 2 )2 daλn 1 λn 2 λn 3 ≤ c2 K ε−1 .
Finally Z 280 ≤ In summary,
c1 K ε−2
α(n 1 , n 2 )λn 1 λn 2
n 1 ,n 2
qˆ ε (a; n 1 , n 2 )2 da ≤ c2 K ε−2 .
Z 2 ≤ c1 K ε−1 + |z|2 .
We now turn to Z 3 . We have Z3 =
m i −1 1 eq −3 Eε K ε β(m, m i − m) V ε (xi − y; m i − m, m) 2 i m=1 ⎡ ⎤2 4 ⎣i (y, m) + i ( p; y, m)⎦ dy, p=1
(7.31)
806
M. Ranjbar, F. Rezakhanlou
where for example i (y, m) = qˆ ε (xi − y; m, m i − m) J¯(xi , m, m i − m, y), q¯ ε (y − x j ; m, m j ) J¯(y, m, x j , m j ). i (3; y, m) = j
Again Z 3 ≤ 5 Z 33
4 0
Z 3r with for example
m i −1 5 eq −3 = Eε K ε β(m, m i − m) V ε (xi − y; m i − m, m)i (3; y, m)2 dy. 2 i
m=1
We can now repeat the line of argument we had for Z 2 by squaring out i and use Lemma 7.2 to get Z 3 ≤ c1 K ε−1 + |z|2 . (7.32) From (7.30), (7.31) and (7.32) we deduce (7.13).
8. Kinetic Limit In this section we establish the main claim of Theorem 3.1. We now state the martingale formulation of the Ornstein–Uhlenbeck diffusion which uniquely determines the solution of Eq. (3.13). Definition 8.1. We say ξ is a solution of (3.13) if for any smooth function J of compact support with J = 0, the following processes are martingales:
t M J (t) = M(t) = ξ(t, J ) − ξ(0, J ) − ξ(s, J )ds, 0
N J (t) = N (t) = M(t) − t A(J ). Here J = (Jn : n ∈ N) with Jn : Rd → R and Jn (x)d x = 0, = 0 + c + f , and ξn (t, Jn ), ξ(t, J ) = 2
n
0 ξ(t, J ) =
d(n)ξn (t, x Jn ),
n
c ξ(t, J ) =
α(m, ˆ n)λn ξn (t, Jn+m − Jn − Jm ),
m,n
f (ξ, J ) =
ˆ β(m, n)ξn (t, Jn + Jm − Jn+m ),
m,n
A(J ) = 2
d(n)λn |∇x Jn |2 d x +
n
1 + 2
m,n
1 2
α(m, ˆ n)λn λm (Jn+m − Jn − Jm )2 d x
m,n
ˆ β(m, n)λn+m (Jn+m − Jn − Jm )2 d x.
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
807
We note that the last two terms in the definition of A(J ) are equal by the detailed balance assumption. Ideally, we would like to show that the family P ε is tight as ε → 0 and that any limit point solves (3.13). Unfortunately we have not been able to establish the tightness and the difficulty comes from two error terms which go to 0 as ε → 0 for each t. More precisely, let us define ξ (t, J ) = ξ(t, J ) + ξ (t, J ), where J (xi (t), m i (t)), ξ(t, J ) = K ε1/2 i 1 ¯ 2 G ε (ω(t)),
and ξ (t, J ) = with
G (ω; z 2 − z 1 )ζ δ (ε) (z 2 )ζ δ (ε) (z 1 )dz 1 dz 2 . G¯ ε (ω) = G(ω; z)ζ δ(ε) (z)dz + (8.1) Here G and G are as in (5.1) and (7.3), and ζ δ (a) = δ −2 ζ (a/δ) with ζ a smooth non-negative symmetric function of compact support satisfying ζ (a)da = 1. We take a countable dense subset D0 of smooth functions of compact support and write H = L 1 ([0, T ]; R)D0 . The transformation ω(·) → (ξ (·, J ) : J ∈ D0 ) induces a probability measure Pˆ ε on H. Let us write P for the distribution of a process ξ which solves (3.13) and is subject to the following initial condition: ξ(0, J ) is a Gaussian random variable with variance
2 λn J 2 (x, n)d x. ξ(0, J ) P(dξ ) = n
Note that ξ(·, J ) is stationary under P. Note also that P can be regarded as a probability measure on H. It turns out that the tightness of the sequence Pˆ ε can be shown by standard arguments. Theorem 8.1. The sequence Pˆ ε converges to P as ε → 0. Moreover, lim Eeq ε ξ (t, J ) = 0, ε
(8.2)
for every t. We note that (8.2) is an immediate consequence of (5.5) and (7.7). The proof of the convergence of Pˆ ε is naturally divided into two steps. The first step is devoted to the proof of the tightness of the family Pˆ ε . This step will be carried out in Sect. 9. For the second step, we show that any limit point solves (3.13). This is a rather straight forward consequence of Theorem 8.2 below. This theorem is also the main ingredient for the proof of Theorem 3.1. We note that by a celebrated result of Holley and Stroock [HS], (3.13) has a unique solution in the sense of Definition 8.1. Theorem 8.2. There exist martingales Mε and Nε , and processes Err 1,ε and Err 2,ε such that
t ξ (t, J ) − ξ (0, J ) − ξ (s, J )ds = Mε (t) + Err 1,ε (t), (8.3) 0
Mε (t)2 − t A(J ) = Nε (t) + Err 2,ε ,
(8.4)
808
M. Ranjbar, F. Rezakhanlou
where A(J ) was defined in Definition 8.1, and 1,ε eq 2,ε lim Eeq ε Err (t) = lim Eε Err (t) = 0. ε→0
ε→0
The proof of Theorem 8.2 is naturally divided into two parts. Proof of (8.3). Step 1: Let us write X¯ ε (ω) = X ε (ω)+ 21 G¯ ε (ω), where X ε (ω) and G¯ ε (ω) were defined by (4.1) and (8.1) respectively. As it is a well-known fact for Markov processes, the following process is a martingale:
t M¯ ε (t) := X¯ ε (ω(t)) − X¯ ε (ω(0)) − A X¯ ε (ω(s))ds. 0
Note that by definition, X¯ ε (ω(t)) = ξ (t, J ). Let us study the term A X¯ ε . We certainly have 1 A X¯ ε = A0 X ε + Ac X ε + A f X ε + AG¯ ε . 2 Note that the term AX ε involves J˜ whereas AG¯ ε involves Jˆ. We replace J˜ of AX ε with eq Jˆ. This causes an error Err 0 which is small because Eε | Err 0 | is bounded above by c1 K ε1/2 α(m, n)11(c0 εr (m, n) ≥ 1)λm λn ≤ c2 K ε1/2 a(n)11(2c0 εr (n) ≥ 1)λn . n,m
n
As a result, we may use (3.9) to deduce lim Eeq ε | Err 0 | = 0.
ε→0
As a consequence of Theorems 5.1 and 7.1 (see Remark 7.1), we have
t
t
t
t 1 0 Ac X ε + AG¯ ε (ω(s))ds = Q ε (ω(s))ds − H33 (ω(s))ds + Err 1 ds, 2 0 0 0 0 where Q ε (ω) equals
1 α(m i , m j )W¯ ε (xi − x j + z 2 − z 1 ; m i , m j ) K ε−3/2 2 i, j
Jˆ(xi , m i , x j , m j )ζ δ (z 1 )ζ δ (z 2 )dz 1 dz 2 , and
1/2 −1/2 Eeq | log δ(ε)| + c1 δ (ε) + K ε−1/2 | log δ(ε)|1/2 , ε | Err 1 | ≤ c1 K ε δ(ε) + K ε (8.5)
which goes to 0 in small ε limit. Step 2: Recall that the summation is over distinct i and j by our overall convention. However, one can readily check that if we allow i = j in the summation, then the −1/2 discrepancy is of order O(K ε ). Also, if we replace Jˆ with J˜, the error is of order O(τ (ε)). The sum of these two errors is denoted by Err2 , and we have −1/2 Eeq + τ (ε)), ε | Err 2 | ≤ c1 (K ε
(8.6)
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
809
which goes to 0 in small ε limit. Because of the form of J˜, we may write Q ε (ω) = Q 1ε + Q 2ε − Q 3ε − Q 4ε + Err2 , where for example Q 4ε (ω), given by
1 K ε−3/2 α(m i , m j )W¯ ε (xi − x j + z 2 − z 1 ; m i , m j ) 2 i, j
J (x j , m j )ζ δ (ε) (z 1 )ζ δ (ε) (z 2 )dz 1 dz 2 , with the summation over all i and j. We make a change of variables xi − z 1 = a1 , x j − z 2 = a2 to write that Q 4ε (ω) equals
1 −3/2 Kε α(m i , m j )W¯ ε (a1 − a2 ; m i , m j ) 2 i, j
J (x j , m j )ζ δ (ε) (xi − a1 )ζ δ (ε) (x j − a2 )da1 da2 = K ε1/2 α(m, n)W¯ ε (a1 − a2 ; m, n) f ε (a1 , m; ω) f ε (a2 , n; ω; J )da1 da2 , m,n
where f ε (a, m; ω) = K ε−1
ζ δ (ε) (xi − a)11(m i = m),
i
ε
f (a, m; ω; J ) =
K ε−1
ζ δ (ε) (xi − a)J (xi , m)11(m i = m).
i 42 We then have that Q 4ε (ω) = Q 41 ε (ω) + Q ε (ω) + Err 3 , where
1 Q 41 (ω) = α(m, n)W¯ ε (a1 − a2 , m, n)λm f ε (a2 , n; ω; J )da1 da2 , K ε1/2 ε 2 m,n
1 Q 42 α(m, n)W¯ ε (a1 − a2 ; m, n)λn f ε (a1 , m; ω) J¯ε (a2 , n)da1 da2 , K ε1/2 ε (ω) = 2 m,n
where J¯ε (a, n) = ζ δ (ε) (x − a)J (x, n)d x and Err 3 is given by
1 α(m, n)W¯ ε (a1 − a2 ; m, n) K ε1/2 2 m,n ( f ε (a1 , m; ω) − λm )( f ε (a2 , n; ω; J ) − λn J¯ε (a2 , m))da1 da2
1 = α(m i , m j )W¯ ε (a1 − a2 ; m i , m j ) K ε−3/2 2 i, j
(xi − a1 ) − λm i )(ζ δ (ε) (x j − a2 )J (x j , m j ) − λm j J¯ε (a2 , m j ))da1 da2 . Here we have used the assumption J = 0. We wish to show that Err 3 is small. We first observe that we can write Err3 = Err31 + Err32 , where Err31 is what we obtain
(ζ
δ (ε)
810
M. Ranjbar, F. Rezakhanlou
by restricting the summation to indices i = j, and Err32 corresponds to the case i = j. −1/2 It is not hard to show that Err32 is of order O(K ε ). On the other hand, 2 −1 −2 Eeq ε Err 31 ≤ c1 K ε δ (ε) .
(8.7)
To see this, observe that Err 231 equals
1 α(m i , m j )α(m p , m q ) da1 da2 db1 db2 K ε−3 2 i, j, p,q (a1 − a2 ; m i , m j )W¯ ε (b1
W¯
ε
(ζ
δ (ε)
(xi − a1 ) − λm i )(ζ
(ζ
δ (ε)
(x j − a2 )J (x j , m j ) − λm j J¯ε (a2 , m j ))
− b2 ; m p , m q )
δ (ε)
(x p − b1 ) − λm p )
(ζ δ (ε) (xq − b2 )J (xq , m q ) − λm q J¯ε (b2 , m q )) =: E 1 + E 2 + E 3 , where E s represents the above summation with (i, j, p, q) ∈ I (s) with I (1) corresponding to the cases i = p, q or p = i, j or j = p, q or q = i, j, I (2) corresponds to the case i = p and q = j, and I (3) corresponding to the case i = q and p = j. (Recall that the summation in our expression for Err 231 is over i = j and q = p.) We can readily check Eeq ε E 1 = 0.
(8.8)
eq Eε E 2
equals On the other hand
1 −1 Kε W¯ ε (a1 − a2 ; m, n)W¯ ε (b1 − b2 ; m, n)α(m, n)2 (λm γ δ (ε) (a1 − b1 ) − λ2m ) 2 m,n (λn ζ δ (ε) (y − a2 )ζ δ (ε) (y − b2 )J 2 (y, n) − λ2n J¯ε (a2 , n) J¯ε (b2 , n))dyda1 da2 db1 db2 1 = (E 21 + E 22 + E 23 + E 24 ), 2 where γ δ (ε) (a) = δ (ε)−2 γ (a/δ (ε)), for γ (a) = ζ (a + b)ζ (b)db, and E 2r for r = 1, . . . , 4, are given by
−1 E 21 = K ε λm λn W¯ ε (a1 − a2 ; m, n)W¯ ε (b1 − b2 ; m, n)α(m, n)2 γ δ (ε) (a1 − b1 )
m,n
(y − a2 )ζ δ (ε) (y − b2 )J 2 (y, n)dyda1 da2 db1 db2 ,
= K ε−1 λ2m λn W¯ ε (a1 − a2 ; m, n)W¯ ε (b1 − b2 ; m, n)α(m, n)2 ζ
E 22
δ (ε)
m,n δ (ε)
E 23
(y − a2 )ζ δ (ε) (y − b2 )J 2 (y, n)dyda1 da2 db1 db2 ,
= K ε−1 λm λ2n W¯ ε (a1 − a2 ; m, n)W¯ ε (b1 − b2 ; m, n)α(m, n)2 γ δ (ε) (a1 − b1 )
E 24
J¯ε (a2 , n) J¯ε (b2 , n)dyda1 da2 db1 db2 ,
−1 2 2 = Kε λm λn W¯ ε (a1 − a2 ; m, n)W¯ ε (b1 − b2 ; m, n)α(m, n)2
ζ
m,n
m,n
J (a2 , n) J¯ε (b2 , n)dyda1 da2 db1 db2 . ¯ε
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
811
We can readily see |E 22 | + |E 23 | + |E 24 | ≤ c1 K ε−1 , for a constant c1 . As for E 21 we have, E 21 ≤ c2 K ε−1 δ (ε)
−2
λm λn
W¯ ε (a1 − a2 ; m, n)W¯ ε
m,n
(b1 − b2 ; m, n)α(m, n)2 γ δ (ε) (a1 − b1 )
ζ δ (ε) (y − a2 )J 2 (y, n)dyda1 da2 db1 db2 ≤ c3 K ε−1 δ (ε)
−2
. −1/2
δ (ε)−2 for a constant c5 . Hence Eε E 2 ≤ c4 K ε−1 δ (ε)−2 . Similarly Eε E 3 ≤ c2 K ε This and (8.8) yield (8.7). Step Note that αˆ = α limε→0 W ε (this was proved in [HR1] as Theorem 3.2), 3: and W¯ ε = W ε . Hence
1 Q 41 (ω) = α(m, ˆ n)λm f ε (a2 , n; ω, J )da2 + Err4 K ε1/2 ε 2 m,n 1 −1/2 = Kε α(m, ˆ n)λm J (x j , n)11(m j = n) + Err41 , (8.9) 2 m,n eq
eq
j
where Err4 is the error we get by replacing W¯ ε = W ε with its limit as ε → 0. Since ⎤ ⎡ ⎣ K ε−1/2 Eeq α(m, ˆ n)λm J (x j , n)11(m j = n)⎦ ≤ c1 , ε j
m,n
for a constant c1 independent of ε, we deduce 2 lim Eeq ε |Err41 | = 0.
ε→0
(8.10)
Moreover, since J¯ε (a, n) = λn J (a, n) + O(δ (ε)), we have that Q 42 ε (ω) equals
1 α(m, n)W¯ ε (a1 − a2 ; m, n)J (a2 , m) f δ (a1 , n; ω)da1 da2 + O(δ) K ε1/2 2 m,n
1 = α(m, ˆ n)λn J (a1 , n) f δ (a1 , m; ω)da1 + Err 42 (8.11) K ε1/2 2 m,n
1 = α(m, ˆ n)λn J (xi , n)11(m i = m) + Err 42 K ε−1/2 2 m,n i
with 2 lim Eeq ε |Err42 | = 0.
ε→0
(8.12)
812
M. Ranjbar, F. Rezakhanlou j
The terms Q ε for j = 1, 2, 3 can be treated likewise. From (8.6), (8.7), (8.9), (8.10), (8.11), (8.12) and (8.2) we deduce
t
t
t Q δε (ω(s))ds = c ξ(s, J )ds + Err 5 ds, (8.13) 0
0
0
with lim Eeq ε |Err5 | = 0.
(8.14)
ε→0
0 . Recall Step 4: We now study the term H33
0 (ω) = H33
m i −1 1 −3/2 Kε β(m, m i − m) 2 i m=1
V ε (xi − y; m, m i − m)u ε (xi − y; m, m i − m) Jˆ(xi , m, y, m i − m)dy.
As we discussed in Step 1, we have replaced Jˆ with J˜ for an error which vanishes in small ε limit. Moreover J˜(xi , m, y, m i − m) = J˜(xi , m, xi , m i − m) + O(r (m, m i − m)ε), 0 (ω) + A X equals whenever V ε (xi − y; m, m i − m) = 0. Hence −H33 f ε m i −1 −1 −1/2 K β(m, m i − m) 2 ε i m=1
× W ε (xi − y; m, m i − m) J˜(xi , m, xi , m i − m)dy + Err6
=
m i −1 −1 −1/2 ˆ β(m, m i − m) J˜(xi , m, xi , m i − m) + Err 7 Kε 2 i
m=1
= f (ξ, J ) + Err 7 , with lim Eeq ε |Err7 | = 0.
(8.15)
ε→0
This is proved as in the previous step. For example, we have used
i −1 m −1/2 K β(m, m − m) W ε (xi − y; m, m i − m) Eeq i ε ε i
m=1
∇ Jˆ(xi , m, xi + θ (y − xi ), m i − m) · (y − xi )dθ dy
2
= O(ε2 ),
where ∇ denotes the derivative with respect to the second spatial variable.
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
813
Final Step: From (8.4-6) and (8.13-15) we deduce that the martingale M¯ ε (t) satisfies M¯ ε (t) = ξ (t, J ) − ξ (0, J ) − = ξ (t, J ) − ξ (0, J ) −
t
0
t
ξ(s, J )ds +
Err 8 (ω(s))ds
0
t
ξ (s, J )ds +
0
t
Err 9 (ω(s))ds
0
eq lim Eeq ε | Err 8 (ω(0))| = lim Eε | Err 9 (ω(0))| = 0.
ε→0
ε→0
(8.16)
Proof of (8.4). Step 1: Define Gˆ ε (ω) = X ε (ω) + 21 Gˆ ε (ω). Set,
G(ω; z)ζ δ(ε) (z)dz and let us write Xˆ ε (ω) =
Mˆ ε (t) := Xˆ ε (ω(t)) − Xˆ ε (ω(0)) −
t 0
A Xˆ ε (ω(s))ds.
Note that Mˆ ε (t) = M¯ ε (t) + Mt , where Mt was defined in Sect. 7. By (7.13), 2 ¯ ˆ M lim Eeq (t) − M (t) = 0. ε ε ε
ε→0
(8.17)
As it is a well-known fact for Markov processes, the following process is also a martingale: Nˆ ε (t) = Mˆ ε (t)2 −
t 0
(A( Xˆ ε )2 − 2 Xˆ ε A Xˆ ε )(ω(s))ds.
We certainly have A := A( Xˆ ε )2 − 2 Xˆ ε A Xˆ ε = A0 + Ac + A f ,
(8.18)
with A0 = 2K ε−1
d(m i ) |∇x J (xi , m i ) + Bi (ω)|2 ,
i
where Bi (ω) =
1 −1 K ε ∇xi u¯ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ), 2 j
1 + K ε−1 ∇xi u¯ ε (x j − xi ; m j , m i ) Jˆ(x j , m j , xi , m i ), 2 j
with u¯ ε = u˜ ε − u ε , where u˜ ε (a; m, n) = u ε (a + z; m, n)ζ δ(ε) (z)dz. The exact form of Ac and A f will be given in Steps 2 and 4 respectively. We have Bi = Bi1 + Bi2 , where
814
M. Ranjbar, F. Rezakhanlou
Bi1 (ω) and Bi2 (ω) are given by 1 −1 ε u¯ x (xi − x j ; m i , m j ) Jˆ(x j , m j , xi , m i ) K 2 ε j
−u¯ εx (x j − xi ; m j , m i ) Jˆ(xi , m i , x j , m j ) ,
1 −1 ε u¯ (xi − x j ; m i , m j ) Jˆx (xi , m i , x j , m j ) K 2 ε j
−u¯ ε (x j − xi ; m j , m i ) Jˆy (x j , m j , xi , m i ) ,
respectively. We have A0 = 2 A00 + 2 A01 + 2 A02 , with A00 = K ε−1
d(m i )|∇x J (xi , m i )|2 ,
i
A01 =
2K ε−1
d(m i )(Bi1 (ω) + Bi2 (ω)) · ∇x J (xi , m i ),
i
A02 = K ε−1
d(m i ) (Bi1 (ω) + Bi2 (ω))2 .
i
We first show that the term A01 is small. Note that |A01 | ≤ A011 + A012 , for d(m i )|Bir (ω)|, A01r = c1 K ε−1 i
with c1 = 2∇ J ∞ . By Lemma 6.1,
ε λm λn |∇ u¯ ε (a; m, n)|da ≤ c3 δ(ε) log δ(ε), Eeq A011 ≤ c2 m,n
Eεeq A012
≤ c2
λm λn
m,n
|a|≤1
|u¯ ε (a; m, n)|da ≤ c3 δ(ε),
as in the proof of (5.5) and (5.7). As a result, |A01 | ≤ 2c3 δ(ε) log δ(ε). We now turn to A02 . We may write A02 = A021 + A022 + A023 , with 2 d(m i )Bi1 , A021 = K ε−1 i
A022 =
K ε−1
A023 =
2K ε−1
2 d(m i )Bi2 ,
i
i
d(m i )Bi1 Bi2 .
(8.19)
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
815
We now use Lemma 6.2 to show that A022 and A023 are small. Indeed after squaring, Eεeq A022 ≤ c4 Eεeq K ε−3 d(m i )|u¯ ε (xi − x j ; m i , m j )| |u¯ ε (xi − xk ; m i , m k )| i, j,k
= A0221 + A0222 , where A0221 and A0222 correspond to the cases k = j and k = j respectively. By Lemma 6.2 and (3.10),
ε ε Eeq A0221 ≤ c5 λm λn λ p |u¯ (a; m, n)|da |u¯ ε (a; m, p)|da ≤ c6 δ(ε)2 , |a|≤1
m,n, p
Eεeq A0222 ≤ c5 K ε−1
|a|≤1
λm λn
m,n
In the same fashion, Eεeq A023 ≤ c7 Eεeq K ε−3
|a|≤1
|u¯ ε (a; m, n)|2 da ≤ c6 K ε−1 .
d(m i )|u¯ ε (xi − x j ; m i , m j )| |∇ u¯ ε (xi − xk ; m i , m k )|
i, j,k
= A0231 + A0232 , where A0231 and A0232 correspond to the cases k = j and k = j respectively. By Lemma 6.2 and (3.10),
λm λn λ p |u¯ ε (a; m, n)|da Eεeq A0231 ≤ c8 |a|≤1
m,n, p
×
|a|≤1
Eεeq A0232 ≤ c10 K ε−1
≤
|a|≤1
c10 K ε−1 ×
λm λn
m,n
×
|∇ u¯ ε (a; m, p)|da ≤ c9 δ(ε)2 | log δ(ε)|,
|u¯ ε (a; m, n)||∇ u¯ ε (a; m, n)|da m,n
|a|≤1
λm λn
|a|≤1
1/2
ε
|u¯ (a; m, n)| da
|∇ u¯ ε (a; m, n)|2 da
2
1/2
≤ c11 K ε−1 δ(ε)K ε1/2 K ε1/2 = c11 δ(ε). As a result,
|A022 + A023 | ≤ c12 δ(ε) + K ε−1 .
(8.20)
We now concentrate on A021 . First observe that since V and ζ are symmetric, we learn that u¯ ε is symmetric. From this and symmetry of J˜ and K we learn Bi1 = K ε−1 u¯ εx (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j ). j
816
M. Ranjbar, F. Rezakhanlou
After squaring, we obtain A021 = A0211 + A0212 , where A0211 = 2K ε−3 d(m i )u¯ εx (xi − x j ; m i , m j ) · u¯ εx (xi − xk ; m i , m k ) i, j,k
A0212
Jˆ(xi , m i , x j , m j ) Jˆ(xi , m i , xk , m k ), = K ε−3 d(m i )|u¯ εx (xi − x j ; m i , m j )|2 Jˆ(xi , m i , x j , m j )2 , i, j
where j = k in A0211 . Using Lemma 6.1 we deduce
Eεeq |A0211 | ≤ c1 λm λn λ p |∇ u¯ ε (a; m, n)|da m,n, p
×
|a|≤1
|a|≤1
|∇ u¯ ε (a; m, p)|da ≤ c2 δ(ε)2 (log δ(ε))2 .
As for A0212 , we first write, A0212 = A02121 + A02122 + A02123 , where A02121 = K ε−3 d(m i )|u εx (xi − x j ; m i , m j )|2 Jˆ(xi , m i , x j , m j )2 , i, j
A02122 = K ε−3
i, j
A02123 =
−2K ε−3
d(m i )|u˜ εx (xi − x j ; m i , m j )|2 Jˆ(xi , m i , x j , m j )2 ,
d(m i ) u˜ εx · u εx (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j )2 .
i, j
We now argue that A02122 and A02123 are small. To see this, first observe that by Lemma 6.2, ε ε δ(ε) ∇u (a + z; m, n)ζ (z)dz |∇ u˜ (a; m, n)| =
−2 ≤ c3 δ(ε) |∇u ε (a + z; m, n)|dz |z|≤c3 δ(ε)
−2 |a + z|−1 dz ≤ c4 δ(ε) α (m, n) |z|≤c3 δ(ε) ≤ c5 α (m, n) min |a|−1 , δ(ε)−1 . As a result,
|a|≤1
|a|≤1
|∇ u˜ ε (a; m, n)|2 da ≤ c6 α (m, n)2 | log δ(ε)|,
|∇ u˜ ε (a; m, n) · ∇u ε (a; m, n)|da ≤ c6 α (m, n)2 | log δ(ε)|1/2 | log ε|1/2 .
From this we learn |A02122 | ≤ c7 α (m, n)2
| log δ(ε)| | log δ(ε)|1/2 , |A02123 | ≤ c7 α (m, n)2 . | log ε| | log ε|1/2
(8.21)
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
From (8.18)-(8.21) we deduce that A0 = 2 A00 + 2 A02121 + Err1 , with −1/2 | log δ(ε)|1/2 . Eeq ε | Err 1 | ≤ c δ(ε)| log δ(ε)| + | log ε|
817
(8.22)
Furthermore, if we pick a small δ > 0 and write A02121 = A021211 + A021212 , with A021211 = K ε−3 d(m i )|u εx (xi − x j ; m i , m j )|2 Jˆ(xi , m i , x j , m j )2 11(|xi − x j | ≤ δ), i, j
A021212 =
K ε−3
d(m i )|u εx (xi − x j ; m i , m j )|2 Jˆ(xi , m i , x j , m j )2 11(|xi − x j | > δ),
i, j
then we have that Eε |A021212 | ≤ K ε−1 δ −2 , and A021211 = A0212111 + A0212112 , where in the first term we replace the x j argument of Jˆ with xi , and the second term is the error caused by such a replacement. More precisely, A0212111 = K ε−3 d(m i )|u εx (xi − x j ; m i , m j )|2 eq
i, j
J˜(xi , m i , xi , m j )2 11(|xi − x j | ≤ δ),
−1 Eeq |A | ≤ K δc λ λ |∇u ε (a; m, n)|2 da ≤ c2 δ, 0212112 1 m n ε ε |a|≤δ
m,n
where we used the smoothness of J for the first inequality and (6.8) for the second inequality. Now that we have replaced x j with xi in Jˆ, we can drop the condition |xi − x j | ≤ δ. Indeed A0212111 = A02121111 + A02121112 , with d(m i )|∇u ε (xi − x j ; m i , m j )|2 A02121111 = K ε−3 i, j
J˜(xi , m i , xi , m j )2 11(|xi − x j | ≤ 1), −1 −2 δ . Eeq ε |A02121112 | ≤ c3 K
We now choose δ = | log ε|−1/3 . In summary, A02121 = A02121111 + Err2 , where −1/3 Eeq . ε |Err2 | ≤ c4 | log ε|
On the other hand, by Lemma 5.1, t 2 A00 (ω(s))ds − t A0 (J ) = 0, lim Eeq ε ε→0 0 t 2 A02121111 (ω(s))ds − t A (J ) = 0, lim Eeq ε 0 ε→0
(8.24)
0
where A0 (J ) = 2
d(n)λn
|∇ Jn |2 d x,
n
A0 (J ) =
(8.23)
m,n
λm λn d(m)γ (m, n)
|Jm+n − Jn − Jm |2 d x,
(8.25)
818
M. Ranjbar, F. Rezakhanlou
with γ (m, n) = lim | log ε|−1 ε→0
|a|≤1
|∇u ε (a; m, n)|2 da.
Step 2: We now study Ac . Recall that u¯ ε = u˜ ε − u ε . We have ! 02 8 1 −1 α(m i , m j )Vε (xi − x j ; m i , m j ) S(i, j, ω) + Rr (i, j, ω) , Ac (ω) = K ε 2 r =0
i, j
where S(i, j, ω) = J˜(xi , m i , x j , m j ) + Jˆ(xi , m i , x j , m j )K ε−1 u ε (xi − x j ; m i , m j ), R0 (i, j, ω) = −K ε−1 Jˆ(xi , m i , x j , m j )u˜ ε (xi − x j ; m i , m j ), mi u¯ ε (xi − xk ; m i + m j , m k ) Jˆ(xi , m i + m j , xk , m k ), R1 (i, j, ω) = K ε−1 mi + m j k mi −1 R2 (i, j, ω) = K ε u¯ ε (xk − xi ; m k , m i + m j ) Jˆ(xk , m k , xi , m i + m j ), mi + m j k mj −1 R3 (i, j, ω) = K ε u¯ ε (x j − xk ; m i + m j , m k ) Jˆ(x j , m i + m j , xk , m k ), mi + m j k mj −1 R4 (i, j, ω) = K ε u¯ ε (xk − x j ; m k , m i + m j ) Jˆ(xk , m k , x j , m i + m j ), mi + m j k −1 u¯ ε (xi − xk ; m i , m k ) Jˆ(xi , m i , xk , m k ), R5 (i, j, ω) = −K ε k
R6 (i, j, ω) =
−K ε−1
R7 (i, j, ω) =
−K ε−1
R8 (i, j, ω) =
−K ε−1
u¯ ε (xk − xi ; m k , m i ) Jˆ(xk , m k , xi , m i ),
k
u¯ ε (x j − xk ; m j , m k ) Jˆ(x j , m j , xk , m k ),
k
u¯ ε (xk − x j ; m k , m j ) Jˆ(xk , m k , x j , m j ),
k
where the summation is over k with k = i, j. Let us write T (i, j, ω) = We then write
8
r =0
Rr (i, j, ω).
Ac (ω) = Ac0 (ω) + Ac1 (ω) + Ac2 (ω) with 1 −1 K α(m i , m j )Vε (xi − x j ; m i , m j )S(i, j, ω)2 , 2 ε i, j Ac1 (ω) = K ε−1 α(m i , m j )Vε (xi − x j ; m i , m j )S(i, j, ω)T (i, j, ω),
Ac0 (ω) =
i, j
1 Ac2 (ω) = K ε−1 α(m i , m j )Vε (xi − x j ; m i , m j )T (i, j, ω)2 . 2 i, j
(8.26)
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
819
Our goal is showing that both Ac1 and Ac2 are small. We start with the latter. We use the 2 8 ≤ 9 80 Rr2 , to assert that Ac2 ≤ 9 80 Ac2r . We only bound Ac20 inequality 0 Rr and Ac25 as the remaining Ac2r are similar to Ac25 . To bound Ac25 , we simply square out the summation in k to obtain Ac25 =
1 −3 K α(m i , m j )Vε (xi − x j ; m i , m j ) 2 ε i, j,k,l
ε
u¯ (xi − xk ; m i , m k )u¯ ε (xi − xl ; m i , m l ) Jˆ(xi , m i , xk , m k ) Jˆ(xi , m i , xl , m l ) =: Ac251 + Ac252 , where Ac251 and Ac252 represent the cases k = l and k = l respectively. We then have
ε Eeq A ≤ c α(m, n)λ λ λ λ | u ¯ (a; m, p)|da |u¯ ε (a; m, q)|da 1 m n p q c251 ε |a|≤1
m,n, p,q
≤ c2 δ(ε)
|a|≤1
2
by (6.5). In the same fashion, we may use (6.7) to show that Eε Ac252 ≤ c2 K ε−1 . Treating other Ac2r , r = 1, . . . , 8 in the same way yields eq
Eeq ε
8 r =1
Ac2r ≤ c3 δ(ε)2 + K ε−1 .
(8.27)
eq
As for Ac20 we have that Eε Ac20 is bounded by 1 eq −3 E K α(m i , m j )Vε (xi − x j ; m i , m j )u˜ ε (xi − x j ; m i , m j )2 Jˆ(xi , m i , x j , m j )2 2 ε ε i, j
−1 ≤ c4 K ε α(m, n)λm λn Vε (a; m, n)u˜ ε (a; m, n)2 da. m,n
If we restrict the summation to those m and n such that r (m, n)ε ≥ δ(ε), then we simply use u˜ ε ≤ c1 K ε to show that the sum is bounded by a constant multiple of τ (ε), which is small by our assumption (3.9). On the other hand, when Vε (a; m, n) = 0 and r (m, n)ε ≤ δ(ε), ε ε δ(ε) u (a + z; m, n)ζ (z)dz |u˜ (a; m, n)| =
≤ c5 δ(ε)−2 |u ε (a + z; m, n)|dz |z|≤c5 δ(ε)
−2 | log |z||dz ≤ c6 δ(ε) α (m, n)
|z|≤c5 δ(ε)+c6 εr (m,n)
≤ c7 α (m, n)| log δ(ε)|.
(8.28)
Hence −2 2 Eeq ε Ac20 ≤ c8 | log ε| | log δ(ε)| + c8 τ (ε).
(8.29)
820
M. Ranjbar, F. Rezakhanlou
From this and (8.27) we deduce 2 −1 −2 2 δ(ε) A ≤ c + | log ε| + | log ε| | log δ(ε)| + τ (ε) . Eeq c2 9 ε
(8.30)
Step 3: We now turn to Ac1 . By Lemma 6.2, the expression K ε−1 u ε (a; m, n) is uniformly bounded whenever Vε (a; m, n) = 0. Hence |Ac1 | ≤ Ac1 = c1 K ε−2 α(m i , m j )Vε (xi − x j ; m i , m j )|T (i, j, ω)|. i, j,k
Again using the decomposition of T , we write Ac1 = r8=0 Ac1r , with for example α(m i , m j )Vε (xi − x j ; m i , m j ) Ac15 (ω) = c1 K ε−2 i, j,k
ε u¯ (xi − xk ; m i , m k ) Jˆ(xi , m i , xk , m k ) . By (6.5), Eeq ε Ac15
≤ c2
α(m, n)λm λn λ p
|a|≤1
m,n, p
|u¯ ε (a; m, p)|da ≤ c2 δ(ε).
(8.31)
Similarly Ac10 (ω) = c1 K ε−2
α(m i , m j )Vε (xi − x j ; m i , m j )
i, j
ε u˜ (xi − x j ; m i , m j ) Jˆ(xi , x j , m i , m j ) , which yields Eeq ε Ac10 (ω)
≤ c3
α(m, n)λm λn
m,n
|a|≤1
Vε (a; m, n)|u˜ ε (a; m, n)|da.
Again using (8.28) we obtain −1 Eeq ε Ac10 (ω) ≤ c4 | log ε| | log δ(ε)| + c4 τ (ε),
in the same way we obtained (8.28). From this and (8.30) we deduce −1 Eeq ε Ac1 ≤ c5 δ(ε) + | log ε| | log δ(ε)| + τ (ε) . From this, (8.26) and (8.30) we deduce
t
Ac (ω(s))ds = 0
with
t
Ac0 (ω(s))ds + Err 2 ,
0
−1 Eeq + | log ε|−1 | log δ + τ (ε)| . ε | Err 2 | ≤ c δ(ε) + | log ε|
(8.32)
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
On the other hand, by the Law of Large Numbers, t = 0, lim Eeq A (ω(s))ds − t A (J ) c0 c ε ε→0
where Ac (J ) =
1 2
821
(8.33)
0
α(m, ˜ n)λn λm (Jn+m − Jn − Jm )2 d x
m,n
and α(m, ˜ n) = α(m, n)η(m, n)2 = α(m, ˆ n)η(m, n), where η = α/α. ˆ Here we have used limε K ε−1 u ε = η − 1 uniformly in the support of Vε . (See Theorem 3.2 of [HR].) Step 4: We now concentrate on A f . We have A f (ω) =
m i −1 1 −1 Kε β(m, m i − m) V ε (xi − y; m i − m, m) 2 i m=1 2 4 Rr (i; y, m; ω) dy, S(i; y, m; ω) + r =0
where S(i; y, m; ω) = J (xi , m) + J (y, m i − m) − J (xi , m i ) −K ε−1 u ε (xi − y; m, m i − m) Jˆ(xi , m, y, m i − m), R0 (i; y, m; ω) = K ε−1 u˜ ε (xi − y; m, m i − m) Jˆ(xi , m, y, m i − m), R1 (i; y, m; ω) = K ε−1 [u¯ ε (xi − x j ; m, m j ) Jˆ(xi , m, x j , m j ) j
−u¯ ε (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j )], R2 (i; y, m; ω) = K ε−1 [u¯ ε (xi − x j ; m i , m) Jˆ(xi , m i , x j , m) j
−u¯ (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j )], R3 (i; y, m; ω) = K ε−1 u¯ ε (y − x j ; m, m j ) Jˆ(y, m, x j , m j ), ε
j
R4 (i; y, m; ω) = K ε−1
u¯ ε (x j − y; m j , m) Jˆ(x j , m j , y, m).
j
Let us write T =
4 0
Rr , and A f = A f 0 + A f 1 + A f 2,
(8.34)
822
M. Ranjbar, F. Rezakhanlou
with Af0
m i −1 1 −1 = Kε β(m, m i − m) V ε (xi − y; m i − m, m)S(i; y, m; ω)2 dy, 2 m=1
i
Af1 =
K ε−1
i −1 m
Af2
i −1 m
We have that A f 2 ≤ A f 2r (ω) =
V ε (xi − y; m i − m, m)T (i; y, m; ω)2 dy.
β(m, m i − m)
m=1
i
K ε−1
V ε (xi − y; m i − m, m)(ST )(i; y, m; ω)dy,
β(m, m i − m)
m=1
i
1 = K ε−1 2
5 2
4 0
i −1 m
i
A f 2r with
V ε (xi − y; m i − m, m)Rr (i; y, m; ω)2 dy.
β(m, m i − m)
m=1
We then use (8.27) to learn that A f 20 (ω) is bounded above by c1 K ε−3
i −1 m
i
β(m, m i − m)
V ε (xi − y; m i − m, m)u˜ ε (xi − y; m, m i − m)2 dy
m=1
≤ c2 K ε−3 | log δ(ε)|2
i −1 m
i
β(m, m i − m)α (m, m i − m)2 .
m=1
We then use (3.10) to deduce −2 2 Eeq ε A f 20 (ω) ≤ c3 | log ε| | log δ(ε)| .
(8.35)
On the other hand, A f 21 (ω) = K ε−1
i −1 m
i
β(m, m i − m)R1 (i; y, m; ω)2 ,
m=1
is bounded above by A f 211 + A f 212 , with A f 211 = 2K ε−3
i −1 m
i
A f 212 = 2K ε−3
m=1
i −1 m
i
m=1
⎡ β(m, m i − m) ⎣ ⎡ β(m, m i − m) ⎣
⎤2 |u¯ ε (xi − x j ; m, m j ) Jˆ(xi , m, x j , m j )|⎦
j
⎤2 |u¯ (xi − x j ; m i , m j ) Jˆ(xi , m i , x j , m j )|⎦ . ε
j
By squaring out the expression inside brackets, we can readily see A f 211 + A f 212 ≤ c1 δ(ε)2 + K ε−1
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
by Lemma 6.1. The term A f 22 is treated likewise. Hence,
A f 21 (ω) + A f 22 (ω) ≤ c2 δ(ε)2 + K ε−1 . Eeq ε
823
(8.36)
We now study A f 23 : A f 23 (ω) =
K ε−3
i −1 m
β(m, m i − m)
V ε (xi − y; m i − m, m)
i, j,k m=1
u¯ (y − x j ; m, m j ) Jˆ(y, m, x j , m j )u¯ ε (y − xk ; m, m k ) Jˆ(y, m, xk , m k )dy = A f 231 + A f 232 , ε
with A f 231 and A f 232 corresponding to the cases k = j and k = j. With the aid of (6.5) and (6.8) , we can readily deduce −1 Eeq ε A f 232 ≤ c3 | log ε| .
2 Eeq ε A f 231 ≤ c3 δ(ε) ,
The term A f 24 is treated likewise. In summary ε A f 2 (ω) ≤ c4 δ(ε)2 + | log ε|−1 . E eq We can readily bound A f 1 as in the previous step: ε A f 1 (ω) ≤ c5 δ(ε) + | log ε|−1 | log d(ε) + τ (ε)| . E eq From all this we conclude
t
A f (ω(s))ds = 0
t
A f 0 (ω(s))ds + Err 3
0
with Err 3 satisfying (8.32). On the other hand S(i; y, m; ω) = −J (xi , m, xi , m i − m) (1 + K ε−1 u ε (xi − y; m, m i − m)) + O(εr (m, m i − m)) whenever V ε (y − xi ; m, m i − m) = 0. From this and a law of large numbers, we can readily deduce t = 0, (8.37) lim Eeq A (ω(s))ds − t A (J ) f0 f ε ε→0 0
where A f (J ) =
1 2
˜ β(m, n)λn+m (Jn+m − Jn − Jm )2 d x,
m,n
where ˜ ˆ β(m, n) = β(m, n)η(m, n)2 = β(m, n)η(m, n).
824
M. Ranjbar, F. Rezakhanlou
Final Step: From (8.22), (8.23), (8.24), (8.32), (8.35) and (8.37) we learn that the process M¯ ε (t)2 − t A (J ), is a sum of a martingale and a small error, where
2 A (J ) = 2 λn d(n)|∇x Jn | d x + λn λm d(n)γ (n, m)(Jn+m − Jn − Jm )2 d x n
+
1 2
1 + 2
n,m
α(m, ˜ n)λn λm (Jn+m − Jn − Jm )2 d x
m,n
˜ β(m, n)λn+m (Jn+m − Jn − Jm )2 d x.
m,n
It remains to show that A(J ) = A (J ). Since ˜ ˆ λm λn α(m, ˜ n) = β(m, n)λn+m , λm λn α(m, ˆ n) = β(m, n)λn+m , it suffices to show (d(m) + d(n))γ (m, n) = lim (d(m) + d(n))K ε−1 ε→0
|a|≤1
|∇u ε (a; m, n)|2 da
= α(m, ˆ n) − α(m, ˜ n) = α(m, n)(η(m, n) − η(m, n)2 ). (8.38) We have (d(m) + d(n))K ε−1
|∇u ε (a; m, n)|2 da
u ε (a; m, n)u ε (a; m, n)da + O(K ε−1 ) = −(d(m) + d(n))K ε−1 |a|≤1
V ε (a; m, n)K ε−1 u ε (a; m, n) = −α(m, n) |a|≤1 × 1 + K ε−1 u ε (a; m, n) da + O(K ε−1 ), |a|≤1
where we integrated by parts and used (4.11). This and limε K ε−1 u ε = η−1 (Theorem 3.2 of [HR1]) imply (8.38). 9. Proofs of Theorems 8.1 and 3.1 In this section, we complete the proof of Theorems 8.1 and 3.1. We first show that the process ξ (t, J ) is tight. More precisely, Theorem 9.1. For every smooth function J of compact support and positive T , there exists a constant c(J, T ) such that
T lim sup Eeq sup |ξ (t + h, J ) − ξ (t, J )|dt ≤ c(J, T )δ 1/2 . (9.1) ε ε→0
0
0≤h≤δ
Equilibrium Fluctuations of Coagulating-Fragmenting Planar Brownian Particles
825
Proof. Recall that by (8.16),
t
t ¯ ¯ ξ (t, J ) − ξ (s, J ) = ξ(θ, J )dθ + Mε (t) − Mε (s) − Err 8 (ω(θ )) dθ, (9.2) s
s
where,
t Err 8 (ω(θ ))dθ ≤ T lim sup Eeq |Err 8 (ω(0))| = 0. lim sup Eeq sup ε ε ε→0
0≤t≤T
ε→0
0
(9.3)
On the other hand, since 2 eq 2 sup Eeq ε ξ(θ, J ) = sup Eε ξ(0, J ) < ∞, ε
ε
we can readily deduce Eeq ε
sup 0≤s≤t≤s+δ≤T
t ξ(θ, J )dθ s
1/2
t 2 ≤ (T + δ)1/2 Eeq ξ(θ, J ) dθ ≤ c1 δ 1/2 . ε
(9.4)
s
See Sect. 5 and (5.7) of [R] for more details. It remains to establish the tightness of the martingale M¯ ε . For this, it suffices to show lim sup
2 ¯ ¯ sup Eeq ε [ Mε (t + h) − Mε (t)] ≤ c0 (T )δ.
ε→0 0≤t≤T 0 λ2k , then ch V 1 = 2 ch V (λ). The “queer” version of the classical limit theorems are Theorem 5.14 and Theorem 5.16. With the aid of the classical limit theorems we obtain another important result in the last section: the category Oq≥0 is semisimple. The organization of the paper is as follows. In Sect. 1 we recall some definitions and basic results about q(n). The realization of Uq (q(n)) and its triangular decomposition is provided in Sect. 2. Section 3 is devoted to the study of the quantum Clifford superalgebra and its modules. In Sect. 4 we introduce the notion of highest weight modules and Weyl modules. In particular, we show that every Weyl module W q (λ) has a unique irreducible quotient V q (λ). The classical limit theorem for the category Oq≥0 is proved in Sect. 5 and the complete reducibility of Uq (q(n))-modules in Oq≥0 is established in the last section.
Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))
829
1. The Lie Superalgebra q(n) and its Representations The ground field in this section will be C. By Z≥0 and Z>0 we denote the nonnegative integers and strictly positive integers, respectively. We set Z2 = Z/2Z. Every vector space V = V0¯ ⊕ V1¯ over C is Z2 -graded with even part V0¯ and odd part V1¯ . We will write dim V = m|n if dim V0¯ = m and dim V1¯ = n. By we denote the parity change functor; i.e., V is a vector space for which V0¯ = V1¯ and V1¯ = V0¯ . The direct sum of r copies of a vector space V will be written as V ⊕r . The Lie subsuperalgebra g = q(n) of gl(n|n) is defined in matrix form by A B A, B ∈ gln . g = q(n) := B A By definition, a subsuperalgebra h = h0¯ ⊕ h1¯ of g is a Cartan subsuperalgebra, if it is a self-normalizing nilpotent subsuperalgebra. Every such h has a nontrivial odd part h1¯ . We fix h to be the standard Cartan subsuperalgebra, namely the onefor which h0¯ E i,i 0 , has a basis {k1 , . . . , kn } and h1¯ has a basis {k1¯ , . . . , kn¯ }, where ki := 0 E i,i 0 E i,i ki¯ := and E i, j is the n × n matrix having 1 in the (i, j) position and 0 E i,i 0 elsewhere. One should note that all Cartan subsuperalgebras of g are conjugate to h. Let {1 , . . . , n } be the basis of h∗0¯ dual to {k1 , . . . , kn }. We denote ki − ki+1 by h i for i = 1, 2, . . . , n − 1. The root system = 0¯ ∪ 1¯ of g has identical even and odd parts. Namely, 0¯ =1¯ = {i − j | 1 < i = j < n}. In particular, the root space decomposition g = property that gα has dimension 1|1 for every α ∈ . α∈ gα has the n−1 n−1 Set αi := i − i+1 . Let Q = i=1 Zαi be the root lattice and Q + = i=1 Z≥0 αi be the positive root lattice. The notation Q − = −Q + will also be used. There is a partial ordering on h∗0¯ defined by λ ≥ µ if and only if λ − µ ∈ Q + for λ, µ ∈ h∗0¯ . The root space E i,i+1 0 E i,i+1 0 , while g−αi is gαi is spanned by ei := and ei¯ := 0 E E i,i+1 0 i,i+1 n 0 E i+1,i 0 E i+1,i . Let P := i=1 and f i¯ := spanned by f i := Zi 0 E i+1,i E i+1,i 0 n be the weight lattice of g and denote by P ∨ := i=1 Zki the dual weight lattice. Let I := {1, 2, . . . , n − 1} and J := {1, 2, . . . , n}. Proposition 1.1. [LS]. The Lie superalgebra g is generated by the elements ei , ei¯ , f i , f i¯ (i ∈ I ), h0¯ and kl¯ (l ∈ J ) with the following defining relations: [h, h ] = 0 for h, h ∈ h0¯ , [h, ei ] = αi (h)ei , [h, ei¯ ] = αi (h)ei¯ for h ∈ h0¯ , i ∈ I, [h, f i ] = −αi (h) f i , [h, f i¯ ] = −αi (h) f i¯ for h ∈ h0¯ , i ∈ I, [h, kl¯] = 0 for h ∈ h0¯ , l ∈ J, [ei , f j ] = δi j (ki − ki+1 ), [ei , f j¯ ] = δi j (ki¯ − ki+1 ) for i, j ∈ I, [ei¯ , f j ] = δi j (ki¯ − ki+1 ), [kl , ei ] = αi (kl )ei for i, j ∈ I, l ∈ J, [kl , f i ] = −αi (kl ) f i , [ei¯ , f j¯ ] = δi j (ki + ki+1 ) for i, j ∈ I, l ∈ J, ei if l = i, i + 1 for i ∈ I, l ∈ J, [kl¯, ei¯ ] = 0 otherwise
830
D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim
[kl¯, f i¯ ] =
f i if l = i, i + 1 0 otherwise
for i ∈ I, l ∈ J,
[ei , e j¯ ] = [ei¯ , e j¯ ] = [ f i , f j¯ ] = [ f i¯ , f j¯ ] = 0 for i, j ∈ I, |i − j| = 1, [ei , e j ] = [ f i , f j ] = 0 for i, j ∈ I, |i − j| > 1, [ei , ei+1 ] = [ei¯ , ei+1 ], [ei , ei+1 ] = [ei¯ , ei+1 ], [ f i+1 , f i ] = [ f i+1 , f i¯ ], [ f i+1 , f i¯ ] = [ f i+1 , f i ], [ki¯ , k j¯ ] = δi j 2ki for i, j ∈ J, [ei , [ei , e j ]] = [ei¯ , [ei , e j ]] = 0 for i, j ∈ I, |i − j| = 1, [ f i , [ f i , f j ]] = [ f i¯ , [ f i , f j ]] = 0 for i, j ∈ I, |i − j| = 1. Remark. We modified the relations given in [LS]. More precisely, we replaced the relations [ei¯ , [ei , e j¯ ]] = 0 for i, j ∈ I, |i − j| = 1, [ f i¯ , [ f i , f j¯ ]] = 0 for i, j ∈ I, |i − j| = 1
(1.1)
by [ei , ei+1 ] = [ei¯ , ei+1 ], [ei , ei+1 ] = [ei¯ , ei+1 ], [ f i+1 , f i ] = [ f i+1 , f i¯ ], [ f i+1 , f i¯ ] = [ f i+1 , f i ].
(1.2)
Since (1.1) can be derived from (1.2) (and other ones), we can easily see that these two presentations are equivalent. The universal enveloping algebra U (g) is obtained from the tensor algebra T (g) by factoring out by the ideal generated by the elements [u, v] − u ⊗ v + (−1)αβ v ⊗ u, where α, β ∈ Z2 , u ∈ gα , v ∈ gβ . Let U + (respectively, U 0 and U − ) be the subalgebra of U (g) generated by the elements ei , ei¯ (i ∈ I ) (respectively, by ki , ki¯ (i ∈ J ) and by f i , f i¯ (i ∈ I )). By the Poincaré-Birkhoff-Witt theorem, the universal enveloping algebra has the triangular decomposition: U (g) ∼ = U − ⊗ U 0 ⊗ U +.
(1.3)
A g-module V is called a weight module if it admits a weight space decomposition V = Vµ , where Vµ = {v ∈ V | hv = µ(h)v for all h ∈ h0¯ }. µ∈h∗¯ 0
For a weight g-module M denote by wt(M) the set of weights λ ∈ h∗0¯ for which Mλ = 0. Every submodule of a weight module is also a weight module. If dimC Vµ < ∞ for all µ ∈ h∗0¯ , the character of V is defined to be
ch V = (dimC Vµ ) eµ , µ∈h∗¯ 0
are formal basis elements of the group algebra C[h∗0¯ ] with the multiplication where given by eλ eµ = eλ+µ for all λ, µ ∈ h∗0¯ . Denote by b+ the standard Borel subsuperalgebra of g generated by kl , kl¯ (l ∈ J ) and ei , ei¯ (i ∈ I ). A weight module V is called a highest weight module if it is generated over g by a finite dimensional irreducible b+ -submodule (see [PS1, Def. 4]). eµ
Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))
831
Proposition 1.2. [P]. Let v be a finite dimensional irreducible Z2 -graded b+ -module. (1) The maximal nilpotent subsuperalgebra n of b+ acts on v trivially. (2) For any weight µ ∈ h∗0¯ , consider the symmetric bilinear form Fµ (u, v) := µ([u, v]) on h1¯ and let Cliff(µ) be the Clifford superalgebra of the quadratic space (h1¯ , Fµ ). Then there exists a unique weight λ ∈ h∗0¯ such that v is endowed with a canonical Z2 -graded Cliff(λ)-module structure and v is determined by λ up to . (3) h0¯ acts on v by the weight λ determined in (2). From the above proposition, we know that the dimension of the highest weight space of a highest weight g-module with highest weight λ is the same as the dimension of an irreducible Cliff(λ)-module. On the other hand all irreducible Cliff(λ)-modules have the same dimension (see, for example, [ABS, Table 2]). Thus the dimension of the highest weight space is constant for all highest weight modules with highest weight λ. Definition 1.3. Let v(λ) be the irreducible b+ -module determined by λ up to . The Weyl module W (λ) of g with highest weight λ is defined to be W (λ) := U (g) ⊗U (b+ ) v(λ). Note that the structure of W (λ) is determined by λ up to . Remark. One may define the Verma module corresponding to λ by M(λ) := U (g)⊗U (b+ ) Cliff(λ). Since the Verma modules are not highest weight modules, they will not be considered in this paper. We will denote by +0¯ and + the set of gln -dominant integral weights and the set of g-dominant integral weights, respectively. These are given by +0¯ := {λ1 1 + · · · + λn n ∈ h∗0¯ | λi − λi+1 ∈ Z≥0 for all i ∈ I }, + := {λ1 1 + · · · + λn n ∈ +0¯ | λi = λi+1 ⇒ λi = λi+1 = 0 for all i ∈ I }. Proposition 1.4. [P]. (1) For any weight λ, W (λ) has a unique maximal submodule N (λ). (2) For each finite dimensional irreducible g-module V , there exists a unique weight λ ∈ +0¯ such that V is a homomorphic image of W (λ). (3) V (λ) := W (λ)/N (λ) is finite dimensional if and only if λ ∈ + . Now we restrict our attention to the following subcategory of the category of finite dimensional g-modules. Definition 1.5. Set P≥0 := {λ = λ1 1 + · · · + λn n ∈ P | λ j ≥ 0 for all j = 1, . . . , n}. The category O≥0 consists of finite dimensional U (g)-modules M with weight space decomposition M = λ∈P Mλ such that wt(M) ⊂ P≥0 . Clearly, O≥0 is closed under finite direct sum, tensor product and taking submodules and quotient modules. Because a q(n)-module in O≥0 can be decomposed into a direct sum of irreducible highest weight gln -modules, one can easily prove the following proposition (see, for example, [HK, Theorem 7.2.3]). Proposition 1.6. For each λ ∈ + ∩ P≥0 , V (λ) is an irreducible U (g)-module in the category O≥0 . Conversely, every irreducible U (g)-module in the category O≥0 has the form V (λ) for some λ ∈ + ∩ P≥0 .
832
D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim
In [Se1], Sergeev has presented an explicit set of generators of Z = Z(U (g)), the center of U (g), and showed that each Weyl module W (λ) (λ ∈ h∗0¯ ) admits a central character. Let χλ ∈ HomC (Z , C) be the central character afforded by W (λ); i.e., every element z ∈ Z acts on W (λ) as scalar multiplication by χλ (z). Following [B, (2.12)], to each weight λ = λ1 1 + · · · + λn n ∈ P, one can assign a formal symbol δ(λ) := δλ1 + · · · + δλn such that δ0 = 0 and δ−i = −δi . Proposition 1.7. [B, Theorem 4.19], [PS2, Proposition 1.1]. For λ, µ ∈ P, χλ = χµ if and only if δ(λ) = δ(µ). The following proposition will be very useful in Sect. 5. Proposition 1.8. Let V be a finite dimensional highest weight module over g with highest weight λ ∈ + ∩ P≥0 . Then V is isomorphic to an irreducible highest weight module V (λ). Proof. If V is reducible, since it is finite dimensional, it contains a nonzero proper irreducible submodule W . Then W is isomorphic to an irreducible highest weight module V (µ) for some weight µ ∈ + ∩ P≥0 by Proposition 1.4. We know that µ λ and χλ = χµ . But, by Proposition 1.7, δ(λ) = δ(µ). Since λ, µ ∈ + ∩ P≥0 , we have λ = µ, which is a contradiction. Thus V is irreducible and by Proposition 1.4, it must be isomorphic to the irreducible highest weight module V (λ) up to . The next proposition gives a sufficient condition for the finite dimensionality of a highest weight g-module. Proposition 1.9. Let V be a highest weight module over g with highest weight λ ∈ + . If f iλ(h i )+1 v = 0 for all v ∈ Vλ and i ∈ I , then V is finite dimensional. Proof. Let {x1 , x2 , . . . , xr } and {y1 , y2 , . . . , yr } be bases of g0¯ and g1¯ , respectively. Then by the Poincaré-Birkhoff-Witt theorem, U (g) has a basis consisting of elements of the form y11 y22 · · · yrr xn1 1 xn2 2 · · · xrnr , where j = 0 or 1 and n j ∈ N ∪ {0}. Because {y11 y22 · · · yrr | j = 0, 1} is a finite set, it is enough to show that U (g0¯ )Vλ is finite dimensional. For any v ∈ Vλ , we know that U (g0¯ )v is a highest weight module over g0¯ λ(h )+1 with highest weight λ satisfying f i i v = 0 for all i ∈ I . Thus it is finite dimensional.
U (g0¯ )v, we have the desired result. Since U (g0¯ )Vλ ⊂ v∈Vλ
We say that a weight λ = λ1 1 + · · · + λn n ∈ h∗0¯ is α-typical if α = i − j and λi + λ j = 0. In [Se2], Sergeev proved the following character formula for V (λ) (λ ∈ + ∩ P≥0 ): ⎛ ⎞ ch V (λ) =
⎜ dim vλ
⎜ sgn w w ⎜eλ+ρ0 ⎝ D w∈W
α∈+¯ , 0 λ is α−tyipical
⎟ ⎟ (1 + e−α )⎟ , ⎠
(1.4)
where vλ is an irreducible Cliff(λ)-module, W is the Weyl group of g0¯ = gln , ρ0 = 1 w(ρ0 ) is the Weyl denominator. In [PS2], the formula α∈+¯ α and D = w∈W sgn w e 2 0 (1.4) is called the generic character formula and an explicit algorithm for computing the character of an arbitrary finite dimensional irreducible g-module is presented.
Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))
833
2. The Quantum Superalgebra Uq (q(n)) In [O], Olshanski constructed the quantum deformation Uq (q(n)) of the universal enveloping algebra of q(n). The quantum superalgebra Uq (q(n)) is defined to be the associative algebra over C(q) generated by L i j , i ≤ j, with defining relations L ii L −i,−i = L −i,−i L ii = 1, (−1) p(i, j) p(k,l) q ϕ( j,l) L i j L kl + {k ≤ j < l}θ (i, j, k)(q − q −1 )L il L k j + {i ≤ −l < j ≤ −k}θ (−i, − j, k)(q − q −1 )L i,−l L k,− j
(2.1)
= q ϕ(i,k) L kl L i j + {k < i ≤ l}θ (i, j, k)(q − q −1 )L il L k j + {−l ≤ i < −k ≤ j}θ (−i, − j, k)(q − q −1 )L −i,l L −k, j , where ϕ(i, j) = δ|i|,| j| sgn( j), θ (i, j, k) = sgn(sgn(i) + sgn( j) + sgn(k)), p(i, j) = 0 if i j > 0 for any indices i ≤ j, k ≤ l in {±1, · · · ± n} and the symbol {· · ·} 1 if i j < 0, (the dots stand for some inequalities) is equal to 1 if all of these inequalities are fulfilled and 0 otherwise. Following [O, Remark 7.3], we consider the set of generators of Uq (g) = Uq (q(n)) as follows: 1 1 L −i−1,−i , f i := L i,i+1 , q − q −1 q − q −1 (2.2) 1 1 1 ei¯ := − L , f := − L , k := − L . −i−1,i −i,i+1 −i,i i¯ i¯ q − q −1 q − q −1 q − q −1
q ki := L i,i , q −ki := L −i,−i , ei := −
Our first main result is the following presentation of Uq (g). Theorem 2.1. The quantum superalgebra Uq (g) is isomorphic to the unital associative algebra over C(q) generated by the elements ei , f i , ei¯ , f i¯ (i = 1, . . . , n − 1), kl¯ (l = 1, . . . , n), and q h (h ∈ P ∨ ), satisfying the following relations: q 0 = 1, q h 1 +h 2 = q h 1 q h 2 for h 1 , h 2 ∈ P ∨ , q h ei q −h = q αi (h) ei , q h f i q −h = q −αi (h) f i for h ∈ P ∨ , q h ki¯ q −h = ki¯ , q h ei¯ q −h = q αi (h) ei¯ , q h f i¯ q −h = q −αi (h) f i¯ for h ∈ P ∨ , 1 ki −ki+1 −ki +ki+1 ei f i − f i ei = q , − q q − q −1 qei+1 f i − f i ei+1 = ei f i+1 − q f i+1 ei = ei f j − f j ei = 0 if |i − j| > 1, ei f i¯ − f i¯ ei = q −ki+1 ki¯ − ki+1 q −ki , qei+1 f i¯ − f i¯ ei+1 = ei f i+1 − q f i+1 ei = ei f j¯ − f j¯ ei = 0 if |i − j| > 1, ei¯ f i − f i ei¯ = q ki+1 ki¯ − ki+1 q ki , qei+1 f i − f i ei+1 = ei¯ f i+1 − q f i+1 ei¯ = ei¯ f j − f j ei¯ = 0 if |i − j| > 1,
834
D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim
ki¯ ei − qei ki¯ = ei¯ q −ki , qki¯ ei−1 − ei−1 ki¯ = −q −ki ei−1 , ki¯ e j − e j ki¯ = 0 for j = i and j = i − 1, ki¯ f i − q f i ki¯ = − f i¯ q ki , qki¯ f i−1 − f i−1 ki¯ = q ki f i−1 , ki¯ f j − f j ki¯ = 0 for j = iand j = i − 1, ki¯2 =
q 2ki − q −2ki , ki¯ k j¯ = −k j¯ ki¯ for i = j, q 2 − q −2
ei¯ f i¯ + f i¯ ei¯ =
(2.3)
q ki +ki+1 − q −ki −ki+1 + (q − q −1 )ki¯ ki+1 , q − q −1
qei+1 f i¯ + f i¯ ei+1 = ei¯ f i+1 + q f i+1 ei¯ = ei¯ f j¯ + f j¯ ei¯ = 0 if |i − j| > 1, ki¯ ei¯ + qei¯ ki¯ = ei q −ki , qki¯ ei−1 + ei−1 ki¯ = q −ki ei−1 , ki¯ e j¯ + e j¯ ki¯ = 0 for j = i and j = i − 1, ki¯ f i¯ + q f i¯ ki¯ = f i q ki , qki¯ f i−1 + f i−1 ki¯ = q ki f i−1 , ki¯ f j¯ + f j¯ ki¯ = 0 for j = i and j = i − 1, ei¯2 = −
q − q −1 2 q − q −1 2 2 e , f = f , i¯ q + q −1 i q + q −1 i
ei e j − e j ei = f i f j − f j f i = ei¯ e j¯ + e j¯ ei¯ = f i¯ f j¯ + f j¯ f i¯ = 0 if |i − j| > 1, ei e j¯ − e j¯ ei = f i f j¯ − f j¯ f i = 0 if |i − j| = 1, ei ei+1 − ei+1 ei = ei¯ ei+1 + ei+1 ei¯ , f i+1 f i − f i f i+1 = f i¯ f i+1 + f i+1 f i¯ , ei ei+1 − ei+1 ei = ei¯ ei+1 − ei+1 ei¯ , f i+1 f i − f i f i+1 = f i+1 f i¯ − f i¯ f i+1 , qei2 ei+1 − (q + q −1 )ei ei+1 ei + q −1 ei+1 ei2 = 0, q f i2 f i+1 − (q + q −1 ) f i f i+1 f i + q −1 f i+1 f i2 = 0, 2 2 − (q + q −1 )ei+1 ei ei+1 + q −1 ei+1 ei = 0, qei ei+1 2 2 q f i f i+1 − (q + q −1 ) f i+1 f i f i+1 + q −1 f i+1 f i = 0,
qei2 ei+1 − (q + q −1 )ei ei+1 ei + q −1 ei+1 ei2 = 0, q f i2 f i+1 − (q + q −1 ) f i f i+1 f i + q −1 f i+1 f i2 = 0, 2 2 qei ei+1 − (q + q −1 )ei+1 ei ei+1 + q −1 ei+1 ei = 0, 2 2 q f i f i+1 − (q + q −1 ) f i+1 f i f i+1 + q −1 f i+1 f i = 0.
Proof. Let U be the unital associative algebra over C(q) generated by the elements ei , f i , ei¯ , f i¯ (i = 1, . . . , n − 1), kl¯ (l = 1, . . . , n), and q h (h ∈ P ∨ ) with defining relations given in (2.3). Using (2.1) and (2.2), the relations in (2.3) can be derived easily. Thus there is a well-defined algebra homomorphism φ : U −→ Uq (g).
Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))
835
From the relation (2.1), we obtain L i,i+ j = (q − q −1 )q −
j−1
h=1 ki+h
j−1
ad f i+h ( f i ),
h=1
L −i,i+ j = − (q − q −1 )q −
j−1
h=1 ki+h
j−1
ad f i+h ( f i¯ ),
h=1
L −i− j, i = (−1) (q − q j
−1
)q
j−1
j−1
h=1 ki+h
(2.4) ad ei+h (ei¯ ),
h=1
L −i− j,−i = (−1) (q − q j
−1
)q
j−1
h=1 ki+h
j−1
ad ei+h (ei ),
h=1
j where ad bi (b j ) := bi b j − b j bi , h=1 ad bi+h (bi ) := ad bi+ j · · · ad bi+1 (bi ) and 0 h=1 ad bi+h (bi ) = bi for bi = ei , ei¯ , f i , f i¯ (i = 1, . . . , n − 1, j > 0). It follows that the homomorphism φ must be surjective. It remains to prove φ is injective. For this purpose, we will show that the relations in (2.1) can be derived from the ones in (2.3). The proof of our assertion is quite lengthy and tedious. But the basic idea is just the case-by-case check-up. We define the sets = {(i, j) ∈ Z/{0} × Z/{0} | − n ≤ i ≤ j ≤ n},
1 = {(i, j) ∈ | i > 0, j > 0 and i < j},
2 = {(i, j) ∈ | i < 0, j > 0 and |i| < | j|},
3 = {(i, j) ∈ | i < 0, j > 0 and |i| > | j|},
4 = {(i, j) ∈ | i < 0, j < 0 and |i| > | j|},
5 = {(i, j) ∈ | i < 0, j > 0 and |i| = | j|}.
For ((i, j), (k, l)) ∈ × , let a = min{|i|, | j|}, b = max{|i|, | j|}, c = min{|k|, |l|}, d = max{|k|, |l|}. We list all possible subsets of × : C1 = {((i, j), (k, l)) ∈ × | c < d < a < b},
C2 = {((i, j), (k, l)) ∈ × | c < d = a < b},
C3 = {((i, j), (k, l)) ∈ × | c < a < d < b},
C4 = {((i, j), (k, l)) ∈ × | c < a < d = b},
C5 = {((i, j), (k, l)) ∈ × | c < a < b < d},
C6 = {((i, j), (k, l)) ∈ × | c = a < d < b},
C7 = {((i, j), (k, l)) ∈ × | c = a < d = b},
C8 = {((i, j), (k, l)) ∈ × | c = a < b < d},
C9 = {((i, j), (k, l)) ∈ × | a < c < d < b},
C10 = {((i, j), (k, l)) ∈ × | a < c < d = b},
C11 = {((i, j), (k, l)) ∈ × | a < c < b < d},
C12 = {((i, j), (k, l)) ∈ × | a < b = c < d},
C13 = {((i, j), (k, l)) ∈ × | a < b < c < d},
D1 = {((i, j), (k, l)) ∈ 5 × | |i| < c < d},
D2 = {((i, j), (k, l)) ∈ 5 × | |i| = c < d},
D3 = {((i, j), (k, l)) ∈ 5 × | c < |i| < d},
D4 = {((i, j), (k, l)) ∈ 5 × | c < |i| = d},
D5 = {((i, j), (k, l)) ∈ 5 × | c < d < |i|},
D6 = {((i, j), (k, l)) ∈ × 5 | |k| < a < b},
D7 = {((i, j), (k, l)) ∈ × 5 | |k| = a < b},
D8 = {((i, j), (k, l)) ∈ × 5 | a < |k| < b},
D9 = {((i, j), (k, l)) ∈ × 5 | a < b = |k|},
D10 = {((i, j), (k, l)) ∈ × 5 | a < b < |k|}.
836
D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim
We consider all cases for s × t ∩ Ci (1 ≤ s, t ≤ 4, 1 ≤ i ≤ 13) and s × t ∩ Di (s = 5, 1 ≤ t ≤ 4 or 1 ≤ s ≤ 4, t = 5 and 1 ≤ i ≤ 10). Since the remaining cases can be checked similarly, we just prove: −1 L i,i L k,l L i,i = q ϕ(l,i)−ϕ(k,i) L k,l L i, j L k,l − L k,l L i, j = 0
if (k, l) ∈ 1 ∪ 2 , if ((i, j), (k, l)) ∈ 1 × 1 ∩ C1 ,
(2.5) (2.6)
L i, j L k,l − L k,l L i, j = (q − q −1 )L i,l L k, j if ((i, j), (k, l)) ∈ 1 × 1 ∩ C2 ,
(2.7)
(L i, j )2 =
− q −1
q (L −i, j )2 q + q −1
if (i, j) ∈ 2
.
(2.8)
From (2.4), we obtain L i, j = L i, j
L −1 j−1, j−1
(L j−1, j L i, j−1 − L i, j−1 L j−1, j ) q − q −1 L −i−1,−i−1 = (L i,i+1 L i+1, j − L i+1, j L i,i+1 ) q − q −1
if (i, j) ∈ 1 ∪ 2 , if (i, j) ∈ 3 ∪ 4 .
To prove (2.5), we use induction on l − k: −1 L i,i L k,l L i,i
=
−1 L l−1,l−1
q − q −1
−1 L i,i (L l−1,l L k,l−1 − L k,l−1 L l−1,l )L i,i
= q ϕ(l,i)−ϕ(l−1,i)+ϕ(l−1,i)−ϕ(k,i) =q
ϕ(l,i)−ϕ(k,i)
−1 L l−1,l−1
q
− q −1
L l−1,l L k,l−1 − L k,l−1 L l−1,l
L k,l .
From (2.3), we know that f i f j − f j f i = 0 if |i − j| > 1. By using induction on j − i and (2.5), one can show that L i, j L k,k+1 − L k,k+1 L i, j = 0 when ((i, j), (k, k + 1)) ∈ 1 × 1 ∩ C1 . Similarly, one can prove L i, j L k,l − L k,l L i, j = 0 by induction on l − k. The proof of (2.7) is analogous (we use induction on l − k and (2.5), (2.6)): L i, j L k,l =
−1 L l−1,l−1
q − q −1
L i, j (L l−1,l L k,l−1 − L k,l−1 L l−1,l )
−1 L l−1,l−1 −1 L = L L +(q −q )L L L − L L L l−1,l i, j k,l−1 i,l l−1, j k,l−1 k,l−1 i, j l−1,l q − q −1 −1 L l−1,l−1 = L l−1,l L k,l−1 L i, j +(q − q −1 )L i,l L l−1, j L k,l−1 − L k,l−1 L l−1,l L i, j q − q −1 −(q − q −1 )L k,l−1 L i,l L l−1, j −1 L i,l (L l−1, j L k,l−1 − L k,l−1 L l−1, j ) = L k,l L i, j + L l−1,l−1
= L k,l L i, j + (q − q −1 )L i,l L k, j . To verify the relation (2.8), it suffices to show that (L j−1, j L i, j−1 − L i, j−1 L j−1, j )2 =
q − q −1 (L j−1, j L −i, j−1 − L −i, j−1 L j−1, j )2 . q + q −1
Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))
837
For this purpose, we need the following formulas for (i, j) ∈ 2 which can be derived using induction: L j−1, j L i, j−1 L j−1, j =
1 (q L i, j−1 L 2j−1, j + q −1 L 2j−1, j L i, j−1 ), q + q −1
q L −i, j−1 L 2j−1, j − (q + q −1 )L j−1, j L −i, j−1 L j−1, j + q −1 L 2j−1, j L −i, j−1 = 0. Using these formulae, we can verify the desired relations (L j−1, j L i, j−1 − L i, j−1 L j−1, j )2 = (L j−1, j L i, j−1 L j−1, j )L i, j−1 −
q −q −1 L j−1, j L 2−i, j−1 L j−1, j − L i, j−1 L 2j−1, j L i, j−1 q +q −1
+L i, j−1 (L j−1, j L i, j−1 L j−1, j ) −1 q q −q −1 q 2 2 2 2 2 = L L + L L − L L L j−1, j −i, j−1 j−1, j q +q −1 q +q −1 j−1, j −i, j−1 q +q −1 −i, j−1 j−1, j q − q −1 q = (L j−1, j L −i, j−1 L j−1, j − L −i, j−1 L 2j−1, j )L −i, j−1 q + q −1 q + q −1 q −1 2 2 +L −i, j−1 (L j−1, j L −i, j−1 L j−1, j − L L −i, j−1 )− L j−1, j L −i, j−1 L j−1, j q + q −1 j−1, j 2 q − q −1 L j−1, j L −i, j−1 − L −i, j−1 L j−1, j . = −1 q +q
Set deg f i = deg f i¯ = −αi , deg q h = deg kl¯ = 0, deg ei = deg ei¯ = αi . Since all the defining relations of the quantum superalgebra Uq (g) are homogeneous, it has a root space decomposition Uq (g) = (Uq )α , α∈Q
where (Uq )α = {u ∈ Uq (g) | q h uq −h = q α(h) u for all h ∈ P ∨ }. Remark. If we define Fi = f i q −ki+1 , E i = q ki+1 ei , one can see that the relations involving E i , Fi and q h are the same as the standard relations for Uq (gln ) (see, for example, [HK, Def. 7.1.1]). Hence Uq (gln ) is a subalgebra of Uq (g). The comultiplication of Uq (g) is given by the formula (L i, j ) =
j
L i,k ⊗ L k, j ,
k=i
(see §4 in [O]). In terms of the new generators we have:
(2.9)
838
D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim
(q h ) = q h ⊗ q h for every h ∈ P ∨ , (ei ) = q −ki+1 ⊗ ei + ei ⊗ q −ki , ( f i ) = q ki ⊗ f i + f i ⊗ q ki+1 , (ei¯ ) = q −ki+1 ⊗ ei¯ − (q − q −1 )ei ⊗ ki¯ ⎛ j i−1 j
+(q − q −1 ) ⎝ (−1) j+1 q h=1 ki− j+h ad ei− j+h (ei− j ) j=1
⊗ q−
j−1
h=1 ki− j+h
j−1
h=1
⎞ ad f i− j+h ( f i− j )⎠
h=1
⎛ j i−1 j
+(q − q −1 ) ⎝ (−1) j q h=1 ki− j+h ad ei− j+h (ei− j ) j=1
⊗ q−
j−1
h=1 ki− j+h
j−1 h=1
h=1
ad f i− j+h ( f i− j ) ) + ei¯ ⊗ q ki , ⎛
( f i¯ ) = q −ki ⊗ f i¯ + (q − q −1 ) ⎝
j−1 i−1 j−1
(−1) j q h=1 ki− j+h ad ei− j+h (ei− j ) j=1
⊗q −
j
h=1 ki− j+h
j
⎞
h=1
ad f i− j+h ( f i− j )⎠
h=1
⎛ j−1 i−1 j−1
+(q − q −1 ) ⎝ (−1) j+1 q h=1 ki− j+h ad ei− j+h (ei− j ) j=1
⊗q −
j
h=1 ki− j+h
j
h=1
⎞ ad f i− j+h ( f i− j )⎠
h=1
+(q − q (ki¯ ) = q −ki
−1
) ki¯ ⊗ f i + f i¯ ⊗ q ki+1 , ⎛ j−1 i−1 j−1
⊗ ki¯ + (q − q −1 ) ⎝ (−1) j q h=1 ki− j+h ad ei− j+h (ei− j ) j=1
⊗q −
j−1
h=1 ki− j+h
j−1
⎞
h=1
ad f i− j+h ( f i− j )⎠
h=1
⎛ j−1 i−1 j−1
+(q − q −1 ) ⎝ (−1) j+1 q h=1 ki− j+h ad ei− j+h (ei− j ) j=1
⊗q −
j−1
h=1 ki− j+h
j−1 h=1
h=1
⎞
ad f i− j+h ( f i− j )⎠ + ki¯ ⊗ q ki .
Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))
839
Let Uq+ (respectively, Uq− ) be the subalgebra of Uq (g) generated by the elements ei , ei¯ (respectively, f i , f i¯ ) for i = 1, . . . , n − 1, and let Uq0 be the subalgebra of Uq (g) generated by q h (h ∈ P ∨ ) and kl¯ for l = 1, . . . , n. In addition, let Uq≥0 (respectively, Uq≤0 ) be the subalgebra of Uq (g) generated by Uq+ and Uq0 (respectively, by Uq− and Uq0 ). We will show that the quantum superalgebra Uq (g) has a triangular decomposition. For this purpose, we need the following lemma. Lemma 2.2. Uq≥0 ∼ = Uq0 ⊗ Uq+ ,
Uq≤0 ∼ = Uq− ⊗ Uq0 .
Proof. We will prove the second part. Let { f ζ | ζ ∈ } be a basis of Uq− consisting of monomials in f i and f i¯ ’s (i ∈ I ). Consider a set = {(a1 , . . . , an ) | ai = 0 or 1 for all i ∈ J }. Then {q h kη | h ∈ P ∨ , η ∈ } is a basis of Uq0 , where kη = k1a¯ 1 · · · kna¯ n for η = (a1 , . . . , an ) by [O, Theorem 6.2]. By the defining relations of Uq (g), it is easy to see that the elements f ζ q h kη (ζ ∈ , h ∈ P ∨ , η ∈ ) span Uq≤0 . Thus there is a surjective C(q)-linear map Uq− ⊗ Uq0 −→ Uq≤0 given by f ζ ⊗ q h kη −→ f ζ q h kη . To show that this map is injective, it suffices to show that the elements f ζ q h kη (ζ ∈ , h ∈ P ∨ , η ∈ ) are linearly independent over C(q). Suppose
Cζ,h,η f ζ q h kη = 0 for some Cζ,h,η ∈ C(q). ζ ∈, h∈P ∨ , η∈
We may write
⎛
⎜ ⎜ ⎜ ⎝
β∈Q +
⎞
deg f ζ =−β,
h∈P ∨ , η∈
Since Uq (g) =
⎟ ⎟ Cζ,h,η f ζ q h kη ⎟ = 0 ⎠
for some Cζ,h,η ∈ C(q).
β∈Q (Uq )β ,
we have
Cζ,h,η f ζ q h kη = 0
for each β ∈ Q + .
deg f ζ =−β,
h∈P ∨ , η∈
n−1 n−1 Write β = − i=1 m i αi (m i ∈ Z≥0 ), and let h β = i=1 m i ki+1 . Since f ζ is a monomial in f i and f i¯ ’s, the term of degree (−β, 0) in ( f ζ ) is f ζ ⊗ q h β . We consider the terms of degree (0, 0) in (kη ), where η = (a1 , . . . , an ). Then the terms of degree (0,0) in (kη ) can be written as (q −k1 ⊗ k1¯ + k1¯ ⊗ q k1 )a1 · · · (q −kn ⊗ kn¯ + kn¯ ⊗ q kn )an ⎛ ⎞ ai n
j a −j ⎝ = q −(ai − j)ki ki¯ ⊗ q jki ki¯ i ⎠ i=1
=
j=0
( j1 ,..., jn )∈ ji ≤ai , i∈J
n j a −j q −(ai − ji )ki ki¯ i ⊗ q ji ki ki¯ i i . i=1
840
D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim
Since the terms of degree (−β, 0) of ⎛
0=
η=(a1 ,...,an ) ( j1 ,..., jn )∈ ji ≤ai , i∈J ∈
⊗q h β +h
n
Cζ,h,η ( f ζ q h kη ) must sum to zero, we have
⎜
⎜ ⎝
Cζ,h,η f ζ q
h
deg f ζ =−β, h∈P ∨
n
q
−(ai − ji )ki
j ki¯ i
i=1
⎞
⎟ ⎟. ⎠
a − ji
q ji ki ki¯ i
i=1
(2.10)
ai − ji n For all (a1 − j1 , . . . , an − jn ) ∈ and h ∈ P ∨ , the elements q h are k i=1 i¯ linearly independent. Set η1 := (1, . . . , 1). Since there is only one pair of (a1 , . . . , an ) n a −j and ( j1 , . . . , jn ) such that i=1 ki¯ i i = kη1 in the above sum, we obtain 0=
h∈P ∨
×
n
Cζ,h,η1 f ζ q h−
i=1 ki
j q −(ai − ji )ki ki¯ i
⊗ q h β +h
i=1
n
Cζ,h,η f ζ q h
η=(a1 ,...,an ), ( j1 ,..., jn ), h∈P ∨ , (a1 − j1 ,...,an − jn ) =η1 deg f ζ =−β
deg f ζ =−β
n
⊗q h β +h kη1 +
a −j q ji ki ki¯ i i
.
i=1
Thus we have
n
Cζ,h,η1 f ζ q h−
i=1 ki
= 0 for all h ∈ P ∨ .
deg f ζ =−β n
Multiplying by q −h+
i=1 ki
from the right we obtain
Cζ,h,η1 f ζ = 0 for all h ∈ P ∨ .
deg f ζ =−β
Using the linear independence of f ζ , we conclude all Cζ,h,η1 = 0 for all ζ ∈ , h ∈ P ∨ . Now consider general η = (a1 , . . . , an ) ∈ . Assume that for all η = (a1 , . . . , an ) such that ai ≥ ai for all i ∈ J and η = η, Cζ,h,η = 0 for all ζ ∈ , h ∈ P ∨ . Then there is only one pair of (a1 , . . . , an ) and ( j1 , . . . , jn ) such that (a1 − j1 , . . . an − jn ) = η in (2.10). Repeating the above argument, we conclude Cζ,h,η = 0 for all ζ ∈ , h ∈ P ∨ . For example, consider η2 = (0, 1, . . . , 1). Since Cζ,h,η1 = 0, there is only one pair of (a1 , . . . , an ) and ( j1 , . . . , jn ) such that (a1 − j1 , . . . an − jn ) = (0, 1, . . . , 1) in (2.10). Thus we have n
Cζ,h,η2 f ζ q h− i=2 ki = 0 for all h ∈ P ∨ . deg f ζ =−β n
Multiplying q −h+ i=2 ki and using the linear independence of f ζ , we obtain Cζ,h,η2 = 0 for all ζ ∈ , h ∈ P ∨ . We are now ready to prove the triangular decomposition for Uq (g).
Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))
841
Theorem 2.3. There is a C(q)-linear isomorphism Uq (g) ∼ = Uq− ⊗ Uq0 ⊗ Uq+ .
Proof. Let { f ζ | ζ ∈ } , {q h kη | h ∈ P ∨ , η ∈ }, and {eτ | τ ∈ } be monomial bases of Uq− , Uq0 and Uq+ respectively, where and are the index sets as in the proof for Lemma 2.2. It suffices to show that the elements f ζ q h kη eτ (ζ, τ ∈ , h ∈ P ∨ , η ∈ ) are linearly independent over C(q). Suppose
Cζ,h,η,τ f ζ q h kη eτ = 0
for some Cζ,h,η,τ ∈ C(q).
ζ,h,η,τ
The root space decomposition of Uq (g) yields
Cζ,h,η,τ f ζ q h kη eτ = 0
for all γ ∈ Q.
h, η, deg f ζ +deg eτ =γ
Using the partial ordering on h∗0¯ , we can choose α = deg f ζ and β = deg eτ , which are minimal and maximal, respectively, among those for which α + β = γ and C ζ,h,η,τ is nonzero. If α = − m i αi , set h α = m i ki+1 , and if β = n i αi , set h β = n i ki+1 . The term of degree (0, β) in (eτ ) is q −h β ⊗ eτ and the term of degree (α, 0) of ( f ζ ) is f ζ ⊗ q h α . Since the terms of degree (α, β) of Cζ,h,η,τ ( f ζ q h kη eτ ) must sum to zero, we have
deg f ζ =α, deg eτ =β, h, η=(a1 ,··· ,an )
( j1 ,..., jn )∈ ji ≤ai , i∈J
⊗q
h α +h
n
Cζ,h,η,τ f ζ q
h
n
q
−(ai − ji )ki −h β
j ki¯ i
i=1
a −j q ji ki ki¯ i i
eτ = 0.
i=1
j n ki¯ i are linearly independent for ζ ∈ , h ∈ P ∨, ( j1 , . . . , jn ) The elements f ζ q h i=1
∈, by Lemma 2.2. By the similar argument in the proof for Lemma 2.2, we obtain
Cζ,h,η,τ eτ = 0 for all h ∈ P ∨ , ζ ∈ , and η ∈ .
deg eτ =β
Using the linear independence of eτ , we conclude that Cζ,h,η,τ = 0 for all ζ ∈ , h ∈ P ∨ , η ∈ , and τ ∈ , as desired.
842
D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim
3. The Quantum Clifford Superalgebra Cliffq (λ) We first introduce some notation that will be used in this section only. Let K be a field of zero characteristic and A be an associative K-algebra. Denote by Mat n (A) the associative K-algebra of n × n matrices with entries in A. If A is a superalgebra, then Matn (A) is a superalgebra as well by setting Matn (A)i¯ = Mat n (A i¯ ). By sMat n|n (K) we denote A B the associative superalgebra of 2n × 2n matrices , where A, B, C, and D are C D in Matn (K) and A 0 0 B sMat n,n (K)0¯ = , sMat n,n (K)1¯ = . 0 D C 0 A B . In particular, Let Q n (K) be the subsuperalgebra of sMatn|n (K) with elements B A Q n (K)0¯ = Q n (K)1¯ = Mat n (K). There are K-superalgebra isomorphisms Matr sMat 1|1 (K) ∼ = sMatr |r (K), Matr (Q 1 (K)) ∼ = Q r (K). Note that if K = C, then the superalgebra Q n (C) coincides with g as a complex vector space. Another example of a K-superalgebra is any extension K(α) of K of√degree 2 considering α as an odd element. If α 2 = β ∈ K we will denote K(α) by K( β). In this section, we set F = C(q). For every λ ∈ P we define I q (λ) to be the left ideal of Uq0 generated by q h − q λ(h) 1, h ∈ P ∨ . Set Cliffq (λ) := Uq0 /I q (λ). We may consider Cliffq (λ) as the associative F-algebra generated by the identity 1 = 1 + I q (λ) and ti¯ := ki¯ + I q (λ) satisfying the relations ti¯ t j¯ + t j¯ ti¯ = δi j
2(q 2λi − q −2λi ) 1, i, j = 1, . . . , n. q 2 − q −2
Furthermore, Cliffq (λ) has an obvious Z2 -grading (and thus a superalgebra structure) by assuming that ti¯ are odd. More precisely, Cliff q (λ)0¯ is spanned by 1 and the monomials ti¯1 . . . ti¯2k of even degree, while Cliff q (λ)1¯ is spanned by those of odd degree. In this section we will describe the structure of Cliffq (λ) and will classify its irreducible modules. Because of its superalgebra structure, Cliffq (λ) has both Z2 -graded and nongraded modules and both cases will be addressed. The results in this section may be derived from more general statements about quadratic forms and Clifford superalgebras over arbitrary fields (see, for example, [Lam] and [Sh]). For the sake of completeness we will give an outline of the proofs. The results and the proofs in this section will also help us to describe explicitly the action of Uq0 on the highest weight vectors of an irreducible highest weight module over Uq (g). This is demonstrated in Example 3.10 forthe case n = 3 and λ = (4, 2, 1). n n In this section, we fix V := i=1 Fti¯ and := ( 1 , . . . , n ) ∈ F and denote by B : V × V → F the symmetric bilinear form defined by B (ti¯ , t j¯ ) = δi j i . Let Cliffq ( ) be the unique up to isomorphism Clifford algebra associated to V and B . If q 2λi − q −2λi , then we have Cliffq ( ) Cliffq (λ). i = q 2 − q −2 Define V ( ) := V / ker B , where ker B := {v ∈ V | B (v, u) = 0, for every u ∈ V } and denote by β the restriction of B on V ( ). Let N = {i | i = 0}, Z = { j | j = 0}, and | | = #N . Set N := ( i1 , . . . , i| | ), 0 Z :=
Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))
843
( j1 , . . . , jn−| | ) = (0, . . . , 0), where N = {i 1 , . . . , i | | }, Z = { j1 , . . . , jn−| | }, and i 1 < · · · < i | | . It is clear that ker B = j∈Z Ft j¯ and that Cliffq ( N ) = Ft is the Clifford algebra corresponding to (V ( ), β ). Furthermore, i∈N i¯ Cliffq ( ) Cliffq ( N ) ⊗F Cliffq (0 Z ) Cliffq ( N ) ⊗F ker B . Here W denotes the exterior algebra of the vector space W . Thanks to the above isomorphisms every Cliffq ( )-module can be considered as a Cliffq ( N )-module under the embedding Cliffq ( N ) = Cliffq ( N ) ⊗F 1 → Cliffq ( N ) ⊗F Cliffq (0 Z ). The class ˙ F˙ 2 is called the discriminant of (V, B ). ( ) of ( ) = i∈N i in F/ The following lemma is standard and the proof is left to the reader. Lemma 3.1. Let M be an irreducible Cliff q ( )-module. Then M is an irreducible Cliffq ( N )-module and ti¯ v = 0 for every i ∈ Z . Conversely, if M0 is an irreducible Cliffq ( N )-module then M0 considered as a Cliff q ( )-module with trivial action of Cliff q (0 Z ) is irreducible as well. Since our goal in this section is to classify the irreducible representations of Cliff q ( ), thanks to the above lemma, we may assume that i are nonzero. So, for simplicity we fix Z = ∅, and thus B = β and V ( ) = V , in all statements preceding Corollary 3.9. Recall that a vector v in V is called β −isotropic (or simply isotropic) if β (v, v) = 0. A subspace W of V is β −isotropic subspace if β (u, w) = 0 for every u and w in W . A subspace W of V is anisotropic if it contains no nonzero β −isotropic vector. An isotropic subspace W of V is maximal isotropic if there is no larger β -isotropic subspace containing W . Lemma 3.2. Let W be an isotropic subspace of V . Then there exists an isotropic subspace W ∗ and a subspace Z of V such that V = Z ⊕ W ⊕ W ∗ , dim W = dim W ∗ , β (z, w) = β (z, w ∗ ) = 0 for every z ∈ Z , w ∈ W, w∗ ∈ W ∗ . ∗ } of W and W ∗ , respectively, Moreover, there exist bases {w1 , . . . , wm } and {w1∗ , . . . , wm ∗ such that β (wi , w j ) = δi j .
Proof. The lemma follows by induction on dim W . If dim W = 1, then W ∗ is spanned by w1∗ = x − 21 β (x, x)w1 , where x ∈ V is arbitrarily chosen so that β (w1 , x) = 1. Then we define Z to be Z = {z ∈ V | β (z, w1 ) = β (z, w1∗ ) = 0}. For the complete proof, see [Sh, Lemma 1.3]. The decomposition V = Z ⊕ W ⊕ W ∗ in Lemma 3.2 is called a weak Witt decomposition of V . For any weak Witt decomposition V = Z ⊕ W ⊕ W ∗ , we denote by Cliff( Z ) the Clifford algebra corresponding to (Z , β |Z ). If V = Z ⊕ W ⊕ W ∗ is a weak Witt decomposition for which Z is anisotropic (or, equivalently, W is maximal isotropic) we call it a Witt decomposition. We may identify W ∗ with the dual space of W via the nondegenerate form β . If V = Z ⊕ W ⊕ W ∗ is a Witt decomposition, the dimension of W is an invariant of (V, β ) (see [Sh, Lemma 1.4]) and is known as the
844
D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim
Witt index of the form β . We say that the Witt index is maximal if dim Z ≤ 1. Recall that if the ground field is C, the Witt index is always maximal. In the case of arbitrary F though, the Witt index is generally not maximal as we verify in Lemma 3.6. In order to find a Witt decomposition and the Witt index of (V, β ) we need some preparatory statements. Lemma 3.3. Let V = Z ⊕ W ⊕ W ∗ be a weak Witt decomposition and let m = 2dim W . Then Cliff q ( ) ∼ = Mat m (Cliff q ( Z )). Moreover, we have Mat m (Cliff q ( Z )0¯ ) if Z = 0, Cliff q ( )0¯ ∼ = Mat m/2 (F) ⊕ Mat m/2 (F) if Z = 0. Proof. For the complete proof, see [Sh, Theorem 2.6]. The proof follows by induction on dim W . We sketch the proof for dim W = 1. In this case there is an isomorphism : Cliff q ( ) → Mat m (Cliff q ( Z )) defined by its restriction |V on V : z r . z + r w1 + sw1∗ → s −z Notice that if Z = 0, is not necessarily parity preserving. In such a case we choose the −1 isomorphism : Cliff q ( ) → Mat m (Cliff q ( Z )) defined by (α) = D (α)D, g 0 where D = and any g ∈ Z with β (g, g) = 0. 0 1 Lemma 3.4. The nondegenerate Legendre’s equation always has a nontrivial solution in F: for every nonzero A, B, C in F, there exist X, Y, Z ∈ F with (X, Y, Z ) = (0, 0, 0) such that AX 2 + BY 2 + C Z 2 = 0. Proof. We modify the proof of the classical Legendre’s Theorem (see, for example, [IR, §17.3]). We first assume that A, B, C, X, Y, Z are polynomials in C[q], where A, B, C are square free. We may fix √ C = −1, since if (X, Y, Z ) is a solution of AC X 2 + BCY 2 = Z 2 , then (X, Y, −1 CZ ) is a solution of AX 2 + BY 2 + C Z 2 = 0. We prove that AX 2 + BY 2 = Z 2 has a nontrivial solution by induction on N := max{deg A, deg B}. If N = 0; i.e., A and B are constant polynomials, then AX 2 + BY 2 = Z 2 has a solution (constant polynomials). Assume that deg B ≤ deg A and deg A ≥ 1. Recall that every polynomial R ∈ C[q] is a quadratic residue modulo any square free polynomial S. Indeed, if S is constant, our assertion is obvious. Otherwise, let S(q) = ri=1 (q − z i ) with z i = z j , and let yi ∈ C be such that yi2 = R(z i ). Then yi2 ≡ R (mod (q −z i )). Using the Chinese Remainder Theorem, we find y ∈ C[q] for which y ≡ yi (mod (q − z i )). But then y 2 ≡ R (mod (q − z i )) and thus y 2 ≡ R (mod S). We fix C1 with deg C1 < deg A such that C12 ≡ B ( modA). Then C12 − B = AT = A A1 M 2 for some square free polynomial A1 . Since deg A + deg A1 ≤ deg(A A1 M 2 ) = deg(C12 − B) < 2 deg A, we have 0 ≤ deg A1 < deg A. Now we observe that if (X 1 , Y1 , Z 1 ) is a solution of A1 X 2 + BY 2 = Z 2 , then (A1 X 1 M, C1 Y1 + Z 1 , Z 1 C1 + BY1 ) is a solution of AX 2 + BY 2 = Z 2 . Using the induction hypothesis, we complete the proof. Remark. Lemma 3.4 may be proved with a standard algebro-geometric argument using dimensions, see, for example, [Har, Exercise 11.6]. The lemma is also a particular case
Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))
845
of the following theorem of Tsen-Lang: if K is a field of transcendence degree n over an algebraically closed field k, then any quadratic form over K of dimension bigger than 2n is isotropic. For details, see [Lam, Chap. XI]. q 2λi − q −2λi . For simplicity, we will write βλ , q 2 − q −2 |λ|, and (λ) for β , | |, and ( ), respectively. The following technical lemma can be easily verified. In what follows, we assume i =
Lemma 3.5. Define an equivalence relation ∼ in {λi | i = 1, . . . , n} by λi ∼ λ j if λi2 = λ2j and denote by o(λi ) the orbit of λi relative to ∼. Then (λ) = 1¯ (or, equivalently, (λ) is a square in F) if and only if the orbit o(λi ) of every λi = ±1 contains an even number of elements. Lemma 3.6. The space V is anisotropic if and only if dim V = 1 or dim V = 2 and (λ) = 1. If V is isotropic, there is a Witt decomposition V = Z ⊕ W ⊕ W ∗ of V such that (1) dim W = k if dim V = 2k + 1, k ≥ 1 (maximal Witt index); (2) dim W = k − 1 if dim V = 2k and (λ) = 1; (3) dim W = k if dim V = 2k and (λ) = 1 (maximal Witt index). In particular, if λ1 > λ2 > · · · > λn > 0, then dim W = n−1 2 . Proof. The proof consists of several steps. Step 1. The case dim V = 1. This case is straightforward. Step 2. The case dim V = 2. In this case, v = a1 t1¯ + a2 t2¯ is βλ -isotropic if and only 1 if a12 1 + a22 2 = 0. The latter equation has a solution for a1 and a2 if and only if 2 is a square (or equivalently, 1 2 is a square). Step 3. If dim V ≥ 3, then V ∼ = Fw ⊕ Fw∗ ⊕ Fv3 ⊕ · · · ⊕ Fvn , where βλ (w, w) = βλ (w ∗ , w ∗ ) = βλ (w, vi ) = βλ (w ∗ , vi ) = 0 for i ≥ 3, βλ (w, w ∗ ) = 1, βλ (v3 , v3 ) = 1 2 3 , βλ (vi , vi ) = i if i ≥ 4. Let us first consider the case dim V = 3. We use Lemma 3.4 to find w = x1 t1¯ + x2 t2¯ + x3 t3¯ such that βλ (w, w) = 0. Applying Lemma 3.2 to W = Fw, we find w∗ = y1 t1¯ + y2 t2¯ + y3 t3¯ and z = z 1 t1¯ + z 2 t2¯ + z 3 t3¯ such that βλ (w ∗ , w ∗ ) = βλ (w ∗ , z) = βλ (w, z) = 0, βλ (w, w ∗ ) = 1. The choice of z is unique up to a multiplication by a nonzero constant in F. A simple calculation shows that z i may be chosen as follows √ z 1 = −1 2 3 (x2 y3 − x3 y2 ), √ z 2 = −1 1 3 (x3 y1 − x1 y3 ), √ z 3 = −1 1 2 (x1 y2 − x2 y1 ). Then one can easily verify that βλ (z, z) = 1 2 3 .
846
D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim
∗ In the case dim V > 3, write V = Ft1¯ ⊕ Ft2¯ ⊕ Ft3¯ ⊕ i≥4 Fti¯ . Fix w, w , z ∈ Ft1¯ ⊕ Ft2¯ ⊕ Ft3¯ as above, and set v3 = z and vi = ti¯ for i ≥ 4. Step 4. If dim V ≥ 3, then V has a Witt decomposition V ∼ = Z ⊕ W ⊕ W ∗, where
⎧ ⎪ ⎨0 dim Z = 1 ⎪ ⎩2
if dim V is even and 1 2 . . . n is a square, if dim V is odd, if dim V is even and 1 2 . . . n is not a square.
This follows from an inductive argument using Step 1, Step 2, and Step 3. Lemma 3.7. (1) Assume that dim V = 1. Then Q 1 (F) if (λ) = 1¯ (equivalently, 1 is a square in F), √ Cliff q (λ) ∼ = F( 1 ) if (λ) = 1¯ (equivalently, 1 is not a square in F). (2) Assume that dim V = 2. Then Cliff q (λ) ∼ = Mat 2 (F) as (nongraded) algebras and Cliff q (λ)0¯ ∼ = Cliff q ( 1 2 ). Proof. The case (1) corresponds to the “classical case” (Clifford superalgebra over C) and can be easily verified. (2) Let A = Cliff q (λ). Then A is a quaternion algebra over F. Since it is not a division algebra, by Wedderburn’s Theorem, we have A ∼ = Mat 2 (F) (see [Lam, Theorem 2.7] for details). The isomorphism A0¯ ∼ = Cliff q ( 1 2 ) is straightforward. Remark. The superalgebraic structure of Cliff q (λ) for dim V = 2 is “explicit” only ¯ In this case, one can show that Cliff q (λ) ∼ when (λ) = 1. = sMat 1|1 (F). We are now ready to describe the superalgebra structure of Cliff q (λ). Proposition 3.8. (1) If n is even, then Cliff q (λ) ∼ = Matr (A), where A = Cliff q n −1 (((λ), 1)) and r = 2 2 . Furthermore, Cliff q (λ) ∼ = Mat 2r (F) as (nongraded) algebras and ¯ Matr (F) ⊕ Matr (F) if (λ) = 1, ∼ √ Cliff q (λ)0¯ = ¯ if (λ) = 1. Matr (F( (λ))) n−1 (2) If n is odd, then Cliff q (λ) ∼ = Matr (B), where B = Cliff q ((λ)) and r = 2 2 . Furthermore, ¯ Cliff q (λ) ∼ if (λ) = 1, = Q r (F), Cliff q (λ)0¯ ∼ = Matr (F) √ ∼ ∼ ¯ Cliff q (λ) = Matr (F( (λ))), Cliff q (λ)0¯ = Matr (F) if (λ) = 1.
In particular, Cliff q (λ) is a simple superalgebra which is isomorphic to ¯ • a direct sum of two isomorphic simple algebras if n is odd and (λ) = 1; • a simple algebra otherwise.
Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))
847 n
Proof. We first consider the case when n is even and let r = 2 2 −1 . If 1 . . . n is a ¯ square in F, then (1) is proved by Lemma 3.6 (3) and Lemma 3.3. Now if (λ) = 1, ∼ by Lemma 3.3 and Step 3 in the proof of Lemma 3.6, we have Cliff q ( ) = Matr (A), where A = Cliff q ( 1 . . . n−1 , n ). We now apply Lemma 3.7 (1),(2) and prove (1). n−1 Next, assume that n is odd and let r = 2 2 . By Lemma 3.3 and Step 3 in the proof ∼ of Lemma 3.6, we have Cliff q (λ) = Matr (B), where B is the 2-dimensional Clifford superalgebra Cliff q ( 1 . . . n ). We use Lemma 3.7 (1) to complete the proof. In the statement of the following corollary we allow λi to be zero for some i. Recall that |λ| is the number of nonzero λi . We also set λ N := (λi1 , . . . , λi|λ| ), where Nλ = {i 1 , . . . , i |λ| } and i 1 < · · · < i |λ| . Corollary 3.9. Every Z2 -graded Cliff q (λ N )-module is completely reducible. Furthermore, the superalgebra Cliff q (λ) has up to isomorphism (1) two simple modules E q (λ) and (E q (λ)) of dimension 2k−1 |2k−1 if |λ| = 2k and ¯ (λ) = 1; (2) one simple module E q (λ) ∼ = (E q (λ)) of dimension 2k |2k if |λ| = 2k and (λ) = 1¯ (in particular, if λ1 > · · · . > λ2k > 0); (3) one simple module E q (λ) ∼ = (E q (λ)) of dimension 2k |2k if |λ| = 2k + 1. Proof. Thanks to Lemma 3.1, we may assume that λi = 0; i.e., |λ| = n. The category of all Z2 -graded Cliff q (λ)-modules is equivalent to the category of all nongraded Cliff q (λ)0¯ -modules. Indeed, the reverse correspondence is obtained by V0 → Cliff q (λ) ⊗Cliff q (λ)0¯ V0 . The corollary follows from Proposition 3.8 and the characterization of the simple and indecomposable (nongraded) modules of Matr (F) ⊕ Matr (F), Matr (F), and √ Matr (F( (λ))). (This characterization may be found, for example, in [Lang, Chap. XVII].) Example 3.10. Let n = 3 and λ = (4, 2, 1). We describe the action of ti¯ (i = 1, 2, 3) on E q (λ). We have 1 = (q 2 + q −2 )(q 4 + q −4 ), 2 = q 2 + q −2 , 3 = 1. For simplicity, let t = q 2 + q −2 . We first find a solution of Legendre’s equation 1 X 2 + 2 Y 2 + 3 Z 2 = 0.
√
(3.1)
We follow the proof of Lemma 3.4. Let Z = t Z and Y = −1Y . In order to solve the equation (t 2 − 2)X 2 + t Z 2 = Y 2 we find C1 ∈ C[t] for which C12 − t is a multiple of t 2 − 2. Using the Chinese Remainder Theorem, we choose √ √ 4 4 √ √ 8 2 (1 − −1)t + (1 + −1). C1 = 4 2 √ √ Then we solve the equation A1 X 12 + B Z 12 = Y12 for A1 = − 42 −1 and B = t. A solution for this is √ 4 √ 8 (1 − −1), 0). (X 1 , Y1 , Z 1 ) = (1, 4
848
D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim
Then (3.1) has a solution √ (A1 X 1 , −1(Y1 C1 + B Z 1 ), t (C1 Z 1 + Y1 )) √ √ √ 4 √ 1√ 2√ 2 8 t+ (1 − −1)t . −1, −1, = − 4 4 2 4 Multiplying by an appropriate constant and changing signs, we fix the following solution of (3.1): √ √ √ √ 4 w = (X, Y, Z ) = (1, −1t − 2, 2(1 + −1)t). We consider w as an element in V relative to the basis {t1¯ , t2¯ , t3¯ }. We use Lemma 3.2 to find a Witt decomposition V = Fw ⊕ Fw∗ ⊕ Fz. As mentioned in the proof of Lemma 3.2, we find w ∗ = c(X, Y, −Z ), √
where c = √−1 t −2 such that βλ (w, w ∗ ) = 1 and βλ (w ∗ , w ∗ ) = 0. Then, as pointed out 4 2 in Step 3 of the proof of Lemma 3.6, we can find z = c(tY Z , −t (t 2 − 2)X Z , 0) such that βλ (z, w) = βλ (z, w ∗ ) = 0 and βλ (z, z) = − 41 1 2 3 = − 41 t 2 (t 2 − 2). Set ! √ α = (λ) = t 2 (t 2 − 2). Using Lemma 3.3, we define an isomorphism : Cliff q (λ) → Mat 2 (F(α)) by α 0 0 0 0 α −1 . , z → w → , w∗ → 0 −α α 0 0 0 From Proposition 3.8 and Corollary 3.9 we find that E q (λ) = F(α)⊕2 . Let v1 and v2 be the standard basis vectors of the F(α)-vector space E q (λ), and let v¯i = αvi (i = 1, 2). The action of Cliff q (λ) on E q (λ) is given by z(v1 ) = v¯1 , z(v2 ) = −v¯2 , z(v¯1 ) = t 2 (t 2 − 2)v1 , z(v¯2 ) = −t 2 (t 2 − 2)v2 , w(v1 ) = 0, w(v2 ) = (t 2 (t 2 − 2))−1 v¯1 , w(v¯1 ) = 0, w(v¯2 ) = v1 , w ∗ (v1 ) = v¯2 , w ∗ (v2 ) = 0, w ∗ (v¯1 ) = t 2 (t 2 − 2)v2 , w ∗ (v¯2 ) = 0. In order to determine the action of ti¯ (i = 1, 2, 3) on E q (λ), we need to express t1¯ , t2¯ , t3¯ in terms of z, w, w ∗ . With simple computations we find: √ √ √ √ √ 4 −1 t 2 − 2 8(1 − −1) −1t − 2 2 ∗ w + t (t − 2)w + z, t1¯ = √ 2 t 4 2 t √ √ √ √ √ 4 √ √ 8( −1 − 1) 1 −1 −1t − 2 w + ( −1t 2 − 2t)w ∗ + z, t2¯ = √ t 2 t 4 2 √ √ √ 2(1 + −1)t ∗ 1 − −1 t3¯ = w . w+ √ √ 4 2 4 4 2t
Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))
849
4. Highest Weight Representation Theory of Uq (g) A Uq (g)-module V q is called a weight module if it admits a weight space decomposition Vµq , where Vµq = {v ∈ V q | q h v = q µ(h) v for all h ∈ P ∨ }. Vq = µ∈P
q
For a weight Uq (g)-module V q , we set wt V q = {λ ∈ P | Vλ = 0}. By the same argument as in [HK, Ch.3], it can be verified that every submodule of a weight Uq (g)-module q is also a weight module. If dimC(q) Vµ < ∞ for all µ ∈ P, then the character of V q is defined to be
(dimC(q) Vµq ) eµ , ch V q = µ∈P
where eµ are formal basis elements of the group algebra C(q)[P] with the multiplication given by eλ eµ = eλ+µ for all λ, µ ∈ P. A weight module V q is called a highest weight module if it is generated over Uq (g) by a finite dimensional irreducible Uq≥0 -module vq . Note that vq also admits a weight space decomposition. We call a vector in vq a highest weight vector of V q . Combining Lemma 2.2 and the triangular decomposition of Uq (g) (Theorem 2.3), we obtain V q = Uq− vq . Proposition 4.1. If vq is a finite dimensional irreducible Uq≥0 -module with a weight q space decomposition vq = µ∈P vµ , then vq is irreducible as a Uq0 -module and vq = q vλ for some λ ∈ P. Conversely, if vq is an irreducible Uq0 -module on which the even part of Uq0 acts by a weight λ, then vq can be endowed with the structure of an irreducible Uq≥0 -module by letting Uq+ act trivially on vq . q
Proof. Because vq is finite dimensional, there exists a weight λ ∈ P such that vλ = 0 q q q q q q and vλ+αi = 0 for all i ∈ I . Then we have Uq+ vλ = vλ and Uq0 vλ = vλ . Thus vλ is q a Uq≥0 -submodule of vq and hence vλ = vq . The other direction is obvious from the defining relations of Uq (g) in Theorem 2.1. Remark. If vq is a finite dimensional irreducible Uq≥0 -module which generates a highest weight module V q of highest weight λ, then, by Proposition 4.1, we know that vq is an irreducible Uq0 -module of weight λ. Thus vq is a finite dimensional irreducible module over Cliff q (λ) = Uq0 /I q (λ). Conversely, if E q is a finite dimensional irreducible Cliff q (λ)-module, then it is clear that E q is an irreducible Uq0 -module of weight λ. By Corollary 3.9, we know that, up to isomorphism, Cliff q (λ) has at most two simple modules: E q (λ) and (E q (λ)). The Uq (g)-module W q (λ) = Uq (g) ⊗U ≥0 E q (λ) is q called the Weyl module of Uq (g) corresponding to λ (defined up ). Proposition 4.2. (1) W q (λ) is a free Uq− -module of rank dim E q (λ). (2) Every highest weight Uq (g)-module with highest weight λ is a homomorphic image of W q (λ). (3) Every Weyl module W q (λ) has a unique maximal submodule N q (λ).
850
D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim
Proof. (1) This is clear from the definition. (2) Let V q be a highest weight module with highest weight λ generated by the irreducible Uq≥0 -module vq . Because vq is irreducible over Cliff q (λ), it is isomorphic to E q (λ) up to . Thus the map φ : W q (λ) −→ V q induced by E q (λ) → vq is a surjective Uq (g)-module homomorphism. (3) Since E q (λ) is an irreducible Cliff q (λ)-module, any proper submodule N q of W q (λ) does notcontain highest weight vectors (the vectors in E q (λ)). That is, N q must lie in µ 1, ei¯ f i¯ = − f i¯ ei¯ + [q ki +ki+1 ; 0]q + (q − q −1 )ki ki+1 , ei+1 f i¯ = −q −1 f i¯ ei+1 , ei¯ f i+1 = −q f i+1 ei¯ , ei¯ f j¯ = − f j¯ ei¯ = 0 for |i − j| > 1. Together with Lemma 5.2, one can show that the image of the canonical isomorphism lies inside UA−1 ⊗ UA0 1 ⊗ UA+1 when restricted to UA1 . Its inverse map is given by multiplication. Hence the two spaces are isomorphic as A1 -modules. In what follows, V q is a highest weight module over Uq (g) with highest weight λ ∈ P generated by a finite dimensional irreducible Uq≥0 -submodule vq . Then vq is a finite dimensional irreducible Cliff q (λ)-module. Since it is irreducible, it is generated by a nonzero vector v ∈ (vq )0¯ ; i.e., vq = Cliff q (λ)v. Note that q 2n − q −2n = q 2n−2 + q 2n−6 + · · · + q −2n+6 + q −2n+2 ∈ A1 for n ∈ Z>0 . q 2 − q −2 We denote by CliffA1 (λ) the A1 -subalgebra of Cliff q (λ) generated by {ti¯ | i ∈ J }. Definition 5.4. Let V q be a highest weight Uq (g)-module generated by a finite dimensional irreducible Uq≥0 -module vq and let E A1 (λ) be the Cliff A1 (λ)-submodule of vq ∼ = E q (λ) generated by a nonzero element v ∈ (vq )0¯ . The A1 -form of V q is defined to be the UA1 -submodule VA1 of V q generated by E A1 (λ). In what follows, V q will denote a highest weight Uq (g)-module. Proposition 5.5. VA1 = UA−1 E A1 (λ). Proof. In view of Proposition 5.3, it suffices to show that UA+1 E A1 (λ) = E A1 (λ) and UA0 1 E A1 (λ) = E A1 (λ). The first assertion is clear by the definition of highest weight modules. For the second assertion, we observe that q h w = q λ(h) w, (q h ; 0)q w =
q λ(h) − 1 w for all w ∈ E A1 (λ). q −1
Hence we obtain VA1 = UA1 E A1 (λ) = UA−1 E A1 (λ).
Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))
853 q
For each µ ∈ P, let us denote by (VA1 )µ the space VA1 ∩ Vµ . The following assertion can be proved using the same arguments as in [HK, Prop. 3.3.6]. Proposition 5.6. VA1 has the weight space decomposition VA1 = µ≤λ (VA1 )µ . Proposition 5.7. For each µ ∈ P, the weight space (VA1 )µ is a free A1 -module with q rank A1 (VA1 )µ = dimC(q) Vµ . In particular, rank A1 E A1 (λ) = dimC(q) E q (λ). Proof. Because A1 is a principal ideal domain, every finitely generated torsion free module over A1 is free. Furthermore, since C(q) is the field of quotients of the integral domain A1 , a finite subset of a C(q)-vector space is linearly independent over C(q) if q and only if it is linearly independent over A1 . Thus it is enough to show that each Vµ has q a C(q)-basis which is also contained in (VA1 )µ . The highest weight space v = E q (λ) has a linearly independent subset of {t1¯1 t2¯2 · · · tn¯ n v | j = 0 or 1} which generates E q (λ) over C(q), since E q (λ) = Cliff q (λ)v. By definition, this subset is contained in E A1 (λ). q q For Vµ , it is easy to show that there is a basis of Vµ whose elements are of the form n 1 2 f ζ t1¯ t2¯ · · · tn¯ v, where f ζ are monomials in f i and f j¯ . This basis is also contained in (VA1 )µ , which proves the proposition. Corollary 5.8. The map φ : C(q) ⊗A1 VA1 −→ V q given by f ⊗ v −→ f v ( f ∈ C(q), v ∈ VA1 ) is a C(q)-linear isomorphism. Let J1 be the ideal of A1 generated by q − 1. Then there is a canonical isomorphism of fields ∼
A1 /J1 −→ C given by f (q) + J1 −→ f (1). Define the C-linear vector spaces U1 = (A1 /J1 ) ⊗A1 UA1 , V 1 = (A1 /J1 ) ⊗A1 VA1 . Then V 1 is naturally a U1 -module. Note that U1 ∼ = UA1 /J1 UA1 and V 1 ∼ = VA1 /J1 VA1 . We use the bar notation for the images under these maps. The passage under these maps is referred to as taking the classical limit. Since VA1 = UA1 E A1 (λ), we have: V1 ∼ = VA1 /J1 VA1 = UA1 E A1 (λ)/J1 UA1 E A1 (λ) = (UA1 /J1 UA1 ) · (E A1 (λ)/J1 E A1 (λ)). Hence V 1 is generated by E A1 (λ)/J1 E A1 (λ) over U 1 . For each µ ∈ P, denote by Vµ1 the space (A1 /J1 ) ⊗A1 (VA1 )µ ∼ = (VA1 )µ /J1 (VA1 )µ . Proposition 5.9. (1) V 1 = µ≤λ Vµ1 (2) For each µ ∈ P, dimC Vµ1 = rank A1 (VA1 )µ . Proof. The first assertion follows from Proposition 5.6. Using the same argument as in [HK, Lemma 3.4.1], we can prove the second assertion. Let h¯ ∈ U1 be the classical limit of (q h ; 0)q ∈ UA1 . Using [HK, Lemma 3.4.3], we have: Lemma 5.10. (1) For all h ∈ p ∨ , we have q h = 1. (2) For any h, h ∈ P ∨ , h + h = h + h .
854
D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim
Theorem 5.11. (1) The elements ei , ei¯ , f i , f i¯ , (i ∈ I ), kl¯ (l ∈ J ) and h (h ∈ P ∨ ) satisfy the defining relations of U (g). Hence there exists a surjective C-algebra homomorphism ψ : U (g) −→ U1 and the U1 -module V 1 has a U (g)-module structure. (2) For each µ ∈ P and h ∈ P ∨ , the element h acts on Vµ1 as scalar multiplication by µ(h). So Vµ1 is the µ-weight space of the U (g)-module V 1 . ∼
(3) There is an isomorphism Cliff(λ) −→ Cliff 1 (λ) := Cliff A1 (λ)/J1 Cliff A1 (λ). (4) As a U (g)-module, V 1 is a highest weight module or the sum of two highest weight modules with highest weight λ ∈ P. Proof. (1) The first relation for U (g) is trivial. Since (q h ; 0)q ei − ei (q h ; 0)q = ei (q h ; αi (h))q − ei (q h ; 0)q =
q αi (h) − 1 h ei q , q −1
we obtain [h, ei ] = αi (h)ei by letting q → 1. Similarly, [h, ei¯ ] = αi (h)ei¯ , [h, f i ] = −αi (h) f i , [h, f i¯ ] = −αi (h) f i¯ and [h, kl¯] = 0. We have ei f i − f i ei = [q h i ; 0]q =
q (1 + q −h i )(q h i ; 0)q . q +1
1 Taking the classical limit to both sides above leads to ei f i − f i ei = 2h i = h i . 2 Also ki¯2 = [q 2ki ; 0]q 2 = q 2
q2 − 1 1 (1 + q −2ki ) (q 2ki ; 0)q . q4 − 1 q +1 2
When we take q → 1, we obtain ki¯ = ki . Since we can obtain the following relations in U (g) by the Jacobi identity, [ei¯ , [ei , e j ]] = [[ei¯ , ei ], e j ] + [ei , [ei¯ , e j ]] = [ei , [ei¯ , e j ]], for |i − j| = 1, in order to prove the corresponding relations in U1 , it suffices to show that [ei , [ei¯ , e j ]] = 0. The latter relation can be checked easily by letting q → 1. The rest of the relations can be derived in a similar manner. Therefore, there exists a surjective algebra homomorphism ψ : U (g) −→ U1 defined by ei −→ ei , ei¯ −→ ei¯ , f i −→ f i , f i¯ −→ f i¯ , h −→ h, kl¯ −→ kl¯ (i ∈ I, l ∈ J ), which can be used to define a U (g)-module structure on V 1 . (2) For v ∈ (VA1 )µ and h ∈ P ∨ , we have (q h ; 0)q v =
q µ(h) − 1 v. q −1
Taking the classical limit of both sides yields our assertion.
Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))
855
(3) Note that ti¯ t j¯ + t j¯ ti¯ = 2δi j λi in Cliff 1 (λ) and Cliff(λ) is the associative C-algebra with 1 generated by {ki¯ | i ∈ J } with defining relations ki¯ k j¯ + k j¯ ki¯ = 2δi j λi . Thus we have a surjective C-algebra homomorphism Cliff(λ) → Cliff 1 (λ). Observe that dimC Cliff 1 (λ) = rank A1 Cliff A1 (λ) = dimC(q) Cliff q (λ) = dimC Cliff(λ). The first two equalities follow by using the same reasoning as in Proposition 5.9 and Proposition 5.7, respectively. It is well known that the dimension of the Clifford algebra associated with a symmetric bilinear form on a vector space of dimension k is 2k . This result holds for any base field of characteristic different from 2. Thus we proved the last equality. (4) V q is generated by a finite dimensional irreducible Uq≥0 -submodule vq ∼ = E q (λ) up to . By Corollary 3.9, ⎧ k ¯ ⎪ if |λ| = 2k and (λ) = 1, ⎨2 q k+1 ¯ dim E (λ) = 2 if |λ| = 2k and (λ) = 1, ⎪ ⎩2k+1 if |λ| = 2k + 1. It is"well #known that the dimension of the Z2 -graded irreducible Cliff(λ)-modules " # |λ|−1
|λ|−1
is 2 2 |2 2 (see, for example, [ABS]). With this in mind we deduce that E A1 (λ)/J1 E A1 (λ) is an irreducible Cliff(λ)-module when |λ| = 2k + 1 or |λ| = 2k and (λ) = 1, and the direct sum of two irreducible Cliff(λ)-modules otherwise. Since E q (λ) is a parity invariant module over Cliff q (λ) for |λ| = 2k and ¯ E A1 (λ)/J1 E A1 (λ) is a parity invariant Cliff(λ)-module as well. Hence (λ) = 1, E A1 (λ)/J1 E A1 (λ) = v(λ)⊕v(λ) for some irreducible Cliff(λ)-module v(λ). By definition, V 1 is a highest weight U (g)-module generated by E A1 (λ)/J1 E A1 (λ) or the sum of two highest weight modules generated by v(λ) and v(λ) for some irreducible Cliff(λ)-module v(λ). By Propositions 5.7 and 5.9 and Theorem 5.11, we obtain the following identity between the characters of a highest weight U (g)-module and a highest weight Uq (g)module. Proposition 5.12. ch V 1 = ch V q . Corollary 5.13. V q (λ) is finite dimensional if and only if λ ∈ + . Proof. Let V q = V q (λ). If λ ∈ + , then we have f iλ(h i )+1 v = 0 for all v ∈ Vλ by Proposition 4.4. Taking the classical limit, we have f¯iλ(h i )+1 v¯ = 0 for all v¯ ∈ Vλ1 . Because V 1 is a highest weight module or the sum of two highest weight modules, it is finite dimensional by Proposition 1.9, and hence V q is finite dimensional by Proposition 5.12. Conversly, assume that λ is not in + . Then V 1 has a submodule which is a highest weight module and whose irreducible quotient is isomorphic to an irreducible highest weight module with highest weight λ. It is not finite dimensional by (2) of Proposition 1.4. Again by Proposition 5.12, V q cannot be finite dimensional. q
856
D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim
Theorem 5.14. If λ ∈ + ∩ P≥0 and V q is the irreducible highest weight Uq (g)-module V q (λ) with highest weight λ, then V 1 is isomorphic to ¯ (1) V (λ) or V (λ) if |λ| = 2k and (λ) = 1, ¯ (2) V (λ) ⊕ V (λ) if |λ| = 2k and (λ) = 1 (in particular, if λ1 > . . . . > λ2k > 0), (3) V (λ) ∼ = V (λ) if |λ| = 2k + 1. ⎧ ¯ ⎪ if |λ| = 2k and (λ) = 1, ⎨ch V (λ) q ¯ Hence, ch V (λ) = 2 ch V (λ) if |λ| = 2k and (λ) = 1, ⎪ ⎩ch V (λ) if |λ| = 2k + 1. Proof. By Theorem 5.11 (4), V 1 is a highest weight module or the sum of two highest weight modules over U (g) with highest weight λ. By Proposition 1.8, we have ⎧ ¯ ⎪ if |λ| = 2k and (λ) = 1, ⎨V (λ) or V (λ) 1 ∼ ¯ V = V (λ) ⊕ V (λ) if |λ| = 2k and (λ) = 1, ⎪ ⎩V (λ) ∼ V (λ) if |λ| = 2k + 1. = The second assertion follows from Proposition 5.12. Remark. The main reason we restrict our attention in Theorem 5.14 to the dominant set of weights + ∩ P≥0 is the statement of Proposition 1.8. We still believe that the theorem holds in a more general setting and conjecture that it is true for any weight λ ∈ + for which the generic character formula (1.4) holds. Corollary 5.15. If V q is a finite dimensional highest weight module over Uq (g) with highest weight λ ∈ + ∩ P≥0 , then V q is isomorphic to V q (λ) up to . Proof. Note that V 1 is a highest weight module or the sum of two highest weight modules over U (g) with highest weight λ and it is finite dimensional by Proposition 5.12. From Proposition 1.8, we know that V 1 is an irreducible module or the direct sum of two irreducible modules. Thus we get ch V q = ch V 1 = ch V q (λ) by Theorem 5.14 and hence V q ∼ = V q (λ). Define the subalgebras U1± := A1 /J1 ⊗A1 UA±1 and U10 := A1 /J1 ⊗A1 UA0 1 of U1 . Theorem 5.16. The classical limit U1 of Uq (g) is isomorphic to the universal enveloping algebra U (g). Proof. By Theorem 5.11 (1), there exists a surjective algebra homomorphism ψ : U (g) −→ U1 defined by ei −→ ei , ei¯ −→ ei¯ , f i −→ f i , f i¯ −→ f i¯ , h −→ h, kl¯ −→ kl¯ for i ∈ I , h ∈ P ∨ and l ∈ J . From (1.3), U (g) ∼ = U − ⊗ U 0 ⊗ U +. We first show that U 0 is isomorphic to U10 . Consider the restriction ψ0 of ψ to U 0 . Note that Cliff A1 (λ) is a UA0 1 -module. Indeed, as in the proof of Proposition 5.5, we know that q h w = q λ(h) w, (q h ; 0)q w =
q λ(h) − 1 w for all w ∈ Cliff A1 (λ). q −1
In particular, the action of ki¯ is just the left multiplication by ti¯ . Let g ∈ ker ψ0 . By the 2n Poincaré-Birkhoff-Witt theorem, we can write g = i=1 gi kηi , where kηi = k1a¯ 1 · · · kna¯ n , 0 ≤ a j ≤ 1 for all j ∈ J and each gi is a polynomial in k1 , . . . , kn . For each λ ∈ P we have
Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))
0 = ψ0 (g) · 1 =
2n
857
λ(gi )tηi ∈ Cliff 1 (λ),
i=1
where λ(gi ) denotes the polynomial in λ j corresponding to gi . Since {tηi } is a linearly independent subset of Cliff 1 (λ) ∼ = Cliff(λ), we have λ(gi ) = 0 for all i = 1, . . . , 2n. Since we may take any integer value for λ j , gi must be zero for all i = 1, . . . , 2n and hence g is identically zero. Thus ψ0 is injective. − − − Next we show that the restriction of ψ− of ψ to U is an isomorphism of U onto U1 . Suppose ker ψ− = 0 and u = bζ f ζ ∈ ker ψ− , where bζ ∈ C and f ζ are monomials in f i and f i¯ ’s. Let N be the maximal length of the monomials f ζ in the expression of u, and choose λ ∈ + ∩ P≥0 satisfying λ(h i ) > N and |λ| = 2k and (λ) = 1¯ or |λ| = 2k+1 for all i ∈ I . By Theorem 5.14, the classical limit V 1 of V q (λ) is isomorphic to the irreduc|λ|+1 ¯ or |λ| = 2k + 1. Set r = 2[ 2 ] . ible U (g)-module V (λ) |λ| = 2k and (λ) = 1, when ⊕r Consider the map φ : U − −→ V 1 , given by (x1 , . . . , xn ) −→ r1=1 ψ(xi ) · vi for a basis {vi | i = 1, . . . , r } of Vλ1 . Then by Proposition 1.8 and Proposition 1.9, ker φ is ⊕r λ(h )+1 λ(h )+1 generated by( f i i , 0, . . . , 0), . . ., (0, . . . , 0, f i i ) for the left ideal of U − i ∈ I . In particular, (u, 0, . . . , 0) = ( bζ f ζ , 0, . . . , 0) ∈ ker φ. That is ψ− (u)v1 = 0, which is a contradiction. So ker ψ− = 0 and U − is isomorphic to U1− . Similarly, we can show that U + ∼ = U1+ . By the triangular decomposition we have U (g) ∼ = U1− ⊗ U10 ⊗ U1+ ∼ = U1 . = U− ⊗ U0 ⊗ U+ ∼ It can be checked easily that this isomorphism is an algebra isomorphism. Theorem 5.17. Let λ ∈ P. If V q is the Weyl module W q (λ) over Uq (g) with highest weight λ, then its classical limit V 1 is isomorphic to ¯ (1) W (λ) or W (λ) if |λ| = 2k and (λ) = 1, (2) W (λ) ⊕ W (λ) if |λ| = 2k and (λ) = 1¯ (in particular, if λ1 > · · · > λ2k > 0), (3) W (λ) ∼ = W (λ) if |λ| = 2k + 1. Proof. Let v(λ) be a finite dimensional irreducible b+ -module of weight λ which generates W (λ). Since U − ∼ = U1− and E A1 (λ)/J1 E A1 (λ) is isomorphic to v(λ) or v(λ) ⊕ v(λ) as a Cliff(λ)-module, it suffices to show that V 1 is a free U1− -module whose rank is dimC v(λ) or 2 dimC v(λ). By Proposition 4.2 we know that W q (λ) is a free Uq− -module generated by q E (λ). Since VA1 is a subspace of V q , taking Proposition 5.7 into account, VA1 is a free UA−1 -module generated by E A1 (λ). Taking the classical limit, we see that V 1 = U1− · E A1 (λ)/J1 E A1 (λ) and dimC E A1 (λ)/J1 E A1 (λ) = dimC(q) E q (λ) = dimC v(λ) or 2 dimC v(λ). By a similar argument as in [HK, Prop. 3.4.10], we can show that V 1 is a free ¯ E q (λ) is parity invariant. Hence we have When |λ| = 2k and (λ) = 1, ⎧ ¯ ⎪ if |λ| = 2k and (λ) = 1, ⎨W (λ) or W (λ) ¯ V1 ∼ if |λ| = 2k and (λ) = 1, = W (λ) ⊕ W (λ) ⎪ ⎩W (λ) ∼ if |λ| = 2k + 1. = W (λ)
U1− -module.
858
D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim ≥0
6. Complete Reducibility of the Category Oq
In this section, we prove the complete reducibility theorem for Uq (g)-modules in the category Oq≥0 . Definition 6.1. The category Oq≥0 consists of finite dimensional Uq (g)-modules M with a weight space decomposition M = λ∈P Mλ such that wt(M) ⊂ P≥0 . Remark. The complete reducibility theorem for Oq≥0 , which we establish at the end of this section, implies that Oq≥0 is isomorphic to the category Tq of tensor modules, i.e., submodules of a tensor power of the natural representation C(q)n|n . Indeed, using the description of Tq provided by Olshanski and Sergeev we first check that every simple object of Oq≥0 is a tensor module. Then, by the complete reducibility result for Tq , obtained again by Sergeev and Olshanski, we conclude that the two categories are isomorphic. One can easily prove the following proposition (see, for example, [HK, Theorem 7.2.3]). Proposition 6.2. For each λ ∈ + ∩ P≥0 , V q (λ) is an irreducible Uq (g)-module in the category Oq≥0 . Conversely, every finite dimensional irreducible Uq (g)-module in the category Oq≥0 has the form V q (λ) for some λ ∈ + ∩ P≥0 . Let S be the antipode on Uq (g) defined in [O, Sect. 4]. We have S(q h ) = q −h for all h ∈ P ∨ . Because S is an anti-automorphism on Uq (g), one can define two Uq (g)-module structures on the dual vector space of a Uq (g)-module V ∈ Oq≥0 by x · φ, v := φ, S(x) · v and x · φ, v := φ, S −1 (x) · v for each x ∈ Uq (g) and linear functional φ on V . We denote by V ∗ these modules ∗ and V , respectively. As vector spaces both modules are just µ∈P Vµ , where Vµ∗ = HomC(q) (Vµ , C(q)). The following lemma is an immediate consequence of the definitions. Lemma 6.3. Suppose that V is a Uq (g)-module in the category Oq≥0 . (1) There exist canonical Uq (g)-module isomorphisms (V ∗ ) ∼ =V ∼ = (V )∗ . ∗ (2) The space Vµ is a weight space of weight −µ. Since q h S(ei )q −h = q αi (h) S(ei ), we have S(ei )Vµ ⊂ Vµ+αi , which implies ei Vµ∗ ⊂ ∗ Vµ−αi . By Lemma 6.3, we get ei (V ∗ )−µ ⊂ (V ∗ )−µ+αi . Similarly, we also have ei¯ (V ∗ )−µ ⊂ (V ∗ )−µ+αi , f i (V ∗ )−µ ⊂ (V ∗ )−µ−αi , f i¯ (V ∗ )−µ ⊂ (V ∗ )−µ−αi for all i ∈ I and ki¯ (V ∗ )−µ ⊂ (V ∗ )−µ for all i ∈ J . A weight module M is called a lowest weight module with lowest weight λ ∈ P if it is generated over Uq (g) by an irreducible finite dimensional Uq≤0 -module. By a similar argument as in Proposition 4.1, one can show that (V q (λ)λ )∗ is an irreducible Uq≤0 -module so that V q (λ)∗ and V q (λ) are lowest
weight modules of lowest weight −λ. Suppose that V is a Uq (g)-module in the category Oq≥0 . Because V is finite dimensional, we may choose a maximal weight λ ∈ wt(V ) with the property that λ + αi is not a
Highest Weight Modules Over the Quantum Queer Superalgebra Uq (q(n))
859
weight of V for any i ∈ I . Then the weight space Vλ is a Uq≥0 -module. Fix an irreducible Uq≥0 -submodule v of Vλ and set L = Uq (g)v. Then L is a highest weight Uq (g)-module with highest weight λ. By the assumption, λ ∈ + ∩ P≥0 and from Corollary 5.15 we know L ∼ = V q (λ) up to . Now consider v¯ = HomC(q) (v, C(q)) ⊂ V ∗ , and set L¯ = Uq (g)¯v ⊂ V ∗ . It is easy to show that v¯ is an irreducible Uq≤0 (g)-module and L¯ is a lowest weight module with lowest weight −λ. Translating Corollary 5.15 to the case of lowest weight modules, we get the following lemma. Lemma 6.4. The Uq (g)-module L¯ is isomorphic to the irreducible lowest weight module V q (λ)∗ with lowest weight −λ and lowest weight space v¯ . Now we can prove the completely reducibility theorem for Uq (g)-modules in the category Oq≥0 . Theorem 6.5. Every Uq (g)-module V in the category Oq≥0 is completely reducible. Proof. Take a maximal weight λ and consider a submodule of V , say L, generated by an irreducible Uq≥0 -submodule of Vλ . We want to show V ∼ = L ⊕ V /L. Taking dual with −1 ∗ ¯ respect to S of the inclusion L → V , we obtain a Uq (g)-module homomorphism ¯ . Thus we have a map: V ∼ = (V ∗ ) → ( L) ¯ . ψ : L → V → ( L) ¯ are It is easy to check that ψ is a nontrivial homomorphism. Since both L and ( L) irreducible, ψ is an isomorphism by Schur’s lemma and we see that the following short exact sequence splits: 0 → L → V → V /L → 0. Since V /L ∈ Oq≥0 , using induction on the dimension of V , we complete the proof. Corollary 6.6. The tensor product of a finite number of Uq (g)-modules in the category Oq≥0 is completely reducible. Remark. The same argument can be applied to prove the complete reducibility of O≥0 . In that case, the antipode is given by S(x) = −x for all x ∈ g (see [N, Sect. 4]) and Proposition 1.8 plays the same role as Proposition 5.15. Acknowledgements. We would like to thank Ivan Penkov and Vera Serganova for stimulating discussions. D.G. gratefully acknowledges the hospitality and excellent working conditions at the Seoul National University where most of this work was completed.
860
D. Grantcharov, J. H. Jung, S.-J. Kang, M. Kim
References [ABS] [B] [BKM] [Dr] [G] [Har] [HK] [IR] [K] [Lam] [Lang] [LS] [N] [O] [P] [PS1] [PS2] [PS3] [RTF] [Se1] [Se2] [Sh]
Atiyah, M.F., Bott, R., Shapiro, A.: Clifford modules. Topology 3, 3–38 (1964) Brundan, J.: Kazhdan-Lusztig polynomials and character formulae for the Lie superalgebra q(n). Adv. Math. 182, 28–77 (2004) Benkart, G., Kang, S.-J., Melville, D.: Quantized enveloping algebras for Borcherds superalgebras. Trans. Amer Math. Soc. 350, 3297–3319 (1998) Drinfel’d, V.: Quantum groups. In: Proceedings of the International Congress of Mathematicians, Vol. 1 (Berkeley, Calif., 1986), Providence, RI: Amer. Math. Soc., 1987, pp. 798–820 Gorelik, M.: Shapovalov determinants of q-type Lie superalgebras. Int. Math. Res. Pap., Article ID 96895, 1–71 (2006) Harris, J.: Algebraic Geometry, A first course. Corrected reprint of the 1992 original. Graduate Texts in Mathematics 133, New York: Springer-Verlag, 1995 Hong, J., Kang, S.-J.: Introduction to Quantum Groups and Crystal Bases. Graduate Studies in Mathematics 42, Providence, RI: Amer. Math. Soc., 2002 Ireland, K., Rosen, M.: A Classical Introduction to Modern Number Theory. 2nd ed., Graduate Texts in Mathematics 84, New York: Springer-Verlag, 1990 Kac, V.: Lie superalgebras. Adv. Math. 26, 8–96 (1977) Lam, T.Y.: Introduction to Quadratic Forms over Fields, Graduate Studies in Mathematics 67, Providence, RI: Amer. Math. Soc., 2005 Lang, S.: Algebra, Revised third edition, Graduate Texts in Mathematics 211, New York: SpringerVerlag, 2002 Leites, D., Serganova, V.: Defining relations for classical Lie superalgebras I. Superalgebras with Cartan matrix or Dynkin-type diagram. In: Proc. Topological and Geometrical Methods in Field Theory eds. J. Mickelson, et al., Singapore: World Sci., 1992, pp. 194–201 Nazarov, M.: Capelli identities for Lie superalgebras, Ann. Sci. Ecole Norm. Sup. (4) 30, 6, 847–872 (1997) Olshanski, G.: Quantized universal enveloping superalgebra of type q and a super-extension of the Hecke alegbra. Lett. Math. Phys. 24, 93–102 (1992) Penkov, I.: Characters of typical irreducible finite-dimensional q(n)-modules. Funct. Anal. Appl. 20, 30–37 (1986) Penkov, I., Serganova, V.: Generic irreducible representations of finite-dimensional Lie superalgebras. Int. J. Math. 5, 389–419 (1994) Penkov, I., Serganova, V.: Characters of irreducible G-modules and cohomology of G/P for the Lie supergroup G = Q(N ). J. Math. Sci. (New York) 84, 1382–1412 (1997) Penkov, I., Serganova, V.: Characters of finite-dimensional irreducible q(n)-modules. Lett. Math. Phys. 40, 147–158 (1997) Reshetikhin, N., Takhtadzhyan, L., Faddeev, L.: Quantization of Lie groups and Lie algebras. (Russian) Algebra i Analiz 1, 178–206 (1989); translation in Leningrad Math. J. 1, 193–225 (1990) Sergeev, A.: The centre of enveloping algebra for Lie superalgebra Q(n, C). Lett. Math. Phys. 7, 177–179 (1983) Sergeev, A.: Tensor algebra of the identity representation as a module over the Lie superalgebras G L(n, m) and Q(n). (Russian), Mat. Sb. (N.S.) 123(165), 422–430 (1984) Shimura, G.: Arithmetic and Analytic Theories of Quadratic Forms and Clifford Groups. Mathematical Surveys and Monographs 109, Providence, RI: Amer. Math. Soc., 2004
Communicated by Y. Kawahigashi
Commun. Math. Phys. 296, 861–880 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1017-8
Communications in
Mathematical Physics
Global Solution to the Three-Dimensional Incompressible Flow of Liquid Crystals Xianpeng Hu, Dehua Wang Department of Mathematics, University of Pittsburgh, Pittsburgh, PA 15260, USA. E-mail:
[email protected];
[email protected] Received: 25 June 2009 / Accepted: 29 November 2009 Published online: 6 February 2010 – © Springer-Verlag 2010
Abstract: The equations for the three-dimensional incompressible flow of liquid crystals are considered in a smooth bounded domain. The existence and uniqueness of the global strong solution with small initial data are established. It is also proved that when the strong solution exists, all the global weak solutions constructed in [16] must be equal to the unique strong solution.
1. Introduction
Liquid crystals are substances that exhibit a phase of matter that has properties between those of a conventional liquid, and those of a solid crystal. For instance, a liquid crystal may flow like a liquid, but its molecules may be oriented in a crystal-like way. There are many different types of liquid crystal phases, which can be distinguished based on their different optical properties. The various liquid crystal phases can be characterized by the type of ordering that is present. One can distinguish positional order and orientational order, and moreover order can be either short-range or long-range. Liquid crystals may have an isotropic phase at high temperature, or anisotropic orientational structure at lower temperature. The diverse phases of liquid crystals have wide applications from the liquid crystal display to biology. (In particular, biological membranes and cell membranes are a form of liquid crystal.) In the 1960s, the theoretical physicist P.-G. de Gennes found fascinating analogies between liquid crystals and superconductors as well as magnetic materials, which was rewarded with the Nobel Prize in Physics in 1991. One of the most common liquid crystal phases is the nematic, where the molecules have no positional order, but they have long-range orientational order. For more details of physics, we refer the readers to the two books of de Gennes-Prost [5] and Chandrasekhar [3].
862
X. Hu, D. Wang
The three-dimensional flow of nematic liquid crystals can be governed by the following system of partial differential equations ([5,14–16]): ∂u + u · ∇u − µu + ∇ P = −λdiv (∇d ∇d) , ∂t ∂d + u · ∇d = γ (d − f (d)) , ∂t divu = 0,
(1.1a) (1.1b) (1.1c)
where u ∈ R3 denotes the velocity, d ∈ R3 the director field for the averaged macroscopic molecular orientations, P ∈ R the pressure arising from the incompressibility; and they all depend on the spatial variable x = (x1 , x2 , x3 ) ∈ R3 and the time variable t > 0. The positive constants µ, λ, γ stand for viscosity, the competition between kinetic energy and potential energy, and microscopic elastic relaxation time or the Deborah number for the molecular orientation field, respectively. We set these three constants to be one since their sizes do not play any role in our analysis. The symbol ∇d ∇d denotes a matrix whose i j th entry is < ∂xi d, ∂x j d >, and it is easy to see that ∇d ∇d = (∇d) ∇d, where (∇d) denotes the transpose of the 3 × 3 matrix ∇d. In (1.1), f (d) is the penalty function which will be assumed to be zero as in [16] for the three-dimensional problem. However, the approach of this paper can be applied to the general case. The system (1.1) is a simplified version, but still retains most of the essential features, of the Ericksen-Leslie equations ([7,8,10–13]) for the hydrodynamics of nematic liquid crystals; see [16,19,20] for more discussions on the relations of the two models. Both the Ericksen-Leslie system and the simplified one (1.1) describe the time evolution of liquid crystal materials under the influence of both the velocity field u and the director field d. In many situations, the flow velocity field does disturb the alignment of the molecule, and, in turn, a change in the alignment will induce velocity. We consider the initial-boundary value problem of system (1.1) in a bounded domain ⊂ R3 with C 3 boundary under the initial-boundary conditions: d|t=0 = d0 , u|t=0 = u0 ,
(1.2)
u|∂ = 0, d|∂ = d0 ,
(1.3)
and
with divu0 = 0 in , and d0 ∈ C 1 () satisfying ∇d0 = 0 on the boundary ∂. We introduce a 3 × 3 matrix F = ∇d,
(1.4)
and take the gradient of (1.1b) to rewrite (1.1), with f (d) = 0 and µ = λ = γ = 1, as: ∂u + u · ∇u − u + ∇ P = −div(F F), ∂t ∂F + u · ∇F + F∇u = F, ∂t divu = 0,
(1.5a) (1.5b) (1.5c)
Global Solution to the Flow of Liquid Crystals
863
where we used, for all i, j, k = 1, 2, 3, ∂u j ∂di ∂di ∂di ∂ ∂ uj = = (F∇u + u · ∇F)ik . + uj ∂ xk ∂x j ∂ xk ∂ x j ∂ x j ∂ xk Notice that (1.5a) is the incompressible Navier-Stokes equation with the source term, −div(F F), while (1.5b) is a parabolic equation of F. The initial-boundary conditions (1.2) and (1.3) become u|t=0 = u0 , F|t=0 = F0 := ∇d0 ,
(1.6)
u|∂ = 0, F|∂ = 0.
(1.7)
and
There have been some studies on system (1.1). In Lin-Liu [16], the global existence of weak solutions with large initial data was proved under the condition that the orientational configuration d(x, t) belongs to H 2 , and the global existence of classical solutions was also obtained if the coefficient µ is large enough in three dimensional spaces. Similar results were obtained also in [20] for a different but similar model. When weak solutions are discussed, the regularity of the weak solution was investigated in [17] (and also [11]). In this paper, we are interested in strong solutions of (1.5) in the Sobolev space W 2,q () with q > 3. It is worth pointing out that if F belongs to W 2,q (), it is equivalent to saying that d should be in W 3,q () according to (1.4). By a Strong Solution, we mean a triplet (u, F, P) satisfying (1.5) almost everywhere with the initial condition (1.6) and the boundary condition (1.7). Our strategy to consider (1.5) in W 2,q () is to linearize (1.5) as ∂u − u + ∇ P = −v · ∇v − div(G G), ∂t ∂F − F = −v · ∇G − G∇v, ∂t divu = 0,
(1.8a) (1.8b) (1.8c)
for some given v ∈ R3 and G ∈ M 3×3 . One of the motivations of making such an linearization is that we can use the maximal regularity of Stokes equations ([4]) and the parabolic equation ([1]). We first use an iteration method to establish the local existence and uniqueness of a strong solution with general initial data. Then we prove the global existence by establishing some global estimates under the condition that the initial data is small in some sense. The global weak solution was obtained in Lin-Liu [16], but the uniqueness is still an open problem. We shall prove that when the strong solution exists, all the global weak solutions constructed in [16] must be equal to the unique strong solution, which is called the weak-strong uniqueness. Similar results were obtained by Danchin [4] for the density-dependent incompressible Navier-Stokes equations. We shall establish our results in the spirit of [4], while developing new estimates for the director field d. The rest of the paper is organized as follows. In Sect. 2, we state our main results on local and global existence of strong solution, as well as the weak-strong uniqueness. In Sect. 3, we recall the maximal regularity for Stokes equations and the parabolic equation, and also some L ∞ estimates. In Sect. 4, we give the proof of the local existence. In Sect. 5, we prove the global existence. Finally in Sect. 6, we show the weak-strong uniqueness.
864
X. Hu, D. Wang
2. Main Results In this section, we state our main results. If k > 0 is an integer and p ≥ 1, we denote by W k, p the set of functions in L p () whose derivatives of up to order k belong to L p (). For T > 0 and a function space X , denote by L p (0, T ; X ) the set of Bochner measurable X-valued time dependent functions f such that t → f X belongs to L p (0, T ). Let us define the functional spaces in which the existence of solutions is going to be obtained: p,q
Definition 2.1. For T > 0 and 1 < p, q < ∞, we denote by MT (u, F, P) such that 1− 1p , p
u ∈ C([0, T ]; D Aq
the set of triplets
1,q
) ∩ L p (0, T ; W 2,q () ∩ W0 ()), ∂t u ∈ L p (0, T ; L q ),
divu = 0, 2(1− 1p )
F ∈ C([0, T ]; Bq, p
) ∩ L p (0, T ; W 2,q ()), ∂t F ∈ L p (0, T ; L q ()),
and
P ∈ L p (0, T ; W 1,q ()),
Pd x = 0.
The corresponding norm is denoted by · M p,q . T
We remark that the condition
Pd x = 0
in Definition 2.1 holds automatically if we replace P by 1 Pd x P− || 1− 1 , p
in (1.1) and (1.5). Also, in the above definition, the space D Aq p stands for some fractional domain of the Stokes operator in L q (cf. Sect. 2.3 in [4]). Roughly, the vector1− 1p , p
fields of D Aq
are vectors which have 2 −
2 p
derivatives in L q , are divergence-free, 2(1− 1p )
and vanish on ∂. The Besov space (for a definition, see [2]) Bq, p as the interpolation space between L q and W 2,q , that is, 2(1− 1p )
Bq, p
can be regarded
= (L q , W 2,q )1− 1 , p . p
We note that, from Proposition 2.5 in [4], 1− 1p , p
D Aq
2(1− 1p )
→ Bq, p
∩ L q ().
(2.1)
The local existence will be shown by using an iterative method, and if the initial data is sufficiently small in some suitable function spaces, the solution is indeed global in time. More precisely, our existence results read:
Global Solution to the Flow of Liquid Crystals
865
Theorem 2.1. Let be a bounded domain in R3 with C 3 boundary. Assume 1 ≤ p, q ≤ 1 1 2 1− 1− , p p ∞ with 2p 1 − q3 ∈ (0, 1) and u0 ∈ D Aq p , F0 ∈ Bq, p ∩ L q . Then, (1) there exists a T0 > 0, such that system (1.5) with the initial-boundary conditions p,q (1.6)–(1.7) has a unique local strong solution (u, F, P) ∈ MT0 in × (0, T0 ); (2) moreover, there exists a δ0 > 0, such that, if the initial data satisfies u0
1− 1p , p
≤ δ0 , F0
D Aq
2(1− 1p )
Bq, p
∩L q
≤ δ0 , p,q
then (1.6)–(1.7) has a unique global strong solution (u, F, P) ∈ MT for all T > 0.
in ×(0, T )
Remark 2.1. The above theorem gives us the global strong solution near u = 0, F = 0. The similar argument to the proof of Theorem 2.1 below will also enable us to show the global existence of a strong solution to (1.5) near the equilibrium state: u = 0, F = I (the 3 × 3 identity matrix). According to Lin-Liu [16], for the given initial-boundary conditions (1.6) and (1.7), there exists at least a Weak Solution to (1.5). But its uniqueness is still an open question. More precisely, a triplet (v, E, ) is called a weak solution to (1.5) with (1.6) and (1.7) in × (0, T ) if (v, E, ) satisfies the system (1.5) in the sense of distributions, i.e, for all ψ ∈ (C0∞ ( × (0, T )))3 with divψ = 0 and φ ∈ (C0∞ ( × (0, T )))9 , we have T T T v∂t ψ d xdt + v ⊗ v : ∇ψ d xdt − ∇v : ∇ψ d xdt 0
=− and
T
0
=
T 0
0
E E : ∇ψ d xdt,
E : ∂t φ d xdt − T 0
0
T 0
u · ∇ E : φ d xdt −
T 0
E∇u : φ d xdt
∇ E : ∇φ d xdt,
with the energy inequality: t 2 2 2 2 (|v(t)| + |E(t)| )d x + (|∇v| + |∇ E| )d xds ≤ (|v0 |2 + |E 0 |2 )d x.
0
In this weak formulation, the pressure can be determined as in the Navier-Stokes equations, see Galdi [9]. We state here the existence of weak solutions in Theorem A of [16]: Proposition 2.1. Assume that u0 ∈ L 2 and F0 ∈ L 2 . Then the system (1.5) with the initial condition (1.6) and the boundary condition (1.7) has a global weak solution (v, E, ) such that v ∈ L 2 (0, T ; H 1 ) ∩ L ∞ (0, T ; L 2 ), and E ∈ L 2 (0, T ; H 1 ) ∩ L ∞ (0, T ; L 2 ), for all T ∈ (0, ∞).
866
X. Hu, D. Wang
For the same initial-boundary conditions, the relation between its weak solution and its strong solution can be formulated as: 1− 1 , p
2(1− 1 )
Theorem 2.2. Assume that u0 ∈ D Aq p and F0 ∈ Bq, p p ∩ L q . Then its corresponding weak solution to (1.5) with (1.6) and (1.7) is unique and indeed is equal to its unique strong solution. Usually, we call this kind of uniqueness Weak-Strong Uniqueness. For similar results on the compressible Navier-Stokes equations, we refer readers to [6,18]. 3. Maximal Regularity In this section, we recall the maximal regularities for the parabolic operator and the Stokes operator, as well as some L ∞ estimates. For T > 0, 1 < p, q < ∞, denote W(0, T ) := W 1, p (0, T ; (L q ())3 ) ∩ L p (0, T ; (W 2,q ())3 ). Throughout this paper, C stands for a generic positive constant. We first recall the maximal regularity for the parabolic operator (cf. Theorem 4.10.7 and Remark 4.10.9 in [1]): 2(1− 1p )
Theorem 3.1. Given 1 < p, q < ∞, ω0 ∈ Bq, p Cauchy problem
and f ∈ L p (0, T ; L q (R3 )3 ), the
dω − ω = f, t ∈ (0, T ), ω(0) = ω0 , dt has a unique solution ω ∈ W(0, T ), and
ω W (0,T ) ≤ C f L p (0,T ;L q (R3 )) + ω0
2 1− 1p
,
Bq, p
where C is independent of ω0 , f and T . Moreover, there exists a positive constant c0 independent of f and T such that ω W (0,T ) ≥ c0 sup ω(t) t∈(0,T )
2(1− 1p )
.
Bq, p
Now we recall the maximal regularity for the Stokes equations (cf. Theorem 3.2 in [4]): Theorem 3.2. Let be a bounded domain with a C 3 boundary in R3 and 1 < p, q < ∞. 1− 1p , p
Assume that u0 ∈ D Aq
and f ∈ L p (R+ ; L q ). Then the system
⎧ ⎪ ⎨∂t u − u + ∇ P = f, Pd x = 0, divu = 0, u|∂ = 0, ⎪ ⎩u| t=0 = u0 ,
Global Solution to the Flow of Liquid Crystals
867
has a unique solution (u, P) satisfying the following inequality for all T > 0: T 1p p (∇ P, u, ∂t u) L q dt u(T ) 1− 1p , p + D Aq
0
⎛
≤ C ⎝ u0
1− 1p , p
T
+
D Aq
0
p
f (t) L q dt
1p
⎞ ⎠
(3.1)
with C = C(q, p, ). Remark 3.1. We notice that (3.1) does not include the estimate for u L p (0,T ;L q ) . Indeed, thanks to u|∂ = 0, Poincaré’s Inequality, and the fact ∇ud x = 0, we have u W 2,q ≤ C u L q , and then (3.1) can be rewritten as u(T )
1− 1p , p
+
D Aq
0
⎛
≤ C ⎝ u0
T
p (∇ P, u, u, ∂t u) L q
1− 1p , p D Aq
T
p
f (t) L q dt
+ 0
1p
1p dt
⎞ ⎠.
(3.2)
We have the L ∞ estimate in the spatial variable as follows (cf. Lemma 4.1 in [4]). Lemma 3.1. Let 1 < p, q, r, s < ∞ satisfy 0
T ≥
3 8C(H0 + 4C H0 )
2q q−3
.
Global Solution to the Flow of Liquid Crystals
875
This implies that the maximal time of existence will go to infinity when the initial data approaches zero. More precisely, we can show that, if the initial data is sufficiently small, the solution exists globally in time. To this end, we need some other estimates for the terms on the right side of (5.1). Indeed, by the imbedding W 1,q → L ∞ , as q > 3, we have u · ∇u L p (0,t;L q ) ≤ u L ∞ (0,t;L q ()) ∇u L p (0,t;L ∞ ) ≤ C( u0 L q + H (t)) u L p (0,t;W 2,q ) ≤ C(H0 + H (t))H (t). Similarly, we have div(F F) L p (0,t;L q ) ≤ C(H0 + H (t))H (t), and u · ∇F + F∇u L p (0,t;L q ) ≤ C(H0 + H (t))H (t). Thus, (5.1) turns out to be H (t) ≤ C(H0 + (H0 + H (t))H (t)).
(5.6)
By the Cauchy-Schwarz inequality, (5.6) becomes H (t) ≤ C(H0 + H02 + 2H 2 (t)),
(5.7)
for all t ∈ [0, T ∗ ). Now we take H0 sufficiently small such that H0 + H02 ≤ δ :=
1 . 8C 2
(5.8)
Then, under the assumption (5.8), we compute directly from (5.7) and the continuity of H (t) that 1 − 1 − 8C 2 (H0 + H02 ) 1 ≤ , (5.9) H (t) ≤ 4C 4C for all t ∈ [0, T ∗ ). In particular, this implies that (u, F, P) M p,q ≤ ∗ T
1 < ∞. 4C
Hence, according to the local existence in the previous section, we can extend the solution on [0, T ∗ ) to some larger interval [0, T ∗ + T0 ) with T0 > 0. This is impossible since T ∗ is already the maximal time of existence. Hence, when the initial data satisfies (5.8), the strong solution is indeed global in time. The proof of Theorem 2.1 is complete.
876
X. Hu, D. Wang
6. Weak-Strong Uniqueness The purpose of this section is to show Weak-Strong Uniqueness in Theorem 2.2. To this end, we need to obtain first an energy estimate for the strong solution to the system (1.5). More precisely, we have p,q
Lemma 6.1. Let p, q satisfy the same conditions as Theorem 2.1 and (u, F, P) ∈ MT0 be the unique solution to (1.5) on × [0, T0 ]. Then, one has, t u(t) 2 + F(t) 2 d x + ∇u 2 + ∇F 2 d xds = u0 2 + F0 2 d x.
0
Proof. Note that 1− 1p , p
u ∈ C([0, T0 ]; D Aq
) ∩ L p (0, T0 ; W 2,q )
with q > 3.
Then, we have u ∈ C([0, T0 ]; L 2 ) ∩ L 2 (0, T0 ; H 1+α ) for some α ≥ 0, using 1− 1p , p
D Aq
2(1− 1p )
→ Bq, p
∩ L q () → L 2 (),
Sobolev’s embedding W 2,q () → H 2 () as q > 3, and the standard interpolation inequality. Similarly, F ∈ C([0, T0 ]; L 2 ) ∩ L 2 (0, T0 ; H 1+α ). Taking the L 2 scalar product in (1.5a) with u and performing integration by parts, we obtain d 2 2 |u| d x + |∇u| d x = F F : ∇ud x, (6.1) dt where the notation A : B means the inner product between two matrices, i.e. A : B = 2 A i, j i j Bi j . Similarly, taking the L inner product in (1.5b) with F and performing integration by parts, we obtain d |F|2 d x + |∇F|2 d x = − F∇u : Fd x − F : (u · ∇F)d x, (6.2) dt where |F|2 = F : F and |∇F|2 =
∂Fi j 2 ∂x .
i, j,k
k
Notice that 1 1 F : (u · ∇F)d x = u · ∇|F|2 d x = − divu|F|2 d x = 0, 2 2
Global Solution to the Flow of Liquid Crystals
877
and, due to AB : C = A : C B = B : A C, F∇u : Fd x = ∇u : F Fd x.
Hence, adding (6.1) and (6.2) together, we have d 2 2 (|u| + |F| )d x + (|∇u|2 + |∇F|2 )d x = 0. dt Integrating the above equality over the time interval [0, t], we obtain the energy equality of this lemma. Now, we recall that for the weak solution (v, E, ) obtained in [16], we have for (almost) all t ∈ (0, T ), t 1 1 2 2 2 2 (|v(t)| +|E(t)| )d x + (|∇v| +|∇ E| )d xds≤ (|u0 |2+|F0 |2 )d x. (6.3) 2 2 0 We remark that, in view of the regularity of u, we deduce from the weak formulation of (1.5) the following equalities: t v · ud xds + ∇u : ∇vd xds 0 (6.4) t t ∂u 2 = |u0 | + E E : ∇ud xds + v· + v · ∇u d xds, ∂t 0 0 and
t F : Ed x + ∇F : ∇ Ed xds 0 t t 2 = |F0 | d x − v · ∇ E : Fd xds − E∇v : Fd xds 0 0 t ∂F d xds, + E: ∂t 0
(6.5)
for a.e. t ∈ (0, T ). Here, we used the identity v · ∇u · wd x = − v · ∇w · ud x,
for a vector w, if divv = 0. Since E satisfies Eq. (1.5b), we substitute (1.5b) into (6.5), and use the following two facts: t t (v · ∇ E : F + v · ∇F : E)d xds = v · ∇(E : F)d xds = 0, 0
0
and E∇u : F + E : F∇u = ∇u : (E F + F E),
878
X. Hu, D. Wang
to obtain
t F : Ed x + 2 ∇F : ∇ Ed xds 0 t = |F0 |2 d x − ∇u : (E F + F E)d xds 0 t t + (v − u) · ∇F : Ed xds − F : E∇(v − u)d xds.
0
(6.6)
0
On the other hand, we can write the equation for u as ∂u + v · ∇u − u + ∇ P = (v − u) · ∇u − div(F F). ∂t Multiplying (6.7) by v and integrating over × (0, t), we get t ∂u + v · ∇u d xds v· ∂t 0 t t =− ∇u : ∇vd xds + (v − u) · ∇u · vd xds 0 0 t + F F : ∇vd xds. 0
(6.8)
Substituting (6.8) into (6.4), we obtain t u · vd xds + 2 ∇u : ∇vd xds 0 t = |u0 |2 + E E : ∇ud xds 0 t t + (v − u) · ∇u · vd xds + F F : ∇vd xds. 0
(6.7)
0
(6.9)
Also, according to Lemma 6.1, we have t 1 1 2 2 2 2 (|u| +|F| )d x + (|∇F| +|∇u| )d xds = (|u0 |2 +|F0 |2 )d x. 2 2 0
(6.10)
Summing (6.3), (6.10) and subtracting the sum of (6.6) and (6.9), we obtain for almost all t ∈ (0, T ), t 1 (|u(t) − v(t)|2 + |F(t) − E(t)|2 )d x + (|∇u − ∇v|2 + |∇F − ∇ E|2 )d xds 2 0 t t ≤− (F − E) (F − E) : ∇ud xds − (v − u) · ∇u · vd xds 0 0 t t − (v − u) · ∇F : Ed xds + F : E∇(v − u)d xds 0 0 t − F F : ∇(v − u)d xds (6.11) 0
Global Solution to the Flow of Liquid Crystals
879
t t =− (F − E) (F − E) : ∇ud xds − (v − u) · ∇u · (v − u)d xds 0 0 t t − (v − u) · ∇F : (E − F)d xds + (E − F) F : ∇(v − u)d xds := I,
0
0
where, we used twice the fact
v · ∇u · ud x = 0,
if divv = 0. For I , we have, by Hölder’s inequality, t 2 2 (|F − E| +|u − v| )d x ds |I | ≤ ( ∇u L ∞ () + ∇F L ∞ () ) 0 t t 1 + |∇v − ∇u|2 d xds + C F 2L ∞ E − F 2L 2 ds. 2 0 0
(6.12)
Substituting (6.12) back to (6.11), one has 1 1 t (|u(t)−v(t)|2 +|F(t)− E(t)|2 )d x + (|∇u−∇v|2 +|∇F − ∇ E|2 )d xds 2 2 0 t ≤ ( ∇u L ∞ () + ∇F L ∞ () +C F 2L ∞ () ) (|F − E|2 +|u − v|2 )d x ds. 0
(6.13) Notice that ∇u L ∞ () + ∇F L ∞ () + F 2L ∞ () ∈ L 1 (0, T ). Therefore, using (6.13) together with Grönwall’s inequality, we finally conclude that u = v, F = E a.e and thus P = up to a constant in × (0, T ). The proof of Theorem 2.2 is complete. Acknowledgements. Xianpeng Hu’s research was supported in part by the National Science Foundation grant DMS-0604362 and by the Mellon Predoctoral Fellowship of the University of Pittsburgh. Dehua Wang’s research was supported in part by the National Science Foundation under grants DMS-0604362 and DMS0906160, and by the Office of Naval Research under grant N00014-07-1-0668.
References 1. Amann, H.: Linear and Quasilinear Parabolic Problems. Vol. I. Abstract linear theory. Boston, MA: Birkhúser Boston, Inc., 1995 2. Bergh, J., Löfström, J.: Interpolation Spaces. An Introduction. Grundlehren der Mathematischen Wissenschaften, Berlin-New York: Springer-Verlag, 1976 3. Chandrasekhar, S.: Liquid Crystals. 2nd ed., Cambridge: Cambridge University Press, 1992 4. Danchin, R.: Density-dependent incompressible fluids in bounded domains. J. Math. Fluid Mech. 8, 333–381 (2006) 5. de Gennes, P.G., Prost, J.: The Physics of Liquid Crystals. New York: Oxford University Press, 1993. 6. Desjardins, B.: Regularity of weak solutions of the compressible isentropic Navier-Stokes equations. Comm. Part. Diff. Eqs. 22, 977–1008 (1997)
880
X. Hu, D. Wang
7. Ericksen, J.L.: Conservation laws for liquid crystals. Trans. Soc. Rheology 5, 23–34 (1961) 8. Ericksen, J.L.: Continuum theory of nematic liquid crystals. Res. Mechanica 21, 381–392 (1987) 9. Galdi, G.P.: An Introduction to the Mathematical Theory of the Navier-Stokes Equations.Vol. I. Linearized steady problems. New York: Springer-Verlag, 1994 10. Hardt, R., Kinderlehrer, D.: Mathematical Questions of Liquid Crystal Theory. The IMA Volumes in Mathematics and its Applications 5, New York: Springer-Verlag, 1987 11. Hardt, R., Kinderlehrer, D., Lin, F.: Existence and partial regularity of static liquid crystal configurations. Commun. Math. Phys. 105, 547–570 (1986) 12. Leslie, F.: Some constitutive equations for liquid crystals. Arch. Rat. Mech. Anal. 28, 265–283 (1968) 13. Leslie, F.: Theory of flow phenomenum in liquid crystals. In: The Theory of Liquid Crystals, 4, LondonNew York: Academic Press, 1979, pp. 1–81 14. Lin, F.-H.: Nonlinear theory of defects in nematic liquid crystals; phase transition and flow phenomena. Commun. Pure. Appl. Math. 42, 789–814 (1989) 15. Lin, F.-H.: Mathematics theory of liquid crystals. In: Applied Mathematics at the Turn of the Century, Lecture Notes of the 1993 Summer School, Universidad Complutense de Madrid, Madrid: Editorial Complutense, 1995 16. Lin, F.-H., Liu, C.: Nonparabolic dissipative systems modeling the flow of liquid crystals. Comm. Pure Appl. Math. 48, 501–537 (1995) 17. Lin, F.-H., Liu, C.: Partial regularity of the dynamic system modeling the flow of liquid crystals. Disc. Cont. Dyn. Sys. 2, 1–22 (1996) 18. Lions, P.-L.: Mathematical Topics in Fluid Mechanics. Vol. 1. Incompressible models. Oxford Lecture Series in Mathematics and its Applications, 3. Oxford Science Publications. New York: The Clarendon Press/Oxford University Press, 1996 19. Liu, C., Walkington, N.J.: Approximation of liquid crystal flow. SIAM J. Numer. Anal. 37, 725–741 (2000) 20. Sun, H., Liu, C.: On energetic variational approaches in modeling the nematic liquid crystal flows. Disc. Contin. Dyn. Syst. 23, 455–475 (2009) Communicated by P. Constantin
Commun. Math. Phys. 296, 881–898 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-0995-x
Communications in
Mathematical Physics
Ambient Metrics for n-Dimensional pp-Waves Thomas Leistner1 , Pawel Nurowski2,3 1 School of Mathematical Sciences, The University of Adelaide,
Adelaide, SA 5005, Australia. E-mail:
[email protected] 2 Instytut Fizyki Teoretycznej, Uniwersytet Warszawski, ul. Ho˙za 69,
00-681 Warszawa, Poland. E-mail:
[email protected] 3 Instytut Matematyczny PAN, ul. Sniadeckich 8,
00-956 Warszawa, Poland Received: 1 July 2009 / Accepted: 2 November 2009 Published online: 4 February 2010 – © Springer-Verlag 2010
Abstract: We provide an explicit formula for the Fefferman- Graham ambient metric of an n-dimensional conformal pp-wave in those cases where it exists. In even dimensions we calculate the obstruction explicitly. Furthermore, we describe all 4-dimensional pp-waves that are Bach-flat, and give a large class of Bach-flat examples which are conformally Cotton-flat, but not conformally Einstein. Finally, as an application, we use the obtained ambient metric to show that even-dimensional pp-waves have vanishing critical Q-curvature. 1. Introduction Plane fronted gravitational waves, called pp-waves, are Lorentzian 4-manifolds (M, g) admitting a covariantly constant null vector field K . In addition, their Ricci tensor Ric satisfies Ric = κ ⊗ κ,
(1)
where κ is the 1-form on M defined by κ := K −| g. Physicists require also that the function is nonnegative for a pp-wave. This is because , via the Einstein field equations, is directly related to the energy momentum tensor of its gravitational field. pp-waves are important in general relativity theory since they generalize the concept of a plane wave of classical electrodynamics [41], as well as because of the fact that every 4-dimensional spacetime has a special pp-wave as a well defined limit [40], the Penrose limit, as it is called. Higher dimensional generalizations of the 4-dimensional pp-waves were studied in [42], appeared in Kaluza-Klein theory [28,25,29,9], and later in string theory [5,6,4, 35,11,36,12,18,3,37]. Their property of possessing a covariantly constant null vector This work was supported in part by the Polish Ministerstwo Nauki i Informatyzacji grant nr: 1 P03B 07529 and by the Sonderforschungsbereich 676 of the German Research Foundation.
882
T. Leistner, P. Nurowski
field K , implies that they have reduced Lorentzian holonomy from the full orthogonal group SO(1, n − 1) to the subgroup preserving the null vector K . In fact, they can be characterised by having Abelian holonomy Rn−2 [30,32]. As such they admit many parallel spinors: The dimension of the space of parallel spinors on an n-dimensional pp-wave is at least half of the dimension of the spinor module, [30]. In local coordinates (xi , u, r )i=1,...,n−2 in Rn , the n-dimensional pp-wave metric can be written as g=
n−2
(dxi )2 + 2du (dr + hdu) .
i=1
Here h is an arbitrary smooth real function of the first (n − 1) coordinates, h = h(xi , u). The covariantly constant null vector field is K = ∂r . Another property of this metric is that it has vanishing scalar curvature. Hence, if it is Einstein then it is Ricci flat. This n−2 ∂ 2 h happens if and only if h = i=1 = 0. ∂(xi )2 Conformal classes of pp-wave metrics have remarkable properties. One of them has been described by their discoverer H. W. Brinkmann already in 1925. In his seminal paper [8] Brinkmann not only studied spaces that were later called Brinkmann waves, namely Lorentzian manifolds with parallel null vector field, but he also showed the following [8, Theorems IV and VIII]: A 4-dimensional, not locally conformally flat Einstein manifold (M, g) locally admits a function ϒ such that the conformally rescaled metric e2ϒ g is again Einstein, but not homothetic to g, if and only if (M, g) is a Ricci-flat pp-wave (or its counterpart in neutral signature1 ). In this case, the rescaled metric is also Ricci-flat and the gradient of ϒ is a null vector. This occurs because the Weyl tensor W of a pp-wave is null and aligned with K , i.e. K −| W = 0, which makes these metrics not weakly generic in the terminology of [20]. In this paper we discuss another remarkable conformal property of n-dimensional pp-wave metrics, which is related to the ambient metric construction of Fefferman and Graham [15,16], a construction that provides the geometric framework of AdS/CFT correspondence2 . The ambient metric construction mimics the situation in the flat model of conformal geometry: Here the n-dimensional sphere equipped with the flat conformal structure can be viewed as the projectivisation of the light-cone in (n + 2)-dimensional Minkowski space. Letting the spheres wander along the light cone recovers the metrics in the conformal class. For a conformal class [g] in signature ( p, q) on an n = ( p + q)dimensional manifold M the ambient metric is a metric g of signature ( p + 1, q + 1) := (−ε, ε) × M × (1 − δ, 1 + δ), ε > 0, on the product of M with two intervals, M δ > 0, that is compatible with the conformal structure (for details see Definition 1) and, moreover, is Ricci flat. The Ricci-flat condition ensures that the the ambient metric depends uniquely on the conformal structure and encodes all properties of the conformal class [g] but has the downside that the ambient metric does not always exist. Starting with a formal power series 1 Be aware that the coordinates in the relevant Sect. 4.2 of Brinkmann’s paper [8] have to be understood as complex and complex conjugate in order to obtain Lorentzian metrics. If they are considered as real coordinates the resulting metric has neutral signature. 2 Note that in some papers from the physics literature the term Fefferman-Graham metric has a different meaning than ours. What physicists call Fefferman-Graham metric, e.g. in [2 or 13], is a related concept that Fefferman and Graham call the Poincaré-Einstein metric. How to obtain one from another is well known and we shall explain it in Sect. 7.
Ambient Metrics for n-Dimensional pp-Waves
883
g = 2 (tdρ + ρdt) dt + t
2
g+
∞
ρ µk k
(2)
k=1
with ρ ∈ (−ε, ε), t ∈ (1 − δ, 1 + δ) Fefferman and Graham showed that if n is odd, the Ricci-flatness of the ambient metric gives equations for µ1 , µ2 , . . . that can be solved in principle, but the calculations have been carried out only for very special conformal classes, mainly those that are related to Einstein spaces [34,31,19]. If n = 2s is even, there is a conformally invariant obstruction to the existence of a Ricci-flat ambient metric, called the Fefferman-Graham obstruction. This obstruction is the nonvanishing of the obstruction tensor O, given by the term µs . In n = 4 this obstruction tensor is the Bach tensor for g. In higher dimensions the leading term of O is sg (g), but there are a lot of lower order terms, which, again, are determined in principle, but whose calculation is very cumbersome. One important feature of the ambient metric is that if the metric g is real analytic then its corresponding ambient metric g (if it exists) is also real analytic [15,16,27]. Another feature of the ambient metric is that if the conformal class of g includes an Einstein metric g E , then the power series in the ambient metric g˜ E truncates at k = 2; in particular, for n > 3, even the obstruction tensor vanishes. In such case the metric is given as a second order polynomial in each of the variables t and ρ. However, if the metric g is not conformally Einstein, then, except for a few examples [19,39], no explicit formulae for µk , k > 3 are known. In this context our main result is the following remarkable conformal property of n-dimensional pp-waves: for them all the coefficients µk in the ambient metric, the obstruction tensor in even dimensions, and hence, the condition under which the ambient metric truncates at a given order can be calculated explicitly. In Sect. 4 we prove n−2 Theorem 1. Let g = i=1 (dxi )2 + 2du (dr + hdu) be an n-dimensional pp-wave metric with a real analytic function h = h(x1 , . . . , xn−2 , u). Then the FeffermanGraham ambient metric for the conformal class [g] exists if and only if n is odd and h is arbitrary, or if n = 2s is even and s h = 0. In both cases the ambient metric is given by a formal power series ∞ k h g = 2d (tρ) dt + t 2 g + ρ k du 2 , k! pk k=1
k
n−2
with pk := j=1 (2 j − n) and := i=1 ∂i2 . In particular, if n = 2s is even, the obstruction tensor O is given by O = s h du 2 . Thus if n = 2s is even, the ambient metric g is a polynomial of order s − 1 in the variable ρ. If n is odd, since the metric g is real analytic, the Fefferman-Graham result guarantees that the above metric g is also real analytic. This in particular means that the ∞ k h k power series k=1 k! pk ρ converges to a real analytic function in variable ρ. Theorem 1 provides us with a variety of examples of conformal structures with explicit ambient metrics and which, in general, are not conformally Einstein. For example, every polynomial h in the xi ’s of order lower than k, with coefficients being functions of u, represents a pp-wave with ambient metric truncated at order lower than k/2. In Sect. 6 we construct more general examples than those defined by h being polynomials in the xi s. In particular, in dimension four we find all Bach-flat 4-dimensional pp-waves and
884
T. Leistner, P. Nurowski
we prove that most of them are not conformally Einstein. They are defined by quite general functions h and have ambient metrics which are linear in variable ρ. It is interesting to note that these pp-waves, although Bach-flat and conformal to Cotton-flat, are not conformally Einstein. Theorem 1 implies also another interesting feature of the pp-waves: their obstruction tensor O (in even dimensions) involves only the terms of the highest possible order in the derivatives of their metric; since all the lower order terms that are usually present in the obstruction tensor are vanishing, the pp-waves are, in a sense, the closest cousins of the conformally Einstein metrics. Using the explicit form of the ambient metric and the main result of [24], in Sect. 7 we show that for even-dimensional pp-waves the critical Q-curvature vanishes. This result is in correspondence with the fact that for a pp-wave all scalar invariants constructed from the curvature tensor vanish (for the proof in arbitrary dimension see [10]). In the final Sect. 8 we study the holonomy of the ambient metric of a pp-wave in relation to results in [31]. We show that it is contained in the stabiliser of a totally null plane. 2. The Fefferman-Graham Ambient Metric An important tool in order to construct invariants in conformal geometry is the so-called Fefferman-Graham ambient metric or ambient space (see [15 and 16]). Let (M, [g]) be a a smooth n-dimensional manifold M with conformal structure [g] of signature ( p, q) with the conformal frame bundle P 0 . It can also be characterised by a principle R+ -fibre bundle π : Q → M defined as the ray sub-bundle in the bundle of metrics of signature ( p, q) given by metrics in the conformal class c. The action of R+ on Q shall be denoted by ϕ: ϕ(t, gx ) = t 2 gx . From [16] we adopt the following notation. Definition 1. Let (M, [g]) be a conformal structure of signature ( p, q) over an n-dimensional manifold M, and π : Q → M the corresponding ray bundle. A semi-Riemannian g ) of signature ( p + 1, q + 1) is called pre-ambient space if manifold ( M, and (1) there is a free R+ -action ϕ on M, is R+ -equivariant. (2) an embedding ι : Q → M (3) If F is the fundamental vector field of ϕ , and L denotes the Lie derivative, then L F g = 2 g , i.e. the metric g is homogeneous of degree 2 with respect to the R+ -action. (4) Any gx ∈ Q satisifies the equality (ι∗ g )gx = gx (dπ(.), dπ(.)) in 2 Tg∗x Q. A pre-ambient space is called ambient space if its Ricci curvature vanishes. Under the assumption that the conformal structure is given by a real analytic metric, in odd dimensions a Ricci-flat ambient metric always exists and is also real analytic. In even dimensions n ≥ 4, the existence of a Ricci-flat ambient metric is obstructed by the nonvanishing of the obstruction tensor O, [16, pp. 22]. This is a symmetric tracefree and divergence-free (2, 0)-tensor, which is conformally invariant of weight (2 − n), i.e. if gˆ = e2ϕ g ∈ [g], then Oˆ = e(2−n)ϕ O. It is given by 2 O = n/2−2 P − ∇ J + lower order terms, g g
Ambient Metrics for n-Dimensional pp-Waves
885
1 scal Ric − 2(n−1) where P = n−2 g is the Schouten tensor, J its trace, and g denotes the Laplacian of g ∈ [g]. For a conformal class in even dimension that is given by a real analytic metric with vanishing obstruction tensor, the ambient metric exists and is also real analytic. Fixing a metric g in the conformal class, in [15,16] it is shown that an ambient space near M can be written as = (− , ) × M × (1 − δ, 1 + δ) M with the ambient metric g = 2tdρdt + 2ρdt 2 + t 2 g(ρ), in which g(ρ) is a one-paramemter family of metrics on M with g(0) = g. This is referred to as g being in normal form. As the ambient metric is analytic, one can write the family g(ρ) as a power series in ρ,
1 2
1 3
2 2
g = 2tdρdt + 2ρdt + t g + ρg + ρ g + ρ g + . . . , 2 6 with g = ∂ρ g(0). We summarise the results for the ambient metric in Theorem 2 ([15,16 and 27]). Let (M, [g]) be a real analytic manifold M of dimension n ≥ 2 equipped with a conformal structure defined by a real analytic semi-Riemannian metric g. g) (1) If n is odd, or if n is even with O = 0, then there exists an ambient space ( M, with real analytic Ricci-flat metric g. (2) If n is odd the ambient space is unique modulo diffeomorphisms that restrict to the and commute with identity along Q ⊂ M ϕ . If n is even with O = 0, the ambient space is unique, modulo the same set of diffeomorphisms and modulo terms of order ≥ n/2 in ρ, where ρ is the coordinate in the normal form of the ambient metric. The Ricci-flat condition then determines symmetric (2, 0)-tensors µk such that ∞ 2 2 k g = 2tdρdt + 2ρdt + t g + ρ µk . k=1
In [16] the first µk are determined explicitly: (µ1 )ab = 2Pab , (n − 4)(µ2 )ab = −Bab + (n − 4)Pa c Pbc , 3(n−4)(n−6)(µ3 )ab = g Bab −2Wcabd B cd −4(n−6)Pc(a Bb) c −4Pc c Bab + 4(n − 4)Pcd ∇d C(ab)c − 2(n − 4)C c a d Cdbc + (n − 4)Ca cd Cbcd + 2(n − 4)∇d Pc c C(ab) d − 2(n − 4)Wcabd Pc e Ped ,
(3)
where Wabcd is the Weyl tensor, Pab is the Schouten tensor, Cabc := ∇c Pab − ∇b Pac is the Cotton tensor, and Bab = ∇c Cabc − Pcd W cabd is the Bach tensor.
886
T. Leistner, P. Nurowski
3. pp-Waves and Their Curvature A pp-wave is a Lorentzian manifold with a parallel null vector field K , i.e. ∇ K = 0, K = 0, and g(K , K ) = 0, whose curvature tensor satisfies the trace condition ef
Rab Re f cd = 0.
(4)
If we denote by κ the one-form given by κ := K −| g the curvature condition (4) is equivalent to each of the following, in which [ab] denotes the skew symmetrisation with respect to a and b, [42]: (1) κ[a Rbc]de = 0; (2) there is a symmetric (2, 0)-tensor with K −| = 0, such that Rabcd = κ[a b][c κd] ; e f (3) there is a function ϕ, such that R ab Recd f = ϕκa κb κc κd . The Ricci tensor of a pp-wave is given by Ric = κ ⊗ κ, for a smooth function . In dimension n = 4 this is even equivalent to the curvature condition (4). In [31] we gave another equivalent definition, without using coordinates or traces, but identifying a pp-wave as a Lorentzian manifold with parallel null vector field K , whose curvature satisfies (5) Im R(U, V )|K ⊥ ⊂ R · K for all U, V ∈ T M. This equivalence allows for several generalisations [32] and for an easy proof of another equivalence that is related to holonomy: An n-dimensional Lorentzian manifold is a pp-wave if and only if its holonomy group is contained in the Abelian subgroup Rn−2 of the stabiliser in SO(1, n − 1) of a null vector [30]. Locally, an n-dimensional pp-wave admits coordinates (x1 , . . . , xn−2 , u, r ) such that the metric is given by g=
n−2
(dxi )2 + 2du (dr + hdu) ,
(6)
i=1
with h being a smooth real function of the first (n − 1) coordinates, h = h(xi , u), [42]. In these coordinates the parallel null vector field K is given by ∂r and, up to symmetries, the only non-vanishing curvature terms of a pp-wave are R(∂i , ∂u , ∂ j , ∂u ) = ∂i ∂ j h. ∂ ∂ , ∂u := ∂u and ∂i := ∂x∂ i , i = 1, . . . , n − 2. Here we use the obvious notation ∂r := ∂r Hence, the function determining the Ricci-tensor is given by = −h with n−2 2 ∂i h, i.e. h = i=1
Ric = −h du 2 .
(7)
Hence, the image of the Ricci-tensor is totally null, and the scalar curvature vanishes. With this at hand, one can easily calculate the tensors related to the conformal geometry of a pp-wave. First, there is the Schouten-tensor P =
1 h Ric = − du 2 . n−2 n−2
(8)
Ambient Metrics for n-Dimensional pp-Waves
887
Secondly, the Weyl tensor is given by W (∂i , ∂u , ∂ j , ∂u ) = ∂i ∂ j h − δi j
h , n−2
(9)
h and for n > 3 we obtain that ∂i ∂ j h = δi j n−2 as an equivalent condition on h for g being conformally flat. 1 Next, we calculate the Cotton tensor C. As ∇P = − n−2 d(h) ⊗ du 2 one obtains that
C(∂u , ∂i , ∂u ) = −C(∂u , ∂u , ∂i ) =
∂i h n−2
(10)
are the only non-vanishing components of the Cotton tensor. Hence, ∂i h = 0 is the condition on h for 3-dimensional conformally flat pp-waves. Furthermore, we obtain the Bach tensor B, B=−
2 h du 2 . n−2
(11)
This enables us to calculate the next terms in the ambient metric expansion in Eqs. (3) h du 2 , namely beyond µ1 = 2P = n−2 µ2 =
1 − n−4 B
=
µ3 =
1 2(n−4)(n−6) B
=
2 h 2 (n−2)(n−4) du , 3 h 2 3(n−2)(n−2)(n−4) du .
The very simple structure of µ1 , µ2 , and µ3 above, and in particular the appearance of the consecutive powers of the Laplacian, suggests that this pattern may be also present in the next terms in the ambient metric expansion. That this is really the case will be proven in the next section. 4. The pp-Wave Ambient Metric Looking at the very simple form of the pp-wave metric (6) and the general formula for the ambient metrics (2), our ansatz for the ambient metric for this g is g¯ = 2d(ρt)dt + t 2 2du (dr + (h + H )du) +
n−2
(dxi )2 ,
(12)
i=1
where H = H (ρ, xi , u), and H (ρ, xi , u)|ρ=0 = 0.
(13)
If we were able to find an analytic function H satisfying (13) and for which the metric (12) was Ricci flat then, by the uniqueness of the Fefferman-Graham Theorem 2, we would conclude that g¯ with this H is the ambient metric for (6). Thus to check our guess it is enough to calculate the Ricci tensor for (12) and to check if its vanishing is possible for the function H in the postulated form (13).
888
T. Leistner, P. Nurowski
Lemma 1. The Ricci tensor of the metric (12) is Ric(g) ¯ = (2 − n)Hρ + 2ρ Hρρ − H − h du 2 . Here H =
n−2
∂2 H i=1 ∂(xi )2 ,
∂H ∂ρ ,
Hρ =
etc.
Proof. We start with a coframe θ 0 = d(ρt), θ i = tdxi , θ n−1 = t 2 (dr + (h + H )du), θ n = du, θ n+1 = dt,
(14)
in which the metric g¯ reads: g¯ = g¯µν θ µ θ ν = 2θ 0 θ n+1 + 2θ n−1 θ n +
n−2
(θ i )2 ,
µ, ν = 0, 1, . . . , n + 1.
i=1
It has the following differentials: dθ 0 dθ i
= 0, = −t −1 θ i ∧ θ n+1 ,
dθ n−1 = t Hρ θ 0 ∧ θ n + t dθ n = 0, dθ n+1 = 0.
n−2 i=1
∀i = 1, . . . , n − 2, (h i + Hi )θ i ∧ θ n − 2t −1 θ n−1 ∧ θ n+1 + ρt Hρ θ n ∧ θ n+1 ,
In this coframe the Levi-Civita connection 1-forms, i.e. matrix-valued 1-forms satisfying µ dθ µ + ν ∧ θ ν = 0, µν + νµ = 0, µν = g¯µσ σν , are: 0n in n−1 n i n+1 n−1 n+1 n n+1
= −t Hρ θ n , = −t (h i + Hi )θ n , = t −1 θ n+1 = t −1 θ i , = t −1 θ n = t −1 θ n−1 − ρt Hρ θ n .
(15)
Modulo the symmetry µν = −νµ all other connection 1-forms are zero. ρ The curvature 2-forms µν = dµν + µρ ∧ ν , have the following nonvanishing components: 0n = −Hρρ θ 0 ∧ θ n − in = −Hiρ θ 0 ∧ θ n −
n−2
Hiρ θ i ∧ θ n − ρ Hρρ θ n ∧ θ n+1 ,
i=1 n−2
(δik Hρ + Hik + h ik )θ k ∧ θ n − ρ Hiρ θ n ∧ θ n+1 ,
k=1 n−2
nn+1 = −ρ Hρρ θ 0 ∧ θ n −
(16)
ρ Hiρ θ i ∧ θ n − ρ 2 Hρρ θ n ∧ θ n+1 ,
i=1
together with the components that are implied by the symmetry µν = −νµ . The Riemann tensor Rµνρσ , defined by µν = 21 Rµνρσ θ ρ ∧ θ σ , can be read off from Eqs. (16). Using this and the inverse of the metric g µν , gµρ g ρν = δµν , we calculate the
Ambient Metrics for n-Dimensional pp-Waves
889
Ricci tensor Rµν = g ρσ Rρµσ ν . It turns out that it has Rnn = −2R0nnn+1 + as its only nonvanishing component. Explicitly:
n−2 i=1
Rinin
Rnn = 2ρ Hρρ − (n − 2)Hρ − H − h. This finishes the proof of the lemma. The lemma shows that the metric g¯ is Ricci flat if and only if the function H satisfies the following PDE: (2 − n)Hρ + 2ρ Hρρ − H = h.
(17)
For g¯ to be the ambient metric for (6) we in addition require the initial condition (13). By looking for the solution of the initial value problem (17), (13) in the form of a power series H=
∞
ak ρ k ,
(18)
k=0
we immediately get a0 = 0 from the initial condition (13). Then inserting (18) in (17), we easily arrive at Proposition 1. If n = 2s + 1, s ≥ 1, then the initial value problem (17), (13) has a unique power series solution. It is given by: H=
∞ k=1
k!
k
k h
i=1 (2i − n)
ρk .
(19)
If n = 2s the power series solution exists only if s h = 0. If this is the case, the solution is also unique and given by the power series (19), which truncates to a polynomial of order (s − 1) in the variable ρ. This proposition proves our Theorem 1 of the Introduction. Note that the solution we found is a solution to Eq. 3.17 in [16] that was derived for the Taylor expansion of the ambient metric, here specified for a pp-wave. In particular, for n = 2s the obstruction tensor of an n-dimensional pp-wave is given by O = s h du 2 . With this result at hand, every polynomial h in the xi ’s of order lower than 2k, with coefficients being functions of u, gives an example of a pp-wave for which the ambient metric truncates to a polynomial of order lower than k. This gives plenty of examples of explicit ambient metrics, also in even dimensions. Moreover, choosing h properly, one gets examples for which the conformal class does not contain an Einstein metric. This will be the aim of Sect. 6. But first we address the issue of convergence of H in odd dimensions.
890
T. Leistner, P. Nurowski
5. Convergence in Three Dimensions In odd dimensions the solution to the Ricci-flat equation, H in (19), may be given by an infinite series. Since H contains only natural powers of ρ, general arguments as in [16] ensure that H converges for an analytic function h and is analytic as well, [21]. Here we give a simple argument that proves convergence for n = 3: Proposition 2. Let h be a function on C × R of variables (z, u) which is an entire holomorphic function in z = x + iy ∈ C, is continuous in u ∈ R, and is real for z = x ∈ R. Then the series H (x, u, ρ) =
∞ k=1
(k h)(x, u) ρk k k! i=1 (2i − 3)
(20)
converges uniformly on compact subsets of R3 . Proof. Let R > 1 be a real number and let C = sup{|h(z, u)|} over all values of (z, u) such that |z −x| ≤ (R +2 ), |u| ≤ ν > 0, and |x| ≤ > 0. Then by the Cauchy-Schwarz inequality, the k th derivative of h at every real point (x, u) ∈ [− , ] × [−ν, ν] satisfies . This provides the following estimate for the values of the powers of |h (k) (x, u)| ≤ Ck! Rk the Laplacian k h =
d 2k h : dz 2k
∀(x, u) ∈ [− , ] × [−ν, ν] we have |(k h)(x, u)| ≤
C(2k)! . R 2k
(21)
Now we rewrite (20) to the equivalent form H = ρh −
∞ k=1
k+1 h ρ k+1 . (k + 1)! · 1 · 3 · · · · · (2k − 1)
To show that H converges it is enough to show the convergence of the power series above. This can be done by using the estimate (21): k+1 ∞ |ρ| k+1 h (2k + 2)! k+1 | ρ |≤C (k + 1)! · 1 · 3 · · · · · (2k −1) (k + 1)! · 1 · 3 · · · · · (2k −1) R 2 k=1 k=1 k+1
∞ ∞ |ρ| (2 · 4 · · · · · 2k) · (2k + 1)(2k + 2) |ρ| k+1 =C = C b . k (k + 1)! R2 R2 ∞
k=1
k=1
Since |bk+1 | 2(k + 1)(2k + 3)(2k + 4) = −→ 2 as k → ∞, |bk | (k + 2)(2k + 1)(2k + 2) then this series converges for |ρ| ≤
R2 2 .
This finishes the proof.
Ambient Metrics for n-Dimensional pp-Waves
891
6. Bach Flat Metrics that are not Conformally Einstein With Eq. (11) it is obvious how to obtain Bach-flat pp-waves. It is more difficult to find those that are not conformally Einstein. In this section we want to give examples of 4-dimensional pp-waves that are both Bach flat and not conformal to Einstein. But first we have to review some necessary conditions of being conformal to Einstein given in [20] for any dimension. In this section, when we write ‘conformal to’ we mean ‘locally conformal to’. From the formulae for the transformation of the Schouten tensor under conformal changes of the metric one obtains that a metric is conformal to an Einstein metric if and only if there exists a scaling function ϒ such that P − ∇dϒ + (dϒ)2 is pure trace.
(22)
In the following we write Y for the gradient of ϒ. In [20, Prop. 2.1] the following necessary conditions for the metric to be conformal to Einstein were derived from Eq. (22): C + W (Y, ., ., .) = 0, B + (n − 4)W (Y, ., ., Y ) = 0.
(23) (24)
Note that the first condition is satisfied for a gradient Y if and only if the metric is conformally equivalent to a metric with vanishing Cotton tensor, i.e. if it is conformally Cotton-flat. We further mention that the property of being conformally Cotton-flat is also neccessary for the metric to be conformally Einstein [20]. For a pp-wave conditions (23) and (24) are equivalent to the following: Proposition 1. If the pp-wave (6) is conformally Einstein but not conformally flat and n > 3, then there is a vector field Y on M, whose components Y i := dxi (Y ), i = 1, . . . , n − 2, and Y n−1 := du(Y ) satisfy the equations ∂i h − Y i h + (n − 2)
n−2
Y k ∂k ∂i h = 0,
(25)
k=1
2 h − (n − 4)h
n−2
Yk
2
+ (n − 2)(n − 4)
k=1
n−2
Y k Y l ∂k ∂l h = 0,
(26)
k,l=1
for i = 1, . . . , n − 2, and Y n−1 = 0. Proof. Writing Y =
Y k∂
k
+ Y n−1 ∂
u + dr (Y )∂r , Eq. (23) and the formulae in Sect. 3 give
0 = Y n−1 W (∂u , ∂i , ∂u , ∂ j ),
∂i h h 0= + Y k ∂k ∂i h − δki . n−2 n−2 These, when n > 3, imply both Y n−1 = 0 and Eq. (25). Equation (24) gives that
h 2 h , 0=− − (n − 4)Y k Y l ∂k ∂l h − δkl n−2 n−2 which implies Eq. (26).
(27)
892
T. Leistner, P. Nurowski
Writing Y as the gradient of ϒ, Y =
n−2
∂k ϒ∂k + ∂r ϒ∂u + (∂u ϒ − h∂r ϒ) ∂r ,
k=1
the proposition implies that du(Y ) = ∂r ϒ = 0. Hence, ∂r (dr (Y )) = ∂r (∂u ϒ − h∂r ϒ) = 0, and we obtain Corollary 1. Let g be a pp-wave that is conformally Einstein but not conformally flat in dimension n > 3, and let Y be the gradient of the scaling function ϒ satisfying Eq. (22). Then the function Y n = dr (Y ) does not depend on the r -variable. Example 1. For n = 3 a third order polynomial h in x with coefficients being functions of u defines a pp-wave with non-vanishing Cotton tensor. Hence, it is not conformally flat and therefore not conformally Einstein. h Example 2. Set M = Rn and h = (x1 )4 + · · · + (xn−2 )4 . Then, ∂i ∂ j h = δi j n−2 on open sets in M and hence, g is not conformally flat. On the other hand, Eq. (26) can never be satisfied in 0 ∈ M, because here all second order derivatives of h vanish, but 2 h = 24(n − 2). Thus, the pp-wave defined by h = (x1 )4 + · · · + (xn−2 )4 is not conformally Einstein.
Now we turn to dimension n = 2s = 4. Here the formula (19) makes sense only if 2 h = 0. In such case the formula truncates to H = 21 ρh. Thus it is clear that for the 4-dimensional pp-waves the Fefferman-Graham obstruction is precisely 2 h, which is a multiple of the Bach tensor, and does not involve any lower order terms in the derivatives of the metric functions. In order to write down all such metrics, it is convenient to 1 2 1 −ix2 pass to the complex notation by introducing coordinates z = x √+ix , z¯ = x √ . In this 2
2
notation the most general 4-dimensional pp-wave metric satisfying 2 h = 0 is given by g4 = 2du dr + z¯ α + z α¯ + β + β¯ du + 2dzd¯z . Here α = α(z, u), β = β(z, u) are holomorphic functions of z. This metric is Bach-flat, and in some cases, such as when az + α¯ z¯ = const, is conformal to an Einstein metric. Its ambient metric is given by g˜4 = 2d(ρt)dt + t 2 2du[dr + z¯ α + z α¯ + β + β¯ − ρ(az + α¯ z¯ ) du] + 2dzd¯z , and by construction is Ricci flat. We get Proposition 2. A 4-dimensional pp-wave g4 is Bach flat if and only if g4 = 2du dr + z¯ α + z α¯ + β + β¯ du + 2dzd¯z , with α = α(z, u), β = β(z, u) functions of a complex variable z and a real variable u which are holomorphic in z. In general, this Bach-flat metric is not conformally Einstein:
Ambient Metrics for n-Dimensional pp-Waves
893
Theorem 3. A 4-dimensional Bach-flat pp-wave g4 = 2du (dr + (¯z α + z α) ¯ du) + 2dzd¯z
(28)
with β ≡ 0 is conformally equivalent to a metric with vanishing Cotton tensor. Moreover, the following three properties are equivalent: (1) ∂z2 α ≡ 0, (2) g4 is conformally flat, (3) g4 is conformally Einstein. In particular, any such metric with ∂z2 α ≡ 0 is not conformally Einstein. ¯ Next, Proof. First, in the complex coordinates (z, z¯ ) we have: h = 2 (∂z α + ∂z¯ α). using 1 i ∂1 = √ (∂z + ∂z¯ ) , ∂2 = √ (∂z − ∂z¯ ) , 2 2 in the formula (9) we see that the Weyl tensor vanishes if and only if ∂z2 α = 0. This proves the equivalence of (1) and (2). For the remaining statements we try to find a vector field Y that solves the necessary condition (23) for g to be conformally Einstein. We use this equation in the form (25), as in Proposition 1. Recall that in this proposition we proved that such a vector does not have a ∂u -component. Thus we look for Y of the form Y = F∂z + F∂z¯ + f ∂r , where F = F(z, z¯ , r, u) is a complex and f = f (z, z¯ , r, u) is a real function. Equation (25) gives 0 = ∂z2 α (1 + z¯ F) + ∂z¯2 α¯ 1 + z F , (29) 2 2 0 = ∂z α (1 + z¯ F) − ∂z¯ α¯ 1 + z F , (30) which immediately implies ∂z2 α (1 + z¯ F) = 0. Assuming that g4 is not conformally flat, i.e. ∂z2 α ≡ 0 we get F(z) = −1/¯z . Thus we found that the vector Y solves (23) if and only if Y = − 1z¯ ∂z − 1z ∂z¯ + f ∂r . Now, g4 is conformally Cotton-flat if we find f such that this Y is a gradient. Setting 1 1 Y = g4 (Y, .) = − dz − d¯z + f du, z z¯ we see that Y is locally a gradient, i.e. dY = 0, if and only if f is a function of variable u alone. Every f = f (u) gives a solution to the conformally Cotton equation. To prove that (3) implies (2), assume that g4 is not conformally flat but conformally Einstein. Then we plug in the vector Y we have obtained as a solution of Eq. (25), and its corresponding
α + z∂z¯ α¯ α¯ + z¯ ∂z α 1 1 ∇Y = d f ⊗ du − + du 2 + 2 dz 2 + 2 d¯z 2 z¯ z z z¯
894
T. Leistner, P. Nurowski
into P − ∇Y + (Y )2 . According to Eq. (22) this must be a pure trace, if the metric g4 is conformally Einstein. But this can not happen since P − ∇Y + (Y )2 has a nowhere vanishing dzd¯z -term given by z2z¯ dzd¯z , and an identically vanishing dr du-term. Thus P − ∇Y + (Y )2 is never proportional to g4 , which in turn, can not be conformally Einstein. In the light of discussions in [20], the metrics (28) provide interesting examples because, apart from being Bach-flat, they are conformally Cotton-flat, but not conformally Einstein even though the necessary conditions (23) and (24) are both satisfied for a gradient. This phenomenon is special to Lorentzian and probably to other indefinite signature metrics. We strongly believe that a similar argument works in any dimension, even though one might not be able to describe the functions with s h = 0. But under certain assumptions it might be possible to deduce a contradiction between Eq.’s (25) – (26) and the fact that the function dr (Y ) is independent of the r -coordinate as it occurs for n = 4. We want to conclude this section by returning to the result of Brinkmann in [8] mentioned in the Introduction. If a 4-dimensional pp-wave is Einstein, and hence Ricci-flat, the function h is given by α + α for a holomorphic function α. Again, this metric is conformally flat if and only if ∂z2 α = 0. If it is not conformally flat but conformally Einstein, then the vector field Y is null and a multiple of ∂r , namely Y = f ∂r with a function f = f (u) that depends on the variable u only. As P = 0, Eq. (22) then is equivalent to f = f 2 . Hence, any such function yields a conformal rescaling of a Ricci-flat pp-wave to another Einstein metric that is in fact Ricci-flat. The new metric may be isometric to the original one but in general this is not the case (see also [14]). Finally, note that a non-trivial solution of f = f 2 is not defined on all of R, and thus, in general, f does not yield a global rescaling to another Einstein metric. 7. The Critical Q-Curvature of a pp-Wave For a semi-Riemannian manifold of (M, g) even dimension n = 2s, in [7] T. Branson introduced a series {Q 2k }k=1...s of scalar invariants constructed from the curvature tensor involving 2k derivatives of the metric3 . As such, for a pp-wave all Q 2k are zero. This follows from the general fact that all scalar invariants constructed from the Riemannian curvature tensor of a pp-wave vanish (for a proof in arbitrary dimension see [10]). However, as an application of Theorem 1, in this section we will use the pp-wave ambient metric in order to show that the critical Q-curvature Q n of a pp-wave vanishes. The so-called subcritical Q-curvatures Q 2 , . . . , Q n−2 are defined by the inhomogeneous part of the GJMS-operators P2k , namely g
P2k (1) = (s − k)Q 2k . The GJMS-operators P2k introduced in [23] are conformally covariant operators. We will not give a definition of the critical Q-curvature Q n here (please refer to [17], for example). Instead we will explain a formula for the critical Q-curvature given in [24] that expresses it in terms of the volume of the Poincaré metric. 3 Regarding this section, we would like to thank Andreas Juhl for explaining to us some facts about Q-curvature.
Ambient Metrics for n-Dimensional pp-Waves
895
Let (M, [g]) be a smooth manifold of even dimension n = 2s with conformal class [g]. To this manifold one can assign a Poincaré metric g+ . g+ is a metric on M+ = M × (0, a) given by 1 g+ = 2 dx2 + gx , x where gx is a 1-parameter family of metrics with the same signature as g and with initial condition g0 = g such that g+ is asymptotically Einstein, which means that Ric(g+ )+ ng+ vanishes up to terms of order (n − 2) in x. The Poincaré-metric is unique up to addition of terms of the form xn Sx , where Sx is a 1-parameter family of symmetric (2, 0)-tensors such that S0 is trace-free√(for details see [15,16]). For a Poincaré metric one can show, see [22] for details, that det(gx )/ det(g) has the Taylor expansion
det(gx ) = 1 + v (2) x2 + v (4) x4 + · · · + v (n−2) xn−2 + v (n) xn + · · · , (31) det(g) defining smooth functions v (2k) . Then in [24] it is shown that the critical Q-curvature Q n of (M, [g]) is given as 2nc n2 Q n = nv (n) +
s−1 (n − 2k)A∗2k v (n−2k) .
(32)
k=1
Here A2k are the linear differential operators that appear in the expansion of a harmonic function for a Poincaré-metric, the star denotes the formal adjoint, and c n2 is a constant. Furthermore, one has to recall how the Poincaré-metric can be obtained by the ambient metric. Assume that g = 2d(ρt)dt + t 2 g(ρ) is a pre-ambient metric for [g] that is Ricci-flat up to terms of order s and higher. Such a metric always exists and is unique up to terms of order n/2 in ρ. Now, on | p ∈ M, t 2 ρ = −1}, M+ = {(ρ, p, t) ∈ M the Poincaré-metric is given by 1 g+ = 2 x
1 2 dx + g(x ) . 2 2
Note that if the pre-ambient metric is Ricci-flat, then the Poincaré-metric obtained in this way is Einstein. We can use the ambient metric of a pp-wave to prove Theorem 4. The critical Q-curvature of an even-dimensional pp-wave vanishes. Proof. Let (M, g) be a pp-wave of even dimension n = 2s. In Sect. 4 we have also shown that its pre-ambient metric that is Ricci-flat up to terms of order n/2 is given by formula (12) with H as in (19). Using the coframe in (14) we can write down the volume form ω(ρ) of the ρ-dependent family of pp-waves, g(ρ) = 2du (dr + (h + H )du) +
n−2 (dxi )2 , i=1
896
T. Leistner, P. Nurowski
namely ω(ρ) = dx1 ∧ . . . ∧ dxn−2 ∧ (dr + (h + H )du) ∧ du = ω(0). For the family gx = 21 g(x2 ) defining the Poincaré metric this implies that det(gx ) = det(g0 ). Hence, all the v (2k) in (31) are zero and so is the critical Q-curvature by the result of [24] given in formulae (32). Recall that for a pp-wave (M, g) the vanishing of the scalar curvature implies that the Laplacian g is conformally covariant. Calculations using formulae in [26] show that the first GJMS-operators P2 , P4 and P6 are equal to the corresponding powers of the Laplacian g , 2g and 3g . We conjecture that for pp-waves this is also the case for the higher P2k . 8. Conformal and Ambient Holonomy We conclude with a brief remark about the holonomy of the ambient metric and the holonomy of the normal conformal Cartan connection, also called the conformal holonomy, of a pp-wave. Holonomy groups describe the reduction of generic structures down to more special structures, in the semi-Riemannian, the conformal, and in other geometric settings. For a conformal manifold of signature (r, s) the conformal holonomy is contained in SO(r + 1, s + 1). If it is a proper subgroup, then the conformal structure is reduced to a more special structure. Examples are Lorentzian Fefferman spaces, for an overview see [1], where the conformal holonomy reduces to the special unitary group, or conformal structures in signature (2, 3) with non-compact G2 as structure group, [38,39]. In [31] it is proven that the conformal holonomy of an n-dimensional Lorentzian conformal class that is given by a metric with parallel null line and totally null Ricci tensor is contained in the stabiliser in SO(2, n) of a totally null plane N . Of course, pp-waves are special examples of such metrics and hence, their conformal holonomy reduces to this stabiliser. But we get the same result also for the holonomy of the ambient metric of a pp-wave. Proposition 3. The metric g defined in Eq. (12) admits a holonomy invariant distribution of totally null planes N spanned by ∂r and ∂ρ . In particular, all curvature operators ¯ ¯ leave invariant the fibres of N and of N ⊥ , which is spanned R(V, W ), V, W ∈ T M, by ∂r , ∂ρ , and ∂i . Proof. The easiest way to see this is to consider the dual frame to the co-frame in (14) given by E0 =
1 1 1 ρ ∂ρ , E i = ∂i , E n−1 = 2 ∂r , E n = ∂u − (h + H )∂r , E n+1 = ∂t − ∂ρ . t t t t
Using the relation g( ¯ ∇¯ E µ , E ν ) = µν one can read off from the formulae for the connection 1-forms in (15) that N = span(E 0 , E n−1 ) = (span(E 0 , E i , E n−1 ))⊥ is invariant under the Levi-Civita connection.
Ambient Metrics for n-Dimensional pp-Waves
897
Corollary 2. Let G be the holonomy group of the ambient metric of a pp-wave in odd dimension or in dimension 2s with s h = 0. Then G is contained in the stabiliser in SO(2, n) of a totally null plane in R2,n . In general, it is possible to show that the conformal holonomy is always contained in the ambient holonomy [33]. For a conformal class with an Einstein-metric or a Ricciflat metric both holonomy groups are the same [31,34]. For a pp-wave, not necessarily conformal Einstein, we have just seen that both are contained in the isotropy group of a totally null plane. Hence, it is very likely that the conformal holonomy is actually equal to the ambient holonomy. But to give a proof of this is beyond the scope of this paper. References 1. Baum, H.: The conformal analog of Calabi-Yau manifolds. In: Handbook of Pseudo-Riemannian Geometry, IRMA Lectures in Mathematics and Theoretical Physics. Zürich European Mathematical Society, 2007, In press 2. Bautier, K., Englert, F., Rooman, M., Spindel, P.: The Fefferman-Graham ambiguity and AdS black holes. Phys. Lett. B 479(1-3), 291–298 (2000) 3. Bena, I., Roiban, R.: Supergravity pp solutions with 28 and 24 supercharges. Phys. Rev D 67, 125014 (2003) 4. Berenstein, D., Maldacena, J., Nastase, H.: Strings in flat space and pp waves from N = 4 super Yang Mills. J. High Energy Phys. (4):No. 13, 30 (2002) 5. Blau, M., Figueroa-O’Farrill, J., Hull, C., Papadopoulos, G.: A new maximally supersymmetric background of type IIB superstring theory. J. High Energy Phys. 01, 047 (2002) 6. Blau, M., Figueroa-O’Farrill, J., Hull, C., Papadopoulos, G.: Penrose limits and maximal supersymmetry. Class. Quant. Grav. 19, L87–L95 (2002) 7. Branson, T.P.: The Functional Determinant. Volume 4 of Lecture Notes Series. Seoul: Seoul National University Research Institute of Mathematics Global Analysis Research Center, 1993 8. Brinkmann, H.W.: Einstein spaces which are mapped conformally on each other. Math. Ann. 94, 119–145 (1925) 9. Chru´sciel, P.T., Kowalski-Glikman, J.: The isometry group and Killing spinors for the pp wave space-time in D = 11 supergravity. Phys. Lett. B 149(1-3), 107–110 (1984) 10. Coley, A., Milson, R., Pelavas, N., Pravda, V., Pravdová, A., Zalaletdinov, R.: Generalizations of pp-wave spacetimes in higher dimensions. Phys. Rev. D (3), 67(10):104020, 4, 2003 11. Cvetiˇc, M., Lü, H., Pope, C.N.: Penrose limits, pp-waves and deformed M2-branes. Phys. Rev. D69, 046003 (2004) 12. Cvetiˇc, M., Lü, H., Pope, C.N.: M-theory pp-waves, Penrose limits and supernumerary supersymmetries. Nuclear Phys. B 644(1-2), 65–84 (2002) 13. de Haro, S., Skenderis, K., Solodukhin, S.N.: Holographic reconstruction of spacetime and renormalization in the AdS/CFT correspondence. Commun. Math. Phys. 217, 595 (2001) 14. Ehlers, J., Kundt, W.: Exact solutions of the gravitational field equations. In: Gravitation: An Introduction to Current Research. New York: Wiley, 1962, pp. 49–101 15. Fefferman, C., Graham, C.R.: Conformal invariants. In: Elie Cartan etles mathematiques of Aujourdheu, Astérisque, (Numero Hors Serie):95–116 (1985) 16. Fefferman, C., Graham, C.R.: The ambient metric. http://arxiv.org/abs/0710.0919v2[math.DG], 2008 17. Fefferman, C., Hirachi, K.: Ambient metric construction of Q-curvature in conformal and CR geometries. Math. Res. Lett. 10(5-6), 819–831 (2003) 18. Gauntlett, J.P., Hull, C.M.: pp-waves in 11-dimensions with extra supersymmetry. J. High Energy Phys. 6(13), 13 (2002) 19. Gover, A.R., Leitner, F.: A sub-product construction of Poincare-Einstein metrics. Int. J. Math. 20, 1263–1287 (2009) 20. Gover, A.R., Nurowski, P.: Obstructions to conformally Einstein metrics in n dimensions. J. Geom. Phys. 56(3), 450–484 (2006) 21. Graham, C.R.: Personal communication 22. Graham, C.R.: Volume and area renormalizations for conformally compact Einstein metrics. In: The Proceedings of the 19th Winter School “Geometry and Physics” (Srni, 1999), Rend. Circ. Mat. Palermo (2) Suppl. No. 63, 31–42 (2000) 23. Graham, C.R., Jenne, R., Mason, L.J., Sparling, G.A.J.: Conformally invariant powers of the Laplacian. I. Existence. J. London Math. Soc. (2) 46(3), 557–565 (1992)
898
T. Leistner, P. Nurowski
24. Graham, C.R., Juhl, A.: Holographic formula for Q-curvature. Adv. Math. 216(2), 841–853 (2007) 25. Hull, C.M.: Exact pp-wave solutions of eleven-dimensional supergravity. Phys. Lett. 139B, 3941 (1984) 26. Juhl, A.: Families of Conformally Covariant Differential Operators, Q-curvature and Holography. Progress in Mathematics. 275, Basel: Birkhäuser, 2009 27. Kichenassamy, S.: On a conjecture of Fefferman and Graham. Adv. Math. 184(2), 268–288 (2004) 28. Kowalski-Glikman, J.: Vacuum states in supersymmetric Kaluza-Klein theory. Phys. Lett. B 134(3-4), 194–196 (1984) 29. Kowalski-Glikman, J.: A nontrivial vacuum state in D = 10, N = 1 supergravity. Phys. Lett. B 134(3-4), 159–160 (1984) 30. Leistner, T.: Lorentzian manifolds with special holonomy and parallel spinors. In: Proceedings of the 21st Winter School “Geometry and Physics” (Srni, 2001), Rend. Circ. Mat. Palermo suppl. 69, 131–159 (2002) 31. Leistner, T.: Conformal holonomy of C-spaces, Ricci-flat, and Lorentzian manifolds. Diff. Geom. Appl. 24(5), 458–478 (2006) 32. Leistner, T.: Screen bundles of Lorentzian manifolds and some generalisations of pp-waves. J. Geom. Phys. 56(10), 2117–2134 (2006) 33. Leistner, T., Nurowski, P.: Conformal classes with G 2(2) -ambient metrics. http://arxiv.org/abs/:0904. 0186v2[math.DG], 2009 34. Leitner, F.: Conformal Killing forms with normalisation condition. Rend. Circ. Mat. Palermo (2) Suppl. 75, 279–292 (2005) 35. Meessen, P.: A small eprint on pp-wave vacua in 6 and 5 dimensions. Phys. Rev. D65, 087501 (2002) 36. Michelson, J.: (Twisted) toroidal compactication of pp-waves. Phys. Rev. D66, 066002 (2002) 37. Michelson, J.: A pp-wave with 26 supercharges. Class. Quant. Grav. 19(23), 5935–5949 (2002) 38. Nurowski, P.: Differential equations and conformal structures. J. Geom. Phys. 43(4), 327–340 (2005) 39. Nurowski, P.: Conformal structures with explicit ambient metrics and conformal G 2 holonomy. In: Symmetries and Overdetermined Systems of Partial Differential Equations. Volume 144 of IMA Vol. Math. Appl., New York: Springer, 2008, pp. 515–526 40. Penrose, R.: Any space-time has a plane wave as a limit. In: Differential Geometry and Relativity, Mathematical Phys. and Appl. Math., Vol. 3. Dordrecht: Reidel, 1976, pp. 271–275 41. Robinson, I.: A solution of the Maxwell-Einstein equations. Bull. Acad. Polon. Sci. Sér. Sci. Math. Astr. Phys. 7, 351–352 (unbound insert), (1959) 42. Schimming, R.: Riemannsche Räume mit ebenfrontiger und mit ebener Symmetrie. Math. Nach. 59, 128–162 (1974) Communicated by P.T. Chru´sciel