Commun. Math. Phys. 221, 1 – 26 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Evolution of a ...

Author:
M. Aizenman (Chief Editor)

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Commun. Math. Phys. 221, 1 – 26 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Evolution of a Model Quantum System Under Time Periodic Forcing: Conditions for Complete Ionization O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko Department of Mathematics, Rutgers University, Piscataway, NJ 08854-8019, USA Received: 1 November 2000 / Accepted: 5 February 2001

Abstract: We analyze the time evolution of a one-dimensional quantum system with an attractive delta function potential whose strength is subjected to a time periodic (zero mean) parametric variation η(t). We show that for generic η(t), which includes the sum of any finite number of harmonics, the system, started in a bound state will get fully ionized as t → ∞. This is irrespective of the magnitude or frequency (resonant or not) of η(t). There are however exceptional, very non-generic η(t), that do not lead to full ionization, which include rather simple explicit periodic functions. For these η(t) the system evolves to a nontrivial localized stationary state which is related to eigenfunctions of the Floquet operator. 1. Introduction and Results We are interested in the qualitative long time behavior of a quantum system evolving under a time dependent Hamiltonian H (t) = H0 + H1 (t), i.e. in the nature of the solutions of the Schrödinger equation i h∂ ¯ t ψ = [H0 + H1 (t)]ψ.

(1)

Here ψ is the wavefunction of the system, belonging to some Hilbert space H, H0 and H1 are Hermitian operators and Eq. (1) is to be solved subject to some initial condition ψ0 . Such questions about the solutions of (1) belong to what Simon [1] calls “second level foundation” problems of quantum mechanics. They are of particular practical interest for the ionization of atoms and/or dissociation of molecules, in the case when H0 has both a discrete and a continuous spectrum corresponding respectively to spatially localized (bound) and scattering (free) states in Rd . Starting at time zero with the system in a bound state and then “switching on” at t = 0 an external potential H1 (t), we want to know the “probability of survival”, P (t), of the bound states, at times t > 0: P (t) = 2 j | ψ(t), uj | , where the sum is over all the bound states uj [2–6, 8, 9].

2

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

This problem has been investigated both analytically and numerically for the case H1 (t) = η(t)V1 (x) with η(t) = r sin(ωt + θ) and V1 a time independent potential, x ∈ Rd . When ω is sufficiently large for “one photon” ionization to take place, i.e., when hω ¯ > −E0 , E0 the energy of the bound (e.g. ground) state of H0 and r is “small enough” for H1 to be treated as a perturbation of H0 then this is a problem discussed extensively in the literature ([8, 9]). Starting with the system in its ground state the long time behavior of P (t) is there asserted to be given by the P (t) ∼ exp[−F t]. The rate constant F is computed from first order perturbation theory according to Fermi’s golden rule. It is proportional to the square of the matrix element between the bound and free states, multiplied by the appropriate density of continuum states in the vicinity of the final state which will have energy hω ¯ − E0 [6, 8–10]. Going from perturbation theory to an exponential decay involves heuristics based on deep physical insights requiring assumptions which seem very hard to prove. It is therefore very gratifying that many features of this scenario have been recently made mathematically rigorous by Soffer and Weinstein [6] (their analysis was generalized by Soffer and Costin [7]). They considered the case when H0 = −∇ 2 + V0 (x), x ∈ R3 , V0 compactly supported and such that there is exactly one bound state with energy −ω0 (from now on we use units in which h¯ = 2m = 1) and a continuum of quasi-energy states with energies k 2 for all k ∈ R3 . The perturbing potential is H1 (t) = r cos(ωt)V1 (x) with V1 (x) also of compact support and satisfying some technical conditions. They then showed that for ω > ω0 and r small enough there is indeed an intermediate time regime where P (t) has a dominant exponential form with the Fermi exponent F . This regime is followed for longer times by an inverse power law decay. Some of these restrictions can presumably be relaxed but the requirement that r be small is crucial to their method which is essentially perturbative. The behavior of P (t) becomes much more difficult to analyze when the strength of H1 (t) is not small and perturbation theory is no longer a useful guide. This became clear in the seventies with the beautiful experiments by Bayfield and Koch, cf. [11] for a review, on the ionization of highly excited Rydberg (e.g. hydrogen atoms) by intense microwave electric fields. These experiments showed quite unexpected nonlinear behavior of P (t) as a function of the initial state, field strength E and the frequency ω. These results as well as other multiphoton ionizations of hydrogen atoms have been (and continue to be) analyzed by various authors using a variety of methods. Prominent among these are semi-classical phase-space analysis, numerical integration of the Schrödinger equation, Floquet theory, complex dilation, etc. While the results obtained so far are not rigorous, they do give physical insights and quite good agreement with experiments although many questions still remain open even on the physical level [11–15]. In addition to the above experiments on Rydberg atoms there are also many experiments which use strong laser fields to produce multiphoton (ω < −E0 ) ionization of multielectron atoms and/or dissociation of molecules [16, 17]. These systems are more complex than Rydberg atoms and their analysis is correspondingly less developed. One unexpected result of certain studies is that an increase in the intensity of the field may reduce the degree of ionization, i.e., P (t) can be non-monotone in the field strength E at large values of E. This phenomenon, which is often called “stabilization”, can be observed in some numerical simulations, analyzed rigorously in some models and is claimed to have been seen experimentally cf. [5] and [18–21]. It turns out that many features observed for Rydberg atoms and also stabilization are already present in a simple model system which we have recently begun to investigate analytically [22–24]. This somewhat surprising finding is based on comparisons between

Ionization of Simple Model

3

experimental and model results described in detail in [23]. In fact the phenomenon of ionization by periodic fields is very complex indeed once one goes beyond the perturbative regime even in the most simple model. This will become clear from the new results about this model presented here. 2. The Model We consider a very simple quantum system where we can analyze rigorously many of the phenomena expected to occur in more realistic systems described by (1). This is a one dimensional system with an attractive delta function potential. The unperturbed Hamiltonian H0 has, in suitable units, the form H0 = −

d2 − 2 δ(x), dx 2

−∞ < x < ∞.

(2)

The zero range (delta-function) attractive potential is much used in the literature to model short range attractive potentials [25–28]. It belongs, in one dimension, to the class K1 [2]. H0 has a single bound state ub (x) = e−|x| with energy −ω0 = −1. It also has continuous uniform spectrum on the positive real line, with generalized eigenfunctions 1 1 ikx i|kx| u(k, x) = √ , −∞ < k < ∞ e e − 1 + i|k| 2π and energies k 2 . Beginning at t = 0, we apply a parametric perturbing potential, i.e. for t > 0 we have H (t) = H0 − 2 η(t)δ(x)

(3)

and solve the time dependent Schrödinger equation (1) for ψ(x, t), with ψ(x, 0) = ψ0 (x). Expanding ψ in eigenstates of H0 we write ψ(x, t) = θ(t)ub (x)eit ∞ 2 + !(k, t)u(k, x)e−ik t dk (t ≥ 0)

(4)

−∞

with initial values θ (0) = θ0 , !(k, 0) = !0 (k) suitably normalized, ∞ ψ0 , ψ0 = |θ0 |2 + |!0 (k)|2 dk = 1. −∞

(5)

We then have that the survival probability of the bound state is P (t) = |θ(t)|2 , while |!(k, t)|2 dk gives the “fraction of ejected particles” with (quasi-) momentum in the interval dk. This problem can be reduced to the solution of an integral equation in a single variable [22, 23]. Setting Y (t) = ψ(x = 0, t)η(t)eit

(6)

4

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

we have

t

θ (t) = θ0 + 2i

Y (s)ds,

0

√

!(k, t) = !0 (k) + 2|k|/

(7)

2π (1 − i|k|)

t

Y (s)ei(1+k

2 )s

ds.

(8)

0

Y (t) satisfies the integral equation t [2i + M(t − t )]Y (t )dt Y (t) = η(t) I (t) + 0

= η(t) I (t) + (2i + M) ∗ Y ,

(9)

where the inhomogeneous term is i I (t) = θ0 + √ 2π and 2i M(s) = π

∞

0

∞

!0 (k) + !0 (−k) −i(k 2 +1)t dk, e 1 + ik

0

1+i u2 e−is(1+u ) du = √ 2 1+u 2 2π 2

with f ∗g =

t

s

∞

e−iu du u3/2

f (s)g(t − s)ds.

0

In our previous works we considered the case where !0 (k) = 0 and η(t) is a finite sum of harmonics with period 2πω−1 . In particular, we showed in [23] how to compute the survival probability P (t) as a function of the strength r and frequency ω when η(t) = r sin ωt. Here we study the general periodic case and write η=

∞

Cj eiωj t + C−j e−iωj t .

j =0

Our assumptions on the Cj are (a) (b) (c)

0 ≡ η ∈ L∞ (T), C0 = 0, C−j = Cj .

Genericity condition (g). Consider the right shift operator T on l2 (N) given by T (C1 , C2 , . . . , Cn , . . . ) = (C2 , C3 , . . . , Cn+1 , . . . ). We say that C ∈ l2 (N) is generic with respect to T if the Hilbert space generated by all the translates of C contains the vector e1 = (1, 0, 0 . . . , ) (which is the kernel of T ): e1 ∈

∞

T nC

(10)

n=0

(where the right side of (10) denotes the closure of the space generated by the T n C with n ≥ 0). This condition is generically satisfied, and is obviously weaker than the

Ionization of Simple Model

5

n “cyclicity” condition l2 (N) ∞ n=0 T C = {0}, which is also generic [29] (Appendix B discusses in more detail the rather subtle cyclicity condition). An important case, which satisfies (10), (but fails the cyclicity condition) corresponds to η being a trigonometric polynomial, namely C ≡ 0 but Cn = 0 for all large enough n. (We can in fact replace e1 in (10) by ek with any k ≥ 1.) A simple example which fails (10) is η(t) = 2rλ

λ − cos(ωt) 1 + λ2 − 2λ cos(ωt)

(11)

for some λ ∈ (0, 1), for which Cn = −rλn for n ≥ 1. In this case the space generated by T n C is one-dimensional. We will prove that there are values of r and λ for which the ionization is incomplete, i.e. θ(t) does not go to zero for large t. 3. Results and Remarks Theorem 1. Under assumptions (a) . . . (c) and (g), the survival probability P (t) of the bound state ub , |θ (t)|2 tends to zero as t → ∞. Theorem 2. For ψ0 (x) = ub (x) there exist values of λ, ω and r in (11), for which |θ (t)| → 0 as t → ∞. Remarks. 1. Theorem 1 can be extended to show that D |ψ(x, t)|2 dx → 0 for any compact interval D ⊂ R. This means that the initially localized particle really wanders off to infinity since by unitarity of the evolution R |ψ(x, t)|2 dx = 1. Theorem 2 can be extended to show that for some fixed r and ω in (11) there are infinitely many λ, accumulating at 1, for which θ(t) → 0. In these cases, it can also be shown that for large t, θ approaches a quasiperiodic function. 2. While Theorem 1 holds for arbitrary ψ0 , care has to be taken with the initial conditions for Theorem 2. In particular we cannot have an initial state such that in (9) I (t) = 0 for all t. This would occur, for example, if ψ0 (x) is an odd function of x. In that case the evolution takes place as if the particle was entirely free – never feeling the delta function potential. There may also be other special ψ0 for which θ0 = 0 but for which θ(t) → 0 as t → ∞. We have therefore stated Theorem 2 for the case ψ0 = ub . We shall also, for simplicity, use this choice of ψ0 in the proofs of Theorem 1. For this case, which is natural from the physical point of view, I (t) = 1 in (9). The extension to general ψ0 is immediate and is given at the end of Sect. 5. 3. In [23] we gave a detailed picture of how the decay of θ(t) depends on r and ω when η(t) = r sin(ωt), θ0 = 1. For small r and ω−1 not too close to an integer we get an −1 exponential decay with a decay rate (r, ω) ∼ r 2(1+ω ) , where ω−1 is the integer part of ω−1 . (For ω > 1, this corresponds to ∼ F ). At times large compared to −1 , |θ (t)| decays as t −3/2 . The picture becomes much more complicated when r is large and/or ω−1 is an integer. In particular there is no monotonicity in |θ(t)| as a function of r. In [24] we proved complete ionization for the case where Cn = 0 for n > N , N ≥ 1. 4. We note here that Pillet [3] proved complete ionization for quite general H0 under the assumption that H1 (t) is “very random”, in fact a Markov process. Our results are not only consistent with this but support the expectation that generic perturbations will lead to complete ionization for general H0 . This is what we expect from entropic considerations – there is just too much phase space “out there”. The surprising thing is that even for our simple example one can readily find exceptions to the rule.

6

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

We should also mention here the work of Martin et al. [31, 32] who consider the case where H0 has an isolated eigenvalue E0 plus an absolutely continuous spectrum in the interval [0, Emax ]. They show that if the frequency ω of the periodic, small, perturbation H1 (t) is larger than E0 then the bound state is stable. This can be understood in terms of Fermi’s golden rule by noting that the density of states at the energy E0 + ω > Emax is zero so that F would be zero. 5. There is a direct connection between our results and Floquet theory where, for a time-periodic Hamiltonian H (t) with period T = 2π/ω, one constructs a quasienergy operator (QEO) [2, 33, 34] ∂ K = −i + H (θ). ∂θ K acts on functions of x and θ , periodic in θ , i.e. on the extended Hilbert space H ⊗ L2 (S, T −1 dθ ). Let now φ(x, θ ) be an eigenfunction satisfying Kφ = µφ, φ(x, θ + T ) = φ(x, θ) then,

(12)

ψ(x, t) = e−iµt φ(x, t)

is a solution of the Schrödinger equation i ∂ψ ∂t = H (t)ψ. The existence of a real eigenvalue µ of the QEO with an associated φ(x, θ ) ∈ L2 (Rd ⊗ S) is thus seen to imply the existence of a solution of the time-dependent Schrödinger equation which is, in absolute value, periodic. This shows that for appropriate initial conditions, the particle has a nonvanishing probability of staying in a compact domain and thus, for the case considered here, that ionization is incomplete. We also note that for each such µ there is actually a whole set µn = µ + nω of eigenvalues of K. For the specific model considered here, (12) takes the form Kφ = −

∂ 2 φ(x, θ ) ∂φ − 2(1 + η(θ ))δ(x)φ − i = µφ. 2 ∂x ∂θ

(13)

We can now look for solutions of (13) in the form φµ (x, θ ) = yn einωθ eαn x n∈Z

√

with αn± = ± µ − nω. Such a solution is in L2 only if (αn x) < 0, a condition which obviously selects different roots λn depending on whether x > 0 or x < 0. The requirement that φµ be in L2 (R) leads to a set of matching conditions which determine whether such eigenvalues µ can exist. It is easy to see that φµ has to be continuous at zero and satisfy the condition 2φµ (0− , θ) − φµ (0+ , θ ) = 2(1 + η(θ ))φµ (0, θ). This implies, after taking the Fourier coefficients of both sides of the above equality, the recurrence relation yn (2 − αn+ + αn− ) = 2 Cj yn−j (14) j =0

Ionization of Simple Model

7

for which a (nontrivial) solution yn ∈ l 2 is sought. This is effectively the same equation as (20) below which is at the core of our analysis. Complete ionization thus corresponds to the absence of a discrete spectrum of the QEO operator and conversely stabilization implies the existence of such a discrete spectrum. In fact, an extension of Theorem 2 shows that for the initial condition ψ0 = ub , ψt approaches such a function with µ = −s0 . More details about Floquet theory and stability can be found in [33, 34]. 6. We are currently investigating extensions of our results to the case where H0 = −∇ 2 + V0 (x), x ∈ Rd , has a finite number of bound states and the perturbation is of the form η(t)V1 (x) and both V0 and V1 have compact support. Preliminary results indicate that, with much labor, we shall be able to generalize Theorem 1, to generic V1 (x). The definition of genericity will, however, depend strongly on V0 . The physically important case of an external electric dipole field, V1 (x) = −Ex can be transformed into the solution of a Schrödinger equation of the form H (t) = −∇ 2 + V0 (x − g(t)), see [2]. This should, in principle, also be amenable to our methods but so far we have no results for that case. Outline of the technical strategy. The method ∞ of proof relies on the properties of the Laplace transform of Y , y(p) = LY (p) = 0 e−pt Y (t)dt. Since the time evolution of ψ is unitary, |θ(t)| ≤ 1. This gives some a priori control on Y . For our purposes however it is useful to characterize directly the solution of the convolution equation (9). (We restrict ourselves to !0 (k) = 0 and I (t) = 1 there.) We show that this equation has a unique solution in suitable norms. This solution is Laplace transformable and the Laplace transform y satisfies a linear functional equation. The solution of the functional equation satisfied by the transform of Y is unique in the right half plane provided it satisfies the additional property that y(p0 + is) is square integrable in s for any p0 > 0. Any such solution y transforms back (by the standard properties of the inverse Laplace transform) into a solution of our integral equation with no faster than exponential growth; however there is a unique locally integrable solution of this equation, and this solution is exponentially bounded. This must thus be our Y . We can thus use the functional equation to determine the analytic properties of y(p). This is done using (appropriately refined versions of) the Fredholm alternative. After some transformations, the functional equation reduces to a linear inhomogeneous recurrence equation in l2 , involving a compact operator depending parametrically on p, see e.g. (17). The dependence is analytic except for a finite set of poles and squareroot branch-points on the imaginary axis and we show that the associated homogeneous equation has no nontrivial solution. We then show that the poles in the coefficients do not create poles of y, while the branch points are inherited by y. The decay of y(p) when |(p)| → ∞, and the degree of regularity on the imaginary axis give us the needed information about the decay of Y (t) for large t. 4. Behavior of y(p) in the Open Right Half Plane H Lemma 3. (i) Equation (9) has a unique solution Y ∈ L1loc (R+ ) and |Y (t)| < KeBt for some K, B ∈ R. (ii) The function y(p) = LY exists and is analytic in HB = {p : (p) > B}. (iii) In HB , the function y(p) satisfies the functional equation y=

∞ j =−∞

Cj T j h + by

(15)

8

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

with

T f (p) = f (p + iω),

of

h(p) = −p −1

and b(p) = −

i 1 + 1 − ip . p

The branch of the square root is such that for p ∈ H = {p : (p) > 0}, the real part √ 1 − ip is nonnegative and the imaginary part nonpositive.

The straightforward proofs of this lemma are done in Appendix A. (Some of the results can also be gotten directly from standard results on the Schrödinger operators and on integral equations.) Remark 4. It is clear that the functional equation (15) only links points on the one dimensional lattice {p + iZω}. It is convenient to take p0 such that p = p0 + inω with (p0 ) = (p) and (p0 ) ∈ [0, ω).

(16)

The functions y, h, b in (15) will now depend parametrically on p0 . We set y = {yj }j ∈Z , h = {hj }j ∈Z , b = {bj }j ∈Z with yn = y(p0 + inω) = y(p) (and similarly for h(p) and b(p)). It is convenient to define the operator (Hˆ y)n = bn yn . Let (T y)n = yn+1 be the right shift on l2 (Z) (which we denote for simplicity by l2 ) and rewrite (15) as y=

∞ j =−∞

j

Cj T h +

∞

Cj T j Hˆ y ≡ f + J y.

(17)

j =−∞

Proposition 5. For (p0 ) > 0 there exists a unique solution of (17) in l2 . This solution is analytic in p0 , (p0 ) > 0. Thus y(p) is analytic in p ∈ H and inverse Laplace transformable there with L−1 (y) = Y . Proof. The proof uses the Fredholm alternative. We first prove the following results. Lemma 6. The operator J is compact on l2 if p0 = 0. Proof. The proof uses standard compact operator results, see e.g. [30]. First note that the operator Hˆ is compact. This is straightforward: since bj → 0 as j → ∞, it follows that Hˆ is the norm limit as N → ∞ of the finite rank operators defined by (Hˆ N y)j = bj yj for |j | ≤ N and (Hˆ N y)j = 0 otherwise, and thus is compact. The operator J is the composition between the “convolution” operator C given by (Cv)n := (C ∗ v)n := ˆ j ∈Z Cj vn+j , which is continuous on l2 , and the compact operator H . Thus J is compact. " # Remarks. 1. Note that f ∈ l2 if p0 = 0 (a straightforward consequence of the fact that C and h in (17) are in l2 ). 2. The operator J is analytic in p0 , except for p0 = 0, where the coefficients have poles, and for an additional value on the imaginary axis (possibly also 0), where the coefficients have square root branch points.

Ionization of Simple Model

9

Remark 7. Setting, for p0 = 0,

yl = ( 1 − i(p0 + ilω) − 1)zl

(18)

y = Jy

(19)

the homogeneous equation

clearly has a (nontrivial) l2 solution y only if

∞

Ck zl+k + C k zl−k 1 − ip0 + lω − 1 zl = −

(20)

k=1

has a (nontrivial) l2 solution z with

1 − ip0 + j ω − 1 zj

j ∈Z

∈ l2 .

(21)

Lemma 8. For any η under assumptions (a) to (c), if p0 ∈ H there is no nonzero l2 solution of (20) such that (21) holds. Proof. To get a contradiction, assume z ∈ l2 , z ≡ 0, satisfying (21), is a solution of (20). Multiplying (20) by zl , and summing with respect to l from −∞ to +∞ we get ∞ ∞ ∞

Ck zl+k zl + C k zl−k zl 1 − ip0 + lω − 1 |z|2l = −

l=−∞

=− =− √

l=−∞ k=1 ∞ ∞

Ck zl zl−k + C k zl−k zl

l=−∞ k=1 ∞ ∞

(22)

2 Ck zl zl−k .

l=−∞ k=1

If p0 ∈ H the imaginary part of 1 − ip0 + lω is negative (see Remark 24) and thus, if some zl is nonzero then the left side of (22) has strictly negative imaginary part, which is impossible since the right side is real. " # Proof of Proposition 5. The existence of the analytic solution follows now immediately from the analytic Fredholm alternative and the analyticity of the coefficients, for p0 ∈ H. The fact that {yn } ∈ l2 together with the stated analyticity imply that the function L−1 y(p) exists and satisfies the integral equation of Y , and thus coincides with Y . " # 5. Behavior of y(p) in the Neighborhood of (p) = 0 in the Generic Case Discussion of methods. We start again from relation (17). This has the form yn = i

j

Cj Cj qn+j yn+j , C0 = 0, − −ip0 + (n + j )ω j

(23)

10

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

where

qn =

1+

1 − ip0 + nω . −ip0 + nω

√

(24)

As the imaginary axis (p0 ) = 0 is approached, two types of potential singularities in the coefficients need attention: the poles in the coefficients due to the presence of p −1 , and the square root singularities. It will turn out that by cancellation effects, the poles play no role, generically. The square root singularities will be manifested in the solution y. The study of these questions requires further regularization of the functional Eq. (23). It is convenient to separate out the terms in (23) which are singular at p0 = 0. Using (from now on) the notation s0 = −ip0 we have √ Cj C−n C−n (1 + 1 + s0 ) − y0 + i yn = i s0 s0 s + (n + j )ω j =−n 0 Cj qn+j yn+j , n = 0, − (25) j =−n

y0 = i

j =0

Cj − Cj qj yj . s0 + j ω j =0

We break up the proof into two parts, the non-resonant and resonant case. We start with the former. 5.1. The non-resonant case, ω−1 ∈ N. Proposition 9. If condition (g) is satisfied, and ω−1 ∈ N, then the solution y of (25) is analytic in a small neighborhood of s0 = 0. For the proof we write y0 = i/2 + s0 u0 , and for n = 0 we make the substitution yn = vn + dn u0 , where we will choose dn according to (26) in order to eliminate u0 from all equations with n = 0. Lemma 10. (i) For s0 ∈ R there exists a unique solution d ∈ l2 (Z \ {0}) of the system dn = −C−n (1 + 1 + s0 ) − Ck−n qk dk , n = 0. (26) k=0

This solution is analytic at s0 = 0. (ii) With this choice of d, the system (25) becomes v n = fn − s0 +

j =0

Cj qj dj u0 = f0 −

Ck−n qk vk ,

k=0

C j q j vj ,

(27)

j =0

where

√ Cj Ck−n 1 − 1 + s0 i f0 = − + i , fn = iC−n . +i 2 s0 + j ω 2s0 s0 + kω j =0

k=0

(28)

Ionization of Simple Model

11

(iii) For small s0 we have j =0 Cj qj dj = 0, and the system (27) has a unique solution with v ∈ l2 (Z \ {0}), and vn , u0 are analytic at s0 = 0 . Proof. (i) Equation (26) is of the form (I − J )d = c in l2 (Z \ {0}), where cn = √ −(1 + 1 + s0 )C n and (J d)n = − Ck−n qk dk , (n = 0). k=0

We show first that Ker(I − J ) = {0}. Indeed, assume d = J d and set Dk = qk dk . Then we see that Ck−n Dk = 0 (29) qn −1 Dn + k=0

and, by multiplying with D n and summing over n we get qn−1 |Dn |2 + Ck−n Dk Dn = 0. n=0

(30)

n,k=0

Note that, because C−n = C n , the following quantity is real:

Ck−n Dk D n =

n,k=0

n,k=0

implying that

n=0

with (cf. (24)) Let N0 = −(1 + s0 have, by Remark 24

Cn−k Dk Dn =

Ck−n Dk D n ,

(31)

n,k=0

qn−1 |Dn |2 ∈ R

qn−1 = −1 + )ω−1

∈ R. Obviously

1 + s0 + nω.

qn−1

∈ R for n ≥ N0 while for n < N0 we

(qn−1 ) < 0.

Thus it is necessary that Dn = 0 for all n < N0 . Assume D = 0. Let N ∈ N be such that Dn = 0 for all n < N and DN = 0 (thus N0 ≤ N). Then from (29), Ck−n Dk = 0 for any n < N k≥N;k=0

or, setting k = N − 1 + j ,

Cj +n DN−1+j = 0

for n ≥ 0.

(32)

j ≥1,j =1−N

It is here that we use the genericity condition on C. In fact we will show that (32) implies D = 0 if condition (g) is satisfied. To see this define D˜ ∈ l2 (N) as D˜ j = DN−1+j if j ≥ 1, j = 1 − N and, if 1 − N ≥ 1, D˜ 1−N = 0. Then by (32) D˜ is orthogonal in

12

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

˜ e1 >= DN = 0, l2 (N) to all T n C, n ≥ 0. By the genericity condition (g) then < D, which is a contradiction. Thus D = 0. Since J is analytic in s0 for small enough s0 , and compact by the same simple arguments as in Lemma 6, it follows that (I − J )−1 exists and is analytic in s0 at s0 = 0. (ii) This part is an immediate calculation. (iii) Note first that f ∈ l2 (Z \ {0}), because 1/2 √ Ck−n 2 1 − 1 + s0 'c' + 'f ' ≤ 2s s + kω 0

≤ 'c'

k=0

n=0 k=0 0

1 < ∞. |s0 + kω|2

Also, formula (28) expresses f in terms of a discrete measure integral with respect to k of a function which depends analytically on the (small) parameter s0 , and which is uniformly in l1 . Therefore f depends analytically on s0 . The rest of the proof of (iii) closely follows that of part (i), using the following result. Cj qj dj = 0. Lemma 11. For s0 = 0 we have j =0

Proof. Assume the contrary was true. At s0 = 0, with Dn0 = Dn |s0 =0 and qn0 = qn |s0 =0 , relation (29), using (26), gives D0 0 = 0n = − Ck−n Dk0 − 2C−n (n = 0). (33) qn k=0

Multiplying with Dn0 and summing over n = 0 we would get √ (−1 + 1 + nω)|Dn0 |2 = − Ck−n Dk0 Dn0 − 2C−n Dn0 , n=0

k,n=0

(34)

n=0

and since we assumed n Cn Dn0 = 0 then, as in the proof of Lemma 10 (i), it follows that Dn0 = 0 for all n < N0 = −ω−1 . This gives, using (33), that Ck−n Dk0 + 2C−n = 0. (35) k≥N0 ;k=0

∈ l2 the sequence Dk1 = Dk0 if k = 0 and D01 = 2. As in the Denote by proof of Lemma 10 (i), using the genericity condition (g), we get D 1 = 0, an obvious contradiction. " # D1

This concludes the proof of Proposition 9: for generic η the solution y of (17) has, / N, analytic components yn when p = 0. for ω−1 ∈ Square root singularities. We now study the behavior at the square root singularities of the coefficients of the equation of y. Let k0 be the unique integer such that for some sr ∈ [0, ω) we have 1 + sr + k0 ω = 0 (then sr is a branch point in the coefficient q). The following proposition describes the analytic structure of y(p) near the imaginary axis.

Ionization of Simple Model

13

√ Proposition 12. We have the decomposition yn = un + ( s0 − sr )vn , where un and vn are analytic in s0 in a complex neighborhood of the segment [0, ω). √ Proof. The substitution yn = un + ( s0 − sr )vn , and Uk = qk uk ; Vk = qk vk

(k = k0 )

and Uk0 =

uk0 ; s0 + k 0 ω

Vk0 =

vk0 s0 + k 0 ω

leads to the following system of equations for Un and Vn : Ck−n Ck−n Uk − Ck0 −n (s0 − sr )Vk0 (n = k0 ), − s0 + kω k k Ck−n Vk − Ck0 −n (s0 − sr )Vk0 − Ck0 −n Uk0 (n = k0 ), (36) qn−1 Vn = −

qn−1 Un = ri

k

(s0 + k0 ω)Uk0 (s0 + k0 ω)Vk0

Ck−k 0 =i , s0 + kω k =− Ck−k0 Vk . k

√ We now let Qk0 = s0 + k0 ω and, for n = k0 , Qn = qn−1 = −1 + 1 + s0 + kω. We use again the Fredholm alternative and, as in the previous proofs, we need only to show the absence of a solution of the homogeneous equation at s0 = sr . We thus multiply the homogeneous equations associated to (36) in the following manner: the equation for Uj by Uj and the equation for Vj by Vj , then sum over all j . As in the previous proofs, from the reality of the r.h.s. and then from the genericity condition (g) U ≡ 0. Then, similarly, V ≡ 0. The rest is immediate. " #

5.2. The resonant case: ω−1 = M ∈ N. In this case when s0 = 0 there are poles in the coefficients of (23) when n + j = 0 and branch points when n + j = −M. The proof is a combination of the two regularization techniques used in the previous case. √ Proposition 13. We can set y(s0 ) = A(s0 ) + B(s0 ) s0 with A and B analytic in a complex neighborhood of the segment [0, ω). Proof. Special care is only needed near s0 = 0. The system (26)–(28) now reads dn = −C−n (1 + vn = fn −

1 + s0 ) −

k ∈{0,−M} /

k ∈{0,−M} /

Ck−n qk dk − C−M−n

√ 1 + s0 v−M . Ck−n qk vk − C−M−n s0 − 1

√ 1 + s0 d−M , s0 − 1 (37)

14

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

√ s0 βn and vn = γn + s0 δn . The system becomes αn = − C−n (1 + 1 + s0 ) − Ck−n qk αk

We take dn = αn +

√

k ∈{0,−M} /

− C−M−n

βn = −

Ck−n qk βk − C−M−n

k ∈{0,−M} /

γn = fn −

δn = −

1 (α−M + β−M ), s0 − 1

Ck−n qk γk − C−M−n

k ∈{0,−M} /

Ck−n qk δk − C−M−n

k ∈{0,−M} /

1 (α−M + s0 β−M ), s0 − 1 (38)

1 (γ−M + s0 δ−M ), s0 − 1

1 (δ−M + γ−M ). s0 − 1

(39)

The system (38) is of the form α F1 α , + = S(s0 ) β F2 β where α, β, F1 , F2 are in l2 . We prove that the homogeneous equation has no nontrivial solutions: α α Lemma 14. (I − S(0)) = 0 implies = 0. β β Proof. Let Qn = qn , An = qn αn , Bn = qn βn for n = 0, −M and Q−M = −1, A−M = −α−M and B−M = −β−M . The system (38) becomes Ck−n Ak , Q−1 n An = − k=0

Q−1 n Bn

=−

Ck−n Bk − C−M−n A−M .

(40)

k=0

As in the proofs in Case I, multiplying the first equation by An , summing over n we first get from the reality of the r.h.s. that An = 0 for n < −M and then by the condition (g) we get that A ≡ 0. The conclusion B ≡ 0 now follows in the same way. " # End of proof of Proposition 13. The operator S is compact on l2 ⊕ l2 and S and (F1 , F2 ) are analytic in a complex neighborhood of 0. We saw in Lemma 14 that the kernel of I − S(0) is trivial and by the analytic Fredholm alternative it follows that (I − S(0))−1 exists and is analytic in a small neighborhood of s0 = 0. Hence (α, β) are analytic. Similarly, γ , δ are analytic in the same region. " #

5.3. Proof of Theorem 1. Combining the above results we have the following conclusion: Proposition 15. If condition (g) is fulfilled, then y(p) is analytic in a neighborhood of iR \ {isr + iωZ}. For any j ∈ Z,√in a neighborhood of p = isr + ij ω (sr ∈ R) y has the form y(p) = Aj (p) + Bj (p) −ip − sr − ij ω, where Aj and Bj are analytic. In particular, y is Lipschitz continuous of exponent 1/2 in the closed right half plane. Thus Y (t) = O(t −3/2 ) for large t.

Ionization of Simple Model

15

Proof. All but the last claim has already been shown. The last statement is a standard Tauberian theorem (note that L−1 is the Fourier transform along the imaginary line). # " Proposition 16. We have θ (t) → 0 as t → ∞. Proof. We can write (9) (with I (t) = 1) as Y = η(θ + M ∗ Y ).

(41)

It is easy to check, in view of the fact that M and Y are O(t −3/2 ), that M ∗ Y → 0. t Furthermore 1 + 2i 0 Y (s)ds is convergent as t → ∞. Thus θ(t) → const as t → ∞. Since now the l.h.s. of (9) converges to zero and η does not, the equality (41) is only consistent if θ (t) → 0. " # This completes the proof of Theorem 1 for the case ψ0 = ub = e−|x| . The general case follows by noting that the inhomogeneous term does not affect the main argument, using the Fredholm alternative. Hence we will still have |θ(t)| → 0 but the rate of decay may be different. 6. A Nongeneric Example Let η be given by (11), for which Cn = −rλn for n ≥ 1,

Cn = C−n .

As in Sect. 5 set −ip0 = s0 and let qn be given by (24). Denote

1 1 1 an = an (s0 ) = = 1 + s0 + nω − 1 . r qn r

(42)

(43)

For r ∈ (0, 1), ω > 1, ω−1 ∈ N such that (1 − r)2 < ω − 1, let sr and sp be the unique numbers in (0, ω) so that 1 + sr ∈ ωZ and 1 + a−1 (sp ) = 0. We choose r, ω such that sr = sp . 6.1. The homogeneous equation. Lemma 17. Let s0,0 be a point in (0, sr ) ∪ (sr , ω). Consider s0 in a small enough neighborhood of s0,0 . The linear operator J = J (s0 ) of (17) depends analytically on s0 , and is compact on l2 . For s0 = sp , (I − J (s0 ))−1 exists and is analytic. Lemma 18. Denote for short J0 = J (sr ). There exists a value λ = λs ∈ (0, 1) such that dim Ker (I − J0 ) = 1.

(44)

Denote by A the diagonal (unbounded) operator (Az)n = an zn in l2 ; A−1 is bounded. Lemma 19. For λ = λs as in Lemma 18 we have Ker (I − J0 ) = A Ker I − J0∗ .

(45)

16

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

6.2. Proof of Lemma 17. The operator J is compact by Lemma 6. To show that I − J is invertible we prove this for any points s0 ∈ (0, ω), s0 = sp , sr ; by the analytic Fredholm theorem it will follow that I − J is invertible in a small enough neighborhood of any such point, thus proving the lemma. Let s0 ∈ (0, ω), s0 = sp , sr . As in Remark 7 in Sect. 5, the substitution yn = an zn (n ∈ Z) transforms the homogeneous equation (19) to an zn =

∞

λj zn+j + zn−j ,

n ∈ Z.

(46)

j =1

Note that an < 0 for n < −1 for s0 ∈ [ω − 1, ω) and an < 0 for n < 0 for s0 ∈ (0, ω − 1). We will discuss only the first case, s0 ≥ ω − 1, since the second one is completely analogous. As in the proof of Lemma 8, it follows that zn = 0

for n < −1.

(47)

Then Eqs. (46) for n < −1 become ∞

λk zk−2 = 0.

(48)

k=1

For n = −1 (46) gives (a−1 + 1)z−1 = 0,

(49)

and for n ≥ 0, using (48), we get (1 + an )zn =

n+1

(λj − λ−j )zn−j , n ≥ −1.

(50)

j =1

Since s0 = sp , (49) gives z−1 = 0, and it follows by induction, from (50), that zn = 0 for all n. By the Fredholm alternative theorem then I − J (s0 ) is invertible. 6.3. Proof of Lemma 18. In what follows s0 = sr . 6.3.1. An auxiliary lemma. We show that if z ∈ l2 then Eq. (48) is redundant. Lemma 20. If z is an l2 solution of (50) with zn = 0 for n < −1 then z satisfies (48). Proof. Let z ∈ l2 be a solution of (50). Then Z [n+1] ≡

n

λk zk−2

(51)

k=1

is the truncation of a convergent series, since there is a constant M with |zn | < M for all n. Note that n+1 1 + an )zn = λj zn−j − λ−n−2 Z [n+1] , j =1

Ionization of Simple Model

17

hence Z [n+1] = λn+2

n+1

λj zn−j − λn+2 (1 + an )zn ,

j =1

so that √ Mλ [n+1] + λn+2 M 1 + const n −→ 0 as n → ∞. (52) Z ≤ λn+2 1−λ Since (51) are truncations of the series in the LHS of (48), then (52) implies (48). " # 6.3.2. Behavior of the general solution of (50). A direct calculation shows that the sequence zn satisfying the infinite order recurrence (50) and the initial condition z−1 = 1 satisfies, in fact, the three step recurrence (1 + an+1 )zn+1 + (1 + an−1 )zn−1 = [λ(1 + an ) + λ + an λ−1 ]zn (n ≥ 0)

(53)

with the initial condition z−1 = 1,

z0 =

λ − λ−1 . 1 + a0

(54)

Denote zn = then (53) becomes

λ − λ−1 Vn−1 , 1 + an

(55)

Vn + Vn−2

λ 2 + an = λ+ Vn−1 λ(1 + an )

n ≥ 1.

(56)

We are looking for l2 solutions. Recent rigorous WKB estimates (see e.g.√[35]) would imply there are solutions of the discrete equation (56) behaving like λ−n e− n/ω and like √ n n/ω . We will prove this in our context and find special values of λ for which the λ e solution decaying for large n satisfies the initial condition. We will show that this solution is obtained by taking Vn−2 = gn−1 Vn−1

(57)

in (56) and iterating: gn−1 = Gn −

1 gn

with Gn = λ +

λ 2 + an , λ(1 + an )

(58)

i.e., g0 is given by the continued fraction: g 0 = G1 −

1 G2 −

1 ... G3

,

which needs to match the initial condition (see (54): g0 = g0 (λ) =

λ + λ−1

1 . + (1 + a0 )−1 (λ − λ−1 )

(59)

18

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

Lemma 21. (i) Let λ ∈ (0, 1). The recurrence (58) has a solution such that gn → λ−1 as n → ∞. (ii) g0 is meromorphic in λ on [0, 1) and has poles. (iii) There exists λs ∈ (0, 1) such that g0 (λs ) satisfies (59). (iv) Let λ = λs . To the solution of (i) there corresponds a solution V [s] of the recurrence n+o(n) as n → ∞. The corresponding solution z[s] of (50) (56) such that Vn[s] ∼ λs satisfies zn → 0 as n → ∞. (v) Let λ = λs . There exists a solution of (56) with the asymptotic behavior Vn[l] ∼ −n+o(n) . λs Thus, for λ = λs , there exists a unique (up to a multiplicative constant) “small” n+o(n) for large n, while the general solusolution of (56), with the behavior Vn[s] ∼ λs −n+o(n) . As a consequence, a similar statement holds for the tion behaves like Vn ∼ λs recurrence (53). Remark. The proof of (iii) can be refined to show that, in fact, there is a countable set of points λs for which g0 satisfies the initial condition, and that these values accumulate to 1. Proof. (i) With the substitution gn = Gn+1 − λ + δn ,

(60)

the recurrence (58) becomes 1 ≡ (Sδ)n , n ≥ 0. (61) Gn+2 − λ + δn+1 −1 For n0 ≥ 0 and F small, positive, define λn0 = an0 +2 2 + an0 +2 − F. Let Nn0 be a small neighborhood of the interval In0 = [0, λn0 ]. Consider the Banach space Bn0 of sequences {δn }n≥n0 with δn = δn (λ) analytic on Nn0 and continuous up to the boundary, with the norm 'δ' = supn≥n0 supλ∈Nn |δn (λ)|. Direct estimates show that 0 the operator S defined by (61) takes the ball of radius ρn0 = 2/(2 + an0 +2 ) + F in Bn0 into itself (if F, F and Nn0 are small enough), and is a contraction in this ball. Therefore the equation δ = S(δ) has a unique solution in Bn0 , of norm less than ρn0 . Then |δn (λ)| < const(n + 2)−1/2 for all λ ∈ In and all n ≥ 0. Since the sequence λn increases to 1, (i) follows. (ii) Step I: All gn are meromorphic on [0, 1). Since δn is analytic on In , then from (60), gn is analytic on In \ {0}, having a pole at λ = 0: gn ∼ λ−1 an+1 (1 + an+1 )−1 (λ → 0). Iterating (58) it follows that gn−1 , gn−2 , . . . , g0 are meromorphic on In . Since the intervals In increase toward [0, 1) it follows that g0 , g1 , . . . gn . . . are meromorphic on [0, 1). Step II: There exists n1 and λ0 ∈ (0, 1) such that gn1 (λ0 ) ≤ 0. Define Fn = (1 + an )−1 ; we have (see (43)) δn = λ −

Fn0 ∼ r(n0 ω)−1/2 ,

n0 → ∞.

(62)

Let n0 be large and denote λ0 = 1 − Fn0 . Let N0 be large enough so that λ0 is in the domain of analyticity of gN0 . Iterating (58) starting from N0 (and decreasing indices) we get the value gn0 (λ0 ). If for some n ∈ {n0 , n0 + 1, . . . , N0 } we get gn (λ0 ) ≤ 0, Step II is proved. Then assume that gn0 (λ0 ) > 0.

Ionization of Simple Model

19

Consider the recurrence r˜n−1 = Gn0 (λ0 ) −

1 r˜n

for n ≤ n0 ,

r˜n0 = gn0 (λ0 ),

(63)

where, in fact, Gn0 (λ0 ) = 2 − Fn20 . The recurrence (63) can be solved explicitly (it is a discrete Riccati equation and a substitution r˜n−1 = xn−1 /xn transforms it into a linear recurrence with constant coefficients). It has the solution r˜n =

cos ((n − n0 )φ + θ ) , cos ((n + 1 − n0 )φ + θ )

(64)

where cos φ = 1−Fn20 /2, sin φ > 0, and tan θ = (cos φ −λ)/ sin φ so that θ ∼ π4 − 41 Fn0 (Fn0 → 0). We assume, to get a contradiction, that gn (λ0 ) > 0 for all n = 0, 1, . . . , n1 . Then gn (λ0 ) ≤ r˜n

for n ≤ n0 ,

(65)

which follows immediately by induction using (58), (63), noting that Gn is increasing in n. Note that there is an n1 ∈ {1, 2, . . . , n0 − 1} so that r˜n > 0

for n ∈ {n1 + 1, . . . , n0 } and r˜n1 < 0.

(66)

Indeed (from (62)) when n decreases from n0 the numerator and denominator in (64) increase up to 1, then decrease, until the numerator becomes negative, when n equals n1 = n0 − k1 , where k1 is the integer with k1 − 1 < (π/2 + θ )/φ ≤ k1 . Since φ ∼ Fn0 (Fn0 → 0) then k1 ∼ (3π)/(4Fn0 ), and, using (62), clearly k1 ∈ {1, . . . , n0 − 1} (if n0 is sufficiently large). Then (65) and (66) contradict the assumption that gn1 (λ0 ) > 0, and Step II is proved. Step III. The function gn1 is meromorphic on [0, 1), with gn1 (0+) = +∞. There is a smallest value of λ in (0, λ0 ), where gn1 changes sign: this is either a zero, or a pole. Assume it was a pole. Let p ∈ (0, λ0 ) be the first pole of gn1 . Then gn1 is positive and analytic on (0, p), and gn1 (p−) = +∞, gn1 (p+) = −∞. Since gn+1 = 1/(Gn+1 −gn ) (see (58)) then gn1 +1 (p−) = 0−, hence gn1 +1 changes sign in (0, p). But gn1 +1 has no zero in (0, p) (otherwise at that zero gn1 would have had a pole, from (58)). Then gn1 +1 has a pole, with a change of sign, from + to −, in (0, p). Now the argument can be repeated. It follows that for any k > 0, gn1 +k has a pole in (0, p), which contradicts the fact that the domain of analyticity of gn increases to (0, 1) as n → ∞. Therefore, the first change of sign of gn1 is at a zero. Let ζ1 be the smallest value in (0, 1) such that gn1 (ζ1 −) = 0+, gn1 (ζ1 +) = 0−. Then from (58) we have gn1 −1 (ζ1 −) = −∞ and gn1 −1 changes sign in (0, ζ1 ). Now the argument can be repeated. It follows that g0 has a pole at a point ζn1 with g0 (ζn1 −) = −∞. (iii) Since g0 (λ) takes all the values when λ ∈ (0, ζn1 ) there exists λ = λs ∈ (0, 1) such that (59) holds. (iv) For λ = λs , since the solution of (i) satisfies gn (λ) = λ−1 +O(n−1/2 ) we have from n+o(n) ) (57), with the notation V [s] = V (λs ), that Vn[s] = nk=0 gk (λs )−1 V0[s] = O(λs n+o(n) n+o(n) [s] [s] [s] ); then from (55) zn = O(λs ). and thus Vn − Vn−1 = O(λs (v) The substitution (variation of constants) Vn = Vn[s] vn brings the recurrence (56) to [s] /Vn[s] In−1 and a first order one: with the notation In = vn − vn−1 we have In = Vn−2 the rest of the argument consists of straightforward estimates. " #

20

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko 2

1

0

0.75

0.8

0.85

0.9

λ

–1

–2

Fig. 1. Graph of g0 given by (58) (discontinuous graph) and by (59) in a region near λ = 1, as functions of λ

6.3.3. Proof of Lemma 18. Proof. Lemma 21(v) shows that Eq. (53) has a unique (up to a multiplicative constant) n+o(n) small solution, zn[s] ∼ λs (n → ∞), while the general solution behaves like zn ∼ √ −n+o(n) . Since yn ∼ nzn the uniqueness of the l2 solution is proven. λs 6.3.4. Examples of solutions. We will show next how concrete values λs satisfying Lemma 21 (iii) are relatively straightforwardly, and rigorously, found. One method is as follows. Note that the minimum/maximum of the function a − b/x, where x varies in an interval not containing zero is achieved at the endpoints. We thus take the recurrence 2 √ (58) with initial conditions gn0 = λ−1 ± 1−λ and compute g0 from these. The actual nω graph will be between these two, unless the condition mentioned is violated in between n0 and 0. This graph is to be intersected with the graph of the initial condition (59). We take for instance ω = 1.1, r = 0.45, sp = 0.11, n0 = 10, for which the rigorous control is not too involved. The two graphs are very close to each other (within about 3.10−6 for λ ∈ (0.3, 0.4)) and cannot be distinguished from each-other in Fig. 1. A first intersection is seen at λ ≈ 0.327; see Fig. 2. 6.4. Proof of Lemma 19. Denote B = (I − J0 )A; we have B = A − S. Hence B ∗ = A − S. Then Ker(B) = Ker(B ∗ ) (since Az = Sz implies (47), so Az = Az, and similarly, Az = z implies Az = Az). So Ker[(1 − J0 )A] = Ker[A(1 − J0 ∗ )] so that (since A is one-to-one) A−1 Ker(1 − J0 ) = Ker(1 − J0 ∗ ), which proves the lemma. 6.5. Discussion of the singularities of solutions of (17). Let λ = λs . We have that I − J is invertible for p0 > 0, and is not invertible at p0 = isp (Lemma 18). By the analytic Fredholm theorem (see e.g. [30]) (I − J )−1 is meromorphic on a small neighborhood of isp , therefore there exist m ≥ 1 and operators Sm , . . . , S1 , R(p0 ) so that: 1

(I − J )−1 =

p0 − isp

m Sm + · · · +

1 S1 + R(p0 ), p0 − isp

(67)

Ionization of Simple Model

21

1.2

1.1

1

0.9

0.8

λ 0.3

0.32

0.34

0.36

0.38

0.4

Fig. 2. Graphs of g0 (steeper graph) and of the initial condition for g0 (59)

where R(p0 ) is analytic at isp , and Sm = 0 (since I − J0 is not invertible). Multiplying (67) by I −J to the left, respectively to the right, and writing J = J0 +(p0 −isp )J1 (p0 ) (where J1 (p0 ) is analytic at isp ) we get that −m+1

1 , R1 (p0 ) = m (I − J0 ) Sm + O p0 − isp p0 − isp −m+1

1 R2 (p0 ) = , m Sm (I − J0 ) + O p0 − isp p0 − isp where R1,2 are analytic at p0 = isp . By the uniqueness of the series of the analytic functions (Banach space valued) R1,2 we must then have (I − J0 ) Sm = 0 = Sm (I − J0 ) .

(68)

The first equality in (68)

implies Ran(Sm ) ⊂ Ker (I − J0 ) = {yKer } and since Sm = 0 then Ran(Sm ) = {yKer }, therefore Sm y = y, u yKer u ∈ l2 \ {0}. for some The second equality in (68) means u ∈ Ran (I − J0 )⊥ = Ker I − J0∗ . By Lemma 19 then (up to a multiplicative constant) u = A−1 yKer = zKer , where zKer satisfies (46), hence (53),(54). The solution y = (I − J )−1 f of (17) is then singular at p0 = isp if c =< f, zKer > = 0. For the example of Sect. 6.3.4 this latter condition can be checked by explicit calculation of the truncations to 10 terms and estimation of the remainder based on the contractivity bounds in the previous section. The result is c = −1.953 ± 0.001. Thus the inhomogeneous equation has poles. Lemma 22. Let Y (t) be analytic on [0, ∞), with limt→∞ Y (t) = 0. Let s ∈ R. Then ∞ lim a e−(a+is)t Y (t) dt = 0. a↓0

0

(69)

∞ Corollary 23. Let Y (t) be as in Lemma 22. Let y(p) = 0 e−pt Y (t) dt. Assume that y(p) is analytic on iR+ , except for a set of isolated points. Then y(p) does not have poles on iR+ .

22

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

Proof. I. We first show (69) for s = 0. Separating the positive and negative parts of Y (t), Y (t) write Y (t) = Y [1] (t) − [2] Y (t) + iY [3] (t) − iY [4] (t) with Y [k] (t) nonnegative, continuous, nonanalytic only on a discrete set, where the left and right derivatives exist, with limt→∞ Y [k] (t) = 0. It is enough to show (69) for each Y [k] . Let then Y be one of the Y [k] ’s. Denote H (t) = supτ ≥t Y (τ ). The function H on [0, ∞) has the same properties as Y and in ∞ addition is decreasing. Then H exists a.e. and H ∈ L1 [0, ∞), since 0 |H (τ )| dτ = t − limt→∞ 0 H (τ ) dτ = limt→∞ −H (t) + H (0) = H (0). Then ∞ ∞ ∞ d −at −at −at a H (t) dt e e Y (t) dt ≤ a e H (t) dt = − dt 0 0 0 ∞ e−at H (t) dt, = H (0) + 0

therefore

∞

lim a a↓0

e−at Y (t) dt ≤ H (0) + lim

a↓0 0

0

∞

e−at H (t) dt = 0,

which proves the lemma in this case. II. Let now s ∈ R arbitrary. Then (69) follows from the result for s = 0 applied to the # function Y˜ (t) = e−ist Y (t). " Proof of Theorem 2. In conclusion Y (t) cannot tend to zero as t → ∞ and complete ionization fails. " # Acknowledgements. The authors would like to thank A. Soffer and M. Weinstein for interesting discussions and suggestions. Work of O. C. was supported by NSF Grant 9704968, of R.D.C. by NSF Grant 0074924, and that of O. C., J. L. L. and A. R. by AFOSR Grant F49620-98-1-0207 and NSF Grant DMR-9813268.

Appendix A. Proof of Lemma 3 A (i) Consider L1loc [0, A] endowed with the norm 'F 'ν := 0 |F (s)|e−νs ds, where ν > 0. If f is continuous and F, G ∈ L1loc [0, A], a straightforward calculation shows that 'f F 'ν < 'F 'ν sup |f |,

(A1)

'F ∗ G'ν < 'F 'ν 'G'ν , 'F 'ν → 0 as ν → ∞,

(A2) (A3)

[0,A]

where the last relation follows from the Riemann–Lebesgue lemma. The integral equation (9) can be written as Y = η +JY

where J F := η(2i + M) ∗ F.

(A4)

Since M is locally in L1 and bounded for large x it is clear that for large enough B2 , (9) is contractive if ν > B2 , for any A.

Ionization of Simple Model

23

(ii) This is an immediate consequence of Lemma 3 and of elementary properties of the Laplace transform. (iii) We have in H, ∞ 2 −i(x−ia)(1+u2 ) 2i ∞ u e −px dxe du (A5) LM = lim a↓0 π 0 1 + u2 0 ∞ i u2 = du. (A6) π −∞ (1 + u2 )(p + i(1 + u2 )) For (p) > 0 we push the integration contour through the upper half plane. At the two poles in the upper half plane u2 + 1 equals 0 and ip respectively, so that i π

∞ −∞

u2 du (1 + u2 )(p + i(1 + u2 )) u20 (−1) ds i u0 ds i + =− + , = π (2i)(p) s (ip)(2iu0 ) s p p

(A7)

where u0 is the root of p + i(1 + u2 ) = 0 in the upper half plane. Thus √ i i 1 − ip LM = − + (A8) p p √ with the branch satisfying 1 − ip → √ 1 as p → 0 in H. Thus, the analytic continuation of 1 − ip in H∪∂H in our calculations is as follows: √ Remark 24. As p varies in H, 1−ip belongs to the √ lower half plane −iH so that 1 − ip varies√in the fourth quadrant, and in particular 1 − ip < 0. If p√∈ iR and −ip ≥ −1 then 1 − ip is real and nonnegative, while if −ip < −1 and 1 − ip has zero real part and negative imaginary part. To show (15) note that for (p) > 0, ω > 0 we have √

i i 1 − ip ∓ ω ±iω L e M =− + , p ∓ iω p ∓ iω (with 1 − ip − ω = −i ω − 1 + ip if ω > 1)

(A9)

The branch of the square root was discussed in Remark 24. This concludes the proof of Lemma 3 (iii). Appendix B. Discussion of the Genericity Condition (g) A thorough analysis of the properties of the shift operator is provided by the treatise [29]. We provide here an independent discussion, meant to give an insight on the interesting analytic properties involved in this condition. Let C = (C0 , C1 , . . . , Cn , . . . ) ∈ l2 (N) and the operator T defined as before by T C = (C1 , C2 , . . . ). We want to see for which such vectors, the system of equations (z, T j C) = 0, j = 0, 1, . . .

(B1)

24

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

has nontrivial solutions z in l2 . We can associate to z and C analytic functions in the unit disk, Z(x) and C(x) by C(x) =

∞

Ck x k

Z(x) =

k=0

∞

zk x k .

(B2)

k=0

These functions, extend to L2 functions on the unit circle. The system of equations (B1) can be written as z0 C(x) + z1 x −1 (C(x)C(0)) + . . . + zn x

−n

C(x) − x

−n

n−1 k x k=0

k!

C (k) (0) + · · · = 0.

Using Cauchy’s formula, we can the difference in square brackets in (B3) as 1 C(s) ds, 2πi |s|=1 s n (s − x)

(B3)

(B4)

and thus (B1) becomes |s|=1

C(s)Z(1/s) ds = 0. s−x

(B5)

The functions C for which this equation has nontrivial solutions Z relate to the Beurling inner functions [29] and are very “rare”. Examples. (i) Let |λ| < 1 and Cn = λn , i.e. C(x) = (1 − λx)−1 . This is related to the function (11). Taking advantage of the analyticity of Z outside the unit circle, we can push the contour of integration towards s = ∞, collecting the residue at x = λ−1 ; we see that Eq. (B5) holds iff Z(λ) = 0, i.e., for a space of analytic functions of codimension one. (ii) Let λn = 1/n. Then C(x) = ln(1 − x), and by taking s = 1/t in (B5) we get 1 Z(t) ln(t − 1) ln(t)Z(t) 1 dt − dt = 0. (B6) x |t|=1 (t − x −1 )t x |t|=1 t (t − x −1 ) By making a cut on [1, ∞) for the log we see that the integrand in the first integral is analytic in the unit circle and thus the integral vanishes. We decompose the second integral by partial fractions and we get ln(t)Z(t) ln(t)Z(t) dt − dt = 0, (B7) t |t|=1 |t|=1 (t − y) where y = x −1 . The first integral is a constant, C. By pushing the contour of integration inwards, we see that the second integral extends analytically for small y = 0. For such y we thus have ln(t)(Z(t) − Z(y)) ln(t) dt + Z(y) dt = −C. (B8) (t − y) |t|=1 |t|=1 (t − y)

Ionization of Simple Model

25

Now the contour of integration can be pushed to the sides of the interval [0, 1] collecting the difference between the branches of the log. We get 1 1 Z(t) − Z(y) 1 dt + Z(y) dt = 0. (B9) t −y 0 0 t −y Thus φ(y) + Z(y) ln(−y) = C with φ and Z analytic in the unit circle, thus ln(−y) is analytic unless Z = 0. This shows Cn = 1/n is generic. References 1. Simon, B.: Schrödinger Operators in the Twentieth Century. J. Math. Phys. 41, 3523 (2000) 2. Cycon, H.L., Froese, R.G., Kirsch, W. and Simon, B.: Schrödinger Operators. Berlin–Heidelberg–New York: Springer-Verlag, 1987 3. Pillet, C.-A.: Some results on the quantum dynamics of a particle in a Markovian potential. Commun. Math. Phys. 102, 237–254 (1985) and 105, 259 (1986) 4. Yajima, K.: Existence of Solutions for Schrödinger Evolution Equations. Commun. Math. Phys. 110, 415–426 (1987) 5. Fring, A., Kastrykin, V. and Schrader, R.: Ionization Probabilities through Ultra-Intense Fields in the Extreme Limit. J. Phys. Math. Gen. 30, 8599–8610 (1997) 6. Soffer, A. and Weinstein, M.I.: Nonautonomous Hamiltonians. J. Stat. Phys. 93, 359–391 (1998) 7. Soffer, A. and Costin, O.: Resonance Theory for Schrödinger Operators. Submitted to Commun. Math. Phys. 8. Landau, L.D. and Lifschitz, E. M.: Quantum Mechanics – Nonrelativistic Theory. 2nd ed. New York: Pergamon, 1965 9. Cohen-Tannoudji, C., Duport-Roc, J. and Arynberg, G.: Atom-Photon Interactions. New York: Wiley, 1992 10. Fermi, E.: Notes on Quantum Mechanics. Chicago: The University of Chicago Press, 1961, p. 100 11. Koch, P.M. and van Leeuven, K.A.H.: The Importance of Resonances in Microwave “Ionization” of Excited Hydrogen Atoms. Phys. Repts. 255, 289–403 (1995) 12. Casatti, G., Chirikovand, B. and Shepelyansky, D. L.: Relevance of Classical Chaos in Quantum Mechanics: The Hydrogen Atom in a Monochromatic Field. Phys. Repts. 154, 77–123 (1987) 13. Patolige, R.M. and Shaheshaft, R.: Multiphoton Processes in an Intense Laser Field: Harmonic Generation and Total Ionization Rates for Atomic Hydrogen. Phys. Rev. A 40, 3061–3079 (1990) 14. Buchleitner, A., Delande, D. and Gay, J.-C.: Microwave Ionization of Three Dimensional Hydrogen Atoms in a Realistic Numerical Experiment. J. Opt. Soc. Am. B 12, 505–519 (1995) 15. Benenti, G., Casati, G., Maspero, G. and Shepelyansky, D.L.: Quantum Poincaré Recurrences for Hydrogen Atom in a Microwave Field. Preprint, physics/9911200 16. Schwendner, P., Seyl, F. and Schinke, R.: Photodissociation of Ar2+ in Strong Laser Fields, Chem. Phys. 217, 233–244 (1997) 17. Guerin, S. and Jauslin, H.-R.: Laser-Enhanced Tunneling through Resonant Intermediate Levels. Phys. Rev. A 55, 1262–1275 (1997) 18. Eberly, J.M. and Kulander, K.C.: Atomic-Stabilization by Super-Intense Lasers. Science 262 1233 (1993) 19. Su, Q., Irving, B.P. and Eberly, J.H.: Ionization Modulation in Dynamic Stabilization. Laser Physics 7, 568 (1997) 20. Figueira de Morisson Faria, C., Fring, A. and Schrader, R.: Analytical Treatment of Stabilization. Preprint, physics/9808047, v2 21. Barash, D., Orel, A.E. and Vemuri, V.R.: A Genetic Search in Frequency Space for Stabilizing Atoms by High-Intensity Laser Fields. J. Comp. Info. Tech. CIT 8, 2, 103–113 (2000) 22. Rokhlenko, A. and Lebowitz, J.L.: Ionization of a Model Atom by Perturbation of the Potential. J. Math. Phys. 41, 3511–3523 (2000) 23. Costin, O., Lebowitz, J.L. and Rokhlenko, A.: Exact Results for the Ionization of a Model Quantum System. J. Phys. A. 33, 6311–6319 (2000) physics/9905038, and work in preparation 24. Costin, O., Lebowitz, J.L. and Rokhlenko, A.: To appear in: Proceedings of the CRM meeting “Nonlinear Analysis and Renormalization Group”, American Mathematical Society Publications (2000), mathph/0002003 25. Demkov, Yu.N. and Ostrovskii, V.N.: Zero Range Potentials and Their Application in Atomic Physics. New York: Plenum, 1988; Albeverio, S., Gesztesy, F., Høegh-Krohn, R. and Holden, H.: Solvable Models in Quantum Mechanics. Berlin–Heidelberg–New York: Springer-Verlag, 1988

26

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

26. Susskind, R.M., Cowley, S.C. and Valeo, E.J.: Multiphoton Ionization in a Short Range Potential: A Nonperturbative Approach. Phys. Rev. A 42, 3090–3106 (1990) 27. Scharf, G., Sonnemoser, K. and Wreszinski, W.F.: Sensitive Multiphoton Ionization. Phys. Rev. A 44, 3250–3265 (1991) 28. LaGattuta, K.J.: Laser-Assisted Scattering from a One-Dimensional δ-function potential: An Exact Solution. Phys. Rev. A 49 No. 3, 1745–1751 (1994) 29. Nikol’skii, N.K.: Treatise on the Shift Operator. Berlin–Heidelberg–New York: Springer-Verlag, 1986 30. Reed, M. and Simon, B.: Methods of Modern Mathematical Physics, Vol. I. New York: Academic Press, 1979 31. Martin, P.A.: Scattering with Time Periodic Potentials and Cyclic States. Preprint 1999, Texas 32. Kovar, T. and Martin, P.A.: Scattering with a periodically kicked interaction and cyclic states. Preprint (1999) 33. Belissard, J.: Stability and Instability in Quantum Mechanics. In: Trends and Developments in the Eighties, Albeverio, S. and Blanchard, Ph. eds., Singapore: World Scientific, 1985, pp. 1–106 34. Jauslin, H.R. and Lebowitz, J.L.: Spectral and Stability Aspects of Quantum Chaos. Chaos 1, 114–121 (1991) 35. Costin, O. and Costin, R.D.: Rigorous WKB for Discrete Schemes with Smooth Coefficients. SIAM J. Math. Anal. 27 no. 1, 110–134 (1996) Communicated by B. Simon

Commun. Math. Phys. 221, 27 – 56 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

How to Prove Dynamical Localization Serguei Tcheremchantsev MAPMO-CNRS, Département des Mathématiques, Université d’Orléans, BP 6759, 45067 Orléans Cedex 2, France. E-mail: [email protected] Received: 16 November 2000 / Accepted: 14 February 2001

Abstract: Let H be a self-adjoint operator on l 2 (Zd ) or L2 (Rd ) with pure point spectrum on some interval I . We establish general necessary and sufficient conditions for dynamical localization for a given vector and on the interval of energies I . The sufficient conditions we obtain improve the existing ones such as SULE or WULE and can be useful in applications. 1. Introduction Let H be a self-adjoint operator on the Hilbert space H with pure point spectrum on some interval I ⊂ R. We shall consider the case H = l 2 (Zd ) as well as H = L2 (Rd ). Consider the subspace H(I ) of H, H(I ) = PI (H )H, where PI (H ) is the spectral projector of H onto I . It is natural to say that the operator H has dynamical localization on I if for any p > 0 and well localised ψ ∈ H, sup r p (t) ≡ supexp(−itH )PI (H )ψ, |X|p exp(−itH )PI (H )ψ < +∞, t

t

(1.1)

where X is the usual position operator. The problem of dynamical localization was intensively studied during the last past years [1–8], especially in the case of random discrete and continuous Schrödinger operators (in particular, the Anderson model). The aim of the present paper is not to give a review of the results obtained for concrete models. We are rather interested in general mathematical methods which can be used to prove dynamical localization (1.1) for any self-adjoint operator H . We hope, however, that the obtained results (especially sufficient conditions for dynamical localization) will be useful in applications. Let {ek } be any orthonormal set of eigenfunctions of H complete in H(I ). For any k we have H ek = λk ek with λk ∈ I . Suppose that H = l 2 (Zd ) (in the last section of the paper we discuss also the continuous case). Let ψ = δm for some m ∈ Zd . Then exp(−itλk )ek (n)ek (m) ψI (t, n) ≡ (exp(−itH )PI (H )ψ)(n) = k

28

S. Tcheremchantsev

and sup |ψI (t, n)| ≤ W (n, m), t

where

W (n, m) =

sup r p (t) ≤ t

|n|p W 2 (n, m),

(1.2)

n∈Zd

|ek (n)ek (m)|.

k

To prove dynamical localization (1.1) for any ψ = δm , it is sufficient to show that the function W (n, m) is fast decaying as |n| → ∞ for all m ∈ Zd . What one usually proves for “concrete” Schrödinger operators is the so-called exponential (or mathematical) localization on the interval I . Namely, for some α > 0 and any eigenfunction ek ∈ H(I ), |ek (n)| ≤ C(k) exp(−α|n|),

(1.3)

where C(k) < +∞. If the sum is finite, it is evident that W (n, m) is exponentially decaying in n for any m and (1.1) holds. However, typically the sum has infinitely many terms. In this case the bounds (1.3) give nothing about the behaviour of the sum W (n, m). The well known example of [5] shows that it is quite possible that (1.3) holds, but there is no dynamical localization for ψ = δ0 and r 2 (tn )/tn2−δ → +∞ for any δ > 0 for some sequence tn → ∞. It is clear that to prove dynamical localization, one should control the decay of |ek (n)ek (m)| both in n and in k. Or, equivalently, one should control the constants C(k) in (1.3). Two approaches have been developed to solve this problem. The first is to estimate rather directly |ek (n)ek (m)| and to prove that the sum W (n, m) is fast decaying as |n| → ∞ [7, 8]. For example, one shows that

where

|ek (n)ek (m)| ≤ C(m)ak exp(−α|n − m|), k ak

(1.4)

= D < +∞. Clearly, (1.4) yields W (n, m) ≤ C(m)D exp(−α|n − m|)

and (1.1) holds. In particular, the condition called WULE was considered [7]. This is 2 and B is the operator of multiplication by (1 + |n|2 )−δ/2 with (1.4), where ak = Bek δ > d/2. Obviously, as k |ek (n)|2 ≤ 1 for any n, ak ≤ (|n|2 + 1)−δ < +∞. k

n

Another possibility is to have some control of constants C(k) in (1.3). Namely, the following condition called SULE was introduced in [5]. The operator H has SULE on I , if H has a complete set {ek } in H(I ) of orthonormal eigenfunctions and there exist α > 0 and nk ∈ Zd such that for any δ > 0, |ek (n)| ≤ C(δ) exp(δ|nk | − α|n − nk |)

(1.5)

with some finite constants C(δ) uniform in k, n. One shows that if (1.5) holds, then ek (m) are fast decaying in k for all m. So, it is easy to see that the function W (n, m) is exponentially decaying in n for all m and (1.1) holds (in fact, one controls also the behaviour in m, so one proves (1.1) for any exponentially decaying ψ). This result

How to Prove Dynamical Localization

29

was extended to the continuous case in [6] and applied in [6, 4] to prove dynamical localization for some concrete models. Similar ideas are used in [3] to prove strong dynamical localization. In the present paper we propose a new approach which ameliorates considerably the existing methods. Surprisingly, one can show that if the function |ek (n)ek (m)| decays sufficiently fast in n uniformly in k, then one should not take special care of decay in k necessary for the convergence of the sum W (n, m). The key point is the following. Due to the fact that the system {ek } is orthonormal in H, the decay in n and decay in k of |ek (n)ek (m)| are related. Therefore, one can “sacrifice” some decay in n to obtain a decay in k sufficient to the convergence of the sum. One can even allow some growth in k in the bounds for |ek (n)ek (m)| provided the decay in n is fast enough (see Theorem 5.3 and Theorem 5.4). One should say that there is a deep relation between our approach and that of [5]. The main technical ingredient in our consideration is the following result (Theorem 2.2 and Theorem 6.1). Let {ek } be any orthonormal system in H. For any p > 0 define the positive numbers dk (p) = |ek (n)|2 (|n| + 1)p , ηk (p) = sup(|ek (n)|2 exp(p|n|)) n

n

(where it is possible that dk (p) = +∞, ηk (p) = +∞). For any p > 0, one can reorder dk (p), ηk (p) so that dk (p) ≥ D(p, d)k p/d ,

ηk (p) ≥ C(p, d) exp(βk 1/d )

(1.6)

with some positive universal constants D(p, d), C(p, d), β(p, d). Considering the SULE condition, one can easily verify the fact that |nk | ≥ Ck 1/d (after reordering) implies dk (p) ≥ Dk p/d and ηk (p) ≥ C exp(βk 1/d ) for any p > 0. So, the growth of |nk | for the systems {ek } with SULE, which plays a key role in the proof of [5], can be considered as a manifestation of a more general result (1.6) valid for any orthonormal system. When proving (1.6), we don’t need any assumptions about the form of eigenfunctions ek or the notion of “center of localization” nk . The paper is organised as follows. In Sect. 2 we prove our main technical result about the growth of the moments dk (p). In Sect. 3 we establish necessary conditions for dynamical localization for a given vector ψ. In particular, we show (Theorem 3.4) that if p

sup |X|ψ (t) ≡ supexp(−itH )ψ, |X|p exp(−itH )ψ < +∞, t

t

(1.7)

for some p > 0, then the coefficients of the spectral measure of ψ decay sufficiently fast. Namely, if µψ = k ak δλk , then k

1

ak1+β < +∞

for any 0 < β < p/d. In particular, if (1.7) holds for any p > 0, then ak are fast decaying: akν < +∞ ∀ν > 0. k

In Sect. 4 we give some sufficient conditions for dynamical localization for a given vector ψ and p > 0 (Theorem 4.2 is the main result of the section). As a result, we obtain

30

S. Tcheremchantsev

(Corollary 4.3) general necessary and sufficient conditions for dynamical localization for a given vector ψ for all p > 0. Namely, let ψ ∈ Hpp . One can always represent it as ψ= ψk , ψk ∈ Hλk , k

where λk = λs for k = s and Hλ is an eigenspace of H corresponding to the eigenvalue λ. Consider the function Rψ (n) = sup |ψk (n)|. k

Then (1.7) holds for any p > 0 iff Rψ (n) is fast decaying. We show also that the dynamical localization p sup |X|ψ (t) < +∞ ∀p > 0 t

is equivalent to the dynamical localization for Cesaro averages: T p sup 1/T |X|ψ (t)dt < +∞ ∀p > 0. T

0

The results of Sects. 3 and 4 are well adapted to the case of power law or subexponential decay of eigenfunctions of H . In Sect. 5 we discuss the problem of dynamical localization on the interval of energies I (in the sense (1.1)). Theorems 5.3 and 5.4 give sufficient conditions that (1.1) hold for any finite ψ and any fast decaying ψ respectively. We give also some bound (Theorem 5.5) which can be used to prove the strong dynamical localization on I considered in [3, 8] (in the case when there is a family of operators H depending on some parameter). In Sect. 6 we consider exponential dynamical localization: sup |(exp(−itH )ψ)(t, n)| ≤ C exp(−γ |n|), t

γ > 0,

(1.8)

(denoted as EDL(ψ)) which is a special case of dynamical localization. In particular, we show that if (1.8) holds for some vector ψ, then the coefficients of the spectral measure of ψ decay (after reordering) as follows: ak ≤ C exp(−βk 1/d ) with some β > 0. We give also (Theorem 6.5) some sufficient conditions for exponential localization on the interval of energies I . In particular, if |ek (n)ek (m)| ≤ C exp(−α|n − m| + β|m|)

(1.9)

for some α > 0, β > 0, then EDL(PI (H )ψ) holds for any exponentially decaying ψ. The condition (1.9) is similar to (1.4), but is much easier to prove in concrete cases because one doesn’t need the decreasing factors ak . In particular, the SULE condition (1.5) implies immediately (1.9). In Sect. 7 we consider the continuous case H = L2 (Rd ). We show that most of the results proved in the discrete case remain true under some additional assumptions about eigenfunctions of H . In particular, it is the case if H = −( + V (x) with V (x) bounded from below and I = (−∞, K], K ∈ R. One should stress that practically all results of the paper hold regardless of the multiplicity of the spectrum of H .

How to Prove Dynamical Localization

31

2. Lower Bounds for the Moments of Eigenfunctions Let {akn } be a double sequence of nonnegative numbers labelled by k ∈ N, n ∈ Zd . We shall suppose that there exist two positive constants A, B such that ∀n ∈ Zd ,

∞

akn ≤ A < +∞,

(2.1)

k=1

∀k ∈ N,

akn ≥ B > 0.

(2.2)

n∈Zd

For p > 0 define the numbers

dk (p) =

akn (|n| + 1)p

n∈Zd

with the understanding that dk (p) may be equal to +∞. One can also remark that dk (p) ≥ B for any k, p. Lemma 2.1. Let p > 0. There exist a positive constant D(A, B, p, d) such that for any {ank } satisfying (2.1), (2.2), one can reorder dk (p) so that p

dk (p) ≥ Dk d .

(2.3)

Proof. For any N > 0 consider the following set in N: J (N) = k ∈ N akn ≤ B/2 n:|n|≥N

and the sum

S(N ) =

akn .

k∈J (N) |n| 0 such that L = of the set I (N, p) and (2.7) that

(2.7)

B p 2 (N +1) . It follows from the definition d

Card({k ∈ N | dk (p) ≤ L}) ≤ C(d)A/B(2L/B) p .

(2.8)

The bound (2.8) implies, in particular, that the set {k ∈ N | dk (p) ≤ L} is finite for any L > 0. Reordering dk (p) in such a way that they increase with k, we obtain the result of the lemma. As a direct application of this lemma, we obtain the following important result. Theorem 2.2. Let {ek }, k ∈ N be any orthonormal system in l 2 (Zd ) (not necessarily complete). For any p > 0 define the moments dk (p) = |ek (n)|2 (|n| + 1)p . n∈Zd

One can reorder dk (p) so that p

dk (p) ≥ D(p, d)k d , where the positive constants D are the same for any system {ek }. Proof. Let akn = |ek (n)|2 . Since the system is orthonormal, ∀n ∈ Zd ,

∞

akn ≤ 1,

(2.9)

k=1

∀k ∈ N,

akn = 1.

n∈Zd

(One has the equality in (2.9) if the system is complete.) The result of the theorem follows directly from Lemma 2.1, where A = B = 1.

How to Prove Dynamical Localization

33

Remark 2.3. The result of the theorem is optimal since there exist orthonormal systems such that C1 k p/d ≤ dk (p) ≤ C2 k p/d . The simplest example is the canonical basis in l 2 (Zd ) or, more generally, complete systems with SULE [5], where the functions ek (n) are well localised and fast decaying at infinity. One can observe that if the system is not complete, then dk (p) can grow as fast as you will (taking, for example, ek = δm(k) with fast growing |m(k)|). Even if the system is complete but the functions ek decay not too fast at infinity, it is possible that dk (p) are fast growing (in particular, dk (p) = +∞ for any k). An interesting problem: is it possible that dk (p) grow faster than k p/d for some complete systems where all the functions ek (n) decay fast (for example, exponentially) as |n| → +∞? 3. Localization for a Given Vector: Necessary Conditions Let H be a self-adjoint operator in H = l 2 (Zd ), ψ ∈ H, ψ = 1, ψ(t) = exp(−itH )ψ. For any p > 0, t ∈ R, define the moments of the position operator p |X|ψ (t) = |ψ(t, n)|2 (|n| + 1)p . n∈Zd

We prefer to take (|n| + 1)p rather than |n|p to avoid some technical problems in the proofs. Definition 3.1. One has dynamical localization for ψ and the moment of order p, if p

sup |X|ψ (t) < +∞. t

We shall denote it as DL(ψ, p). One has Cesaro dynamical localization CDL(ψ, p) if T p p sup|X|ψ (T ) ≡ sup 1/T |X|ψ (t)dt < +∞. T

T

0

Clearly, DL(ψ, p) ⇒ CDL(ψ, p). Definition 3.2. One has dynamical localization (Cesaro dynamical localization) for ψ if DL(ψ, p) (respectively, CDL(ψ, p)) holds for any p > 0. We shall write DL(ψ) and CDL(ψ) respectively. Again, DL(ψ) ⇒ CDL(ψ). Let Hc be the subspace of a continuous spectrum of H and Pc be the orthogonal p projection on it. It is well known that if Pc ψ = 0, then |X|ψ (t) → +∞ as t → ∞ for any p > 0. So, the dynamical localization is possible only if ψ ∈ Hpp - the subspace of pure point spectrum of H . We shall denote by λ the eigenvalues of H and by Hλ the corresponding eigenspaces: Hλ = {ϕ| H ϕ = λϕ}.

34

S. Tcheremchantsev

Clearly, the subspaces Hλ and Hµ are mutually orthogonal for λ = µ and Hpp = ⊕λ Hλ . We shall denote by Pλ orthogonal projection on Hλ . For any ψ ∈ Hpp consider the set (at most countable) I (ψ) = {λ| ψλ ≡ Pλ ψ = 0}. Then ψ can be written as ψ=

+∞

ψk , ψk = 0, ψk ∈ Hλk , λk ∈ I (ψ),

k=1

where λk = λs for k = s. (It is possible that the sum is finite, in this case the problem of dynamical localization for the vector ψ is rather trivial.) For any k define ek = ψk −1 ψk . As Hλk and Hλs are mutually orthogonal for k = s, the system M(ψ) ≡ {ek } is orthonormal in Hpp . Finally, any ψ ∈ Hpp can be written as ψ=

γ k ek ,

k

where γk = ψ, ek , H ek = λk ek and M(ψ) = {ek } is some orthonormal system of eigenfunctions of H depending on ψ. It is obvious that exp(−itλk )γk ek . (3.1) ψ(t) = k

Let dk (p) be the moments of the functions ek : dk (p) =

|ek (n)|2 (|n| + 1)p .

n∈Zd

One can also note that the spectral measure of ψ is equal to µψ =

ak δλk ,

(3.2)

k

where ak = |γk |2 > 0, k ak = ψ2 = 1. The first result we shall prove is a necessary condition for dynamical localization. Theorem 3.3. 1. For any p > 0, p lim inf |X|ψ (T ) T →∞

≥

∞

ak dk (p)

k=1

(with the convention that if dk (p) = +∞, for some k, then k = +∞). 2. DL(ψ, p) ⇒ CDL(ψ, p) ⇒ dk (p) < +∞ for any k and k ak dk (p) < +∞.

How to Prove Dynamical Localization

35

Proof. It follows from (3.1) that ∞

|ψ(t, n)|2 =

exp(−it (λk − λm ))γk γm ek (n)em (n).

(3.3)

k,m=1

The sum in (3.3) is absolutely converging for any n because |γk |2 = 1, |ek (n)|2 ≤ 1. k

k

Therefore, for any N > 0, T ∈ R, T ∞ dt |ψ(t, n)|2 (|n| + 1)p = bkm (T )γk γm dkm (p, N ), A(N, T ) ≡ 1/T 0

|n|≤N

k,m=1

(3.4)

p where dkm (p, N ) = |n|≤N ek (n)em (n)(|n| + 1) , bkm (T ) = 1 for k = m and bkm (T ) = (exp(−iT (λk − λm )) − 1)/(−iT (λk − λm ) for k = m. As A(N, T ) ≤ p |X|ψ (T ), for any N > 0 we have the inequality p

liminf T →∞ |X|ψ (T ) ≥ liminf T →∞ A(N, T ).

(3.5)

On the other hand, due to the dominated convergence theorem, one can take the limit in (3.4) for a fixed N as T → ∞: lim A(N, T ) = |γk |2 |ek (n)|2 (|n| + 1)p . (3.6) T →∞

k

|n|≤N

As ak = |γk |2 , it follows from (3.5) and (3.6) that p ak |ek (n)|2 (|n| + 1)p liminf T →∞ |X|ψ (T ) ≥ k

|n|≤N

for any N > 0. Taking the limit N → +∞, we obtain the first statement of the theorem. The second statement follows directly from the first. As a corollary of Theorem 3.3, we shall prove a necessary condition for dynamical localization for the vector ψ in terms of its spectral measure µψ given by (3.2). Theorem 3.4. The following statements hold: 1. For any p > 0, DL(ψ, p) ⇒ CDL(ψ, p) ⇒

k

1

ak1+β < +∞

for all 0 < β < p/d. 2.

DL(ψ) ⇒ CDL(ψ) ⇒

k

for all ν > 0.

akν < +∞

36

S. Tcheremchantsev

Proof. Suppose that CDL(ψ, p) holds. Theorem 3.3 implies ak dk (p) < +∞. k

One can apply Theorem 2.2 to the orthonormal system {ek } and reorder dk (p) so that dk (p) ≥ Dk p/d . Therefore, after reordering, ak k p/d < +∞. k

Let 0 < β < p/d, r = 1 + easily see that k

1 1+β

ak

β, r

= 1 + 1/β. Applying the Hölder inequality, one can

≤

ak k

p/d

1 1+β

k

β/(1+β) k

−p/(βd)

< +∞

k

1 The fact that k ak1+β < +∞, does not depend on reordering of {ak }. The second part of the theorem is obvious.

Corollary 3.5. If CDL(ψ) holds, then the numbers ak are fast decaying: for any s > 0, one can reorder {ak } so that ak ≤ C(s)k −s . One should stress that the statements of Theorem 3.4 and Corollary 3.5 are weaker than that of Theorem 3.3, because they do not depend on the system {ek }, and inevitably, one loses some information about the moments dk (p). If dk (p) grow very fast as k → ∞, it is even possible that k akν < +∞ for all ν > 0, but k ak dk (p) = +∞ for all p > 0. Finally, in this section we shall give necessary conditions for DL(ψ, p) in terms of projections of ψ on Hλk . Lemma 3.6. Let M = {ek } be any orthonormal system in H (in particular, the system M(ψ)), ψ ∈ H. Define the following function: Rψ,M (n) = sup |γk ek (n)|, k

Then

n∈Zd

γk = ψ, ek .

2 2 Rψ, M (n) ≤ ψ .

Proof. As the system {ek } is orthonormal, k |γk |2 ≤ ψ2 . Therefore, S= |γk ek (n)|2 = |γk |2 ≤ ψ2 . k,n

On the other hand, 2 S= |γk ek (n)|2 ≥ sup |γk ek (n)|2 = Rψ, M (n). n

k

(3.7)

k

n

k

The result of the lemma follows from (3.7)–(3.8).

n

(3.8)

How to Prove Dynamical Localization

37

The function Rψ,M (n) (especially its decay properties) will play an important role in the next part of the paper. Lemma 3.6 implies that Rψ,M always lies in l 2 (Zd ). We shall see below that if DL(ψ, p) holds, then Rψ,M(ψ) decays faster at infinity. On the other hand, in the next section we shall prove that a sufficiently fast decay of Rψ,M for some M implies DL(ψ, p). Definition 3.7. We shall say that a function f : Zd → C (f : Zd → R) is fast decaying if for any s > 0, sup |f (n)|(|n| + 1)s < +∞. n

Theorem 3.8. The following statements hold: 2 p 1. DL(ψ, p) ⇒ CDL(ψ, p) ⇒ n Rψ, M(ψ) (n)(|n| + 1) < +∞. 2. DL(ψ) ⇒ CDL(ψ) ⇒ Rψ,M(ψ) is fast decaying. Proof. Suppose that CDL(ψ, p) holds. Theorem 3.3 implies that dk (p) < +∞ for any k and ak dk (p) ≤ C(p) < +∞. S(p) = k

On the other hand, it follows from definition of dk (p) that S(p) = (|n| + 1)p |γk ek (n)|2 ≥ (|n| + 1)p sup |γk ek (n)|2 n

k

2 = (|n| + 1)p Rψ, M(ψ) (n),

n

k

n

so we obtain the first statement of the theorem. The second statement follows directly from the first. 4. Localization for a Given Vector: Sufficient Conditions In this section we shall give some sufficient conditions for DL(ψ, p) and DL(ψ). Let M = {ek } be any orthonormal system of eigenfunctions of H and HM the subspace of Hpp spanned by M. For any ψ ∈ HM , we have the identity ψ=

γk e k ,

γk = ψ, ek .

(4.1)

k

We shall denote as λk and dk (p) the eigenvalue and the moments of ek (n) respectively. In this section we shall consider any systems M of eigenfunctions of H , not necessarily M(ψ), so it is possible that λk = λm for k = m if the spectrum of H is not simple. The decomposition ψk ψ= k

of the previous section is a special case of (4.1), when M = M(ψ). The simplest sufficient condition for DL(ψ, p) can be given in terms of dk (p) and ak = |γk |2 .

38

S. Tcheremchantsev

Lemma 4.1. Let p > 0. The following statement holds: p sup |X|ψ (t) t

2 p 2 ≤ (|n| + 1) sup |ψ(t, n)| ≤ ak dk (p) . t

n

k

So, if the last sum converges, DL(ψ, p) holds. Proof. Since sup |ψ(t, n)| = sup | t

t

exp(−itλk )γk ek (n)| ≤

k

|γk ek (n)|,

(4.2)

k

the Cauchy–Schwartz inequality yields for any t, (|n| + 1)p sup |ψ(t, n)|2 ≤ |γk γm | |ek (n)em (n)|(|n| + 1)p n

t

n

k,m

≤

|γk γm | dk (p)dm (p) =

k,m

(4.3)

2

ak dk (p)

.

k

If the functions ek and em have only small overlapping for k = m, one can better estimate the sums |ek (n)em (n)|(|n| + 1)p . dkm (p) = n

(p) decay fast when |k − m| → ∞, the sum on the r.h.s of (4.3) can be majorated If dkm by C k ak dk (p) (or by something close to it). In this case one obtains a sufficient condition which is close to (or even identical with) the necessary condition given by Theorem 3.3. In particular, this is the case when M is the canonical basis in l 2 (Zd ). The sufficient condition of Lemma 4.1 is, however, difficult to apply in the concrete cases, because one should control at the same time the growth of dk (p) and the decay of ak . A more practical sufficient condition can be given in terms of the function Rψ,M (n), defined by Rψ,M (n) = sup |γk ek (n)|, γk = ψ, ek , |γk |2 = ψ2 . k

k

Later on in this section, ψ and M are fixed and we omit them in notations. To prove DL(ψ, p), one shall use the trivial bound (4.2): |ψ(t, n)| ≤ |γk ek (n)|. k

Clearly, any term in the sum is majorated by R(n), and if the sum has a finite number of terms, the sufficiently fast decay of R(n) implies DL(ψ, p). However, typically it is not the case, and one needs some decay in k so that the sum converges. The key moment is the following: one can sacrifice some decay in n to obtain a necessary decay in k. This is possible due to the growth of the moments dk (p) given by Theorem 2.2. Surprisingly,

How to Prove Dynamical Localization

39

one can even allow some growth in k in the bounds for γk ek (n). Namely, suppose that for some α > 0 the moments dk (α) are finite for any k. Consider the function R(n, α) = sup k

|γk ek (n)| . dk (α)

For α = 0 one has dk (0) = 1 (so, dk (α) are always finite) and R(n, 0) coincides with the function R(n) considered above. For any s > 0 define the moments R 2 (n, α)(|n| + 1)s , Lα (s) = n

where Lα (s) depend also on ψ, M and it is possible that Lα (s) = +∞. Theorem 4.2. Let ψ ∈ HM , α ≥ 0. Suppose that dk (α) < +∞ for any k. The following statements hold: 1. Let δ ∈ (0, 1), ε > 0. Then sup |ψ(t, n)| ≤ C(ε, δ, d)Lδ/2 α t

2α + d(2 − δ) + ε R 1−δ (n, α), δ

(4.4)

where the constants C(ε, δ, d) are universal, i.e. do not depend on H, M or ψ. 2. Let p > 0, ε > 0. There exist the universal constants C(ε, p, d) such that p (|n| + 1)p sup |ψ(t, n)|2 sup |X|ψ (t) ≤ t t (4.5) n ≤ C(ε, p, d)Lα (2α + p + 2d + ε). 3. If R(n, α) is fast decaying in n, then so is supt |ψ(t, n)| and DL(ψ) holds. Proof. Let r = 2α+d(2−δ) + ε. If Lα (r) = +∞, the bound (4.4) is trivially true with δ any finite constant C. Suppose that Lα (r) < +∞. It follows from definition of dk (r) and Rα (n) that for any r > 0, k ∈ N, |γk ek (n)|2 (|n| + 1)r ≤ dk2 (α) R 2 (n, α)(|n| + 1)r ≡ dk2 (α)Lα (r). ak dk (r) = n

n

As Lα (r) < +∞, dk (r) < +∞ for any k such that ak = |γk |2 = 0. Therefore, 1/2 |γk | ≤ Lα (r)dk2 (α)/dk (r) .

(4.6)

At the same time, |γk ek (n)| ≤ dk (α)R(n, α),

(4.7)

directly from the definition of R(n, α). Using the bounds (4.6)–(4.7), one can estimate: |γk ek (n)| ≤ (dk (α)Rα (n))1−δ |γk ek (n)|δ |ψ(t, n)| ≤ k

k

≤R

(n, α)Lδ/2 α (r)

1−δ

k

−δ/2

|ek (n)|δ dk (α)dk

(r),

(4.8)

40

S. Tcheremchantsev

where the summation is carried only over k such that ak > 0, so dk (r) < +∞. Let s = 2/(2 − δ), s = 2/δ. Applying the Hölder inequality, and using the fact that 2 k |ek (n)| ≤ 1, one obtains:

s 2/(2−δ) −δ/2 −δ/(2−δ) δ |ek (n)| dk (α)dk (r) ≤ dk (α)dk (r). (4.9) S≡ k

k

Let w < q. Applying the Hölder inequality with s = q/w, s = q/(q − w), one can estimate: |ek (n)|2/s (|n| + 1)w |ek (n)|2/s dk (w) = n

1/s

≤

|ek (n)| (|n| + 1) 2

ws

n

1/s |ek (n)|

2

(4.10) = (dk (q))w/q .

n

The bound (4.10) with w = α, q = r and (4.9) imply S≤ dk (r)(2α/r−δ)/(2−δ) .

(4.11)

k

Now we shall use the result of Theorem 2.2. One can reorder dk (r) so that dk (r) ≥ D(r, d)k r/d .

(4.12)

The choice of r and (4.11)–(4.12) yield S ≤ C(ε, δ, d) < +∞

(4.13)

with some universal constants C. The first statement of the theorem follows from (4.8) and (4.13). To prove the second statement, we shall use the inequality (4.6) with r = 2α + p + 2d + ε and the bound of Lemma 4.1. Again, if Lα (r) = +∞, there is nothing to prove. Suppose that Lα (r) < +∞. One obtains 2 d 2 (α)dk (p) k . (|n| + 1)p sup |ψ(t, n)|2 ≤ Lα (r) (4.14) dk (r) t n k

Applying twice the bound (4.10), we obtain

2 2α+p−r p 2 2r (|n| + 1) sup |ψ(t, n)| ≤ Lα (r) dk (r) . n

t

(4.15)

k

Again, by Theorem 2.2 and the choice of r, the sum converges and the second statement of the theorem follows directly from (4.15). The third statement follows directly from the first and the second. As a direct application of Theorem 4.2 we obtain the necessary and sufficient conditions for DL(ψ). Corollary 4.3. Let ψ ∈ Hpp and R(n) = R(n, 0) be defined with the system M(ψ) as in the previous section. Then CDL(ψ) ⇔ DL(ψ) ⇔ R(n) is fast decaying. The result follows from Theorem 3.8 and Theorem 4.2.

How to Prove Dynamical Localization

41

5. Localization on the Interval of Energies In the previous parts of the paper the vector ψ ∈ Hpp was fixed and we did not suppose anything about decay of ψ or about the set of λk such that ψk = 0. In this section we shall consider the set of functions ψ with some decaying properties at infinity and with the energies λk from some interval I ⊂ R (bounded or not). First of all, if DL(ψ) holds, then, in particular, for any p > 0, p |X|ψ (0) = |ψ(n)|2 (|n| + 1)p = C(p) < +∞, n

so ψ is fast decaying. Therefore, the largest set of ψ for which one could prove DL(ψ) is the set of fast decaying functions: A = {ψ ∈ H | ψ fast decaying}. We shall also consider the set of finite functions B = {ψ ∈ H | ψ finite}, which is the subset of A. The set of ψ exponentially decaying at infinity is intermediate between A and B and will be considered in the next section. Let I ⊂ R be some interval (bounded or not). We shall denote by H(I ) the subspace of Hpp with the energies from I H(I ) = ⊕λ∈I Hλ , and by PI (H ) the orthogonal projection on H(I ). Definition 5.1. The operator H has an A-dynamical localization on I if for any ψ ∈ A, we have DL(PI (H )ψ). H has a B-dynamical localization on I if for any ψ ∈ B we have DL(PI (H )ψ). Let M = {ek } be any orthonormal system of eigenfunctions of H complete in H(I ). One can obtain such systems choosing for all eigenvalues λ ∈ I orthonormal systems Mλ complete in Hλ and then taking M = ∪λ∈I Mλ . Clearly, M is unique if the spectrum of H is simple on I . For any ϕ ∈ H(I ) the identity holds: γk ek , γk = ϕ, ek . ϕ= k

Suppose that for some α ≥ 0,

|g(n)|2 (|n| + 1)α < +∞

n

for any eigenfunction of H from H(I ) (if α = 0, this is always true). Define as in the previous section |γk ek (n)| . Rϕ,M (n, α) = sup dk (α) k

42

S. Tcheremchantsev

Define also three functions from Z2d to R+ : |ek (n)ek (m)| , dk (α) k |g(n)g(m)| , Gα (n, m) = sup dg (α) g∈L Fα (n, m) = sup

where L = {g ∈ H| Hg = λg, λ ∈ I, g = 1},

dg (α) =

|g(n)|2 (|n| + 1)α ,

n

and

g (n) g (m)|, Uα (n, m) = sup | g ∈K

where K = { g | H g = λ g , λ ∈ I, and ∀n ∈ Zd ,

| g (n)| ≤ (|n| + 1)−α/2 }.

One can see that Fα (n, m) ≤ Gα (n, m) ≤ Uα (n, m).

(5.1)

The first inequality in (5.1) is obvious. To prove the second, for any g ∈ L define g (n) = (dg (α))−1/2 g(n), so that |g(n)g(m)| = | g (n) g (m)|. dg (α) One verifies that

| g (n)|2 (|n| + 1)α = 1,

n

so g ∈ K and the second inequality in (5.1) holds. Lemma 5.2. Let α ≥ 0, ψ ∈ H, ϕ = PI (H )ψ. The inequality holds: Rϕ,M (n, α) ≤ Nα (n, m)|ψ(m)|,

(5.2)

m∈Zd

where Nα is one of the three functions Fα , Gα , Uα . Proof. As ϕ = PI (H )ψ and the system M is complete in H(I ), ϕ= γk ek , γk = ϕ, ek = ψ, ek . k

Therefore, γk =

ψ(m)ek (m)

m

and |ek (n)ek (m)| |γk ek (n)| . |ψ(m)| ≤ dk (α) dk (α) m

(5.3)

Taking in (5.3) the supremum over k, we obtain the statement of the lemma for Fα . The inequality (5.1) yields the result for Gα and Uα .

How to Prove Dynamical Localization

43

The bound (5.2) combined with Theorem 4.2 allows us to give sufficient conditions for B- and A-dynamical localization on I . Theorem 5.3. The following statements hold: 1. Let α ≥ 0. If one of three functions Fα (n, m), Gα (n, m), Uα (n, m) is fast decaying in n for all fixed m ∈ Zd , then B-dynamical localization holds on I . In particular, PI (H )ψ ∈ A for any ψ ∈ B. 2. If the spectrum of H is simple on I and B-dynamical localization holds on I , then the function F0 (n, m) = G0 (n, m) is fast decaying in n for all fixed m ∈ Zd . Proof. Let ψ ∈ B and Nα be one of three functions Fα , Gα , Uα . As the function ψ is finite and Nα (n, m) is fast decaying in n for any m, (5.2) implies that Rϕ,M (n, α) is fast decaying in n. The third statement of Theorem 4.2 implies DL(ϕ), so B-dynamical localization holds on I . To prove the second statement of the theorem, observe that since the spectrum of H is simple on I , the system M is unique and coincides with the set of normalised eigenfunctions of H with eigenvalues from I . Therefore, Fα (n, m) = Gα (n, m). Moreover, one sees easily that for any ϕ ∈ H(I ), M(ϕ) is a subset of M, where M(ϕ) was defined in Sect. 3. Namely, M(ϕ) = {ek }k∈J , J = {k| γk = 0}. Since γk = 0 for any k ∈ / J, Rϕ,M(ϕ) (n, 0) = sup |γk ek (n)| = sup |γk ek (n)| = Rϕ,M (n, 0). k∈J

k

Let ψ ∈ B, so that DL(ϕ) ≡ DL(PI (H )ψ) holds. By the second statement of Theorem 3.8, the function Rϕ,M (n, 0) is fast decaying in n. In particular, if ψ = δm , m ∈ Zd , then γk = ek (m) and Rϕ,M (n, 0) = F0 (n, m) is fast decaying in n, so the second statement of the theorem holds. As to the A-dynamical localization on I , there are many possible sufficient conditions to propose. For example, the following result holds. Theorem 5.4. Let α ≥ 0 and Nα is one of three functions Fα , Gα , Uα . Assume that one of the two conditions holds: 1. For any s > 0 there exist two finite positive constants k(s), C(s) such that Nα (n, m) ≤ C(s)(|m| + 1)k(s) (|n| + 1)−s .

(5.4)

2. For any s > 0 there exist two finite positive constants k(s), C(s) such that Nα (n, m) ≤ C(s)(|m| + 1)k(s) (|n − m| + 1)−s . Then A-dynamical localization holds on I . Proof. Let ψ ∈ A, so

|ψ(m)| ≤ C(r)(|m| + 1)−r

for any r > 0. For any s > 0 the bounds (5.2) ans (5.4) yield Rϕ,M (n, α) ≤ C(r)C(s)(|n| + 1)−s (|m| + 1)k(s)−r . m

(5.5)

44

S. Tcheremchantsev

Taking r = k(s) + 2d, we see that Rϕ,M (n, α) is fast decaying in n. The first statement of the theorem follows from the third statement of Theorem 4.2. In the case of (5.5) the proof is similar. Up to now, the operator H was fixed. Suppose that there is a family of self-adjoint operators H (θ ) depending on some parameter θ ∈ 0,

0 such that sup |ψ(t, n)| ≤ C exp(−γ |n|). t

We shall note it as EDL(ψ). Clearly, EDL(ψ) implies DL(ψ). To establish necessary and sufficient conditions for EDL(ψ) we shall need the following version of Theorem 2.2.

How to Prove Dynamical Localization

45

Theorem 6.1. Let {ek } be any orthonormal system in H, γ > 0. For any k define the numbers ηk (γ ) = sup(|ek (n)|2 exp(γ |n|)). n

One can reorder ηk (γ ) so that

ηk (γ ) ≥ D exp(βk 1/d ) with some universal positive constants D(γ , d), β(γ , d). Proof. We shall follow the proof of Lemma 2.1 with akn = |ek (n)|2 and A = B = 1. Let N > 0, then for the set J (N ) = {k| |ek (n)|2 ≤ 1/2} |n|>N

we have

Card(J (N )) ≤ K(N + 1)d . Let L > 0. Consider the following set in N: I (L) = {k| ηk (γ ) ≤ L}.

It follows from definition of ηk (γ ) that for any k ∈ I (N ), |ek (n)|2 ≤ L exp(−γ |n|). Therefore, for any ν > 0, |ek (n)|2 ≤ C(ν, d)L exp(−(γ − ν)N ). |n|>N

Let L be such that C(ν, d)L exp(−(γ − ν)N ) = 1/2. Then I (L) ⊂ J (N ) and Card(I (L)) ≤ Card(J (N )) ≤ K(N + 1)d ≤ C(γ , ν, d) logd L for any L ≥ L0 (γ , ν, d). The result of the theorem follows directly from (6.1).

(6.1)

With this result we can obtain a necessary condition for EDL(ψ) in terms of projections ψk and in terms of coefficients of the spectral measure of ψ. Let M(ψ) be the orthonormal system of eigenfunctions of H defined in Sect. 2 and Rψ,M(ψ) (n) = supk |γk ek (n)|, where γk = ψ, ek , H ek = λk ek . The spectral measure of ψ can be written as µψ = ak δλk . k

Theorem 6.2. Suppose that sup |ψ(t, n)| ≤ C exp(−α|n|)

(6.2)

sup |γk |2 ηk (2α) < +∞,

(6.3)

Rψ,M(ψ) (n) ≤ C exp(−α|n|).

(6.4)

t

for some α > 0. Then k

or, equivalently, One can reorder ak so that with some positive C, β.

ak ≤ C exp(−βk 1/d )

46

S. Tcheremchantsev

Proof. The proof of the first statement is made in [5]. Since ψ(t, n) =

exp(−itλs )γs es (n),

s

then for any k, n we have

T

1/T 0

ψ(t, n) exp(itλk )dt → γk ek (n)

(6.5)

as T → ∞. The bound (6.2) and (6.5) yield |γk ek (n)| ≤ C exp(−α|n|) for any k, n, which implies (6.4) and (6.3). Next, it follows from (6.3) and Theorem 6.1 that after reordering ak ≡ |γk |2 ≤ C(ηk (2α))−1 ≤ C exp(−βk 1/d ) with some positive C, β.

In the following statement we shall use the same notations as in Theorem 4.2. As usual, M = {ek } is any orthonormal system of eigenfunctions of H . Moreover, for δ ≥ 0 we define Rψ,M (n, δ) = sup k

|γk ek (n)| , ηk (δ)

(6.6)

supposing that ηk (δ) < +∞ for any k (it is always true for δ = 0 because ηk (0) = 1). Theorem 6.3. Let ψ ∈ HM The following statements hold with universal constants C: 1. If Rψ,M (n, 0) ≤ C exp(−α|n|) for some α > 0, then sup |ψ(t, n)| ≤ C(α, d)(|n|d + 1) exp(−α|n|). t

2. Suppose that ηk (δ) < +∞ for some δ > 0 for any k and Rψ,M (n, δ) ≤ C exp(−α|n|), where α > δ. Then for any ν : 0 < ν < α − δ, sup |ψ(t, n)| ≤ C(ν, α, d) exp(−ν|n|). t

In particular, in both cases EDL(ψ) holds.

How to Prove Dynamical Localization

47

Proof. Since sup |γk ek (n)| ≤ C exp(−α|n|), k

we obtain sup(|γk |2 ηk (2α)) < +∞. k

The result of Theorem 6.1 yields after reordering |γk | ≤ C exp(−βk 1/d ) with some C, β > 0. Now ψ(t, n) can be estimated in the usual way. For any n ∈ Zd and B > 0, |ψ(t, n)| ≤ |γk ek (n)| ≤ C exp(−α|n|) + |γk | d d k k≤B|n| k>B|n| (6.7) ≤ CB|n|d exp(−α|n|) + K exp(−β/2(B|n|d )1/d ). Taking in (6.7) B so that β/2B 1/d = α, we obtain the first statement of the theorem. To prove the second statement of the theorem we shall need a bound relating ηk (α) and ηk (ν) for ν ≤ α. It follows from definition of ηk (α) that |ek (n)|2 ≤ ηk (α) exp(−α|n|) for any k, n. At the same time |ek (n)|2 ≤ 1. Therefore, |ek (n)|2 ≤ min{1, ηk (α) exp(−α|n|)}.

(6.8)

We shall use the elementary inequality min{1, z} ≤ zs ,

z ≥ 0, 0 < s < 1.

(6.9)

The bounds (6.8) and (6.9) where s = ν/α yield |ek (n)|2 ≤ ηks (α) exp(−ν|n|), and finally ηk (ν) ≤ (ηk (α))ν/α .

(6.10)

This bound is very similar to the bound (4.10) for the moments dk (r). Now we can end the proof. For any k, n, |γk ek (n)| ≤ Cηk (δ) exp(−α|n|). Therefore, |γk | ηk (2α) ≤ Cηk (δ). Next, as |ψ(t, n)| ≤

k

|γk ek (n)|,

(6.11)

48

S. Tcheremchantsev

one has

A ≡ sup exp(ν|n|) sup |ψ(t, n)| ≤ |γk | ηk (2ν). n

t

(6.12)

k

The bounds (6.11) and (6.12) imply A≤C

k

1/2

−1/2

ηk (δ)ηk (2ν)ηk

(2α).

(6.13)

Using twice the bound (6.10), we obtain from (6.13): A≤C

(ηk (2α))(δ+ν−α)/(2α) . k

As δ + ν < α, by Theorem 6.1 the sum converges and is bounded by some universal constant, so the second statement of theorem holds. The result of the theorem can be used to give sufficient conditions for exponential dynamical localization on the interval of energies I . Consider the set of exponentially decaying functions in H: C = {ψ| ∃r > 0 : |ψ(n)| ≤ C exp(−r|n|)}. Clearly, B ⊂ C ⊂ A, where A and B were defined in the previous section. Definition 6.4. The operator H has exponential dynamical localization on I , if for any ψ ∈ C, we have EDL(PI (H )ψ). Using the result of Theorem 6.3, one can give sufficient conditions for EDL on the interval I . For the sake of simplicity, we restrict ourselves to the first statement of this theorem. One could, however, if necessary, give also a more general sufficient condition based on the second statement of Theorem 6.3. As in the previous section, M = {ek } is some orthonormal system of eigenfunctions of H complete in H(I ) and F0 (n, m) = sup |ek (n)ek (m)|, k

G0 (n, m) = sup |g(n)g(m)|, g∈L

F0 (n, m) ≤ G0 (n, m). Theorem 6.5. Let N be one of two functions F0 , G0 . Let ψ ∈ C so that |ψ(m)| ≤ C exp(−r|m|) for some r > 0. As usual, let ϕ = PI (H )ψ, ϕ(t) = exp(−itH )ϕ. The following statements hold:

How to Prove Dynamical Localization

49

1. Suppose that there exist α > 0, β > 0 such that N (n, m) ≤ C exp(−α|n| + β|m|). Then

sup |ϕ(t, n)| ≤ C(γ ) exp(−γ |n|), t

where 0 < γ < min{α, αr/β}. 2. Suppose that there exist α > 0, β > 0 such that N (n, m) ≤ C exp(−α|n − m| + β|m|). Then sup |ϕ(t, n)| ≤ C(γ ) exp(−γ |n|), t

(6.14)

where 0 < γ < min{α, αr/(α + β)}. In particular, in both cases EDL holds on I . Proof. We shall give it for the second statement of the theorem; for the first the things are similar. As in the proof of Lemma 5.2, we have the bound Rϕ,M (n, 0) ≤ N (n, m)|ψ(m)|. m

Since N (n, m) ≤ 1 for any n, m, N (n, m) ≤ N s (n, m) ≤ C s exp(−sα|n − m| + sβ|m|) for all s ∈ [0, 1] (the argument is similar to (6.8)–(6.9)). Therefore, Rϕ,M (n, 0) ≤ C(s) exp(−sα|n − m| − (r − sβ)|m|). m

If r ≥ α + β, we take s = 1, and for r < α + β, we take s = r/(α + β). In both cases we obtain the bound Rϕ,M (n, 0) ≤ C(γ ) exp(−γ |n|)

(6.15)

for all 0 < γ < min{α, αr/(α + β)}. The bound (6.14) follows directly from (6.15) and the first statement of Theorem 6.3. As an example where this theorem can be directly applied consider operators with SULE on I . Namely, assume that there exists an orthonormal system M = {ek } of eigenfunctions of H complete in H(I ) such that for some nk ∈ Zd for any δ > 0, |ek (n)| ≤ C(δ) exp(δ|nk | − α|n − nk |),

(6.16)

where α > 0 and the constants C(δ) are uniform in k, n. It follows from (6.16) that |ek (n)ek (m)| ≤ C 2 (δ) exp(2δ|nk | − α(|n − nk | + |m − nk |)). Using the elementary inequalities |nk | ≤ |m| + |m − nk |,

|n − nk | + |m − nk | ≥ |m − n|,

50

S. Tcheremchantsev

one can easily show that F0 (n, m) = sup |ek (n)ek (m)| ≤ C 2 (δ) exp(2δ|m| − (α − 2δ)|n − m|) k

for any δ > 0. The second statement of Theorem 6.5 implies EDL on I . Moreover, if |ψ(m)| ≤ C exp(−r|m|), then sup |ϕ(t, n)| ≤ C(γ ) exp(−γ |n|), t

where 0 < γ < min{α, r}. 7. Adaptation to the Continuous Case Most of results of the previous sections remain valid in the case of L2 (Rd ) provided the result of Theorem 2.2 holds. However, one cannot expect that Theorem 2.2 is true in the continuous case in such a generality. For example, in the case of L2 (R), define the moments dk (p) = |ek (x)|2 (|x| + 1)p dx. R

It is sufficient to take any orthonormal system {ek (x)} in L2 ([−1, 1]) and to put ek (x) = 0 for |x| > 1. For such a system dk (p) ≤ 2p for any k. However, if the functions ek (x) do not oscillate fast, the same phenomenon of “repulsion” of eigenfunctions occurs and one can show the result similar to that of Theorem 2.2. The main result of this section is the following. Theorem 7.1. Let {ek }, k ∈ N be an orthonormal system in L2 (Rd ) such that lim sup | ek (u)|2 du = 0, R→+∞ k

|u|>R

(7.1)

where ek is the Fourier transformation of ek . Then for any p > 0 one can reorder the moments dk (p) = (|x| + 1)p |ek (x)|2 dx Rd

so that

dk (p) ≥ D(p, d)k p/d

with some positive constants D(p, d) depending on the system {ek }. Proof. To prove the theorem, we shall discretize the problem and use the same technical Lemma 2.1 as in the discrete case. For any n = (n1 , ..., nd ) ∈ Zd , ε > 0 define the cube of size ε in Rd : Kn (ε) = {x = (x1 , ..., xd ) ∈ Rd | xj ∈ [εnj , ε(nj + 1)), j = 1, ..., d}. It is clear that the cubes are disjoint and Rd = ∪n Kn (ε). Let x ∈ Kn (ε). Then C1 (|n| + 1) ≤ |x| + 1 ≤ C2 (|n| + 1) with some constants C1 (ε), C2 (ε). As dk (p) = (|x| + 1)p |ek (x)|2 dx = (|x| + 1)p |ek (x)|2 dx, Rd

n∈Zd

Kn (ε)

How to Prove Dynamical Localization

51

we obtain that p

p

C1 (ε)wk (p) ≤ dk (p) ≤ C2 (ε)wk (p),

(7.2)

where p wk (p) = (|n| + 1) n

Kn (ε)

|ek (x)|2 dx ≡

(|n| + 1)p |gk (n)|2 . n

One Lemma 2.1 taking akn = |gk (n)|2 . It is obvious that could 2 try to apply 2 = ek = 1, so the condition (2.2) is satisfied. However, it is not clear n |gk (n)| whether k |gk (n)|2 ≤ A < +∞. To avoid this problem, we shall consider rather the quantities hk (n) = ek (x)dx. Kn (ε)

By the Cauchy–Schwartz inequality, |hk (n)|2 ≤ εd |gk (n)|2 .

(7.3)

Therefore, (7.2) implies p

dk (p) ≥ ε−d C1 (ε)

(|n| + 1)p |hk (n)|2 ,

(7.4)

n

and to prove the theorem it is sufficient to show that the numbers akn = |hk (n)|2 verify the conditions of Lemma 2.1 for some ε > 0. One can represent hk (n) as hk (n) = ek , ηn L2 (Rd ) , where ηn is the characteristic function of Kn (ε). Since the system {ek } is orthonormal, |hk (n)|2 ≤ ηn 2 = εd , k

so (2.1) holds with A = εd . To prove (2.2) is more difficult. We shall show that the numbers ε −d |hk (n)|2 are close 2 to |gk (n)| for ε small enough if the condition (7.1) is satisfied. Using n |gk (n)|2 = 1, we shall prove (2.2) for some B(ε) > 0 if ε is small enough. We need the following technical result. Lemma 7.2. Let ψ ∈ L2 (Rd ), ψ = 1. For any n ∈ Zd , ε > 0 define (n (ε) =

|ψ(x)| dx − ε 2

Kn (ε)

−d

Kn (ε)

2 ψ(x)dx

((n (ε) ≥ 0 by Cauchy–Schwartz inequality). There exists some universal constant C(d) such that for any ε > 0, R > 0, 0≤

n∈Zd

1/2

(n (ε) ≤ C(d) R ε + 2 2

|u|>R

(u)|2 du |ψ

.

52

S. Tcheremchantsev

Proof. One can represent (n (ε) as (n (ε) = ε−d dx ψ(x) Kn (ε)

Kn (ε)

dy (ψ(x) − ψ(y)).

(7.5)

Applying twice the Cauchy–Schwartz inequality (to the integral over y and to the integral over x), we obtain from (7.5): (2n (ε) ≤ ε−d dx|ψ(x)|2 dxdy|ψ(x) − ψ(y)|2 . Kn (ε)

Kn (ε) Kn (ε)

Therefore,

(n (ε) ≤ ε

n

·

n

=ε

1/2

−d/2

n

Kn (ε)

1/2

Kn (ε) Kn (ε)

−d/2

dx|ψ(x)|

n

2

dxdy|ψ(x) − ψ(y)|

2

(7.6)

1/2

Kn (ε) Kn (ε)

dxdy|ψ(x) − ψ(y)|

2

.

√ One can now observe that |x − y| ≤ ε d for any x, y ∈ Kn (ε). Therefore, dxdy|ψ(x) − ψ(y)|2 n

Kn (ε) Kn (ε)

≤

n

=

Rd

Kn (ε)

Rd

dx

Rd

√ dy|ψ(x) − ψ(y)|2 F |x − y| ≤ ε d (7.7)

√ dxdy|ψ(x) − ψ(y)|2 F (|x − y| ≤ ε d),

√ where F is the characteristic function of the set {(x, y) | |x − y| ≤ ε d}. The bounds (7.6)-(7.7) imply (n (ε) ≤ ε−d/2 L1/2 (δ), (7.8) n

√ where L(δ) = Rd Rd |ψ(x) − ψ(y)|2 F (|x − y| ≤ δ) and δ = ε d. Changing the variable z = y − x in the integral, we obtain in Fourier representation (u)|2 |1 − eiz,u |2 . L(δ) = dz du|ψ (7.9) |z|≤δ

Rd

Let R > 0. The integral over u can be written as (u)|2 |1 − eiz,u |2 = I1 (z) + I2 (z), du|ψ Rd

How to Prove Dynamical Localization

53

where in I1 (z) and I2 (z) one integrates over u : |u| ≤ R and over u : |u| > R respectively. Using the elementary bound |eiw − 1| ≤ C|w|, w ∈ R, we estimate I1 (z) ≤ C|z|2

|u|≤R

(u)|2 ≤ C|z|2 R 2 ψ 2 = C|z|2 R 2 . du|u|2 |ψ

(7.10)

As to I2 (z), trivially I2 (z) ≤ 4

|u|>R

(u)|2 . du|ψ

(7.11)

The bounds (7.9)-(7.11) imply 2 d+2

L(δ) ≤ CR δ

+ Cδ

d

|u|>R

(u)|2 . du|ψ

Finally, (7.8) and (7.12) yield the statement of the lemma.

(7.12)

Now we can finish the proof of the theorem. Let {ek } be an orthonormal system verifying (7.1). The bound of Lemma 7.2 applied to ek yields 2 −d 2 (|gk (n)| − ε |hk (n)| ) ≤ C(d) R 2 ε 2 + n

1/2 |u|>R

du| ek (u)|

2

.

(7.13)

Using the condition (7.1), it is easy to see that one can choose R > 0 big enough and ε> 0 small enough so that the r.h.s. of (7.13) is smaller than 1/2 for any k ∈ N. As 2 n |gk (n)| = 1 for any k, (7.13) yields for such ε:

|hk (n)|2 ≥ εd /2

n

and (2.2) holds for akn = |hk (n)|2 with B = εd /2. The proof of the theorem is completed. Remark. One can note that the choice of ε depends on the system {ek }, so, unlike the discrete case, the constants D(p, d) are not necessarily universal in the continuous case. An important example where the condition (7.1) is satisfied, is given by the following Theorem 7.3. Let H = −( + V (x) be an operator in L2 (Rd ) self-adjoint on H 2 (Rd ), where V (x) is a real function bounded from below: V (x) ≥ −M for a.e. x ∈ Rd . Let K ∈ R and {ek } be any orthonormal family of eigenfunctions of H with eigenvalues λk ≤ K for all k. Then for any p > 0 one can reorder the moments dk (p) so that dk (p) ≥ D(p, d, K + M)k p/d with universal positive constant D depending only on p, d and A + M.

54

S. Tcheremchantsev

Proof. For any k ∈ N we have H ek (x) = −(ek (x) + V (x)ek (x) = λk ek (x). Therefore, −(ek , ek =

Rd

dx(λk − V (x))|ek (x)|2 ≤ (K + M)ek 2 = K + M.

(7.14)

On the other hand,

−(ek , ek =

Rd

du|u|2 | ek (u)|2 ≥ R 2

|u|>R

du| ek (u)|2

(7.15)

for any R > 0. The bounds (7.14)–(7.15) imply sup du| ek (u)|2 ≤ (K + M)/R 2 , k

|u|>R

so (7.1) is satisfied. Moreover, it is clear from the proof of Theorem 7.1 that one can choose R > 0 and ε > 0 depending only on d and K + M so that the r.h.s of (7.13) is smaller than 1/2 for any k. That means that the constants A = ε d and B = εd /2 in Lemma 2.1 depend only on d and K + M but not on the choice of the system {ek }. This gives us the result of the theorem. All the results of Sect. 3 hold if the orthonormal system M(ψ) satisfies the condition (7.1). The proof of Theorem 3.3 is essentially the same (one considers |x|≤N dx instead of |n|≤N in the proof). The proofs of Theorem 3.4 and Corollary 3.5 do not change. The results of Lemma 3.6 and Theorem 3.8 hold with the function Rψ,M (n) defined as follows: Rψ,M (n) = sup |γk gk (n)|, γk = ψ, ek , n ∈ N, k

where |gk ≡ Kn (1) dx|ek (x)|2 . The sufficient conditions for DL(ψ, p) and DL(ψ) in the continuous case are based on the following version of Lemma 4.1: (n)|2

p |X|ψ (t)

≤C

n∈Zd

(|n| + 1)

p

|ψ(t, x)| ≤ 2

Kn (1)

2 ak wk (p)

,

(7.16)

k

where wk (p) = n (|n| + 1)p |gk (n)|2 . The numbers wk (p) are equivalent to the moments dk (p) due to (7.2), so the lower bounds wk (p) ≥ Dk p/d hold. The result similar to (n)| , Theorem 4.2 can be easily obtained. One should only replace R(n, α) by supk |γwk gk k(α) supt |ψ(t, n)| by k |γk gk (n)|, dk (p) by wk (p), and ek (n) by gk (n). The only differ ence is the following: one does not have the bound k |gk (n)|2 ≤ 1 which was valid for ek (n) in the discrete case. Therefore the bounds in Statements 1 and 2 of the theorem one can prove are slightly weaker than in the discrete case. Statement 3 of the theorem and the result of Corollary 4.3 remain true. For the sake of completeness, let us give a direct proof of the third statement of Theorem 4.2 in the continuous case (this proof is valid also in the discrete case). For simplicity we shall suppose that α = 0.

How to Prove Dynamical Localization

55

Theorem 7.4. Let M = {ek } be some orthonormal system of eigenfunctions of H in L2 (Rd ) verifying the conditions of Theorem 7.1. For agiven vector ψ ∈ HM consider the function R(n) = supk |γk gk (n)|, where |gk (n)|2 = Kn (1) |ek (x)|2 dx ≤ 1, n ∈ Zd and γk = ψ, ek . If the function R(n) is fast decaying then DL(ψ) hold. Proof. As the function R(n) is fast decaying, for any r > 0, |γk |2 wk (r) =

(|n| + 1)r |γk gk (n)|2 ≤ (|n| + 1)r R 2 (n) ≤ C(r) < +∞. n

n

On the other hand, wk (r) ≥ Dk r/d after reordering. Therefore, |γk | ≤ C(m)k −m

for any m > 0.

(7.17)

Next, as |ψ(t, x)| ≤

∞

|γk ek (x)|,

k=1

for any n ∈ Zd the bound holds: Kn (1)

|ψ(t, x)|2 dx ≤

∞

2 |γk gk (n)|

≡ S 2 (n).

(7.18)

k=1

Reorder the terms in the sum so that (7.17) hold. Then 2 |n| S(n) = + k=1

∞

|γk gk (n)|

k=|n|2 +1 ∞

≤ |n|2 R(n) +

(7.19) |γk | ≤ |n|2 R(n) + C(m)(|n|2 + 1)1−m

k=|n|2 +1

for any m > 0. The bounds (7.16), (7.18) and (7.19) yield DL(ψ, p) for all p > 0. The proof is completed. Most of results of Sect. 5 can be adapted to the continuous case. It is sufficient to take gk (n) instead of ek (n) and ( Kn (1) |ψ(x)|2 )1/2 instead of ψ(n). The results of Theorem 5.3 and Theorem 5.4 are true if the system M complete in H(I ) satisfies the conditions of Theorem 7.1. In particular, this is the case if H = −( + V (x) with V (x) bounded from below and I = (−∞, K]. The result similar to that of Theorem 5.5 can be proved in the case H (θ ) = −( + V (x, θ), where V (x, θ ) ≥ −M for µ-a.e. θ and a.e.x. The constants in the bounds will depend on ε, p, d and K + M. The main results of Sect. 6 can be also generalized to the continuous case in a similar way. Acknowledgements. I thank F. Germinet for stimulating discussions on the subject of the paper.

56

S. Tcheremchantsev

References 1. Aizenman, M.: Localization at weal disorder: Some elementary bounds. Rev. Math. Phys. 6, 1163–1182 (1994) 2. Aizenman, M., Schenker, J.H., Friedrich, R.M., Hundertmark, D.: Finite-volume fractional-moment criteria for Anderson localization. To appear in Commun. Math. Phys. 3. Damanik, D. and Stollman, P.: Multi-scale analysis implies strong dynamical localization. Preprint (1999) 4. De Bièvre, S. and Germinet, F.: Dynamical localization for the random dimer Schrödinger operator. J. Stat. Phys. 98, 1135–1148 (2000) 5. Del Rio, R., Jitomirskaya, S., Last, Y. and Simon, B.: Operators with singular continuous spectrum IV: Hausdorff dimensions, rank one perturbation and localization. J. d’Analyse Math. 69, 153–200 (1996) 6. Germinet, F. and De Bièvre, S.: Dynamical localization for discrete and continuous random Schrödinger operators. Commun. Math. Phys. 194, 323–341 (1998) 7. Germinet, F.: Dynamical localization II with an application to the almost Mathieu operator. J. Stat. Phys. 95, 273–286 (1999) 8. Germinet, F. and Jitomirskaya, S.: Strong dynamical localization for the almost Mathieu model. Preprint (2000) Communicated by B. Simon

Commun. Math. Phys. 221, 57 – 76 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Conformal and Quasiconformal Realizations of Exceptional Lie Groups M. Günaydin1, , K. Koepsell2 , H. Nicolai2 1 CERN, Theory Division, 1211 Geneva 23, Switzerland. E-mail: [email protected] 2 Max-Planck-Institut für Gravitationsphysik, Albert-Einstein-Institut, Mühlenberg 1, 14476 Potsdam,

Germany. E-mail: [email protected]; [email protected] Received: 12 August 2000 / Accepted: 2 March 2001

Abstract: We present a nonlinear realization of E8(8) on a space of 57 dimensions, which is quasiconformal in the sense that it leaves invariant a suitably defined “light cone” in R 57 . This realization, which is related to the Freudenthal triple system associated with the unique exceptional Jordan algebra over the split octonions, contains previous conformal realizations of the lower rank exceptional Lie groups on generalized space times, and in particular a conformal realization of E7(7) on R 27 which we exhibit explicitly. Possible applications of our results to supergravity and M-Theory are briefly mentioned.

1. Introduction It is an old idea to define generalized space-times by association with Jordan algebras J , in such a way that the space-time is coordinatized by the elements of J , and that its rotation, Lorentz, and conformal group can be identified with the automorphism, reduced structure, and the linear fractional group of J , respectively [11–13]. The aesthetic appeal of this idea rests to a large extent on the fact that key ingredients for formulating relativistic quantum field theories over four dimensional Minkowski space extend naturally to these generalized space times; in particular, the well-known connection between the positive energy unitary representations of the four dimensional conformal group SU (2, 2) and the covariant fields transforming in finite dimensional representations of the Lorentz group SL(2, C) [29, 28] extends to all generalized space-times defined by Jordan algebras [16]. The appearance of exceptional Lie groups and algebras in extended supergravities and their relevance to understanding the non-perturbative regime of string theory have provided new impetus; indeed, possible applications to string and M-Theory constitute the main motivation for the present investigation. This work was supported in part by the NATO collaborative research grant CRG. 960188.

Work supported in part by the National Science Foundation under grant number PHY-9802510.

Permanent address: Physics Department, Penn State University, University Park, PA 16802, USA.

58

M. Günaydin, K. Koepsell, H. Nicolai

In this paper, we will present a novel construction involving the maximally extended Lie group E8(8) . This construction of E8(8) together with the corresponding construction of E8(−24) contain all previous examples of generalized space-times based on exceptional Lie groups, and at the same time goes beyond the framework of Jordan algebras. More precisely, we show that there exists a quasiconformal nonlinear realization of E8(8) on a space of 57 dimensions1 . This space may be viewed as the quotient of E8(8) by its maximal parabolic subgroup [18, 19]; there is no Jordan algebra directly associated with it, but it can be related to a certain Freudenthal triple system which itself is associated with the “split” exceptional Jordan algebra J3O S , where O S denote the split real form of the octonions O. It furthermore admits an E7(7) invariant norm form N4 , which gets multiplied by a (coordinate dependent) factor under the nonlinearly realized “special conformal” transformations. Therefore the light cone, defined by the condition N4 = 0, is actually invariant under the full E8(8) , which thus plays the role of a generalized conformal group. By truncation we obtain quasiconformal realizations of other exceptional Lie groups. Furthermore, we recover previous conformal realizations of the lower rank exceptional groups (some of which correspond to Jordan algebras). In particular, we give a completely explicit conformal Möbius-like nonlinear realization of E7(7) on the 27-dimensional space associated with the exceptional Jordan algebra J3O S , with linearly realized subgroups F4(4) (the “rotation group”) and E6(6) (the “Lorentz group”). Although in part this result is implicitly contained in the existing literature on Jordan algebras, the relevant transformations have not been exhibited explicitly so far, and are here presented in the basis that is also used in maximal supergravity theories. The basic concepts are best illustrated in terms of a simple and familiar example, namely the conformal group in four dimensions [29], and its realization via the Jordan algebra J2C of hermitian 2 × 2 matrices with the hermiticity preserving commutative (but non-associative) product a ◦ b := 21 (ab + ba)

(1)

(basic properties of Jordan algebras are summarized in Appendix A). As is well known, these matrices are in one-to-one correspondence with four-vectors x µ in Minkowski space via the formula x ≡ xµ σ µ , where σ µ := (1, σ ). The “norm form” on this algebra is just the ordinary determinant, i.e. N2 (x) := det x = xµ x µ

(2)

(it will be a higher order polynomial in the general case). Defining x¯ := xµ σ¯ µ (where σ¯ µ := (1, −σ )) we introduce the Jordan triple product on J2C : ¯ ◦ c + (c ◦ b) ¯ ◦ a − (a ◦ c) ◦ b¯ {a b c} := (a ◦ b) 1 ¯ + cba) ¯ = a, b c + c, b a − a, c b = (a bc

(3)

2

with the standard Lorentz invariant bilinear form a, b := aµ bµ . However, it is not generally true that the Jordan triple product can be thus expressed in terms of a bilinear form. The automorphism group of J2C , which is by definition compatible with the Jordan product, is just the rotation group SU (2); the structure group, defined as the invariance 1 A nonlinear realization will be referred to as “quasiconformal” if it is based on a five graded decomposition of the underlying Lie algebra (as for E8(8) ); it will be called “conformal” if it is based on a three graded decomposition (as e.g. for E7(7) ).

Conformal Realizations of Exceptional Lie Groups

59

of the norm form up to a constant factor, is the product SL(2, C) × D, i.e. the Lorentz group and dilatations. The conformal group associated with J2C is the group leaving invariant the light-cone N2 (x) = 0. As is well known, the associated Lie algebra is su(2, 2), and possesses a three-graded structure g = g−1 ⊕ g0 ⊕ g+1 ,

(4)

where the grade −1 and grade +1 spaces correspond to generators of translations P µ and special conformal transformations K µ , respectively, while the grade 0 subspace is spanned by the Lorentz generators M µν and the dilatation generator D. The subspaces g1 and g−1 can each be associated with the Jordan algebra J2C , such that their elements are labeled by elements a = aµ σ µ of J2C . The precise correspondence is Ua := aµ P µ ∈ g−1

U˜ a := aµ K µ ∈ g+1 .

and

(5)

By contrast, the generators in g0 are labeled by two elements a, b ∈ J2C , viz. Sab := aµ bν (M µν + ηµν D).

(6)

The conformal group is realized non-linearly on the space of four-vectors x ∈ J2C , with a Möbius-like infinitesimal action of the special conformal transformations δx µ = 2c, x x µ − x, x cµ

(7)

with parameter cµ . All variations acquire a very simple form when expressed in terms of the above generators: we have Ua (x) = a, Sab (x) = {a b x}, U˜ c (x) = − 1 {x c x},

(8)

2

where {. . . } is the Jordan triple product introduced above. From these transformations it is elementary to deduce the commutation relations [Ua , U˜ b ] = Sab , [Sab , Uc ] = U{abc} [Sab , U˜ c ] = U˜ {bac} ,

(9)

[Sab , Scd ] = S{abc} d − S{bad} c (of course, these could have been derived directly from those of the conformal group). As one can also see, the Lie algebra g admits an involutive automorphism ι exchanging g−1 and g+1 (hence, ι(K µ ) = P µ ). The above transformation rules and commutation relations exemplify the structure that we will encounter again in Sect. 3 of this paper: the conformal realization of E7(7) on R 27 presented there has the same form, except that J2C is replaced by the exceptional Jordan algebra J3O S over the split octonions O S . While our form of the nonlinear variations appears to be new, the concomitant construction of the Lie algebra itself by means of the Jordan triple product has been known in the literature as the Tits–Kantor–Koecher construction [32, 21, 25], and as such generalizes to other Jordan algebras. The generalized linear fractional (Möbius) groups of Jordan algebras can be abstractly defined in an

60

M. Günaydin, K. Koepsell, H. Nicolai

analogous manner [26], and shown to leave invariant certain generalized p-angles defined by the norm form of degree p of the underlying Jordan algebra [22, 14]. However, to our knowledge, explicit formulas of the type derived here have not appeared in the literature before. While this construction works for the exceptional Lie algebras E6(6) , and E7(7) , as well as other Lie algebras admitting a three graded structure, it fails for E8(8) , F4(4) , and G2(2) , for which a three grading does not exist. These algebras possess only a five graded structure g = g−2 ⊕ g−1 ⊕ g0 ⊕ g+1 ⊕ g+2 .

(10)

Our main result, to be described in Sect. 2, states that a “quasiconformal” realization is still possible on a space of dimension dim(g1 ) + 1 if the top grade spaces g±2 are one-dimensional. Five graded Lie algebras with this property are closely related to the so-called Freudenthal Triple Systems [9, 30], which were originally invented to provide alternative constructions of the exceptional Lie groups2 . This relation will be made very explicit in the present paper. The novel realization of E8(8) which we will arrive at, together with its natural extension to E8(−24) , contains various other constructions of exceptional Lie algebras by truncation, including the conformal realizations based on a three graded structure. For this reason, we describe it first in Sect. 2, and then show how the other cases can be obtained from it. Whereas previous attempts to construct generalized space-times mainly focused on generalizing Minkowski space-time and its symmetries, the physical applications that we have in mind here are of a somewhat different nature, and inspired by recent developments in superstring and M-Theory. Namely, the generalized “space-times” presented here could conceivably be identified with certain internal spaces arising in supergravity and superstring theory, which are related to the appearance of central charges in the associated superalgebras. Central charges and their solitonic carriers have been much discussed in the recent literature because it is hoped that they may provide a window on M-Theory and its non-perturbative degrees of freedom. More specifically, it has been argued in [5] that a proper description of the non-perturbative M-Theory degrees of freedom might require supplementing ordinary space-time coordinates by central charge coordinates. Solitonic charges also play an important role in the microscopic description of black hole entropy: for maximally extended N = 8 supergravity, the latter is conjectured to be given by an E7(7) invariant formula [20, 8]. The corresponding formula for the entropy in maximally extended supergravity in five dimensions is E6(6) invariant and involves a cubic form. In [7] an invariant classification of orbits of E7(7) and E6(6) actions on their fundamental representations that classify BPS states in d = 4 and d = 5 was given. The entropy formula in [20, 8] is identical to the equation for a vector with vanishing norm in 57 dimensions (see Eq. (27)), provided we use the SL(8, R)form of the quartic E7(7) invariant. This suggests that the 57th component of our E8(8) realization should be interpreted as the entropy. However, we should stress that the quartic invariant can assume both positive and negative values, cf. the simple examples given inAppendix B. In order to avoid imaginary entropy, one must therefore restrict oneself to the positive semidefinite values of the quartic invariant, corresponding to the “time-like” and “light-like” orbits of E7(7) in the language of [7]. With the 57th coordinate interpreted as the entropy and the remaining 56 coordinates as the electric and magnetic charges, it is natural from our point of view to define a distance in this “entropy-charge space” between any two 2 The more general Kantor–Triple-Systems for which g±2 have more than one dimension, will not be discussed in this paper.

Conformal Realizations of Exceptional Lie Groups

61

black hole solutions using our Eqs. (25), (26). If two black hole solutions are light-like separated in this space, they will remain so under the action of E8(8) .3 We should also point out that it is not entirely clear from the existing black hole literature whether it is the SU(8) or the SL(8, R) form of the invariant that should be used here (the detailed relation between the two is worked out in Appendix B). The SU(8) basis is relevant for the central charges, which appear in the superalgebra via surface integrals at spatial infinity and determine the structure (and length) of BPS multiplets. By contrast, the 28 electric and 28 magnetic charges carried by the solitons of d = 4, N = 8 supergravity transform separately under SL(8, R) [4], and therefore the SL(8, R) form of the invariant appears more appropriate in this context. For applications to M-Theory it would be important to obtain the exponentiated version of our realization. One might reasonably expect that modular forms with respect to a fractional linear realization of the arithmetic group E8(8) (Z) will have a role to play. We expect that our results will pave the way for the explicit construction of such modular forms. According to [19] these would depend on 28+1 = 29 variables, such that the 57dimensional Heisenberg subalgebra of E8(8) exhibited here would be realized in terms of 28 “coordinates” and 28 “momenta”. Consequently, the 57 dimensions in which E8(8) acts might alternatively be interpreted as a generalized Heisenberg group, in which case the 57th component would play the role of a variable parameter h. ¯ The action of E8(8) (Z) on the 57 dimensional Heisenberg group would then constitute the invariance group of a generalized Dirac quantization condition. This observation is also in accord with the fact that the term modifying the vector space addition in R 57 (cf. Eq.(25)), which is required by E8(8) invariance, is just the cocycle induced by the standard canonical commutation relations on an (28+28)-dimensional phase space. 2. Quasiconformal Realization of E8(8) 2.1. E7(7) decomposition of E8(8) . We will start with the maximal case, the exceptional Lie group E8(8) , and its quasiconformal realization on R 57 , because this realization contains all others by truncation. Our results are based on the following five graded decomposition of E8(8) with respect to its E7(7) × D subgroup g−2 ⊕ g−1 ⊕

g0

⊕ g+1 ⊕ g+2

1 ⊕ 56 ⊕ (133 ⊕ 1) ⊕ 56 ⊕ 1

(11)

with the one-dimensional group D consisting of dilatations. D itself is part of an SL(2, R) group, and the above decomposition thus corresponds to the decomposition 248 → (133, 1) ⊕ (56, 2) ⊕ (1, 3) of E8(8) under its subgroup E7(7) × SL(2, R). In order to write out the E7(7) generators, it is convenient to further decompose them w.r.t. the subgroup SL(8, R) of E7(7) . In this basis, the Lie algebra of E7(7) is spanned by the SL(8, R) generators Gi j , and the antisymmetric generators Gij kl , transforming in the 63 and 70 representations of SL(8, R), respectively. We also define Gij kl :=

1 24 %ij klmnpq

Gmnpq

3 For the exceptional N = 2 Maxwell–Einstein supergravity [17] defined by the exceptional Jordan algebra the U-duality groups in five and four dimensions are E6(−26) and E7(−25) , respectively. The quasi-conformal symmetry of the exceptional supergravity in four dimensions is hence E8(−24) , with the maximal compact subgroup E7 × SU (2).

62

M. Günaydin, K. Koepsell, H. Nicolai

with SL(8, R) indices 1 ≤ i, j, . . . ≤ 8. The commutation relations are [Gi j , Gk l ] = δ kj Gi l − δ il Gk j , lmn]i − [Gi j , Gklmn ] = −4 δ [k j G

[Gij kl , Gmnpq ] =

1 36

δ ij Gklmn ,

1 2

% ij kls[mnp Gq] s .

The fundamental 56 representation of E7(7) is spanned by the two antisymmetric real tensors Xij and Xij and the action of E7(7) is given by4 δX ij = *i k X kj − *j k X ki + + ij kl Xkl , δXij = *k i Xj k − *k j Xik + +ij kl X kl ,

(12)

where +ij kl =

mnpq 1 . 24 %ij klmnpq +

(13)

In order to extend E7(7) × D to the full E8(8) , we must enlarge D to an SL(2, R) with generators (E, F, H ) in the standard Chevalley basis, together with 2 × 56 further real generators (Eij , E ij ) and (Fij , F ij ). Under hermitian conjugation, we have E ij = Fij† ,

F ij = −Eij† ,

and

E = −F † .

The grade −2, −1, 1 and 2 subspaces in the above decomposition correspond to the subspaces g−2 , g−1 , g1 , and g2 in (11), respectively: E ⊕ {E ij , Eij } ⊕ {Gij kl , Gi j ; H } ⊕ {F ij , Fij } ⊕ F.

(14)

The grading may be read off from the commutators with H [H , E] = −2 E, ij

ij

[H , E ] = −E , [H , Eij ] = −Eij ,

[H , F ] = 2 F, [H , F ij ] = F ij , [H , Fij ] = Fij .

The new generators (Eij , E ij ) and (Fij , F ij ) form two (maximal) Heisenberg subalgebras of dimension 28 ij

[E ij , Ekl ] = 2 δ kl E,

ij

[F ij , Fkl ] = 2 δ kl F,

and they transform under SL(8, R) as [Gi j , E kl ] = δ kj E il − δ lj E ik − 41 δ ij E kl , [Gi j , Ekl ] = δ ik Elj − δ il Ekj + 41 δ ij Ekl , [Gi j , F kl ] = δ kj F il − δ lj F ik − 41 δ ij F kl , [Gi j , Fkl ] = δ ik Flj − δ il Fkj + 41 δ ij Fkl . 4 We emphasize that X ij and X are independent. This convention differs from the one used for the SU(8) ij basis in the appendix.

Conformal Realizations of Exceptional Lie Groups

63

The remaining non-vanishing commutation relations are given by [E, F ] = H and [ij

1 ij klmnpq [Gij kl , E mn ] = − 24 % Epq ,

[ij

1 ij klmnpq [Gij kl , F mn ] = − 24 % Fpq ,

[Gij kl , Emn ] = −δ mn E kl] , [Gij kl , Fmn ] = −δ mn F kl] , [E ij , F kl ] = 12 Gij kl , [E ij , Fkl ] = 4 δ [i[k G

j]

l]

[Eij , Fkl ] = −12 Gij kl , ij

− δ kl H,

[E , F ij ] = −E ij , ij

l] kl [Eij , F kl ] = 4 δ [k [i G j ] + δ ij H,

[E , Fij ] = −Eij ,

ij

[F , E ] = F ,

[F , Eij ] = Fij .

To see that we are really dealing with the maximally split form of E8(8) , let us count the number of compact generators: The antisymmetric part (Gi j − Gj i ) of Gi j and (Gij kl − Gij kl ) correspond to the 63 generators of the maximal compact subalgebra SU (8) of E7(7) [4]. The remaining compact generators are the 28+28+1 anti-hermitian generators (Eij + F ij ), (E ij − Fij ), and (E + F ) giving a total of 120 generators which close into the maximal compact subgroup SO(16) ⊃ SU(8) of E8(8) . An important role is played by the symplectic invariant of two 56 representations. It is given by X, Y := X ij Yij − Xij Y ij .

(15)

The second structure which we need to introduce is the triple product. This is a trilinear map 56 × 56 × 56 −→ 56, which associates to three elements X, Y and Z another element transforming in the 56 representation, denoted by (X, Y, Z), and defined by (X, Y, Z)ij := − 8 X ik Ykl Z lj −8 Y ik Xkl Z lj −8 Y ik Zkl X lj − 2 Y ij X kl Zkl − 2 X ij Y kl Zkl − 2 Z ij Y kl Xkl +

1 2

% ij klmnpq Xkl Ymn Zpq ,

(X, Y, Z)ij := 8 Xik Y kl Zlj + 8 Y ik X kl Zlj + 8 Y ik Z kl Xlj

(16)

+ 2 Yij Z kl Xkl + 2 Xij Z kl Ykl + 2 Zij X kl Ykl −

kl mn pq 1 Z . 2 %ij klmnpq X Y

A somewhat tedious calculation5 shows that this triple product obeys the relations (X, Y, Z) = (X, Y, Z) = (X, Y, Z) , W = (X, Y, (V , W, Z)) =

(Y, X, Z) + 2 X, Y Z, (Z, Y, X) − 2 X, Z Y, (X, W, Z) , Y − 2 X, Z Y, W , (V , W, (X, Y, Z)) + ((X, Y, V ) , W, Z) + (V , (Y, X, W ) , Z) .

5 Which relies heavily on the Schouten identity ε [ij klmnpq Xr]s = 0.

(17)

64

M. Günaydin, K. Koepsell, H. Nicolai

We note that the triple product (16) could be modified by terms involving the symplectic invariant, such as X, Y Z; the above choice has been made in order to obtain agreement with the formulas of [6]. While there is no (symmetric) quadratic invariant of E7(7) in the 56 representation, a real quartic invariant I4 can be constructed by means of the above triple product and the bilinear form; it reads I4 (X ij , Xij ) := ≡

1 48 (X, X, X) , X Xij Xj k X kl Xli − 41 X ij Xij X kl Xkl 1 ij klmnpq + 96 % Xij Xkl Xmn Xpq 1 + 96 %ij klmnpq X ij X kl X mn X pq .

(18)

2.2. Quasiconformal nonlinear realization of E8(8) . We will now exhibit a nonlinear realization of E8(8) on the 57-dimensional real vector space with coordinates X := (X ij , Xij , x), where x is also real. While x is a E7(7) singlet, the remaining 56 variables transform linearly under E7(7) . Thus X forms the 56 ⊕ 1 representation of E7(7) . In writing the transformation rules we will omit the transformation parameters in order not to make the formulas (and notation) too cumbersome. To recover the infinitesimal variations, one must simply contract the formulas with the appropriate transformation parameters. The E7(7) subalgebra acts linearly by Gi j (X kl ) = 2 δ kj X il − 41 δ ij X kl ,

Gij kl (X mn ) =

Gi j (Xkl ) = −2 δ ik Xj l + 41 δ ij Xkl ,

Gij kl (Xmn ) =

1 ij klmnpq Xpq , 24 % [ij δ mn X kl] ,

(19)

Gij kl (x) = 0,

Gi j (x) = 0, H generates scale transformations H (Xij ) = Xij ,

H (Xij ) = Xij ,

H (x) = 2 x,

(20)

and the E generators act as translations; we have E(Xij ) = 0,

E(Xij ) = 0,

E(x) = 1

(21)

and E ij (X kl ) = 0, Eij (X kl ) = δ kl ij ,

ij

E ij (Xkl ) = δ kl ,

E ij (x) = −Xij ,

Eij (Xkl ) = 0,

Eij (x) = Xij .

(22)

Conformal Realizations of Exceptional Lie Groups

65

By contrast, the F generators are realized nonlinearly: F (X ij ) = −

1 6

(X, X, X)ij + X ij x

≡ 4Xik Xkl X lj +X ij X kl Xkl 1 ij klmnpq Xkl Xmn Xpq + X ij 12 % − 16 (X, X, X)ij + Xij x − 4X ik X kl Xlj − Xij X kl Xkl

− F (Xij ) = ≡

x,

(23)

kl mn pq 1 + Xij x, 12 %ij klmnpq X X X 4 I4 (X ij , Xij ) + x 2 4 Xij Xj k X kl Xli − X ij Xij X kl Xkl 1 ij klmnpq + 24 % Xij Xkl Xmn Xpq 1 + 24 %ij klmnpq X ij X kl X mn X pq + x 2 .

+

F (x) = ≡

Observe that the form of the r.h.s. is dictated by the requirement of E7(7) covariance: (F (Xij ), F (Xij )) and F (x) must still transform as the 56 and 1 of E7(7) , respectively. The action of the remaining generators is likewise E7(7) covariant: F ij (X kl ) = − 4 Xi[k X l]j + 41 % ij klmnpq Xmn Xpq , F ij (Xkl ) = + 8 δ [ik X

j ]m

ij

ij

Xml + δ kl X mn Xmn + 2 X ij Xkl − δ kl x,

mn kl kl Fij (X kl ) = − 8 δ k[i Xj ]m X ml +δ kl ij X Xmn − 2 Xij X − δ ij x,

Fij (Xkl ) = 4 X ki Xj l −

mn pq 1 4 %ij klmnpq X X ,

(24)

F ij (x) = 4 X ik Xkl X lj +X ij X kl Xkl −

1 12

% ij klmnpq Xkl Xmn Xpq + X ij x,

Fij (x) = 4 X ik X kl Xlj + Xij X kl Xkl −

kl mn pq 1 12 %ij klmnpq X X X

− Xij x.

Although E7(7) covariance considerably constrains the expressions that can appear on the r.h.s., it does not fix them uniquely: as for the triple product (16) one could add further terms involving the symplectic invariant. However, all ambiguities are removed by imposing closure of the algebra, and we have checked by explicit computation that the above variations do close into the full E8(8) algebra in the basis given in the previous section. This is the crucial consistency check. The term “quasiconformal realization” is motivated by the existence of a norm form that is left invariant up to a (possibly coordinate dependent) factor under all transformations. To write it down we must first define a nonlinear “difference” between two points X ≡ (Xij , Xij ; x) and Y ≡ (Y ij , Yij ; y); curiously, the standard difference is not invariant under the translations (E ij , Eij ). Rather, we must choose δ(X , Y) := (X ij − Y ij , Xij − Yij ; x − y + X, Y ).

(25)

66

M. Günaydin, K. Koepsell, H. Nicolai

This difference still obeys δ(X , Y) = −δ(Y, X ) and thus δ(X , X ) = 0, and is now invariant under (E ij , Eij ) as well as E; however, it is no longer additive. In fact, with the sum of two vectors being defined as δ(X , −Y), the extra term involving X, Y can be interpreted as the cocycle induced by the standard canonical commutation relations. The relevant invariant is a linear combination of x 2 and the quartic E7(7) invariant I4 , viz. N4 (X ) ≡ N4 (X ij , Xij ; x) := 4I4 (X) − x 2 ,

(26)

In order to ensure invariance under the translation generators, we consider the expression N4 (δ(X , Y)), which is manifestly invariant under the linearly realized subgroup E7(7) . Remarkably, it also transforms into itself up to an overall factor under the action of the nonlinearly realized generators. More specifically, we find F N4 (δ(X , Y)) = 2 (x + y) N4 (δ(X , Y)), F ij N4 (δ(X , Y)) = 2 (Xij + Y ij ) N4 (δ(X , Y)), H N4 (δ(X , Y)) = 4 N4 (δ(X , Y)). Therefore, for every Y ∈ R 57 the “light cone” with base point Y, defined by the set of X ∈ R 57 obeying N4 (δ(X , Y)) = 0,

(27)

is preserved by the full E8(8) group, and in this sense, N4 is a “conformal invariant” of E8(8) . We note that the light cones defined by the above equation are not only curved hypersufaces in R 57 , but get deformed as one varies the base point Y. As we will show in Appendix B, the quartic invariant I4 can take both positive and negative values, but in the latter case Eq. (27) does not have real solutions. However, we can remedy this problem by extending the representation space to C 57 and using the same formulas to get a realization of the complexified Lie algebra E8 (C) on C 57 . The existence of a fourth order conformal invariant of E8(8) is noteworthy in view of the fact that no irreducible fourth order invariant exists for the linearly realized E8(8) group (the next invariant after the quadratic Casimir being of order eight). 2.3. Relation with Freudenthal Triple Systems. We will now rewrite the nonlinear transformation rules in another form in order to establish contact with mathematical literature. Both the bilinear form (15) and the triple product (16) already appear in [6], albeit in a very different guise. That work starts from 2 × 2 “matrices” of the form α 1 x1 A= , (28) x2 α 2 where α1 , α2 are real numbers and x1 , x2 are elements of a simple Jordan algebra J of degree three. There are only four simple Jordan algebras J of this type, namely the 3 × 3 hermitian matrices over the four division algebras, R, C, H and O. The associated matrices are then related to non-compact forms of the exceptional Lie algebras F4 , E6 , E7 , and E8 , respectively. For simplicity, let us concentrate on the maximal case J3O S , when the matrix A carries 1+1+27+27 = 56 degrees of freedom. This counting suggests

Conformal Realizations of Exceptional Lie Groups

67

an obvious relation with the 56 of E7(7) and its decomposition under E6(6) , but more work is required to make the connection precise. To this aim, [6] defines a symplectic invariant A, B , and a trilinear product mapping three such matrices A, B and C to another one, denoted by (A, B, C). This triple system differs from a Jordan triple system in that it is not derivable from a binary product. The formulas for the triple product in terms of the matrices A, B and C given in [6] are somewhat cumbersome, lacking manifest E7(7) covariance. For this reason, instead of directly verifying that our prescription (16) and the one of [6] coincide, we have checked that they satisfy identical relations: a quick glance shows that the relations (T1)–(T4) [6] are indeed the same as our relations (17), which are manifestly E7(7) covariant. To rewrite the transformation formulas we introduce Lie algebra generators UA and ˜ UA labeled by the above matrices, as well as generators SAB labeled by a pair of such matrices. For the grade ±2 subspaces we would in general need another set of generators KAB and K˜ AB labeled by two matrices, but since these subspaces are one-dimensional in the present case, we have only two more generators Ka and K˜ a labelled by one real number a. In the same vein, we reinterpret the 57 coordinates X as a pair (X, x), where X is a 2 × 2 matrix of the type defined above. The variations then take the simple form Ka (X) = 0, UA (X) = A, SAB (X) = (A, B, X) , U˜ A (X) = 1 (X, A, X) − Ax,

Ka (x) = 2 a, UA (x) = A, X , SAB (x) = 2 A, B x, (29) 1 ˜ UA (x) = − (X, X, X) , A + X, A x,

2

6

K˜ a (X) = − 16 a (X, X, X) + aXx,

K˜ a (x) =

1 6

a (X, X, X) , X + 2 ax 2 .

From these formulas it is straightforward to determine the commutation relations of the transformations. To expose the connection with the more general Kantor triple systems we write KAB ≡ KA,B

(30)

in the formulas below. The consistency of this specialization is ensured by the relations (17). By explicit computation one finds [UA , U˜ B ] = SAB , [UA , UB ] = −KAB , [U˜ A , U˜ B ] = −K˜ AB , [SAB , UC ] = −U(A,B,C) , [SAB , U˜ C ] = −U˜ (B,A,C) , [KAB , U˜ C ] = U(A,C,B) − U(B,C,A) , [K˜ AB , UC ] = U˜ (B,C,A) − U˜ (A,C,B) , [SAB , SCD ] = −S(A,B,C)D − SC(B,A,D) , [SAB , KCD ] = KA(C,B,D) − KA(D,B,C) , [SAB , K˜ CD ] = K˜ (D,A,C)B − K˜ (C,A,D)B , [KAB , K˜ CD ] = S(B,C,A)D − S(A,C,B)D − S(B,D,A)C + S(A,D,B)C .

(31)

68

M. Günaydin, K. Koepsell, H. Nicolai

For general KAB , these are the defining commutation relations of a Kantor triple system, and, with the further specification (30), those of a Freudenthal triple system (FTS). Freudenthal introduced these triple systems in his study of the metasymplectic geometries associated with exceptional groups [10]; these geometries were further studied in [1, 6, 30, 24]6 . A classification of FTS’s may be found in [24], where it is also shown that there is a one-to-one correspondence between simple Lie algebras and simple FTS’s with a non-degenerate bilinear form. Hence there is a quasiconformal realization of every Lie group acting on a generalized lightcone. 3. Truncations of E8(8) For the lower rank exceptional groups contained in E8(8) , we can derive similar conformal or quasiconformal realizations by truncation. In this section, we will first give the list of quasiconformal realizations contained in E8(8) . In the second part of this section, we consider truncations to a three graded structure, which will yield conformal realizations. In particular, we will work out the conformal realization of E7(7) on a space of 27 dimensions as an example, which is again the maximal example of its kind. 3.1. More quasiconformal realizations. All simple Lie algebras (except for SU (2)) can be given a five graded structure (10) with respect to some subalgebra of maximal rank and one can associate a triple system with the grade +1 subspace [23, 2]. Conversely, one can construct every simple Lie algebra over the corresponding triple system. The realization of E8 over the FTS defined by the exceptional Jordan algebra can be truncated to the realizations of E7 , E6 , and F4 by restricting oneself to subalgebras defined by quaternionic, complex, and real Hermitian 3 × 3 matrices. Analogously the non-linear realization of E8(8) given in the previous section can be truncated to nonlinear realizations of E7(7) , E6(6) , and F4(4) . These truncations preserve the five grading. More specifically we find that the Lie algebra of E7(7) has a five grading of the form: E7(7) = 1 ⊕ 32 ⊕ (SO(6, 6) ⊕ D) ⊕ 32 ⊕ 1.

(32)

Hence this truncation leads to a nonlinear realization of E7(7) on a 33 dimensional space. Note that this is not a minimal realization of E7(7) . Further truncation to the E6(6) subgroup preserving the five grading leads to: E6(6) = 1 ⊕ 20 ⊕ (SL(6, R) ⊕ D) ⊕ 20 ⊕ 1.

(33)

This yields a nonlinear realization of E6(6) on a 21 dimensional space, which again is not the minimal realization. Further reduction to F4(4) preserving the five grading F4(4) = 1 ⊕ 14 ⊕ (Sp(6, R) ⊕ D) ⊕ 14 ⊕ 1

(34)

leads to a minimal realization of F4(4) on a fifteen dimensional space. One can further truncate F4 to a subalgebra G2(2) while preserving the five grading G2(2) = 1 ⊕ 4 ⊕ (SL(2, R) ⊕ D) ⊕ 4 ⊕ 1,

(35)

6 FTS’s have also been used in [3] to give a classification and a unified realization of non-linear quasisuperconformal algebras and in the realizations of nonlinear N = 4 superconformal algebras in two dimensions [15].

Conformal Realizations of Exceptional Lie Groups

69

which then yields a nonlinear realization over a five dimensional space. One can go even futher and truncate G2 to its subalgebra SL(3, R) SL(3, R) = 1 ⊕ 2 ⊕ (SO(1, 1) ⊕ D) ⊕ 2 ⊕ 1,

(36)

which is the smallest simple Lie algebra admitting a five grading. We should perhaps stress that the nonlinear realizations given above are minimal for G2(2) , F4(4) , and E8(8) which are the only simple Lie algebras that do not admit a three grading and hence do not have unitary representations of the lowest weight type. The above nonlinear realizations of the exceptional Lie algebras can also be truncated to subalgebras with a three graded structure, in which case our nonlinear realization reduces to the standard nonlinear realization over a JTS. This truncation we will describe in Sect. 3.2 in more detail. With respect to E6(6) the quasiconformal realization of E8(8) (11) decomposes as follows: 1 ⊕

56

(133 ⊕ 1)

⊕

56

1

1 ⊕

1

⊕

27

⊕

⊕

⊕

✧ 27 ✧ ✧ ⊕ ❜ ❜ ❜ 27

⊕

27

⊕

1

⊕

1

27 1

⊕

❜

⊕

❜ ❜ ✧ ✧

27

✧

78

⊕

1

1

1 The numbers in the first line are the dimensions of E7(7) , whereas the remaining numbers correspond to representations of USp(8) which is the maximal compact subgroup of E6(6) . The 27 of grade −1 subspace and the 27 of grade +1 subspace close into the E6(6) ⊕ D subalgebra of grade zero subspace and generate the Lie algebra of E7(7) . Similarly 27 of grade −1 subspace together with the 27 of grade +1 subspace form another E7(7) subalgebra of E8(8) . Hence we have four different E7(7) subalgebras of E8(8) : i) E7(7) subalgebra of grade zero subspace which is realized linearly. ii) E7(7) subalgebra preserving the 5-grading, which is realized nonlinearly over a 33 dimensional space iii) E7(7) subalgebra that acts on the 27 dimensional subspace as the generalized conformal generators. iv) E7(7) subalgebra that acts on the 27 dimensional subspace as the generalized conformal generators.

70

M. Günaydin, K. Koepsell, H. Nicolai

Similarly for E7(7) under the SL(6, R) subalgebra of the grade zero subspace the 32 dimensional grade +1 subspace decomposes as 32 = 1 + 15 + 15 + 1. The 15 from grade +1 (−1) subspace together with 15 (15) of grade −1 (+1) subspace generate a nonlinearly realized SO(6, 6) subalgebra that acts as the generalized conformal algebra on the 15 (15) dimensional subspace. For E6(6) , F4(4) , G2(2) , and SL(3, R) the analogous truncations lead to nonlinear conformal subalgebras SL(6, R), Sp(6, R), SO(2, 2), and SL(2, R), respectively. 3.2. Conformal Realization of E7(7) . As a special truncation the quasiconformal realization of E8(8) contains a conformal realization of E7(7) on a space of 27 dimensions, on which the E6(6) subgroup of E7(7) acts linearly. The main difference is that the construction is now based on a three-graded decomposition (4) of E7(7) rather than (10) – hence the realization is “conformal” rather than “quasiconformal”. The relevant decomposition can be directly read off from the figure: we simply truncate to an E7(7) subalgebra in such a way that the grade ±2 subspace can no longer be reached by commutation. This requirement is met only by the two truncations corresponding to the diagonal lines in the figure; adding a singlet we arrive at the desired three graded decomposition of E7(7) 133 = 27 ⊕ (78 ⊕ 1) ⊕ 27

(37)

under its E6(6) × D subgroup. The Lie algebra E6(6) has USp(8) as its maximal compact subalgebra. It is spanned ˜ ij in the adjoint representation 36 of USp(8) and a fully antisymby a symmetric tensor G ˜ ij kl transforming under the 42 of USp(8); indices metric symplectic traceless tensor G 1 ≤ i, j, . . . ≤ 8 are now USp(8) indices and all tensors with a tilde transform under ˜ ij kl is traceless with respect to the real symplectic metric USp(8)rather than SL(8, R). G j 9ij = −9j i = −9ij (thus 9ik 9kj = δi ). The symplectic metric also serves to pull up and down indices, with the convention that this is always to be done from the left. The remaining part of E7(7) is spanned by an extra dilatation generator H˜ , translation generators E˜ ij and the nonlinearly realized generators F˜ ij , transforming as 27 and 27, respectively. Unlike for E8(8) , there is no need here to distinguish the generators by the position of their indices, since the corresponding generators are linearly related by means of the symplectic metric. The fundamental 27 of E6(6) (on which we are going to realize a nonlinear action of E7(7) ) is given by the traceless antisymmetric tensor Z˜ ij transforming as ˜ i j (Z˜ kl ) = 2 δ k Z˜ il , G j ˜ ij kl (Z˜ mn ) = G

1 ij klmnpq ˜ Z pq , 24 %

where Z˜ ij := 9ik 9j l Z˜ kl = (Z˜ ij )∗

and

9ij Z˜ ij = 0.

(38)

Conformal Realizations of Exceptional Lie Groups

71

Likewise, the 27 representation transforms as ˜ i j (Z¯ kl ) = 2 δ k Z¯ il , G j ˜ ij kl (Z¯ mn ) = − 1 % ij klmnpq Z¯ pq . G 24

(39)

Because the product of two 27’s contains no singlet, there exists no quadratic invariant of E6(6) ; however, there is a cubic invariant given by ˜ := Z˜ ij Z˜ j k Z˜ kl 9il . N3 (Z)

(40)

We are now ready to give the conformal realization of E7(7) on the 27 dimensional space spanned by the Z˜ ij .As the action of the linearly realized E6(6) subgroup has already been given, we list only the remaining variations. As before E˜ ij acts by translations: E˜ ij (Z˜ kl ) = −9i[k 9l]j − 18 9ij 9kl

(41)

H˜ (Z˜ ij ) = Z˜ ij .

(42)

and H˜ by dilatations

The 27 generators F˜ ij are realized nonlinearly: F˜ ij (Z˜ kl ) := − 2 Z˜ ij (Z˜ kl ) + 9i[k 9l]j (Z˜ mn Z˜ mn ) +

1 8

9ij 9kl (Z˜ mn Z˜ mn )

+ 8 Z˜ km Z˜ mn 9n[i 9j ]l −9kl (Z˜ im 9mn Z˜ nj ).

(43)

The norm form needed to define the E7(7) invariant “light cones” is now constructed from the cubic invariant of E6(6) . Then N3 (X˜ − Y˜ ) is manifestly invariant under E6(6) and under the translations E˜ ij (observe that there is no need to introduce a nonlinear difference unlike for E8(8) ). Under H˜ it transforms by a constant factor, whereas under the action of F˜ ij we have F˜ ij N3 (X˜ − Y˜ ) = (X˜ ij + Y˜ ij )N (X˜ − Y˜ ). (44) Thus the light cones in R 27 with base point Y˜ N3 (X˜ − Y˜ ) = 0

(45)

are indeed invariant under E7(7) . They are still curved hypersurfaces, but in contrast to the E8(8) light-cones constructed before, they are no longer deformed as one varies the base point Y˜ . The connection to the Jordan Triple Systems of Appendix A can now be made quite explicit, and the formulas that we arrive at in this way are completely analogous to the ones given in the introduction. We first of all notice that we can again define a triple product in terms of the E6(6) representations; it reads ˜ ij = 16 X˜ ik Z˜ kl Y˜ lj +16 Z˜ ik X˜ kl Y˜ lj +4 9ij (X˜ kl Y˜lm Z˜ mn 9kn ) {X˜ Y˜ Z} + 4 X˜ ij Y˜ kl Z˜ kl + 4 Y˜ ij X˜ kl Z˜ kl + 2 Z˜ ij X˜ kl Y˜kl .

(46)

72

M. Günaydin, K. Koepsell, H. Nicolai

This triple product can be used to rewrite the conformal realization. Recalling that a triple product with identical properties exists for the 27-dimensional Jordan algebra J3O S , we now consider Z˜ as an element of J3O S . Next we introduce generators labeled by elements of J3O S , and define the variations ˜ = a, Ua (Z) ˜ = {a b Z}, ˜ Sab (Z) ˜ = U˜ c (Z)

(47)

˜ − 21 {Z˜ c Z},

for a, b, c ∈ J3O S . It is straightforward to check that these reproduce the commutation relations listed in the introduction with the only difference that J2C has been replaced by J3O S . Acknowledgements. We are very grateful to R. Kallosh for poignant questions and comments on the first version of this paper. We would also like to thank B. de Wit and B. Pioline for enlightening discussions.

Appendix A. Jordan Triple Systems Let us first recall the defining properties of a Jordan algebra. By definition these are algebras equipped with a commutative (but non-associative) binary product a ◦ b = b ◦ a satisfying the Jordan identity (a ◦ b) ◦ a 2 = a ◦ (b ◦ a 2 ).

(A.1)

A Jordan algebra with such a product defines a so-called Jordan triple system (JTS) under the Jordan triple product ˜ ◦ c − b˜ ◦ (a ◦ c), {a b c} = a ◦ (b˜ ◦ c) + (a ◦ b) where ˜ denotes a conjugation in J corresponding to the operation † in g. The triple product satisfies the identities (which can alternatively be taken as the defining identities of the triple system) {a b c} = {c b a}, {a b {c d x}} − {c d {a b x}} − {a {d c b} x} + {{c d a} b x} = 0.

(A.2)

The Tits–Kantor–Koecher (TKK) construction [32, 21, 25] associates every JTS with a 3-graded Lie algebra g = g−1 ⊕ g0 ⊕ g+1 ,

(A.3)

satsifying the formal commutation relations: [g+1 , g−1 ] = g0 , [g+1 , g+1 ] = 0, [g−1 , g−1 ] = 0. With the exception of the Lie algebras G2 , F4 , and E8 every simple Lie algebra g can be given a three graded decomposition with respect to a subalgebra g0 of maximal rank.

Conformal Realizations of Exceptional Lie Groups

73

By the TKK construction the elements Ua of the g+1 subspace of the Lie algebra are labelled by the elements a ∈ J . Furthermore every such Lie algebra g admits an involutive automorphismι, which maps the elements of the grade +1 space onto the elements of the subspace of grade −1: ι(Ua ) =: U˜ a ∈ g−1 .

(A.4)

To get a complete set of generators of g we define [Ua , U˜ b ] = Sab , [Sab , Uc ] = U{abc}

(A.5)

where Sab ∈ g0 and {abc} is the Jordan triple product under which the space J is closed. The remaining commutation relations are [Sab , U˜ c ] = U˜ {bac} , [Sab , Scd ] = S{abc}d − Sc{bad} ,

(A.6)

and the closure of the algebra under commutation follows from the defining identities of a JTS given above. The Lie algebra generated by Sab is called the structure algebra of the JTS J , under which the elements of J transform linearly. The traceless elements of this action of Sab generate the reduced structure algebra of J . There exist four infinite families of hermitian JTS’s and two exceptional ones [31, 27]. The latter are listed in the table below (where M1,2 (O) denotes 1 × 2 matrices over the octonions, i.e. the octonionic plane) J

G

H

M1,2 (O S )

E6(6)

SO(5, 5)

M1,2 (O)

E6(−14) SO(8, 2)

J3O S

E7(7)

E6(6)

J3O

E7(−25)

E6(−26)

Here we are mainly interested in the real form J3O S , which corresponds to the split octonions O S and has E7(7) and E6(6) as its conformal and reduced structure group, respectively. Appendix B. The Quartic E7(7) Invariant In the SL(8, R) basis E7(7) the quartic invariant is given by (18), which we here repeat for convenience SL(8,R)

I4

= Xij Xj k X kl Xli − 41 X ij Xij X kl Xkl + +

1 ij klmnpq Xij Xkl Xmn Xpq 96 % ij kl mn pq 1 96 %ij klmnpq X X X X .

(B.1)

74

M. Günaydin, K. Koepsell, H. Nicolai

Another very useful form of E7(7) makes the maximal compact subgroup SU(8) manifest. The fundamental 56 representation then is spanned by the complex tensors ZAB which are related to the SL(8, R) basis by [4] Z AB = (ZAB )∗ =

1 √ (X ij 4 2

ij

− i Xij ):AB ,

(B.2)

ij

where :AB are the SO(8) gamma matrices. In this basis the quartic invariant takes the form SU(8)

I4

= Z AB ZBC Z CD ZDA − 41 Z AB ZAB Z CD ZCD + +

1 ABCDEF GH ZAB ZCD ZEF ZGH 96 % AB CD EF GH 1 Z Z Z . 96 %ABCDEF GH Z SU(8)

(B.3)

SL(8,R)

The precise relaton between I4 and I4 has never been spelled out in the literature although it is claimed in [4] that they should be proportional. In fact, we have SU(8)

I4

SL(8,R)

= −I4

.

(B.4)

To prove this claim, one needs the identities ij

ij

ij

pq

kl Tr(: ij : kl : mn : pq ) = − 128 δ p[k δlmn ] q + 128 δ p[m δn]q + 128 δ k[m δn]l ij

mn + 96 (δkl δpq )sym ∓ 8 % ij klmnpq ,

(B.5)

and ij

pq

ij

ij

kl mn mn % ABCDEF GH :AB :CD :EF :GH = − 128 (12 δkl δpq + 48 δ p[k δlmn ] q )sym

∓ % ij klmnpq ,

(B.6)

where (. . . )sym denotes symmetrization w.r.t. the pairs of indices (ij ), (kl), (mn), (pq), and the signs ∓ depend on whether the spinor representation or the conjugate spinor representation of the gamma matrices is used: : ij klmnpq = ∓% ij klmnpq . To see that I4 can assume both positive and negative values it is sufficient to consider configurations in the SU(8) basis of the form [8] z1 0 1 .. ZAB =: ⊗ , (B.7) . −1 0 z4 with complex parameters z1 , . . . , z4 . For this configuration the quartic invariant becomes SU(8) I4 = |zα |4 − 2 |zα |2 |zβ |2 + 4 z1 z2 z3 z4 + 4 z1∗ z2∗ z3∗ z4∗ . (B.8) α

β>α

Using this formula, one can easily see that both positive and negative values are possible for I4 :

Conformal Realizations of Exceptional Lie Groups

i)

75

We find positive values for I4 when all but one parameter vanish: SU(8)

I4

= |z1 |4 > 0

for

z1 = 0, z2 = z3 = z4 = 0

ii) I4 vanishes when all parameters take the same real (electric) or imaginary (magnetic) value: SU(8)

I4

=0

for

z1 = z2 = z3 = z4 = M or iM, M ∈ R.

This is the example considered in [20] corresponding to maximally BPS black hole solutions in d = 4, N = 8 supergravity with vanishing entropy and vanishing area of the horizon. iii) I4 is negative when all parameters take the same complex “dyonic” value. For instance, SU(8)

I4

x0 }, the map A → A∗ , A ∈ A(W1 ) defines an antilinear operator SW1 : A(W1 ) → A(W1 ) which is closable. Its closure is called the Tomita operator of and A(W1 ) and admits a unique polar 1/2 decomposition SW1 = JW1 W1 into an antiunitary conjugation JW1 (the “phase” of SW1 ) which is called the modular conjugation of ( , A(W1 ) ), and a positive operator 1/2 W1 (the “modulus” of SW1 ) whose square W1 is referred to as the modular operator of ( , A(W1 ) ). The main theorem of Tomita–Takesaki theory [46] now implies that the adjoint actions of the operators itW1 map the algebras A(W1 ) and A(W1 ) onto themselves, whereas the adjoint action of the conjugation JW1 maps the two algebras onto one another. Bisognano and Wichmann showed that for finite-component Wightman fields, the unitary itW1 coincides with the unitary representing the 01-boost by −2π t for all t ∈ R, whereas JW1 implements a charge conjugation together with a time reflection and a spatial reflection in the 1-direction, this combination of discrete transformations will be referred to as a P1 CT-symmetry. For the algebraic setting, Borchers proved in [11]2 that the spectrum condition (without assuming Lorentz covariance) implies the commutation relations (i)

JW1 U (a)JW1 = U (j1 a),

(ii) itW1 U (a)−it W1 = U (1 (−2π t)a)

for all t ∈ R,

where 1 (−2πt) denotes the Lorentz boost by −2π t in the 1-direction, while j1 is the reflection defined by j1 x := (−x0 , −x1 , x2 , . . . , xs ). Wiesbrock noted that Borchers’ relations are not only a necessary, but also a sufficient condition for the spectrum condition ([52], cf. also [25]). For 1+1 dimensions, Borchers’ relations immediately imply [11] that the net of observables may be enlarged to a local net which generates the same wedge algebras (and hence the same corresponding modular operator and conjugation) as the original one and which has the property that J1 is a P1 CT-operator (modular P1 CT-symmetry), whereas itW1 implements the Lorentz boost by −2πt for each t ∈ R (modular Lorentz symmetry). The first uniqueness theorem for modular symmetries states that even in higher dimensions, JW1 or itW1 , t ∈ R, can be shown to be a P1 CT-operator or a 0-1-Lorentz boost, respectively, provided that JW1 or itW1 implement any geometric action on the net. The first step towards it is the following lemma. In this lemma and in what follows, K will denote the class of all double cones of the form O := (a + V+ ) ∩ (b − V+ ), a, b ∈ R1+s . Lemma 2.1. Let K be a unitary or antiunitary operator with the property that for every double cone O there are open sets MO and NO such that KA(O)K ∗ = A(MO ),

K ∗ A(O)K = A(NO ),

2 For a considerably simpler proof found recently, see [28].

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

81

and let κ be a causal automorphism3 of R1+s such that KU (a)K ∗ = U (κa) for all a ∈ R1+s . Then there is a unique ξ ∈ R1+s such that KA(O)K ∗ = A(κO + ξ ),

for all O ∈ K.

A first proof of Lemma 2.1 was published in [37], but both the statement and the proof given there were more general, which made the formulation somewhat technical. For the reader’s convenience a less general, but more accessible formulation is used here, and a more detailed version of the proof is given below. The following theorem is a consequence of Lemma 2.1 and Borchers’ commutation relations. Theorem 2.2 (First Uniqueness Theorem). (i) If for every double cone O ∈ K there is an open set MO such that JW1 A(O)JW1 = A(MO ), then

JW1 A(O)JW1 = A(j1 O) for all O ∈ K.

t such that (ii) If for every t ∈ R and for every O ∈ K there is an open set MO t itW1 A(O)−it W1 = A(MO ),

then

itW1 A(O)−it W1 = A(1 (−2π t)O) for all O ∈ K.

The statement of part (ii) implies the statement of part (i) [30], i.e., the Unruh effect implies modular P1 CT-symmetry. Further results relating the above statements to each other and to similar conditions can be found in [26]. Assuming that is separating with respect to the algebra A(V+ ), Borchers also found commutation relations for the corresponding modular conjugation and unitaries: for each a ∈ R1+s , he found that J+ U (a)J+ it + U (a)−it +

= U (−a); = U (e−2πt a)

for all t ∈ R.

These relations, together with Lemma 2.1, imply the following corollary: Corollary 2.3 (Uniqueness Theorem “1a”). Assume A to be Poincaré covariant, and assume that the vacuum vector is separating with respect to the algebra A(V+ ) , and let itV+ and JV+ be the corresponding modular operator and conjugation, respectively. 3 Recall that a causal automorphism of R1+s is a bijection f : R1+s → R1+s which preserves the causal structure of R1+s , i.e., f (x) and f (y) are timelike with respect to each other if and only if x and y are timelike with respect to each other. Without assuming linearity or continuity, one can show that the group of all causal automorphisms of R1+s is generated by the elements of the Poincaré group and the dilatations [1, 3, 2, 54, 15]. Since the transformations implemented on the translations by Borchers’ commutation relations happen to be causal in all applications discussed below, this assumption means no loss of generality.

82

B. Kuckert

(i) If for every double cone O there is an open set MO such that JV+ A(O)JV+ = A(MO ), then

JV+ A(O)JV+ = A(−O) for all O ∈ K.

t such that (ii) If for every t ∈ R and every double cone O there is an open set MO t itV+ A(O)−it V+ = A(MO ),

then

−2πt O) for all O ∈ K. itV+ A(O)−it V+ = A(e

Since massive theories cannot be dilation invariant unless their mass spectrum is dilation invariant (cf., e.g., [42]), the models concerned by part (ii) of this corollary are massless theories. But it follows from the scattering theory for massless fermions and bosons in 1+3 or 1+1 dimensions (see [17–19]) that either of the symmetry properties found in part (i) and part (ii) of the corollary implies a massless theory to be free (i.e., its S-matrix is trivial) (see [18, 20, 23]). Note that for the 1+1-dimensional case, all modular symmetries considered in Thm. 2.2 and Cor. 2.3 have been established in [11]. It is assumed above that the adjoint actions of JW1 and itW1 , t ∈ R, map each local algebra A(O), O ∈ K, onto the algebra A(MO ) associated with some open region MO in Minkowski space. This means that, essentially, the net structure has to be preserved. This is the restrictive aspect of the assumption. On the other hand, the shape of the region MO is left completely arbitrary, the map K O → MO is not even assumed to be induced by a point transformation. In this aspect, the above assumptions are rather weak. But there are, of course, other ways to specify what a “geometric action” is. Denote by W the class of all wedges, i.e., all images of the Rindler wedge W1 under Poincaré transformations. For M ⊂ R1+s , define the causal complement M c to be the set of all points that are spacelike to M, and let M denote the interior of M c . It has been shown in [38, 39] that one can define a nonempty localization region for each local observable A∈ / Cid by L(A) := {W : W ∈ W, A ∈ A(W ) }. This localization prescription will be said to satisfy locality if any two local observables A and B with the property that L(A) and L(B) are spacelike separated commute. This property does not follow from the locality property of the net alone, but with the following additional assumptions one can derive it for the present setting [39]: (E) Wedge duality. A(W ) = A(W ) for each wedge W ∈ W. (F) Wedge additivity. For each wedge W ∈ W and each double cone O ∈ K with W ⊂ W + O one has A(W ) ⊂ A(a + O) . a∈W

Wedge duality is a property of all finite-component Wightman fields by the Bisognano–Wichmann theorem, and wedge additivity is a standard property of Wightman

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

83

fields as well. Condition (F) is slightly stronger than the definition of wedge additivity used in [47, 39], where the algebras A(a + O) in Condition (F) are replaced by the larger algebras A(a + O ) , but as this difference is not expected to be substantial for physics, we use the same term for convenience, which is in harmony with the other existing notions of additivity used in algebraic quantum field theory. Assume now that the localization region of the observable At := itW1 A−it W1 depends continuously on t, i.e., that for every sequence (tν )ν∈N which converges to some t∞ ∈ R, the localization region L(At∞ ) consists precisely of all accumulation points of sequences (xν )ν∈N with xν ∈ L(Atν ). Then the following lemma establishes a first restriction on how the localization region can depend on t. Lemma 2.4. With Assumptions (A)–(E), suppose the localization prescription L defined above satisfies locality. Let A be a local observable in A(W1 ), and assume that there exists an ε > 0 such that all At , t ∈ [0, ε], are local observables and such that the function [0, ε] t → L(At ) is continuous in the above sense. Then

(i) L(Aε ) ⊂ 1 (−2πε) (L(A) + W1 )cc ∩ (L(A) − W1 )cc ; (ii) L(Aε ) ⊂ L(A) − V + ; (iii) L(A) ⊂ L(Aε ) + V + . It is shown in the Appendix that the continuity assumption made on t → L(At ) is equivalent to continuity with respect to a metric first considered by Hausdorff, and that L(A t ) is compact. t∈[0,ε] Next suppose that t → L(At ) is continuous not only for sufficiently small t, but for all t ∈ R, and assume wedge additivity in addition. With these slightly strengthened assumptions one can now prove the following: Theorem 2.5 (Second Uniqueness Theorem). With Assumptions (A)–(F), assume that itW1 Aloc itW1 = Aloc , and suppose that L(At ) depends continuously from t for all t ∈ R and for all A ∈ Aloc . Then L(itW1 A−it W1 ) = 1 (−2π t)L(A) for all A ∈ Aloc . By the result of Guido and Longo, the conclusion of this proposition also implies modular P1 CT-symmetry, but Proposition 2.5 does not provide a proper parallel to the P1 CT-part of the first uniqueness theorem, which may also apply if the modular group does not act in any geometric way. The assumption that every local observable A is mapped onto some other local observable under the adjoint action of the modular group prevents A to be mapped onto an observable localized in an unbounded region. For every bounded open region O there are conformal transformations which map O onto an unbounded region; these transformations are excluded a priori. In contrast, the assumptions of the first uniqueness theorem do not exclude these symmetries explicitly, while it is evident from this theorem that the modular objects under consideration cannot implement these symmetries. Another restrictive assumption of the second uniqueness theorem is that wedge duality is assumed there, whereas the first one can be used to derive wedge duality. On the other hand the assumptions made in the second uniqueness theorem admit the situation that the net structure of A is destroyed completely under the action of the modular group.

84

B. Kuckert

3. Proofs For every algebra M ⊂ B(H), define its localization region L(M) with respect to the net A by L(M) := {O ∈ K : A(O) ⊂ M}. The only reason to use the class K of double cones in this definition is convenience; one could replace K by the larger class T of all open sets in R1+s without affecting the definition. To see this, denote the localization region obtained this way by LT (M); it is trivial that L(M) ⊂ LT (M) as K ⊂ T , while from isotony of the net and the fact that each open region M is the union of all double cones O ⊂ M, one finds {M ∈ T : A(M) ⊂ M} = {O ∈ K : ∃M ∈ T : O ⊂ M, A(M) ⊂ M} ⊂ {O ∈ K : A(O) ⊂ M} = L(M),

LT (M) =

which is the converse inclusion. It is obvious from the definitions that L(A(M)) ⊃ M. For causally complete and convex regions one can prove the converse inclusion, which we recall without proof from [39] (Cor. 5.4) for later use. Here a causally complete region is a region R such that (R c )c = R. Lemma 3.1. Let R ⊂ R1+s be a causally complete convex open region. (i) For every open region M ⊂ R1+s , one has A(M) ⊂ A(R ) if and only if M ⊂ R. (ii) L(A(R)) = R. One also checks that for any such R, one has L(A(R)) = L(A(R) ) = L(A(R ) ). We emphasize that the above assumption s ≥ 2 is crucial for this lemma; in 1+1 dimensions, there are chiral theories which do not obey the statement of the lemma. The repeated use of this lemma in the proofs is the main reason why s ≥ 2 is assumed throughout this paper. Proof of Lemma 2.1. In what follows, K and κ are defined as in Lemma 2.1. As before, K will denote the class of double cones. For any open region M ⊂ R1+s , we denote by KM the class of all double cones O ∈ K with O ⊂ M, and for each subalgebra M of B(H), we denote by KM the class of all double cones O such that A(O) ⊂ M. The proof will be subdivided into five lemmas. The first implies that for every O ∈ K, the regions MO and NO are bounded. It uses the fact that a region M is bounded if and only if its difference region M − M is bounded, and that difference sets can be expressed in terms of translations. Since the behaviour of translations under the action of the symmetry K is known by assumption, one can prove the following lemma. Lemma 3.2. For every double cone O ∈ K, one has L(KA(O)K ∗ ) − L(KA(O)K ∗ ) = κ(O − O).

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

85

Proof. Using the assumptions of Theorem 2.1, one obtains L(KA(O)K ∗ ) − L(KA(O)K ∗ ) = L(A(MO )) − L(A(MO )) = {a ∈ R1+s : ∃P ∈ KA(MO ) : A(P + a) ⊂ A(MO )} = {a ∈ R1+s : ∃P ∈ KA(MO ) : KU (κ −1 a)K ∗ A(P )KU (−κ −1 a)K ∗ ⊂ A(MO )} = κ{a ∈ R1+s : ∃P ∈ KA(MO ) : U (a) K ∗ A(P )K U (a) ⊂ K ∗ A(MO )K }

=A(NP )

⊂ κ{a ∈ R

1+s

: ∃P ∈ K

A(MO )

: ∃Q ∈ K

A(NP )

=A(O )

: A(Q + a) ⊂ A(O)}.

Since the definitions and isotony imply ∗ ∗ KA(NP ) = KK A(P )K ⊂ KK A(MO )K = KA(O) ,

and since, as remarked above, KA(O) = KO , one obtains L(KA(O)K ∗ ) − L(KA(O)K ∗ ) ⊂ κ{a ∈ R1+s : ∃Q ∈ KO : A(Q + a) ⊂ A(O)} = κ(O − O). Conversely, κ(O − O) = κ{a ∈ R1+s : ∃P ∈ KO : A(P + a) ⊂ A(O)} = {a ∈ R1+s : ∃P ∈ KO : A(P + κ −1 a) ⊂ A(O)} = {a ∈ R1+s : ∃P ∈ KO : K ∗ U (a)KA(P )K ∗ U (−a)K ⊂ A(O)} = {a ∈ R1+s : ∃P ∈ KO : A(MP + a) ⊂ A(MO )} ⊂ {a ∈ R1+s : ∃P ∈ KO : ∃Q ∈ KA(MP ) : A(Q + a) ⊂ A(MO )}, and since ∗

∗

KA(MP ) = KK A(P )K ⊂ KK A(O)K = KA(MO ) , one obtains κ(O − O) ⊂ {a ∈ R1+s : ∃Q ∈ KA(MO ) : A(Q + a) ⊂ A(MO )} = L(A(MO )) − L(A(MO )). The next lemma proves that strict inclusions of double cones are preserved under the adjoint action of the operator K. Again, this boils down to translating local algebras up and down Minkowski space and using the commutation relations between K and the translation operators. One uses the fact that O ⊂ P if and only if O can be translated within P into all directions. Lemma 3.3. For any two double cones O, P ∈ K with O ⊂ P , one has L(KA(O)K ∗ ) ⊂ L(KA(P )K ∗ ).

86

B. Kuckert

Proof. O ⊂ P if and only if the set {a ∈ R1+s : O + a ⊂ P } is a neighbourhood of the origin of R1+s . After using Lemma 3.1, elementary transformations yield {a ∈ R1+s : O + a ⊂P } = {a ∈ R1+s : A(O + a) ⊂ A(P )} = {a ∈ R1+s : K ∗ U (κa)KA(O)K ∗ U (−κa)K ⊂ A(P )} = {a ∈ R1+s : A(MO + κa) ⊂ A(MP )} = κ −1 {a ∈ R1+s : A(MO + a) ⊂ A(MP )} ⊂ κ −1 {a ∈ R1+s : L(A(MO )) + a ⊂ L(A(MP ))}. Since κ is a linear automorphism of R1+s , it follows that O can be a subset of P only if {a ∈ R1+s : L(A(MO )) + a ⊂ L(A(MP ))} is a neighbourhood of the origin. This implies the statement.

The next lemma proves that the maps K K → L(KA(O)K ∗ ) and

K O → L(K ∗ A(O)K)

are induced by continuous functions κ˜ : R1+s → R1+s and κˆ : R1+s → R1+s . Lemma 3.4. Let x ∈ R1+s be arbitrary, and let (Oν )ν∈N be a neighbourhood base of x consisting of double cones Oν ∈ K. Then (L(KA(Oν )K ∗ ))ν∈N is a neighbourhood base of a (naturally, unique) point κ(x) ˜ ∈ R1+s , and (L(K ∗ A(Oν )K))ν∈N is a neighbourhood base of a point κ(x) ˆ ∈ R1+s . The functions x → κ(x) ˜ and x → κ(x) ˆ are continuous. Proof. Without loss of generality, one may assume that Oν+1 ⊂ Oν for all ν ∈ N. It follows from L(A(O)) = O for all O ∈ K and Lemma 3.2 that all L(KA(Oν )K ∗ ), ν ∈ N, are bounded sets, and it follows from Lemma 3.3 that L(KA(Oν+1 )K ∗ ) ⊂ L(KA(Oν )K ∗ ). Therefore, the intersection of this family is nonempty, and Lemma 3.2 implies that the diameter of L(KA(Oν )K ∗ ) tends to zero as ν tends to infinity. This implies that the intersection contains precisely one point κ(x), ˜ as stated. The corresponding statements for K ∗ are proved analogously. This proves that x → κ(x) ˜ is a bijective point transformation. Let (xν )ν∈N be a sequence in R1+s that converges to a point x∞ . Then there is a neighbourhood base (Oν )ν∈N of x∞ with xν ∈ Oν for all ν ∈ N. But since κ(x ˜ ν ) ∈ κ(O ˜ ν ) for all ν ∈ N, and since κ(O ˜ ν ) is a neighbourhood base of κ(x ˜ ∞ ), it follows that κ(x ˜ ν ) tends to κ(x ˜ ∞ ) as ν → ∞. This line of argument applies to κˆ as well. The next lemma determines the functions κ˜ and κˆ up to a constant translation. Lemma 3.5. For every x ∈ R1+s , one has κ(x) ˜ = κ(0) ˜ + κx, and κ(x) ˆ = κ(0) ˆ + κ −1 x.

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

87

Proof. Let (Oν )ν∈N be a neighbourhood base of o. Then (Oν +x)ν∈N is a neighbourhood base of x, and L(KA(Oν + x)K ∗ ) = κ(O ˜ ν + x) = {κ(x)}. ˜ ν∈N

ν∈N

On the other hand, L(KA(Oν + x)K ∗ ) = L(U (κx)KA(Oν )K ∗ U (−κx)) ν∈N

ν∈N

= κx +

κ(O ˜ ν)

ν∈N

= κx + {κ(0)}. ˜ The corresponding reasoning also leads to the statement made on κ. ˆ It has been shown now that L(KA(O)K ∗ ) = κ(O) ˜ for each double cone O ∈ K, and since KA(O)K ∗ = A(MO ) by assumption, one concludes from MO ⊂ K(A(MO )) and isotony that KA(O)K ∗ ⊂ A(κ(O)) ˜ for all O ∈ K and that

ˆ for all O ∈ K. K ∗ A(O)K ⊂ A(κ(O))

Using this, one can now prove that κ˜ and κˆ are inverse to each other. Lemma 3.6. κˆ = κ˜ −1 , and in particular, κ˜ and κˆ are homeomorphisms. Proof. For every double cone O, it follows from the preceding results that A(O) = K ∗ KA(O)K ∗ K ⊂ K ∗ A(κ(O))K ˜ ⊂ A(κ( ˆ κ(O))), ˜ and since κ( ˆ κ(O)) ˜ is a double cone by Lemma 3.5, one can use Lemma 3.1 to conclude that O ⊂ κ( ˆ κ(O)). ˜ On the other hand, it follows from Lemma 3.2 that the radii of the double cones O and κ( ˆ κ(O)) ˜ are equal, so these double cones coincide, and as this applies for any double cone O, it follows that κˆ = κ˜ −1 , as stated. The proof of Lemma 2.1 is now almost complete. For each O ∈ K, one has KA(O)K ∗ ⊂ A(κ(O)), ˜ and conversely, ∗ ∗ A(κ(O)) ˜ = KK ∗ A(κ(O))KK ˜ ⊂ KA(κ˜ −1 (κ(O)))K ˜ = KA(O)K ∗ ,

so

KA(O)K ∗ = A(κ(O)), ˜

and with ξ := κ(0) ˜ it follows from Lemma 3.5 that KA(O)K ∗ = A(κO + ξ )

for all O ∈ K.

That ξ is unique, immediately follows from Lemma 3.1, so the proof of Lemma 2.1 is complete.

88

B. Kuckert

Proof of Theorem 2.2 (i). It follows from Lemma 2.1 that there is a unique ι ∈ R1+s such that JW1 A(O)JW1 = A(j1 O + ι)

for all O ∈ K.

It remains to be shown that ι = 0. Since J is an involution, one has x = j1 (j1 x + ι) + ι) = x + j1 ι + ι

for all x ∈ R1+s ,

which gives ι = −j1 ι, hence ι2 = · · · = ιs = 0. Furthermore, one has A(W1 + ι) = JW1 A(W1 ) JW1 = A(W1 ) from Lemma 2.1 and the Tomita–Takesaki theorem, so on the one hand, it follows from Lemma 3.1 that W1 + ι ⊂ W1 , and on the other hand, locality implies A(W1 ) ⊂ A(W1 ) = A(W1 + ι) ⊂ A(W1 + ι) , so using Lemma 3.1 once more one finds W1 ⊂ W1 + ι, arriving at W1 + ι = W1 and ι0 = ι1 = 0, as stated.

In what follows, a well-known generalization of Asgeirsson’s Lemma will be used repeatedly. It is called the double cone theorem of Borchers andVladimirov [50, 9, 51, 12]. Below, it will be applied together with the edge of the wedge theorem due to Bogoliubov (cf., e.g., [45, 51, 12]). For the reader’s convenience, both theorems are recalled here. For ε > 0, Bε will denote the open ε-ball centered at the origin of R2 , and n will denote some natural number. Theorem 3.7 (Edge of the Wedge Theorem). Let C be a nonempty, open and convex cone in Rn . For some ε > 0, assume that g+ is a function analytic in the tube Rn + i(C ∩ Bε ), and that g− is a function analytic in the tube Rn − i(C ∩ Bε ). If there is an open region γ ⊂ Rn where g+ and g− have a common boundary value in the sense of distributions, then g+ and g− are branches of a function g which is analytic in a complex neighbourhood . of γ . Theorem 3.8. Given the assumptions and notation of Theorem 3.7, let c be any smooth curve in γ which has all its tangent vectors in C. Then g is analytic in a complex neighbourhood of the double cone (c + C) ∩ (c − C). Another well known lemma that will be used repeatedly is the following (cf. e.g., part (i) of Lemma 2.4.1 in [39]). Lemma 3.9. Let R ⊂ R1+s be a region that contains an open cone, and let A ∈ Aloc be a local observable such that , AB = , BA for all B ∈ A(R). Then A ∈ A(R) .

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

89

Proof of Theorem 2.2 (ii). In what follows, e0 and e1 denote the unit vectors pointing into the 0- and the 1-direction, respectively. For every t ∈ R, Theorem 2.1 implies the existence of a unique ξ(t) ∈ R1+s with itW1 A(O)−it W1 = A(ξ(t) + 1 (−2π t)O)

for all O ∈ K.

By Corollary 3.1 it is clear that ξ(t) + W1 = W1 , so for all s ∈ R, one has 1 (−2π s)ξ(t) = ξ(t) and −it −is it A(ξ(s + t) + 1 (−2π(t + s))O) = is W1 W1 A(O)W1 W1

= A(ξ(s) + 1 (−2π s)(ξ(t) + 1 (−2π t)O)) = A(ξ(s) + 1 (−2π s)ξ(t) + 1 (−2π(t + s))O) = A(ξ(s) + ξ(t) + 1 (−2π(t + s))O), so ξ(s+t) = ξ(s)+ξ(t) follows from Lemma 3.1. One now concludes that ξ(λt) = λξ(t) for λ ∈ Q, so t → ξ(t) is Q-linear. Next we prove that the function R t → ξ(t) is continuous and, hence, R-linear. As ξ is additive, it is sufficient to prove continuity at t = 0. Assume ξ were not continuous there, then there would exist a sequence (tν ), ν ∈ N, in R that tends to zero, while |ξ(tν )| > ε for some ε > 0. Define the double cone O := − 3ε e0 + V+ ∩ 3ε e0 − V+ . By the above results and locality, there is an Nε ∈ N such that for any A, B ∈ A(O), one has ν [itWν1 A−it W1 , B] = 0 for all ν > Nε . But as itW1 depends strongly continuously on t, one concludes that A and B commute, and since A and B are arbitrary elements of A(O), it follows that A(O) is abelian. Ad is abelian as well, so H = C by irreducibility, which contradicts ditivity implies that A the assumption that H is infinite-dimensional. It follows that ξ is continuous and, hence, R-linear, so there is a ξ ∈ R1+s with ξ(t) = ξ t for all t ∈ R. It remains to be shown that ξ = 0. To this end, define the double cone O := (ρe1 + V+ ) ∩ (ρe1 + ρe0 − V+ ) ⊂ W1 for some ρ > 0. If one chooses ρ sufficiently small, there are a ∈ R1+s and ε, δ > 0 such that (1) 1 (−2πt)O + tξ − δte0 ⊂ a + V+ for all t ∈ [0, ε]; (2) O ⊂ a + V+ . As an example, choose a := ρe1 + ξ − |ξ |e0 , where |ξ | :=

|ξ 2 |. Defining

f (t) := (1 (−2π t)ρe1 + tξ − δte0 − a)2 , one computes

f (0) = 2|ξ |(−2πρ + |ξ | − δ).

|ξ | If one chooses ρ < 2π , one can choose δ such that 0 < δ < −2πρ + |ξ |. With this choice one has f (0) > 0, and as f is smooth and satisfies f (0) = 0, there is an ε > 0 such that f (t) ≥ 0 for all t ∈ [0, ε], which immediately implies Condition (1), whereas Condition (2) follows from f (0) = 0.

90

B. Kuckert b

P O

V1 (−2π t)O + εξ

a Fig. 1. The double cone P in the proof of Thm. 2.2 (ii)

As the set

0≤t≤ε (1 (−2πt)O

+ tξ ) is bounded, there is a b ∈ R1+s such that

(3) 1 (−2πt)O + tξ ⊂ b − V+ for all t ∈ [0, ε]. Now denote P := (a + V+ ) ∩ (b − V+ ) (Fig. 1), choose A ∈ A(O) and B ∈ A(P ), denote by e0 the unit vector in the time direction, and consider the function gA,B defined by R2 (t, s) → gA,B (t, s) := , [B, U (se0 )itW1 A−it W1 U (−se0 )] . By Conditions (1) and (3), this function vanishes in the closure of the open triangle γ with corners (0, 0), (ε, 0) and (ε, −δε) (Fig. 2). Clearly, γ contains a smooth curve that joins (0, 0) to (ε, −δε) and that has tangent vectors in the cone C := {(t, s) ∈ R2 : t > 0, s < 0}. It will be shown that by the double cone theorem, gA,B vanishes in the whole open rectangle ]0, ε[ × ]−δε, 0[. Since gA,B is continuous, it follows that it even vanishes in the closed rectangle [0, ε] × [−δε, 0]. Since B ∈ A(P ) and A ∈ A(O) are arbitrary, Lemma 3.9 implies that A(O − δεe0 ) ⊂ A(P ) . But since by Condition (2), the double cone O − δεe0 cannot be contained in P no matter how small δε is, this is in conflict with Lemma 3.1, so it follows that ξ = 0, which completes the proof.

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

91

1s 0 0 1 0 1 0 1 0 1 0 1 0 1 ε 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 −εδ 0 1 0 1 0 1 0 1 0 1 0 1 0 1

t

Fig. 2. Where gA,B vanishes in the proof of Thm. 2.2 (ii)

It remains to be shown that the function gA,B fulfills the assumptions of the double cone theorem. To this end, first note that gA,B = , BU (se0 )it A − , A−it U (−se0 )B = , BU (se0 )it A − , B ∗ U (se0 )it A∗ =: g+ (t, s) − g− (t, s). Using elementary arguments from spectral theory it can be shown that given any ρ > 0, any vector φ in the domain of ρ and any ψ ∈ H, the function R t → ψ, it φ has an extension to a function that is continuous on the strip {t ∈ C : −ρ ≤ Im t ≤ 0} and analytic on the interior of this strip (cf. [40], Lemma 8.1.10 (p. 351)). 1 As O ⊂ W1 , the vectors A and A∗ are in the domain of 2 , and it follows that for every ψ ∈ H, the functions R t → ψ, it A and R t → ψ, it A∗ have extensions that are continuous in the strips {t ∈ C : − 21 ≤ Im t ≤ 0} and {t ∈ C : 0 ≤ Im ≤ 21 }, respectively, and that are analytic in the interior of these strips. On the other hand, it follows from the spectrum condition that for any two vectors φ, ψ ∈ H, the functions R s → ψ, U (se0 )φ and R s → ψ, U (se0 )φ have extensions that are continuous in the (complex) closed upper and lower half plane, respectively, and analytic in the interior of these half planes. This proves that the function g+ has a continuous extension to the tube T+ := {(t, s) ∈ C2 : −1/2 ≤ Im t ≤ 0, Im s ≥ 0} and that at every interior point of this strip, this extension is analytic separately in t and in s. Using Hartogs’ fundamental theorem stating that a function of several complex variables is holomorphic if and only if it is holomorphic separately in each of these variables [33, 51], it follows that g+ , as a function in two complex variables, is analytic in the interior of T+ . It follows in the same way that g− has the corresponding properties for the tube −T+ =: T− . The tubes T+ and T− contain the smaller tubes R2 − iC ∩ B 1 and R2 + iC ∩ B 1 . 2

2

92

B. Kuckert

Since g+ and g− coincide as continuous functions in the closure of γ , they coincide as distributions in the open region γ , and it follows from the edge of the wedge theorem that they are branches of a function g that is analytic in a complex neighbourhood . of γ . But since γ contains a smooth curve joining the points (0, 0) and (ε, −δε) with tangent vectors in C, it follows from the double cone theorem that the function g is analytic in the region ((0, 0) + C) ∩ ((ε, −δε) − C) =]0, ε[ × ] − δε, 0[. This implies that gA,B vanishes in this region, which is all that remained to be shown, so the proof is complete. Proof of Corollary 2.3. If J+ or it+ behave the way assumed in (i) or (ii), respectively, the commutation relations recalled in the remark preceding the corollary, together with Lemma 2.1, imply that its geometrical action can differ from the stated symmetry at most by a translation. Since V+ is Lorentz-invariant, J+ and it+ , t ∈ R, commute with ↑ all U (g), g ∈ L+ . However, there are no nontrivial translations that commute with all ↑ g ∈ L+ . Proof of Lemma 2.4. It follows from the Tomita–Takesaki Theorem that the modular group under consideration leaves the algebras A(W1 ) and A(W1 ) invariant. By wedge duality, it also leaves the algebra A(W1 ) = A(−W1 ) invariant. Borchers’commutation relations now imply −iε iε W1 A(a ± W1 ) W1 = A(1 (−2π ε)a ± W1 ) .

L(A) + W1 is a union of translates of W1 , so (L(A) + W1 )c , being an intersection of translates of −W 1 , is a translate of −W 1 . It follows that (L(A) + W1 )cc is a translate of W 1 . In particular, (L(A) + W1 )cc = {a + W 1 : a ∈ R1+s , (L(A) + W1 )cc ⊂ a + W1 }. But if a ∈ R1+s is chosen such that (L(A) + W1 )cc ⊂ a + W1 , Lemma 3.1 above and wedge duality imply A ∈ A(a + W1 ) = A(a + W1 ) , so one finds {a + W 1 : a ∈ R1+s , A ∈ A(a + W1 ) } ⊂ (L(A) + W1 )cc , and one concludes

−iε {a + W 1 : a ∈ R1+s , iε W1 AW1 ∈ A(a + W1 ) } iε = {a + W 1 : a ∈ R1+s , A ∈ −iε W1 A(a + W1 ) W1 } = {a + W 1 : a ∈ R1+s , A ∈ A(1 (2π ε)a + W1 ) } = 1 (−2πε) {a + W 1 : a ∈ R1+s , A ∈ A(a + W1 ) }

L(Aε ) ⊂

⊂ 1 (−2πt)(L(A) + W1 )cc . The proof that L(Aε ) ⊂ 1 (−2π t)(L(A) − W1 )cc is completely analogous, so the proof of (i) is complete.

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

93

It remains to prove (ii) and (iii). We prove (iii); (ii) can be established along precisely the same line of argument by replacing itW1 by −it W1 and by exchanging, respectively, V+ and −V+ , A and Aε with one another. Due to Borchers’ commutation relations it suffices to consider A ∈ A(W1 ) , which, as in the proof of Theorem 2.2 (ii), will ensure that A ∈ D(1/2 ) in the following argument. Assume that L(A) ⊂ L(Aε ) + V + . Then one finds an a ∈ R1+s such that (1) L(Aε ) ⊂ a + V+ , while (2) L(A) ⊂ a + V+ . This can be seen as follows. The assumption that L(A) ⊂ L(Aε ) + V + and Statement (i) just proved imply that there is a double cone O ⊂ L(A) such that O and L(Aε ) are spacelike separated, so there is a double cone P ⊃ L(Aε ) such that O and P are spacelike separated (cf., e.g., Prop. 3.8 (b) in [47]); choosing a to be the lower tip of P , one arrives at both Conditions (1) and Condition (2). By Condition (1), L(Aε ) is a compact subset of the open set a + V+ , and as L(At ) depends continuously on t by assumption, there exist σ 7 > 0 and δ > 0 such that (1’) L(At ) − σ 7 e0 ⊂ a + V+

for all t ∈ [ε − δ, ε],

and this condition is, of course, equivalent to Condition (1). Since L(At ) depends continuously on t ∈ [0, ε], the set 0≤t≤ε L(At ) is bounded, so one finds a σ 8 ≥ 0 such that (3) L(At ) + σ 8 e0 ⊂ a + V+ for all t ∈ [0, ε], and for the same reason there is a b ∈ R1+s such that (4) L(At ) + 2σ 8 e0 ⊂ b − V+ for all t ∈ [0, ε]. Now define P := (a + V+ ) ∩ (b − V+ ), and for any B ∈ A(P ), consider – as in the proof of Proposition 2.2 – the function gA,B defined by R2 (t, s) → gA,B (t, s) := , [B, U (se0 )At U (−se0 )] . Locality and Conditions (3) and (4) imply that this function vanishes in the rectangle [0, ε] × [σ 8 , 2σ 8 ], and Condition (1’) implies that it also vanishes in the rectangle [ε −δ, ε]×[−σ 7 , σ 8 ]. By the double cone theorem, gA,B vanishes throughout the whole rectangle [0, ε] × [−σ 7 , 2σ 8 ] (Fig. 3). In particular, one obtains gA,B (0, −σ 7 ) = 0 for all B ∈ A(P ), so one can use Lemma 3.9 to conclude that A ∈ A(σ 7 e0 + P ) . By the definition of L(A), one finds L(A) − σ 7 e0 ⊂ P ⊂ a + V + , and as σ 7 > 0, this implies L(A) ⊂ a + V+ , which is in conflict with Condition (2) above and completes the proof. Proof of Theorem 2.5. Fix any ρ > 0, and define the double cones O1 := (ρ(2e1 + e0 ) + V+ ) ∩ (ρ(2e1 + 2e0 ) − V+ ), O2 := (ρ(2e1 − 2e0 ) + V+ ) ∩ (ρ(2e1 + 2e0 ) − V+ ), and O3 := (ρ(2e1 − 3e0 ) + V+ ) ∩ (ρ(2e1 + 3e0 ) − V+ ), (Fig. 4) and choose A ∈ A(O1 ). As L(A) ⊂ O1 , it follows from Lemma 2.4 (i) and (ii)

94

B. Kuckert

s

000 111 11111111111111 00000000000000 00000000000000 11111111111111 000 111 00000000000000 11111111111111 000 111 000 111 00000000000000 11111111111111 00000000000000 11111111111111 000 111 000 111 00000000000000 11111111111111 00000000000000 11111111111111 000 111 000 111 8 11111111111111 00000000000000 σ 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 ε − δ 111 ε 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 111 000 −σ 7 111 000 11111111111111 00000000000000

2σ 8

t

Fig. 3. Where gA,B vanishes in the proof of Lemma 2.4

that L(At ) ⊂ (1 (−2π t)ρ ( 23 e1 + 23 e0 ) + W 1 ) ∩ (1 (−2π t)ρ ( 25 e1 + 23 e0 ) − W 1 ) ∩ (ρ(2e1 + 2e0 ) − V + ) =: Rt , and there is an ε > 0 such that Rt ⊂ O2

for all t ∈ [0, ε].

Note that by the linearity of the Lorentz boosts, ε does not depend on ρ. One now has L(At ) ⊂ O2 for all A ∈ A(O1 ), and with Corollary 5.4 in [39], it follows that itW1 A(O1 )−it W1 ⊂ A(O3 )

for all t ∈ [0, ε].

Using Borchers’ commutation relations, one finds itW1 A(a + O1 )−it W1 ⊂ A(1 (−2π t)a + O3 )

for all a ∈ R1+s and all t ∈ [0, ε]. Defining x := ρ(2e1 + e0 ), P1 := O1 − x, and P3 := O3 − x, one obtains itW1 A(a + P1 )−it W1 ⊂ A(1 (−2π t)a + (x − 1 (−2π t)x) + P3 ) .

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry x0

95

W1

2ρ O3 O1 O2 x1 2ρ

W1 (dashed lines) Fig. 4. The double cones O1 , O2 , and O3 in the proof of Thm. 2.5

Note that the euclidean length of the vector x − 1 (−2π t)x is ≤ 3ρ for all t ∈ [0, ε], as 1 (−2πt)x ∈ Rt ⊂ O2 by the above choice of ε. Now choose any wedge W ∈ W. As W ⊂ W + P1 , it follows from wedge additivity that

A(W ) ⊂

A(a + P1 )

.

a∈W

Define, for δ > 0, the wedges W (δ) := Bδ (W ) , where Bδ (W ) denotes the euclidean δ-ball around W , and W (−δ) := ((W )(δ) ) , then it follows from isotony and wedge duality that

A(a

+ P3 )

⊂ A(W (4ρ) ) ,

a∈W

and as the euclidean length of the vector (1 (−2π t)x − x) is ≤ 3ρ, one arrives at

a∈W

A(a + (x

− 1 (−2π t)x) + P3 )

⊂ A(W (7ρ) ) .

96

B. Kuckert

For t ∈ [0, ε], one now obtains

itW1 A(1 (2π t)W ) −it W1 ⊂

a∈1 (2πt)W

⊂

itW1 A(a + P1 )−it W1

A(1 (−2π t)a + (x − 1 (−2π t)x) + P3 )

a∈1 (2πt)W

⊂ A(W (7ρ) ) , and as W = (W (−7ρ) )(7ρ) , this can be rewritten it (−7ρ) ) ). −it W1 A(W ) W1 ⊃ A(1 (2π t)W

Using the fact that the transformations 1 (2π t) are linear and, hence, bounded maps in R1+s , which map the euclidean 7ρ-ball onto some bounded set with radius proportional to ρ, and using the facts that this radius continuously depends on t ∈ [0, ε], that the interval [0, ε] is compact, and that ε does not depend on the choice of ρ, one concludes that there is an M > 0 which is independent from ρ and satisfies 1 (2πt)W (−7ρ) ⊃ (1 (2π t)W )(−Mρ)

for all t ∈ [0, ε],

so with the above specifications of ε and M, one obtains it (−Mρ) ) −it W1 A(W ) W1 ⊃ A((1 (2π t)W )

for all wedges W ∈ W and all ρ > 0. For each A ∈ Aloc , one now concludes {W : W ∈ W, itW1 A−it L(At ) = W1 ∈ A(W ) } it = {W : W ∈ W, A ∈ −it W1 A(W ) W1 } ⊂ {W : W ∈ W, A ∈ A((1 (2π t)W )(−Mρ) ) } ρ>0

=

{1 (−2π t)X : X ∈ W, A ∈ A(X (−Mρ) ) }

ρ>0

= 1 (−2πt)

{X : X ∈ W, A ∈ A(X (−Mρ) ) }

ρ>0

= 1 (−2πt)

{X (Mρ) : X ∈ W, A ∈ A(X) }

ρ>0

= 1 (−2π t)L(A). To prove the converse inclusion, one proves L(At ) ⊂ 1 (−2π t) for t ∈ [−ε, 0] by mimicking the above argument: one defines the double cone O1 := ρ(2e1 − 2e0 ) + V+ ) ∩ (ρ(2e1 − e0 ) − V+ ), keeps O2 and O3 as before, defines x := ρ(2e1 − e0 ) and proceeds like above with t ∈ [−ε, 0], using Lemma 2.4 (iii) instead of Part (ii) of the same lemma. Now having proved L(At ) ⊂ 1 (−2π t)L(A) for all t ∈ [−ε, ε] and for all A ∈ Aloc , one concludes L(At ) = 1 (−2π t)L(A) for all t ∈ [−ε, ε] and for all A ∈ Aloc . As this immediately implies the statement for all t ∈ R and all A ∈ Aloc , the proof is complete.

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

97

4. Conclusion By the above results, the modular group of a theory that does not exhibit the Unruh effect acts in a completely “non-geometric” fashion, in the sense that it can neither preserve the net structure nor act on the local observables in such a way that localization regions evolve continuously. In particular, it cannot implement any equilibrium dynamics in this case. The above results imply that the only observer who can possibly experience the vacuum in thermodynamical equilibrium is the uniformly accelerated one (whose acceleration may, of course, be zero). Physically, this result reflects the fact that any nonuniformly accelerated observer would feel nonstationary inertial forces destroying any thermodynamical equilibrium, while the constant acceleration felt by a uniformly accelerated observer does not affect thermodynamical equilibrium provided the theory exhibits the Unruh effect. The first results similar to the above ones have been obtained by Araki and by Keyl [4, 35]. These authors avoid the spectrum condition and assume stronger a priori restrictions on the possible geometric behaviour instead. Recently, more results in this spirit have been found by Buchholz et al. and by Trebels [21, 27, 29, 48]. One aim of these approaches is to obtain new insight on quantum fields on curved spacetimes by avoiding the spectrum condition. So far, results have been obtained for de Sitter, Anti-de Sitter, and certain Robertson–Walker spacetimes [21, 22, 24]. For the vacuum states in Minkowski space considered above, the spectrum condition is a reasonable physical assumption. The assumptions made above on the possible geometric behaviour of the modular objects (in particular those made in the first uniqueness theorem) are less restrictive than those made in any of the other approaches, since a small class of regions, namely, the double cones, is assumed to be mapped into an extremely large class of regions, namely, the open sets. In this sense the above results are, at present, the most general uniqueness results in Minkowski space that point towards the Unruh effect and modular P1 CT-symmetry. Even more than a uniqueness result can be found if conformal symmetry holds in addition to our above Conditions (A) through (C). In this case, the whole representation of the conformal group arises from the modular objects of the theory, and in particular, the Bisognano–Wichmann symmetries can be established [16]. Appendix. A Remark on the Continuity of t → L(At ) In the discussion of the second uniqueness theorem it was assumed that L(At ) depends continuously on t for t ∈ [0, ε] in the sense that for each sequence (tν )ν∈N tending to a t∞ ∈ [0, ε], the localization region L(At∞ ) consists precisely of all accumulation points of sequences (xν )ν∈N with xν ∈ L(Atν ). In this appendix we show that this notion of convergence, which we refer to as pointwise convergence, is equivalent to the convergence according to a metric first considered by Hausdorff, which one can introduce on the set C of compact convex subsets of R1+s by defining, for any two such sets K, L ∈ C, δH (K, L) := inf{δ > 0 : K ⊂ Bδ (L) and L ⊂ Bδ (K)} (cf. Problem 4D (p. 131) in [34]). It is evident that continuity of [0, ε] t → L(At ) with respect to this metric, which we refer to as uniform continuity, implies the pointwise

98

B. Kuckert

continuity for this map. Conversely, one can also show that pointwise continuity implies uniform continuity for t → L(At ). To prove this indirectly, assume that t → L(At ) is pointwise continuous for t ∈ [0, ε] and that this map is not continuous with respect to Hausdorff’s metric. Then there exists a ρ > 0 and a sequence (tν )ν∈N of points in [0, ε] which converges to a point t∞ ∈ [0, ε] and has the property that δH (L(Atν ), L(At∞ )) ≥ ρ. On the other hand, there is a subsequence (sν )ν∈N of (tν )ν∈N with the property that all L(Asν ) have nonempty intersection with Bρ (L(At∞ )), as otherwise L(At∞ ) would be empty by the assumption of pointwise continuity. As δH (L(Asν ), L(At∞ )) ≥ ρ, there exists a sequence (xν )ν∈N such that the euclidean distance δ(xν , L(At∞ )) between xν and L(At∞ ) is ≥ ρ/2 for all ν ∈ N, and as all L(Asν ) are convex sets with a nonempty intersection with Bρ (L(At∞ )), this sequence can be chosen such that it is bounded and, hence, has an accumulation point x. ˜ As δ(xν , L(At∞ )) ≥ ρ/2 for all ν ∈ N, one finds δ(x, ˜ L(At∞ ) ≥ ρ/2, so x˜ ∈ / L(At∞ ). But this contradicts the assumption that t → L(At ) is pointwise continuous and proves that this map is pointwise continuous if and only if it is uniformly continuous, as stated. It is now easy to see that t∈[0,ε] L(At ) is bounded, as stated in the text. Namely, the function [0, ε] t → δH (L(A), L(At )) is continuous and, hence, has a maximum ρ > 0 in the compact interval [0, ε]. It follows that t∈[0,ε] L(At ) ⊂ Bρ (L(A)), which is a bounded set. Acknowledgements. It was an important help that D. Arlt and N. P. Landsman read the manuscript carefully. This research was funded by the Deutsche Forschungsgemeinschaft, a Feodor–Lynen grant of the Alexander von Humboldt foundation, and a Hendrik Casimir–Karl Ziegler award of the Nordrhein-Westfälische Akademie der Wissenschaften. The idea to reinitiate the project originated during a stay in 1997 at the Erwin-Schrödinger Institute for Mathematical Physics at Vienna. Helpful discussions there with S. Trebels and D. Guido are gratefully acknowledged.

References 1. Alexandrov, A. D.: On Lorentz transformations. Uspekhi Mat. Nauk. 5 No. 3 (37), 187 (1950) 2. Alexandrov, A. D.: Mappings of Spaces with Families of Cones and Space-Time Transformations. Annali di matematica 103, 229–257 (1975) 3. Alexandrov, A. D., Ovchinnikova, V. V.: Notes on the foundations of relativity theory. Vestnik Leningrad Univ. 14, 95 (1953) 4. Araki, H.: Symmetries in a Theory of Local Observables and the Choice of the Net of Local Algebras. Rev. Math. Phys. Special Issue, 1–14 (1992) 5. Araki, H.: Mathematical Theory of Quantum Fields. Oxford: Oxford University Press, 1999 6. Baumgärtel, H., Wollenberg, M.: Causal Nets of Operator Algebras. Berlin: Akademie-Verlag, 1992 7. Bisognano, J. J., Wichmann, E. H.: On the Duality Condition for a Hermitian Scalar Field. J. Math. Phys. 16, 985–1007 (1975) 8. Bisognano, J. J., Wichmann, E. H.: On the Duality Condition for Quantum Fields. J. Math. Phys. 17, 303 (1976) 9. Borchers, H.-J.: Über die Vollständigkeit lorentzinvarianter Felder in einer zeitartigen Röhre. Nuovo Cimento 19, 787–796 (1961) 10. Borchers, H.-J.: On the Vacuum State in Quantum Field Theory, II. Commun. Math. Phys. 1, 57 (1965) 11. Borchers, H.-J.: The CPT-Theorem in Two-Dimensional Theories of Local Observables. Commun. Math. Phys. 143, 315–332 (1992) 12. Borchers, H.-J.: Translation Group and Particle Representations in Quantum Field Theory. Berlin– Heidelberg: Springer, 1996 13. Borchers, H.-J.: On Poincaré transformations and the modular group of the algebra associated with a wedge. Lett. Math. Phys. 46, 295–301 (1998)

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

99

14. Borchers, H.-J.: On the Revolutionization of Quantum Field Theory by Tomita’s Modular Theory. J. Math. Phys. 41, 3604–3673 (2000) 15. Borchers, H.-J., Hegerfeldt, G. C.: The Structure of Space-Time Transformations. Commun. Math. Phys. 28, 259–266 (1972) 16. Brunetti, R., Guido, D., Longo, R.: Modular Structure and Duality in Conformal Quantum Field Theory. Commun. Math. Phys. 156, 201–219 (1993) 17. Buchholz, D.: Collision Theory for Massless Fermions. Commun. Math. Phys. 42, 269–279 (1975) 18. Buchholz, D.: Collision Theory for Waves in Two Dimensions and a Characterization of Models with Trivial S-Matrix. Commun. Math. Phys. 45, 1–8 (1975) 19. Buchholz, D.: Collision Theory of Massless Bosons. Commun. Math. Phys. 52, 147–173 (1977) 20. Buchholz, D.: On the Structure of Local Quantum Fields with Non-Trivial Interaction. In: Proceedings of the International Conference on Operator Algebras, Ideals and Their Applications in Theoretical Physics, Leipzig, 1977. Stuttgart: Teubner, 1978 21. Buchholz, D., Dreyer, O., Florig, M., Summers, S. J.: Geometric Modular Action and spacetime Symmetry Groups. Rev. Math. Phys. 12, 475–560 (2000) 22. Buchholz, D. Florig, M., Summers, S. J.: Hawking–Unruh Temperature and Einstein Causality in Anti-de Sitter Space-Time. Class. Quant. Grav. 17, L31–L37 (2000) 23. Buchholz, D., Fredenhagen, K.: Dilations and interaction. J. Math. Phys. 18, 1107–1111 (1977) 24. Buchholz, D., Mund, J., Summers, S. J.: Transplantation of Local Nets and Geometric Modular Action on Robertson–Walker Space-Times. Preprint, hep-th/0011237 25. Buchholz, D., Summers, S. J.: An Algebraic Characterization of Vacuum States in Minkowski Space. Commun. Math. Phys. 155, 449–458 (1993) 26. Davidson, D. R.: Modular Covariance and the Algebraic PCT/Spin-Statistics Theorem. Preprint, hep-th/9511216 27. Dreyer, O.: Das Prinzip der geometrischen modularen Wirkung im de Sitter-Raum. diploma thesis, University of Hamburg, 1996 28. Florig, M.: On Borchers’ Theorem. Lett Math. Phys. 46, 289–293 (1998) 29. Florig, M.: Geometric Modular Action. PhD-thesis, University of Florida, Gainesville, 1999 30. Guido, D., Longo, R.: An Algebraic Spin and Statistics Theorem. Commun. Math. Phys. 172, 517–534 (1995) 31. Guido, D., Longo, R.: The Conformal Spin and Statistics Theorem. Commun. Math. Phys. 181, 11–36 (1996) 32. Haag, R.: Local Quantum Physics. Berlin: Springer, 1992 33. Hartogs, F.: Zur Theorie der Funktionen mehrerer komplexer Veränderlicher, insbesondere über die Darstellung derselben durch Reihen, welche nach Potenzen einer Veränderlichen fortschreiten. Math. Ann. 62, 1–88 (1906) 34. Kelley, J. L.: General Topology. New York: van Nostrand, 1955 35. Keyl, M.: Remarks on the relation between causality and quantum fields. Class. Quantum Grav. 10, 2353–2362 (1993) 36. Kuckert, B.: A New Approach to Spin & Statistics. Lett. Math. Phys. 35, 319–335 (1995) 37. Kuckert, B.: Borchers’ Commutation Relations and Modular Symmetries in Quantum Field Theory. Lett. Math. Phys. 41, 307–320 (1997) 38. Kuckert, B.: Spin & Statistics, Localization Regions, and Modular Symmetries in Quantum Field Theory. PhD-thesis, Hamburg 1998, DESY-thesis 1998-026 39. Kuckert, B.: Localization Regions of Local Observables. Commun. Math. Phys. 215, 197–216 (2000) 40. Li Bing-Ren: Introduction to Operator Algebras. Singapore: World Scientific, 1992 41. Longo, R.: On the spin-statistics relation for topological charges. In: Doplicher, S., Longo, R., Roberts, J. E., Zsido, L. (eds.): Operator Algebras and Quantum Field Theory. Proceedings of the conference at the Accedemia Nazionale dei Lincei, Rome 1996. Cambridge, MA: International Press, 1997 42. Mack, G., Salam, A.: Finite-Component Field Representations of the Conformal Group. Ann. Phys. 53, 174–202 (1969) 43. Mund, J.: Quantum Field Theory of Particles with Braid Group Statistics in 2+1 dimensions. PhD-thesis, Freie Universität Berlin, 1998 44. Reeh, H., Schlieder, S.: Bemerkungen zur Unitäräquivalenz von lorentzinvarianten Feldern. Nuovo Cimento 22, 1051 (1961) 45. Streater, R. F., Wightman, A. S.: PCT, Spin & Statistics, and All That. New York: Benjamin, 1964 46. Takesaki, M.: Tomita’s Theory of Modular Hilbert Algebras and Its Applications. Lecture Notes in Mathematics 128, New York: Springer, 1970 47. Thomas, L. J., Wichmann, E. H.: Standard forms of local nets in quantum field theory. J. Math. Phys. 39, 2643–2681 (1998) 48. Trebels, S.: PhD-thesis. Göttingen 1997, cf. also [14] 49. Unruh, W. G.: Notes on black hole evaporation. Phys. Rev. D 14, 870–892 (1976)

100

B. Kuckert

50. Vladimirov, V. S.: The construction of envelopes of holomorphy for domains of a special type. (in Russian) Doklady Akad. Nauk SSSR 134, 251–254 (1960) 51. Vladimirov, V. S.: Methods of the Theory of Functions of Many Complex Variables. Cambridge, MA: M. I. T. Press, 1966 52. Wiesbrock, H.-W.: A Comment on a Recent Work of Borchers. Lett. Math. Phys. 25, 157–159 (1992) 53. Yngvason, J.: A Note on Essential Duality. Lett. Math. Phys. 31, 127–141 (1994) 54. Zeeman, E. C.: Causality Implies the Lorentz Group. J. Math. Phys. 5, 490–493 (1964) Communicated by H. Araki

Commun. Math. Phys. 221, 101 – 140 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Renormalization Group and the Melnikov Problem for PDE’s Jean Bricmont1, , Antti Kupiainen2, , Alain Schenkel2 1 UCL, FYMA, 2 chemin du Cyclotron, 1348 Louvain-la-Neuve, Belgium 2 Department of Mathematics, Helsinki University, P.O. Box 4, 00014 Helsinki, Finland

Received: 29 January 2001 / Accepted: 8 March 2001

Abstract: We give a new proof of persistence of quasi-periodic, low dimensional elliptic tori in infinite dimensional systems. The proof is based on a renormalization group iteration that was developed recently in [BGK] to address the standard KAM problem, namely, persistence of invariant tori of maximal dimension in finite dimensional, near integrable systems. Our result covers situations in which the so called normal frequencies are multiple. In particular, it provides a new proof of the existence of small-amplitude, quasi-periodic solutions of nonlinear wave equations with periodic boundary conditions. 1. Introduction In this paper, we address the persistence problem of quasi-periodic, low dimensional, elliptic tori in infinite dimensional systems. A typical example that we will consider is the nonlinear wave equation (NLW) on a bounded interval, ∂t2 u = ∂x2 u − V u − f (u),

(1.1)

with Dirichlet or periodic boundary conditions and f (u) = O(u3 ). The first results concerning the Melnikov problem (i.e., the persistence of elliptic invariant tori of dimension lower than the number of degrees of freedom, [M, E]) for infinite dimensional Hamiltonian systems were obtained independently by Kuksin, Pöschel and Wayne, [K2, P1, W]. In particular, existence of quasi-periodic solutions of (1.1) was shown in [K1, W]. Based on the Kolmogorov–Arnold–Moser (KAM) approach, these results were restricted to Dirichlet or Neumann boundary conditions and to specific classes of adjustable potentials V , excluding, in particular, arbitrary constant potentials. This latter case was covered in [BK] by using the sine-Gordon PDE as the unperturbed integrable system, and, following a different approach, in [P2]. In [P2], the existence of a Birkhoff normal Partially supported by ESF/PRODYN.

Partially supported by EC grant FMRX-CT98-0175.

102

J. Bricmont, A. Kupiainen, A. Schenkel

form for the Hamiltonian of (1.1) is exploited in order to control the torus frequencies via amplitude-frequency modulation, and therefore to dispense with outer parameters provided by an adjustable potential. This approach was applied in [KP] to the persistence of quasi-periodic solutions for the nonlinear Schrödinger equation (NLS) subject to Dirichlet (or Neumann) boundary conditions. The case of periodic boundary conditions is more delicate due to the fact that the eigenvalues of the Sturm-Liouville operator L = −d 2 /dx 2 + V are degenerate. This leads to resonances between pairs of frequencies corresponding to motion in directions normal to the torus (the so called normal frequencies). These additional resonances prevent one from controlling quadratic terms in the Hamiltonian of the system. (This difficulty also appears in finite-dimensional Melnikov situations.) Developing new techniques based on the Lyapunov-Schmidt method, Craig and Wayne proved in [CW] persistence of periodic solutions of the NLW with periodic boundary conditions. Later, their approach was significantly improved by Bourgain in [B1-2] who constructed quasi-periodic solutions of the NLW and NLS with periodic boundary conditions. Most notably, it is shown in [B2] that solutions of this type can be constructed, in particular, for the NLS on twodimensional domains. The usual Melnikov nonresonance condition reads, with ω ∈ Rd and µ ∈ Rn denoting the torus and, respectively, the normal frequencies (n is possibly infinite), k, ω + l, µ = 0,

k ∈ Zd , l ∈ Zn with |k| + |l| = 0, |l| ≤ 2.

(1.2)

In Bourgain’s approach and at the price of a considerable technical effort, condition (1.2) is reduced to k, ω + µs = 0,

k ∈ Zd , s = 1, . . . , n,

i.e., all nonresonance conditions on pairs of normal frequencies are absent. More recently, Chierchia and You, see [Y,CY], showed that persistence of quasi-periodic solutions of the NLW with periodic boundary conditions is tractable by KAM techniques. Their nonresonance condition, k, ω + l, µ = 0,

k ∈ Zd \ {0}, l ∈ Zn with |l| ≤ 2,

(1.3)

is stronger than Bourgain’s condition but weaker than (1.2). However, their result does not cover the case of constant potential. In the present paper, we give a new proof of Bourgain’s result for the NLW with periodic boundary conditions in the case of constant potential. Our proof is based on a renormalization group procedure recently developed in [BGK] for standard KAM problems. The nonresonance condition that we will impose is the same as Chierchia and You’s condition, but our technique could in principle accommodate Bourgain’s condition. In order to describe our result further, we start by specifying the infinite dimensional Hamiltonians we will consider. For dk , k ≥ 1, a sequence of strictly positive integers uniformly bounded by some d¯ < ∞, let R∞ denote the set of infinite sequences x = (x1 , x2 , . . . ) with xk ∈ Rdk . For an integer d ≥ 1, let P = Td × Rd × R∞ × R∞ , where Td is the torus Rd /(2πZd ). Denoting the coordinates in P by (φ, I, x, y) and endowing P with the symplectic structure dφ ∧ dI + dx ∧ dy, we consider perturbations of integrable Hamiltonians of the form H (φ, I, x, y) = ω · I + 21 I · gI + 21 µ2k |xk |2 + |yk |2 + λU (φ, I, x), (1.4) k≥1

Renormalization Group and Melnikov Problem for PDE’s

103

where µk ∈ R, k ≥ 1, ω ∈ Rd , and g is a real symmetric, invertible d × d matrix. 2 Above, |v|2 for v ∈ Rm denotes m i=1 vi . The Hamiltonian flow generated by (1.4) is given by the equations of motion I˙ = −λ∂φ U,

φ˙ = ω + gI + λ∂I U,

(1.5)

and x¨k = −µ2k xk − λ∂xk U.

(1.6)

For λ = 0 and the initial condition I 0 = φ 0 = x 0 = y 0 = 0, the flow φ(t) = ωt, I (t) = 0, and x(t) = 0, is quasi-periodic and spans a d-dimensional torus in Td × Rd × R∞ × R∞ . In order to study the case for which the perturbation is turned on, we consider a quasi-periodic solution of the form (φ(t), I (t), x(t)) = (ωt + !(ωt), J (ωt), Z(ωt)). Then, (1.5) and (1.6) require that T ≡ (!, J, Z) : Td → Rd × Rd × R∞ satisfies the equation DT (ϕ) = −λ∂U (ϕ + !(ϕ), J (ϕ), Z(ϕ)),

(1.7)

where ∂ = (∂φ , ∂I , ∂x ) and, setting µ ≡ diag(µ1 1d1 , µ2 1d2 , . . . ), together with D ≡ ω · ∂φ ,

(1.8)

0 D 0 . 0 D = −D g 2 2 0 0 D +µ

(1.9)

Note that if T is a solution of Eq. (1.7), then so is Tβ for β ∈ Rd , where Tβ (ϕ) = T (ϕ − β) − (β, 0, 0).

(1.10)

We now state the two hypotheses under which we shall prove existence of a solution T of Eq. (1.7), first introducing the following family of Banach spaces R∞ s , s ∈ R,

∞ R∞ k s |Zk |Rdk < ∞ . (1.11) s = Z ∈ R | |Z|s ≡ k≥1

(H1) Asymptotics of eigenvalues. The sequence {µk }k≥1 satisfies µk > 0 and µk = µl for all k = l ≥ 1, and there exist γ ≥ 1 and c > 0 such that µk ≥ ck γ

for all k ≥ 1.

(1.12)

Furthermore, if γ > 1 then µk − µk ≥ c(k γ − k γ ) for all k > k ≥ 1.

(1.13)

If γ = 1, then there exist constants ξ > 0 and cl > 0 such that µk − µk = cl (1 + O(k −ξ )) for all k − k = l ≥ 1.

(1.14)

104

J. Bricmont, A. Kupiainen, A. Schenkel

(H2) Regularity of the perturbation. The map (φ, I, x) → U (φ, I, x) is assumed to be real analytic in φ ∈ Td and real analytic in I and x in a neighborhood of the origin of Rd and R∞ 0 . In addition, we assume that there exist an s > 0 and a ξ > 0 such that for some OI ⊂ Rd and Ox ⊂ R∞ s neighborhoods of the origin, the gradient ∂x U is bounded as a map from Td × OI × Ox to R∞ s+ξ −γ . In the sequel, we will often use the short notation s ≡ s + ξ − γ . Theorem 1.1. Let {µk } satisfy (H1) and U satisfy (H2). Then, there exists a set +∗ = +∗ (U, µ) ⊂ Rd such that for ω ∈ +∗ , Eq. (1.7) has a unique solution (up to translations (1.10)) which is real analytic in λ and φ provided that |λ| is small enough. Furthermore, for all bounded + ⊂ Rd the set +∗ of admissible frequencies satisfies meas(+\+∗ ) → 0 as λ → 0. The proof of Theorem 1.1 is based on an inductive procedure developed in [BGK] for standard KAM problems. This renormalization group iteration can be viewed as an iterative resummation of the Lindstedt series, as is explained in more details in [BGK], and was directly inspired by the quantum field theory analogy with KAM problems forcefully emphasized by Gallavotti et al. [G, GGM]. Melnikov type problems require to deal with the additional resonances arising from the normal frequencies µk , and the goal of the present paper is to explain how the procedure of [BGK] can be applied in such cases. In contrast to standard KAM problems, the set +∗ of admissible frequencies depends for Melnikov type problems on the perturbation U . In our approach, this dependence expresses itself by the fact that under iteration, the normal frequencies are renormalized in a U -dependent way and that the set +∗ is defined according to the renormalized normal frequencies. As usual, the set +∗ is constructed in such a way that nonresonance conditions are fulfilled in order for the inductive scheme to converge. Our scheme is technically simplified if one imposes the nonresonance condition of the form (1.3), i.e., conditions involving pairs of normal frequencies. Hypothesis (H1) ensures that +∗ has large measure under these conditions, and hypothesis (H2) ensures that the asymptotic properties of the normal frequencies stated in (H1) are preserved under renormalization. The requirement ξ > 0 is needed both in (H1) when γ = 1, and, for γ > 1, in (H2) in order to cover the case of degenerate normal frequencies (more precisely the case where dk > 1 for infinitely many k). In Sect. 2, we show how Theorem 1.1 provides a proof of the existence of quasi-periodic solutions of the 1D NLW with periodic boundary conditions. In particular, γ = 1 in (H1) and we will see that (H2) is satisfied with ξ = 1. In contrast, one has for the 1D NLS γ = 2 and ξ = 0. Thus, the scheme presented here only applies to NLS with Dirichlet boundary conditions (namely dk = 1 for all k) or to the persistence of periodic solutions of NLS (namely d = 1). In order to cover the other situations, one must be able to dispense with nonresonance conditions involving certain pairs of normal frequencies. The remainder of the paper is organized as follows. Section 2 is devoted to the NLW. In Sect. 3 we explain the renormalization group scheme that will be used to prove Theorem 1.1. Section 4 is devoted to the definition of the spaces we will consider. In Sect. 5, we state some crucial inductive bounds, which will be shown to hold in Sect. 6. Section 7 is concerned with the measure estimate of +∗ , whereas the proof of Theorem 1.1 is carried out in Sect. 8. Finally, we have collected in the appendix some technical and intermediary results.

Renormalization Group and Melnikov Problem for PDE’s

105

2. The 1D Wave Equation In this section, we show how Theorem 1.1 implies the existence of small amplitude quasi-periodic solutions of nonlinear 1D wave equations of the form ∂t2 u = ∂x2 u − mu − f (u),

t > 0, x ∈ [0, 2π ],

(2.1)

with periodic boundary conditions u(0, t) = u(2π, t), ∂t u(0, t) = ∂t u(2π, t). Here, m > 0 is a real parameter and f is a real analytic function of the form f (u) = u3 +O(u4 ). For f ≡ 0, Eq. (2.1) becomes ∂t2 u = ∂x2 u − mu ≡ −Lu.

(2.2)

The operator L with periodic boundary conditions admits a complete orthonormal basis of eigenfunctions ψn ∈ L2 ([0, 2π ]), n ∈ Z, with corresponding eigenvalues ζn = n2 + m,

(2.3)

√ if one sets ψ0 = 1/ 2π and for n ≥ 1,

1 ψn (x) = √ cos(nx), π

1 ψ−n (x) = √ sin(nx). π

(2.4)

Every solution of the linear wave Eq. (2.2) can be written √ as a superposition of the basic modes ψn , namely, for I any subset of Z and µn ≡ ζn , an cos(µn t + θn )ψn (x), (2.5) u(x, t) = n∈I

with amplitudes an > 0 and initial phases θn . Regarding existence of solutions for the nonlinear wave equation (2.1), we will prove Theorem 2.1. Let 1 ≤ d < ∞ and I = {n1 , . . . , nd } ⊂ Z satisfying |ni | = |nj | for i = j . Then, for λ > 0 small enough there is a set A ⊂ {a = (a1 , . . . , ad ) | 0 < ai < λ} of positive measure such that for a ∈ A Eq. (2.1) has a solution u(x, t) =

d i=1

ai cos(µni t + θi )ψni (x) + O(|a|3 ),

(2.6)

with frequencies µni = µni + O(|a|2 ). Furthermore, the set A is of asymptotically full measure as |a| → 0. As is well known, the nonlinear wave Eq. (2.1) can be studied as an infinite dimensional Hamiltonian system by taking the phase space to be the product of the Sobolev spaces H01 ([0, 2π ]) × L2 ([0, 2π ]) with coordinates u and v = ∂t u. The Hamiltonian for (2.1) is then 2π 1 1 H = 2 (v, v) + 2 (Lu, u) + g(u) dx, (2.7)

0

where L = −d 2 /dx 2 + m, g = f ds, and (·, ·) denotes the usual scalar product in L2 ([0, 2π ]). In order to prove existence of solutions of type (2.6) by means of Theorem 1.1, we would like to write (2.7) in the form (1.4). This turns out to be possible,

106

J. Bricmont, A. Kupiainen, A. Schenkel

through amplitude-frequency modulation, due to the availability of a (partial) normal form theory for (2.7). As we shall see, the requirement for the parameter m to be non zero is crucial for this part of the argument. In the sequel, we will closely follow the exposition of Pöschel in [P2]. Introducing the coordinates q = (q0 , q1 , q−1 , . . . ) and p = (p0 , p1 , p−1 , . . . ) by setting u(x) =

qn ψn (x),

v(x) =

n∈Z

pn ψn (x),

(2.8)

n∈Z

one rewrites the Hamiltonian (2.7) in the coordinates (q, p), H =

1 2 2 µn qn + p2n + G(q), 2

(2.9)

n∈Z

where

2π

G(q) =

g

0

qn ψn (x) dx.

(2.10)

n∈Z

The Hamiltonian flow generated by (2.9) is given by the equations of motion q¨ n = −µ2n qn − ∂qn G(q),

(2.11)

and one can show that a solution q of (2.11) yields a solution of the nonlinear wave Eq. (2.1) if q has some decaying properties. More precisely, defining lbs to be the Banach space of all real valued bi-infinite sequences w = (w0 , w1 , w−1 , . . . ) with norm ||w||s =

[n]s |wn |, n∈Z

where [n] = max(1, |n|), one has the Lemma 2.2. Let s ≥ 2. If a curve I → lbs , t → q(t), is a solution of (2.11), then u(x, t) =

qn (t)ψn (x)

n∈Z

is a classical solution of (2.1). For the proof of Lemma 2.2, see [CY]. Before turning to the normal form analysis of the Hamiltonian (2.9), we state a result concerning the regularity of the gradient ∂q G. Lemma 2.3. For all s > 0, the gradient ∂q G is real analytic as a map from some neighborhood of the origin in lbs into lbs , with ||∂q G(q)||s = O(||q||3s ).

(2.12)

Renormalization Group and Melnikov Problem for PDE’s

107

Proof. We first note that lbs is a Banach algebra with respect to convolution of sequences, with s

[i] ||q ∗ p||s ≤ [i]s |qj −i ||pj | ≤ sup ||q||s ||p||s ≤ 2s ||q||s ||p||s . i,j ∈Z [j − i][j ] i,j ∈Z

(2.13) Therefore, using the analyticity of f (u) = u3 +O(u4 ), one computes that in a sufficiently small neighborhood of the origin, ||f (u)||s ≤ C||q||3s .

(2.14)

On the other hand, since ∂qn G(q) =

2π 0

f (u)ψn (x)dx,

the components of ∂q G(q) are the Fourier components of f (u) and (2.12) follows from the estimate (2.14). The regularity of ∂q G follows from the regularity of its components and its local boundedness, cf. [PT], p. 138. We now turn to the normal form analysis of (2.9). First, since g(u) = 41 u4 + O(u5 ), we find that 1 G(q) = gij kl qi qj qk ql + O(|q|5 ), 4 i,j,k,l

where gij kl =

2π 0

ψi ψj ψk ψl dx.

(2.15)

An easy computation shows that gij kl = 0 unless i ± j ± k ± l = 0 for at least one combination of plus and minus signs. This will play an important role later on. Next, given a finite subset of indices Id = {n1 , . . . , nd } ⊂ Z with |ni | = |nj | if i = j , we decompose the Hamiltonian (2.9) as H = Hd + H∞ , where Hd (q, p) =

1 2 2 (µn qn + p2n ) 2 n∈Id

+ H∞ (q, p) =

1 4

gij kl qi qj qk ql ≡ 7d (q, p) + Gd (q),

(2.16)

i,j,k,l∈Id

1 2 2 (µn qn + p2n ) 2 n∈Id

+ G(q) − Gd (q) ≡ 7∞ (q, p) + G∞ (q).

(2.17)

108

J. Bricmont, A. Kupiainen, A. Schenkel

Introducing the complex coordinates zj , j = 1, . . . , d, by zj =

1 (µnj qnj + i pnj ), 2µnj

one obtains the Hamiltonian Hd (z, z¯ ) = j µnj |zj |2 + Gd (z, z¯ ) on Cd with symplectic structure i j dzj ∧ d z¯ j . For the remaining coordinates, one introduces the notation, for k ≥ 1, (qk , q−k ) ∈ R2 if k, −k ∈ Id , xk = ˜ for some k˜ ∈ Id , q−k˜ ∈ R if k = |k| and similarly for pn , n ∈ Id , denoted in terms of yk ∈ Rdk , k ≥ 1, with dk as above, namely, dk = 2 if both k, −k ∈ Id and dk = 1 otherwise. Clearly, for q, p ∈ lbs one has ∞ x, y ∈ R∞ s , where Rs is defined in (1.11), and H∞ reads in these notations H∞ (z, z¯ , x, y) =

1 2 (µk |xk |2 + |yk |2 ) + G∞ (z, z¯ , x), 2 k≥1

with |G∞ | = O 3l=0 |z|l ||x||4−l + |z|5 + ||x||5s . The next proposition establishes the s existence of a symplectic change of coordinates that transforms the Hamiltonian Hd into a Birkhoff normal form. As it will be clear from the proof, this normal form is not available for H = Hd + H∞ , since most frequencies in H∞ are degenerate. This is the main difference with [P2] in the present discussion. Proposition 2.4. For each m > 0 and each subset Id , d < ∞, satisfying |ni | = |nj | when i = j , there exists a near identity, real analytic, symplectic change of coordinates 9d in some neighborhood of the origin in Cd that takes the Hamiltonian (2.16) into ¯ d + Kd , Hd ◦ 9d = 7d + G where |Kd | = O(|z|6 ) and d 1 3 4 − δij ¯ Gd (z, z¯ ) = g¯ ij |zi |2 |zj |2 with g¯ ij = . 2 8π µni µnj

(2.18)

i,j =1

∞ , one has H∞ ◦ 9∞ = 7∞ + K∞ with Furthermore, setting 9∞ = 9d ⊕ 1R∞ s ×Rs 3 l 4−l 5 5 |K∞ | = O l=0 |z| ||x||s + |z| + ||x||s .

Proof. Modulo straightforward modifications, the proof is carried out in [P2] and we restrict ourselves here to a quick overview. Proceeding as in [P2] and using that |n| = |n | for n = n ∈ Id , one can show that for integers i, j, k, l ∈ Id satisfying i ±j ±k ±l = 0 and {i, j, k, l} = {n, n, n , n }, one has for all combinations of plus and minus signs, |µi ± µj ± µk ± µl | ≥ c

(N 2

m > 0, + m)3/2

(2.19)

Renormalization Group and Melnikov Problem for PDE’s

109

with c some absolute constant and N = min{|i|, . . . , |l|}. This allows to eliminate all terms in Gd (z, z¯ ) that are not of the form |zi |2 |zj |2 . To see this, it is convenient to adopt the notation zj = wj and z¯ j = w−j in which Gd reads Gd =

1 g˜ ij kl wi wj wk wl , 16 i,j,k,l

g˜ ij kl = √

gn|i| ...n|l| µn|i| . . . µn|l|

,

where the prime indicates that the sum runs over all indices i, j, k, l ∈ {1,−1, . . . , d,−d} with n|i| ±n|j | ±n|k| ±n|l| = 0 for at least one combination of plus and minus signs. Defining the transformation 9d as the time-1 map of the flow of the vectorfield XF given by a Hamiltonian F (z, z¯ ) of order four, namely, 9d = XFt |t=1 and F = Fij kl wi wj wk wl , one obtains using Taylor’s formula Hd ◦ 9d = 7d + Gd + {7d , F } + O(|z|6 ) with {7d , F } = −i (µˆ i + µˆ j + µˆ k + µˆ l )Fij kl wi wj wk wl , i,j,k,l

where µˆ i ≡ sign(i)µn|i| . With (2.19), one easily checks that if {i, j, k, l} = {a,−a, b,−b} then µˆ i + µˆ j + µˆ k + µˆ l > 0. Therefore, choosing Fij kl suitably, one finally obtains, using giijj = (2 + δij )/4π and counting multiplicities, d 3 4 − δij ¯ d. |zi |2 |zj |2 ≡ G Gd + {7d , F } = µni µnj 16π i,j =1

For the rest of the proof, we refer the reader to [P2].

¯ d is integrable with integrals |zi |2 , i = 1, . . . , d. FurtherThe Hamiltonian 7d + G more, the matrix g¯ = (g¯ ij )i,j is non degenerate, as can be checked from the explicit formula (2.18). Hence, introducing the standard action-angle variables (I, φ) ∈ Rd ×Td and linearizing H around a given value for the action, namely, by setting for some a = (a1 , . . . , ad ) ∈ Rd , zi z¯ i = Ii + ai2 , one finally obtains Ha = ω · I + 21 I · gI ¯ +

k≥1

(µ2k xk2 + yk2 ) + Ua (I, φ, x),

(2.20)

where Ua is just Kd + K∞ with the variables zi , z¯ i , i = 1, . . . , d, expressed in terms of I, φ, and where ω = (ω1 , . . . , ωd ) is given by ωi = µni +

d j =1

g¯ ij aj2 ,

and covers a cone at (µn1 , . . . , µnd ) as a varies in a neighborhood of the origin of Rd . Furthermore, Ua is real analytic in φ ∈ Td and real analytic in I in a sufficiently small neighborhood OI of the origin of Rd . As a function of x, Ua is real analytic in a neighborhood Ox ⊂ R∞ s and by Lemma 2.3, its gradient ∂x Ua is bounded as a map from Td × OI × Ox to R∞ s . Therefore, since hypothesis (H1) is satisfied with γ = 1,

110

J. Bricmont, A. Kupiainen, A. Schenkel

Ua satisfies (H2) with ξ = 1. Finally, the small parameter λ is given in terms of |a| = δ. In the Hamilton’s equations for Ha , rescaling a by δ, x and y by δ 2 , and I by δ 4 , one obtains an Hamiltonian system given by the rescaled Hamiltonian H˜ a (φ, I, x, y) = δ −4 Hδa (φ, δ 4 I, δ 2 x, δ 2 y) δ4 = ω · I + I · gI (µ2k xk2 + yk2 ) + U˜ a (I, φ, x), ¯ + 2 k≥1

with U˜ a analytic in δ and, as a function of I , U˜ a = O(δ + δ 3 |I | + δ 5 |I |2 ). Hence, Theorem 1.1 implies the existence of quasi-periodic solutions I, x and y of period ω, real analytic in φ and λ. Tracing the coordinate transformations back to the original variables qn (t) in the expression (2.8) for u(x, t) completes the proof of Lemma 2.2 with u(x, t) given by (2.6). 3. The Renormalization Group Scheme Equation (1.7) consists in a system of equations for the variables (!, J ) and Z which are coupled through the perturbation U only. Adopting the notation ∂ U V (!, J, Z)(ϕ) = λ φ (ϕ + !(ϕ), J (ϕ), Z(ϕ)), (3.1) ∂I U W (!, J, Z)(ϕ) = λ∂x U (ϕ + !(ϕ), J (ϕ), Z(ϕ)),

(3.2)

one rewrites Eq. (1.7) as

! = −V (!, J, Z), J

(3.3)

(D 2 + µ2 )Z = −W (!, J, Z).

(3.4)

0 D −D g

Our strategy will be to consider (3.3) and (3.4) separately, treating the functions Z and (!, J ), respectively, as parameters. As we will see in Sect. 8, existence of a (unique) solution of the original equation (1.7) can then be proved by using the implicit function theorem. Note that (3.3) involves only the torus frequencies ω and is equivalent to a standard KAM problem. Existence of a solution for such equations is well known and has been established by various means. One important feature we will use is the regular dependence of the solution (!, J ) on the function Z. A precise result about the solution of (3.3) will be stated in Sect. 4, Theorem 4.1, once the required Banach spaces of functions have been introduced. We now focus our attention on Eq. (3.4), and will suppress from the notation the dependence of the vector field W on the parameters ! and J . Most of our analysis will be conducted in Fourier space, and we will denote by lower case letters the Fourier transforms of functions of ϕ, the latter being denoted by capital letters, namely, F (ϕ) = e−iq·ϕ f (q), where f (q) = eiq·ϕ F (ϕ)dϕ, q∈Zd

Td

Renormalization Group and Melnikov Problem for PDE’s

111

where dϕ stands for the normalized Lebesgue measure on Td . For Z(ϕ) ∈ R∞ , note dk ˆ ∞ with zki (q) = zki (−q), where R ˆ ∞ stands for that z(q) ∈ R k≥1 C and ki refers th d ∞ ˆ s will denote the complexification of the to the i component of C k . Similarly, R Banach space R∞ defined in (1.11). Finally, we will denote the vector space of functions s ˆ ∞ by h, z(q) ∈ R ˆ ∞ , q ∈ Zd }. h = {z = (z(q)) | z(q) ∈ R In terms of the Fourier transform of W , namely, w0 (z)(q) ≡ λ eiq·ϕ ∂x U (ϕ + !(ϕ), J (ϕ), Z(ϕ))dϕ, Td

(3.5)

Eq. (3.4) becomes, K0 z = w0 (z), where the operator K0 is given by the diagonal kernel K0 (q, q ) = |ω · q|2 − µ2 δqq .

(3.6)

(3.7)

Solving Eq. (3.6) requires to invert the operator K0 . Although the inverse of K0 is unbounded for generic frequencies, restricting ω to a set of admissible frequencies gives sufficient control on the inverse of K0 to prove existence of a solution. As is well known for Melnikov problems, this set depends on the perturbation U . In order to prove existence of a solution to Eq. (3.6), we will follow a strategy developed in [BGK] for standard KAM problems, namely, for equations of the type (3.3). This strategy basically consists in inductively reducing (3.6) to a sequence of effective equations involving denominators of decreasing size. One inductive step, say the nth step, consists in splitting the effective equation obtained at the previous step into two equations involving only large and, respectively, small denominators, where large and small are defined with respect to a scale of order ηn for some fixed η < 1. This splitting is done in such a way that the nonlinear operator involved in the large denominators equation is a contraction, and this equation can thus be solved by a simple application of the contraction mapping principle. This, in turn, allows to map the small denominators equation into a new effective equation of type (3.6), with a new righthand side wn and (eventually) a new linear operator Kn . In [BGK], it was shown that for equations of type (3.3), the above mentioned contraction property follows naturally from symmetries specific to this case. In contrast, Eq. (3.4) involves in addition the normal frequencies µk and does not possess such symmetry. In order to obtain the required contraction, we must make at every inductive step an additional preparation step. As we shall see below, this amounts to renormalizing the linear operator Kn−1 obtained at the previous step into a new operator Kn , which, in effect, corresponds to renormalizing the normal frequencies. Furthermore, we will see that the renormalized normal frequencies converge to a U -dependent set {µ∗α }, α ≥ 1, as n → ∞. Therefore, since the Diophantine conditions imposed on ω will eventually be defined relative to this set, one obtains in a constructive way the dependence of the set of admissible frequencies on the perturbation U . We now describe how the renormalization group approach is implemented in practice for Melnikov type problems. First, we proceed with the above mentioned preparation step by decomposing w0 as w0 (z) = w˜ 0 (z) + A0 z,

112

J. Bricmont, A. Kupiainen, A. Schenkel

where the linear operator A0 is the dominant part of Dw0 (z) evaluated at z = 0. With K1 ≡ K0 − A0 , Eq. (3.6) now reads K1 z = w˜ 0 (z).

(3.8)

As explained in more details below, A0 can be chosen in such a way that K1 is of the same form as K0 , cf. (3.7), but now given in terms of a new set of frequencies µ˜ ki ∈ R which are perturbation of order λ of the original normal frequencies µk . The notation µ˜ ki reflects the fact that the perturbation A0 may lift some of the degeneracies. Therefore, when inverting K1 , denominators smaller than O(η) occur for q such that ||ω · q| − µ˜ ki | ≤ O(η) for some ki . Furthermore, these small denominators only occur, q for such q, in a specific subspace hki of Cdk depending on which µ˜ kj , if any, has been q separated from µ˜ ki by more than O(η). Introducing P1 as the projection of h onto hki for q such that ||ω · q| − µ˜ ki | ≤ O(η) and defining Q1 ≡ 1 − P1 , one thus expects that the restriction of K1 to Q1 h is invertible with an inverse of order O(η−1 ). Multiplying (3.8) by Q1 and P1 leads to the small and large denominators equations for z˜ 1 ≡ Q1 z and z1 ≡ P1 z, K1 z˜ 1 = Q1 w˜ 0 (˜z1 + z1 ), K1 z1 = P1 w˜ 0 (˜z1 + z1 ),

(3.9) (3.10)

and by definition of Q1 , the first equation can be rewritten as a fixed point equation for the functional R1 defined as R1 (z1 ) ≡ z˜ 1 , namely, R1 (z) = K1−1 Q1 w˜ 0 (z + R1 (z)).

(3.11)

By choice of A0 , the nonlinear operator K1−1 Q1 w˜ 0 is a contraction and one can solve Eq. (3.11) for R1 using the Banach fixed point theorem. (See point (a) of Theorem 5.1 for this part of the inductive step.) Next, with w1 defined as w1 (z) ≡ w˜ 0 (z + R1 (z)), Eq. (3.10) reads K1 z1 = P1 w1 (z1 ),

(3.12)

and the solution z = z1 + z˜ 1 of the original Eq. (3.6) is now given by z = z1 + R1 (z1 ) ≡ F1 (z1 ). Hence, the problem of solving (3.6) is reduced to solving the effective Eq. (3.12). To solve this equation one proceeds similarly, starting with our preparation step. After n steps of this inductive process, the solution of (3.6) is given by z = Fn−1 (zn + Rn (zn )) ≡ Fn (zn ),

(3.13)

where Rn solves the functional equation Rn (z) = 9n w˜ n−1 (z + Rn (z)),

(3.14)

9n ≡ Kn−1 Qn Pn−1 ,

(3.15)

with

Renormalization Group and Melnikov Problem for PDE’s

113

and, for some linear operator An−1 , w˜ n−1 (z) ≡ wn−1 (z) − An−1 z, Kn ≡ Kn−1 − Pn−1 An−1 ,

(3.16) (3.17)

whereas zn solves the effective equation Kn zn = Pn wn (zn ),

(3.18)

wn (z) ≡ w˜ n−1 (z + Rn (z)).

(3.19)

with wn defined as

Remark 3.1. The point of this inductive procedure is that Pn wn (z) becomes effectively linear in z for large n. More precisely, we will show, cf. Theorem 5.1 below, that the rescaled maps wnr defined by wnr (z) = η−n r −n wn (r n z) satisfy for r < η, Pn wnr (z) = Pn Dwnr (0)z + O(λr 2n η−n )

with

Pn Dwnr (0) = O(λ),

in some appropriate Banach space. Thus, zn = 0 becomes a better and better approximation to the solution of (3.18), and we shall construct the solution z of the original Eq. (3.6) as the limit of the approximate solutions z = lim Fn (0). n→∞

(3.20)

We now give a precise description of the operators Pn . Note that in order to obtain (3.14) and (3.18), we have tacitly assumed that Pn Pn−1 = Pn . The possibility to define Pn satisfying such a property follows from the convergence of the normal frequencies under renormalization. Recall that renormalization occurs because at every inductive step one turns the nonlinear map wn of the effective functional equation (3.18) into a contraction by subtracting some linear operator An . Delaying to subsequent sections the discussion of the appropriate choice for the family Am , m ≥ 0, it suffices to point here to the properties of Am that will ensure convergence of the renormalized normal frequencies. As will be shown, cf. point (c) of Theorem 5.1 for a precise statement, Am is a perturbation of order ˆ∞ → R ˆ ∞ linear ληm and is given by a constant kernel Am (q, q ) = am δqq with am : R n−1 and hermitian. As a consequence, the operator Kn = K0 − m=0 Pm Am has a kernel of the form (3.7) with µ2 essentially replaced by the positive definite matrix µ˜ 2n ≡ µ2 +

n−1

am ,

(3.21)

m=0

with µ˜ n having a discrete spectrum σ (µ˜ n ) ⊂ R+ . One easily checks that the singularities of Kn−1 are given by the eigenvalues of µ˜ n , which therefore correspond to renormalized normal frequencies. Since am is of order ληm , one expects the eigenvalues of µ˜ n to converge as n → ∞ with |νn+1 − νn | ≤ O(ληn ) for νn+1 ∈ σ (µ˜ n+1 ) and νn ∈ σ (µ˜ n ). This, in turn, allows us to define scales of denominators in a consistent way by carefully keeping track of the separation properties of σ (µ˜ n ) as n increases. To this end, one groups the normal frequencies into a hierarchy of clusters satisfying gap conditions that are preserved by the renormalization procedure. We first introduce some notation. For

114

J. Bricmont, A. Kupiainen, A. Schenkel

x ∈ R and C a finite collection of points in R, let d(x, C) denote the distance between x and the smallest interval containing all points in C, and for two finite collections C1 , C2 ⊂ R, let d(C1 , C2 ) ≡ inf d(x, C2 ). x∈C1

Then, one can uniquely decompose σ (µ˜ n ) into a maximal number of disjoint clusters n , k ≥ 1, i = 1, . . . , M n , satisfying d(µ , C n ) = O(λ) and the gap condition Ck,i k k,i k n n , Ck,j ) > ηn d(Ck,i

if i = j.

(3.22)

Note that Mkn ≤ dk , where dk denotes the multiplicity of the original normal frequency µk , and that by requiring Mkn to be maximal, the decomposition n

σ (µ˜ n ) =

Mk k≥1 i=1

n Ck,i

(3.23)

is unique. The above observation about the rate of convergence of σ (µ˜ n ) as n → ∞ ensures that eigenvalues belonging to different clusters will remain separated. Generically, one expects all degeneracies to be lifted eventually, so that Mkn = dk for n sufficiently n contains a single eigenvalue. Next, defining S ⊂ Zd as large and each cluster Ck,i n n

Sn =

Mk k≥1 i=1

n Sk,i ,

(3.24)

where n n Sk,i = {q ∈ Zd | d(|ω · q|, Ck,i ) < 41 ηn },

(3.25)

one is ensured that all q ∈ Zd \ Sn satisfy d(|ω · q|, σ (µ˜ n )) ≥ O(ηn ) for n ≥ n. Hence, such q can be safely “integrated out” in the large denominators equation. Remark that n are pairwise disjoint. In order to achieve the construction of due to (3.22), the sets Sk,i Pn , one must isolate for every q ∈ Sn the subspace of R∞ in which small denominators n , the latter is given by the eigenspace of µ will occur. For q ∈ Sk,i ˜ n associated with n . This eigenspace will be denoted by J n , whereas the the eigenvalues belonging to Ck,i k,i n will be denoted by P n . Thus, one defines P to be the diagonal projector onto Jk,i n k,i operator acting on h given by the kernel n

Pn (q) =

Mk k≥1 i=1

n n χk,i (ω · q)Pk,i ,

n denotes a function in ∈ C 1 (R) which satisfies where χk,i

n χk,i (κ)

=

1

n ) ≤ 1 ηn , if d(|κ|, Ck,i 8

0

n ) ≥ 1 ηn , if d(|κ|, Ck,i 4

(3.26)

Renormalization Group and Melnikov Problem for PDE’s

115

and interpolates monotonically between 0 and 1 otherwise, with n sup |χk,i (κ)| ≤ Cη−n ,

(3.27)

Qn = 1 − Pn .

(3.28)

κ∈R

whereas Qn is defined as

n have been introduced Note that Pn and Qn are not projectors. The smooth functions χk,i in order to ensure the continuity of the diagonal kernels 9n (q, q), cf. the discussion preceding Lemma 5.3 below. However, we will make use later of the projector n

Pˆn (q) =

Mk k≥1 i=1

n n (q)P ISk,i k,i ,

(3.29)

where IO denotes the indicator function of a set O. Note that Pn Pˆn = Pn , whereas Qn Pˆn = 0. We conclude this section by a few remarks related to the convergence of the inductive n ⊂ R to be the smallest interval covering C n , one easily checks scheme. First, setting Ik,i k,i n that |Ik,i | ≤ (dk − 1)ηn . Hence, since the multiplicities of the normal frequencies µk were assumed to be uniformly bounded in k, i.e., dk ≤ d¯ for all k ≥ 1, one obtains for all n ≥ 1, k ≥ 1, and i = 1, . . . , Mkn , n ¯ n. |Ik,i | ≤ dη

(3.30)

Next, it follows from the gap condition (3.22) being preserved that for all m < n the n are perturbation of all or some eigenvalues belonging eigenvalues in a given cluster Ck,i m m . Furthermore, C n remains close to C m . More to a single cluster Ck,j , denoted by Ck,j n k,i k,jin i precisely, we will show that sup

inf d(x, y) ≤ ηm+1

n y∈I m n x∈Ik,i k,j

for

1 ≤ m < n.

(3.31)

i

n . One has by construction Finally, we consider the properties of the eigenspaces Jk,i n P n = δ δ P n . However, it will be possible to choose a in (3.21) in such a way Pk,i kl ij k,i m l,j n , every m is an invariant subspace for a . Hence, by definition of µ that each Jk,i ˜ n and Jk,i m n−1 n m Jk,i is a subspace of some Jk,j , and by recursion, of some Jk,j for all m < n. The m containing J n will be denoted by J m . Therefore, one has (unique) eigenspace Jk,j k,i k,jin for all 1 ≤ m ≤ n, k ≥ 1, and i = 1, . . . , Mkn , n m n Pk,i Pl,j = δkl δjjin Pk,i ,

(3.32)

which, in particular, implies that Pn Pn−1 = Pn−1 Pn = Pn .

(3.33)

116

J. Bricmont, A. Kupiainen, A. Schenkel

Notations. For most of the subsequent analysis, it will not be necessary to distinguish between indices (k, i) and (l, j ) with k = l or k = l. This intervenes only in the description of the asymptotic behavior of the spectrum σ (µ˜ n ) and the measure estimate of +∗ . For notational convenience, we thus introduce the index sets I n = {(k, i) | k ≥ 1, i = 1, . . . , Mkn }, I n.

and will reserve bold letters for indices in With this n ,k ≥ denotes for instance the collection of all clusters Ck,i

n ≥ 1, convention, {Ckn | 1, i = 1, . . . , Mkn .

(3.34) k ∈ I n },

4. Spaces For the Fourier transform z of the solution Z of our original Eq. (3.4), we consider the Banach space hs , s ∈ R, defined by ˆ∞ hs = {z = (z(q)) | z(q) ∈ R |z(q)|s < ∞}. (4.1) s , ||z||s ≡ q∈Zd

For s ≥ t, one has the natural embedding hs → ht with || · ||t ≤ || · ||s . We will denote by hns the subspace Pˆn hs . In particular, one has for z ∈ hns , ||z||s = |Pkn z(q)|s . (4.2) k∈I n q∈Skn

(n,m)

(n)

The operator norm in L(hns , hm t ) will be denoted by || · ||s,t , and by || · ||s when n = m and s = t. Let us now turn to the spaces we will consider for the functions wn . Recall that in our analysis of (3.4), the functions ! and J only appear as parameters. In the sequel, we consider !, J : Td → Rd as (fixed) real analytic maps belonging to a small neighborhood of the origin OB in the Banach space B = {(F, G) : Td → Rd × Rd | ||(F, G)||B ≡ |f (q)| + |g(q)| < ∞}. (4.3) q∈Zd

Next, it follows from assumption (H2) that the gradient ∂x U is real analytic as a map d ∞ from Td × OI × Ox to R∞ s , cf. [PT] p. 138. (Recall that OI ⊂ R and Ox ⊂ Rs are neighborhoods of the origin and that s ≡ s + ξ − a.) This implies that for (!, J ) ∈ OB small enough, one can write the Taylor expansion of ∂x U (ϕ + !(ϕ), J (ϕ), Z) = ∂x U ((ϕ + !(ϕ), J (ϕ), 0) + (0, 0, Z)) as ∂x U ((ϕ + !(ϕ), J (ϕ), 0) + (0, 0, Z)) =

∞ 1 !,J U (ϕ)(Z, . . . , Z), m! m+1

(4.4)

m=0

!,J where the coefficients Um+1 (ϕ) belong to the space of m-linear maps L(Rs , . . . , Rs ; Rs ), are real analytic in ϕ ∈ Td and analytic in (!, J ) ∈ OB . Hence, there !,J exist ρ > 0, α > 0 and b < ∞ such that the Fourier transforms of Um+1 (ϕ) satisfy −m eα|q| ||u!,J . (4.5) ˆ ,...,R ˆ ;R ˆ ) ≤ b m! ρ m+1 (q)||L(R q∈Zd

s

s

s

Renormalization Group and Melnikov Problem for PDE’s

117

Inserting the Fourier series for Z into (4.4), one obtains the expansion for w0 as defined in (3.5), w0 (z)(q) = λ ≡

∞ m 1 !,J qi (z(q1 ), . . . , z(qm )) um+1 q − m! q

m=0 ∞

i=1

(4.6)

(m) w0 (q; q1 , . . . , qm )(z(q1 ), . . . , z(qm )),

m=0 q

where q = (q1 , . . . , qm ) ∈ Zmd . This formula suggests to consider w0 as an analytic function of z ∈ hs . Let B(r0 ) be the open ball of radius r0 in hs centered at the origin and let H ∞ (B(r0 ), hs ) denote the Banach space of an analytic function w : B(r0 ) → Hs equipped with the supremum norm, which we shall denote by |||w|||. Then, bound (4.5) implies that w0 ∈ H ∞ (B(r0 ), hs ) for r0 small enough. (m) It will be convenient to encode the decay property of the kernels w0 inherited from the estimate (4.5) as a property of the functional w0 . Let τβ denote the translation by β ∈ Rd , i.e., (τβ Z)(ϕ) = Z(ϕ − β). On hs , τβ is realized by (τβ z)(q) = eiβ·q z(q), and it induces a map w → wβ from H ∞ (B(r0 ), hs ) to itself if we define wβ (z) = τβ w(τ−β z).

(4.7)

(m)

On the kernels w0 , this is given by

(m)

wβ (q; q1 , . . . , qm ) = eiβ·(q−

qi )

w (m) (q; q1 , . . . , qm ),

and makes sense also for β ∈ Cd . Since |||w0β ||| ≤

∞ m=0

r0m sup q

q∈Zd

e−Imβ·(q−

qi )

(m)

||w0 (q; q1 , . . . , qm )||L(Rˆ

ˆ

ˆ

s ,...,Rs ;Rs )

,

it thus follows from (4.5) that there exist r0 > 0, α > 0, and D < ∞, such that w0β belongs to H ∞ (B(r0 ), hs ) and extends to an analytic function of β in the strip | Im β| < α with values in H ∞ (B(r0 ), hs ) satisfying the bound |||w0β ||| ≤ D|λ|.

(4.8)

Let us now come back to the existence of a solution for Eq. (3.3), namely for the standard KAM problem. One has the classical result (see for instance [BGK]): Theorem 4.1. Let U satisfy hypothesis (H2) and let g be an invertible matrix. Then, there is a λ1 > 0 small enough such that for |λ| < λ1 and ω satisfying a Diophantine condition of the form |ω · q| > K|q|−ν for q ∈ Zd , q = 0, (3.3) has a solution (!, J ) ∈ B which is real analytic in ϕ, analytic in λ, and vanishes for λ = 0. Furthermore, this solution is unique up to translations (!, J )(ϕ) → (! − β, J )(ϕ − β) and depends analytically on Z, for Z in a small ball centered at the origin of the Banach space hs .

118

J. Bricmont, A. Kupiainen, A. Schenkel

To conclude this section, we list some standard properties of bounded analytic functions defined on open balls in Banach spaces. Let h, h , h be Banach spaces, B(r) ⊂ h, B(r ) ⊂ h , and wi ∈ H ∞ (B(r), h ), w ∈ H ∞ (B(r ), h ). First, one has the composition property: If |||wi ||| < r then w ◦ wi ∈ H ∞ (B(r), h ) and |||w ◦ wi ||| < |||w|||.

(4.9)

Next, one deduces from the Cauchy estimate that for r1 < r , sup ||Dw(x)||L(h ,h ) ≤ (r − r1 )−1 |||w|||.

||x|| 0 and {Ckn }k∈I n the clusters described in the previous section, +n (K) = ω ∈ Rd | d(|ω · q|, Ckn ), d(|ω · q|, |Ckn ± Ckn |) > K|q|−ν ∀ |q| < Kη−n/ν , Ckn

± Ckn

± ν | ν

Ckn , ν

q = 0,

and k, k ∈ I n },

(5.3)

Ckn }.

denotes the set {ν ∈ ∈ Note that +n (K) ⊂ +n (K ) where whenever K > K . Furthermore, one introduces for ω ∈ Rd the subsets of Zd , d Q+ ω = {q ∈ Z | ω · q > 0},

d Q− ω = {q ∈ Z | ω · q < 0}.

(5.4)

Renormalization Group and Melnikov Problem for PDE’s

119

Proposition 5.1. There exist positive constants r and λ0 small enough such that the following is true for |λ| < λ0 , n ≥ 1, and | Im β| < αn , where α1 = α and, for n ≥ 2, αn = (1 − n−2 )αn−1 .

(5.5)

There exists Kλ > 0 satisfying Kλ → 0 as λ → 0 such that one has for ω ∈ +n (Kλ ) arbitrary but fixed, (a) Equation (5.1) has a solution Rnβ in H ∞ (Bn , hn−1 ) analytic in |λ| < λ0 and s (!, J ) ∈ OB . (b) Defining wnβ according to (5.2), one has wnβ ∈ An and, writing wnβ (z) ≡ wn (z) = wn (0) + Dwn (0)z + δ2 wn (z), ||Pˆn wn (0)||s ≤ εr 2n , |||Pˆn δ2 wn |||An ≤ εr 2n ,

(5.6) (5.7)

where ε → 0 as λ → 0. (c) There exists An ∈ L(hs , hs ) such that w˜ n ≡ wn − An obeys for all z ∈ Bn , (n)

||Pˆn D w˜ n (z)||s,s ≤ εηn .

(5.8)

||An ||s,s ≤ 3εηn−1 ,

(5.9)

An (q, q) = an IQ+ω (q) + an IQ−ω (q),

(5.10)

Furthermore, An (q, q ) = 0 if q = q and T n ˆ∞ ˆ∞ where an ∈ L(R s , Rs ) is hermitian, i.e., an = an , and satisfies for all k ∈ I ,

an Jkn = Jkn .

(5.11)

(d) The matrix µ˜ 2n+1 ≡ µ2 + nm=0 am is positive definite and the spectrum of µ˜ n+1 can be uniquely decomposed into a maximal family of pairwise disjoint clusters n+1 Ck,i , k ≥ 1, i = 1, . . . , Mkn+1 , with Mkn+1 ≥ Mkn , satisfying for all k ≥ 1 the gap condition n+1 n+1 , Ck,j ) > ηn+1 if i = j, d(Ck,i

(5.12)

and n+1 ν = µk + O(εk −ξ ) for all ν ∈ Ck,i , i = 1, . . . , Mkn+1 .

(5.13)

n+1 defined according to (3.25) are pairwise disjoint, and Furthermore, the sets Sk,i (3.31), (3.32) and (3.33) hold with n replaced by n + 1.

Let us briefly comment on Proposition 5.1, whose proof will be carried out in Sect. 6. n+1 enjoy First, we note that point (d) ensures, in particular, that the new set of clusters Ck,i the properties required for proceeding to the next step of the induction, cf. the discussion at the end of Sect. 3. The asymptotic behavior (5.13) concerns the measure estimate of the set +∗ of admissible frequencies in Theorem 1.1. Such an asymptotic behavior is required in order to obtain a set of large measure because one imposes Diophantine conditions with respect to differences of the normal frequencies. We will show in Sect. 7 that (5.13) implies the

120

J. Bricmont, A. Kupiainen, A. Schenkel

Proposition 5.2. For ν = ν(d, ξ ) sufficiently large, the set +∗ (K) ≡ +n (K)

(5.14)

n≥1

satisfies for all bounded + ⊂ Rd , meas(+ \ +∗ (K)) → 0 as K → 0. Note that ω ∈ +∗ assume a Diophantine condition with respect to zero. Therefore, − one has for such ω, Zd \ {0} = Q+ ω ∪ Qω . Next, we turn to bound (5.8), the most delicate estimate to establish. To treat the off-diagonal part Dwn (q, q ), q = q , we will rely on the fact that the exponential decay of the kernel Dw0 (q, q ) in the size of |q − q | is preserved due to the introduction of the parameter β. We note that imposing Diophantine conditions on ω with respect to the differences Ckn ± Ckn ensures that |q − q | is of order O(η−n/ν ) for q = q ∈ Sn . To treat the diagonal part, we will use that Dwn (q, q) depends on q through ω · q only, and is, in some sense, continuous in this variable. More precisely, defining tp : L(hs , hs ) → L(hs , hs ), p ∈ Zd , by (tp L)(q, q ) = L(q + p, q + p),

(5.15)

Tp ≡ tp − 1,

(5.16)

and setting we will show that Tp Dwn is of order O(ε|ω · p|) on the diagonal. Therefore, since p = q − q satisfies |ω · p| ≤ ηn for q, q ∈ Skn such that sign(ω · q) = sign(ω · q ), one has for q ∈ Skn , Pˆn Dwn (q, q)Pˆn = aˆ k + O(εηn ), where aˆ k : Jkn → Jkn depends only on the sign of ω · q. The continuity of Dwn (q, q) ultimately follows from the fact that 9n (q) is continuous in ω·q, as stated in the following lemma, whose proof can be found in the Appendix. Lemma 5.3. Let σ ∈ R and p ∈ Zd . Then the operator 9n = Kn−1 Qn Pn−1 obeys ||9n ||σ,σ +γ ≤ Cη−n , ||Tp 9n ||σ,σ +γ ≤ Cη

−2n

(5.17) |ω · p|.

(5.18)

Finally, the perturbation an being hermitian will essentially follow from the reality of the original Eq. (3.4). More precisely, the derivative Dwn satisfies ij

ij

Dwn (q, q ) = Dwn (−q, −q ),

(5.19)

ij Dwn (q, q )

(5.20)

=

ji Dwn (−q , −q).

ˆ∞ → R ˆ ∞ is given by an hermitian matrix Thus, the diagonal element Dwn (q, q) : R for all q, and an hermitian will follow since, as was mentioned above, an will be chosen in such a way that its action on each Jkn is the constant approximation of Dwn (q, q) for q ∈ Skn . Note that due to (5.19), one expects Dwn (−q, −q) to be approximated by an , which explains the decomposition in formula (5.10). Identities (5.19) and (5.20) are easily checked to hold for n = 0. Indeed, the perturbation U in the Hamiltonian (1.4) being real analytic ensures (5.19), whereas (5.20) follows from the fact that Dw0 is the symmetric second derivative of the functional Z → λ U (ϕ + !(ϕ), J (ϕ), Z(ϕ))dϕ, cf. (3.5). Using the recursive relations (3.19) and (3.16), one obtains (5.19) and (5.20) for n ≥ 1 by iteration.

Renormalization Group and Melnikov Problem for PDE’s

121

Remark 5.4. The choice of constants is as follows. We first fix η small enough according, essentially, to the constants entering the asymptotics of the frequencies µk in (H1), cf. Sect. 6.4. Given η, ε and r are chosen small enough, and λ0 is chosen in turn according to ε. The latter choice plays a role only in ensuring that the inductive hypothesis of Proposition 5.1 are satisfied for n = 0, cf. the introduction in Sect. 6. Finally, Kλ is chosen large enough in order for the estimate −2 K 1/ν η−n/ν λ

Ce−Cn

≤ r 2n ,

(5.21)

to hold for all n ≥ 1. This will be needed in order to iterate the bound (5.6) in Sect. 6.2. Note that due to the double exponential, the dependence of Kλ on η and r is given by the behavior at small n of the expressions entering (5.21). That Kλ can be taken smaller as λ goes to zero will follow from the fact that r and ε, and thus ultimately η, can be taken smaller. Finally, we denote by C a generic constant, independent on n, r, and ε, which may vary from place to place. 6. Proof of Proposition 5.1 We proceed by induction and assume that Proposition 5.1 holds up to n − 1 ≥ 1. Regarding the inductive hypothesis in the case n = 1, we simply choose A0 ≡ 0, so that the bounds for w0 in points (b) and (c) of Proposition 5.1 are a simple consequence of (4.8). Furthermore, µ˜ 1 = µ and point (d) follows immediately from (H1). We note that in Sect. 6.1 below, point (a) is established for n = 1 by taking ε, namely λ, small enough. At some point in the induction, however, one is forced to consider nontrivial An in order for the inductive bounds to hold uniformly in n for a given λ. In the sequel, we adopt the convention, for B a ball of radius r centered at the origin, to denote by γ B the ball of radius γ r centered at the origin. 6.1. Existence of the functional Rnβ . With the notations R = Rnβ , 9 = 9n and w˜ = w˜ (n−1)β , Eq. (5.1) reads R(z) = 9 w(z ˜ + R(z)).

(6.1)

To prove existence in H ∞ (Bn , hn−1 ) of a solution R to Eq. (6.1), one starts, using the s identities w(0) ˜ = w(0) and δ2 w˜ = δ2 w, by decomposing w˜ as w(z) ˜ = w(0) + D w(0)z ˜ + δ2 w(z),

(6.2)

R(z) = 9w(0) + 9D w(0)(z ˜ + R(z)) + 9δ2 w(z + R(z)).

(6.3)

−1 H = 1 − 9D w(0) ˜ ,

(6.4)

to obtain from (6.1),

Defining

and using the identity 1 + H 9D w(0) ˜ = H , one rewrites (6.3) as R(z) = H 9w(0) + H 9D w(0)z ˜ + u(z),

(6.5)

122

J. Bricmont, A. Kupiainen, A. Schenkel

where

and

u(z) = H 9δ2 w(˜z) ≡ G(u)(z),

(6.6)

z˜ ≡ z + R(z) = H z + 9w(0) + u(z).

(6.7)

Since 9 = 9 Pˆn−1 = Pˆn−1 9, (5.17) (with σ = s + ξ − γ ) and the recursive bound (5.8) (with n replaced by n − 1) imply (n−1)

(n−1) −1 ||9D w(0)|| ˜ ≤ ||9D w(0)|| ˜ s s,s+ξ ≤ Cεη .

(6.8)

≤ 2, ||H ||(n−1) s

(6.9)

Hence,

˜ = w(0), and since bounds (5.6) for ε = ε(η) small enough. Since Bn ⊂ Bn−1 , w(0) (with n replaced by n − 1), (5.17) and (6.8) hold, the existence of R in H ∞ (Bn , hn−1 ) s ∞ n−1 follows from the existence of u in H (Bn , hs ). For reasons that will become clear in the next section, we actually show that (6.6) has a solution u in the ball √ −n 2(n−1)

) | |||u||| ≤ εη r . (6.10) B = u ∈ H ∞ ( 18 Bn−1 , hn−1 s This result is stronger, since Bn ⊂ 18 Bn−1 for r small enough. Let us first check that G maps B into itself. From (6.9) and the recursive bound (5.6), it follows that for all z ∈ 18 Bn−1 and u ∈ B, z˜ ∈ hn−1 with s √ ||˜z||s ≤ 2( 18 r n + Cεη−n r 2(n−1) ) + εη−n r 2(n−1) ≤ 21 r n , for ε = ε(r, η) and r = r(η) small enough. Hence, z˜ ∈ 21 Bn−1 ⊂ Bn−1

for all z ∈ 18 Bn−1 ,

(6.11)

and one uses the bound (5.7) to conclude that for all u ∈ B, √ |||G(u)||| ≤ 2Cη−n εr 2(n−1) ≤ εη−n r 2(n−1) , for ε small enough. To show that G is a contraction in B, we apply the estimate (4.11) to the functions z˜ i given by (6.7) in terms of ui ∈ B, i = 1, 2. Noting that |||˜zi ||| ≤ 21 r n , which follows from (6.11), and using in addition (5.7), one obtains, ||Pˆn−1 δ2 w(˜z1 ) − Pˆn−1 δ2 w(˜z2 )||s 1 z∈ 8 Bn−1 4Cη−n r −n |||Pˆn−1 δ2 w|||An−1 sup ||˜z1 − z˜ 2 ||s 1 z∈ 8 Bn−1 4Cεη−n r −n r 2(n−1) sup ||u1 (z) − u2 (z)||s 1 z∈ 8 Bn−1

|||G(u1 ) − G(u2 )||| ≤ 2Cη−n ≤ ≤

sup

≤ 21 |||u1 − u2 |||, for r = r(η) and ε = ε(r, η) small enough.

Renormalization Group and Melnikov Problem for PDE’s

123

Before turning to part (b) of Proposition 5.1, we make some remarks that shall be useful later. First note that (6.11) means z + Rn (z) ∈ 21 Bn−1

for all

z ∈ 18 Bn−1 .

(6.12)

Therefore, with R˜ m (z) ≡ z + Rm (z),

(6.13)

Fnm (z) ≡ R˜ m ◦ R˜ m+1 ◦ · · · ◦ R˜ n (z),

(6.14)

and

it follows recursively that for m = 1, . . . , n, Fnm (z) ∈ 21 Bm−1

for all

z ∈ Bn .

(6.15)

Furthermore, since Fn1 = Fn , where Fn is defined in (3.13), one has Fn ∈ An , together with the uniform bound |||Fn |||An ≤ |||R˜ 1 |||A1 ≤ ε.

(6.16)

6.2. Bounds on the functional wn . According to (5.2), one defines wnβ (z) = w˜ (n−1)β (z + Rnβ (z)). ), it follows from (6.12) and the inductive bounds that for all Since Rnβ ∈ H ∞ (Bn , hn−1 s β with | Im β| < αn−1 , wnβ is well defined as a map from Bn to hs , with wnβ ∈ An . In the sequel, we adopt the simplified notation R = Rnβ , w = w(n−1)β and w = wnβ . We proceed with proving (5.6). Using the decomposition (6.2) at z = 0, one may write w (0) = w(0) + D w(0)R(0) ˜ + δ2 w(R(0)). Since (6.12) implies that R(0) ∈ 21 Bn−1 , one obtains using the bounds (5.6), (5.7) and (5.8), ||Pˆn w (0)||s ≤ εr 2(n−1) + 21 εηn−1 r n + εr 2(n−1) ≤ 3ε.

(6.17)

This leads to |Pkn w (0)(q)|s ≤ 3ε,

(6.18)

for all k ∈ I n and q ∈ Skn . The latter is valid for all β with | Im β| < αn−1 . Let now β with | Im β | < αn . Then, shifting β to β = β − i(αn−1 − αn )q/|q| and using the recursive relation (5.5) for αn , one obtains

−2 α n−1 |q|

wβ (0)(q) = ei(β −β)·q wβ (0)(q) = e−n

wβ (0)(q).

(6.19)

Since for such β one has | Im β| < αn−1 , it follows from (6.18) and (6.19) that −2 e−n αn−1 |q| . (6.20) ||Pˆn w (0)||s ≤ 3ε k∈I n q∈Skn

124

J. Bricmont, A. Kupiainen, A. Schenkel

From the Diophantine conditions satisfied by ω ∈ +n (K), one infers for q ∈ Skn that |q| > min(Kη−n/ν , (4K)1/ν η−n/ν ), cf. (3.25) and (5.3). Therefore, bound (5.6) finally follows by choosing K appropriately, cf. (5.21). We now iterate bound (5.7). Using again the decomposition (6.2), one has δ2 w (z) = D w(0)δ ˜ 2 R(z) + δ2 δ2 w(z + R(z)). The first term on the right-hand side is estimated by using δ2 R(z) = δ2 u(z) together with (4.12) applied to u ∈ B with γ = 8r, since Bn ⊂ 18 Bn−1 , to obtain n−1 |||Pˆn D w(0)δ ˜ ˜ 2 R|||An ≤ ||Pˆn−1 D w(0)|| s,s sup ||δ2 u(z)||s z∈Bn

≤ εηn

(8r)2

|||u||| 1 − (8r)2 √ 2 ε8 ≤ εr 2n 1 − (8r)2 ≤ 21 εr 2n , for ε small enough. In a similar way, one estimates, using (6.12), that sup ||Pˆn δ2 δ2 w(z + R(z))||s ≤ 21 εr 2n ,

z∈Bn

which finally leads to (5.7).

6.3. Bounds on the derivative. In this section, we prove the estimates stated in part (c) of Proposition 5.1. The main difficulty consists in controlling the diagonal part of the kernel of the derivative Dwn evaluated at zero, namely Dwn (0)(q, q), q ∈ Zd . To address this problem, as mentioned in the end of Sect. 5, we will use the fact that Dwn (0)(q, q) depends on q through ω · q only, and satisfies some continuity property when viewed as a function of ω · q. We start by deriving an a priori bound on the norm of Dwn . From (3.14), one infers that DRn (z) = Hn (˜z)9n D w˜ n−1 (˜z),

(6.21)

−1 Hn (˜z) = 1 − 9n D w˜ n−1 (˜z) ,

(6.22)

where

z˜ = z + Rn (z).

(6.23)

Since by definition, cf. (3.19), one has Dwn (z) = D w˜ n−1 (˜z) 1 + DRn (z) , (6.21) and the identity Hn (˜z) = 1 + Hn (˜z)9n D w˜ n−1 (˜z), imply the recursive relation Dwn (z) = D w˜ n−1 (˜z)Hn (˜z).

(6.24)

Renormalization Group and Melnikov Problem for PDE’s

125

As in the previous section, it follows from (5.17), (6.12), and the inductive bounds, that (n−1) ≤ 2 for all z˜ ∈ Bn−1 . Therefore, one obtains for all z ∈ 18 Bn−1 , using ||Hn (˜z)||s again the inductive bound (5.8), (n)

(n−1)

||Pˆn Dwn (z)||s,s ≤ ||Pˆn−1 Dwn (z)||s,s

≤ 2εηn−1 .

(6.25)

In order to iterate bounds (5.8), we decompose Dwn (z) as follows: Dwn (z) = σn + τn + δ1 Dwn (z),

(6.26)

where σn + τn = Dwn (0) and σn (q, q ) = Dwn (0)(q, q )δqq . Let us consider first the last two terms on the right-hand side of (6.26). One has Lemma 6.1. Let r and ε be the positive constants of Proposition 5.1. Then, one has for all n ≥ 0 and all z ∈ Bn , n

(n)

||Pˆn δ1 Dwn (z)||s,s ≤ 21 εr 2 ,

(6.27)

(n)

||Pˆn τn ||s,s ≤ εr n .

(6.28)

Proof. Proceeding by induction, we suppose that Proposition 5.1 and Lemma 6.1 are true up to some n − 1, n ≥ 1. We start with (6.27) and compute from δ1 Dwn (z) = Dwn (z) − Dwn (0) and the recursive relation (6.24) that δ1 Dwn (z) = H˜ n (˜z0 ) D w˜ n−1 (˜z) − D w˜ n−1 (˜z0 ) Hn (˜z), where z˜ 0 = Rn (0) and H˜ n (˜z0 ) = 1+Dwn−1 (˜z0 )Hn (˜z0 )9n . As previously, the inductive (n−1) ≤ 2. Using (6.12) and Pˆn H˜ n = Pˆn H˜ n Pˆn−1 , one bound (5.8) implies ||Pˆn H˜ n (˜z0 )||s infers from the identity D w˜ n−1 (˜z) − D w˜ n−1 (˜z0 ) = δ1 D w˜ n−1 (˜z) − δ1 D w˜ n−1 (˜z0 ) that for all z ∈ 18 Bn−1 , (n−1,n) ||Pˆn δ1 Dwn (z)||s,s ≤C

sup 1 z ∈ 2 Bn−1

(n−1) ||Pˆn−1 δ1 D w˜ n−1 (z )||s,s .

Since δ1 D w˜ n−1 = δ1 Dwn−1 , the recursive bound (6.27) leads to (n−1,n)

||Pˆn δ1 Dwn (z)||s,s

≤ Cεr

n−1 2

,

for all z ∈ 18 Bn−1 . Finally, iterating bound (6.27) is completed by restricting z to Bn ⊂ 1 8 Bn−1 and using (4.12) with γ = 8r. Next, we turn to (6.28), the estimate for the off-diagonal part of Dwn (0). The norm of τn reads (n) ||Pˆn τn ||s,s = sup sup |Pkn τn (q, q )Pkn |s,s , k ∈I n q ∈Skn k∈I n q∈S n k

and one infers from (6.27) and the a priori bound (6.25) that n

|Pkn τn (q, q )Pkn |s,s ≤ 2εηn−1 + 21 εr 2 ≤ 3ε.

(6.29)

126

J. Bricmont, A. Kupiainen, A. Schenkel

The latter is valid for all β with | Im β| < αn−1 . Let now β with | Im β | < αn . Then, shifting β to β = β − i(αn−1 − αn )(q − q )/|q − q |, one obtains

−2 α n−1 |q−q |

τnβ (q, q ) = ei(β −β)·(q−q ) τnβ (q, q ) = e−n Hence, since | Im β| < αn−1 for such ||Pˆn τn ||ns,s

τnβ (q, q ).

(6.30)

β ,

(6.29) and (6.30) lead to −2 ≤ 3ε sup sup e−n αn−1 |q−q | . k ∈I n q ∈Skn k∈I n

(6.31)

q∈Skn q=q

We now show that every term in the previous sum yields a super-exponentially small factor. Let q ∈ Skn and q ∈ Skn for some k ∈ I n , k ∈ I n . Then, one estimates using (3.25) and (3.30) that if sign(ω · q) = sign(ω · q ), ¯ n, d |ω · (q − q )|, Ckn + Ckn ≤ 21 ηn + |Ikn | + |Ikn | ≤ 3dη and that otherwise d |ω · (q − q )|, |Ckn − Ckn | ≤

1 n 2η

¯ n. + |Ikn | + |Ikn | ≤ 3dη

Therefore, since q = q , it follows from (5.3) and ω ∈ +n (K) that

K 1/ν , K η−n/ν . |q − q | ≥ min 3d¯ Hence, the contribution of each term in (6.31) is super-exponentially small, and (6.28) follows for some r * η < 1. Finally, we turn to σn , the diagonal part of Dwn (0) in the decomposition (6.26). We first state a result about the continuity properties of the kernel σn (q, q), namely that Tp σn = tp σn − σn is of order |ω · p|. More precisely, one has the Proposition 6.2. Suppose that Proposition 5.1 is valid up to n − 1 for some n ≥ 1. Then, the diagonal part σn (z) of Dwn (z) satisfies for all z ∈ Bn and all p such that 1 n−1 |ω · p| < 16 η , ||Pˆn Tp σn (z)||ns,s ≤ ε 2 |ω · p|. 3

(6.32)

Delaying the proof of the above proposition to the end of this section, we now construct a diagonal operator An ∈ L(hs , hs ) such that σ˜ n ≡ σn − An obeys ||Pˆn σ˜ n ||ns,s = sup sup |Pkn σ˜ n (q, q)Pkn |s,s ≤ 21 εηn . k∈I n q∈Skn

(6.33)

The equality above follows from the sets Skn being pairwise disjoint. This will conclude the proof of iterating (5.8), since (6.27), (6.28) and (6.33) imply that the derivative of w˜ n ≡ wn −An satisfies the required bound for r = r(η) small enough. In order to obtain bound (6.33) by using the continuity property (6.32), we would like to construct An as an approximation of σn (q, q) for ω · q close to the normal frequencies in Ckn , k ∈ I n . To this end, we set µ¯ k to be the center of the interval Ikn and, using that {ω · q | q ∈ Zd } is dense in R, we choose a sequence {ql,k }l≥1 ⊂ Skn such that ω · ql,k > 0 for all l ≥ 1 and lim ω · ql,k = µ¯ k .

l→∞

Renormalization Group and Melnikov Problem for PDE’s

127

Next, one defines the matrix aˆ n,k ∈ L(Jkn ) by aˆ n,k ≡ lim Pkn σn (ql,k , ql,k )Pkn . l→∞

(6.34)

Due to (6.32), the limit in (6.34) exists and does not depend on the particular choice of the sequence {ql,k }l≥1 . Finally, setting aˆ n,k , (6.35) an ≡ k∈I n

we define the operator An : h → h as given by the diagonal kernel An (q, q) = an IQ+ω (q) + an IQ−ω (q)

(6.36)

for all q ∈ Zd . We note that by construction, (5.11) is clearly satisfied. Furthermore, it follows from (5.19) and (5.20) that an is indeed hermitian. Let us check that definition (6.36) leads to the required bound (6.33). By construction, one has for all k ∈ I n , lim Pkn σ˜ n (ql,k , ql,k )Pkn = 0.

l→∞

(6.37)

On the other hand, since Tp An = 0, bound (6.32) is also satisfied by σ˜ n , which by definition of the norm implies that |Pkn Tp σ˜ n (q, q)Pkn |s,s ≤ ε 2 |ω · p|, 3

(6.38)

1 n−1 η . The definition of Skn for all q ∈ Skn , k ∈ I n , and p ∈ Zd with |ω · p| < 16 1 n−1 n ¯ together with (3.30) implies that |ω · (q − q )| ≤ 2dη ≤ 16 η for all q, q ∈ Skn with sign (ω · q) = sign (ω · q ) and η small enough. Therefore, using

σ˜ n (q, q) = σ˜ n (q , q ) + Tq−q σ˜ n (q , q ), one infers from (6.38) that for all ql,k and q ∈ Skn with ω · q > 0, |Pkn σ˜ n (q, q)Pkn |s,s ≤ |Pkn σ˜ n (ql,k , ql,k )Pkn |s,s + ε 2 |ω · (q − ql,k )|, 3

(6.39)

which, with (6.37), leads to ¯ 2 ηn . |Pkn σ˜ n (q, q)Pkn |s,s ≤ 2dε 3

(6.40)

For q ∈ Skn with ω · q < 0, we note that (6.39) is also valid if one replaces ql,k by −ql,k , and, due to (5.19), that the same is true of (6.37). Therefore, (6.40) holds for all q ∈ Skn , k ∈ I n , and bound (6.33) follows by taking ε small enough. Finally, we check that An obeys (5.9). The a priori bound (6.25) together with (6.33) imply that (n) ||Pˆn An ||s,s ≤ 3εηn−1 , which, with (5.11) and definition (6.36), leads to (5.9). To complete the proof of part (c) of Proposition 5.1, we are left with the Proof of Proposition 6.2. Denoting Dwn (z) = σn (z) + τn (z), with σn (z)(q, q ) = Dwn (z)(q, q )δqq , one computes from (6.24) the recursive relation σn (z) = σ˜ n−1 (˜z) + Tn (z) Hn (˜z), (6.41)

128

J. Bricmont, A. Kupiainen, A. Schenkel

where −1 Hn (˜z) = 1 − 9n σ˜ n−1 (˜z) , Tn (z)(q, q ) = τn (z)9n τn−1 (˜z) (q, q )δqq . Setting Rn (z) ≡ σ˜ n−1 Hn (˜z) − 1 ,

Tn (z) ≡ Tn (z)Hn (˜z),

and using Tp σ˜ n−1 = Tp σn−1 together with the identity Tp σ0 = 0, one applies (6.41) recursively to obtain Tp σn (z) =

n

Tp Rm (zm ) + Tm (zm ) ,

(6.42)

m=1

where zm = Fnm+1 (z), cf (6.14), with Fnn+1 ≡ 1. Note that Rm (z) is diagonal and can be rewritten as Rm (z) = σ˜ m−1 (˜z)9m σ˜ m−1 (˜z)Hm (˜z).

(6.43)

As shown below, each term in (6.42) is easily seen to be of order ε 2 |ω · p|. Thus, the main issue in obtaining (6.32) is to ensure that taking the sum will deteriorate the bound only slightly. Let us first consider the terms involving the quantities Tp Tm . They are higher order terms, since Tm is quadratic in the off-diagonal part τm which, as shown in Lemma 6.1, are bounded by powers of r. Indeed, as carried out in the Appendix, one has for all m = 1, . . . , n and z ∈ Bm , (m) ||Pˆm Tp Tm (z)||s,s ≤ ε2 ηm |ω · p|,

(6.44)

so that n (n) n (m) ˆn Tp Tm (z) ≤ ||Pˆm Tp Tm (z)||s,s ≤ ε2 |ω · p|. P m=1

s,s

(6.45)

m=1

On the other hand, the terms involving Tp Rm are not higher order terms. Since

Tp Hm (˜z) = tp Hm (˜z) Tp 9m tp σ˜ m−1 (˜z) + 9m Tp σ˜ m−1 (˜z) Hm (˜z), (5.18) with σ = s + ξ − γ and n replaced by m yields with the recursive bound (6.32) ≤ η−m |ω · p|. ||Tp Hm (˜z)||(m−2) s

(6.46)

Thus, using in addition the recursive bounds (5.8) and (6.32), together with ||Hm (˜z) − 1||(m−1) = ||9m σ˜ m−1 (˜z)Hm (˜z)||(m−1) ≤ Cε, s s one obtains for all m = 1, . . . , n and z ∈ Bm , (n)

(m)

||Pˆn Tp Rm (z)||s,s ≤ ||Pˆm Tp Rm (z)||s,s ≤ Cε 2 |ω · p|,

(6.47)

Renormalization Group and Melnikov Problem for PDE’s

129

to be compared with (6.44). However, one can actually show that n n (n) ˆ Pn Tp Rm (z) ≤ sup sup |Tp Rm (z)(q)|s,s m=1

s,s

k∈I n q∈Skn m=1

≤ Cε2 |ω · p|,

(6.48) (6.49)

with another n-independent constant C. Although (6.47) yields the a priori bound |Tp Rm (z)(q)|s,s ≤ Cε 2 |ω · p| for all q ∈ Skn and k ∈ I n , (6.49) will follow from the fact that all but a finite number of terms in (6.48) are identically zero. More precisely, there is for all k ∈ I n a set Zkn ⊂ {1, . . . , n} with #Zkn uniformly bounded in n and k such that for all q ∈ Skn , |Tp Rm (z)(q)|s,s ≡ 0

if

m ∈ Zkn .

(6.50)

This leads to (6.49) and concludes the proof of bound (6.32), since (6.42), (6.45) and (6.49) lead to (6.32) by taking ε small enough and by noting that zm ∈ Bm for all z ∈ Bn , cf. (6.15). Identity (6.50) for some finite set Zkn follows from the expression (6.43) for −1 Q P Rm since by localization of scales 9m (q) = (Km m m−1 )(q) = 0 for most m ≤ n if n q ∈ Sk . More precisely, one computes that 1 − χk˜m (q) χ ˜m−1 (q)Pkm Qm (q)Pm−1 (q) = ˜ , km−1

˜ Im k∈

where the index k˜ m−1 serves to denote the (unique) subspace J ˜m−1 containing J ˜m . Fix km−1

now some k ∈ I n . Then one has for all q ∈ Skn and all m < n, χ ˜m−1 (q)Pkm Qm (q)Pm−1 (q) = ˜ = PJ m−1 \J m , m ˜ k∈I k˜ =km

km−1

km−1

k

km

since by construction χkmm (q) = 1 for such m and q. Therefore, Qm (q)Pm−1 (q) = 0 . On the other hand, Jkmm is a strict for all q ∈ Skn if m < n is such that Jkmm = Jkm−1 m−1

only if #Ckmm < #Ckm−1 , i.e., if the eigenvalues contained in Ckm−1 subspace of Jkm−1 m−1 m−1 m−1 have been divided after perturbation by am−1 into two (or more) clusters. But this can be true only for finitely many m since the original eigenvalues µk are finitely many times degenerate. Hence, there is an L < ∞ such that for all n ≥ 1 and all 1 ≤ m ≤ n, one has Pˆn Rm (q) = 0, except for some m1 , . . . , mL . Since the same is true of Pˆn tp Rm (q) provided that p satisfies |ω · p| < ηn−1 /16, (6.50) follows.

6.4. The cluster decomposition. We now check that point (d) of Proposition 5.1 holds. First, (5.9), (5.10) and (5.11) lead to, for k = (k, ·) ∈ I n , |an Pkn |L(Jkn ) ≤ 3k γ −ξ εηn−1 ,

(6.51)

which, since µ2k ≥ ck 2γ by hypothesis (H1), implies that µ2 + nm=0 am ≡ µ˜ 2n+1 is positive definite for ε = ε(c, η) small enough. Next, it follows from an being hermitian that σ (µ˜ n+1 ) ⊂ R+ . Furthermore, using (5.11) and the fact that Jkn is by definition an

130

J. Bricmont, A. Kupiainen, A. Schenkel

invariant subspace for µ˜ n , one infers from µk ≥ ck γ , the asymptotic (5.13) for µ˜ n , and the estimate (6.51), that n −1 −ξ n−1 |an µ˜ −1 . n Pk |L(Jkn ) ≤ 3c k εη

ˆ∞ = Therefore, denoting by Pk the projector onto the k th component of R one obtains

1 µ˜ n+1 Pk = µ˜ 2n + an 2 Pk = µ˜ n Pk + O(k −ξ εηn−1 ),

k≥1 C

dk ,

(6.52)

which, since µPk = µk 1dk , implies by recursion that µ˜ n+1 Pk = µk 1dk + O(εk −ξ ). Hence, the asymptotic (5.13) holds, where for each k ≥ 1 the sequence of clusters n+1 Ck,i , i = 1, . . . , Mkn+1 , forms a partition of the component σ (µ˜ n+1 Pk ) satisfying n+1 n+1 d(Ck,i , Ck,j ) > ηn+1 for i = j . This partition is unique if Mkn+1 is required to be maximal. Furthermore, it follows from (1.13) and (1.14) in (H1) that for ε = ε(c) small enough, the components σ (µ˜ n+1 Pk ) are well separated. Therefore, the sets Skn+1 , k ∈ I n+1 , defined according to (3.25) are pairwise disjoint. Next, (6.52) and the gap condition (5.12) with n + 1 replaced by n imply that for ε = ε(c, η) small enough, n+1 every cluster Ck,i is composed of perturbed eigenvalues belonging to a unique C n n+1 . k,ji

The distance between these two clusters is at most of order O(k −ξ εηn−1 ), so that (3.31) follows for n + 1 by induction. In order to iterate (3.32), we note that by definition, Jkn+1 is the eigenspace of µ˜ n+1 associated with Ckn+1 , k ∈ I n+1 , and that every Jkn , k ∈ I n , n+1 is also an invariant subspace for µ˜ n+1 by (5.11). Therefore, each Jk,i is contained in n n a unique J n+1 , namely, the eigenspace associated with C n+1 . Finally, we check that k,ji

k,ji

n+1 (3.33) iterates. This is a simple consequence of (3.32) and Sk,i ⊂ Sn

k,jin+1

following from (6.52) for ε small enough.

, the latter

7. Measure Estimate In this section, we prove Proposition 5.2, namely, that +∗ (K) =

n≥1 +n (K)

lim meas(+ \ +∗ (K)) = 0,

K→0

satisfies (7.1)

for all bounded + ⊂ Rd . The strategy is standard and consists in studying the complementary sets of +n (K). For n ≥ 1, b > 0, and q ∈ Zd , let us define

n;k n;k,k n Oq,b ∪ , ≡ Oq,b Oq,b k∈I n

k,k ∈I n

where n;k Oq,b = {ω ∈ Rd | d(|ω · q|, Ckn ) ≤ b},

n;k,k = {ω ∈ Rd | d(|ω · q|, |Ckn ± Ckn |) ≤ b}. Oq,b

Renormalization Group and Melnikov Problem for PDE’s

131

Next, with Zn ≡ {q ∈ Zd | K ν η− 1

and O ∗ (K) ≡

n−1 ν

n≥1 q∈Zn

n

≤ |q| < K ν η− ν }, 1

n Oq,2K|q| −ν ,

one shows first, that for all bounded + ⊂ Rd , ξ meas + ∩ O ∗ (K) ≤ C+ K ξ +1 ,

(7.2)

for some constant C+ depending on + only, and, second, that ∗ c O (K) ⊆ +∗ (K).

(7.3)

Obviously, (7.1) follows from (7.2) and (7.3). Below, C+ will denote a generic constant that may change from place to place but depends on + only. Let us start with the bound (7.2). One has n n meas + ∩ O ∗ (K) ≤ Tq,2K|q|−ν + Tˆq,2K|q| (7.4) −ν , n≥1 q∈Zn

where n Tq,b =

k∈I n

n;k ˆ n n;k,k . , Tq,b = meas + ∩ meas + ∩ Oq,b Oq,b

(7.5)

k,k ∈I n

n , we first To treat the terms on the right-hand side of (7.4) involving the quantities Tq,b use (3.30) to estimate, n;k ¯ n ). meas + ∩ Oq,b ≤ C+ (b + dη

Next, we note that the asymptotic behavior of the clusters Ckn , cf. (1.12) and (5.13), n;k is empty if k = (k, ·) satisfies k ≥ C+ |q| for some constant C+ . implies that + ∩ Oq,b Hence, since the number of indices k of the form (k, ·) is uniformly bounded in k, the n is proportional to |q|, and number of terms which are non-zero in the sum defining Tq,b n n ¯ ). Finally, the fact that q ∈ Zn satisfies one obtains the estimate Tq,b ≤ C+ |q|(b + dη n −ν η ≤ K|q| leads to n ¯ Tq,2K|q| |q|1−ν ≤ C+ K, (7.6) −ν ≤ C+ 2K + dK n≥1 q∈Zn

q∈Zd

for ν = ν(d) large enough. To treat the remaining terms in (7.4), we first note that, as above, n;k,k ¯ n ). (7.7) ≤ C+ (b + 2dη meas + ∩ Oq,b Next, one distinguishes the cases γ = 1 and γ > 1. If γ > 1, then for k > k the n;k,k is empty inequality k γ − k γ > k γ −1 and the asymptotic (1.13) imply that + ∩ Oq,b

132

J. Bricmont, A. Kupiainen, A. Schenkel

for k = (k, ·) and k = (k , ·) such that k ≥ C+ |q|1/(γ −1) ≡ kq . Furthermore, it follows from (5.13) that for kb = b− ξ +1 ,

n;(k,i),(k,j ) −ξ meas Oq,b ≤ Ckb . 1

k>Ckb i,j

Therefore, one obtains with (7.7), ξ

n ≤ Cb ξ +1 + Tˆq,b

Ckb k=1

n;(k,i),(k,j )

meas(+ ∩ Oq,b

)+

kq k =2 k 0 and ν = ν(d, ξ ) large enough. We now consider the case γ = 1. From n;k,k is empty for (5.13) and the asymptotic behavior (1.14), it follows first that + ∩ Oq,b k = (k, ·) and k = (k , ·) with k − k = l ≥ C|q|, and second that for all l ≥ 0,

n;(k,i),(k+l,j ) −ξ meas Oq,b ≤ Ckb , k>Ckb i,j

where kb = b− ξ +1 . Therefore, (7.7) leads to 1

ξ

n ¯ n ), Tˆq,b ≤ C|q|b ξ +1 + C+ b− ξ +1 |q|(b + 2dη 1

and one finally obtains for ν = ν(d, ξ ) large enough, n≥1 q∈Zn

ξ

n 1+ξ Tˆq,2K|q| −ν ≤ C+ K

ξ

ξ

|q|1−ν 1+ξ ≤ C+ K 1+ξ .

(7.9)

q∈Zd

Inserting (7.6) and (7.9) into (7.4) yields (7.2). We now check that (7.3) holds. If ω ∈ O ∗ (K), then the following is true for all n ≥ 1, q ∈ Zn and k, k ∈ I n , d(|ω · q|, Ckn ) > 2K|q|−ν , d(|ω

· q|, |Ckn

± Ckn |)

> 2K|q|

(7.10) −ν

.

(7.11)

Next,we verify that for such ω, this implies that bounds (7.10) and (7.11) hold for all q ∈ nm=1 Zm provided one replaces the constant 2K on the right-hand side by K. This

Renormalization Group and Melnikov Problem for PDE’s

133

in turn implies that ω ∈ +n (K) for all n ≥ 1, so that ω ∈ +∗ (K). Let m < n and fix some k ∈ I n . Then, recalling (3.31), namely that there is at least one k ∈ I m for which sup infm d(x, y) ≤ ηm+1 ,

x∈Ikn y∈Ik

and since, on the other hand, ηm < K|q|−ν whenever q ∈ Zm , one infers from (7.10) with n replaced by m that for q ∈ Zm and η < 1, d(|ω · q|, Ckn ) ≥ d(|ω · q|, Ckm ) − ηm+1 ≥ (2K − ηK)|q|−ν

> K|q|

−ν

(7.12)

.

Since (7.12) holds for all q ∈ Zm , 1 ≤ m ≤ n, one concludes that d(|ω · q|, Ckn ) > K|q|−ν whenever 0 < |q| < Kη−n/ν . In a similar way, one derives an identical lower bound on d(|ω · q|, |Ckn ± Ckn |), thus achieving the proof of (7.3) and (7.1). 8. Proof of Theorem 1.1 Defining zn ≡ Fn (0), we now show that zn converges in hs , as n → ∞, to a function z whose Fourier transform is real analytic and provides a solution of Eq. (3.4). Using Fn (0) = Fn−1 (Rn (0)), cf. (3.13), one computes that zn − zn−1 = δ1 Fn−1 Rn (0) . According to (6.5), Rn (0) = Hn 9n wn−1 (0) + u(0), so that (5.6), (5.17), (6.9), (6.10) and the identity 9n = 9n Pˆn−1 lead to ||Rn (0)||hn−1 ≤ η−n r 2(n−1) . s Therefore, since, Fn−1 ∈ An−1 = H ∞ (Bn−1 , hs ), one can apply (4.12) to δ1 Fn−1 with γ = η−n r n−2 to obtain ||zn − zn−1 ||s ≤ Cη−n r n−2 |||Fn−1 |||An−1 , and the convergence of zn in hs follows from the uniform bound (6.16) by taking r = r(η) small enough. Bound (6.16) also implies ||zβ || ≤ ε uniformly in the strip | Im β| < α = −2 α ∞ n=2 (1 − n ). This yields the pointwise estimate

|z(q)| ≤ εe−α |q| , and, consequently, ensures the real analyticity of the Fourier transform of z. In order to prove that the limit z solves Eq. (3.6), namely, K0 z = w0 (z), we will show below that K0 zn = Qn w0 (zn ) + A 1, but with the transformation q → 1/q, α → α ∗ , β → qβ and z → z, we get a sphere which is C ∗ -isomorphic to one for |q| < 1. It is clear that the quotient of the C ∗ -algebra Aq by the ideal generated by z can be identified with the C ∗ -algebra of the compact quantum group SUq (2). However, in this paper we shall not make any use of additional structures (like coproduct, counit, and antipode) coming from SUq (2). In [19] it was shown that for q ∈ (−1, 0) ∪ (0, 1) the spaces SUq (2) are all homeomorphic in the sense that the corresponding C ∗ -algebras are isomorphic. Then, for q ∈ (−1, 0) ∪ (0, 1), all our C ∗ -algebras Aq are isomorphic as well and all corresponding spheres are homeomorphic. For the generic situation when −1 < q < 0 or 0 < q < 1 any character χ of Aq has to satisfy the equations χ (α ∗ ) = χ (α), χ (β) = 0

and

χ (β ∗ ) = χ (β),

χ (z∗ ) = χ (z),

|χ (α)|2 + (χ (z))2 = 1.

(4)

To show that the space of all characters is homeomorphic to the two dimensional sphere S 2 , we take a generic α ∈ C and z ∈ R such that |α|2 + z2 = 1. Then, from the general considerations presented above, there is a 1-dimensional representation (that is a character) χ of Aq such that χ (α) = α , χ (β) = 0 and χ (z) = z and this proves the homeomorphism in question. Hence, for −1 < q < 0 or 0 < q < 1 the space of (nonzero) characters of Aq , which can be thought of as the space of “classical points” of Sq4 , is homeomorphic to the classical S 2 . For the particular case q = 1 the algebra of the sphere Sq4 is commutative. The associated space of characters is homeomorphic to the 4-dimensional sphere S 4 . Indeed any character χ of Aq=1 satisfies the equations χ (α ∗ ) = χ (α), and

χ (β ∗ ) = χ (β),

χ (z∗ ) = χ (z),

|χ (α)|2 + |χ (β)|2 + (χ (z))2 = 1.

(5)

To show that any element of S 4 arises in this way, similarly to what we did before we take generic α , β ∈ C and z ∈ R such that |α |2 + |β |2 + z2 = 1. Thus they satisfy relations (5) (or relations (1) for q = 1) and there is a 1-dimensional representation χ of Aq (q = 1) such that χ (α) = α , χ (β) = β and χ (z) = z . This proves the homeomorphism in question and shows that the algebra Aq for q = 1 can be identified with the algebra of all continuous functions on the 4-dimensional sphere S 4 . It is in this sense that Sq4 provides a deformation of the classical S 4 . Next, we describe irreducible representations of the algebra Aq (for −1 < q < 0 or 0 < q < 1) as bounded operators on an infinite dimensional Hilbert space H with an orthonormal basis {ψn , n = 0, 1, 2, · · · }. With λ ∈ C, |λ| ≤ 1, we get two families of

164

L. D¸abrowski, G. Landi, T. Masuda

representations πλ,± : Aq → B(H ) given by πλ,± (z)ψn = πλ,± (z∗ )ψn = ± 1 − |λ|2 ψn , πλ,± (α)ψn = λ 1 − q 2(n+1) ψn+1 , πλ,± (α ∗ )ψn = λ¯ 1 − q 2n ψn−1 ,

(6)

¯ n ψn . πλ,± (β ∗ )ψn = λq

πλ,± (β)ψn = λq n ψn ,

To be precise, for λ such that |λ| = 1, the two representations πλ,+ and πλ,− are identical so that, in fact, we have a family of representations parametrized by points on a classical sphere S 2 , similarly to what happens for one dimensional representations (characters) as described before. As mentioned already, the quotient of the C ∗ -algebra Aq by the ideal generated by z is the C ∗ -algebra of the compact quantum group SUq (2). Then, with |λ| = 1, the representations πλ,+ = πλ,− =: πλ yield representations of SUq (2) which are unitary equivalent to the ones constructed by Woronowicz (see for instance ([21])).

3. The Instanton and Its Classes Consider now the following element e in the algebra Mat 4 (Aq ) Mat4 (C) ⊗ Aq

1+z, 0, α, β ∗ ∗ 1 0, 1+z, −qβ , α e= ∗ . 0 2 α , −qβ, 1−z, ∗ β , α, 0, 1−z

(7)

Using the relations (1) it can be verified that e is a selfadjoint idempotent (projection) e2 = e = e∗ . It operates on the right Aq -module A4q = Aq ⊗ C4 and its range may be thought of as sections of a vector bundle over Sq4 . It is easy to see that eA4q is a deformation of the classical instanton bundle over S 4 in the sense that for q = 1, the module eA4q is the module of sections of the complex rank two instanton bundle over S 4 [1]. Next, we compute the Chern–Connes Character of the idempotent e given in (7). If is the projection on the commutant of 4 × 4 matrices, up to normalization the component of the (reduced) Chern–Connes Character are given by

1 chn (e) = (8) ⊗ e ⊗ · · · ⊗ e , n = 0, 1, 2, . . . , e− 2 2n

and they are elements of Aq ⊗ A¯q ⊗ · · · ⊗ A¯q ,

(9)

2n

where A¯q = Aq /CI is the quotient of the algebra Aq by the scalar multiples of the unit.

Instantons on the Quantum 4-Spheres Sq4

165

The crucial property of the components chn (e) is that they define a cycle in the (b, B) bicomplex of cyclic homology [3, 12], that is, Bchn (e) = bchn+1 (e).

(10)

(−1)j a0 ⊗ · · · ⊗ aj aj +1 ⊗ · · · ⊗ am + (−1)m am a0 ⊗ a1 ⊗ · · · ⊗ am−1 ,

(11)

The operator b is defined by b(a0 ⊗ a1 ⊗ · · · ⊗ am ) =

m−1 j =0

while the operator B is written as B = AB0 ,

(12)

where B0 (a0 ⊗ a1 ⊗ · · · ⊗ am ) = I ⊗ a0 ⊗ a1 ⊗ · · · ⊗ am , m 1 A(a0 ⊗ a1 ⊗ · · · ⊗ am ) = (−1)mj aj ⊗ aj +1 ⊗ · · · ⊗ aj −1 , m

(13) (14)

j =0

with the obvious cyclic identification m + 1 = 0. To be precise, in formulæ (11), (13) and (14), all elements in the tensor products but the first one should be taken modulo complex multiples of the unit I, that is one has to project onto A¯q = Aq /CI. For the 0th component of the Chern–Connes Character of the idempotent (7) on the spheres Sq4 we find,

1 e− 2

ch0 (e) =

= 0.

(15)

This could be interpreted as saying that the idempotent and the corresponding module (the “vector bundle”) has complex rank equal to 2. Next for the 1st component we have, ch1 (e) = =

e−

1 2

⊗e⊗e

1 (1 − q 2 ) z ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) 8

(16)

+ β ∗ ⊗ (z ⊗ β − β ⊗ z) + β ⊗ (β ∗ ⊗ z − z ⊗ β ∗ ) .

It is straightforward to check that bch1 (e) = 0 = Bch0 (e).

(17)

166

L. D¸abrowski, G. Landi, T. Masuda

Finally, the 2nd component ch2 (e) =

1 e− 2

⊗e⊗e⊗e⊗e

(18)

can be written as a sum of five terms ch2 (e) =

1 z ⊗ c z + α ⊗ c α + α ∗ ⊗ cα ∗ + β ⊗ c β + β ∗ ⊗ cβ ∗ , 32

(19)

with cz = (1 − q 4 )(β ⊗ β ∗ ⊗ β ⊗ β ∗ − β ∗ ⊗ β ⊗ β ∗ ⊗ β) + (1 − q 2 ) z ⊗ z ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) + (β ⊗ z ⊗ z ⊗ β ∗ − β ∗ ⊗ z ⊗ z ⊗ β) + (β ⊗ β ∗ − β ∗ ⊗ β) ⊗ z ⊗ z + z ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) ⊗ z − z ⊗ (β ⊗ z ⊗ β ∗ − β ∗ ⊗ z ⊗ β) − (β ⊗ z ⊗ β ∗ − β ∗ ⊗ z ⊗ β) ⊗ z + (α ⊗ α ∗ − q 2 α ∗ ⊗ α) ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) ∗

∗

∗

(20)

2 ∗

+ (β ⊗ β − β ⊗ β) ⊗ (α ⊗ α − q α ⊗ α) + (β ⊗ α − qα ⊗ β) ⊗ (α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) + (α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) ⊗ (β ⊗ α − qα ⊗ β) + (α ∗ ⊗ β − qβ ⊗ α ∗ ) ⊗ (qα ⊗ β ∗ − β ∗ ⊗ α) + (qα ⊗ β ∗ − β ∗ ⊗ α) ⊗ (α ∗ ⊗ β − qβ ⊗ α ∗ ); cα = (z ⊗ α ∗ − α ∗ ⊗ z) ⊗ (β ∗ ⊗ β − β ⊗ β ∗ ) + q 2 (β ∗ ⊗ β − β ⊗ β ∗ ) ⊗ (z ⊗ α ∗ − α ∗ ⊗ z) + q(z ⊗ β − β ⊗ z) ⊗ (α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) + (α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) ⊗ (z ⊗ β − β ⊗ z)

(21)

+ q(β ∗ ⊗ z − z ⊗ β ∗ ) ⊗ (α ∗ ⊗ β − qβ ⊗ α ∗ ) + (α ∗ ⊗ β − qβ ⊗ α ∗ ) ⊗ (β ∗ ⊗ z − z ⊗ β ∗ ); cα ∗ = q 2 (z ⊗ α − α ⊗ z) ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) + (β ⊗ β ∗ − β ∗ ⊗ β) ⊗ (z ⊗ α − α ⊗ z) + (β ∗ ⊗ z − z ⊗ β ∗ ) ⊗ (β ⊗ α − qα ⊗ β) + q(β ⊗ α − qα ⊗ β) ⊗ (β ∗ ⊗ z − z ⊗ β ∗ ) + (z ⊗ β − β ⊗ z) ⊗ (β ∗ ⊗ α − qα ⊗ β ∗ ) + q(β ∗ ⊗ α − qα ⊗ β ∗ ) ⊗ (z ⊗ β − β ⊗ z);

(22)

Instantons on the Quantum 4-Spheres Sq4

167

cβ = (1 − q 4 ) (β ∗ ⊗ z − z ⊗ β ∗ ) ⊗ β ⊗ β ∗ + β ∗ ⊗ β ⊗ (β ∗ ⊗ z − z ⊗ β ∗ ) + (1 − q 2 ) β ∗ ⊗ z ⊗ z ⊗ z − z ⊗ β ∗ ⊗ z ⊗ z + z ⊗ z ⊗ β∗ ⊗ z − z ⊗ z ⊗ z ⊗ β∗ + (β ∗ ⊗ z − z ⊗ β ∗ ) ⊗ (α ⊗ α ∗ − q 2 α ∗ ⊗ α) ∗

2 ∗

∗

∗

(23)

+ (α ⊗ α − q α ⊗ α) ⊗ (β ⊗ z − z ⊗ β ) + (α ⊗ z − z ⊗ α) ⊗ (α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) + q(α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) ⊗ (α ⊗ z − z ⊗ α) + (β ∗ ⊗ α − qα ⊗ β ∗ ) ⊗ (α ∗ ⊗ z − z ⊗ α ∗ ) + q(α ∗ ⊗ z − z ⊗ α ∗ ) ⊗ (β ∗ ⊗ α − qα ⊗ β ∗ ); cβ ∗ = (1 − q 4 ) (z ⊗ β − β ⊗ z) ⊗ β ∗ ⊗ β + β ⊗ β ∗ ⊗ (z ⊗ β − β ⊗ z) + (1 − q 2 ) − β ⊗ z ⊗ z ⊗ z + z ⊗ β ⊗ z ⊗ z −z⊗z⊗β ⊗z+z⊗z⊗z⊗β + (z ⊗ β − β ⊗ z) ⊗ (α ⊗ α ∗ − q 2 α ∗ ⊗ α) ∗

2 ∗

(24)

+ (α ⊗ α − q α ⊗ α) ⊗ (z ⊗ β − β ⊗ z) + q(z ⊗ α ∗ − α ∗ ⊗ z) ⊗ (β ⊗ α − qα ⊗ β) + (β ⊗ α − qα ⊗ β) ⊗ (z ⊗ α ∗ − α ∗ ⊗ z) + q(α ∗ ⊗ β − qβ ⊗ α ∗ ) ⊗ (z ⊗ α − α ⊗ z) + (z ⊗ α − α ⊗ z) ⊗ (α ∗ ⊗ β − qβ ⊗ α ∗ ). By using the relations (1) for our algebra, and remembering that we need to project on A¯q in all terms of the tensor product but the first one, a long (one needs to compute 750 terms) but straightforward computation gives bch2 (e) =

1 (1 − q 2 ) I ⊗ z ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) 16

+ I ⊗ β ⊗ (β ∗ ⊗ z − z ⊗ β ∗ ) + I ⊗ β ∗ ⊗ (z ⊗ β − β ⊗ z) ,

(25)

and this is exactly equal to Bch1 (e). 4. Final Remarks There are several directions in which one can proceed and we just mention some of them. It would be clearly very interesting to study differential calculi on our quantum 4sphere and develop Yang–Mills theory. Another natural question is to which extent the sphere Sq4 could be endowed with a structure of a metric noncommutative manifold which fulfills (some of) the related axioms [5, 6]. In particular one should construct an appropriate Dirac operator. This will probably be possible along the lines of [8] where it was suggested that the true Dirac

168

L. D¸abrowski, G. Landi, T. Masuda

operator D for the quantum SUq (2) (and also for the quantum Podle´s 2-sphere Sq2 [16]) should satisfy an equation of the form q 2D − q −2D = Q, q 2 − q −2

(26)

where Q is some q-analogue of the Dirac operator like the ones found in [2, 13]. Once found the operator D, one would easily “suspend” it to the 4-sphere Sq4 . Finally, we mention that it will be interesting to study if there is any relation with the sheaf-theoretic construction of a q-deformed instanton in [15]. Acknowledgements. We are grateful to Alain Connes for several enlightening conversations. This work has been partially supported by the Regione Friuli-Venezia-Giulia via the Research Project “Noncommutative geometry: algebraic, analytical and probabilistic aspects and applications to mathematical physics”.

References 1. Atiyah, M.F.: Geometry of Yang–Mills fields. Accad. Naz. Dei Lincei, Scuola Norm. Sup. Pisa, 1979 2. Bibikov, P.N., Kulish, P.P.: Dirac operators on quantum SU (2) group and quantum sphere. q-alg/9608012 3. Connes, A.: Noncommutative differential geometry. Inst. Hautes Etudes Sci. Publ. Math. 62, 257–360 (1985) 4. Connes, A.: Noncommutative geometry. London–New York: Academic Press, 1994 5. Connes, A.: Gravity coupled with matter and foundation of noncommutative geometry. Commun. Math. Phys. 182, 155–176 (1996) 6. Connes, A.: Noncommutative geometry: The spectral aspect. Les Houches Session LXIV, London–New York: Elsevier, 1998, pp. 643–685 7. Connes, A.: A short survey of noncommutative geometry. J. Math. Phys. 41, 3832–3866 (2000) 8. Connes, A., Landi, G.: Noncommutative manifolds, the instanton algebra and isospectral deformations. math.QA/0011194, Commun. Math. Phys. 221, 141–159 (2001) 9. Dabrowski, L. and Landi, G.: Instanton algebras and quantum 4-spheres. math.QA/0101177 10. Furuuchi, K.: Instantons on noncommutative R 4 and projection operators. Prog. Theor. Phys. 103, 1043 (2000) 11. Kapustin, A., Kuznetsov, A., Orlov, D.: Noncommutative instantons and twistor transform. hepth/0002193 12. Loday, J.L.: Cyclic homology. Berlin–Heidelberg–New York: Springer, 1998 13. Majid, S.: Riemannian geometry of quantum groups and finite groups with nonuniversal differentials. math.QA/0006150 14. Nekrasov, N., Schwarz, A.: Instantons on noncommutative R 4 and (2,0) superconformal six dimensional theory. Commun. Math. Phys. 198, 689–703 (1998) 15. Pflaum, M.J.: Quantum groups on fibre bundles. Commun. Math. Phys. 166, 279–316 (1994) 16. Podle´s, P.: Quantum spheres. Lett. Math. Phys. 14, 521–531 (1987) 17. Rieffel, M.: Vector bundles over higher dimensional noncommutative tori. Lect. Notes. Math. 1132, Berlin–Heidelberg–New York: Springer-Verlag, 1985, pp. 456–467 18. Rieffel, M., Schwarz, A.: Morita equivalence of multidimensional noncommutative tori. Int. J. Math. 10, 289–299 (1999) 19. Woronowicz, S.L.: Twisted SU (2) group. An example of a non-commutative differential calculus. Publications of RIMS Kyoto University, Vol. 23 No. 1, 117–181 (1987) 20. Woronowicz, S.L.: Compact matrix pseudogroup. Commun. Math. Phys. 111, 613–665 (1987) 21. Woronowicz, S.L., D¸abrowski, L., Nurowski, P.: Compact and non-compact quantum groups. I. Preprint 153/95/FM, SISSA, Trieste, 1995 Communicated by A. Connes

Commun. Math. Phys. 221, 169 – 196 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Hyperelliptic Prym Varieties and Integrable Systems Rui Loja Fernandes1, , Pol Vanhaecke2 1 Departamento de Matemática, Instituto Superior Técnico, 1049-001 Lisboa, Portugal.

E-mail: [email protected]

2 Université de Poitiers, Département de Mathématiques, Téléport 2, Boulevard Marie et Pierre Curie,

BP 30179, 86962 Futuroscope Chasseneuil Cedex, France. E-mail: [email protected] Received: 12 December 2000 / Accepted: 26 March 2001

Abstract: We introduce two algebraic completely integrable analogues of the Mumford systems which we call hyperelliptic Prym systems, because every hyperelliptic Prym variety appears as a fiber of their momentum map. As an application we show that the general fiber of the momentum map of the periodic Volterra lattice a˙ i = ai (ai−1 − ai+1 ),

i = 1, . . . , n,

an+1 = a1 ,

is an affine part of a hyperelliptic Prym variety, obtained by removing n translates of the theta divisor, and we conclude that this integrable system is algebraic completely integrable. Contents 1. 2. 3. 4. 5. 6.

Introduction . . . . . . . . . . . . . . . . . Hyperelliptic Prym Varieties . . . . . . . . . The Hyperelliptic Prym Systems . . . . . . The Periodic Toda Lattices and KM Systems Painlevé Analysis . . . . . . . . . . . . . . Example: n = 5 . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

169 171 175 182 188 192

1. Introduction In this paper we introduce two algebraic completely integrable (a.c.i.) systems, similar to the even and odd Mumford systems (see [12] for the odd systems and [15] for the even systems). By a.c.i. we mean that the general level set of the momentum map is Supported in part by FCT-Portugal through the Research Units Pluriannual Funding Program, European Research Training Network HPRN-CT-2000-00101 and grant POCTI/1999/MAT/33081.

170

R. L. Fernandes, P. Vanhaecke

isomorphic to an affine part of an Abelian variety and that the integrable flows are linearized by this isomorphism ([16]). The phase space of these systems is described by triplets of polynomials (u(x), v(x), w(x)), as in the case of the Mumford system, but now we have the extra constraints that u, w are even and v odd for the first system (the “odd” case), and with the opposite parities for the other system (the “even” case). We show that in the odd case the general fiber of the momentum map is an affine part of a Prym variety, obtained by removing three translates of its theta divisor, while in the even case the general fiber has two affine parts of the above form. We call these systems the odd and the even hyperelliptic Prym system because every hyperelliptic Prym variety (more precisely an affine part of it) appears as the fiber of their momentum map. Thus we find the same universality as in the Mumford system: in the latter every hyperelliptic Jacobian appears as the fiber of its momentum map. To show that the hyperelliptic Prym systems are a.c.i. we exhibit a family of compatible (linear) Poisson structures, making these systems multi-Hamiltonian. These structures are not just restrictions of the Poisson structures on the Mumford system. Rather they can be identified as follows: the hyperelliptic Prym systems are fixed point varieties of a Poisson involution (with respect to certain Poisson structures of the Mumford system) and we prove a general proposition stating that such a subvariety always inherits a Poisson structure (Prop. 3.4). As an application we study the algebraic geometry and the Hamiltonian structure of the periodic Volterra lattice a˙ i = ai (ai−1 − ai+1 )

i = 1, . . . , n;

an+1 = a1 .

(1)

Although systems of this form go back to Volterra’s work on population dynamics ([20]), they first appear (in an equivalent form) in the modern theory of integrable system in the pioneer work of Kac and van Moerbeke ([10]), who constructed this system as a discretization of the Korteweg-de Vries equation and who discovered its integrability. Though those authors only considered the non-periodic case, we shall refer to (1) as the n-body KM system. In the second part of the paper we give a precise description of the fibers of the momentum map of the KM systems and we prove their algebraic complete integrability. We can summarize our results as follows: Theorem. Denote be M , P, T , K the phase spaces of the (even) Mumford system, the hyperelliptic Prym system (odd or even), the (periodic) sl Toda lattice and the (periodic) KM system. Then there exists a commutative diagram of a.c.i. systems TO ? K

/ M O ? /P

where the horizontal maps are morphisms of integrable systems, and the vertical maps correspond to a Dirac type reduction. We stress that the vertical arrows are natural inclusion maps exhibiting for both spaces the subspace as fixed points varieties, but they are not Poisson maps. On the other hand, the horizontal arrows are injective maps that map every fiber of the momentum map on the left injectively into (but not onto) a fiber of the momentum map on the right. In order

Hyperelliptic Prym Varieties and Integrable Systems

171

to make these into morphisms of integrable systems, we construct a pencil of quadratic brackets making Toda → Mumford a Poisson map. For one bracket in this pencil the induced map for the KM systems is also Poisson, so it follows that the diagram is also valid in the Poisson category. A description of the general fiber of the momentum map of the KM systems as an affine part of a hyperelliptic Prym variety follows. Since the flows of the KM systems are restrictions of certain linear flows of the Toda lattices this enables us to show that the KM systems are a.c.i.; moreover the map leads to an explicit linearization of the KM systems. In order to determine precisely which divisors are missing from the affine varieties that appear in the momentum map we use Painlevé analysis, since it is difficult to read this off from the map . The result is that n (= the number of KM particles) translates of the theta divisor are missing from these affine parts. We also show that each hyperelliptic Prym variety that we get is canonically isomorphic to the Jacobian of a related hyperelliptic Riemann surface, which can be computed explicitly, thereby providing an alternative, simpler description of the geometry of the KM systems. The plan of this paper is as follows. In Sect. 2 we recall the definition of a Prym variety and specialize it to the case of a hyperelliptic Riemann surface with an involution (different from the hyperelliptic involution). We show that such a Prym variety is canonically isomorphic to a hyperelliptic Jacobian and we use this result to describe the affine parts that show up in Sect. 3, in which the hyperelliptic Prym systems are introduced and in which their algebraic complete integrability is proved. In Sect. 4 we establish the precise relation between the KM systems and the Toda lattices and we construct the injective morphism . We use it to give a first description of the general fiber of the momentum map of the KM systems and we derive its algebraic complete integrability. A more precise description of these fibers is given in Sect. 5 by using Painlevé analysis. We finish the paper with a worked out example (n = 5) in which we find a configuration of five genus two curves on an Abelian surface that looks very familiar (Fig. 2). As a final note we remark that the (periodic) KM systems have received much less attention than the (periodic) Toda lattices, another family of discretizations of the Korteweg–de Vries equation, which besides admitting a Lie algebraic generalization, is also interesting from the point of view of representation theory. It is only recently that the interest in the KM systems has revived (see e.g. [6, 18], and the references therein). We hope that the present work clarifies the connections between these systems and the master systems (Mumford and Prym systems). It was pointed out to us by Vadim Kuznetsov, that an embedding of the KM systems in the Heisenberg magnet was constructed by Volkov in [19]. 2. Hyperelliptic Prym Varieties In this section we recall the definition of a Prym variety and specialize it to the case of a hyperelliptic Riemann surface , equipped with an involution σ . We construct an explicit isomorphism between the Prym variety of (, σ ) and the Jacobian of a related hyperelliptic Riemann surface. We use this isomorphism to give a precise description of the affine part of the Prym variety that will appear as the fiber of the momentum map of an integrable system related to KM system. 2.1. The Prym variety of a hyperelliptic Riemann surface. Let be a compact Riemann surface of genus G, equipped with an involution σ with p fixed points. The quotient

172

R. L. Fernandes, P. Vanhaecke

surface σ = /σ has genus g , with G = 2g +p/2−1, and the quotient map → σ is a double covering map which is ramified at the p fixed points of σ . We assume that g > 0, i.e., σ is not the hyperelliptic involution on a hyperelliptic Riemann surface . The group of divisors of degree 0 on , denoted by Div0 (), carries a natural equivalence relation, which is compatible with the group structure and which is defined by D ∼ 0 iff D is the divisor of zeros and poles of a meromorphic function on . The quotient group Div0 ()/ ∼ is a compact complex algebraic torus (Abelian variety) of dimension G, called the Jacobian of and denoted by Jac() ([9], Ch. 2.7), its elements are denoted by [D], where D ∈ Div0 () and we write ⊗ for the group operation in Jac(). Notice that σ induces an involution on Div0 () and hence on Jac(); we use the same notation σ for these involutions. Definition 2.1. The Prym variety of (, σ ) is the (G−g )-dimensional subtorus of Jac() given by Prym(/σ ) = {[D − σ (D)] | D ∈ Div0 ()}. We will be interested in the case in which is the Riemann surface of a hyperelliptic curve (0) : y 2 = f (x), where f is a monic even polynomial of degree 2n without multiple roots (in particular 0 is not a root of f ), so that the curve is non-singular. The Riemann surface has genus G = n−1 and is obtained from (0) by adding two points, which are denoted by ∞1 and ∞2 . The two points of (0) for which x = 0 are denoted by O1 and O2 . The 2n Weierstrass points of (the points (x, y) of (0) for which y = 0) come in pairs (X, 0) and (−X, 0); fixing some order we denote them by Wi = (Xi , 0) and −Wi = (−Xi , 0), where i = 1, . . . , n. The Riemann surface admits a group of order four of involutions, whose action on (0) and on the Weierstrass points (Xi , 0) and whose fixed point set are described in Table 1 for n odd, n = 2g + 1 and in Table 2 for n even, n = 2g + 2 (ı is the hyperelliptic involution). Table 1. n odd ı

(x, y)

O1

O2

∞1

∞2

Wi

−Wi

Fix

(x, −y)

O2

O1

∞2

∞1

Wi

−Wi

σ

(−x, y)

O1

O2

∞2

∞1

−Wi

Wi

Wi , −Wi O1 , O2

τ

(−x, −y)

O2

O1

∞1

∞2

−Wi

Wi

∞1 , ∞ 2

Table 2. n even (x, y)

O1

O2

∞1

∞2

Wi

−Wi

Fix

ı

(x, −y)

O2

O1

∞2

∞1

Wi

−Wi

Wi , −Wi

σ

(−x, −y)

O2

O1

∞2

∞1

−Wi

Wi

–

τ

(−x, y)

O1

O2

∞1

∞2

−Wi

Wi

O 1 , O 2 , ∞1 , ∞ 2

For future use we also point out that for points P ∈ which are not indicated on these tables, neither σ (P ) nor τ (P ) coincide with ı(P ). The involutions σ and τ lead to two quotient Riemann surfaces σ := /σ and τ := /τ , and to two covering maps πσ : → σ and πτ : → τ . It follows from Tables 1 and 2 that the genus of τ equals g, while the genus g of σ is g or g + 1 depending on whether n is odd or even. Also, the dimension of Prym(/σ ) = g

Hyperelliptic Prym Varieties and Integrable Systems

173

(whether n is odd or even). If the equation of (0) is written as y 2 = g(x 2 ) then for n (0) (0) odd, σ has an equation v 2 = g(u) while τ has an equation v 2 = ug(u); for n even (0) (0) the roles of σ and τ are interchanged. In order to describe Prym(/σ ), which we will call a hyperelliptic Prym variety, we need the following classical results about hyperelliptic Riemann surfaces and their Jacobians (for proofs, see [12], Ch. IIIa). Lemma 2.2. Let D be a divisor of degree H > G on , where G is the genus of , and let P be any point on . There exists an effective divisor E of degree G on such that D ∼ E + (H − G)P . Corollary 2.3. For any fixed divisor D0 of degree G, Jac() is given by G Jac() = Pi − D 0 | P i ∈ . i=1

Lemma 2.4. Let D be a divisor on of the form D = H i=1 (Pi − Qj ), where H ≤ G and Pi = Qj for all i and j . Then [D] = 0 if and only if H is even and D is of the form D=

H /2

(Ri + ı(Ri ) − Si − ı(Si )),

i=1

for some points Ri , Si ∈ . 2.2. Hyperelliptic Prym varieties as Jacobians. In the following theorem we show that for any n the Prym variety Prym(/σ ) associated with the hyperelliptic Riemann surface is canonically isomorphic to the Jacobian of τ . This result was first proven by D. Mumford (see [13]) for the case in which πσ : → σ is unramified (n even) and by S. Dalaljan (see [7]) for the case in which πσ : → σ has two ramification points (n odd). Our proof, which is valid in both cases, is different and has the advantage of allowing us to describe explicitly the affine parts of the Prym varieties that we will encounter as affine parts of the corresponding Jacobians. Theorem 2.5. Let πτ∗ denote the homomorphism Div0 (τ ) → Div0 () which sends every point of τ to the divisor on which consists of its two antecedents (under τ ). The induced map # : Jac(τ ) → Prym(/σ ) [D] →[πτ∗ D] is an isomorphism. Proof. It is clear that the homomorphism # is a well-defined: if [D] = 0 then D is the divisor of zeros and poles of a meromorphic function f on τ , hence πτ∗ D is the divisor of zeros and poles of f ◦ τ and [πτ∗ D] = 0. To see that the image of # is contained in Prym(/σ ), just notice that πτ∗ (D) can be written as E + τ (E) for some E ∈ Div0 (), so that [πτ∗ (D)] = [E + τ (E)] = [E − σ (E)] ∈ Prym(/σ ).

174

R. L. Fernandes, P. Vanhaecke

Since Jac(τ ) and Prym(/σ ) both have dimension g it suffices to show that # is injective. Suppose that [πτ∗ D] = 0 for some D ∈ Div0 (τ ). We need to show that this implies = 0. It follows from Corollary 2.3 that we may [D] gassume that D is of the g form i=1 pi − gπτ (∞1 ), where pi ∈ τ . Then πτ∗ D = i=1 Pi + τ (Pi ) − 2g∞1 (πτ (Pi ) = pi ). Since 2g ≤ G and ∞1 = ı(∞1 ) Lemma 2.4 implies that Pi = ∞1 , i.e., pi = πτ (∞1 ) for all i.

2.3. The theta divisor. We introduce two divisors on Jac() by %1 =

G−1

Pi − (G − 1)∞1 | Pi ∈ ,

i=1

%2 =

G−1

(2)

Pi + ∞2 − G∞1 | Pi ∈ .

(3)

i=1

These two divisors are both translates of the theta divisor and they differ by a shift over [∞2 − ∞1 ]. Since ∞2 = ı(∞1 ) they are tangent along their intersection locus, which is given by &=

G−2

Pi + ∞2 − (G − 1)∞1 | Pi ∈ .

i=1

Proposition 2.6. When n is odd Prym(/σ ) ∩ (%1 ∪ %2 ) consists of three translates of the theta divisor of Jac(τ ), intersecting as in the following figure.

1

2

[11 + 12 ]

[11 + O]

[12 + O]

Fig. 1.

Proof. We use the isomorphism # to determine which divisors of Jac(τ ) get mapped into %1 and %2 . Since O1 and O2 are the only points g of on which ı and τ coincide, Lemma 2.4 implies that the only divisors D = i=1 pi − gπτ (∞1 ) ∈ Div(τ ) for which πτ∗ D contains, up to linear equivalence, ∞1 or ∞2 are those for which at least

Hyperelliptic Prym Varieties and Integrable Systems

175

one of them contains πτ (∞1 ) or πτ (∞2 ) or πτ (O1 ) (=πτ (O2 )). Denoting O = πτ (O1 ) we find that these points constitute the following three divisors on Jac(τ ): g−1 θ1 = pi − (g − 1)πτ (∞1 ) | pi ∈ τ , i=1 g−1 θ2 = pi + πτ (∞2 ) − gπτ (∞1 ) | pi ∈ τ , i=1 g−1 θ= pi + O − gπτ (∞1 ) | pi ∈ τ . i=1

They all pass through g−2 ω= pi + πτ (∞2 ) − (g − 1)πτ (∞1 ) | pi ∈ τ , i=1

which is the tangency locus of θ1 and θ2 , and θi intersects θ in addition in g−2 ωi = pi + πτ (∞i ) + O − gπτ (∞1 ) | pi ∈ τ , i=1

which is a translate of ω.

When n is even then clearly Prym(/σ ) is contained in %1 , but the following result, similar to Prop. 2.6, holds for an appropriate translate of Prym(/σ ). The proof is left to the reader. Proposition 2.7. When n is even and i ∈ {1, 2} then (Prym(/σ ) ⊗ [O1 − ∞i ]) ∩ (%1 ∪ %2 ) consists of three translates of the theta divisor of Jac(τ ), intersecting as in Fig. 1 (in which O should now be replaced by O2 ). We will see in the next section how in both cases (n odd/even) the affine variety obtained by removing these three translates from the theta divisor from Prym(/σ ) can be described by simple, explicit equations. 3. The Hyperelliptic Prym Systems In this section we introduce two families of integrable systems, whose members we call the odd and the even hyperelliptic Prym systems, where the adjective “odd/even” refers to the parity of n, as in the previous section, and where “hyperelliptic Prym” refers to the fact that the fibers of the momentum map of these systems are precisely the affine parts of the hyperelliptic Prym varieties that were considered in the previous section. These systems are intimately related to the even Mumford systems, constructed by the second author (see [15]), as even analogs of the (odd) Mumford systems, constructed by Mumford (see [12]).

176

R. L. Fernandes, P. Vanhaecke

3.1. The Mumford systems. We first recall the definition of the g-dimensional odd and even Mumford systems and we describe their geometry. Details, generalizations and applications can be found in [16]. The phase space of each of these systems is an affine space CN , which is most naturally described as an affine space of triples (u(x), v(x), w(x)) of polynomials, often represented as Lax operators L(x) =

v(x)

w(x)

,

u(x) −v(x)

where u(x), v(x) and w(x) are subject to certain constraints. Denoting by Mg (resp. Mg ) the phase space of the g th odd (resp. even) Mumford system these constraints are indicated in the following table: Table 3. Phase space Mg Mg

dim

u(x)

3g + 1

monic

3g + 2

deg = g monic

v(x)

w(x)

deg < g

monic

deg < g

deg = g

deg = g + 1 monic deg = g + 2

It is natural to use the coefficients of the three polynomials u(x), v(x), w(x) as coordinates on Mg and on Mg : for Mg for example, which will be most important for this paper, we write u(x) = x g + ug−1 x g−1 + · · · + u0 , v(x) = vg−1 x g−1 + · · · + v0 , w(x) = x g+2 + wg+1 x g+1 + · · · + w0 , or, in terms of the Lax operator L(x), as

0 1 0 0

x g+2 +

0 wg+1 0

0

x g+1 +

0 wg 1

0

xg +

0≤i0

where su(A0 ) denotes the strictly upper triangular part of A0 . The vector fields Xi are also Hamiltonian with respect to {·, ·}xT and their flows are linear on the general fiber of the momentum map K : Tn → C[x], which is defined by 1 det(x Id −L(h)) = −h − + K(x)/2; h since the general fiber of K is an affine part of a hyperelliptic Jacobian, the n-body Toda lattice is an a.c.i. system (see [3] for details). For higher order brackets for the Toda lattices, see [5]. We now turn to the n-body, periodic, Kac–van Moerbeke system (n-body KM system, for short). Its phase space Kn is the subspace of Tn consisting of all Lax operators (10) with zeros on the diagonal. Kn is not a Poisson subspace of Tn . However, Kn is the fixed manifold of the involution : Tn → Tn defined by ((a1 , a2 . . . , an ), (b1 , b2 . . . , bn )) → ((a1 , a2 . . . , an ), (−b1 , −b2 . . . , −bn )), which is a Poisson automorphism of (Tn , {·, ·}xT ). Therefore, by Theorem 3.4, Kn inherits a Poisson bracket {·, ·}K from {·, ·}xT , which is given by ai , aj K = ai aj (δi,j +1 − δi+1,j ).

It follows that the restriction of the momentum map K to Kn is a momentum map for the n-body KM system. Notice that Ij = 0 for even j , while for j odd the Lax equations (11) lead to Lax equations for the n-body KM system, merely by putting b1 = · · · = bn = 0. Taking j = 1 we find the vector field a˙ i = ai (ai−1 − ai+1 ),

i = 1, . . . , n,

(13)

which was already mentioned in the introduction. More generally, taking j odd we find a family of commuting Hamiltonian vector fields on Kn which are restrictions of the Toda vector fields, while for j even the Toda vector fields Xj are not tangent to Kn . In order to conclude that the KM systems are a.c.i. we need to describe the fibers of the momentum map K : Kn → C[x]. This will be done in the next paragraph.

184

R. L. Fernandes, P. Vanhaecke

4.2. Algebraic integrability of KM. We first define a map : Tn → Mn−1 which maps the n-body Toda system to the even Mumford system. The following identity, valid for tridiagonal matrices, will be needed. Lemma 4.1. Let M be a tridiagonal matrix, β1 α1 0 · · · 0 0 γ1 β2 α2 0 .. 0 γ β . 2 3 M= . .. .. .. . . . . . 0 βn−1 αn−1 0 0 · · · · · · γn−1 βn

,

and denote by ;i1 ,...,ik the determinant of the minor of M obtained by removing from M the rows i1 , . . . , ik and the columns i1 , . . . , ik . Then: ;1 ;n − ;;1,n =

n−1 '

αi γ i .

(14)

i=1

Proof. For n = 2 this is obvious. For n > 2 one proceeds by induction, using the following formula for calculating the determinant ; of M, ; = βn ;n − αn−1 γn−1 ;n−1,n .

(15)

In the sequel we use the notation ;i1 ,...,ik from the above lemma taking as M the tridiagonal matrix obtained from x Id −L(h) in the obvious way, i.e., by removing the two terms that depend on h. In this notation the characteristic polynomial of L(h) is given by det(x Id −L(h)) = −h − h−1 + ; − an ;1,n .

(16)

Proposition 4.2. For any m = 1, . . . , n the map m : Tn → Mn−1 defined by u(x) = ;m , v(x) = am−1 ;m−1,m − am ;m,m+1 ,

(17)

w(x) = (x − bm ) ;m + 2(x − bm )(am−1 ;m−1,m + am ;m,m+1 ) + 4am am−1 ;m−1,m,m+1 , 2

maps each fiber of the momentum map K : Tn → C[x] into a fiber of the momentum map H : Mn−1 → C[x]. The restriction of m to Kn takes values in P n−1 when n 2 is odd and in P n −1 when n is even, mapping in both case the fiber of the momentum 2

map K : Kn → C[x] into the fiber of the momentum map H : P n−1 → C[x 2 ] (or 2

H : P n −1 → C[x 2 ]). As a consequence the general fiber of the momentum map of the 2 KM systems is an affine part of a hyperelliptic Jacobian.

Hyperelliptic Prym Varieties and Integrable Systems

185

Proof. Since the momentum map is equivariant with respect to the Z/n action on Tn it suffices to prove the proposition for m = n. It is easy to see that the triple (u, v, w), defined by (17) satisfies the constraints u, w monic, deg w = deg u + 2 = n + 1 and deg v < n − 1, so that n takes values in Mn−1 . Moreover, taking β1 = · · · = βn = x in (15) implies that when all entries on the diagonal of L(h) are zero then ;i1 ,...,ip has the same parity as n − p, so that the triples (u, v, w) which correspond to points in Kn have the additional property that v has the same parity as n while u and w have the opposite parity. Therefore the restriction of n to Kn takes values in P n−1 when n is odd and in P n −1 when n is even. 2

2

For p(x) a monic polynomial of degree n, let L(h) ∈ K −1 (2p(x)), i.e., p(x) = (x − bn );n − an ;1n − an−1 ;n−1,n .

(18)

Proving that n (L(h)) belongs to H −1 (p 2 (x)−4) amounts to showing that u(x)w(x)+ v 2 (x) = p2 (x) − 4, which follows from a direct computation, using (14). The commutativity of the following diagram follows: Tn

H

K

C[x]

/ M n−1

φ

/ C[x]

where φ is defined by φ(q) = (q/2)2 − 4, for q ∈ C[x]. To show that the map n is injective let (u(x), v(x), w(x)) ∈ n (Tn ). We show that the matrix L(h) ∈ Tn which is mapped to this point is unique. First observe that the monic polynomial p(x) = ; − an ;1,n can be recovered from u(x)w(x) + v(x)2 = p(x)2 − 4. We can then determine bn from the following two formulas: n bi x n−1 + · · · , p(x) = x n − i=1

n−1 u(x) = ;n = x n−1 − bi x n−2 + · · · . i=1

Next, the second relation in (17) and (18) lead to the system: an−1 ;n−1,n − an ;1,n = v(x), an−1 ;n−1,n + an ;1,n = (x − bn )u(x) − p(x). This linear system completely determines the products an ;1,n and an−1 ;n−1,n . Because the determinants of the principal minors of x Id −L(h) are monic polynomials, this means that we know an , ;1,n and ;n−1,n separately. From ; = p(x) + an ;1,n we also obtain ;. We have now shown how bn , an , ;, ;n and ;n−1,n are determined. We proceed by induction, showing how to determine bn−k−1 , an−k−1 , ;n−k−1,...,n once we know bn−i , an−i and ;n−i,...,n for i = 0, . . . , k. We use (15) to obtain the recursive relation: ;n−k+1,...,n = (x − bn−k );n−k,...,n − an−k−1 ;n−k−1,...,n .

186

R. L. Fernandes, P. Vanhaecke

This determines the product an−k−1 ;n−k−1,...,n , but also an−k−1 and ;n−k−1,...,n separately, again because ;n−k−1,...,n is monic. Now from ;n−k−1,...,n and ;n−k,...,n we bi and n−k−1 bi . Hence, bn−k−1 is determined. know, as above, the sums n−k−2 i=1 i=1 We saw in Prop. 3.3 that the fibers of the momentum map of the even Prym system are reducible (two isomorphic pieces), so there remains the question if the same is true for the n-body KM system for even n. To check that this is so, note that the highest degree coefficient of the characteristic polynomial of L(h) gives, for n even, the first integral I = a1 a3 a5 · · · an−1 + a2 a4 a6 · · · an . Since a1 a2 · · · an = 1, for generic values of I , the variety defined by a1 a3 a5 · · · an−1 = constant,

a2 a4 a6 · · · an = constant,

is reducible, and the claim follows. Note however that both a1 a3 a5 · · · an−1 and a2 a4 a6 · · · an are first integrals themselves, so we can construct a momentum map using these integrals (instead of their sum and product) and then the general fiber is irreducible. The map m : Tn → Mn−1 not only maps fibers to fibers of the momentum maps, but it maps the whole hierarchy of Toda flows to the Mumford flows defined by (7). To ϕ see this, we construct a family of quadratic Poisson brackets {·, ·}M,q on Mn−1 which make this map Poisson. First observe that there exist unique polynomials p(x) and r(x), with p(x) monic of degree n and r(x) of degree less than n, such that u(x)w(x) + v(x)2 = p(x)2 + r(x).

(19)

The coefficients of p(x) and r(x) are regular functions of ui , vi and wi . Hence, we can define a skew-symmetric biderivation on the space of regular functions of Mn−1 by setting, for any ϕ ∈ C[x] of degree at most 1, ϕ u(x), u(x ) M,q ϕ u(x), v(x ) M,q ϕ u(x), w(x ) M,q ϕ v(x), w(x ) M,q ϕ w(x), w(x ) M,q

ϕ = v(x), v(x ) M = 0, pϕ = u(x), v(x ) M + α ϕ (x + x )u(x)u(x ), pϕ = u(x), w(x ) M − 2α ϕ (x + x )u(x)v(x ), pϕ = v(x), w(x ) M + α ϕ (x + x )u(x)w(x )), pϕ = w(x), w(x ) M + 2α ϕ (x + x ) w(x)v(x ) − w(x )v(x) ),

where α ϕ (x) = ϕ(α(2x)/2). Notice that the polynomial pϕ, used in the definition of the bracket, depends on the phase variables. Proposition 4.3. Let ϕ be a polynomial of degree at most 1. Then ϕ

(i) {·, ·}M,q is a Poisson bracket on Mn−1 and the maps ϕ

ϕ

m : (Tn , {·, ·}T ) → (Mn−1 , {·, ·}M,q ) are Poisson and map the Toda flows to the Mumford flows;

Hyperelliptic Prym Varieties and Integrable Systems

187

ϕ

(ii) For ϕ odd, the bracket {·, ·}M,q induces a Poisson bracket {·, ·}P ,q on P(n−1)/2 (resp. on Pn/2−1 ), and the maps

m : (Kn , {·, ·}K ) → (P(n−1)/2 , {·, ·}P ,q ) , {·, ·}P ,q )

m : (Kn , {·, ·}K ) → (Pn/2−1

are Poisson and map the flows of the n-body KM system to the flows of the hyperelliptic Prym systems. Proof. We take the bracket of both sides of (19) with u(x) to obtain 2p(y)ϕ(y)

u(x)v(y) − u(y)v(x) ϕ ϕ = 2p(y) {u(x), p(y)}M,q + {u(x), r(y)}M,q . x−y ϕ

ϕ

It follows that {u(x), r(y)}M,q is divisible by p(y). Since {u(x), r(y)}M,q is of degree ϕ less than n in y and since p(y) is monic of degree n we must have {u(x), r(y)}M,q = 0 and u(x)v(y) − u(y)v(x) ϕ {u(x), p(y)}M,q = ϕ(y). x−y ϕ

ϕ

Similarly, we find {v(x), r(y)}M,q = {w(x), r(y)}M,q == 0 and also that: ϕ(y) w(x)u(y) − u(x)w(y) {v(x), p(y)}M,q = − α(x + y)u(x)u(y) , 2 x−y v(x)w(y) − w(x)v(y) {w(x), p(y)}M,q = ϕ(y) + α(x + y)v(x)u(y) . x−y These expressions also allow one to compute the brackets of u(x), v(x), w(x) and p(x) ϕ with α(y), and the check of the Jacobi identity follows easily from it. Therefore, {·, ·}M,q is a Poisson bracket for which the coefficients of r(x) are Casimirs. If we compare the expressions above for the brackets with p(y) with expressions (7) for the Mumford vector fields, we conclude that they are Hamiltonian with respect to {·, ·}1M,q with Hamiltonian function K. Checking that m is Poisson can be done by a straightforward (but rather long) computation using the following expressions for the derivatives of ;i1 ,...,ik : ∂;i1 ,...,ik −;i,i+1,i1 ,...,ik , i, i + 1 ∈ {i1 , . . . , ik } , = 0 otherwise, ∂ai ∂;i1 ,...,ik −;i,i1 ,...,ik , i ∈ {i1 , . . . , ik } , = 0 otherwise. ∂bi For the second statement, one easily checks that when ϕ is odd then is a Poisson involution, so that there is an induced bracket on P(n−1)/2 or on Pn/2−1 . Explicit formulas for this bracket are computed as in the proof of Proposition 3.5. The other statements in (ii) then follow from (i). ϕ

ψ

It is easy to check that the Poisson brackets {·, ·}M,q and {·, ·}M on Mn−1 are compatible, when ϕ and ψ have degree at most 1. This is however not true when ψ is of higher degree.

188

R. L. Fernandes, P. Vanhaecke

5. Painlevé Analysis The results in the previous section show that the general fiber of the momentum map of the KM systems is an affine part of a hyperelliptic Prym variety (or two copies of it), which can also be described as a hyperelliptic Jacobian. In order to describe precisely which affine part we determine the divisor which needs to be adjoined to each affine part in order to complete it into an Abelian variety. Since it is difficult to do this by using the maps m we do this by performing Painlevé analysis of the KM systems. The method that we use is based on the bijective correspondence between the principal balances of an integrable vector field (Laurent solutions depending on the maximal number of free parameters) and the irreducible components of the divisor which is missing from the fibers of the momentum map (see [1]). We look for all Laurent solutions ∞ 1 (j ) j ai t , tr

ai (t) =

(20)

j =0

to the vector field (13) of the n-body KM system. The following lemma shows that any such Laurent solution of (13) can have at most simple poles. We may suppose that r in (0) (20) is maximal, i.e., ai = 0 for at least one i, and we call r the order of the Laurent solution. The order of pole (or zero) of ai (t) is denoted by ri , so r = maxi ri . Lemma 5.1. Let the Laurent series ai (t), i = 1, . . . , n, given by (20) be a solution to the vector field (13) of the n-body KM system. If at least one of the ai has a pole (for t = 0) then it is a Laurent solution of order 1. Moreover the orders of the pole (or zero) of each ai (t) satisfy (0)

(0)

ri = ai+1 − ai−1 .

(21)

Proof. For s ∈ N we find from (20):

a˙ i (t) s −ri , t = Res t=0 ai (t) 0,

s=0 s > 0.

On the other hand, if we use (13) then we find Res t=0

a˙ i (t) s (r−s−1) (r−s−1) − ai+1 . t = Res (ai−1 (t) − ai+1 (t)) t s = ai−1 t=0 ai (t)

We conclude that

(k) ai−1

(k) − ai+1

=

−ri , 0,

k =r −1 0 ≤ k ≤ r − 2.

(22)

Now substituting (20) into (13) and comparing the coefficient of 1/t r+1 the following equation (the indicial equation) is obtained: (0)

−rai

(0)

(0)

(0)

= ai (ai−1 − ai+1 ), (0)

i = 1, . . . , n.

(23) (0)

(0)

If ai has a pole of order r > 0 then ai = 0 and (23) implies ai−1 − ai+1 = −r. Comparing with (22) we see that we must have r = 1 and that (21) holds.

Hyperelliptic Prym Varieties and Integrable Systems

189

Notice that in view of the periodicity of the indices (ai+n = ai ) the linear system (0)

(0)

1 = (ai+1 − ai−1 ), (0)

has no solutions, so that at least one of the ai (0)

i = 1, . . . , n, (0)

(0)

vanishes. If, say, a0 = ak+1 = 0 while

ai = 0 for i = 1, . . . , k for some k in the range 1, . . . , n − 1 (this includes the case of (0) a single i for which ai = 0) then the indicial equation specializes to (0)

a2 = 1, (0)

(0)

ai+1 − ai−1 = 1,

(0) ak−1

i = 2, . . . , k − 1,

= −1, (0)

(0)

which has no solution for k odd, and which has a unique solution (a1 , . . . , ak ) = (0) (0) (−l, 1, 1 − l, 2, . . . , −1, l) for even k, k = 2l. The other variables ak+1 . . . , an can either be all zero, or they can constitute one or several other solutions of this type (with varying k = 2l), separated by zeroes. Using periodicity the other solutions to the indicial equation are obtained by cyclic permutation. Thus we are led to the following combinatorial description of the solutions to the indicial equation of the n-body KM system. For a subset A of Z/n, and for p ∈ Z/n let us denote by A(p) ⊂ Z/p the largest subset of A that contains p and that consists of consecutive elements (with the understanding that A(p) = ∅ when p ∈ / A). If we define Fn = {A ⊂ Z/n | p ∈ A ⇒ #A(p) is even}, then we see that the solutions to the indicial equation are in one to one correspondence with the elements of Fn . In the sequel we freely use this bijection. For A ∈ Fn we call the integer #A/2 its order, denoted by ord A. For each solution to the indicial equation (i.e., for each A ∈ Fn ) we compute the eigenvalues of the Kowalevski matrix M, whose entries are given by Mij =

∂Fi (0) (a , . . . , an(0) ) + δij , ∂aj 1

where Fi = ai (ai−1 − ai+1 ), the i th component of (13). The number of non-negative integer eigenvalues of this matrix are precisely the number of free parameters of the (0) (0) family of Laurent solutions whose leading term is given by (a1 , . . . , an ) (see [1]), hence we can deduce from it which strata of the Abelian variety, whose affine part appears as a fiber of the momentum map, are parameterized by it. Proposition 5.2. For a solution of the indicial equation corresponding to A ∈ Fn the Kowalevski matrix M has n − ord A non-negative integer eigenvalues. Proof. In view of (21) the entries of M can be written in the form (0) (1 − ri )δi,j , if ai = 0 Mij = (0) (0) ai (δi,j +1 − δi,j −1 ), if ai = 0.

190

R. L. Fernandes, P. Vanhaecke

Note also that, by using the Z/n action, we can assume that 1 ∈ A, n ∈ / A, and that A is a disjoint union of A(p1 ), . . . , A(ps ), with p1 < p2 < · · · < ps . Let li = ord A(pi ). Then M has the following form: −l1 C1 E1 D1 0 C2 E2 D2 M= . .. . 0 C E s s Ds On the upper right corner the matrix has entry −l1 , and the blocks Ci ,Di and Ei , i = 1, . . . , s, are matrices as follows: • Ci is a tridiagonal matrix of size 2li of the form:

0

li

1 0 −1 1 − l i 0 li − 1 2 0 −2 Ci = . .. .. .. . . li − 1 0 1 − l i −1 0 1 li 0

;

• Di is a diagonal matrix of the form Di = diag (1 + li , 1, . . . , 1, 1 + li+1 ), with the convention that if Di is 1 × 1 then its only entry is 1 + li + lj ; • Ei is a matrix with only one non-zero entry −li in the lower left corner. It is clear that the set of eigenvalues of M is the union of the set of eigenvalues of the Ci ’s and Di ’s. Now we have: Lemma 5.3. The eigenvalues of the matrix Ci are {±1, ±2, . . . , ±li }. Assuming to hold we find that the number of negative eigenvalues of M the lemma is equal to si=1 li = si=1 ord A(pi ) = ord A, so the proposition follows. So we are left with the proof of the lemma. We write l for li and we denote by ej the j th vector of the standard basis of C2l . In the basis e1 , e3 , . . . , e2l−3 , e2l−1 , e2l , e2l−2 , . . . , e4 , e2 the matrix Ci takes form

0 A A 0

,

Hyperelliptic Prym Varieties and Integrable Systems

191

where A is the transpose of the matrix

0 .. . .. .

...

...

0

1

0 2 −1 .. I= . . 3 −2 0 . . .. 0 ... .. .. . l l − 1 0 ... 0 We show that this matrix has eigenvalues 1, −2, 3, . . . , (−1)l−1 l. Then the result follows because the eigenvalues of C are ± the eigenvalues of A. For j = 1, . . . , l, let fj = [1j −1 , 2j −1 , . . . , l j −1 ]T and let Vj denote the span of f1 , . . . , fj . For v = [v1 , . . . , vl ]T ∈ Cl we have that v ∈ Vj if and only if there exists a polynomial P of degree less than j such that vk = P (k) for k = 1, . . . , l. Since the k th component of Ifj is given by 1 k(l − k + 1)j −1 + (1 − k)(l − k + 2)j −1 = (−1)j −1 j k j −1 1 + O , k we have that Ifj ⊂ Vj , more precisely Ifj ∈ (−1)j −1 j fj + Vj −1 . This means that in terms of the basis fj the matrix I is upper triangular, with the integers 1, −2, 3, . . . , (−1)l−1 l on the diagonal. By the proposition above we can have a Laurent solution depending on n − 1 free (0) (0) parameters (a principal balance) only for the n choices of A given by (a1 , . . . , an ) = (−1, 1, 0, . . . , 0) and their cyclic permutations. Let us check that these lead indeed to asymptotic expansions which formally solve (13). By §2 in [1], these solutions are actually convergent and so they define convergent Laurent solutions. (0) (0) It suffices to do this for the solution (a1 , . . . , an ) = (−1, 1, 0, . . . , 0) of the indicial equation. By (21) we know that the order of the singularities of this solution are (r1 , . . . , rn ) = (1, 1, −1, 0, . . . , −1) so we have the following ansatz for the formal expansions: 1 a1 (t) = − + α1 + β1 t + O(t 2 ), t 1 a2 (t) = + α2 + β2 t + O(t 2 ), t a3 (t) = β3 t + O(t 2 ), aj (t) = αj + βj t + O(t 2 ), an (t) = βn t + O(t ). 2

4 ≤ j ≤ n − 1,

192

R. L. Fernandes, P. Vanhaecke

If we replace these expansions in Eq. (13) defining the n-body KM system we obtain the consistency equations: α 1 − α2 2β1 − β2 β1 − 2β2 βj

= 0, = −α1 α2 − βn , = −α1 α2 + β3 , = αj (αj −1 − αj +1 ),

4 ≤ j ≤ n − 1.

They give exactly the n − 1 free parameters α1 , α4 , . . . , αn−1 , β3 , βn . The coefficients (k) (k) a(k) = (a1 , . . . , an ) for k > 2 are then completely determined since they satisfy an equation of the form (j )

(M − kI ) · a(k) = some polynomial in the ai

with j < k,

and the eigenvalues of the Kowalevski matrix M are −1, 1, 2, by the proof above. This leads to the following result. Theorem 5.4. When n is odd the general fiber of the momentum map of the n-body KM system is an affine part of a hyperelliptic Prym variety, obtained by removing n translates of its theta divisor. When n is even the general fiber consists of two isomorphic components which admit the same description as in the odd case. In both cases the Prym variety admits an alternative description as a hyperelliptic Jacobian. 6. Example: n = 5 In this section we study the 5-body KM system in more detail. Its phase space is fourdimensional and is given by K5 = {(a1 , a2 , a3 , a4 , a5 ) | a1 a2 a3 a4 a5 = 1}, with Lax operator 0 a1 0 0 h−1 1 0 a2 0 0 L= 0 1 0 a3 0 . 0 0 1 0 a4 ha5 0 0 1 0 The spectral curve det(x Id −L) = 0 is explicitly given by h+

1 = x 5 − Kx 3 + Lx, h

where K = a1 + a2 + a3 + a4 + a5 , L = a1 a3 + a2 a4 + a3 a5 + a4 a1 + a5 a2 . These functions are in involution with respect to the quadratic Poisson structure, given by {ai , aj } = (δi,j +1 − δi+1,j )ai aj . It follows from the previous section that for generic k, l the affine surface Pkl defined by K = k, L = l is an affine part of the Jacobian

Hyperelliptic Prym Varieties and Integrable Systems

193

of the genus two Riemann surface τ minus five translates of its theta divisor, which is (0) isomorphic to τ . As we have seen, an equation for τ is given by τ(0) : y 2 = (u3 − ku2 + lu)2 − 4u.

(24)

The two commuting Hamiltonian vector fields XK and XL are given by a˙ 1 = a1 (a5 − a2 ),

a1 = a1 (a3 a5 − a2 a4 ),

a˙ 2 = a2 (a1 − a3 ),

a2 = a2 (a4 a1 − a3 a5 ),

a˙ 3 = a3 (a2 − a4 ),

a3 = a3 (a5 a2 − a4 a1 ),

a˙ 4 = a4 (a3 − a5 ),

a4 = a4 (a1 a3 − a5 a2 ),

a˙ 5 = a5 (a4 − a1 ),

a5 = a5 (a2 a4 − a1 a3 ).

The principal balance of XK for which a1 and a2 have a pole corresponds, according to Sect. 5, to the following solution of the indicial equations: (0)

(0)

(0)

(0)

(0)

(a1 , a2 , a3 , a4 , a5 ) = (−1, 1, 0, 0, 0), and its first few terms are given by 1 1 a1 = − + α − (α 2 + 2β + γ )t + O(t 2 ), t 3 1 1 2 a2 = + α + (α − β − 2γ )t + O(t 2 ), t 3 a3 = γ t + O(t 2 ),

(25)

a4 = δ + O(t ), 2

a5 = βt + O(t 2 ). Here α, β, γ and δ are the free parameters. If we look for Laurent solutions that correspond to the divisor to be added to Pkl we find by substituting the above Laurent solution in K = k, L = l, a1 a2 a3 a4 a5 = 1, 2α + δ = k, 2αδ + β − γ = l, γβδ = −1, which means that the Laurent solution depends on two parameters β and δ, bound by the relation (k − δ)δ + β +

1 = l, βδ

(26)

which is an (affine) equation for the theta divisor, i.e., for τ ; it is easy to see that this curve is birational to the curve (24). The other four principal balances are obtained by cyclic permutation from (25). Pkl can be embedded explicitly in projective space by using the functions with a pole of order at most 3 along one of the translates of the theta divisor and no other poles. Since the theta divisor defines a principal polarization on its Jacobian, the vector space of such functions has dimension 32 = 9, giving an embedding in P8 . One checks by

194

R. L. Fernandes, P. Vanhaecke

direct computation that the following functions z0 , . . . , z8 form a basis for the space of functions with a pole of order at most 3 along the divisor associated with the Laurent solution (25) (the first two functions are obvious choices from the expression (25), while the others can be obtained from them by taking the derivative along the two flows): z0 z1 z2 z3 z4 z5 z6

= 1, = a1 a2 , = a1 a2 a4 , = a1 a2 (a1 + a5 ), = a1 a2 a4 (a3 + a4 + a5 ), = a1 a2 a4 (a1 − a2 ), = a1 a2 a4 ((a3 + a4 )a1 − (a4 + a5 )a2 ),

z7 = a12 a22 a4 a5 , z8 = a1 a22 a4 ((a4 + a5 )2 + a3 a4 ). The corresponding embedding of the Jacobian in P8 is then given explicitly on the affine surface Pkl by (a1 , . . . , a5 ) → (z0 : · · · : z8 ). By substituting the five principal balances in this embedding and letting t → 0 we find an embedding of the five curves 1 , . . . , 5 (in that order) which constitute the divisor Jac(τ ) \ Pkl : (0 : 0 : 0 : 1 : 0 : 2δ : 2δ 2 : βδ : −δ 3 ) (βδ 2 : −β 2 δ 2 : 0 : −β 2 δ 3 : βδ : βδ : βδ 2 : 0 : 1 − βδ 3 ) (1 : 0 : βδ : 0 : βδ(k − δ) : βδ 2 : −βδ(β + δ 2 − kδ) : 0 : β 2 δ(k − δ)) (β, δ) → (β 2 δ : 0 : βδ : −βδ : βδ(k − δ) : −βδ 2 : 1 + βδ 2 (δ − k) : −δ : −βδ 2 (β − (δ − k)2 )) 2 2 (βδ : −δ : 0 : δ(δ − k) : βδ : −βδ : −βδ : −1 : 1). The points on the divisor that correspond to the above Laurent solutions are the ones for which β and δ are finite; notice that all these points in P8 are different. In order to determine the coordinates of the other points and the incidence relations between these points and the curves i we choose a local parameter around each of the three points needed to complete (26) into a compact Riemann surface: (a) (b) (c)

δ = 1/u, β = 1/u2 (1 + O(t)); δ = 1/u, β = u3 (1 + O(t)); β = 1/u, δ = −u2 (1 + O(t)).

Substituting these in the equations of the five embedded curves we find the following 5 points (each one is found 3 times because it belongs to three of the curves i ) p1 p2 p3 p4 p5

= (0 : 0 : 0 : 1 : 0 : 0 : 0 : 0 : 0), = (0 : 0 : 0 : 0 : 0 : 0 : 0 : 0 : 1), = (1 : 0 : 0 : 0 : 0 : 0 : 1 : 0 : −k), = (1 : 0 : 0 : 0 : 0 : 0 : −1 : 0 : 0), = (0 : 0 : 0 : 0 : 0 : 0 : 0 : 1 : −1).

Hyperelliptic Prym Varieties and Integrable Systems

3

195

1

4

p4

p3 p2

p3

p5

p1

2

p4

5

Fig. 2.

With this labeling of the points pi we have that i contains the points pi−1 , pi and pi+1 . As a corollary we find a 53 configuration on the Jacobian, where the incidence pattern of the 5 Painlevé divisors and the 5 points pi is as in the following picture (to make the picture exact one has to identify the two points labeled p3 , as well as the two points labeled p4 in such a way that the curves 2 and 4 are tangent, as well as the curves 3 and 5 ). Obviously the order 5 automorphism (a1 , a2 , a3 , a4 , a5 ) → (a2 , a3 , a4 , a5 , a1 ) preserves the affine surfaces Pkl and maps every curve i and every point pi to its neighbor. Since this automorphism does not have any fixed points it is a translation on Jac(τ ), and since its order is 5 it is a translation over 1/5 of a period. Notice also that with the above labeling of points and divisors the intersection point between i and i+2 is pi+1 (so they are tangent), while the intersection points between i and i+1 are pi and pi+1 . Dually, the divisors that pass through pi are precisely i−1 , i and i+1 . The usual Olympic rings are nothing but an asymmetric projection of this most beautiful Platonic configuration! References 1. Adler, M. and van Moerbeke, P.: The complex geometry of the Kowalewski–Painlevé analysis. Invent. Math. 97, 3–51 (1989) 2. Adler, M. and van Moerbeke, P.: Algebraic completely integrable systems: A systematic approach. Perspectives in Mathematics, Academic Press (to appear) 3. Adler, M. and van Moerbeke, P.: The Toda lattice, Dynkin diagrams, singularities and Abelian varieties. Invent. Math. 103, 223–278 (1991) 4. Courant, T.: Dirac manifolds, Trans. Am. Math. Soc. 319, 631–661 (1990) 5. Fernandes, R.L.: On the master symmetries and bi-Hamiltonian structure of the Toda lattice. J. Phys. A: Math. Gen. 26, 3797–3803 (1993) 6. Fernandes, R.L. and Santos, J.P.: Integrability of the periodic KM system. Rep. Math. Phys 40, 475–484 (1997) 7. Dalaljan, S.G.: The Prym variety of a two-sheeted covering of a hyperelliptic curve with two branch points. (Russian) Mat. Sb. (N.S.), 98 (140), no. 2 1(10), 255–267, 334 (1975)

196

R. L. Fernandes, P. Vanhaecke

8. Griffiths, P.A.: Linearizing flows and a cohomological interpretation of Lax equations. Am. J. Math. 107, 1445–1484 (1985) 9. Griffiths, P.A. and Harris, J.: Principles of algebraic geometry. New York: Wiley-Interscience 1978 10. Kac, M. and van Moerbeke, P.: On an explicitly soluble system of nonlinear differential equations related to certain Toda lattices. Adv. in Math. 3, 160–169 (1975) 11. Kuznetsov, V. and Vanhaecke, P.: Bäcklund transformations for finite-dimensional integrable systems: A geometric approach. nlin.SI/0004003 12. Mumford, D.: Tata lectures on theta. II. Boston: Birkhäuser Boston Inc., 1984 13. Mumford, D.: Prym varieties I. In: Contributions to analysis, Ahlfors L.V. Kra I. Maskit B. Nirenberg L., Eds., New York: Academic Press, 1974, pp. 325–350 14. Pedroni, M. and Vanhaecke, P.: A Lie algebraic generalization of the Mumford system, its symmetries and its multi-Hamiltonian structure. J. Moser at 70, Regul. Chaotic Dyn. 3, 132–160 (1998) 15. Vanhaecke, P.: Linearising two-dimensional integrable systems and the construction of action-angle variables. Math. Z. 211, 265–313 (1992) 16. Vanhaecke, P.: Integrable systems in the realm of algebraic geometry. Berlin–Heidelberg–New York: Springer-Verlag, 1996 17. Vanhaecke, P.: Integrable systems and symmetric products of curves. Math. Z. 227, 93–127 (1998) 18. Veselov, A.P. and Penskoï, A.V.: On algebro-geometric Poisson brackets for the Volterra lattice. Regul. Chaotic Dyn. 3, 3–9 (1998) 19. Volkov, A.: Hamiltonian interpretation of the Volterra model. J. Soviet Math. 46, (1576–1581) (1989) 20. Volterra, V.: Leçons sur la Théorie Mathématique de la Lutte pour la Vie. Paris: Gauthier-Villars et Cie., 1931 21. Weinstein, A.: The local structure of Poisson manifolds. J. Differ. Geom., 18, 523–557 (1983) Communicated by M. Aizenman

Commun. Math. Phys. 221, 197 – 227 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

The Complex Geometry of Weak Piecewise Smooth Solutions of Integrable Nonlinear PDE’s of Shallow Water and Dym Type Mark S. Alber1,2 , Roberto Camassa3,4, , Yuri N. Fedorov5, , Darryl D. Holm4,† , Jerrold E. Marsden6,‡ 1 Department of Mathematics, Stanford University, Building 380, MC 2125, Stanford, CA 94305, USA.

E-mail: [email protected]

2 Department of Mathematics, University of Notre Dame, Notre Dame, IN 46556, USA.

E-mail: [email protected]

3 Department of Mathematics, University of North Carolina, Chapel Hill, NC 27599, USA 4 Center for Nonlinear Studies and Theoretical Division, Los Alamos National Laboratory, Los Alamos,

NM 87545, USA. E-mail: [email protected]; [email protected]

5 Department of Mathematics and Mechanics, Moscow Lomonosov University, Moscow 119 899, Russia.

E-mail: [email protected]

6 Control and Dynamical Systems 107-81, California Institute of Technology, Pasadena, CA 91125, USA.

E-mail: [email protected] Received: 16 February 1999 / Accepted: 10 April 2001

To the 70th birthday of Solomon Alber Abstract: An extension of the algebraic-geometric method for nonlinear integrable PDE’s is shown to lead to new piecewise smooth weak solutions of a class of N component systems of nonlinear evolution equations. This class includes, among others, equations from the Dym and shallow water equation hierarchies. The main goal of the paper is to give explicit theta-functional expressions for piecewise smooth weak solutions of these nonlinear PDE’s, which are associated to nonlinear subvarieties of hyperelliptic Jacobians. The main results of the present paper are twofold. First, we exhibit some of the special features of integrable PDE’s that admit piecewise smooth weak solutions, which make them different from equations whose solutions are globally meromorphic, such as the KdV equation. Second, we blend the techniques of algebraic geometry and weak solutions of PDE’s to gain further insight into, and explicit formulas for, piecewisesmooth finite-gap solutions. The basic technique used to achieve these aims is rather different from earlier papers dealing with peaked solutions. First, profiles of the finite-gap piecewise smooth solutions are linked to certain finite dimensional billiard dynamical systems and ellipsoidal billiards. Second, after reducing the solution of certain finite dimensional Hamiltonian Research partially supported by NSF grant DMS 9626672 and NATO grant CRG 950897.

Research supported in part by US DOE CCPP and BES programs and NATO grant CRG 950897. Research supported by INTAS grant 97-10771 and, in part, by the Center for Applied Mathematics,

University of Notre Dame. † Research supported in part by US DOE CCPP and BES programs. ‡ Research partially supported by the California Institute of Technology and NSF grant DMS 9802106.

198

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

systems on Riemann surfaces to the solution of a nonstandard Jacobi inversion problem, this is resolved by introducing new parametrizations. Amongst other natural consequences of the algebraic-geometric approach, we find finite dimensional integrable Hamiltonian dynamical systems describing the motion of peaks in the finite-gap as well as the limiting (soliton) cases, and solve them exactly. The dynamics of the peaks is also obtained by using Jacobi inversion problems. Finally, we relate our method to the shock wave approach for weak solutions of wave equations by determining jump conditions at the peak location. Contents 1. 2. 3. 4. 5. 6.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Finite-Gap Solutions . . . . . . . . . . . . . . . . . . . . . . . . . Flows on n-Dimensional Quadrics and Stationary n-Gap Solutions of the (HD) and (SW) Equations . . . . . . . . . . . . . . . . . . Billiard Dynamical Systems and Piecewise-Smooth Weak Solutions of PDE’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kinematics of Peaks . . . . . . . . . . . . . . . . . . . . . . . . . The Dynamics of Peaks and Weak Solutions . . . . . . . . . . . .

. . . . . . . .

198 201

. . . .

208

. . . . . . . . . . . .

210 222 223

1. Introduction An important feature of many integrable nonlinear evolution equations is the nature of their soliton solutions. There are many examples of such solutions found in a variety of physical applications, such as nonlinear optics and water wave equations. Nonsmooth soliton solutions of integrable equations are now well known, and include solutions of the shallow water equation (SW) with peaks, the points at which their spatial derivative changes sign (see Camassa and Holm [1993] and Camassa, Holm and Hyman [1994]). It was noted in Alber et al. [1994, 1995, 1999] that the spatial structure of these “peakon” and finite-gap piecewise smooth weak solutions are closely related to finite dimensional integrable billiard systems. Some history. Camassa and Holm [1993] described classes of n-peakon solutions for an integrable equation in the context of a model for shallow water theory. This work (see also Camassa, Holm and Hyman [1994]) contains many other facts about these equations as well, such as a Hamiltonian derivation of the equation, the associated linear isospectral eigenvalue problem and its discrete spectrum corresponding to the peakons, a steepening lemma important for understanding how solutions lose regularity, numerical stability, etc. Of particular interest to us is their description of the dynamics of the peakons in terms of a finite-dimensional completely integrable Hamiltonian system. In other words, each peakon solution can be associated with a mechanical system of moving particles. Calogero [1995] and Calogero and Francoise [1996] further extended the class of mechanical systems of this type. It is well-known (see, for example, Ablowitz and Segur [1981]), that solitons and quasi-periodic solutions of most classical integrable equations can be obtained by using the inverse scattering transform (IST) method. This is done by establishing a connection with an isospectral eigenvalue problem for an associated operator that is often a Schrödinger operator. In some cases it involves a potential in the form of an entire function of the spectral parameter. Such an operator is called an energy-dependent

Complex Geometry of Piecewise Solutions

199

Schrödinger operator. The scattering problem for the operators of this type was studied by Jaulent [1972] and Jaulent and Jean [1976]. On the other hand, in connection with certain N -component systems of integrable evolution equations,Antonowicz and Fordy [1989] investigated certain energy dependent scalar Schrödinger operators. Using this formalism, they obtained multi-Hamiltonian structures for this class of systems. Later, Alber et al. [1994, 1995, 1999] showed that in case of certain potentials, a limiting procedure can be applied to generic solutions, which results in solutions with peaks. The latter were related to finite dimensional integrable dynamical systems with reflections and were termed piecewise-smooth solutions, a terminology that hereafter we will adopt. This relation provides an efficient route to the study of finite-gap and piecewise soliton solutions of nonlinear PDE’s. The approach is based on studying finite dimensional Hamiltonian systems on certain Riemann surfaces and can be used for a number of equations including the shallow water equation, the Dym type equation, as well as certain N -component systems and equations in their hierarchies. Finite-gap solutions of the Dym equation were studied in Dmitrieva [1993a] and Novikov [1999] by making use of a connection with the KdV equation and with the aid of additional phase functions. Soliton solutions of Dym type equations were studied in Dmitrieva [1993b]. Periodic solutions of the shallow water equation were discussed in McKean and Constantin [1999]. The papers by Beals et al. [1998, 1999, 2000] used Stieltjes’ theorem on continued fractions and the classical moment problem for studying multi-peakon solutions of the (SW) equation. Multi-peakon solutions have also been derived in Camassa [2000] by Gram–Schmidt orthogonalization. The main results of this paper. While our techniques are rather general and can be applied to large classes of N -component systems, we shall illustrate them in detail for two specific integrable PDE’s. One of these equations is a member of the Dym hierarchy that has been studied by, amongst others, Kruskal [1975], Cewen [1990], Hunter and Zheng [1994] and Alber et al. [1995, 1999]. Using subscript notation for partial derivatives, this equation is Uxxt + 2Ux Uxx + U Uxxx − 2κUx = 0.

(HD)

The other equation, derived from the Euler equations of hydrodynamics in a shallow water framework by Camassa and Holm [1993], is Ut + 3U Ux = Uxxt + 2Ux Uxx + U Uxxx − 2κUx .

(SW)

In both equations, the dependent variable U (x, t) may be interpreted as a horizontal fluid velocity and κ is a parameter. Under appropriate boundary conditions, applying the limit κ → 0 to (SW) leads to an equation that has peaked solutions. For equation (HD), such solutions exist also for κ = 0 (for example periodic and finite-gap peaked solutions). By using the method of generating equations for nonlinear integrable PDE’s, we reduce the equations to a Jacobi inversion problem associated with hyperelliptic curves. The solutions U (x, t) themselves are given by trace formulae, i.e., sums of coordinates of points on such curves. An important feature is that the corresponding Abel–Jacobi mapping is not a standard one. First of all, the holomorphic differentials that are involved do not form a complete set of such differentials on a hyperelliptic curve. Second, it involves a meromorphic differential.As a result, the image of the mapping turns out to be a non-Abelian subvariety

200

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

(a stratum) of a generalized Jacobian. This also implies that the x- and t-flows of (HD) and (SW) are essentially nonlinear, i.e., they are not translationally invariant. Seen from the viewpoint of algebraic geometry, these nonstandard aspects constitute the main difference between shallow water and Dym type equations, and equations of KdV type and more generally equations from the whole KP hierarchy which lead to standard Abel–Jacobi mappings. The basic technique of the present paper is rather different from earlier papers dealing with peaked solutions. First, profiles of the finite-gap piecewise-smooth solutions are linked to certain finite dimensional billiard dynamical systems and ellipsoidal billiards in the field of Hooke potentials. Second, after reducing the solution of the finite dimensional Hamiltonian systems on Riemann surfaces to the solution of a nonstandard Jacobi inversion problem, it is resolved by introducing new parametrizations. The philosophy that “justifies” procedures of this sort is that, in the end, by using the trace formulae, we obtain weak solutions of the PDE’s (HD) and (SW) in the spacetime sense. This is regarded as equivalent to the validity of Hamilton’s principle for these PDE’s and is taken as a fundamental criterion for the definition of their solutions. It is worth emphasizing that Hamilton’s principle naturally leads to weak solutions in the spacetime sense (and not in the spatial sense alone). We might also remark that even for billiards, one has to be careful about the sense in which solutions are interpreted. In the case of a point particle bouncing off a wall, for example, the equations of motion themselves do not rigorously make sense at the collision; what does make sense is the fundamental principle of Hamilton. This point of view of course is not new – see, e.g., Young [1969] and Kane et al. [1999]. The contents of the paper. In Sect. 2, basic trace formulae and µ-variable representations are used to establish a connection between solutions of the nonlinear equations and finite dimensional Hamiltonian systems on Riemann surfaces. These representations describe finite-gap and soliton type solutions, as well as mixed soliton–finite-gap solutions. Then, solving the Hamiltonian systems is reduced to Jacobi inversion problems with meromorphic differentials. These inversion problems are solved by introducing a new parameterization that yields a Hamiltonian flow on a nonlinear subvariety of the Jacobi variety. The approach of recurrence chains used in this section is demonstrated in detail in the case of Dym-type equations. In Sect. 3 the geodesic motion and motion in the field of a Hooke potential on an ellipsoid are linked, at any fixed time t, to finite-gap solutions of (HD) and (SW) equations respectively through trace formulae. In Sect. 4 it is shown how peaked finitegap solutions of (HD) and (SW) equations arise in the particular limit of smooth solutions. Based on this, a connection to ellipsoidal and hyperbolic billiards is used to construct the peak solutions of equations (HD) and (SW) in the form of an infinite sequence of pieces, corresponding to the segments between impacts, glued together along peaks. The motion between impacts in the billiard problems is made linear on generalized Jacobians of hyperelliptic curves. By solving the corresponding generalized Jacobi inversion problem, we find thetafunction solutions to the billiards, which thereby enables us to describe explicit peak solutions for the above PDE. We then extend the analysis from fixed-time peak solutions to time-dependent ones and show that the latter are described by an infinite number of meromorphic pieces in x and t that are glued along peak lines (surfaces) where the solution has discontinuous derivatives in the dependent variables. We give thetafunction expressions for the pieces and the peak surfaces. These formulae may be useful

Complex Geometry of Piecewise Solutions

201

for stability analysis as well as for numerical investigations of the perturbed nonlinear PDE’s. In Sect. 5 the Hamiltonian structure for the motion of the peaks of the finite-gap piecewise-solutions is obtained by using algebraic-geometric methods. Lastly, in Sect. 6 we relate our method to the shock wave approach for weak solutions of wave equations by determining jump conditions at the shock location. 2. Finite-Gap Solutions In this section we will show that even on the level of finite-gap solutions, there are crucial differences between the KdV equation case and equations (HD) or (SW). The same method can be applied to other equations forming the HD and SW hierarchy as well as to N -component systems of nonlinear evolution equations which have associated with them energy dependent Schrödinger operators (see Alber et al. [1997]). We will start by describing the algebraic geometrical structure of finite-gap solutions of equations (HD) and (SW) related to a hyperelliptic curve of genus n, also called n-gap solutions. The same method can be applied also to the other equations forming the HD and SW hierarchy. For the HD equation such solutions were obtained in terms of theta-functions by Dmitrieva [1993a] (see also Dmitrieva [1993b]) and Novikov [1999]. For equation (SW) on a circle, the problem was discussed in Constantin and McKean [1999]. Lax pairs and recurrence chains. We now use the recurrence chain approach to develop a basic trace formula which establishes a connection between solutions of equation (HD) and finite dimensional Hamiltonian systems on Riemann surfaces, written in the socalled µ-variables representation. This representation describes finite-gap solutions, as well as their limiting forms of soliton-type. This representation also yields the existence of peakons in a special limiting case. For definiteness, we concentrate here on equation (HD). Analogous results are available in the case of equation (SW) (for details see Alber et al. [1994, 1995]). The hierarchy of Dym equations is obtained from the Lax equations ∂ L = [L, An ], ∂tn

n ∈ N,

L=−

∂2 + V (E, x, tn ), ∂x 2

where the potential V (E, x, tn ) is written in terms of a complex parameter E in the form M(x, tn ) , (2.1) 2E for a function M(x, tn ) to be determined below. Assuming [L, An ] to be a scalar operator, we choose An = Bn ∂x − 21 Bn for some function Bn (E, x, tn ) and obtain the following sequence of equations for V , V (x, tn , E) =

∂V 1 ∂ 3 Bn ∂Bn ∂V V + Bn . =− +2 ∂tn 2 ∂x 3 ∂x ∂x

(2.2)

Now we choose Bn to be a polynomial in E of degree n: Bn (x, t, E) = b0

n

(E − µk (x, t)) =

k=1

n k=0

bn−k (x, t) E k .

(2.3)

202

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

Substituting the expressions (2.1) and (2.3) into the generating equation (2.2) and equating like powers of E, we obtain a recurrence chain for coefficients of B(x, t) which yields the nth equation of the Dym hierarchy. For example, putting t1 = t and choosing n = 1, B1 (x, t, E) = b0 (x, t)E + b1 (x, t) yields the following chain

E 1 : −b0 = 0,

E 0 : −b1 + 2b0 M + b0 M = 0, ∂M E −1 : 2b1 M + b1 M = . ∂t

(2.4)

After setting b0 = 1 and using (2.1), we get

M = b1 , ∂M . 2b1 M + b1 M = ∂t

(2.5)

The first equation defines b1 in terms of M, M = b1 + κ, with κ a constant. Renaming b1 = −U , so that M = −U + κ,

(2.6)

and putting this into the second equation of the set (2.5) results in equation (HD). (For further details about the hierarchies of (HD) and (SW), see for example Alber et al. [1994, 1995, 1999].) The method of generating equations is due to S. Alber [1979] and another exposition of it can be found in Alber et al. [1985, 1997]. We call (2.2) the “dynamical generating equations”, because it generates a hierarchy of equations governing the dynamics of the dependent variable M(x, t). Remark. The flows where Bn is a polynomial E, as in the definition (2.3) and in the example above, will in general lead to nonlocal equations, i.e., the evolution equation for M involves terms that depend on nonlocal operators acting on combinations of M and its derivatives. This can be seen, for instance, in Eq. (2.5) where both b1 and b1 require inverting (2.6) to write U in terms of M. Thus, flows generated by polynomials Bn in E should be properly classified as integro-differential evolution equations, rather than PDE’s. In contrast, the choice of polynomials in 1/E for Bn leads to flows that are local, i.e., Mt only depends on combinations M and its (spatial) derivatives, and these flows are proper PDE’s. This feature of equations of Dym (HD) or shallow water (SW) type is somewhat different from other completely integrable PDE’s like the KdV or Sine-Gordon equation. Equations (HD) and (SW) possess “open ended” hierarchies: the recurrence chain can be extended from negative to positive powers of E, by choosing Bn in (2.2) to be a rational function of the parameter E. The case when the chain includes only negative powers of E is in fact the one most studied in the literature (see, e.g., Dimitrieva [1993a], Novikov [1999] for the case of Dym equation). Now let us consider the stationary flow for the nth equation of the hierarchy, which is obtained by dropping the time derivative of V in the left-hand side of (2.2). By definition

Complex Geometry of Piecewise Solutions

203

a stationary equation describes a finite-dimensional system for the coefficients of Bn and is equivalent to the 2 × 2 Lax pair ∂ ∂ Wn (E) = −[Wn (E), L(E)], or + L(E), Wn (E) = 0, ∂x ∂x (2.7) 1 0 1 Bn − 2 Bn , L = . Wn (E) = M 1 − 21 Bn + Bn M E 0 E 2 Bn The matrix Wn (E) undergoes an isospectral deformation. Hence the spectral curve = {|Wn (E) − zI | = 0} is an invariant of the stationary flow. The curve is hyperelliptic and can be represented in the form = {w 2 = µC(µ)},

(2.8)

1 C(E) = E −Bn Bn + Bn 2 + Bn2 M. 2

(2.9)

where z = wE and

Since Bn is a polynomial of degree n, C(E) becomes a polynomial of degree (at most) 2n: C(E) =

2n

Cj E j = C2n

j =0

2n

(E − mk ),

(2.10)

k=0

for some constants mk , k = 1, . . . , 2n. In this case the curve has genus n and we set the coefficient C2n to be a negative number: C2n ≡ −L20 . We shall refer to (2.9) as the stationary generating equation. Equating like-powers of E in both sides of the stationary generating equation yields first integrals E 2n : C2n

= − b1 + M,

1 E 2n−1 : C2n−1 = − b1 b1 − b2 + (b1 )2 + 2b1 M, 2 Ej : ··· E 0 : C0

(2.11)

= 2bn2 M.

Let us consider the divisor of points P1 = (µ1 , w1 ), . . . , Pn = (µn , wn ) on . Substituting (2.3) into (2.9) and setting E = µ1 , . . . , µn successively, one gets the following system of equations describing evolution of the points under the stationary flow: √ ∂µi R(µi ) = µi ≡ , (2.12) ∂x µi nj=i (µi − µj )

204

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

where R(µ) = µC(µ) = −L20 µ

2n

(µ − mr ).

(2.13)

r=1

In the case of equation (SW), this should be replaced by R(µ) = µ

2n+1

(µ − mr ).

(2.14)

r=1

We now proceed to describe finite-gap solutions of equation (HD) and the other equations from its hierarchy. According to a general theory (see, e.g., Dubrovin [1981], Belokolos et al. [1994], for any fixed t, the x-profile of an n-gap solution of an integrable PDE satisfies the nth stationary equation of the hierarchy. Hence, n-gap solutions M(x, tk ) of k th equation of HD hierarchy must satisfy the stationary generating equation (2.9) represented by the Lax pair (2.7), as well as the dynamical generating equation ∂V 1 ∂ 3 Bk ∂Bk ∂V =− +2 V + Bk , ∂tk 2 ∂x 3 ∂x ∂x

V =

M(x, tk ) , 2E

(2.15)

where the coefficients of Bk (E) are found recursively. Notice that the latter equation is equivalent to the matrix commutativity relation ∂ ∂ (2.16) + L, + Wk = 0, ∂x ∂tk where Wk (E) =

− 21 Bk 1 − 2 Bk + B k M E

Bk 1 , 2 Bk

(2.17)

and L is defined in (2.7). The compatibility of conditions (2.16), and (2.7) leads to the following Lax pair: ∂ Wn (E) = −[Wn (E), Wk (E)], ∂tk

k ∈ N,

k = n.

(2.18)

For k = n, we replace (2.18) with the Lax pair (2.7) thus identifying tn with x. The (1,2)-entry of the matrix equation (2.18) implies the following tk -evolution of the polynomial Bn (E): ∂Bn ∂Bk ∂Bn = Bk − B n , ∂tk ∂x ∂x

k = n.

In case k = n this relation is replaced by ∂B ∂B = vb0 , ∂x ∂tn where v is a constant, which can always be eliminated by rescaling tn .

(2.19)

Complex Geometry of Piecewise Solutions

205

Expanding the right-hand side of (2.19) in E and using the condition that it must be a polynomial of degree n − 1, we find 1 Bk (E) = Bn (E) , (2.20) E n−k + where [ ]+ denotes the polynomial part of the expression. As follows from the first equation nin (2.11), M = C2n + b1 . On the other hand, according to formula (2.3), b1 = − i=1 µi . Finally, using the definition (2.6) of M in terms of the solution U and integrating twice with respect to x, we obtain U=

n i=1

µi +

1 κ − C2n x 2 + K1 x + K2 , 2

(2.21)

where K1 and K2 are constants of integration. If we assume that all the variables µi are bounded, which is related to the choice of sign of the leading order coefficient C2n , then b1 is a bounded function of x. To find bounded solutions U (x, t) of the PDE, we set C2n = κ,

and

K1 = 0.

Hence, when the above requirements are imposed, we see that the leading order coefficient of the polynomial C(E) must coincide with the parameter κ of the PDE. The Dym equation (HD) is invariant under the Galilean transformation xˆ = x + K2 t,

tˆ = t,

Uˆ = U − K2 ,

so that the constant K2 can always be eliminated from expression (2.21). Therefore, under the boundedness conditions above, and up to a Galilean transformation, we assume that the finite-gap and soliton solutions of the Dym equation (HD) is reconstructed in terms of the root variables µ s by the “trace” formula which in case of equations (HD) and (SW) have the form U (x, t) =

n

µi − m.

(2.22)

i=1

Here m is a constant, which equals zero in the case of equation (HD). Through (2.22) a solution of the system (2.12) allows to construct the instantaneous profile of U (x, ·) from a set of initial conditions µi (x, ·) = µi (0, ·) ∈ [m2i , m2i+1 ], i = 1, . . . , n. Here the “dot” notation stresses the fact that time t is just a parameter in this system. On the other hand, substitution of (2.3) into (2.19), setting E = µ1 , . . . , µn successively, and taking into account expressions (2.12) results in the following tk -evolution equations for µi , √ R(µi ) ∂µi ∂µi = Bk (µi ) = Bk (µi ) n , i = 1, . . . , n, (2.23) ∂tk ∂tx µi j =i (µi − µj ) where, in view of (2.3) and (2.20), for k = 1, . . . , n − 1, Bk (µi ) = Ress=0

1 (s − µ1 ) · · · (s − µn ) , s n−k s − µi

206

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

i.e., up to the sign, the k th elementary symmetric function of {µ1 , . . . µn } \ µi . In the case k = 1, √ ∂µi (µi − &) R(µi ) µ˙ i ≡ = , i = 1, . . . , n, ∂t1 µi nj=i (µi − µj )

& = µ1 + · · · + µn ,

(2.24)

the solution of which produces the µ’s, and hence the PDE’s solution U , at any (later) time t. We notice that for k > n, the derivatives ∂/∂tk are linear combinations of ∂/∂t1 , . . . , ∂/∂tn . Expressions (2.12), (2.23), and (2.12) provide the so-called µ-variables representation for the finite-gap solutions of an evolution equation. They are the analogs of Drach–Dubrovin equations which describe evolution of points on the spectral hyperelliptic curve in the case of the KdV equation. (For further details see Dubrovin [1975], Drach [1919], Alber et al. [1994, 1995, 1999], Gesztesy et al. [1996], and Alber and Fedorov [2001].) With the initial conditions chosen, the right-hand-side of system (2.12) is real, and the derivative of µi changes sign when µi reaches the end points of its gap, µi = m2i or µi = m2i+1 , corresponding to a change of the sheet of the spectral curve . Thus each variable µ undergoes (real) oscillations between the end points of a gap (so that the resulting PDE solution U (x, t) remains real). Remark. The condition that the root variables µ’s are real (or, equivalently, their initial conditions are chosen as described above), while certainly sufficient to assure reality of the PDE’s solution U resulting from (2.21), is clearly not necessary (namely, some of the µ’s could occur in conjugate pairs). A wider class of real solutions U could be constructed by relaxing the reality assumption on the µ-variables. However, a thorough discussion of the reality condition for U and its implications for the root variables, while certainly desirable, lies beyond the scopes of the present paper, and it will be addressed in future work. By rearranging and summing up Eqs. (2.12) and (2.24), (2.23), one obtains the following nonstandard Abel–Jacobi equations n µki dµi dtk = √ x R(µi ) i=1

k = 1, . . . , n − 1, k = n,

(2.25)

which contain (n − 1) holomorphic differentials and one meromorphic differential on . Thus, the number of holomorphic differentials is less than genus of the Riemann surface, which implies that the corresponding inversion problem cannot be solved in terms of meromorphic functions of x and t1 , . . . , tn−1 (see e.g., Markushevich [1977]). Finite-gap stationary flows in x. Let us first consider the x-flow by fixing time variables in (2.25): tk = tk0 =const, k = 1, . . . , n − 1, so that dtk = 0. Now introduce a new spatial variable x1 defined as follows:

x1

x= 0

1 µ1 · · · µn dx1 . L0

(2.26)

Complex Geometry of Piecewise Solutions

207

In view of the well-known Jacobi identities 1/(µ1 · · · µn ) k = −1, 0 µki k = 0, . . . , n − 2, n = 1 k = n − 1, j =i (µi − µj ) & k = n, Eqs. (2.12) give rise to the following system: n µi k−1 µ dµ x1 + φ 1 = √ φk R(µ) µ0 i=1

k = 1, k = 2, . . . , n,

(2.27)

(2.28)

where φ1 , . . . , φn are constant phases which depend on tk0 as on parameters. Equations (2.28) include n holomorphic differentials on and determine the standard Abel–Jacobi map of the symmetric product (n) of n copies of to the Jacobi variety (Jacobian) Jac(). Thus, the flow generated by the system (2.12) is made linear on Jac() after introducing the reparametrization (2.26). By using standard methods (see e.g., Dubrovin [1981] or Mumford [1983]), the map can be inverted, resulting in expressions for algebraic symmetric functions of µ-variables in terms of theta-functions of n arguments which depend linearly on x1 and, in a transcendental way, on tk0 as parameters. Then, by using the trace formula (2.22), one obtains a theta-functional expression for U as a function of x1 , tk0 , U = U˜ (x1 , tk0 ). On the other hand, substituting the theta-functional expression for the product µ1 · · · µn into (2.26) yields a quadrature. By solving it, one finds x as a meromorphic function of x1 which depends on t0 as a parameter. However, the inverse function x1 (x, t0 ) is no longer meromorphic in x. Finally, the composition function U (x, t0 ) = U˜ (x1 (x, t0 ), tk0 ) gives a profile of the finite-gap solutions of the (HD) or (SW) equation (for explicit theta-functional expressions U˜ (x1 , t0 ), x(x1 , t0 ) see Alber and Fedorov [2001]). Notice that as seen from (2.26) and (2.28), the original x-flow is also made linear on Jac(). However the straight line motion is not uniform. The transformation (2.26) involving x and x1 coincides with a change of variable in the well-known Liouville transformation (see, e.g., Verhulst [1996]). Finite-gap flows in tk . Now let us fix the coordinate x = x0 as well as all the times t1 , . . . , tn−1 but tk . Then introduce a new time variable t˜ defined by dtk =

µ1 · · · µn d t˜, L0 (&k−1 )

(2.29)

where &k−1 are the elementary symmetric functions of µ1 , . . . , µn such that (s − µ1 ) · · · (s − µn ) = s n + s n−1 &1 + · · · + s 0 &n . Applying again the identities (2.27), from (2.24) and (2.23) we arrive at the following canonical Abel–Jacobi mapping n µi s−1 dµ µ s = 1, ψ1 = t˜ + φ1 (2.30) = √ ψs = δs,k tk + φs s = 2, . . . , n, µ0 2 R(µ) i=1

208

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

where φ1 , . . . , φn are constant phases which depend on x0 and the rest of times tl as on parameters, and δij is the Kronecker delta. As a result of inversion of (2.30), elementary symmetric functions of µ s and therefore the solution of equations (HD) and (SW) can be found in terms of theta-functions of n arguments which depend linearly on ψs . This means that the arguments depend linearly on t˜, as well as on the original time tk . However, t˜ itself depends on tk in a nonlinear way. Indeed, to describe the relation between t˜ and tk , we substitute the theta-functional expressions for the symmetric functions &n = µ1 · · · µn and &k−1 into (2.29). As a result, in contrast to the quadrature (2.26) relating x and x1 , we now get a differential equation of the form dtk = F (tk , t˜|x0 ), d t˜ where F is a transcendental function of t, t˜ and the parameter x0 . It can be shown that the equation involves a transcendental integral. Remarks. 1. In contrast to the x1 - and x-flows considered above, the flows generated by (2.23) (tk -flows) including (2.24), are nonlinear flows on the Jacobi variety Jac(). From the point of view of algebraic geometry, this phenomenon constitutes the main difference between solutions of such well known equations as KdV and sine Gordon equations and equations of (HD) or (SW) type. 2. The problem of inversion of the full nonstandard Abel mapping defined by (2.25) can be also studied by using a generalized Jacobian of the curve . Namely, one has to extend the mapping by including an extra holomorphic differential on to get a complete set of such differentials. As a result of this procedure, one gets a flow on nonlinear subvarieties (strata) of generalized Jacobians. The complete algebraic geometrical description and explicit formulae are presented in Alber and Fedorov [2001]. 3. Flows on n-Dimensional Quadrics and Stationary n-Gap Solutions of the (HD) and (SW) Equations Consider a family of confocal quadrics in Rn+1 = (X1 , . . . , Xn+1 ) 2 Xn+1 X12 ˜ Q(s) = + ··· + = 1 , s ∈ R, 0 < an+1 < a1 < · · · < an . a1 − s an+1 − s (3.1) The elliptic coordinates µ1 , . . . , µn+1 can be defined in Rn+1 in a standard way (see, ˜ e.g., Jacobi [1884a]) as follows. The condition s = c determines the quadric Q(c) on which one of the coordinates, say µn+1 , equals c, and the other coordinates µ1 , . . . , µn ˜ are elliptic coordinates on Q(c) defined by relations n l=1 (aj − µl ) , j = 1, . . . , n + 1. (3.2) Xj2 = (aj − c) n+1 k=1,k=j (aj − ak ) In the sequel without loss of generality we assume c = 0. ˜ = Q(0) ˜ It is well-known that the problem of geodesics on the ellipsoid Q is completely integrable (Jacobi [1884 a,b]). Moreover, as noticed by Jacobi himself and later

Complex Geometry of Piecewise Solutions

209

by many other authors (see e.g. Rauch-Wojciechowski [1995]), there exists an infinite ˜ in the sequence of integrable generalizations of the problem describing a motion on Q force field of certain polynomial potentials Vp (X1 , . . . , Xn+1 ), p ∈ N of degree 2p. The simplest integrable potential is the quadratic Hooke potential or the potential of an ˜ to the point mass on it: elastic string joining the center of the ellipsoid Q V1 =

σ 2 2 ), (X + · · · + Xn+1 2 1

σ = const.

In this case in terms of the ellipsoidal coordinates, the total energy (Hamiltonian) takes the Stäckel form: n n 1 j =i (µi − µj )µi dµi 2 σ H = + µi + const, 8 6(µi ) dx 2 i=1

where

i=1

6(µ) = (µ − a1 ) · · · (µ − an+1 )

and x denotes time. After fixing constants of motion, the system is reduced to the Abel– Jacobi equations n

µk

k=1 µ0

µi dµ = δin x + φi , √ 2 R(µk )

R(µ) = −µ6(µ)[c0 (µ − c1 ) · · · (µ − cn−1 ) − σ µn ],

i = 1, . . . , n,

(3.3)

c0 , . . . , cn−1 = const,

where φ1 , . . . , φn are constant phases and c1 , . . . , cn−1 are constants of motion. Notice that for σ = 0 the order of the polynomial R(µ) is 2n + 1, whereas for σ = 0 ˜ c0 is the it is 2n + 2. The case σ = 0 corresponds to the free (geodesic) motion on Q. ˙ X) ˙ and the remaining constants admit a clear geometric constant in the first integral (X, interpretation: the tangent line to a geodesic is also tangent to the fixed confocal quadrics ˜ 1 ), . . . , Q(c ˜ n−1 ) (Chasles theorem). Q(c Now notice that Eqs. (3.3) are equivalent to the system (2.25) with dt = 0 describing stationary (HD) and (SW) equations, provided we identify the roots of the polynomial R(µ) with those of the odd order polynomial (2.13) (for σ = 0 and L0 = 1) and of the even order polynomial (2.14) (for σ = 1) respectively. The equivalence also holds when some of the parameters ai in (3.3) are negative, which correspond to the motion on a hyperboloid. For concreteness, we shall consider only the case of ellipsoids. Taking into account the trace formula (2.22), we arrive at the following theorem: Theorem 3.1. The geodesic motion and motion in the field of a Hooke potential on ˜ are linked, at any fixed time t, to the n-gap solutions of (HD) and the ellipsoid Q (SW) equations respectively through the trace formula (2.22). Namely, if the roots of the polynomials R(µ) in (2.13) or (2.14) coincide with the roots of R(µ) in (3.3), the profiles of such solutions are given by the sum of the elliptic coordinates of the moving ˜ with addition of (−m) in case of equation (SW). point on Q ˜ (σ = 0) and equation (HD), this result was obtained in For the geodesic flow on Q Alber and Alber [1985], Cewen [1990], and Alber et al. [1995]). As with Eq. (2.25), under the change of parameter (2.26), Eqs. (3.3) reduce to those containing holomorphic differentials only and having the same structure as (2.28). By

210

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

inverting the corresponding Abel–Jacobi mapping, one obtains explicit expressions for elementary symmetric functions of µi and, in view of (3.2), for the Cartesian coordinates X1 , . . . , Xn+1 in terms of theta-functions of the new parameter x1 (for the case of the geodesic flow, see Weierstrass [1844], Moser[1978], and Knörrer [1982]). In the case n = 2, the change of parameter (2.26) was first applied by Weierstrass [1844] to solve the classical Jacobi geodesic problem on a triaxial ellipsoid (Jacobi [1884a,1884b]).

4. Billiard Dynamical Systems and Piecewise-Smooth Weak Solutions of PDE’s In this section it is first shown how peaked finite-gap solutions of (HD) and (SW) equations arise in the limit m1 → 0, where m1 is the smallest root of the polynomial R(E) in Eqs. (2.12)–(2.24). Then a connection to ellipsoidal and hyperbolic billiards is established. Ellipsoidal billiards and generalized Jacobians. Suppose that one of the semi-axes of ˜ tends to zero, namely, an+1 → 0. In the limit, Q ˜ passes into the interior the ellipsoid Q of (n − 1)-dimensional ellipsoid Q = {X12 /a1 + · · · + Xn2 /an = 1} ∈ Rn ,

Rn = (X1 , . . . , Xn ).

˜ transform to elliptic coordinates in Rn giving The elliptic coordinates µ1 , . . . , µn on Q Xj2

n (aj − µl ) = n l=1 , k=1,k=j (aj − ak )

j = 1, . . . , n,

(4.1)

which appear as the corresponding limits of (3.2). ˜ gets transformed into billiard motion inside the ellipsoid Then the motion on Q ˜ Q. Geodesics on Q pass into straight line segments inside Q, whereas the points of intersection of the geodesics with the plane {Xn+1 = 0} are mapped into impact points ˜ under the Hooke force passes on Q with elastic reflection. Also, the motion on Q to the motion inside Q under the action of the Hooke force with the potential V = σ (X12 + · · · + Xn2 )/2. However, in contrast to cases σ = 0 or σ < 0, for σ > 0 (an attracting Hooke potential), for the trajectory to reach Q the total energy h must be sufficiently large. Namely, there ought to exist a positive ε such that inside Q the following double inequality holds: h + σ (X12 + · · · + Xn2 )/2 > ε > 0. ˜ transforms to billiard motion inside the ellipsoid Under this condition, the motion on Q Q again having impacts and elastic reflections along Q. Thus, we have “an ellipsoidal billiard with the Hooke potential” which is described by the mapping B : (x, v) → (˜x, v˜ ), where x, v ∈ Rn are the Cartesian coordinates of a point on Q and the starting velocity vector respectively, while (˜x, v˜ ) are the coordinates and the starting velocity at

Complex Geometry of Piecewise Solutions

211

the next impact point. Following Fedorov [2001], the mapping has the form −1 [(σ − (v, a −1 v))x + 2(x, a −1 v)v], ν −1 v˜ = [(σ − (v, a −1 v))v − 2σ (x, a −1 v)x] + :a −1 x˜ ν −1 = [(σ − (v, a −1 v))(v + :a −1 x) + 2(x, a −1 v)(:a −1 v − σ x)], ν x˜ =

ν=

4σ (x, a −1 v)2 + (σ − (v, a −1 v))2 ,

:=

(4.2)

2(˜v, a −1 x˜ ) . (˜x, a −2 x˜ )

Notice that in the limit σ → 0 this reduces to a standard billiard mapping given in Veselov [1988] x˜ = x −

2(x, a −1 v) v, (v, a −1 v)

v˜ = v +

2(˜v, a −1 x˜ ) −1 a x˜ . (˜x, a −2 x˜ )

˜ with the higher order The mapping (4.2), as well as the billiard limits of the motion on Q potentials Vp (X1 , . . . , Xn , Xn+1 ) (Xn+1 = 0) are completely integrable. In the limit an+1 → 0 and after using the change of variable (2.26), the Abel–Jacobi equations (3.3) are transformed as follows: n µk i−1 µ dµ = φi = const, i = 1, . . . , n − 1, √ 2 ρ(µ) k=1 µ0 (4.3) n µk dµ = x1 + φn , √ 2µ ρ(µ) µ 0 k=1 ρ(µ) = −(µ − a1 ) · · · (µ − an ) [c0 (µ − c1 ) · · · (µ − cn−1 ) − σ µn ]. This system contains n−1 holomorphic differentials on the Riemann surface C = {w 2 = ρ(µ)} of genus g = n − 1 and one differential of the third kind having a pair of simple poles Q− , Q+ on C with µ(Q± ) = 0. Here again φ1 , . . . , φn are constant phases and c0 , . . . , cn−1 are constants of motion. The elliptic coordinates µ1 , . . . , µn represent the divisor of n points Pi = (µi , wi ) on C. Equations (4.3) describe a well defined mapping of the symmetric product C (g+1) to Jac(C, Q− , Q+ ), the (g + 1)-dimensional generalized Jacobian of the curve C with two distinguished points Q± . The later is obtained from the genus n curve w 2 = R(µ) in (3.3) as a result of confluence of two Weierstrass points (an+1 → 0) and regularization: cutting out the double point and gluing Q− , Q+ . The generalized Jacobian is a noncompact algebraic variety which is topologically equivalent to the product of the customary g-dimensional Jacobian variety Jac(C) with complex angle coordinates φ1 , . . . , φg and the cylinder C∗ = C \ {0} (for the definition and description of generalized Jacobians see, among others, Serre [1959], Previato [1985], Gavrilov [1999], and Fedorov [1999]). As follows from (4.3), the geodesic and the potential billiard motion parameterized by x1 is represented by a straight line flow on Jac(C, Q− , Q+ ), which is directed along the real section of C∗ and leaves the coordinates φ on Jac(C) invariant. As we shall see below, the solutions to the generalized inversion problem (4.3) have different structures, depending on whether R(µ) is an even or an odd order polynomial.

212

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

Solutions in terms of generalized theta-functions. First we concentrate on straight line billiards corresponding to the case σ = 0 when the curve C has one infinite point ∞. Fix a canonical basis of cycles A1 , . . . , Ag , B1 , . . . , Bg on C and let ω¯ 1 , . . . , ω¯ g be the dual basis of normalized holomorphic differentials on C and z1 , . . . , zg be corresponding coordinates on the universal covering of Jac(C). There exists a unique g × g constant normalizing matrix D such that ω¯ k =

g Dkj µj −1 dµ , √ ρ(µ) j =1

zk =

g

Dkj φj ,

k = 1, . . . , g = n − 1.

(4.4)

j =1

Let us also introduce a normalized differential of the third kind ?0 having simple poles at Q± with residues ±1 respectively: ?0 =

√ g ρ(0) dµ βk ω¯ k , + √ µ ρ(µ) k=1

√ ρ(0) = a1 · · · an · c1 · · · cn−1 ,

(4.5)

where βk are unique constants such that ?0 has zero A-periods on C. Then the last equation in (4.3) can be represented in the following form: n

µk

k=1 µ0

?0 = Z,

Z = 2 ρ(0)x1 + const.

(4.6)

√ Notice that in case of the ellipsoidal billiards R(0) is always real and hence Z is also real. Let us also choose the base point (µ0 , w0 ) of the mapping (4.3) to be an infinite point ∞ ∈ C. According to Fedorov [1999], the solution of the problem of inversion (4.3) together with (4.1) yields the following expressions for the Cartesian coordinates Xi of the point moving inside the ellipsoid Q: e−Z/2 θ [D + η(i) ](z − q/2) + eZ/2 θ [D + η(i) ](z + q/2) , (4.7) e−Z/2 θ [D](z − q/2) + eZ/2 θ[D](z + q/2) i = 1, . . . , n, z = (z1 , . . . , zn−1 )T , Z = 2 R(0)x1 + Z0 , Q+ T Q+ z, Z0 = const, q = 2 ω¯ 1 , . . . , ω¯ g ∈ Cg ,

Xi (x1 , z) = κi

∞

∞

κi = const. These expressions involve quotients of generalized theta-functions, where θ [D+η(i) ](z) and θ[D](z) are customary theta-functions associated with the Riemann surface C with appropriately chosen half-integer theta-characteristics η(i) (D is the half-integer thetacharacteristic corresponding to the vector of Riemann’s constants). The vector q coincides with the vector of B-periods of the meromorphic differential ?0 . The constant factors κi depend on the parameters of the curve C only. (For the definition and properties of the generalized theta-functions see e.g., Belokolos et al. [1994], Gagnon et al. [1992], Ercolani [1987], and Fedorov [1999].) The expressions (4.7) describe a straight line segment in Rn (Cn ) with z playing a role of a constant phase vector which defines the position of the segment. When one of √ the µ-variables, say µ1 , equals zero, the corresponding point P1 = (µ1 , R(µ1 )) on

Complex Geometry of Piecewise Solutions

213

the curve C coincides with one of the poles Q− , Q+ of the differential ?0 . Then, as follows from the mapping (4.3) and (4.6), x1 and Z become infinite. On the other hand, in view of (4.1), at this moment the moving point in Rn meets an ellipsoid Q. It follows that as x1 and Z change from −∞ to ∞ along the real axis, the expressions (4.7) have finite limits, giving the coordinates of two subsequent impact points on Q. Notice that Xi (∞, z) have the same values as Xi (−∞, z + q). Hence the next segment of the billiard trajectory is given by (4.7) with z being replaced by z + q. This yields the following algebraic-geometrical description of the billiard motion (see also Fedorov [1999]). Theorem 4.1. As the point mass inside Q approaches the ellipsoid, the point P1 on C tends to the pole Q+ . At the moment of impact, P1 jumps from Q+ back to Q− , whereas the phase vector z is increased by q defined in formulas (4.7). The process repeats itself for each impact. Using this property and by applying induction, from (4.7) the coordinates of the whole sequence of impact points are found in the form xi (N ) = κi

θ [D + η(i) ](z0 + N q) , θ [D](z0 + N q)

i = 1, . . . , n,

(4.8)

where N ∈ N is the number of impacts and the phase vector z0 = (z10 , . . . , zg0 )T is the same for all the segments of the billiard trajectory. These expressions depend on customary theta-functions only and, as functions of z0 , are meromorphic on a covering of the Jacobian variety of C. They have also been obtained by Veselov [1988] by using a factorization of matrix polynomials (see also Moser and Veselov [1991]). The work of Veselov is closely related to the discretization of mechanics that preserves the integrable structure. The numerical implementation of Veselov’s procedures was given in Wendlandt and Marsden [1997], a discrete reduction procedure in Marsden, Pekarsky and Shkoller [1999], Bobenko and Suris [1999] and an extension to PDE’s in Marsden, Patrick and Shkoller [1999]. The generalized Abel map (4.3) yields expressions in terms of generalized thetafunctions for the elementary symmetric functions of the variables µ. In particular, following Fedorov [1999], one obtains µ1 · · · µn = ∂x1 ∂V log θ˜ [D](z, Z) e−Z/2 ∂V θ [D](z − q/2) + eZ/2 ∂V θ [D](z + q/2) , = 2 ρ(0) ∂Z e−Z/2 θ [D](z − q/2) + eZ/2 θ [D](z + q/2)

(4.9)

where ˜ θ[D](z, Z) = e−Z/2 θ [D](z − q/2) + eZ/2 θ[D](z + q/2), Z = 2 ρ(0)x1 + Z0 ,

∂V = V1

(4.10)

∂ ∂ + · · · + Vn , ∂z1 ∂zn

and where V is the last column of the normalizing matrix D defined in (4.4): V = (D1g , . . . , Dgg )T . The phases z and Z0 are the same as in (4.7). As follows from (4.9),

214

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

for x1 , Z → ±∞, the product µ1 · · · µn tends to zero, as expected. Taking the integral (2.26) with L0 = 1 yields ˜ Z) + const x(x1 , z) = µ1 · · · µn dx1 = ∂V log θ(z, (4.11) e−Z/2 ∂V θ [D](z − q/2) + eZ/2 ∂V θ[D](z + q/2) = + const. e−Z/2 θ [D](z − q/2) + eZ/2 θ[D](z + q/2) It follows from this expression that the original parameter x has finite values as x1 → ±∞ and x(∞, z) has the same value as x(−∞, z + q). Now, substituting in (4.11) Z = −∞, Z = ∞, by induction, we find the length of the N th segment of the billiard trajectory in the form x(N ) − x(N − 1) =

∂V θ [D](z0 + N q) ∂V θ [D](z0 + N q − q) − , θ [D](z0 + N q) θ[D](z0 + N q − q)

N ∈ N (4.12)

z0 being the same as in (4.8). As a result, the solution Xi (x), x ∈ R, of the continuous geodesic billiard problem should be viewed as consisting of an infinite number of pieces each parameterized by x1 ∈ (−∞, ∞) and given by (4.7) and (4.11). These pieces are obtained by iteratively adding vector q to the phase z in (4.7) and (4.11) and they are glued together at the impact points corresponding to x1 = ±∞. Now we turn to the ellipsoidal billiard with the Hooke potential (σ = 1). In this case the curve C appearing in (4.3) has 2 infinite points at ±∞. We again introduce normalized differentials ω¯ k , ?0 , and coordinates zk , Z according to (4.4) and (4.6). Let the base point of the mapping (4.3) be one of the Weierstrass points of C, say µ0 = an . Then, instead of (4.7), the inversion of the generalized mapping (4.3) yields the following expressions for the squares of the Cartesian coordinates of the mass point moving inside an ellipsoid Q: Xi2 (x1 , z) = κi

θ˜ 2 [D + η(i) ](z, Z) , ˆ ˆ ˜ ˜ θ[D](z − q/2, ˆ Z − S/2) θ[D](z + q/2, ˆ Z + S/2)

(4.13)

i = 1, . . . , n,

z = (z1 , . . . , zn−1 )T , Z = 2 ρ(0)x1 + Z0 , z, Z0 = const, ∞+ Q+ (ω¯ 1 , . . . , ω¯ g )T , qˆ = 2 (ω¯ 1 , . . . , ω¯ g )T , q= Sˆ =

Q− ∞+

an

∞−

?0 ,

where θ˜ [D](z, Z) is defined in (4.10) and θ˜ [D + η(i) ](z, Z) = e−Z/2 θ [D + η(i) ](z − q/2) + eZ/2 θ[D + η(i) ](z + q/2) (4.14) √ Here κi are constants, and ρ(0) is the same as in (4.5). Similarly to (4.7), as x1 and Z pass from −∞ to ∞, Xi2 (x1 , z) tend to finite values resulting in the squares of the coordinates of subsequent impact points on Q. Thus, expressions (4.13) describe a segment of trajectory of the billiard in the field of the Hooke potential between two

Complex Geometry of Piecewise Solutions

215

impacts. After each impact the phase vector z changes according to Theorem 4.1. Then, by using induction, the sequence of impact points is described as follows: xi2 (N ) = κi

θ 2 [D + η(i) ](z0 + N q) , θ [D](z0 − qˆ + N q) θ[D](z0 + qˆ + N q) N ∈ N,

i = 1, . . . , n,

(4.15)

T

z0 = (z10 , . . . , zg0 ) = const.

Apparently, this theta-functional solution for the billiard with the Hooke potential was not previously known. Lastly, we find the following expression for x x(x1 , z) = const + log

ˆ ˜ θ[D](z − q/2, ˆ Z − S/2) , ˆ ˜ θ[D](z + q/2, ˆ Z + S/2)

Z = 2 ρ(0)x1 + const, (4.16)

which, for x1 → ±∞ and Z → ±∞, has finite limits determining x for two subsequent impacts. Then, using the expression (4.10), by induction, we express a x-interval between the impacts in terms of the customary theta-function: θ [D](z0 − q/2 ˆ + N q) θ[D](z0 + q/2 ˆ + N q) θ [D](z0 − q/2 ˆ + N q − q) ˆ − log − log S. θ[D](z0 + q/2 ˆ + N q − q)

x(N) − x(N − 1) = log

(4.17)

We emphasize that, in contrast to the geodesic billiard, for the billiard in the potential field the “time” x is not proportional to the length of a trajectory. Stationary finite-gap peaked solutions. Now we return to the finite-gap solutions of equations (HD) and (SW). Notice that under the limit m1 → 0 the mapping (2.28) takes the form (4.3) with ρ(µ) being a polynomial of degree 2n − 1 and 2n respectively. The trace formula (2.22) and relations (4.1) yield U=

n j =1

Xj2 +

n

ai + m.

i=1

Then solution to the billiard problems (4.7)–(4.17) provide solutions U (x, t0 ) for the above equations which consist of infinite sequences of smooth pieces each one corresponding to a segment between two impacts. The impacts themselves give peaks of U (x, t0 ). This leads to the following theorem. Theorem 4.2. 1) At any fixed time t = t0 , finite-gap peaked solution of the equation (HD) consists of an infinite number of pieces UN (x, t0 ), N ∈ Z glued at peak points. Let ρ(µ) be any polynomial with distinct roots a1 , . . . , an . Then, for any N , every piece is given by the following pair of theta-functional expressions parameterized by x1 ∈ R, UN =

n j =1

x(x1 , z) =

Xj2 (x1 , zN ) +

n

ai ,

(4.18)

i=1

e−Z/2 ∂V θ [D](zN − q/2) + eZ/2 ∂V θ [D](zN + q/2) + x0 , e−Z/2 θ [D](zN − q/2) + eZ/2 θ [D](zN + q/2) zN = z0 + N q ∈ Cn−1 , Z = 2 ρ(0)x1 + Z0 ,

(4.19)

216

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

where Xj2 (x1 , z) and q are given by (4.7) and z0 , Z0 , x0 are constant phases of the solution depending on t0 , which are the same for any piece. The length of the N th piece equals ∂V θ [D](z0 + N q) ∂V θ[D](z0 + N q − q) − . θ [D](z0 + N q) θ [D](z0 + N q − q)

(4.20)

2) At any fixed time t = t0 finite-gap peaked solution to equation (SW) consists of an infinite number of pieces UN (x, t0 ), N ∈ Z which are glued at peak points. The pieces are given in the following parametric form UN =

n j =1

x(x1 , t0 ) = log

Xj2 (x1 , zN ) +

n

ai + m,

zN = z0 + N q ∈ Cn−1 ,

(4.21)

i=1

ˆ ˜ θ[D](z ˆ Z − S/2) N − q/2, + x0 , ˆ ˜ ˆ Z + S/2) θ[D](z N + q/2,

Z = 2 ρ(0)x1 + Z0 , (4.22)

where Xj2 (x1 , z) are given by (4.13) and z0 , Z0 , x0 are constant phases which depend on t0 . The x-length of N th piece equals log

θ [D](z0 − q/2 ˆ + N q) ˆ + N q − q) θ[D](z0 − q/2 ˆ − log − log S. θ [D](z0 + q/2 ˆ + N q) θ [D](z0 + q/2 ˆ + N q − q)

(4.23)

When in the polynomials (2.13) or (2.14) m1 = 0 and m2 tends to zero, the distance between subsequent peaks of a profile tends to zero and in the limit the peaks coalesce. (Notice that this is done for a fixed t.) The solution U (x, t0 ) for this limiting case is smooth. Remark. It is known (see, for instance, Fedorov [1999]) that there are special degenerate umbilic billiard solutions of the classical billiard problem (without a potential) that have straight line segments meeting n − 1 fixed focal conics of Q between any subsequent impacts and, as x → ±∞, the billiard motion converges to simple oscillations along the largest axis of the ellipsoid. This corresponds to the confluence of the roots of the polynomial ρ(µ) in (4.3), c1 = a1 ,

...,

cn−1 = an−1 .

As a result, the hyperelliptic curve C becomes singular of arithmetic genus zero and the asymptotic billiard motion is described in terms of tau-functions. The corresponding asymptotic peaked solutions of equations (HD) and (SW) are given in Alber and Fedorov [2001]. Time-dependent piecewise-meromorphic solutions. Now we pass to global algebraic geometrical description of the finite-gap peaked solutions. After setting m1 → 0, the system (2.25) is formally reduced to the following Abel–Jacobi mapping: µ1 k−1 µn k−1 µ dµ µ dµ tk + φk k = 1, . . . , n − 1, + ··· + = (4.24) √ √ x + φn k = n, ρ(µ) ρ(µ) 2 2 µ0 µ0

Complex Geometry of Piecewise Solutions

where ρ(µ) = −L20

2n

217

(µ − mr )

and

ρ(µ) =

r=2

2n+1

(µ − mr )

r=2

in the case of equations (HD) or (SW) respectively. Here φ1 , . . . , φn are constant phases. This system contains n − 1 independent holomorphic differentials defined on the genus g = n − 1 Riemann surface {w2 = ρ(µ)}, which can be identified with the curve C described above. However, in contrast to the system (4.3), in the case of a polynomial ρ(µ) of odd order which corresponds to equation (HD), the last equation in (4.24) contains a meromorphic differential of the second kind having a double pole at the infinite point ∞ on C. In case of a polynomial ρ(µ) of even order corresponding to equation (SW), the last equation includes a meromorphic differential of the third kind with a pair of simple poles at the infinite points ∞− , ∞+ on C. According to Clebsch and Gordon [1866] and Gavrilov [1999], in the odd order case, such a system describes a well defined and invertible mapping of the symmetric product C (g+1) to Jac(C, ∞), the generalized Jacobian of the curve C with one distinguished point at ∞. The set Jac(C, ∞) is a noncompact algebraic variety which is topologically equivalent to the product Jac(C) × C. To describe this case we introduce a normalized differential of second kind having a double pole at ∞, √ g −1L0 µg dµ (1) dk ω¯ k , g = n − 1, (4.25) ?∞ = + √ 2 ρ(µ) k=1 where ω¯ k are the normalized holomorphic differentials specified in (4.4), dk are normal(1) izing constants such that all A-periods of ?∞ on C are zeros. Then the last equation in (4.24) implies that n i=1

µi µ0

?(1) ∞ = Z,

Z=

√ −1L0 x + (d, Dt) + const

d = (d1 , . . . , dn−1 )T ,

(4.26)

t = (tn , . . . , t2 )T ,

where D is an (n − 1) × (n − 1) normalizing matrix defined in (4.11). (1) Since ∞ now is a pole of ?∞ , we choose the basepoint P0 = (µ0 , w0 ) to be a finite Weierstrass point on C. For concreteness we choose P0 = (m2n , 0). Applying the residue theorem to the generalized theta-function associated with Jac(C, ∞) we solve the inversion problem (4.24) and find the following expression: n

µi = C1 − Z 2 +

i=1

2Z∂V θ[D + η2n ](z) − ∂V2 θ[D + η2n ](z) , θ [D + η2n ](z)

(4.27)

√

−1L0 x + (d, Dt) + Z0 , z = Dt + z0 ∈ Cn−1 , g µ ω¯ k + m2n , Z0 , z0 = const, C1 =

Z=

k=1 Ak

where the half-integer characteristic η2n labels the point (m2n , 0), the vector V = (D1g , . . . , Dgg )T is specified in (4.4), and the constant C1 contains the sum of integrals along the canonical cycles A1 , . . . , Ag on C. Notice that in the above formula

218

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

∂V = ∂t2 . Expression (4.27) is meromorphic in x and t1 , . . . , tn−1 and can be regarded as a generalization of the Matveev–Its formula to the case of the noncompact variety Jac(C, ∞). In the case of an even order curve C, corresponding to finite-gap peaked solutions of equation (SW), system (4.24) defines a mapping of the symmetric product C (g+1) to the generalized Jacobian Jac(C, ∞± ) which is topologically equivalent to the product Jac(C)×C∗ . As above, we set P0 to be the last Weierstrass point (m2n+1 , 0) and introduce the normalized differential of the third kind having a pair of simple poles at ∞− , ∞+ ∈ C, as well as the corresponding variable Z: g

µg dµ d¯k ω¯ k , ?∞± = √ + 2 ρ(µ) k=1

Z=

n i=1

µi

m2n+1

?∞± ,

(4.28)

where (d¯1 , . . . , d¯g ) = d¯ are chosen such that all the A-periods of ?∞± are zeros. Then, applying the residue theorem to the generalized theta-function associated with the Jac(C, ∞± ) yields n

ˆ + eZ θ [D](z + q) ˆ e−Z θ[D](z − q) , θ [D](z)

(4.29)

¯ Dt) + Z0 , z = Dt + z0 ∈ Cg , Z = x + (d, ∞+ ∞+ T qˆ = ω¯ 1 , . . . , ω¯ g ∈ Cg , Z0 , z0 = const.

(4.30)

µi + m = const −

i=1

where, in view of (4.28),

∞−

∞−

Remark. According to the formula (2.22), expressions (4.27) and (4.29) describe formal solutions to equations (HD) and (SW) respectively. However, while treating these solutions, one needs to take into account the reflection phenomenon described √ in Theorem 4.1. Namely, when a certain variable µi passes zero, the point Pi = (µi , ρ(µi )) jumps from one sheet of the Riemann surface C to another or, in other words, from the pole Q+ of the differential of the third kind ?0 to another pole Q− . Therefore, the above expressions do not provide global solutions to the equations. Instead, the following theorem holds. Theorem 4.3. 1) The time-dependent finite-gap peaked solution U (x, t) of (HD) consists of an infinite number of pieces in Rn = (t1 , . . . , tn−1 , x) described by meromorphic functions 2ZN ∂V θ [D + η2n ](zN ) − ∂V2 θ [D + η2n ](zN ) , N ∈ Z, θ [D + η2n ](zN ) √ zN = Dt + N q + z0 , ZN = −1L0 x + (d, zN ) + N h + Z0 , Z0 , z0 = const, t = (t1 , . . . , tn−1 ), (4.31) T Q+ Q+ Q+ h= ?(1) ω¯ 1 , . . . , ω¯ g , ∞, q =

2 UN (x, t) = C1 − ZN +

Q−

Q−

where C1 is the constant specified in (4.27).

Q−

Complex Geometry of Piecewise Solutions

219

For a fixed N the corresponding piece UN (x, t) is bounded by nonintersecting surfaces SN−1 and SN in Rn given by equations SN = {x = pN (t)}, 1 pN (t) = √ (∂V log θ [D + η2n ](zN + q/2) − (d, zN ) − N h) . −1L0

(4.32)

The adjacent pieces UN (x, t) and UN+1 (x, t) are thus glued to each other along SN , where U (pN (t), t) = C1 − ∂V2 log θ [D + η2n ](zN + q/2).

(4.33)

2) The finite-gap peaked solution U (x, t) of (SW) consists of an infinite number of pieces given by meromorphic functions ˆ + eZN θ [D](zN + q) ˆ e−ZN θ [D](zN − q) , N ∈ Z, θ [D](zN ) ¯ zN ) + N h¯ + Z0 , zN = Dt + qN + z0 , ZN = x + (d, Q+ ?∞± , t = (tn , . . . , t2 ), Z0 , z0 = const, h¯ =

UN (x, t) = const −

(4.34)

Q−

where the vector qˆ is described in (4.30). The piece UN (x, t) is bounded by peak surfaces S¯N−1 and S¯N defined as follows: S¯N = {x = p¯ N (t)},

p¯ N (t) = const − log

θ [D](zN − qˆ + q/2) . θ [D](zN + qˆ + q/2)

(4.35)

The adjacent pieces UN (x, t) and UN+1 (x, t) are glued together along S¯N , where U (pN (t), t) = const − ∂V log

ˆ θ[D](zN − q) . θ[D](zN + q) ˆ

(4.36)

Notice that along the peak surfaces, the solutions described in 1) and 2) have discontinuous partial derivatives with respect to x and t1 , . . . , tn−1 . Remark. By fixing all the times but tk in the above expressions, one obtains 2-dimensional piecewise solutions UN (x, tk ), whereas the corresponding sections of SN , S¯N ⊂ (x, t) = Rn describe peak lines in (x, tk )-plane. As follows from (4.32) and (4.35), the motion of the N th peak pN (tk ) along the x-axis is described by a sum of a linear function in tk and a quasi-periodic one. The latter function becomes periodic in the case g = 1. Finally, after fixing all the times without exception, expressions (4.31) and (4.34) provide pieces of the stationary finite-gap peaked solution already described in Theorem 4.2. Proof of Theorem 4.3. According to Theorem 4.2, the profiles of finite-gap peaked solutions are associated with geodesic ellipsoidal billiards and billiards in the field of a Hooke potential. An impact point on the boundary of a billiard trajectory corresponds to a peak of the profile U (x, t0 ), and this happens when one of the µi passes zero. Hence, the solution (4.27) is valid until one of the points P1 , . . . , Pn on C coincides with Q− or Q+ , the poles of the differential ?0 in (4.5). Putting, for example, Pn ≡ Q+ (µn ≡ 0)

220

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

in (4.24), one arrives at the following relations involving the normalized differentials defined in (4.4) and (4.25): g g i=1

µi

i=1 P0 g

?(1) ∞

P0

µi

−

ω¯ k = zk − qk /2,

dk ω¯ k

=

k=1

√

−1L0 x −

k = 1, . . . , g, Q+ P0

?(1) ∞

−

(4.37) g

dk ω¯ k ,

(4.38)

k=1

where P0 = (m2n , 0). Notice that Eqs. (4.37) form a closed system for the variables µ1 , . . . , µn−1 and describe the standard Abel–Jacobi mapping C (g) →Jac(C). Hence, the first symmetric polynomial has the following standard form in terms of theta-functions in the odd order case: g µ1 + · · · + µn−1 = c1 − ∂V2 log θ [D + η2n ](z − q/2), c1 = ω¯ k . (4.39) k=1 Ak

On the other hand, Eq. (4.38) implies that at a peak point the coordinate x becomes a function of z and therefore of t: x = p0 (t). In the odd order case, this equation contains a sum of Abelian integrals of the second kind, the so-called Abelian transcendent. By making use of the following standard expression for the normalized transcendent (Clebsch and Gordon [1866]) g g µi µi ?(1) ω¯ k , ∞ = −∂V log θ[D + η2n ] i=1

µ0

i=1

P0

from (4.37) and (4.38) we find p0 (t) = √

1 −1L0

(∂V log θ [D + η2n ](z − q/2) − (d, z) + h/2) ,

h=

Q+ Q−

?(1) ∞. (4.40)

Using the trace formula for the solution U (p0 (t), t) = µ1 + · · · + µn−1 and expression (4.39) it follows that the equation x = p0 (t) determines a surface S0 in Cn along which the solution U has a peak. Now setting in (4.24) Pn ≡ Q− and taking into account (4.4), (4.25) and Q− Q+ Q− Q+ (1) (1) ?∞ = − ?∞ , ω¯ = − ω¯ P0

P0

P0

P0

we obtain an expression for another peak surface S1 determined by the equation {x = p1 (t)} with p1 (t) = √

1 −1L0

(∂V log θ [D + η2n ](z + q/2) − (d, z) − h/2) ,

(4.41)

along which U (p1 (t), t) = µ1 + · · · + µn−1 = C1 − ∂V2 log θ [D + η2n ](z + q/2).

(4.42)

Complex Geometry of Piecewise Solutions

221

Under the reality condition, the surfaces S0 and S1 do not intersect and therefore determine a connected domain in Cn = (x, t) where the solution (4.27) is applicable. We denote this piece of solution as U1 (x, t). As follows from (4.40) and (4.41) S1 is obtained from S0 by changing the phase as follows: Z → Z + h,

z→z+q

that is x → x + √

1

−1L0 −1 t → t + D q.

(h − (d, Dt)),

(4.43)

In addition, according to (4.42) and (4.39) at any two points on S0 and S1 which are equivalent modulo the shift, U1 (x, t) takes the same values: 1 (h − (d, Dt)), t + D −1 q . (4.44) U1 (q1 (t), t) = U1 q0 (t) + √ −1L0 √ Now let us define the function U2 (x, t) = U1 (x + (h − (d, Dt))/( −1L0 ), t + D −1 q), which is also a local solution to (HD). In view of (4.44), U1 and U2 take the same values along S1 , which ensures a correct gluing of two pieces together. By using iteration with respect to both positive and negative N s, we construct a complete sequence of peak surfaces and obtain formulae given in part 1) of the theorem. Similarly, solution (4.29) of (SW) is valid until one of the points P1 , . . . , Pn on C coincides with Q− or Q+ , the poles of ?0 . Setting Pn ≡ Q+ in (4.24) for the case of an even order curve C, and using (4.4) and (4.28) yields n−1

n−1

µi P0

i=1

µi

i=1 P0 n−1

ω¯ k = zk − qk /2,

d¯k ω¯ k

?∞± −

k=1

=x−

k = 1, . . . , n − 1,

Q+ P0

?∞± −

n−1

(4.45)

d¯k ω¯ k ,

(4.46)

k=1

where P0 = (m2n+1 , 0). Inverting (4.45) results in the following expression for a symmetric polynomial (see e.g., Clebsch and Gordon [1866]) µ1 + · · · + µn−1 = const − ∂V log

θ [D](z − qˆ − q/2) . θ [D](z + qˆ − q/2)

(4.47)

After applying the theta-functional formula for the normalized transcendent of the third kind (Clebsch and Gordon [1866]), g i=1

µi P0

?∞±

θ [D](s − q) ˆ , = const − log θ [D](s + q) ˆ

s=

g i=1

µi P0

ω, ¯

g = n − 1,

from (4.46) and (4.45) we obtain x = p0 (t) = const − log

θ [D](z − qˆ + q/2) ¯ + h/2. ¯ − (z, d) θ [D](z + qˆ + q/2)

(4.48)

By choosing Pn ≡ Q+ in (4.24), one arrives at the expressions (4.47) and (4.48) with ¯ replaced by −q/2, −h/2. ¯ q/2, h/2 Then, following similar arguments and applying induction, the piecewise solution of part 2) is constructed.

222

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

We emphasize that although the different pieces UN (x, t) of the solution are obtained by iterative shifting the phases z, Z by the same vector, the pieces UN (x, t0 ) of the solution (t0 being fixed) are all distinct because the shift occurs in both x- and t-directions. Remark. If we omit the reality condition above, then the hypersurfaces SN , S¯N in Cn intersect, bounding a set of n-dimensional domains adjacent to each other in a rather complicated manner. Then the procedure of gluing different pieces of the functions UN (x, t) meromorphic inside each domain cannot be defined uniquely. As a result, the generic complex solution U (x, t) branches along the peak surfaces. 5. Kinematics of Peaks Now we obtain expressions for the velocity of the N th peak pN (t) of the piecewise solution of (HD) with respect to time tk . As was shown above, the solution has a peak when one of the µ-variables passes zero implying that Pn = Q− or Pn = Q+ . Theorem 5.1. Let y1 , . . . , yn−1 denote the µ-coordinates of the points P1 , . . . , Pn−1 at the moment in time when one of the µ-variables passes zero. The following system of equations holds: ∂pN (t) = −&k−1 (y1 , . . . , yn−1 ), ∂tk

(5.1)

where &k is k th the symmetric function of y1 , . . . , yn−1 . In particular, we have ∂pN (t) = y1 + · · · + yn−1 = U (pN (t), t), ∂t2

(5.2)

i.e., the t2 -velocity of the peak coincides with its height. Proof. After applying limit m1 → 0, Eqs. (2.12) and (2.23) for the derivatives of µn take the form √ ∂µn ρ(µn ) = , (5.3) ∂x (µn − µ1 ) · · · (µn − µn−1 ) √ ∂µn ρ(µn ) . (5.4) = &k−1 (µ1 , . . . , µn−1 ) ∂tk (µn − µ1 ) · · · (µn − µn−1 ) On the other hand, along the peak line {x = pN (tk )}, we have ∂µn d pN (tk ) ∂µn d + = 0, µn (pN (tk ), tk ) ≡ dt ∂x d tk ∂tk which, in view of (5.3) and (5.4) and after setting µn ≡ 0, yields √ ∂µn ρ(0) + &k−1 (y1 , . . . , yn−1 ) = 0. µ1 · · · µn−1 ∂tk Since ρ(0) = 0 and µ1 = y1 , . . . , µn−1 = yn−1 are finite, the latter relation gives (5.1).

Complex Geometry of Piecewise Solutions

223

Remark. The relations (5.1) can be also found by using direct differentiation of the expression for the N th peak surface (4.35) with respect to tk . Namely, putting without loss of generality N = 0, and taking into account ∂V = ∂t2 , we write n−1 µi ∂p0 (t) = ∂tk ∂t2 log θ [D + η2n ] ω¯ . ∂tk P0

(5.5)

i=1

According to Mumford [1983], in case of odd order hyperelliptic curves, this gives a theta-functional expression for the coefficient in front of λn−k in the polynomial (λ − µ1 ) · · · (λ − µg ) which coincides with &k−1 (y1 , . . . , yn−1 ). 6. The Dynamics of Peaks and Weak Solutions Expression (5.2) states that for equations in the hierarchies of (HD) or (SW), every peak in the solution profile moves with velocity determined by the local value of the solution. In this section, we derive this property without recourse to tools related to the complete integrability of the evolution equation. Thus, this property of peak motion can hold in general for equations that admit piecewise-smooth weak solutions, with jumps in the first spatial derivative at isolated points in the solution’s support. In this case, the derivative discontinuity can be viewed as a “shock” in the appropriate weak form of the evolution equation. We will take the weak form of the equation (HD) or (SW) to be ∇φ(x, t) · V(x, t) dx dt = 0, (6.1) ?

where the equality is satisfied for all test functions φ(x, t) is C ∞ with compact support in a domain ? in the (x, t) plane. Here ∇φ = (φt , φx ), the dot denotes the R2 inner product, and the vector function V(x, t) = (V1 , V2 ) is defined by V1 = Ux , 1 2 1 ∞ 2 U − |x − y| (Uy − 2κU ) dy , V2 = ∂x 2 4 −∞

(6.2)

for equation (HD) and V 1 = Ux ,

1 2 1 ∞ −|x−y| 2 2U + Uy2 − 2κU dy , U + e V2 = ∂x 2 4 −∞

(6.3)

for equation (SW), respectively. We will look for jump conditions satisfied by the solutions of Eq. (6.1). If the jump discontinuities are isolated, by adjusting the support of the test functions φ(x, t) we only need to consider the case of a single discontinuity. Let us suppose that the function U (x, t) is infinitely differentiable almost everywhere in ?, except along the curve x = q(t) where the first derivative Ux has a discontinuity. If we partition the domain ? into ? = ?1 ∪ ?2 by cutting along the portion of the

224

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

discontinuity curve x − q(t) = 0 in ?, the divergence theorem and the choice of test functions φ(x, t) vanishing on the boundary ∂? allow us to write Eq. (6.1) as 0= dx dt ∇φ · V = dx dt φ ∇ · V ?

?1

+

dx dt φ ∇ · V +

?2

dl φ n · [V]+ −.

(6.4)

∂?1 ∩∂?2

Here the unit vector n is directed along the normal [−q, ˙ 1] to the discontinuity curve ∂?1 ∩ ∂?2 in ?, and [V]+ denotes the jump of the vector V across this curve, − [V]+ − ≡

lim

x→q(t)+

V(x, t) −

lim

x→q(t)−

V(x, t).

By the arbitrariness of φ(x, t), each integrand term on the right-hand side of (6.4) has to vanish separately. Thus, from the first two terms, ∇ · V = 0,

or

∂V1 ∂V2 + = 0, ∂t ∂x

(6.5)

in ?1 or ?2 , where U (x.t) is smooth. This smoothness and zero divergence condition, by the definition (6.2) or (6.3) for (HD) or (SW) respectively, imply that U (x, t) is a solution of these equations in ?1 or ?2 . For instance, (6.5) becomes 1 2 1 ∞ |x − y| (Uy2 − 2κU ) dy = 0, Uxt + ∂xx U − 2 4 −∞ which is the integrated form of the Harry-Dym equation (HD). The last (jump) condition in (6.4), n · [V]+ − =0 along ∂?, implies + q[V ˙ 1 ]+ − = [V2 ]− .

(6.6)

The left-hand side of this expression is simply ˙ x ]+ q[V ˙ 1 ]+ − = q[U −.

(6.7)

As to the right-hand side, the second (integral) term in the definitions (6.2) or (6.3) of V2 (x, t) is a continuous function of x, as the integral wipes out the discontinuity sgn(x − y) as well as additional ones that Uy2 might have. Hence the integral terms do not contribute to the right hand side of (6.6). The jump of V2 (x, t) across the discontinuity curve x = q(t) then reduces to 1 2 + U )x [V2 ]+ = U (q, t) [Ux ]+ (6.8) − = −. − 2 If

+ − [Ux ]+ − ≡ Ux (q , t) − Ux (q , t) = 0,

Eqs. (6.7) and (6.8) yield q˙ = U (q, t),

(6.9)

i.e., the location of the discontinuity (shock) in the Ux moves at the local speed U (q, t). We have then proved the following

Complex Geometry of Piecewise Solutions

225

Theorem 6.1. Let U (x, t) be a solution of Eq. (6.1), with the vector V(x, t) defined in terms of U (x, t) by the nonlinear, nonlocal operators (6.2) and (6.3) respectively for equations (HD) and (SW). Let U (x, t) be a smooth function of (x, t) in the domain ? ⊆ R2 , except along the curve x = q(t), where U is continuous while the first derivative Ux has a jump discontinuity (peak) U (q + , t) = U (q − , t). Then U (x, t) is a solution of equations (HD) and (SW) in each domain ?1 and ?2 in which the curve x = q(t) partitions ?, and the location of the peak q(t) moves with velocity equal to its height, q˙ = U (q, t). Conclusions. In this paper, profiles of the weak finite-gap piecewise-smooth solutions of the integrable nonlinear equations of shallow water and Dym type are linked to billiard dynamical systems and geodesic flows with reflections described in terms of finite dimensional Hamiltonian systems on Riemann surfaces. After reducing the solution of these systems to that of a nonstandard Jacobi inversion problem, solutions are found by introducing new parametrizations. The extension of the algebraic-geometric method for nonlinear integrable PDE’s given in this paper leads to a description of piecewise-smooth weak solutions of a class of N -component systems of nonlinear evolution equations and its associated energy dependent Schrödinger operators. Acknowledgements. Mark Alber and Roberto Camassa would like to thank Francesco Calogero and Al Osborne for helpful discussions. The authors would like to thank R. Beals, D. Sattinger and J. Szmigielski for pointing out their recent work and for making it available.

References Abenda, S., Fedorov, Yu. [2000]: On the weak Kowalevski–Painlevé property for hyperelliptically separable systems. Acta Appl. Math. 60 (2), 137–178 Ablowitz, M.J., Segur, H. [1981]: Solitons and the Inverse Scattering Transform. Philadelphia: SIAM Alber, S.J. [1979]: Investigation of equations of Korteweg de Vries type by the method of recurrence relations. (Russian) J. Lond. Math. Soc. (2) 19, no. 3, 467–480 Alber, M.S., Alber, S.J. [1985]: Hamiltonian formalism for finite-zone solutions of integrable equations. C. R. Acad. Sci. Paris Ser. I Math. 301, 777–781 Alber, M.S., Camassa, R., Holm, D.D. and Marsden, J.E. [1994]: The geometry of peaked solitons and billiard solutions of a class of integrable PDE’s. Lett. Math. Phys. 32, 137–151 Alber, M.S., Camassa, R., Holm, D.D. and Marsden, J.E. [1995]: On the link between umbilic geodesics and soliton solutions of nonlinear PDE’s. Proc. Roy. Soc. 450, 677–692 Alber, M.S., Camassa, R., Fedorov, F., Holm, D.D. and Marsden, J.E. [1999]: On billiard solutions of nonlinear PDE’s. Phys. Lett. A 264, 171–178 Alber, M.S. and Miller, C. [2001]: On peak on solutions of the shallow water equation. Appl. Math. Lett. 14, 93–98 Alber, M.S. and Fedorov, Yu.N. [2000]: Wave solutions of evolution equations and Hamiltonian flows on nonlinear subvarieties of generalized Jacobians. J. Phys. A: Math. Gen. 33, 8409–8425 Alber, M.S. and Fedorov, Yu.N. [2001]: Algebraic geometric solutions for nonlinear evolution equations and flows on the nonlinear subvarieties of Jacobians. Inverse Problems (to appear) Alber, M.S., Luther, G.G., and Marsden, J.E. [1997]: Complex billiard Hamiltonian systems and nonlinear waves. In: Fokas,Y.H., Gelfand, I. M. (eds.) Algebraic Aspects of Integrable Systems, 1–16, Progr. Nonlinear Differential Equations Appl., 26. Boston: Birkhäuser Antonowicz, M. and Fordy, A. P. [1989]: Factorization of energy dependent Schrödinger operators: Miura maps and modified systems. Commun. Math. Phys. 124, no. 3, 465–486 Beals, W., Sattinger, D., and Szmigielski, J. [1998]: Acoustic scattering and the extended Korteweg de Vries hierarchy. Adv. Math. 140, 190–206

226

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

Beals, R., D.H. Sattinger, and J. Szmigielski [1999]: Multi-peakons and a theorem of Stietjes. Inverse Problems 15, L1–L4 Beals, R., Sattinger, D.H. and Szmigielski, J. [2000]: Multipeakons and the classical moment. Adv. in Math. 154, no. 2, 229–257 Belokolos, E.D., Bobenko, A.I., Enol’sii, V.Z., Its, A.R., and Matveev, V.B. [1994]: Algebro-Geometric Approach to Nonlinear Integrable Equations. Springer-Verlag series in Nonlinear Dynamics. New York: Springer-Verlag Bobenko, A. I. and Suris, Y. B. [1999]: Discrete Lagrangian reduction, discrete Euler-Poincaré equations, and semidirect products. Lett. Math. Phys. 49, no. 1, 79–93 Bulla, W., Gesztesy, F., Holden, H. and Teschl, G. [1998]: Algebro-geometric quasi-periodic finite-gap solutions of the Toda and Kac–van Moerbeke hierarchies. Mem. Am. Math. Soc. 135 Camassa, R. [2000]: Characteristic variables for a completely integrable shallow water equation. In: Boiti, M. et al. (eds.) Nonlinearity, Integrability and All That: Twenty Years After NEEDS ’79. Singapore: World Scientific Camassa, R., Holm, D.D. [1993]: An integrable shallow water equation with peaked solitons. Phys. Rev. Lett. 71, 1661–1664 Camassa, R., Holm, D.D. and Hyman, J.M. [1994]: A new integrable shallow water equation. Adv. Appl. Mech. 31, 1–33 Calogero, F. [1995]: An integrable Hamiltonian system. Phys. Lett. A 201, 306–310 Calogero, F. and Francoise, J.-P. [1996]: Solvable quantum version of an integrable Hamiltonian system. J. Math. Phys. 37 (6), 2863–2871 Cewen, C. [1990]: Stationary Harry-Dym’s equation and its relation with geodesics on ellipsoid. Acta Math. Sinica 6, 35–41 Clebsch, A. and Gordon, P. [1866]: Theorie der abelschen Funktionen. Leipzig: Teubner Dmitrieva, L.A. [1993a]: Finite-gap solutions of the Harry Dym equation. Phys. Lett. A 182, 65–70 Dmitrieva, L.A. [1993b]: The higher-times approach to multisoliton solutions of the Harry Dym equation J. Phys. A 26, 6005–6020 Drach, M. [1919]: Sur l’integration par quadratures de l’equation d 2 y/dx 2 = [φ(x) + h]y. Comptes rendus 168, 337–340 Dubrovin, B.A. [1975]: Periodic problems for the Korteweg–de Vries equation in the class of finite-band potentials. Funct. Anal. Appl. 9, 215–223 Dubrovin, B.A. [1981]: Theta-functions and nonlinear equations. Russ. Math. Surv. 36 (2), 11–92 Ercolani, N. [1987]: Generalized theta functions and homoclinic varieties. In: Ehrenpreis, L., Gunning, R.C. (eds.) Theta functions. Proceedings, Bowdoin. 87–100, Providence, R.I.: American Mathematical Society Fedorov, Yu. [1999]: Classical integrable systems related to generalized Jacobians. Acta Appl. Math. 55, 3, 151–201 Fedorov, Yu. [2001]: Ellipsoidal billiard with the quadratic potential. Funct. Anal. Appl. (Russian) (to appear) Gavrilov, L. [1999]: Generalized Jacobians of spectral curves and completely integrable systems. Math. Z. 230, 487–508 Gesztesy, F. [1995]: New trace formulas for Schrödinger operators. Evolution equations (Baton Rouge, LA, 1992), Lecture Notes in Pure and Appl. Math., 168, New York: Dekker, pp. 201–221 Gesztesy, F. and Holden, H. [1994]: Trace formulas and conservation laws for nonlinear evolution equations. Rev. Math. Phys. 6, 51–95 and 673 Gesztesy, F. and Holden, H. [2001]: Algebraic-geometric solutions of the Camassa–Holm equation. Preprint Gesztesy, F. and Holden, H. [2001]: Dubrovin equations and integrable systems on hyperelliptic curves. Math. Scand. (to appear) Gesztesy, F., Ratnaseelan, R., and Teschl, G. [1996]: The KdV hierarchy and associated trace formulas. Recent developments in operator theory and its applications (Winnipeg, MB, 1994), 125–163, Oper. Theory Adv. Appl. 87, Basel: Birkhäuser Hunter, J.K. and Zheng, Y.X. [1994]: On a completely integrable nonlinear hyperbolic variational equation. Physica D 79, 361–386 Jacobi, C.G.J. [1884a]: Vorlesungen uber Dynamik, Gesammelte Werke. Berlin: Supplementband Jacobi, C.G.J. [1884b]: Solution nouvelle d’un probleme fondamental de geodesie. Gesamelte Werke Bd. 2, Berlin

Complex Geometry of Piecewise Solutions

227

Jaulent, M. [1972]: On an inverse scattering problem with an energy dependent potential. Ann. Inst. H. Poincare A 17, 363–372 Jaulent, M. and Jean, C. [1976]: The inverse problem for the one-dimensional Schrödinger operator with an energy dependent potential. Ann. Inst. H. Poincare A. I, II 25, 105–118, 119–137 Kane, C., E. A. Repetto, E.A., Ortiz, M., and Marsden, J.E. [1999]: Finite element analysis of nonsmooth contact. Comput. Methods Appl. Mech. and Engrg. 180, 1–26 Klingerberg, W. [1982]: Riemannian Geometry. New York: de Gruyter Knörrer, H. [1982]: Geodesics on quadrics and mechanical problem of C. Neumann. J. Reine Angew. Math. 334, 69–78 Kozlov, V.V. and Treschev, D. V. [1991]: Billiards, a Genetic Introduction to Systems with Impacts. AMS Translations of Math. Monographs 89. New York Kruskal, M.D. [1975]: Nonlinear wave equations. In: Moser, J. (eds.) Dynamical Systems, Theory and Applications. Lecture Notes in Physics 38, New York: Springer Markushevich, A. I. [1977]: Introduction to the Theory of Abelian Functions. English version: Translations of Mathematical Monographs, 96. Providence, RI: American Mathematical Society, 1992 Marsden, J. E., Patrick, G.W., and Shkoller, S. [1998]: Mulltisymplectic geometry, variational integrators and nonlinear PDEs, Commun. Math. Phys. 199, 351–395 Marsden, J. E., Pekarsky, S., and Shkoller, S. [1999]: Discrete Euler-Poincare and Lie-Poisson equations. Nonlinearity 12, 1647–1662 Marsden, J. E. and Ratiu, T.S. [1999]: Introduction to Mechanics and Symmetry. Texts in Applied Mathematics, 17, Berlin–Heidelberg–New York: Springer-Verlag McKean, H.P. and Constantin, A. [1999]: A shallow water equation on the circle. Comm. Pure Appl. Math. Vol LII, 949–982 Moser, J. and Veselov, A.P. [1991]: Discrete versions of some classical integrable systems and factorization of matrix polynomials. Commum. Math. Phys. 139, 217–243 Mumford, D. [1983]: Tata Lectures on Theta I and II. Boston: Birkhäuser-Verlag Novikov D.P. [1999]: Algebraic-geometrical solutions of the Harry–Dym equations. Sibirskii Matematicheskii Zhurnal, 40, 159–163, (Russian) English transl. in: Siberian Math. Journal, 40, 136–140 Previato, E. [1995]: Hyperelliptic quasi-periodic and soliton solutions of the nonlinear Schrödinger equation. Duke Math. J. 52, 329–377 Rauch-Wojciechowski, S. [1995]: Mechanical systems related to the Schrödinger spectral problem. Chaos, Solitons & Fractals 5, 2235–2259 Serre, J.-P. [1959]: Groupes algébriques et corps de classes. Paris: Hermann Vanhaecke, P. [1995]: Stratification of hyperelliptic Jacobians and the Sato Grassmannian. Acta Appl. Math. 40, 143–172 Veselov, A.P. [1988]: Integrable discrete-time systems and difference operators. Funct. An. and Appl. 22, 83–94 Verhulst, F. [1996]: Nonlinear Differential Equations and Dynamical Systems. Second Edition, Berlin–Heidelberg–New York: Springer-Verlag Wadati, M., Ichikawa, Y.H., and Shimizu, T. [1980]: Cusp soliton of a new integrable nonlinear evolution equation. Prog. Theor. Phys. 64, 1959–1967 Weierstrass, K. [1884]: Über die geodätischen Linien auf dreiachsigen Ellipsoid, Math. Werke I, 257–266 Wendlandt, J.M. and Marsden, J.E. [1997]: Mechanical integrators derived from a discrete variational principle. Physica D 106, 223–246 Whittaker, E.T. [1937]: A Treatise on the Analytical Dynamics of Particles and Rigid Bodies, Cambridge: Cambridge University Press, 1904; 4th Ed., 1937 (reprinted by Dover 1944, and Cambridge University 1988) Young, L.C. [1969]: Lectures on the Calculus of Variations and Optimal Control Theory. Corrected printing, Chelsea: Saunders, 1980 Communicated by T. Miwa

Commun. Math. Phys. 221, 229 – 254 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

The Absolute Continuity of the Integrated Density of States for Magnetic Schrödinger Operators with Certain Unbounded Random Potentials Thomas Hupfer1 , Hajo Leschke1 , Peter Müller2 , Simone Warzel1 1 Institut für Theoretische Physik, Universität Erlangen-Nürnberg, Staudtstraße 7, 91058 Erlangen, Germany 2 Institut für Theoretische Physik, Georg-August-Universität, 37073 Göttingen, Germany

Received: 20 October 2000 / Accepted: 8 March 2001

Dedicated to the memory of Kurt Broderix (26 April 1962 – 12 May 2000) Abstract: The object of the present study is the integrated density of states of a quantum particle in multi-dimensional Euclidean space which is characterized by a Schrödinger operator with magnetic field and a random potential which may be unbounded from above and below. In case that the magnetic field is constant and the random potential is ergodic and admits a so-called one-parameter decomposition, we prove the absolute continuity of the integrated density of states and provide explicit upper bounds on its derivative, the density of states. This local Lipschitz continuity of the integrated density of states is derived by establishing a Wegner estimate for finite-volume Schrödinger operators which holds for rather general magnetic fields and different boundary conditions. Examples of random potentials to which the results apply are certain alloy-type and Gaussian random potentials. Besides we show a diamagnetic inequality for Schrödinger operators with Neumann boundary conditions. Contents 1. 2.

3. 4.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Random Schrödinger Operators with Magnetic Fields . . . . . . 2.1 Basic notation . . . . . . . . . . . . . . . . . . . . . . . 2.2 Basic assumptions . . . . . . . . . . . . . . . . . . . . . 2.3 Definition of the operators . . . . . . . . . . . . . . . . . 2.4 The integrated density of states . . . . . . . . . . . . . . Existence of the Density of States for Certain Random Potentials 3.1 A Wegner estimate . . . . . . . . . . . . . . . . . . . . . 3.2 Upper bounds on the density of states . . . . . . . . . . . Examples Illustrating the Results of Section 3 . . . . . . . . . . 4.1 Alloy-type random potentials . . . . . . . . . . . . . . . 4.2 Gaussian random potentials . . . . . . . . . . . . . . . . 4.3 Two space dimensions: random Landau Hamiltonians . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

230 230 230 231 232 234 236 236 238 238 239 241 243

230

T. Hupfer, H. Leschke, P. Müller, S. Warzel

A.

On Finite-Volume Schrödinger Operators with Magnetic Fields A.1 Definition of magnetic Neumann Schrödinger operators A.2 Diamagnetic inequality . . . . . . . . . . . . . . . . . A.3 Some consequences . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

245 245 248 251 252

1. Introduction The integrated density of states is a quantity of primary interest in the theory [34, 10, 49] and application [54, 7, 40, 2, 37] of Schrödinger operators for a particle in d-dimensional Euclidean space Rd (d = 1, 2, 3, . . . ) subject to a random potential. Its knowledge allows one to compute the free energy and hence all basic thermostatic quantities of the corresponding non-interacting many-particle system. It also enters formulae for transport coefficients. The goal of the present paper is to prove the absolute continuity of the integrated density of states N for certain unbounded random potentials, thereby generalizing a result in [23] for zero magnetic field to the case of a constant magnetic field. Examples of random potentials to which our result applies are certain alloy-type and Gaussian random potentials. In particular, we consider the situation of two space dimensions and a perpendicular constant magnetic field where N is not absolutely continuous without random potential. For the proof of absolute continuity of N , we use the abstract one-parameter spectralaveraging estimate of [11] to derive what is called a Wegner estimate [65]. Such estimates provide upper bounds on the averaged number of eigenvalues of finite-volume random Schrödinger operators in a given energy regime. They play a major rôle in proofs of Anderson localization for multi-dimensional random Schrödinger operators [10, 49, 11, 24, 61]. In contrast to the Wegner estimates with magnetic fields which are available so far, we are neither restricted to the case of a constant magnetic field [12, 5, 64] nor to the existence of gaps in the spectrum of the magnetic Schrödinger operator without random potential [4]. In fact, the Wegner estimate in the present paper holds for magnetic vector potentials whose components are locally square integrable. Its proof involves techniques for (non-random) magnetic Neumann Schrödinger operators among them Dirichlet–Neumann bracketing and a diamagnetic inequality. Appendix A provides the definition of these operators and proofs of the latter techniques in greater generality than actually needed for the main body of the present paper. 2. Random Schrödinger Operators with Magnetic Fields 2.1. Basic notation. As usual, let N := {1, 2, 3, . . . } denote the set of natural numbers. Let R, respectively C, denote the algebraic field of real, respectively complex numbers and let Zd be the simple cubic lattice in d dimensions, d ∈ N. An open cube in ddimensional Euclidean space Rd is a translate of the d-fold Cartesian product I × . . . × I of an open interval I ⊆ R. The open unit cube in Rd which is centered at site y ∈ Rd and whose edges are oriented parallel to the co-ordinate axes is denoted by (y). The d 2 1/2 . Euclidean norm of x ∈ Rd is |x| := j =1 xj d The volume of a Borel subset d ⊆ R with respect to the d-dimensional Lebesgue d χ measure is || := d x = Rd d x (x), where χ is the indicator function of . In particular, if is the strictly positive half-line, := χ ] 0,∞[ is the left-continuous

Density of States for Random Schrödinger Operators with Magnetic Fields

231

Heaviside unit-step function. The Banach space Lp (), p ∈ [1, ∞], consists of the Borel-measurable complex-valued functions f : → C which areidentified if their values differ only on a set of Lebesgue measure zero and which obey dd x |f (x)|p < 2 ∞ if p < ∞ and f ∞ := ess supx∈ |f (x)| < ∞ if p = ∞. We recall dthat L () is a separable Hilbert space with scalar product · , · given by f, g = d x f (x) g(x). p Here the overbar denotes complex conjugation. We write f ∈ Lloc (Rd ), if f χ ∈ Lp (Rd ) for any bounded Borel set ⊂ Rd . Finally, C0∞ () is the vector space of functions f : → C which are arbitrarily often differentiable and have compact supports. 2.2. Basic assumptions. Let (, A, P) be a complete probability space and E{·} := P(dω)(·) be the expectation induced by the probability measure P. By a random potential we mean a (scalar) random field V : × Rd → R , (ω, x) → V (ω) (x) which is assumed to be jointly measurable with respect to the product of the sigma-algebra A of event sets in and the sigma-algebra B(Rd ) of Borel sets in Rd . We will always assume d ≥ 2, because magnetic fields in one space dimension may be “gauged away” and are therefore of no physical relevance. Furthermore, for d = 1 far more is known [10, 49] thanks to methods which only work for one dimension. We list four properties which V may have or not: (F) There exists some real p ∈]1, ∞[ with p > 1 if d = 2 and p ≥ d/2 if d ≥ 3 such that for P-almost each ω ∈ the realization V (ω) : x → V (ω) (x) of V belongs to p Lloc (Rd ). (S) There exists some pair of reals p1 > p(d) and p2 > p1 d/ [2(p1 − p(d))] such that p /p (2.1) dd x |V (x)| p1 2 1 < ∞. sup E y∈Zd

(y)

Here p(d) is defined as follows: p(d) := 2 if d ≤ 3, p(d) := d/2 if d ≥ 5 and p(4) > 2, otherwise arbitrary. (E) V is Zd -ergodic or Rd -ergodic. (I) The finiteness condition dd x |V (x)|2ϑ+1 < ∞ (2.2) sup E y∈Zd

(y)

holds, where ϑ ∈ N is the smallest integer with ϑ > d/4. Remark 2.1. (i) Property (E) requires the existence of a group Tx , x ∈ Zd or Rd , of probability-preserving and ergodic transformations on such that V is Zd - or Rd homogeneous in the sense that V (Tx ω) (y) = V (ω) (y − x) for all x ∈ Zd or Rd , all y ∈ Rd and all ω ∈ . p(d)

(ii) Since property (S) assures that the realization V (ω) belongs to Lloc (Rd ) for Palmost each ω ∈ , property (S) implies property (F). Property (I) also implies property (F). We proceed by listing two properties either of which a random potential may additionally have or not and which characterize two examples of random potentials, which we will consider in the present paper.

232

T. Hupfer, H. Leschke, P. Müller, S. Warzel

(A) V is an alloy-type random field, that is, a random field with realizations given by

(ω) λj u0 (x − j ). (2.3) V (ω) (x) = j ∈Zd

The coupling strengths {λj } form a family of random variables which are P-independent and identically distributed according to the common probability measure B(R) I → P{λ0 ∈ I }. Moreover, we suppose that the single-site potential u0 : Rd → R satisfies the Birman–Solomyak condition d p1 1/p1 < ∞ with some real p ≥ 2ϑ + 1 and that 1 j ∈Zd (j ) d x |u0 (x)| E (|λ0 |p2 ) < ∞ for some real p2 satisfying p2 ≥ 2ϑ + 1 and p2 > p1 d/[2(p1 − p(d))]. [The constants p(d) and ϑ are defined in properties (S) and (I).] (G) V is a Gaussian random field [1, 41] which is Rd -homogeneous. It has zero mean, E [ V (0)] = 0, and its covariance function x → C(x) := E [ V (x)V (0)] is continuous at the origin where it obeys 0 < C(0) < ∞. Remark 2.2. (i) Consider an alloy-type random potential V , that is, a random potential with property (A). Then V has properties (E), (I), (S) and (F), see, for example [29]. (ii) Consider a random field with the Gaussian property (G). Then its covariance function C is bounded and uniformly continuous on Rd . Consequently, [22, Thm. 3.2.2] implies the existence of a separable version V of this field which is jointly measurable. Speaking about a Gaussian random potential, we tacitly assume that only this version will be dealt with. By the Bochner–Khintchine theorem [51, Thm. IX.9] there is a one-to-one correspondence between finite positive (and even) Borel measures on Rd and Gaussian random potentials. An explicit calculation shows that a Gaussian random potential enjoys properties (I), (S) and (F). A simple sufficient criterion for the ergodicity property (E) is the mixing condition lim|x|→∞ C(x) = 0. By a vector potential we mean a (non-random) Borel-measurable vector field A : Rd → Rd , x → A(x) which we assume to possess either the property 1 (Rd ), (B) |A|2 belongs to Lloc

or the property (C) A has continuous partial derivatives which give rise to a magnetic field (tensor) with constant components given by Bj k := ∂j Ak − ∂k Aj , where j , k ∈ {1, . . . , d}. Remark 2.3. (i) Property (C) implies property (B). (ii) Given property (C), we may exploit the gauge freedom to choose the vector potential in the symmetric gauge in which the components of A are given by Ak (x) = d j =1 xj Bj k /2, where k ∈ {1, . . . , d}. 2.3. Definition of the operators. We are now prepared to precisely define magnetic Schrödinger operators with random potentials on the Hilbert spaces L2 () and L2 (Rd ). The finite-volume case is treated in Proposition 2.1. Let ⊂ Rd be a bounded open cube, A be a vector potential with the property (B) and V be a random potential with the property (F). Then

Density of States for Random Schrödinger Operators with Magnetic Fields

233

(i) the sesquilinear form hA,0 ,N (ϕ, ψ) :=

d

1 (i∇ + A)j ϕ , (i∇ + A)j ψ , 2

(2.4)

j =1

2 2 d with ϕ, ψ in the form domain Q hA,0 ,N := φ ∈ L () : (i∇ + A) φ ∈ (L ()) and ∇ − iA denoting the gauge-covariant gradient in the sense of distributions on C0∞ (), uniquely defines a self-adjoint positive operator on L2 (), which we A,0 denote by H,N (A, 0). The closure hA,0 ,D of the restriction of h,N to the domain C0∞ () uniquely defines another self-adjoint positive operator on L2 (), which we denote by H,D (A, 0). (ii) The two operators H,X (A, V (ω) ) := H,X (A, 0) + V (ω) ,

X = D or X = N,

(2.5)

are self-adjoint and bounded below on L2 () as form sums for all ω in some subset F ∈ A of with full probability, in symbols, P(F ) = 1. (iii) The mapping H,X (A, V ) : F ω → H,X (A, V (ω) ) is measurable. We call it the finite-volume magnetic Schrödinger operator with random potential V and Dirichlet or Neumann boundary condition if X = D or X = N, respectively. (iv) The spectrum of H,X (A, V (ω) ) is purely discrete for all ω ∈ F . (v) The (random) finite-volume density-of-states measure, defined by the trace (ω) (2.6) ν,X (I ) := Tr χ I H,X (A, V (ω) ) , is a positive Borel measure on the real line R for all ω ∈ F . Here χ I H,X (A, V (ω) ) is the spectral projection operator of H,X (A, V (ω) ) associated with the energy regime I ∈ B(R). Moreover, the (unbounded left-continuous) distribution function (ω) (ω) N,X (E) := ν,X ]−∞, E[ = Tr E − H,X (A, V (ω) ) < ∞ (2.7) (ω)

of ν,X , called the finite-volume integrated density of states, is finite for all energies E ∈ R. Proof. The proofs of assertions (i), (ii) and (iv) are contained in Appendix A because (B) and (F) imply (A.1) and (A.2). Assertion (iii) is a consequence of considerations in [35], see also Sect. V.1 in [10], and of a straightforward generalization to non-zero vector potentials. Assertion (v) follows from (ii) and (iv). (ω)

Remark 2.4. Counting multiplicity, ν,X (I ) is just the number of eigenvalues of the operator H,X (A, V (ω) ) in the Borel set I ⊆ R. Since this number is almost-surely (ω) finite if I is bounded, the mapping ν,X : ω → ν,X is a random Borel measure. The precise definition of the infinite-volume magnetic Schrödinger operator on L2 (Rd ) and a compilation of its basic properties are given in Proposition 2.2. Assume that the vector potential A and the random potential V enjoy the properties (C) and (S). Then

234

T. Hupfer, H. Leschke, P. Müller, S. Warzel

(i) the operator C0∞ (Rd ) ψ → 21 dj =1 (i∂j + Aj )2 ψ + V (ω) ψ is essentially selfadjoint for all ω in some subset S ∈ A of with full probability, P(S ) = 1. Its self-adjoint closure on L2 (Rd ) is denoted by H (A, V (ω) ). (ii) The mapping H (A, V ) : S ω → H (A, V (ω) ) is measurable. We call it the infinite-volume magnetic Schrödinger operator with random potential V . Proof. For assertion (i) see [24, Prop. 2.3], which generalizes [10, Prop. V.3.2] to the case of continuously differentiable vector potentials A = 0. Note that the assumption of a vanishing divergence, dj =1 ∂j Aj = 0, in [24, Prop. 2.3] is not needed in the argument. Assertion (ii) is a straightforward generalization of [10, Prop. V.3.1] to continuously differentiable A = 0, see also [34, Prop. 2 on p. 288]. Remark 2.5. For alternative or weaker criteria instead of (S) guaranteeing the almostsure self-adjointness of H (0, V ), see [49, Thm. 5.8] or [34, Thm. 1 on p. 299]. If A has the property (C), the infinite-volume magnetic Schrödinger operator without scalar potential, H (A, 0), is unitarily invariant under so-called magnetic translations [67]. The latter form a family of unitary operators {Tx }x∈Rd on L2 (Rd ) defined by ψ ∈ L2 (Rd ), (2.8) (Tx ψ) (y) := ei(x (y) ψ(y − x), where (x (y) := K(x,y) dr · (A(r) − A(r − x)) is an integral along some smooth curve K with initial point x ∈ Rd and terminal point y ∈ Rd . Since A and its x-translate A( · − x) give rise to the same magnetic field and Rd is simply connected, the integral (x (y) is actually independent of K. Remark 2.6. (i) For the vector potential in the symmetric gauge (see Remark 2.3 (ii)) one has (x (y) = dj,k=1 xj Bj k (yk − xk )/2. (ii) For a discussion in the case of more general configuration spaces and magnetic fields, see for example [44]. (iii) In the situation of Prop. 2.2 and if the random potential V has property (E), we have Tx H (A, V (ω) ) Tx† = H (A, V (Tx ω) )

(2.9)

for all ω ∈ S and all x ∈ Zd or x ∈ Rd , depending on whether V is Zd - or Rd -ergodic. Hence, following standard arguments, H (A, V ) is an ergodic operator and its spectral components are non-random, see [62, Thm. 2.1]. Moreover, the discrete spectrum of H (A, V (ω) ) is empty for P-almost all ω ∈ , see [34, 10, 62]. 2.4. The integrated density of states. The quantity of main interest in the present paper is the integrated density of states and its corresponding measure, called the density-ofstates measure. The next theorem, which we recall from [29], deals with its definition and its representation as an infinite-volume limit of the suitably scaled finite-volume counterparts (2.7). Proposition 2.3. Let χ (0) denote the multiplication operator associated with the indicator function of the unit cube (0). Assume that the potentials A and V have the properties (C), (S), (I) and (E). Then the (infinite-volume) integrated density of states (2.10) N (E) := E Tr χ (0) E − H (A, V ) χ (0) < ∞

Density of States for Random Schrödinger Operators with Magnetic Fields

235

is well defined for all energies E ∈ R in terms of the (spatially localized) spectral family of the infinite-volume operator H (A, V ). It is the (unbounded left-continuous) distribution function of some positive Borel measure ν on the real line R. Moreover, let ⊂ Rd stand for bounded open cubes centered at the origin. Then there is a set 0 ∈ A of full probability, P(0 ) = 1, such that the limit relation (ω)

N (E) = lim

↑Rd

N,X (E) ||

(2.11)

holds for both boundary conditions X = D and X = N, all ω ∈ 0 and all E ∈ R except for the (at most countably many) discontinuity points of N . Proof. See [29]. Remark 2.7. (i) A proof of the existence of the integrated density of states N under slightly different hypotheses was outlined in [43]. It uses functional-analytic arguments first presented in [36] for the case A = 0. A different approach to the existence of the density-of-states measure ν for A = 0, using Feynman–Kac(-Itô) functional-integral representations of Schrödinger semigroups [58, 9], can be found in [62, 8]. The latter approach dates back to [47, 46] for the case A = 0. To our knowledge, it works straightforwardly in the case A = 0 for X = D only. For A = 0 the independence of the infinite-volume limit in (2.11) of the boundary condition X (previously claimed without proof in [43]) follows from [45] if the random potential V is bounded and from [19] if V is bounded from below. So the main new point about Prop. 2.3 is that it also applies to a wide class of V unbounded from below. Even for A = 0, Prop. 2.3 is partially new in that the corresponding result [49, Thm. 5.20] only shows vague convergence of the underlying measures, see the next remark. (ii) An immediate corollary of Prop. 2.3 is the vague convergence [6, Def. 30.1] of (ω) the spatial eigenvalue concentrations ||−1 ν,X in the infinite-volume limit ↑ Rd to the non-random positive Borel measure ν uniquely corresponding to the integrated density of states (2.10) in the sense that N (E) = ν(] − ∞, E[) for all E ∈ R, that is, (ω)

ν = lim

↑Rd

ν,X ||

(vaguely)

(2.12)

for both X = D and X = N and P-almost all ω ∈ . One may relate properties of the density-of-states measure ν to simple spectral properties of the infinite-volume magnetic Schrödinger operator. Examples are the support of ν and the location of the almost-sure spectrum of H (A, V (ω) ) or the absence of a point component in the Lebesgue decomposition of ν and the absence of “immobile eigenvalues” of H (A, V (ω) ). This is the content of Corollary 2.1. Under the assumptions of Prop. 2.3 and letting I ∈ B(R), the following equivalence holds: ν(I ) = 0 if and only if χ I H (A, V (ω) ) = 0 for P-almost all ω ∈ . This immediately implies: (i)

supp ν = spec H (A, V (ω) ) for P-almost all ω ∈ . [Here spec H (A, V (ω) ) denotes the spectrum of H (A, V (ω) ) and supp ν := {E ∈ R : ν(]E − ε, E + ε[) > 0 for all ε > 0} is the topological support of ν.]

236

T. Hupfer, H. Leschke, P. Müller, S. Warzel

(ii) 0 = ν({E}) = limε↓0 N (E + ε) − N (E) if and only if E ∈ R is not an eigenvalue of H (A, V (ω) ) for P-almost all ω ∈ . Proof. See [29]. The equivalence (ii) of the above corollary is a continuum analogue of [15, Prop. 1.1], see also [49, Thm. 3.3]. In the one-dimensional case [48] and the multi-dimensional lattice case [18], the equivalence has been exploited to show for A = 0 the (global) continuity of the integrated density of states N under practically no further assumptions on the random potential beyond those ensuring the existence of N . The proof of such a statement in the multi-dimensional continuum case is considered an important open problem [60]. For A = 0 one certainly needs additional assumptions as [20] illustrates, see Remark 4.3(ii) below. Under the additional assumptions of Corollary 3.1 below, we will show that the integrated density of states is not only continuous, but even absolutely continuous in the case of a constant magnetic field of arbitrary strength. 3. Existence of the Density of States for Certain Random Potentials In this section we provide conditions under which the integrated density of states N (or, equivalently, its measure ν) is absolutely continuous with respect to the Lebesgue measure. As a by-product, we get rather explicit upper bounds on the resulting Lebesgue density dN (E)/dE = ν(dE)/dE, called the density of states. Results of this genre date back to [65] and go nowadays under the name Wegner estimates. 3.1. A Wegner estimate. The main aim of this subsection is to extend the Wegner estimate in [23] to the case with magnetic fields. For this purpose we recall from there Definition 3.1. A random potential V : × Rd → R admits a (U, λ, u, ,)decomposition if there exists a random potential U : × Rd → R , a random variable d λ : → R and a real-valued u ∈ L∞ loc (R ) such that (i) V (ω) = U (ω) + λ(ω) u for P-almost all ω ∈ , (ii) the conditional probability distribution of λ relative to the sub-sigma-algebra generated by the family of random variables {U (x)}x∈Rd has a jointly measurable density , : × R → [0, ∞[ with respect to the Lebesgue measure on R . d The condition u ∈ L∞ loc (R ) was missed out in [23, Def. 2]. We now state the following generalization of [23, Thm. 2] which in its turn relies on a result in [11]. int J Theorem 3.1. Let ⊂ Rd be a bounded open cube. Let = be j =1 j decomposed into the interior of the closure of finitely many, J ∈ N, pairwise disjoint bounded open cubes j ⊂ Rd . Let the potentials A and V be supplied with the properties (B) and (F), respectively. Assume that for each j ∈ {1, . . . , J } the random potential V admits a (Uj , λj , uj , ,j )-decomposition subject to the following three conditions: there exist five strictly positive constants v1 , v2 , β, R, Z > 0 such that for all j ∈ {1, . . . , J },

(i) v1 χ j (x) ≤ uj (x) and uj (x)χ j (x) ≤ v2 for Lebesgue-almost all x ∈ Rd , (ω) (ii) ess sup ,j (ξ ) max{e−βv1 ξ , e−βv2 ξ } ≤ R for P-almost all ω ∈ , ξ ∈R

−βH ,N (A,Uj ) j ≤ j Z. (iii) E Tr e

Density of States for Random Schrödinger Operators with Magnetic Fields

237

Then the averaged number of eigenvalues of the finite-volume operator H,X (A, V ) in any non-empty energy regime I ∈ B(R) of finite Lebesgue measure |I | is bounded from above according to RZ β sup I E ν,X (I ) ≤ || |I | e v1

(3.1)

for both boundary conditions X. [Here sup I denotes the least upper bound of I ⊂ R.] Remark 3.1. The (Chebyshev–Markov) inequality χ [1,∞[ (|ξ |) ≤ |ξ | implies P I ∩ spec H,X (A, V ) = ∅ = E χ [1,∞[ ν,X (I ) ≤ E ν,X (I ) .

(3.2)

Therefore the Wegner estimate (3.1) in particular bounds the probability of finding at least one eigenvalue of H,X (A, V ) in a given energy regime I ∈ B(R). Such bounds are a key ingredient of proofs of Anderson localization for multi-dimensional random Schrödinger operators, see [10, 49, 11, 24, 61] and references therein. Proof (of Theorem 3.1). Since we follow exactly the strategy of the proof of [23, Thm. 2], we only remark that the two main steps in this proof remain valid in the presence of a vector potential A. The first step, used in inequality (27) of [23], concerns the lowering of the eigenvalues of the operator H,X (A, V (ω) ) by so-called Dirichlet– Neumann bracketing in case X = D and by the (subsequent) insertion of interfaces in with the requirement of Neumann boundary conditions. For A = 0, supplied with property (B), the validity of these two techniques is established in Appendix A. The second step is an application of a spectral-averaging estimate of [11], which is re-phrased as Lemma 3.1 below. Since there the operator L is only required to be self-adjoint and does not enter the r.h.s. of (3.3), it makes no difference if L is taken as H,X (0, Uj ) (as is done in [23]) or as H,X (A, Uj ) for each j ∈ {1, . . . , J }. An essential tool in the preceding proof is the (simple extension of the) abstract oneparameter spectral-averaging estimate of [11]; in this context see also [13]. Lemma 3.1. Let K, L and M be three self-adjoint operators acting on a Hilbert space H with K and M bounded such that κ := inf Kϕ=0 ϕ , M ϕ/ϕ , K 2 ϕ > 0 is strictly positive. Moreover, let g ∈ L∞ (R). Then the inequality R

dξ |g(ξ )| ψ , K χ I (L + ξ M) K ψ ≤ |I |

g ∞ ψ, ψ κ

(3.3)

holds for all ψ ∈ H and all I ∈ B(R). Proof. Since the assumption κ > 0 implies the operator inequality κ K 2 ≤ M, the lemma is proven as Cor. 4.2 in [11] for any positive bounded functions g with compact supports. It extends to positive bounded function with arbitrary support by a monotoneconvergence argument.

238

T. Hupfer, H. Leschke, P. Müller, S. Warzel

3.2. Upper bounds on the density of states. If the fraction RZ/v1 on the r.h.s of the Wegner estimate (3.1) is independent of for sufficiently large ||, this estimate enables one to prove the absolute continuity of the infinite-volume density-of-states measure with a magnetic field. Corollary 3.1. Let A and V have the properties (C), (S), (I) and (E). Suppose furthermore: (i) there exists a sequence () of bounded open cubes ⊂ Rd with ↑ Rd such that int J infinitely many of them admit a decomposition = into a finite j =1 j number J (depending on ) of pairwise disjoint open cubes 1 , . . . , J . (ii) V obeys the assumptions of Theorem 3.1 for every such decomposition with constants β, v1 , R, Z > 0, all of them not depending on . Then the density-of-states measure ν is absolutely continuous with respect to the Lebesgue measure. Moreover, its Lebesgue density w, called the density of states, is locally bounded according to w(E) :=

ν(dE) dN (E) RZ βE e =: W (E) = ≤ dE dE v1

(3.4)

for Lebesgue-almost all energies E ∈ R. Proof. Let I ⊂ R be bounded and open. Then (2.12) together with [6, Satz 30.2] implies (ω) that ν(I ) ≤ lim inf ↑Rd ||−1 ν,X (I ) for P-almost all ω ∈ . Therefore, by the nonrandomness of the density-of-states measure ν and Fatou’s lemma we have E ν,X (I ) RZ β sup I ν(I ) ≤ lim inf e . (3.5) ≤ |I | || v1 ↑Rd Here we used (3.1) and the assumption that the constants involved there do not depend on . Now the Radón-Nikodým theorem yields the claimed absolute continuity of ν. 4. Examples Illustrating the Results of Section 3 Assumption (iii) of Theorem 3.1 may be checked in various ways. For example, by the diamagnetic inequality (A.24) of Appendix A for Neumann partition functions one sees that a possible choice of Z in (3.1) is −βHj ,N (0,Uj ) Z1 := max |j |−1 E Tr e . (4.1) 1≤j ≤J

This yields an upper bound on E ν,X (I ) in (3.1) which is independent of the magnetic field and, in particular, coincides with the one in [23, Thm. 2]. Rather weak conditions on the random potential Uj assuring the finiteness of the expectation value in (4.1) can be found in [21]. Another choice of Z results from applying the following averaged Golden–Thompson inequality.

Density of States for Random Schrödinger Operators with Magnetic Fields

239

Lemma 4.1. Let ⊂ Rd be a bounded open cube and assume that A and V enjoy properties (B) and (F). Then the averaged partition function of H,X (A, V ) is bounded for all β > 0 according to E Tr e−β H,X (A,V ) ≤ Tr e−β H,X (A,0) ess sup E e−β V (x) , (4.2) x∈

provided that the essential supremum on the r.h.s. is finite. (ω)

Proof. We proceed as in the proof of [36, Thm. 3.4(ii)] and define Vn (x) := max{−n, V (ω) (x)} for n ∈ N and ω ∈ F . The Golden–Thompson inequality [53] yields (ω) (ω) Tr e−β H,X (A,Vn ) ≤ Tr e−βH,X (A,0) e−β Vn .

(4.3)

We then evaluate the trace on the r.h.s. in an orthonormal eigenbasis of H,X (A, 0). Using Fubini’s theorem, the probabilistic expectation of the quantum-mechanical expectation of exp(−βVn ) with eigenfunction of H,X (A, 0) is estimated respect to a normalized by ess supx∈ E exp(−β Vn (x)) , which is smaller than the second factor on the r.h.s. of (4.2) since V ≤ Vn . The proof is completed by noting that the l.h.s. of (4.3) converges for n → ∞ to the trace on the l.h.s. of (4.2) by monotone convergence of forms [51, Thm. S.16], similar to the proof of [36, Prop. 2.1(e)]. Using (4.2) one gets Z2 := max

1≤j ≤J

|j |

−1

Tr e

−βHj ,N (A,0)

ess sup E e x∈j

−βUj (x)

(4.4)

as another choice for Z in (3.1). By (A.24) one may further estimate the magnetic Neumann partition function in (4.4) according to d Tr e−β H,N (A,0) ≤ Tr e−β H,N (0,0) ≤ || ||−1/d + (2πβ)−1/2 . (4.5) The second inequality follows from the explicitly known [53, p. 266] spectrum of H,N (0, 0). Applying (4.5) to (4.4) one weakens Z2 to a rather explicit choice of Z in (3.1) given by d −1/d −1/2 −βUj (x) Z3 := max . (4.6) |j | + (2πβ) ess sup E e 1≤j ≤J

x∈j

4.1. Alloy-type random potentials. The existence of a (U, λ, u, ,)-decomposition of V as required in Theorem 3.1 is immediate for alloy-type random potentials whose coupling strengths are distributed according to a Borel probability measure on the real line with a bounded Lebesgue density. To illustrate the essentials of Theorem 3.1 we first consider the case of positive potentials.

240

T. Hupfer, H. Leschke, P. Müller, S. Warzel

d Corollary 4.1. Let A and V have the properties (B) and (A). Assume that u0 ∈ L∞ loc (R ) ∞ and that the probability distribution of λ0 has a Lebesgue density g ∈ L (R) with support in the positive half-line [0, ∞[. Furthermore, suppose that there exist two strictly positive constants v1 , v2 > 0 such that

v1 χ (0) (x) ≤ u0 (x) and u0 (x)χ (0) (x) ≤ v2

(4.7)

for Lebesgue-almost all x ∈ Rd . Then for each bounded open cube of the form =

int (j )

,

(4.8)

j ∈∩Zd

one has E ν,X (I ) ≤ || |I | WA ( sup I )

(4.9)

for both X = D and X = N and all I ∈ B(R). Here WA is the function d g ∞ βE e R E → WA (E) := 1 + (2πβ)−1/2 v1

(4.10)

with β ∈] 0, ∞[ serving as a variational parameter. (ω)

Proof. For each j ∈ ∩ Zd , the choice uj (x) := u0 (x − j ) and Uj (x) := V (ω) (x) − (ω)

λj uj (x) yields a (Uj , λj , uj , g)-decomposition of V in the sense of Definition 3.1. It remains to verify the three assumptions of Theorem 3.1. Assumption (i) is guaranteed by (4.7). Assumption (ii) is fulfilled with R = g ∞ . To verify assumption (iii), we make (ω) use of (4.6) and observe that Uj ≥ 0. Remark 4.1. (i) The estimates in the proof of Corollary 4.1, when specializing the fraction RZ/v1 of Theorem 3.1 to WA , were unnecessarily rough for the sake of simplicity. In specific examples the upper bound WA may be improved. Moreover, more general alloy-type random potentials are also covered by Theorem 3.1. In particular, the random potential may be unbounded from below, see the next corollary. Furthermore, one may allow for correlated coupling strengths {λj } as long as the relevant conditional probabilities have bounded Lebesgue densities. (ii) Apart from the existence of a bounded Lebesgue density for the coupling strength λ0 one further restrictive assumption of Corollary 4.1 is the fact that the single-site potential u0 must possess a definite sign. The latter may be slightly weakened such that one may treat certain u0 taking on values of both signs by choosing a more complicated decomposition different from the natural one used in the proof of Corollary 4.1. This basically corresponds to the linear-transformation technique introduced in [63] which turns certain given alloy-type random potentials into ones with positive single-site potentials and correlated coupling strengths, see the previous Remark 4.1(i). In any case, the fact that u0 must possess a sufficiently large support is believed to be important for the absolute continuity of the integrated density of states in the presence of a magnetic field, see Remark 4.3(ii).

Density of States for Random Schrödinger Operators with Magnetic Fields

241

(iii) We only know of [12, 4, 5, 64] where Wegner estimates for magnetic Schrödinger operators with alloy-type random potentials have been derived.1 The Wegner estimate of [4] is proven for energies in pre-supposed gaps of the spectrum of H (A, 0). The other three works consider the case of two space dimensions and a perpendicular constant magnetic field, see Subsect. 4.3, especially Remark 4.3(iii) and 4.3(iv). We close this subsection by considering the example of an unbounded below alloy-type random potential with exponentially decaying probability density for its (independent) coupling strengths. This example is marginal in the sense that any such density has to fall off at minus infinity at least as fast as exponentially in order to ensure the applicability of Theorem 3.1. Corollary 4.2. Let A and V have the properties (B) and (A). Assume a Laplace distribution for λ0 , that is

1 dξ e−|ξ |/α , I ∈ B(R), (4.11) P λ0 ∈ I = 2α I d with some α > 0. Furthermore, suppose that u ∈ L∞ loc (R ) and that (4.7) holds with some v1 , v2 > 0 and let

ln 1 − [βαu0 (x − j )]2 < ∞ (4.12) Kβ := − ess inf x∈(0)

j ∈Zd

be finite for some β ∈] 0, (α u0 ∞ )−1 [. Finally, let be of the form (4.8). Then (4.9) holds where WA may be taken as the function d 1 − (βαv )2 1 E → WA (E) := 1 + (2πβ)−1/2 (4.13) eβE+Kβ 2αv1

with β ∈ β ∈ ] 0, (α u0 ∞ )−1 [ : Kβ < ∞ serving as a variational parameter. Proof. The proof is analogous to that of Corollary 4.1. To verify the assumptions of Theorem 3.1 we note that assumption (i) is guaranteed by (4.7). Assumption (ii) is fulfilled with R = (2α)−1 if β ∈] 0, (αv2 )−1 ]. As for assumption (iii), we make use of (4.6) and explicitly compute the involved expectation if β ∈] 0, (α u0 ∞ )−1 [. 4.2. Gaussian random potentials. As another application of Theorem 3.1 we note that the Wegner estimate derived previously [23, Thm. 1] for certain Gaussian random potentials and the case without magnetic field remains valid in the present setting. The reason for this is the fact that every Wegner estimate stemming from [23, Thm. 2] is also one in the presence of a magnetic field thanks to the diamagnetic inequality. Corollary 4.3. Let A and V have the properties (B) and (G). Moreover, assume that a finite signed Borel measure µ on Rd , which is normalized in the sense that there exist d x) d y) C(x −y) = C(0), an open subset > ⊂ Rd with volume > > 0 µ(d µ(d Rd Rd and a constant γ > 0 such that the covariance function C of V obeys γ χ > (x) ≤ (C(0))−1 µ(dd y) C(x − y) =: (C(0))−1/2 u(x) (4.14) Rd

1 See, however, note added in proof.

242

T. Hupfer, H. Leschke, P. Müller, S. Warzel

for all x ∈ Rd . Then for each @ > 0, for which there exists a bounded open cube (@) ⊆ > with edges of length @ parallel to the co-ordinate axes, and each bounded open cube ⊂ Rd satisfying the matching condition ||1/d /@ ∈ N, one has E ν,X (I ) ≤ || |I | WG ( sup I ) (4.15) for both X = D and X = N and all I ∈ B(R). Here WG is the function d exp βE + β 2 C@ /2 (4.16) E → WG (E) := 2@ + (2πβ) √ 2π C(0) b@ where we introduced the constants C@ := C(0) 1 + B@2 − b@2 , B@ := (C(0))−1/2 supx∈(@) u(x) and b@ := (C(0))−1/2 inf x∈(@) u(x) ≥ γ . Finally, β ∈ ] 0, ∞[ serves, besides @, as a second variational parameter.

−1

−1/2

Proof. The key input is the fact that every Gaussian random potential V admits a (U, λ, u, ,)-decomposition in the sense of Definition 3.1. More precisely, λ(ω) := −1/2 d (ω) (x) is a standard Gaussian random variable with Lebesgue (C(0)) Rd µ(d x)V density ,(ξ ) := (2π)−1/2 exp −ξ 2 /2 . This random variable and the Gaussian random field U (ω) (x) := V (ω) (x) − λ(ω) u(x), where u is defined in (4.14), are stochastically independent. For details see the proof of [23, Thm. 1]. To obtain the specific form WG , which is independent of the magnetic field, we used (4.6). Remark 4.2. (i) Without loss of generality, every measure µ yielding (4.14) may be normalized in the sense of the assumption in the above corollary. The measure µ allows one to apply Corollary 4.3 to Gaussian random potentials with certain covariance functions taking on also negative values. Examples are given in [23, 30]. (ii) If C(x) ≥ 0 for all x ∈ Rd , we may choose µ equal to Dirac’s point measure at the origin. Due to the continuity of C and since C(0) > 0, condition (4.14) is then fulfilled with some sufficiently small cube > containing the origin and γ = inf x∈> C(x)/C(0). Under stronger conditions on the vector potential A the Wegner estimate for this case has been stated in [24, Prop. 2.14] where it serves as one input for a proof of Anderson localization by certain Gaussian random potentials, see Remark 3.1. (iii) Choosing @ = |E|−1/4 and β = (2C@ )−1 E 2 + 2d C@ − E we obtain the following leading low- and high-energy behaviour: lim

E→−∞

ln WG (E) 1 =− , 2 E 2C(0)

lim

E→∞

WG (E) (e/(π d))d/2 = . √ E d/2 2π u(0)

(4.17)

Since WG provides an upper bound on the density of states (see Corollary 3.1), its lowenergy behaviour is optimal in the sense that it coincides with that of the derivative of the known low-energy behaviour of the integrated density of states [43, 62, 8]. This is not true for the high-energy behaviour. It is known [43, 62] that the high-energy growth of the integrated density of states is neither affected by the random potential nor by the magnetic field and proportional to E d/2 for E → ∞ in analogy to Weyl’s celebrated asymptotics for the free particle [66]. Note that the constant on the r.h.s. of the second equation in (4.17) is smaller than the one given by [23, Eq. (14)].

Density of States for Random Schrödinger Operators with Magnetic Fields

243

4.3. Two space dimensions: random Landau Hamiltonians. In this subsection we consider the special case of two space dimensions and a perpendicular constant magnetic field of strength B := B12 > 0. Accordingly, the vector potential in the symmetric gauge is given by B −x2 x1 A(x) = , x= ∈ R2 . (4.18) 2 x1 x2 This case has received considerable attention during the last three decades [2, 37] in the physics of low-dimensional electronic structures. The magnetic Schrödinger operator on L2 (R2 ) modelling the non-relativistic motion of a particle with unit charge on the Euclidean plane R2 under the influence of this magnetic field is the Landau Hamiltonian. Its spectral resolution dates back to Fock [25] and Landau [38] and is given by the strong-limit relation ∞ B

H (A, 0) = (2l + 1) Pl . 2

(4.19)

l=0

The energy eigenvalue (l + 1/2)B is called the l th Landau level and the corresponding orthogonal eigenprojection Pl is an integral operator with continuous complex-valued kernel B B B B 2 2 Pl (x, y) := (4.20) exp i (x2 y1 − x1 y2 ) − |x − y| Ll |x − y| , 2π 2 4 2 l −ξ dl given in terms of the l th Laguerre polynomial ξ → Ll (ξ ) := l!1 eξ dξ , ξ ≥ 0, [27, l ξ e Sect. 8.97]. The diagonal Pl (x, x) = B/(2π ) is naturally interpreted as the degeneracy per area of the l th Landau level. Using definition (2.10) with V = 0, the integrated density of states of the Landau Hamiltonian (4.19) turns out to be the well-known “staircase” function N (E) =

∞ 1 B E− l+ B , 2π 2

V = 0,

(4.21)

l=0

which is obviously not absolutely continuous with respect to the Lebesgue measure. For the derivation of (4.21) one may apply [51, Thm. VI.23] because the operator Pl χ (0) is Hilbert-Schmidt, more precisely Tr[χ (0) Pl χ (0) ] = B/(2π) < ∞. Alternatively one may compute [45, App. B] the infinite-area limit lim↑R2 ||−1 Tr (E − H,X (A, 0)) for some boundary condition X. The result coincides with (4.21) by Prop. 2.3. Informally, the density of states associated with (4.21) is a series of Dirac delta functions supported at the Landau levels. The corresponding infinities are indicated by vertical lines in Fig. 4.1 and together form what might be called a “Dirac half-comb”. By adding a random potential V to (4.19), the delta peaks are expected to be smeared out. In fact, under the assumptions of Corollary 3.1 they are smeared out completely in the sense that the density of states w of the arising random Landau Hamiltonian H (A, V ) = H (A, 0) + V is shown there to be locally bounded. For example, in the presence of a Gaussian random potential with the Gaussian covariance function C(x) = C(0) exp − |x|2 /(2τ 2 ) > 0, τ > 0, Fig. 4.1 contains the graph of the upper bound WG on w given in (4.16) after (numerically) minimizing with

244

T. Hupfer, H. Leschke, P. Müller, S. Warzel

WG

1/2π

E 0

B/2

3B/2

5B/2

Fig. 4.1. Plot of the upper bound WG (E) on w(E) as a function of the energy E. Here w is the density of states of the Landau Hamiltonian with a Gaussian random potential with Gaussian covariance function. The dashed line is a plot of the graph of an approximation to w. The exact w is unknown. Vertical lines indicate the delta peaks which reflect the non-existence of the density of states without random potential V . The step function (E)/2π (not shown) is the free density of states characterized by B = 0 and V = 0. (See text)

respect to β, @ and a certain one-parameter subclass of possible decompositions of V . Here we picked a (small) disorder parameter, C(0) = (B/5)2 , and a (large) correlation length, τ = 100B −1/2 . We recall that the function WG is independent of B due to our application of the diamagnetic inequality, but nevertheless provides an upper bound on w for all B ≥ 0. Therefore WG (E) is a rather rough estimate of w(E) already for energies E < B/2 and, in particular, starts increasing significantly at too low energies. Nevertheless, the upper bound shows that the density of states w has no infinities for arbitrarily weak disorder, that is, for arbitrarily small C(0) > 0. In fact, in the above situation we believe the graph of w to look similar to the dashed line in Fig. 4.1. We conclude this subsection with several remarks: Remark 4.3. (i) Unfortunately, our upper bound W in (3.4) is never sharp enough to reflect the expected “magneto-oscillations” of w. Instead, by construction W is always increasing. (ii) The assumptions of Corollary 3.1 guarantee in particular that there occurs no point component in the Lebesgue decomposition of the density-of-states measure ν. Using Corollary 2.1, this implies that any given energy E ∈ R, in particular any Landau-level energy, is P-almost surely no eigenvalue under these assumptions. This stands in contrast to a certain situation with random point impurities, in which case the authors of [20] show that finitely many Landau-level energies remain infinitely degenerate eigenvalues if B is sufficiently large. (iii) Exploiting the existence of spectral gaps of H (A, 0), a Wegner estimate for Landau Hamiltonians with alloy-type random potentials is derived in [12, 4, 5] which proves that ν is absolutely continuous when restricted to intervals between the Landau-level energies. For this result to hold the authors were able to weaken the assumption (4.7) on the size of the support of the single-site potential which our Corollary 4.1 requires. On

Density of States for Random Schrödinger Operators with Magnetic Fields

245

the other hand, absolute continuity of ν at all energies is proven in [12] only for bounded random potentials under the present assumptions on the support. (iv) In [64] a Wegner estimate for alloy-type random potentials is derived without assuming a definite sign of the single-site potential. However, this estimate holds only between the Landau-level energies for sufficiently strong magnetic field and does not enable one to deduce the existence of the density of states. (v) In [30] the integrated density of states associated with the restricted random Landau Hamiltonian Pl H (A, V )Pl of a single but arbitrary Landau level is shown to be absolutely continuous for Gaussian random potentials satisfying the assumptions of Corollary 4.3 (for d = 2). A. On Finite-Volume Schrödinger Operators with Magnetic Fields For convenience of the reader (and the authors), this appendix defines non-random magnetic Schrödinger operators with Neumann boundary conditions and compiles some of their basic properties. In passing, the more familiar basic properties of the corresponding operators with Dirichlet boundary conditions are briefly recalled, see for example [42, 9]. In particular, we prove a diamagnetic inequality for Neumann Schrödinger operators and Dirichlet–Neumann bracketing for a wide class of vector potentials including singular ones. Altogether, this appendix may be understood to extend some of the results in the key papers [31, 32, 3, 57] to the case of Neumann boundary conditions. Throughout this appendix, ⊆ Rd denotes a non-empty open, not necessarily proper subset of d-dimensional Euclidean space Rd with d ∈ N. Moreover, a : Rd → Rd stands for a vector potential and v : Rd → R for a scalar potential with v± := (|v| ± v) /2 denoting its positive respectively negative part. We will assume throughout that 1 |a|2 , v+ ∈ Lloc (Rd ).

(A.1)

The negative part v− is assumed to be a form perturbation either of H,N (a, 0) or even of H,N (0, 0). By this we mean that v− is form-bounded [52, Def. p. 168] with form bound strictly smaller than one either relative to H,N (a, 0) or even to H,N (0, 0). Both operators will be defined in Lemma A.1 below. The operator H,N (0, 0) is the usual Neumann Laplacian, up to a factor of −1/2. Remark A.1. By the diamagnetic inequality, see Prop. A.2 below, we will see that v− is a form perturbation of H,N (a, 0) if it is one of H,N (0, 0). If is a bounded open cube, an easy-to-check sufficient criterion for v− to be even infinitesimally form-bounded [52, Def. p. 168] relative to H,N (0, 0) can be taken from [36, Lemma 2.1] and reads p

v− ∈ Lloc (Rd )

(A.2)

with p = 1 if d = 1, some p > 1 if d = 2 and some p ≥ d/2 if d ≥ 3. A.1. Definition of magnetic Neumann Schrödinger operators. In a first step, we consider 1 (Rd ) or, equivalently, a ∈ L2 (Rd ) d , that is, a ∈ the case v = 0 and |a|2 ∈ Lloc j loc 2 (Rd ) for all j ∈ {1, . . . , d}. We define the sesquilinear form Lloc ha,0 ,N (ϕ, ψ) :=

d 1 (i∇ + a)j ϕ, (i∇ + a)j ψ 2 j =1

(A.3)

246

T. Hupfer, H. Leschke, P. Müller, S. Warzel

for all ϕ and ψ in its form domain Wa1,2 () := φ ∈ L2 () : (i∇ + a) φ ∈ (L2 ())d ,

(A.4)

which might be called a magnetic Sobolev space, see [39, Sect. 7.20] in case = Rd . Here and in the following, ∇ − ia denotes the gauge-covariant gradient in the sense of distributions on C0∞ (). In particular, this means Wa1,2 () =

d

φ ∈ L2 () : there is φj ∈ L2 () such that

j =1

φ , i∂j η + aj η = φj , η

(A.5)

for all η ∈ C0∞ () .

Remark A.2. We emphasize that the condition ψ ∈ Wa1,2 () allows for the case that d neither ∇ψ nor aψ belongs to L2 () . In general, ψ ∈ Wa1,2 () only implies ∇ψ ∈

d d 1 , the usual firstLloc () and | ψ | ∈ W 1,2 () := φ ∈ L2 () : ∇φ ∈ L2 () order Sobolev space of L2 -type. The latter statement is a consequence of the diamagnetic inequality, see Remark A.5(iv) below and [59]. If even |a|2 ∈ L∞ (Rd ), the magnetic Sobolev space coincides with the usual one, Wa1,2 () = W 1,2 (), up to equivalence of norms. Basic facts about ha,0 ,N are summarized in 2 Lemma A.1. The form ha,0 ,N is densely defined on L (), symmetric, positive and closed. It therefore uniquely defines a self-adjoint positive operator H,N (a, 0) on L2 () which, up to a factor of −1/2, is called magnetic Neumann Laplacian.

Proof. Since C0∞ () ⊂ Wa1,2 () ⊂ L2 () and C0∞ () is dense in L2 (), the form ha,0 ,N is densely defined. Its symmetry and positivity are obvious from the definition. To

1,2 prove that ha,0 ,N is also closed we have to show that the space Wa () is complete with respect to the (metric induced by the form-) norm φ, φ + ha,0 (A.6) ,N (φ, φ).

To this end, we proceed along the lines of Sects. 7.20 and 7.3 in [39] and let (φn )n∈N be a sequence in Wa1,2 () which is Cauchy with respect to the norm (A.6). By completeness of L2 (), there exist functions φ, ψj ∈ L2 (), j ∈ {1, . . . , d}, such that φn → φ and (i∇ + a)j φn → ψj strongly in L2 () as n → ∞. Since (i∇ + a)j φn → (i∇ + a)j φ in the sense of distributions on C0∞ () as n → ∞, we have (i∇ + a)j φ = ψj and hence φ ∈ Wa1,2 (). The existence and uniqueness of H,N (a, 0) follow now from the one-to-one correspondence between densely defined, symmetric, bounded below, closed forms and self-adjoint, bounded below operators, see [51, Thm. VIII.15]. Remark A.3. (i) We recall that the operator H,N (a, 0) has the subspace ∈ L2 () such that D H,N (a, 0) := ψ ∈ Wa1,2 () : there is ψ ha,0 ,N (ϕ, ψ) = ϕ , ψ

for all ϕ ∈ Wa1,2 ()

(A.7)

Density of States for Random Schrödinger Operators with Magnetic Fields

247

of its underlying form domain as its operator domain and acts according to . H,N (a, 0) ψ = ψ (ii) Let Dj (a) denote the closure of the symmetric operator C0∞ () ψ → (i∇ + a)j ψ ∈ L2 (). Being the closure of a symmetric operator, Dj (a) is symmetric. The domain of its adjoint Dj† (a) is given by D Dj† (a) := ψ ∈ L2 () : (i∇ + a)j ψ ∈ L2 () ,

(A.8)

because the adjoint of C0∞ () ψ → (i∇ + a)j ψ coincides with that of its closure. While for a proper subset = Rd the operator Dj (a) is not self-adjoint, it is so for = Rd [57, Lemma 2.5]. In the latter case it may physically be interpreted, up to a sign, as the j th component of the velocity (operator). By construction the magnetic Neumann Laplacian is a form sum of d operators in accordance with H,N (a, 0) =

d 1

Dj (a) Dj† (a) , 2

(A.9)

j =1

where the self-adjoint positive operator Dj (a) Dj† (a) comes from the closed form † Dj (a) ϕ, Dj† (a) ψ with form domain (A.8). Note that (A.8) is just the j th set of the intersection on the r.h.s. of (A.5). See also Thm. X.25 in [52]. 1,2 ∞ (iii) Restricting the form ha,0 ,N to the domain C0 () ⊂ Wa (), one obtains a form

which is closable in Wa1,2 () with respect to the norm (A.6), see [57, 42, 9]. Its closure ha,0 ,D is uniquely associated with another self-adjoint positive operator H,D (a, 0) on L2 () which, up to a factor of −1/2, is called magnetic Dirichlet Laplacian. For general 2 d a ∈ Lloc (Rd ) the space C0∞ () is not contained in D H,N (a, 0) , see (A.7). As a consequence, H,N (a, 0) in general cannot be restricted to C0∞ (). This stands in contrast to the case a = 0 where H,D (0, 0) is the Friedrichs extension of the restriction of H,N (0, 0) to C0∞ ().As the Dirichlet counterpart of (A.9) we only have the inequality H,D (a, 0) ≤ 21 dj =1 Dj† (a) Dj (a) which is meant in the sense of forms [53, Def. on p. 269]. The operators HRd,N (a, 0) and HRd,D (a, 0) are equal, see [57]. (iv) In the free case, which is characterized by a = 0 and v = 0, the just defined operators H,D (0, 0) and H,N (0, 0) coincide, up to a factor of −1/2, with the usual Dirichlet- and Neumann-Laplacian [53, p. 263], respectively. 1 (Rd ) and assume v to be a form perturbation In a second and final step, we let v+ ∈ Lloc − of H,N (a, 0). As a consequence, the sesquilinear form ! ! 1/2 1/2 1/2 1/2 a,0 (A.10) ha,v ,N (ϕ, ψ) := h,N (ϕ, ψ) + v+ ϕ, v+ ψ − v− ϕ, v− ψ

1,2 is well defined for all ϕ and ψ in its form domain Q ha,v ,N := Wa ()∩Q (v+ ), where 1/2 Q (v+ ) := φ ∈ L2 () : v+ φ ∈ L2 () . (A.11) Basic facts about ha,v ,N are summarized in

248

T. Hupfer, H. Leschke, P. Müller, S. Warzel

2 Lemma A.2. The form ha,v ,N is densely defined on L (), symmetric, bounded below and closed. It therefore uniquely defines a self-adjoint, bounded below operator H,N (a, v) on L2 () which is called magnetic Neumann Schrödinger operator. a,v

Proof. The domain Wa1,2 ()∩Q (v+ ) of h,N+ is dense in L2 (), because both Wa1,2 () and Q (v+ ) contain C0∞ (). Hence H,N (a, v+ ) is well defined as a form sum of a,v H,N (a, 0) and v+ . Moreover, h,N+ is symmetric, positive and closed, because it is the sum of two of such forms. Since H,N (a, 0) ≤ H,N (a, v+ ), the negative part v− of v is also a form perturbation of H,N (a, v+ ). The proof of the lemma is then completed by the KLMN-theorem [52, Thm. X.17]. 1,2 Remark A.4. Since the form domain of ha,0 ,D is contained in Wa (), the negative part v− of v is also a form perturbation of H,D (a, 0) ≤ H,D (a, v+ ). Hence one may apply the KLMN-theorem to define, similarly to H,N (a, v), what is called the magnetic Dirichlet Schrödinger operator and denoted as H,D (a, v).

An immediate consequence of the definition of H,X (a, v) is the fact that so-called decoupling and Dirichlet–Neumann bracketing continues to hold for a = 0 as in the case a = 0, see Props. 3 and 4 in Sect. XIII.15 of [53], and [14, 45] for smooth a = 0. 1 (Rd ) and v be a form perturbation of H Proposition A.1. Let |a|2 , v+ ∈ Lloc − ,N (a, 0). d Moreover, let 1 , 2 ⊂ R be a disjoint pair of non-empty open sets.

(i) Then the orthogonal decomposition H1 ∪2 ,X (a, v) = H1 ,X (a, v) ⊕ H2 ,X (a, v)

(A.12)

holds for both X = D and X = N on L2 (1 ∪ 2 ) = L2 (1 ) ⊕ L2 (2 ). int (ii) Let := 1 ∪ 2 be defined as the interior of the closure of the union of 1 and 2 , and suppose that the interface \ (1 ∪ 2 ) is of d-dimensional Lebesgue measure zero. Then the inequalities H1 ∪2 ,N (a, v) ≤ H,N (a, v) ≤ H,D (a, v) ≤ H1 ∪2 ,D (a, v)

(A.13)

hold in the sense of forms. Proof. The proofs of Props. 3 and 4 in Sect. XIII.15 of [53] for the free case carry over to the case a = 0 and v = 0. In particular, the inclusion relations between the various form domains for a = 0 and v = 0 hold analogously for the form domains in the case a = 0 and v = 0. A.2. Diamagnetic inequality. A useful tool in the study of Schrödinger operators with magnetic fields is 1 (Rd ) and v be a form perturProposition A.2. Let ⊆ Rd be open, |a|2 , v+ ∈ Lloc − bation of H,N (0, 0). Then v− is a form perturbation of H,N (a, 0) with form bound not exceeding the one for a = 0 and the inequality −t H (a,v) ,X e (A.14) ψ ≤ e−t H,X (0,v) |ψ|

holds for all ψ ∈ L2 (), all t ≥ 0 and both X = D and X = N .

Density of States for Random Schrödinger Operators with Magnetic Fields

249

Remark A.5. (i) For the Dirichlet version X = D of the diamagnetic inequality (A.14) to hold, it would be sufficient that v− is a form perturbation of H,D (0, 0). (ii) Inequality (A.14) for = Rd dates back to [31, 56, 28, 32, 3, 59, 57]. It is also known to hold for = Rd and X = D, even under the weaker assumptions 1 (), see [50, 42]. These assumptions still guarantee that the operators |a|2 , v+ ∈ Lloc H,D (a, v) and H,N (a, v) are definable as self-adjoint operators via forms. However, for arbitrary open = Rd the proof of (A.14) for X = N would be more complicated than the one which we will give under the stronger assumptions of Prop. A.2. The reason is that a gauge function more fancy than that in Lemma A.3 would be needed in order to avoid integration of aj across the boundary of . For a “simply shaped” , like a cube, such complications do not arise which implies that our proof would go through for cubes under the weaker assumptions. (iii) If a = 0 inequality (A.14) is equivalent to the assertion that H,X (0, v) is the (negative of the) generator of a positivity-preserving one-parameter operator semigroup 2 d on L2 (), see [52, pp. 186]. For general a ∈ Lloc (Rd ) inequality (A.14) asserts that the semigroup generated by H,X (0, v) dominates the one generated by H,X (a, v). (iv) It follows from [28, 59] that (A.14) is equivalent to the following pair of statements: (a) ψ ∈ D H,X (a, v) implies |ψ| ∈ Q h0,v ,X , (b) h0,v ,X (ϕ, |ψ|) ≤ Re ϕ sgn ψ , H,X (a, v) ψ for all ϕ ∈ Q h0,v ,X with ϕ ≥ 0 and all ψ ∈ D H,X (a, v) , where the signum function associated with ψ is defined by (sgn ψ) (x) := ψ(x)/|ψ(x)| ∈ C if ψ(x) = 0 and zero otherwise. If a = 0 these statements boil down to a Beurling–Deny criterion [17, Thm. 1.3.2] for H,X (0, v) which guarantees that it generates a positivity-preserving semigroup. Inequality (b) with X = N and v = 0 basically corresponds to the germinal distributional inequality of Kato, which he d proved [31] for a ∈ C 1 (Rd ) . In case = Rd and X = N, we are not aware of a reference proving (A.14) or (a) and (b) for singular a. Our proof of the diamagnetic inequality (A.14) for X = N will mimic the proof in [57], where the case = Rd is considered, see also Sect. 1.3 in [16]. It relies on the fact that for one dimension the vector potential can be removed by a gauge transformation. More precisely, for each j ∈ {1, . . . , d} the operator Dj† (a) is unitarily equivalent to Dj† (0). 1 (Rd ) and define a (gauge) function λ : Rd → R through Lemma A.3. Let |a|2 ∈ Lloc j xj λj (x) := dyj aj x1 , . . . , xj −1 , yj , xj +1 , . . . , xd . (A.15) 0

For open ⊆ Rd it induces a densely defined and self-adjoint multiplication operator λj on L2 (). The corresponding unitary operator e−iλj maps D Dj† (a) onto D Dj† (0) , recall (A.8), and one has Dj† (a) ψ = eiλj Dj† (0) e−iλj ψ for all ψ ∈ D Dj† (a) .

(A.16)

250

T. Hupfer, H. Leschke, P. Müller, S. Warzel

2 (Rd ). Proof. Fubini’s theorem and the Cauchy–Schwarz inequality show that λj ∈Lloc

Therefore, the induced multiplication operator on its maximal domain D λj := ψ ∈ L2 () : λj ψ ∈ L2 () ⊃ C0∞ () is densely defined and self-adjoint. Moreover, † 1 (), we are allowed to use the product since ψ ∈ D Dj (a) implies ∇j ψ ∈ Lloc and chain rule for distributional derivatives [26, pp. 150] which yield ∇j e−iλj ψ = e−iλj ∇j ψ − e−iλj iaj ψ.

Proof of Prop. A.2. For X = D see [50, 42, 9]. The proof for X = N consists of three steps. 1 (Rd ) to be bounded from below. In this case In the first step, we assume v ∈ Lloc H,N (a, v) is a form sum of d +1 operators each of which is bounded from below, recall Remark A.3(ii) and Lemma A.2. Hence we may employ the strong Lie–Trotter product formula generalized to form sums of several operators [33] and write n † † e−tH,N (a,v) = s-lim e−tD1 (a)D1(a)/2n · · · e−tDd (a)Dd (a)/2n e−tv/n . (A.17) n→∞

Gauge equivalence (A.16) now shows that †

†

e−tDj(a)Dj(a)/2n = eiλj e−tDj(0)Dj(0)/2n e−iλj

(A.18) for all j ∈ {1, . . . , d} and all t ≥ 0. By the distributional inequality ∇j |ψ| ≤ ∇j ψ , valid for all ψ ∈ D Dj† (0) [39, Thm. 6.17], the operator Dj (0) Dj† (0) obeys a Beurling– Deny criterion [17, Thm. 1.3.2] and hence is the generator of a positivity-preserving semigroup. It follows that † −tDj (a)D †(a)/2n e j (A.19) ψ ≤ e−tDj(0)Dj(0)/2n |ψ| for all ψ ∈ L2 () and all t ≥ 0. This together with (A.17) implies the assertion (A.14) 1 (Rd ) which are bounded from below. (with X = N) for scalar potentials v ∈ Lloc In the second step, we prove that if v− is a form perturbation of H,X (0, 0) then it is also one of H,X (a, 0) with form bound not exceeding the one for a = 0 (see [3] or [58, Thm. 15.10] for the case = Rd ). This follows from (A.23) below with v = 0 and α = 1/2 together with the fact that the form bound of v− relative to H,X (a, 0) can be expressed as " −1/2 −1/2 " " " lim " H,X (a, 0) + E (A.20) v− H,X (a, 0) + E ", E→∞

see [16, Prop. 1.3(ii)]. Here · denotes the (uniform) norm of bounded operators on L2 (). In the third step, we extend the validity of (A.14) (with X = N) to scalar potentials 1 (Rd ) and v being a form perturbation of H v with v+ ∈ Lloc − ,N (0, 0). To this end, we approximate v by vn defined through vn (x) := max {−n, v (x)}, x ∈ Rd , n ∈ N. Monotone convergence for forms [51, Thm. S.16] yields the convergence of H,N (a, vn ) to H,N (a, v) in the strong resolvent sense as n → ∞. It follows that s-lim e−tH,N (a,vn ) = e−tH,N (a,v) n→∞

(A.21)

for all t ≥ 0. Since (A.14) (with X = N) holds for each vn by the first step, the proof is complete.

Density of States for Random Schrödinger Operators with Magnetic Fields

251

A.3. Some consequences. We list some immediate consequences of the diamagnetic inequality. For this purpose, we assume the situation of Prop. A.2. (i) Powers of the resolvent of the self-adjoint operator H,X (a, v) may be expressed in terms of its semigroup by using the functional calculus. This gives the integral representation ∞ −α 1 H,X (a, v) − z = dt t α−1 etz e−tH,X (a,v) , (A.22) (α − 1)! 0 which is valid for all α > 0, all z ∈ C with Re z < inf spec H,X (a, v) and both X = D and X = N. Here α → (α − 1)! denotes Euler’s gamma function [27]. Inequality (A.14) then implies the diamagnetic inequality for powers of the resolvent H,X (a, v) − z −α ψ ≤ H,X (0, v) − Re z −α |ψ| ,

(A.23)

valid for all ψ ∈ L2 () and all z ∈ C with Re z < inf spec H,X (0, v). We recall [55] that the ground-state energy goes up when the magnetic field is turned on, in symbols, inf spec H,X (0, v) ≤ inf spec H,X (a, v). This follows from Remark A.5(iv)(b) or inequality (A.24) below if its r.h.s. is finite. (ii) If H,X (0, v) has purely discrete spectrum or, equivalently [53, Thm. XIII.64], has compact resolvent, the Dodds-Fremlin-Pitt theorem [3, Thm. 2.2] together with (A.23) implies that H,X (a, v) has also compact resolvent and hence purely discrete spectrum. In turn, H,X (0, v) has purely discrete spectrum if the free operator H,X (0, 0) has and if v is a form perturbation of H,X (0, 0) [53, Thm. XIII.68]. While H,D (0, 0) has purely discrete spectrum for arbitrary bounded open ⊂ Rd , H,N (0, 0) only has if possesses an additional property, for example the segment property, see [53, pp. 255]. For example, if is a bounded open cube the spectra of H,D (a, −v− ) and H,N (a, −v− ) are both purely discrete. Moreover, by the min-max principle the addition of the positive multiplication operator v+ to H,X (a, −v− ) cannot create essential spectrum. As a consequence, H,X (a, v) has purely discrete spectrum for both X = D and X = N if is a bounded open cube. (iii) The diamagnetic inequality (A.14) together with Lemma 15.11 in [58] implies the diamagnetic inequality for partition functions Tr e−tH,X (a,v) ≤ Tr e−tH,X (0,v)

(A.24)

for all t > 0 and both X = D and X = N, provided that the r.h.s. is finite. The latter is the case if is a bounded open cube, for example. This follows from Dirichlet– Neumann bracketing (see (A.13) with a = 0), the facts that v+ ≥ 0 and v− is a form perturbation of H,N (0, 0), and the finiteness of the free Neumann partition function (see [36, Prop. 2.1(c)] or (4.5)). Acknowledgement. It is a pleasure to thank Kurt Broderix, Dirk Hundertmark, Thomas Hoffmann-Ostenhof and Georgi D. Raikov for helpful remarks and stimulating discussions. This work was supported by the Deutsche Forschungsgemeinschaft under grant nos. Le 330/10 and Le 330/12. The latter is a project within the Schwerpunktprogramm “Interagierende stochastische Systeme von hoher Komplexität” (DFG Priority Programme SPP 1033).

252

T. Hupfer, H. Leschke, P. Müller, S. Warzel

Note added in proof. After submission of the present paper we learned of the interesting paper The Lp -theory of the spectral shift function, the Wegner estimate, and the integrated density of states for some random operators, Commun. Math. Phys. 218, 113–130 (2001), by J. M. Combes, P. D. Hislop and S. Nakamura. Among other things, their approach yields Wegner estimates for rather general magnetic fields and certain bounded random potentials. While these estimates do not imply absolute continuity of the integrated density of states, they yield Hölder continuity of arbitrary order strictly smaller than one. The recent preprint The integrated density of states for some random operators with nonsign definite potentials, mp_arc 01-139 (2001), by P. D. Hislop and F. Klopp extends part of this result to single-site potentials taking values of both signs.

References 1. Adler, R.J.: The geometry of random fields. Chichester: Wiley, 1981 2. Ando, T., Fowler, A.B., Stern, F.: Electronic properties of two-dimensional systems. Rev. Mod. Phys. 54, 437–672 (1982) 3. Avron, J., Herbst, I., Simon, B.: Schrödinger operators with magnetic fields. I. General interactions. Duke Math. J. 45, 847–883 (1978) 4. Barbaroux, J.-M., Combes, J.M., Hislop, P.D.: Localization near band edges for random Schrödinger operators. Helv. Phys. Acta 70, 16–43 (1997) 5. Barbaroux, J.-M., Combes, J.M., Hislop, P.D.: Landau Hamiltonians with unbounded random potentials. Lett. Math. Phys. 40, 335–369 (1997) 6. Bauer, H.: Maß- und Integrationstheorie. 2. Auflage, Berlin: de Gruyter, 1992 [in German] English translation to appear 7. Bonch-Bruevich,V.L., Enderlein, R., Esser, B., Keiper, R., Mironov,A.G., Zvyagin, I.P.: Elektronentheorie ungeordneter Halbleiter. Berlin: VEB Deutscher Verlag der Wissenschaften, 1984 [in German. Russian original: Moscow: Nauka, 1981] 8. Broderix, K., Hundertmark, D., Leschke, H.: Self-averaging, decomposition and asymptotic properties of the density of states for random Schrödinger operators with constant magnetic field. In: Path integrals from meV to MeV: Tutzing ’92. Grabert, H., Inomata, A., Schulman, L.S., Weiss, U. (eds.), Singapore: World Scientific, 1993, pp. 98–107 9. Broderix, K., Hundertmark, D., Leschke, H.: Continuity properties of Schrödinger semigroups with magnetic fields. Rev. Math. Phys. 12, 181–225 (2000) 10. Carmona, R., Lacroix, J.: Spectral theory of random Schrödinger operators. Boston: Birkhäuser, 1990 11. Combes, J.M., Hislop, P.D.: Localization for some continuous, random Hamiltonians in d-dimensions. J. Funct. Anal. 124, 149–180 (1994) 12. Combes, J.M., Hislop, P.D.: Landau Hamiltonians with random potentials: Localization and the density of states. Commun. Math. Phys. 177, 603–629 (1996) 13. Combes, J.M., Hislop, P.D., Mourre, E.: Spectral averaging, perturbation of singular spectra, and localization. Trans. Am. Math. Soc. 348, 4883–4894 (1996) 14. Combes, J.M., Schrader, R., Seiler, R.: Classical bounds and limits for energy distributions of Hamilton operators in electromagnetic fields. Ann. Phys. (N.Y.) 111, 1–18 (1978) 15. Craig, W., Simon, B.: Log Hölder continuity of the integrated density of states for stochastic Jacobi matrices. Commun. Math. Phys. 90, 207–218 (1983) 16. Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schrödinger operators. Berlin: Springer, 1987 17. Davies, E.B.: Heat kernels and spectral theory. Paperback edition, Cambridge: Cambridge Univ. Press, 1990 18. Delyon, F., Souillard, B.: Remark on the continuity of the density of states of ergodic finite difference operators. Commun. Math. Phys. 94, 289–291 (1984) 19. Doi, S., Iwatsuka, A., Mine, T.: The uniqueness of the integrated density of states for the Schrödinger operators with magnetic fields. Math. Z. 237, 335–371 (2001) 20. Dorlas, T.C., Macris, N., Pulé, J.V.: Characterization of the spectrum of the Landau Hamiltonian with delta impurities. Commun. Math. Phys. 204, 367–396 (1999) 21. Droese, J., Kirsch, W.: The effect of boundary conditions on the density of states for random Schrödinger operators. Stochastic Processes Appl. 23, 169–175 (1986) 22. Fernique, X.M.: Regularité des trajectoires des fonctions aléatoires Gaussiennes. In: Ecole d’Eté de Probabilités de Saint-Flour IV - 1974. Hennequin, P.-L. (ed.), Lecture Notes in Mathematics 480, Berlin: Springer, 1975, pp. 1–96 [in French] 23. Fischer, W., Hupfer, T., Leschke, H., Müller, P.: Existence of the density of states for multi-dimensional continuum Schrödinger operators with Gaussian random potentials. Commun. Math. Phys. 190, 133–141 (1997)

Density of States for Random Schrödinger Operators with Magnetic Fields

253

24. Fischer, W., Leschke, H., Müller, P.: Spectral localization by Gaussian random potentials in multidimensional continuous space. J. Stat. Phys. 101, 935–985 (2000) 25. Fock, V.: Bemerkung zur Quantelung des harmonischen Oszillators im Magnetfeld. Z. Physik 47, 446–448 (1928) [in German] 26. Gilbarg, D., Trudinger, N.S.: Elliptic partial differential equations of second order. 2nd edition, Berlin: Springer, 1983 27. Gradshteyn, I.S., Ryzhik, I.M.: Table of integrals, series, and products. Corrected and enlarged edition, San Diego: Academic, 1980 28. Hess, H., Schrader, R., Uhlenbrock, D.A.: Domination of semigroups and generalization of Kato’s inequality. Duke Math. J. 44, 893–904 (1977) 29. Hupfer, T., Leschke, H., Müller, P., Warzel, S.: Existence and uniqueness of the integrated density of states for Schrödinger operators with magnetic fields and unbounded random potentials. e-print mathph/0010013 (2000). 30. Hupfer, T., Leschke, H., Warzel, S.: Upper bounds on the density of states of single Landau levels broadened by Gaussian random potentials. e-print math-ph/0011010 (2000) 31. Kato, T.: Schrödinger operators with singular potentials. Israel J. Math. 13, 135–148 (1972) 32. Kato, T.: Remarks on Schrödinger operators with vector potentials. Integral Equations Oper. Theory 1, 103–113 (1978) 33. Kato, T., Masuda, K.: Trotter’s product formula for nonlinear semigroups generated by the subdifferentials of convex functionals. J. Math. Soc. Japan 30, 169–178 (1978) 34. Kirsch, W.: Random Schrödinger operators: A course. In: Schrödinger operators. Holden, H., Jensen, A. (eds.), Lecture Notes in Physics 345, Berlin: Springer, 1989, pp. 264–370 35. Kirsch, W., Martinelli, F.: On the ergodic properties of the spectrum of general random operators. J. Reine Angew. Math. 334, 141–156 (1982) 36. Kirsch, W., Martinelli, F.: On the density of states of Schrödinger operators with a random potential. J. Phys. A 15, 2139–2156 (1982) 37. Kukushkin, I.V., Meshkov, S.V., Timofeev, V.B.: Two-dimensional electron density of states in a transverse magnetic field. Sov. Phys. Usp. 31, 511–534 (1988) [Russian original: Usp. Fiz. Nauk 155, 219–264 (1988)] 38. Landau, L.: Diamagnetismus der Metalle. Z. Physik 64, 629–637 (1930) [in German] 39. Lieb, E.H., Loss, M.: Analysis. Providence, Rhode Island: Am. Math. Soc., 1997 40. Lifshits, I.M., Gredeskul, S.A., Pastur, L.A.: Introduction to the theory of disordered systems. New York: Wiley, 1988 [Russian original: Moscow: Nauka, 1982] 41. Lifshits, M.A.: Gaussian random functions. Dordrecht: Kluwer, 1995 42. Liskevitch, V., Manavi, A.: Dominated semigroups with singular complex potentials. J. Funct. Anal. 151, 281–305 (1997) 43. Matsumoto, H.: On the integrated density of states for the Schrödinger operators with certain random electromagnetic potentials. J. Math. Soc. Japan 45, 197–214 (1993) 44. Mohamed, A., Raikov, G.D.: On the spectral theory of the Schrödinger operator with electromagnetic potential. In: Pseudo-differential calculus and mathematical physics. Demuth, M., Schrohe, E., Schulze, B.-W.(eds.), Berlin: Akademie, 1994, pp. 298–390 45. Nakamura, S.: A remark on the Dirichlet–Neumann decoupling and the integrated density of states. J. Funct. Anal. 179, 136–152 (2001) 46. Nakao, S.: On the spectral distribution of the Schrödinger operator with random potential. Japan. J. Math. 3, 111–139 (1977) 47. Pastur, L.: On the Schrödinger equation with a random potential. Theor. Math. Phys. 6, 299–306 (1971) [Russian original: Teor. Mat. Fiz. 6, 415–424 (1971)] 48. Pastur, L.: Spectral properties of disordered systems in the one-body approximation. Commun. Math. Phys. 75, 179–196 (1980) 49. Pastur, L., Figotin, A.: Spectra of random and almost-periodic operators. Berlin: Springer, 1992 50. Perelmuter, M.A., Semenov, Yu.A.: On decoupling of finite singularities in the scattering theory for the Schrödinger operator with a magnetic field. J. Math. Phys. 22, 521–533 (1981) 51. Reed, M., Simon, B.: Methods of modern mathematical physics I: Functional analysis. Revised and enlarged edition, San Diego: Academic, 1980 52. Reed, M., Simon, B.: Methods of modern mathematical physics II: Fourier analysis, self-adjointness. New York: Academic, 1975 53. Reed, M., Simon, B.: Methods of modern mathematical physics IV: Analysis of operators. New York: Academic, 1978 54. Shklovskii, B.I., Efros, A.L.: Electronic properties of doped semiconductors. Berlin: Springer, 1984 [Russian original: Moscow: Nauka, 1979] 55. Simon, B.: Universal diamagnetism of spinless Bose systems. Phys. Rev. Lett. 36, 1083–1084 (1976) 56. Simon, B.: An abstract Kato’s inequality for generators of positivity preserving semigroups. Ind. Math. J. 26, 1067–1073 (1977)

254

T. Hupfer, H. Leschke, P. Müller, S. Warzel

57. 58. 59. 60.

Simon, B.: Maximal and minimal Schrödinger forms. J. Operator Theory 1, 37–47 (1979) Simon, B.: Functional integration and quantum physics. New York: Academic, 1979 Simon, B.: Kato’s inequality and the comparison of semigroups. J. Funct. Anal. 32, 97–101 (1979) Simon, B.: Schrödinger operators in the twenty-first century. In: Mathematical Physics 2000. Fokas, A., Grigoryan, A., Kibble, T., Zegarlinski, B. (eds.), London: Imperial College Press, 2000, pp. 283–288 Stollmann, P.: Caught by disorder: Bound states in random media. Boston: Birkhäuser, 2001 Ueki, N.: On spectra of random Schrödinger operators with magnetic fields. Osaka J. Math. 31, 177–187 (1994) Veseli´c, I.: Wegner estimate for some indefinite Anderson-type Schrödinger operators. e-print mp_arc 00-373 (2000) Wang, W.-M.: Microlocalization, percolation, and Anderson localization for the magnetic Schrödinger operator with a random potential. J. Funct. Anal. 146, 1–26 (1997) Wegner, F.: Bounds on the density of states in disordered systems. Z. Phys. B 44, 9–15 (1981) Weyl, H.: Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Differentialgleichungen (mit einer Anwendung auf die Theorie der Hohlraumstrahlung). Math. Ann. 71, 441–479 (1912) [in German] Zak, J.: Magnetic translation group. Phys. Rev. 134, A1602–A1606 (1964)

61. 62. 63. 64. 65. 66. 67.

Communicated by B. Simon

Commun. Math. Phys. 221, 255 – 265 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Eigenvalues of the Dirac Operator on Manifolds with Boundary Oussama Hijazi1 , Sebastián Montiel2 , Xiao Zhang3 1 Institut Élie Cartan, Université Henri Poincaré, Nancy I, B.P. 239, 54506 Vandœuvre-Lès-Nancy Cedex,

France. E-mail: [email protected]

2 Departamento de Geometría y Topología, Universidad de Granada, 18071 Granada, Spain.

E-mail: [email protected]

3 Institute of Mathematics, Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences,

Beijing 100080, P.R. China. E-mail: [email protected] Received: 22 August 2000 / Accepted: 15 March 2001

Abstract: Under standard local boundary conditions or certain global APS boundary conditions, we get lower bounds for the eigenvalues of the Dirac operator on compact spin manifolds with boundary. For the local boundary conditions, limiting cases are characterized by the existence of real Killing spinors and the minimality of the boundary. 1. Introduction It is well known that the spectrum of the Dirac operator on closed spin manifolds detects subtle information on the geometry and the topology of such manifolds (see for example [6, 8]). In [31, 33, 27, 30], basic properties of the hypersurface Dirac operator are established. This hypersurface Dirac operator appears as the boundary term in the integral Schrödinger–Lichnerowicz formula (2.3) for compact spin manifolds with compact boundary. In fact, the hypersurface Dirac operator is, up to a zero order operator, the intrinsic Dirac operator of the boundary. In this paper, we examine the classical local boundary conditions and certain Atiyah– Patodi–Singer boundary conditions for the Dirac operator. Here, the spectral resolution of the intrinsic Dirac operator of the boundary is used to define the APS boundary conditions. We first prove self-adjointness and ellipticity of such conditions. Then, systematic use of the modified Levi–Civita connections, introduced in [10, 24, 33, 11, 27, 30], is made (see also [28, 15] for the Dirac operators on submanifolds). Under appropriate curvature assumptions, these modified connections combined with formula (2.3), yield the corresponding estimates for compact spin manifolds with boundary. The limiting cases are then studied. Such estimates are obtained in Sects. 3 and 4. In Sect. 3 we consider both the local and the above mentioned APS boundary conditions. We first introduce the modified connection (3.1) which allows to establish a Friedrich’s type inequality, in case the mean curvature of the boundary is nonnegative. Under the local boundary conditions,

256

O. Hijazi, S. Montiel, X. Zhang

the limiting case is then characterized by the existence of a Killing spinor on the compact manifold with minimal boundary (see (3.5)). Then the energy-momentum tensor is used to define the modified connection (3.7), from which one can deduce inequality (3.9). Finally, in Sect. 4, under the local boundary conditions, the conformal aspect is examined. For example, generalizations of the conformal lower bounds in [22, 24] are obtained (see Remark 9). It might be useful to mention that local and global boundary conditions are introduced in [25] to get optimal extrinsic lower bounds for the first nonnegative eigenvalue of the intrinsic Dirac operator of the boundary. Moreover, in [26], the conformal aspect of this setup is examined where a conformal extrinsic lower bound is given. 2. The Elliptic Boundary Conditions Let M be an n-dimensional Riemannian spin manifold with boundary ∂M endowed with its induced Riemannian and spin structures. Denote by S the spinor bundle of M. Let ∇ (resp. ∇ ∂M ) be the Levi–Civita connection of M (resp. ∂M) and denote by the same symbol their corresponding lift to the spinor bundle S. Consider the Dirac operator D of M defined by ∇ on S. It is known [29] that there exists a positive definite Hermitian metric on S which satisfies, for any covector field X∗ ∈ (T ∗ M), and any spinor fields ϕ, ψ ∈ (S), the relation (X∗ · ϕ, X ∗ · ψ) = |X ∗ |2 (ϕ, ψ),

(2.1)

where “·” denotes Clifford multiplication. The connection ∇ is compatible with the metric ( , ). Fix a point p ∈ ∂M and an orthonormal basis {eα } of Tp M with e0 the outward normal to ∂M and ei tangent to ∂M such that for 1 ≤ i, j ≤ n, (∇i∂M ej )p = (∇0 ej )p = 0. Let {eα } be the dual coframe. Then, for 1 ≤ i, j ≤ n, (∇i ej )p = −hij e0 , (∇i e0 )p = hij ej , where hij = (∇i e0 , ej ) are the components of the second fundamental form at p, and we have 1 ∇i = ∇i∂M + hij e0 · ej · . (2.2) 2 Let H = hii be the unnormalized mean curvature of M. In the above notation, the standard sphere Srn = ∂Brn+1 has positive mean curvature H = nr . By (2.1), (e0 · ej · ϕ, ψ) = (ϕ, ej · e0 · ψ). Therefore (2.2) implies d(ϕ, ψ) ∗ ei = (∇i ϕ, ψ) + (ϕ, ∇i ψ) ∗ 1 = (∇i∂M ϕ, ψ) + (ϕ, ∇i∂M ψ) ∗ 1. Hence the connection ∇ ∂M is also compatible with the metric ( , ). Denote by D ∂M the Dirac operator of ∂M. In the above orthonormal coframe {ei } of M, D ∂M = ei · ∇i∂M . Thus D ∂M is self-adjoint with respect to the metric ( , ). The relation (2.2) implies that ∇i∂M (e0 · ϕ) = e0 · ∇i∂M ϕ.

Dirac Operator on Manifolds with Boundary

257

Hence

D ∂M (e0 · ϕ) = −e0 · D ∂M ϕ. Consider the integral form of the Schrödinger–Lichnerowicz formula for a compact manifold with compact boundary 1 (ϕ, e0 · D ∂M ϕ) − H |ϕ|2 2 ∂M ∂M R = |∇ϕ|2 + |ϕ|2 − |Dϕ|2 . (2.3) 4 M It is well-known that there are basically two types of elliptic boundary conditions for the Dirac operator: The local boundary condition and the (global) Atiyah–Patodi–Singer (APS) boundary condition. Such boundary conditions are used in the positive mass theorem for black holes, Penrose conjecture in general relativity and the index theory in topology [13, 14, 20, 21, 34]. The APS boundary condition exists on any spin manifold with boundary [2–4] (see also [16–19]), while the local boundary condition requires certain additional structures on manifolds such as the existence of a Lorentzian structure or a chirality operator, etc [12, 13, 21]. Now we shall show that the local boundary condition exists on certain spin manifolds with a “boundary chirality operator”. An operator defined on C ∞ (∂M, S|∂M ) is said to be a boundary chirality operator if it satisfies the following conditions: 2 = I d, = 0,

∇e∂M i 0

e · = − · e0 , ei · = · ei , ( · ϕ, · ψ) = (ϕ, ψ).

(2.4) (2.5) (2.6) (2.7) (2.8)

If M is a spacelike hypersurface of a spacetime manifold with timelike covector T , then we can let = T · e0 , where e0 is the normal covector on ∂M. Recall that (see [12] for example), an operator F defined on C ∞ (M, S) is called a chirality operator on M if for all X∗ ∈ (T ∗ M), and any spinor fields ϕ, ψ ∈ (S), one has F 2 = I d, ∇X F = 0, X ∗ · F = −F · X ∗ , (F · ϕ, F · ψ) = (ϕ, ψ). Note that such an operator exists if the spin manifold M is even dimensional. It is easy to see that if M has a chirality operator F , then = F |∂M · e0 is a boundary chirality operator. In this paper, we consider the following boundary conditions: • The local boundary condition. As the eigenvalues of the chirality operator are ±1, the corresponding eigenspaces loc + = ϕ ∈ C ∞ (∂M, S|∂M ), · ϕ = ϕ , loc − = ϕ ∈ C ∞ (∂M, S|∂M ), · ϕ = −ϕ provide local boundary conditions.

258

O. Hijazi, S. Montiel, X. Zhang

• The APS type boundary condition. The operator e0 ·D ∂M is self-adjoint with respect to the induced metric ( , ) on ∂M. Therefore it has a discrete (real) spectrum. Let (ϕk )k∈N be the spectral resolution of e0 · D ∂M , i.e., e0 · D ∂M ϕk = λk ϕk , and consider APS spanned by the positive and negative the corresponding L2 -orthogonal subspaces ± 0 ∂M eigenspaces of e · D , i.e., APS + = ϕ ∈ C ∞ (∂M, S|∂M ), ϕ = ck ϕk , λk >0

APS − = ϕ ∈ C ∞ (∂M, S|∂M ), ϕ =

ck ϕk .

λk 0, there exists Ck,δ such that

ϕ 2H k ≤ (1 + δ) Dϕ 2L2 +Ck,δ ϕ 2H k−1 . loc or ϕ ∈ loc , D ∂M ( · ϕ) = · D ∂M ϕ, thus Proof. Note that for any ϕ ∈ + −

(ϕ, e0 · D ∂M ϕ) = · ϕ, e0 · D ∂M ( · ϕ) = ( · ϕ, e0 · · D ∂M ϕ) = −(ϕ, e0 · D ∂M ϕ).

(2.9)

Dirac Operator on Manifolds with Boundary

259

APS , then Therefore (ϕ, e0 · D ∂M ϕ) = 0. If ϕ ∈ − (ϕ, e0 · D ∂M ϕ) = |ck |2 λk ≤ 0. ∂M

λk 0, there exists a constant Cε > 0 such that ϕ 2L2 (∂M) ≤ ε ϕ 2H 1 +Cε ϕ 2L2 , thus (2.3) implies ϕ 2H 1 ≤ (1 + δ) Dϕ 2L2 +Cδ ϕ 2L2 . Then a standard argument gives (2.9).

(2.10)

The following corollary is a direct consequence of the Sobolev embedding theorem ϕ 2C k ≤ C ϕ 2

n

H k+ 2

.

Corollary 2. Any eigenspinor of the Dirac operator which satisfies either the local loc or the (negative) APS boundary condition ϕ ∈ AP S boundary condition ϕ ∈ ± − is smooth. 3. Lower Bounds for the Eigenvalues In this section, we adapt the arguments used in [27] to the case of spin compact manifolds with boundary. In particular, we get generalizations of basic inequalities on the eigenloc or the negative values of the Dirac operator D under the local boundary conditions ± APS APS boundary condition − . For this, we use the integral identity (2.3) together with an appropriate modification of the Levi–Civita connection. Let Dϕ = λϕ, where λ is a real constant or a real function. For any real functions a and u, we define ∇ia,u = ∇i + a∇i u + Then

a λ ∇j u e i · e j · + e i · . n n

(3.1)

1 λ2 2 |ϕ| + a 2 1 − |du|2 |ϕ|2 n n 2λ (∇i ϕ, ei · ϕ) +2a∇i u (∇i ϕ, ϕ) + n 1 λ2 2 2 2 = |∇ϕ| − |ϕ| + a 1 − |du|2 |ϕ|2 + a∇i u∇i |ϕ|2 . n n

|∇ a,u ϕ|2 = |∇ϕ|2 +

Define the functions Ra,u by

1 2 Ra,u = R − 4a,u + 4∇a∇u − 4 1 − a |du|2 , n

(3.2)

260

O. Hijazi, S. Montiel, X. Zhang

where , is the positive scalar Laplacian. Then we have M

|∇

a,u

λ2 ϕ| = |∇ϕ| − |ϕ|2 − n M + a du(e0 )|ϕ|2 . 2

2

Ra,u R − |ϕ|2 4 4

∂M

Therefore (2.3) yields M

|∇ a,u ϕ|2 =

1 2 Ra,u )λ − |ϕ|2 n 4 M

H 0 ∂M + (ϕ, e · D ϕ) + a du(e0 ) − |ϕ|2 . 2 ∂M

(1 −

(3.3)

Now we generalize Lemma 2.3 in [11] to the case where a is a real function. Lemma 3. Suppose there exist a spinor field ϕ ∈ (S), a real number λ and a real functions a and u on M such that for all i, 1 ≤ i ≤ n, λ a ∇i ϕ = − ei · ϕ − a∇i uϕ − ∇j uei · ej · ϕ. n n

(3.4)

Then ϕ is a real Killing spinor, i.e., either a = 0 or du = 0. In particular, the manifold is Einstein. Proof. First, observe that (3.4) implies Dϕ = λϕ. By the Ricci identity (see [11]), we have 1 Rij ei · ej · ϕ = ei · D(∇i ϕ) − D 2 ϕ 2 λ a = ei · ej · ∇j − ei · −a∇i u − ∇k uei · ek · ϕ − λ2 ϕ n n λ = ei · (ei · ej · +2δij )∇j ϕ n − λ2 ϕ − du · da · ϕ + auϕ − aλdu · ϕ 1 + ei · (ei · ej · +2δij )ek · ∇j a∇k uϕ n + a∇j ∇k uϕ + a∇k u∇j ϕ 2(1 − n) 2 2a 2(2 − n) = λ + u − ∇a∇u ϕ n n n 4aλ 2 − du · da · ϕ + 2 du · ϕ. n n

This implies either a = 0 or du = 0. By (3.3) and Lemma 3, we obtain

Dirac Operator on Manifolds with Boundary

261

Theorem 4. Let M n be a compact Riemannian spin manifold of dimension n ≥ 2, with boundary ∂M, and let λ be any eigenvalue of D under either the local boundary condition loc or the (negative) APS boundary condition APS . If there exist real functions a, u ± − on M such that H ≥ 2a du(e0 ) on ∂M, where H is the mean curvature of ∂M, then λ2 ≥

n sup inf Ra,u , 4(n − 1) a,u M

(3.5)

where Ra,u is given in (3.2). In the limiting case with the local boundary conditions, the associated eigenspinor is a real Killing spinor and ∂M is minimal. Note that by [25], under the APS boundary conditions equality in (3.5) could not hold. Now we make use of the energy-momentum tensor (see [24]) to get lower bounds for the eigenvalues of D. For any spinor field ϕ, we define the associated energy momentum 2-tensor Qϕ on the complement of its zero set by, Qϕ,ij =

1 i e · ∇j ϕ + ej · ∇i ϕ , ϕ/|ϕ|2 . 2

(3.6)

If ϕ is an eigenspinor of D, the tensor Qϕ is well-defined in the sense of distribution. Let a Q,a,u ∇i = ∇i + a∇i u + ∇j u ei · ej · +Qϕ,ij ej · . (3.7) n It is easy to prove that (see [27]) 1 |du|2 |ϕ|2 + a∇i u∇i |ϕ|2 . |∇ Q,a,u ϕ|2 = |∇ϕ|2 − |Qϕ |2 |ϕ|2 + a 2 1 − n Therefore

M

|∇

Q,a,u

Ra,u 2 λ − |ϕ|2 ϕ| = + |Qϕ | 4 M

H |ϕ|2 . + (ϕ, e0 · D ∂M ϕ) + a du(e0 ) − 2 ∂M 2

2

(3.8)

Thus we have Theorem 5. Let M n be a compact Riemannian spin manifold of dimension n ≥ 2, with boundary ∂M, and let λ be any eigenvalue of D under either the local boundary condition loc or the (negative) APS boundary condition APS . If there exist real functions a, u ± − on M such that H ≥ 2a du(e0 ) on ∂M, where H is the mean curvature of ∂M, then Ra,u 2 2 + |Qϕ | . λ ≥ sup inf 4 a,u M In the limiting case, one has H = 2adu(e0 ) on ∂M.

(3.9)

262

O. Hijazi, S. Montiel, X. Zhang

loc or the (negative) APS boundRemark 6. Under either the local boundary condition ± APS , assume that H ≥ 0. Take a = 0 or u constant in (3.5) and (3.9), ary condition − then one gets Friedrich’s inequality [10]

λ2 ≥

n inf R 4(n − 1) M

(3.10)

and the following inequality [24] λ ≥ inf 2

M

R 2 + |Qϕ | . 4

(3.11)

4. Conformal Lower Bounds loc , we show that As in the previous section and under the local boundary conditions ± the conformal arguments used in [27] combined with the integral formula (2.3) yield to generalizations of all known lower bounds for the eigenvalues of the Dirac operator. Let g be the metric of M. For any real function u on M, consider a conformal metric g¯ = e2u g. Denote by D the Dirac operator with respect to this conformal metric. If n−1 Dϕ = λϕ, then D ψ = λ e−u ψ, where ψ = e− 2 u ϕ. Note that

a λ ∇ea,u = ∇ ei + a e−u ∇i u + e−u ∇j u ei · ej + e−u ei ·, i n n −u −u −2u ,u = − e (∇ei (e ∇ei u)) = e (,u + |du|2 ), i

R e2u = R + 2(n − 1),u − (n − 1)(n − 2)|du|2 , also, on ∂M,

(n−2) n D ∂M e− 2 u ϕ = e− 2 u D ∂M ϕ, H = e−u H + (n − 1) du(e0 ) .

a,u by Define the function R a,u = R + 4 n − 1 − a ,u + 4∇a∇u R 2 1 2 − (n − 1)(n − 2) + 4(2 − n)a + 4(1 − )a |du|2 , n

(4.1)

where , is the positive scalar Laplacian. Then apply (3.3) to the conformal metric g, to get

a,u 2 1 2 R a,u 2 −u 1− e λ − |ϕ| vg ∇ ψ v¯g = g¯ n 4 M M

H (ψ, e0 · D ∂M ψ)g + a du(e0 ) − |ψ|2g vg , + 2 ∂M

Dirac Operator on Manifolds with Boundary

hence

263

a,u 2 1 2 R a,u 2 1− λ − |ϕ| vg e−u ∇ ψ v¯g = g¯ n 4 M M + e−u (ϕ, e0 · D ∂M ϕ) ∂M

n−1 H 2 + (a − ) du(e0 ) − |ϕ| vg . 2 2

(4.2)

a,u ψ = 0 implies Note that ∇ λ n 1 n ∇i uϕ − a− ∇j uei · ej · ϕ ∇i ϕ = − ei · ϕ − a − n 2 n 2 (see [11]), we thus have either a = Mn

n 2

or du = 0 by Lemma 3. Thus we obtain:

be a compact Riemannian spin manifold of dimension n ≥ 2, with Theorem 7. Let boundary ∂M, and let λ be any eigenvalue of D under the local boundary condition loc . If there exist real functions a, u on M such that ± H ≥ (2a − n + 1) du(e0 ) on ∂M, where H is the mean curvature of ∂M, then n a,u , sup inf R λ2 ≥ 4(n − 1) a,u M

(4.3)

a,u is given in (4.1). In the limiting case, the associated eigenspinor where the function R is a real Killing spinor and either H = du(e0 ) or H = 0 on ∂M. Since Qϕ,i¯ j¯ = e−u Qϕ,ij under the conformal transformation g = e2u g, we apply (3.8) to the conformal metric g, to get

Ra,u Q,a,u 2 ψ v¯g = e−u λ2 − + |Qϕ |2 |ϕ|2 vg ∇ g¯ 4 M M + e−u (ϕ, e0 · D ∂M ϕ)

∂M

+ (a −

H n−1 ) du(e0 ) − |ϕ|2 vg . 2 2

(4.4)

Thus we have Theorem 8. Let M n be a compact Riemannian spin manifold of dimension n ≥ 2, with boundary ∂M, and let λ be any eigenvalue of D under the local boundary condition loc . If there exist real functions a, u on M such that ± H ≥ (2a − n + 1) du(e0 ) on ∂M, where H is the mean curvature of ∂M, then Ra,u + |Qϕ |2 . λ2 ≥ sup inf 4 a,u M In the limiting case one has H = (2a − n + 1) du(e0 ) on ∂M.

(4.5)

264

O. Hijazi, S. Montiel, X. Zhang

2 Remark 9. If n ≥ 3, take a = 0 and u = − n−2 log h in (4.3) and (4.5), where h is a positive eigenfunction of the first eigenvalue µ1 of the conformal Laplacian

L := 4

n−1 +R n−2

under the boundary condition dh(e0 ) −

(n − 2)H h = 0. 2(n − 1)

Then, one gets the lower bounds [22, 24] λ2 ≥ and λ2 ≥ inf M

n µ1 , 4(n − 1) µ

1

4

+ |Qϕ |2

(4.6) (4.7)

loc . In the limiting case of (4.6), the associated under the local boundary condition ± eigenspinor is a real Killing spinor and ∂M is minimal.

Acknowledgements. Research of S.M. is partially supported by a DGICYT grant No. PB97-0785. Research of X.Z. is partially supported by the Chinese NSF and mathematical physics program of CAS. This work is partially done during the visit of the last two authors to the Institut Élie Cartan, Université Henri Poincaré, Nancy 1. They would like to thank the institute for its hospitality.

References 1. Adams, R.A.: Sobolev spaces. New York: Academic Press, 1978 2. Atiyah, M.F., Patodi, V.K., Singer, I.M.: Spectral asymmetry and Riemannian geometry, I. Math. Proc. Cambr. Phil. Soc. 77, 43–69 (1975) 3. Atiyah, M.F., Patodi, V.K., Singer, I.M.: Spectral asymmetry and Riemannian geometry, II. Math. Proc. Cambr. Phil. Soc. 78, 405–432 (1975) 4. Atiyah, M.F., Patodi, V.K., Singer, I.M.: Spectral asymmetry and Riemannian geometry, III. Math. Proc. Cambr. Phil. Soc. 79, 71–99 (1976) 5. Bär, C.: Lower eigenvalue estimates for Dirac operators. Math. Ann. 293, 39–46 (1992) 6. Baum, H., Friedrich, T., Grunewald, R., Kath, I.: Twistor and Killing Spinors on Riemannian Manifolds. Seminarbericht 108, Humboldt-Universität zu Berlin, 1990 7. Bourguignon, J.P., Gauduchon, P.: Spineurs, Opérateurs de Dirac et Variations de Métriques. Commun. Math. Phys. 144, 581–599 (1992) 8. Bourguignon, J.P., Hijazi, O., Milhorat, J.-L., Moroianu, A.: A Spinorial Approach to Riemannian and Conformal Geometry. Monograph (In preparation) 9. Botvinnik, B., Gilkey, P., Stolz, S.: The Gromov Lawson Rosenberg conjecture for groups with periodic cohomology. J. Diff. Geo. 46, 374–405 (1997) 10. Friedrich, T.: Der erste Eigenwert des Dirac-Operators einer kompakten, Riemannschen Mannigfaltigkeit nicht negativer Skalarkrümmung. Math. Nachr. 97, 117–146 (1980) 11. Friedrich, Th., Kim, E.-C.: Some remarks on the Hijazi inequality and generalizations of the Killing equation for spinors. To appear in J. Geom. Phys. 12. Farinelli, S., Schwarz, G.: On the spectrum of the Dirac operator under boundary conditions. J. Geom. Phys. 28, 67–84 (1998) 13. Gibbons, G., Hawking, S., Horowitz, G., Perry, M.: Positive mass theorems for black holes. Commun. Math. Phys. 88, 295–308 (1983) 14. Gilkey, P.B.: Invariance theory, the heat equation, and the Atiyah–Singer index theorem. 2nd ed., Boca Raton: CRC Press, 1995

Dirac Operator on Manifolds with Boundary

265

15. Ginoux, N., Morel, B.: Eigenvalue Estimates for the Submanifold Dirac Operator. Preprint IÉCN, Nancy, n◦ 44 (2000) 16. Grubb, G.: Heat operator trace expansions and index for generalAtiyah-Patodi-Singer boundary problems. Commun. Part. Diff. Equat. 17, 2031–2077 (1992) 17. Grubb, G., and Seeley, R.: Développements asymptotiques pour l’opérateur d’Atiyah–Patodi–Singer, C. R. Acad. Sci., Paris, Ser. I 317, 1123–1126 (1993) 18. Grubb, G., and Seeley, R.: Weakly parametric pseudodifferential operators and Atiyah Patodi Singer boundary problems. Invent. Math. 121, 481–529 (1995) 19. Grubb, G., and Seeley, R.: Zeta and eta functions for Atiyah-Patodi-Singer operators. J. Geom. Anal. 6, 31–77 (1996) 20. Herzlich, M.: A Penrose-like inequality for the mass of Riemannian asymptotically flat manifolds. Commun. Math. Phys. 188, 121–133 (1998) 21. Herzlich, M.: The positive mass theorem for black holes revisited. J. Geom. Phys. 26, 97–111 (1998) 22. Hijazi, O.: A conformal lower bound for the smallest eigenvalue of the Dirac operator and Killing spinors. Commun. Math. Phys. 104, 151–162 (1986) 23. Hijazi, O.: Première valeur propre de l’opérateur de Dirac et nombre de Yamabe. C. R. Acad. Sci. Paris, 313 , 865–868 (1991) 24. Hijazi, O.: Lower bounds for the eigenvalues of the Dirac operator. J. Geom. Phys. 16, 27–38 (1995) 25. Hijazi, O., Montiel, S., Zhang, X.: Dirac operator on embedded hypersurfaces. Math. Res. Lett. 8, 195–208 (2001) 26. Hijazi, O., Montiel, S., Zhang, X.: Conformal Lower Bounds for the Dirac Operator of Embedded Hypersurfaces. Asian J. Math., to appear 27. Hijazi, O., Zhang, X.: Lower bounds for the Eigenvalues of the Dirac Operator, Part I. The hypersurface Dirac Operator. To appear in Ann. Glob. Anal. Geom. 28. Hijazi, O., Zhang, X.: Lower bounds for the Eigenvalues of the Dirac Operator, Part II. The Submanifold Dirac Operator. To appear in Ann. Glob. Anal. Geom. 29. Lawson, H., Michelsohn, M.: Spin geometry. Princeton, NJ: Princeton Univ. Press, 1989 30. Morel, B.: Eigenvalue Estimates for the Dirac-Schrödinger Operators. J. Geom. Phys. 38, 1–18 (2001) 31. Trautman, A.: The Dirac Operator on Hypersurfaces. Acta Phys. Plon. B 26, 1283–1310 (1995) 32. Witten, E.: A new proof of the positive energy theorem. Commun. Math. Phys. 80, 381–402 (1981). 33. Zhang, X.: Lower bounds for eigenvalues of hypersurface Dirac operators. Math. Res. Lett. 5, 199–210 (1998); A remark on: Lower bounds for eigenvalues of hypersurface Dirac operators. Math. Res. Lett. 6, 465–466 (1999) 34. Zhang, X.: Angular momentum and positive mass theorem. Commun. Math. Phys. 206, 137–155 (1999) Communicated by M. Aizenman

Commun. Math. Phys. 221, 267 – 292 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Boundary Layer Stability in Real Vanishing Viscosity Limit Denis Serre1, , Kevin Zumbrun2, 1 ENS Lyon, UMPA (UMR 5669 CNRS), 46, allée d’Italie, 69364 Lyon Cedex 07, France.

E-mail: [email protected]

2 Department of Mathematics, Indiana University, Rawles Hall, Bloomington, IN 47405, USA.

E-mail: [email protected] Received: 27 November 2000/ Accepted: 16 March 2001

Abstract: In the previous paper [20], an Evans function machinery for the study of boundary layer stability was developed. There, the analysis was restricted to strongly parabolic perturbations, that is to an approximation of the form ut + (F (u))x = ν(B(u)ux )x (ν 0.

(1)

Here, F is a given flux, a C 2 -vector field on a convex open subset U of Rn . The diffusion tensor B is of class C 2 ; its eigenvalues need to have non-negative real parts, but it is important not to assume the invertibility of B(u). We assume that the rank of B does not depend on u, and we denote it by r (1 ≤ r ≤ n). The positive constant ν is small. We therefore are interested in the limit as ν → 0+ , expecting that the solutions uν of (1) converge boundedly almost everywhere to solutions of the inviscid system ut + F (u)x = 0,

x, t > 0.

(2)

The local well-posedness of the Cauchy problem (where x ∈ R replaces x > 0) for (1) is a difficult problem, first addressed by Kawashima in his unpublished thesis [15]. A natural hypothesis (see [16]), that we shall adopt here is that there exists a smooth change of variables u → v(u) (with inverse u = g(v)), in which the system rewrites as g(v)t + f (v)x = ν(b(v)vx )x ,

(3)

with the following properties: (H1) b(v) is block-diagonal:

b(v) =

0

0

0 b1 (v)

,

with b1 (v) ∈ GLr (R). (H2) dg(v) is lower block-triangular: dg(v) =

γ (v)

0

·

δ(v)

with, necessarily, γ (v) ∈ GLn−r , δ(v) ∈ GLr .

,

Boundary Layer Stability

269

(H3) For each v¯ ∈ V := v(U), the linear operator ¯ x2 δ(v)∂ ¯ t − b1 (v)∂ is strongly parabolic. (H4) In the block decomposition of df , df (v) =

h(v) · ·

·

,

the matrix γ (v)−1 h(v) is diagonalisable with real eigenvalues. In this list, (H1) is a little bit restrictive. But it only needs the eigenvalue λ = 0 of B to be semi-simple. We shall see later on that a natural assumption (see (H9)) ensures this property. The last hypothesis means that the system obtained from (3) by removing the second order equations, and freezing the corresponding variables, is an (n − r) × (n − r) hyperbolic system. In view of these hypotheses, we shall denote by (w, z)T the block decomposition of v. Defining also f =: (f0 , f1 ), g =: (g0 , g1 ), we have h = dw f0 and γ = dw g0 . Since we are concerned with initial-boundary value problems (IBVP), we need to distinguish between two cases, the boundary {x = 0} being characteristic or not. We say that it is characteristic at some state u ∈ U (corresponding to v ∈ V) if one of the signal velocities of the system under consideration vanishes at u. For the perturbed system (1), or equivalently (3), this means that h(v) is singular. For the “inviscid” system (2) this means that dF (u) is singular. Characteric IBVPs may be difficult to attack. But overall, the status of the boundary layer is much different according to the nature of the boundary. The example of a gas flow, where (2) is the Euler equations and (1) is the Navier–Stokes equations, is enlightening. A natural assumption is that the boundary is impermeable ; then it is characteristic, for both (2) and (1). The width of the boundary √ layer is about the square root ν of the viscosity. Its profile is expected to obey the Prandtl equation. Very little is known at a rigorous level in this case. On the contrary, an inflow (or outflow) boundary condition makes the boundary non-characteristic in most cases1 . In that case, the width of the boundary layer is of order ν and its profile obeys an ODE (see (4) below). A suitable analysis of this case was carried out by Gisclon and one of us [9, 10] in one-space dimension, and by Grenier & Guès [12] in several space dimensions. These references deal with boundary layers of moderate amplitude, when (1) is strictly parabolic, that is r = n. Let us assume that (2) is hyperbolic, that is dF (u) is diagonalisable with real eigenvalues. When dF (u) is invertible, an IBVP needs q independent scalar boundary conditions, where q is the number of incoming characteristic curves, that is the number of positive eigenvalues of dF (u). Similarly, assuming (H1–H4) and that h(v) is invertible, an IBVP for (1) needs r + p independent scalar boundary conditions, where p is the number of incoming characteristics for the reduced system g0 (w, z¯ )t + f0 (w, z¯ )x = 0 (¯z constant), that is the number of positive eigenvalues of γ (v)−1 h(v). We shall see that p ≤ q ≤ r + p (Corollary 1); a boundary layer occurs when q < p + r. For the sake of simplicity, we shall restrict to a set of Dirichlet-type boundary conditions for (1). In [9, 10, 12], the convergence of (1) towards (2) was proved under natural assumptions, as long as the amplitude of the boundary layer remains smaller than some threshold. 1 In- or out-flow data are of interest in problems with apertures, such as occur in oil recovery.

270

D. Serre, K. Zumbrun

For x >> ν, the solution uν of the IBVP for (1) behaves like the solution u¯ of an appropriate IBVP for (2). For x = O(ν), it behaves as a layer U (x/ν; t). Here, the time variable acts just as a parameter and U (·; t) solves an ODE ¯ t)), B(U )U = F (U ) − F (u(0,

(4)

with U (+∞; t) = u(0, ¯ t) and U (0; t) satisfying the boundary condition of (1). Given u(0, ¯ t), this is an overdetermined problem, which admits a solution if and only if u(0, ¯ t) belongs to some subset C(t) (the reference to t is there because the boundary data might depend on t). The relation u(0, ¯ t) ∈ C(t) then plays the rôle of a boundary condition for (2), called the residual boundary condition. Under suitable assumptions, C(t) is a submanifold of codimension q (see [9, 10, 19]) and gives rise to a locally well-posed IBVP for (2) (see [17] for a theory of such IBVP), which determines u¯ on some strip R+ × (0, T ). The restriction r = n, assumed in [9, 10], is not essential here, as pointed out by H. Freistühler (personal communication). We point out that U (·; t) is nothing but a steady solution of the IBVP for the rescaled problem uτ + F (u)y = (B(u)uy )y .

(5)

Similarly, U (·/ν; t0 ) is a steady solution of the IBVP for (1). The restriction of moderate strength in [9, 10, 12] is actually relevant. We do not exclude that some strong layers become linearly unstable, which would forbid the convergence as ν → 0. The instability mechanism may be described as follows. Let us consider the linearized problems about U (·/ν; t0 ) and U (·; t0 ): ut = Lν u + linearized boundary conditions, uτ = Lu + linearized boundary conditions.

(6) (7)

Clearly, the linearized boundary conditions are the same for both problems ; therefore L and Lν have the same domain D. One easily checks that Lν is conjugated to ν −1 L, through the rescaling u → u, ˜ u(y) ˜ = u(νy). Let us now suppose that the spectrum of L contains some complex number λ with real part ω > 0. Then Lν admits the spectral value λ/ν and the boundary layer is more and more unstable as ν → 0: disturbances are amplified by a factor exp(ωt/ν) and are completely destroyed on a time scale O(ν). In other words, such a boundary layer may not be observed in practice and is irrelevant. As a matter of fact, the analysis in [9, 10, 12] implies that layers of moderate size, with r = n, are linearly stable. On the contrary, a recent work by Grenier and Rousset [13] shows that spectral stability of the boundary layer implies non-linear stability, under the condition that r = n. Let us give a short description of their result. Being given a Dirichlet boundary data a(t) for (1), let u be a smooth solution of the hyperbolic system (2) with initial data u0 (x) and residual boundary condition u(0, t) ∈ C(t) associated to a. Let U (·; t) be the boundary layer, determined by (4) and U (0; t) = a(t) (and therefore U (+∞; t) = u(0, t)). Finally, let uν be the solution of the IBVP for (1). Assume that for every t ∈ [0, T [, the boundary layer U (·; t) is spectrally stable. Then uν converges strongly towards u. This motivates our study of the “spectral stability of the boundary layer”. By this, we mean that the spectrum of L is included in the left (say, stable) half-space {λ ∈ C; λ ≤ 0}. To decide whether a given boundary layer is spectrally stable or not is a difficult task, which cannot be solved explicitly by quadrature. We shall see that, under reasonable assumptions, the essential spectrum of L lies in the stable half-space.

Boundary Layer Stability

271

Therefore, instability can occur only when L admits an eigenvalue with λ > 0. This yields the eigenvalue problem (L − λ)u = 0,

u ∈ D.

(8)

The difficulty then comes from the fact that, since L is a differential operator with variable coefficients, we are not able to solve explicitly the ODE (L − λ)u = 0. The information obtained by differentiating (4) is clearly not enough: LU = 0.

(9)

We point out in passing that U does not satisfy the linearized boundary condition in general, so that (9) does not mean that zero is an eigenvalue of L, contrary to the case of travelling waves (see for instance [7]). In the sequel, we first focus on the stability analysis of one single given layer U . We denote by u+ its limit at +∞ and we define V := v ◦ U . We only assume that (H1–H4) hold on a neighbourhood U of the range of U . In order to have minimal hypotheses, we complete (H1–H4) by (H5) The boundary is non-characteristic for (1), that is h(v) ∈ GLn−r (R),

∀v ∈ V := v(U).

(H6) Strict hyperbolicity of (2) near u+ : for u in some neighbourhood of u+ , the matrix dF (u) is diagonalisable with real eigenvalues of constant multiplicities. (H7) The boundary is non-characteristic for (2) at u+ , that is dF (u+ ) ∈ GLn (R). (H8) The state u+ is linearly L2 -stable for the Cauchy problem of (1): for all ξ ∈ R∗ , the eigenvalues of the matrix ξ 2 B(u+ ) + iξ dF (u+ ) have strictly positive real parts. For the sake of simplicity, we shall denote K+ := K(u+ ) (for functions of the variable u ∈ U) or k+ = limx→+∞ k(x) (for functions of the variable x > 0). When (H8) holds, there exists a positive θ such that κ ≥ θξ 2 holds for all eigenvalue κ of ξ 2 B+ + iξ dF+ and |ξ | < 1. This estimate is not uniformly valid for ξ ∈ R when r < n. We point out that (H8) implies that dF+ has real eigenvalues (examine the limit as ξ → 0), a slightly weaker property than (H6). Also, (H8) follows from stronger, but rather natural, hypotheses: (H9) There is a dissipative symmetrizer S+ at u+ , that is a positive definite symmetric matrix such that S+ dF+ is symmetric and ∀X ∈ Rn ,

(S+ B+ X, X) ≥ βB+ X2 ,

where (· , ·) denotes the scalar product in Rn and β > 0 is a constant. (H10) The hyperbolic and parabolic modes do couple: the kernel of B+ does not contain eigenvectors of dF+ . Lemma 1. Hypotheses (H9, H10) imply (H8).

272

D. Serre, K. Zumbrun

Proof. Let ξ ∈ R∗ and let (λ, X) be an eigenpair of ξ 2 B+ + iξ dF+ . Then (ξ 2 B+ + iξ dF+ − λ)X = 0. Multiplying by X ∗ S+ , and taking the real part, we obtain (λ)X∗ S+ X = ξ 2 (S+ B+ X, X∗ ) ≥ ξ 2 βB+ X2 . Therefore λ is positive. It is strictly so, because otherwise B+ X = 0 so that X would be an eigenvector of dF+ . Such a symmetrizer usually comes as the Hessian matrix of an entropy for (2), which is strongly convex at u+ and dissipative for (1). The rôle of (H9) in the computation of a “stability index” has been explained in [2]. Let us point out that assumption (H9) immediately implies that the range R(B+ ) is S+ -orthogonal to ker B+ . This shows that zero is a semi-simple eigenvalue of B+ , a property which was implicit in assumption (H1). We also remark that instability does not occur in scalar problems (n = 1), even at a non-linear level, as shown in [4]. Our paper is organized as follows. In the next section, we study the boundary layer equation in a geometrical setting and we show that the stability analysis reduces to the search of ordinary eigenvalues. In Sect. 3, we built our Evans function, following [20] and focus on its crucial estimate at λ = 0. In Sect. 4, we consider a richer situation, where the boundary layer is parametrized in such a way that it is a piece of a maximal solution of the layer equation. When this solution is a viscous shock profile and the piece is almost the whole, then we show that the stability index is the sign of an algebraic expression. This sign can be computed in several cases. The remaining sections are devoted to full as well as to isentropic gas dynamics. For full gas dynamics (Sect. 5), we show that for an adiabatic constant γ > 2 and when the viscosity coefficient ν dominates the heat diffusion κ (for instance, ν > κ works), then there exist unstable boundary layers with inflow. As explained above, such instability is only shown for layers of large amplitude, which are almost heteroclinic orbits of (4). This result is the main application of our analysis. Finally, an appendix shows that weak boundary layers are spectrally stable, thanks to generalized energy inequalities. 2. Linear and Non-Linear Dynamical Systems We begin with the non-linear equation (4), that we rewrite as B(U )U = F (U ) − F (u+ ).

(10)

When r < n, this is not an ODE in the strict sense, but a “differential-algebraic” equation. It may be better to see it under the form b(V )V = f (V ) − f (v+ ),

v+ = v(u+ ).

(11)

We split this system into two pieces: f0 (W, Z) = f0 (v+ ),

b1 (W, Z)Z = f1 (W, Z) − f1 (v+ ).

From (H5), the identity f0 (v) = f0 (v+ ) allows to determine w in terms of (z, v+ ) in a neighbourhood of the range of V : w = w(z, ˆ v+ ). Therefore, the differential part becomes an ODE in z, well-defined in a neighbourhood of the z-projection of this range. Let write it as z = G(z; v+ ) We know w(z ˆ + , v+ ) = w+ and therefore G(z+ , v+ ) = 0.

(12)

Boundary Layer Stability

273

Lemma 2. Under (H7,9,10), the rest point z+ of the dynamical system (12) is hyperbolic. Its stable manifold is of dimension r + p − q. Proof. One easily computes dz G(z+ , v+ ) = b1 (v+ )−1 (dw f1 dz wˆ + dz f1 )+ . Let us consider eigenvalues σ of dz G(z+ , v+ ). We have det(dw f1 dz wˆ + dz f1 − σ b1 )+ = 0. However, dz wˆ = −(dw f0 )−1 dz f0 . Thus, using Schur’s formula, we arrive to det(df+ − σ b+ ) = 0, or equivalently det(dF+ − σ B+ ) = 0. Up to a non-zero constant, this determinant is the characteristic polynomial of dG+ . According to (H7, H8), σ may not be purely imaginary. Therefore, z+ is a hyperbolic rest point. We now proceed by homotopy. For m > 0, we define Pm (σ ) := det(dF+ − σ (B+ + mIn )). Since the pair (dF+ , B+ + mIn ) satisfies the assumptions (H7, H8), we again see that Pm does not vanish on the imaginary axis. Since its degree n is constant for m > 0, we deduce that the number of roots of negative real part does not depend on m > 0. Letting m → +∞, we find that this number is n − q. Since the degree of Pm drops to p as m reaches 0+, its roots split into two parts. One set contains those which tend to the roots of P0 with negative real parts. The cardinality of this set is the dimension of the stable manifold. The other set consists of those roots which tend to infinity as m → 0+. To prove the lemma, we need to show that its cardinality is n − r − p, the number of negative eigenvalues of γ+−1 h+ . By density, we may assume that this matrix has only simple eigenvalues. We first show that, if Pm (σm ) = 0 and σm tends to infinity, then mσm tends to such an eigenvalue. For Pm = 0 means that there is an Xm , say of unit norm, such that df+ Xm = σm (B+ + mdg+ )Xm . Dividing by σm , we first obtain that Xm has a cluster point X¯ in ker B+ . Obviously, X¯ has unit norm. Next, retaining only the p first rows of ¯ which proves the claim. the equality, we have h+ X¯ ∼ σm mγ+ X, Conversely, let µ0 < 0 be a negative eigenvalue of γ+−1 h+ . This means that there exists a non-zero pair (Y0 , Z0 ) with Y0 ∈ ker B+ , Z0 ∈ R(B+ ) and (dF+ − µ0 )Y0 = B+ Z0 . From (H5), µ0 = 0 and we may redefine Z0 so to have (dF+ −µ0 )Y0 = µ0 BZ0 . This means in particular that (dF+ − µ0 )(ker B+ ) ∩ R(B+ ) = {0}. Since the sum of dimensions of these spaces is n (hypothesis (H10)), this means that their sum has codimension one. Let l0 be a non trivial linear form vanishing on it. From the simplicity of µ0 , we know that l0 Y0 = 0 ; we therefore normalize l0 by l0 Y0 = 1. We now define the following non-linear mapping: R2 × ker B+ × R(B+ ) → R × Rn m µ l Y − 1 0 → N (m, µ, Y, Z) := . (dF+ − µ)(Y + mZ) − µB+ Z Y Z We already have N (0, µ0 , Y0 , Z0 ) = 0. We check easily that the differential dµ,Y,Z N , computed at (0, µ0 , Y0 , Z0 ), is injective, thus invertible. From the implicit function theorem, we receive a locally defined smooth function m → (µ, Y, Z), whose graph is the zero set of N near (0, µ0 , Y0 , Z0 ). Then X := Y (m) + mZ(m) and σ := mµ(m) satisfy (dF+ − σ (B+ + m))X = 0, so that Pm (σ ) = 0, with σ < 0.

274

D. Serre, K. Zumbrun

Corollary 1. Under (H7, H9, H10), one has p ≤ q ≤ p + r. We point out that both inequalities in this corollary are equivalent to each other, in the following sense. Let us for instance assume that p ≤ q is true under (H7, H9, H10). Then (−F, B) satisfy (H7, H9, H10) too, with (p, q) replaced by (n − r − p, n − q). Therefore, n − r − p ≤ n − q, or equivalently q ≤ r + p. We deduce from the lemma that the profile U tends exponentially fast to its limit u+ : U (y) − u+ + U (y) = O(e−α+ y ),

α+ > 0.

(13)

This is actually clear for Z, then for W , using the formula W = w(Z, ˆ v+ ). We now turn to the linear operator Lu = {B(U )u + (dB(U )u)U − dF (U )u} . Its boundary conditions are given by r + p linear forms D1 , . . . , Dr+p : D1 u(0) = · · · = Dr+p u(0) = 0.

(14)

The linear transform u → dv(U )u shows that L is conjugate to l, where

lv := dg(v)−1 b(V )v + (db(V )v)V − df (V )v . The boundary conditions transform accordingly: d1 v(0) = · · · = dr+p v(0) = 0,

(15)

where dj ◦ dv(U (0)) = Dj . The operator l is a list of r second-order differential operators and n − r first order ones. Its domain is

Dl = (w, z) ∈ H 1 (R+ )n−r × H 2 (R+ )r ; dj (w(0), z(0)) = 0, 1 ≤ j ≤ r + p . For instance, Dl = H01 (R+ )n−r × (H 2 (R+ ) ∩ H01 (R+ ))r , when r + p = n, that is when all the eigenvalues of (γ −1 h)(V (0)) are positive. Let us now introduce the constant coefficient operator on the whole line l+ v = (dg+ )−1 (b+ v − df+ v ), with domain H 1 (R)n−r × H 2 (R)r . It is obtained from l by taking the limit as x → +∞. Its spectrum, computable from the Fourier transform, is given by σ+ = {λ ; det(µ2 b+ + iµdf+ + λdg+ ) = 0 for some µ ∈ R}. From (H8), we know that σ+ consists of numbers of strictly negative real part, apart from λ = 0. We shall denote by A the connected component of C \ σ+ , which contains the right half-plane {λ > 0}. As usual, the following lemma is crucial. Lemma 3. For all λ ∈ A, the operator λ − l : Dl → L2 (R+ ) is Fredholm with index zero. The eigenvalues of l, in A, are isolated. Therefore, Corollary 2. The unstable spectrum of L, or equivalently l, consists only of isolated eigenvalues of finite multiplicities.

Boundary Layer Stability

275

Proof. This follows similarly as in the case of an asymptotically constant-coefficient operator on the whole line, by a now-standard argument of Henry [14]. Specifically, the result for constant coefficient operators can be established by direct computation, similarly as in [14, p. 138]; this can then be extended to the asymptotically constant case by a version of Weyl’s Lemma (Theorem A.1 of [14, p. 136]) stating that, except for isolated eigenvalues of constant multiplicity, the spectrum of an operator is unchanged by relatively compact perturbation. For, it is readily verified that an asymptotically constant-coefficient operator is a relatively compact perturbation of the corresponding constant-coefficient operator with limiting coefficients at x → +∞, see Exercise 2, p. 137 of [14]. Alternatively, following the approach of [21] for operators on the line, one can establish the result directly, by explicit construction of the Green’s function in terms of the Evans function, followed by a direct computation showing that the location and multiplicity of eigenvalues of L correspond exactly to the location and multiplicity of zeroes of the (analytic) Evans function. 3. The Evans Function Following the general theory set up in [1], we construct an holomorphic function B : A → C, whose zeroes are the unstable eigenvalues of L. This extends Serre’s construction [20] to the case of a non-invertible B. We call B the “Evans function” of L. Following Gardner & Zumbrun [7], we show that B extends analytically to a neighbourhood of the origin. Let λ be a complex number with λ > 0, or more generally an element of A. We first rewrite the differential equation (l − λ)v = 0 as a linear first order system of n + r ordinary differential equations: w w z = M(x; λ) z , x > 0. (16) z z The boundary conditions are rewritten as dˆj (w, z, z ) := dj (w, z) = 0. The matrix M+ (λ) = M(+∞; λ) is hyperbolic, that is its eigenvalues have non-zero real parts. These are the zeroes of the polynomial Pλ (µ) := det(µ2 b+ −µdf+ −λdg+ ). By a contin+ uation argument, there are r + p eigenvalues of negative real part µ+ 1 (λ), . . . , µr+p (λ), counting with multiplicities. The corresponding (generalized) eigenvectors span the “stable subspace” E+ (λ) of M+ (λ). It follows that the set of bounded solutions of (16) is a vector space of dimension r + p, that we denote by E(λ). Such solutions actually decay exponentially fast as x → +∞. The space E+ (λ) is the limit of the trace E(λ; x), as x → +∞. The space E(λ) depends holomorphically on λ. If it was possible to select a holomorphic basis B(λ) = {φ1 (·; λ), . . . , φp+r (·; λ)} of E(λ), then one should define B(λ) directly by B(λ) := dˆj (φk (0; λ)) . (17) 1≤j,k≤p+r

The vanishing of such a number is clearly equivalent to the existence of a linear combination of the φk ’s, on which the dˆj ’s vanish simultaneously. This amounts to saying

276

D. Serre, K. Zumbrun

that there exists a φ in E(λ), such that dˆ1 φ = · · · = dˆr+p φ = 0. Equivalently, there is a v ∈ Dl such that (l − λ)v = 0: λ is an eigenvalue of l. Reciprocally, B vanishes at every eigenvalue of l in A. This procedure is possible when r +p = 1. It is also possible in every small open ball in A. However, in the general case, it raises serious difficulties because of the existence of branching points in A, where M+ (λ) fails to be diagonalisable. At such points, the natural choice of B, given by prescribed asymptotic behaviour of the φk ’s, is meaningless. To overcome this difficulty, one commonly works in the exterior algebra Fr+p (Cn+r ). For 1 ≤ m ≤ n + r, there is a unique homomorphism M (m) (x; λ) in Fm (Cn+r ), such that m solutions φ1 , . . . , φm of (16) always satisfy d φ1 ∧ · · · ∧ φm = M (m) (x; λ)φ1 ∧ · · · ∧ φm . dx (p+r)

When m = r +p, M+ (λ) has the nice property that it has only one eigenvalue µ+ (λ) of minimal real part and that it is simple. Actually, + µ+ (λ) = µ+ 1 (λ) + · · · + µr+p (λ).

The corresponding eigenvector has the form y1 ∧ · · · ∧ yr+p , where {y1 , . . . , yr+p } is (r+p) (λ) is holomorphic and µ+ is simple, µ+ (λ) any basis of E+ (λ). Since λ → M+ is holomorphic too. Therefore, one may select a holomorphic section λ → Y (λ) of the eigen-bundle: (r+p) (M+ (λ) − µ+ (λ))Y (λ) = 0, Y (λ) = 0. In addition, noticing that µ+ (λ) is real when λ ∈]0, +∞[, we infer that one may choose ¯ = Y (λ). Next, there is a unique Y (λ) so that it is real when λ is. In particular, Y (λ) solution y(·; λ) of y = M (r+p) (x; λ)y,

y(x; λ) ∼ (exp µ+ (λ)x)Y (λ) as x → +∞.

(18)

For every λ ∈ A, y(·; λ) equals, up to a constant, a wedge product of a basis of E(λ). ¯ = y(λ). Moreover, it inherits the holomorphy of Y (λ). Similarly, y(λ) Defining the (p + r)-form dˆ := dˆ1 ∧ · · · ∧ dˆp+r , we may define our Evans function as ˆ y(0; λ) > . B(λ) =< d, Besides all the above-mentioned properties, we point out that it takes real values on the real positive semi-axis. Given any point λ0 ∈ A, it admits a form (17) in a vicinity of λ0 . This is the way we compute its local behaviour in practice. We now point out that λ → (µ+ (λ), Y (λ)) admits an analytic extension in a neighbourhood of the origin. Then, thanks to the exponentially fast convergence of (p+r) M (p+r) (x; λ) towards M+ (λ), we deduce (“gap lemma”, see [7]): Proposition 1. The spaces E+ (λ), E(λ), the eigenvector Y (λ), the eigen-function y(·; λ) ˆ of the origin. and the Evans function B(λ) extend analytically to a neighbourhood A Let us point out that, however, these extensions no longer obey the same definitions. For instance, E+ (λ) is no longer the stable subspace of M+ (λ), and so on. Let us describe E+ (λ) when |λ| 0, the eigenvalues of M+ (λ) are found by looking at decaying modes eµx vˆ of l+ − λ: these are roots of Pλ . Since Pλ (µ) = det(−µdf+ − λdg+ ) + O(µ2 ),

Boundary Layer Stability

277

we see that q roots vanish as λ → 0 in A. They behave as −λ/aj , where an−q+1 , . . . an are the positive eigenvalues of (dg+ )−1 df+ , or of dF+ = df+ (dg+ )−1 . The corresponding eigenvectors are (rj , 0)T + O(λ), where rj is an eigenvector of (dg+ )−1 df+ associated to aj . In terms of eigenvectors Rj of dF+ , one has Rj = df+ rj . The fact that µj extends analytically near the origin is clear when aj is simple, from the implicit function theorem. It is still true when aj is semi-simple (assumption (H6)). The other roots tend to non-zero limits µj (0) as λ → 0. These limits are roots of det(µb+ − df+ ) = 0. These are the eigenvalues of negative real part of the matrix M1 := b1−1 (dz f1 − dw f1 (dw f0 )−1 dz f0 ),

v = v+ .

Given a (generalized) eigenvector zˆ of this matrix, one built a (generalized) eigenvector of M+ (0) through −(dw f0 )−1 dz f0 zˆ . (19) ϕ := zˆ µˆz In summary, a basis {ϕ1 , . . . , ϕp+r } of E+ (0) is given by −(dw f0 )−1 dz f0 zˆ j rn−q+j ϕj = , if j ≤ q, ϕj = zˆ j 0 µj (0)ˆzj

, if q < j ≤ p + r.

Hereabove, {ˆzq+1 , . . . , zˆ p+r } is a basis of the stable subspace of M1 , in which M1 has a Jordan form, diagonal if possible. The µj (0) are the corresponding eigenvalues of M1 . We now turn to the elements of E(0). These are solutions φ = (v, z )T of (16) with λ = 0. This amounts to lv = 0, or {b(V )v + (db(V )v)V − df (V )v} = 0. Integrating once, we receive a first-order differential-algebraic equation: b(V )v + (db(V )v)V − df (V )v = constant =: q.

(20)

Though E(λ) is made up of functions decaying at +∞ when λ ∈ A, this is not true any more for λ = 0. However, E(0) certainly contains all the exponentially decaying solutions of (16). These correspond to the decaying solutions of the homogeneous equation b(V )v + (db(V )v)V − df (V )v = 0.

(21)

Such solutions form a vector subspace of dimension p + r − q, a basis of which being {φq+1 (0), . . . , φp+r (0)}, where φj (0) solves (16) and φj (x; 0) ∼ (exp µj (0)x)ϕj ,

x → +∞,

q < j ≤ p + r.

The remaining elements of E(0) actually do not decay, but have finite limits ϕ ∈ Span{ϕ1 , . . . , ϕq } (the case ϕ = 0 corresponds to the decaying solutions, already considered). The constant in (20) is computed by letting x → +∞. We thus complete a basis of E(0) by choosing φj (0), solutions of (16) with λ = 0, according to lim φj (x; 0) = ϕj ,

x→+∞

1 ≤ j ≤ q.

278

D. Serre, K. Zumbrun

With φj =: (vj , zj )T , we obtain from (20), b(V )vj + (db(V )vj )V − df (V )vj = −Rn−q+j ,

1 ≤ j ≤ q.

Once a basis B(0) = {φ1 (0), . . . , φp+r (0)} of E(0) is chosen according to the above ˆ as a basis of E(λ). requirements, it is extendable in an analytic way in A, Two remarks. First we point out that there remains much room in the choice of B(0). Second, as mentioned above, lV = 0 and V decays at infinity. Therefore, V ∈ E(0). Moreover, the asymptotic behaviour of V is generically

V (x) ∼ eµj (0)x r , for some index j > q, with (µj (0)b − df )+ r = 0. In the case where µj (0) is real, we may choose φj = (V , 0)T . 3.1. Discussion of (20). We now show that (20) may be viewed as a traditional ODE, instead of a differential-algebraic equation. Let us denote the constant right-hand side by q ∈ Rn . We first split the equation into two parts. With q = (q0 , q1 )T and v = (w, z)T : df0 (V )v = −q0 ,

(22)

b1 (V )z + (db1 (V )v)Z − df1 (V )v = q1 .

(23)

We now differentiate (22) and keep (23) unchanged. This yields ˜ b(x)v − a(x)v ˜ = q, ˜

with b˜ =

h

·

0 b1

,

a˜ =

(df0 ) ···

(24)

,

q˜ =

0 q1

.

Thanks to (H5), b˜ is invertible. Therefore, (24) is a linear ODE in the traditional form. Every solution of (20) solves (24). Conversely, let v be a solution of (24), with a constant vector q˜ and q˜0 = 0, and assume that (v(x), v (x)) → (v∞ , 0), as x → +∞. Then (20) holds true, with q1 := q˜1 and q0 := −df0 (v+ )v∞ . 4. Parametrized IBVPs Let U : R → U be a given solution of the differential-algebraic system (10). We emphasize that it is defined on the whole line R, instead of on the semi-axis R+ . We now consider the initial-boundary value problem for (1), on the space domain I :=]x0 , +∞[, instead of R+ . For this, we provide (1) with p + r suitable boundary conditions, possibly depending on the choice of x0 , and we assume that the restriction U |I satisfies these conditions. We are now concerned with the linear stability of U |I for the corresponding IBVP. For this, we construct the Evans function B(x0 ; λ). We easily see that it can be built as a continuous function with respect to x0 . Here, we focus on the sign of W (x0 ) := B(x0 ; 0), vs the sign of B(x0 ; ·) near +∞. By the intermediate value theorem, opposite signs imply the existence of at least one real positive root. In particular, U |I would be unstable. More precisely, opposite signs mean

Boundary Layer Stability

279

that the number of unstable eigenvalues of L is odd, while same signs mean that this number is even, keeping in mind that non-real eigenvalues come in complex conjugate pairs. In the sequel, we shall denote this parity by the stability index of L. Since it can be checked that B(x0 ; ·) does not vanish near infinity, a consequence of a Gårding estimate (see [2]), its sign does not depend on x0 . Therefore, the only ingredient in the computation of the stability index of L is the sign of W (x0 ), which may vary with x0 . Though the exact computation of W is not easy, we may expect to receive some results by means of a qualitative study of W . Notice that, in this case, a suitable choice of solutions φ1 , . . . , φp+r of φ = M(x; 0)φ gives a coherent basis of E(x0 ; 0), for all x0 . That is, {φ1 |I , . . . , φp+r |I } form a basis B(x0 ; 0). As in the previous section, we choose the φj ’s so that φj decays exponentially fast as x → +∞ if j > q, and φj tends to ϕj as x → +∞ if j ≤ q. We also decompose φj =: (vj , zj )T . A convenient tool for this study is a differential equation that W should satisfy. Let us consider for instance the easiest case where p + r = n, that is when the boundary condition for (1) is a pure Dirichlet one (in other cases, it will often depend on x0 when U is not constant). Then, W (x0 ) = det(v1 (x0 ), . . . , vn (x0 )). Let us differentiate, using the matrix K := b˜ −1 a: ˜ W = det(. . . , vj −1 , Kvj + b˜ −1 q˜j , vj +1 , . . . ) j

= (TrK)W +

q

det(. . . , vj −1 , b˜ −1 q˜j , vj +1 , . . . )

j =1

= (TrK)W +

1

q

det b˜

j =1

˜ j −1 , q˜j , bv ˜ j +1 , . . . ). det(. . . , bv

Lemma 4. The spectrum of K+ = K(+∞) is made up of the eigenvalues of dG+ (with the same multiplicities), plus µ = 0, with multiplicity n − r. Proof. Because the first n − r rows of a˜ + vanish, we easily see that w =0 (K+ − µ) z is equivalent to either µ = 0 or to dG+ z = µz with appropriate w. The case where eigenvalues of dG+ are simple is done. The case of higher multiplicy follows by a density argument, as in, e.g. [6, 5]. Two subcases, q = p or p + 1 (recall that we already know that q ≥ p), are of ˜ k consist strong interest for applications. We point out that the p first components of bv of the vector df0 (V )vk , that is of −qk,0 , with qk =: (qk,0 , qk,1 )T . For k > q, this is zero, since qk = 0. Since q˜j,0 vanishes too, we see that each of the above n × n determinants contains a null block of size p × (n − q + 1). Let’s first consider the case q = p. From p + (n − q + 1) = n + 1, we conclude that each of these determinants vanish, so that W = (TrK)W . Therefore, W does not vanish ; it keeps a constant sign. Since we can

280

D. Serre, K. Zumbrun

prove (see Appendix A, below) that weak boundary layers are linearly stable, we already know that the signs of B(x0 , 0) and B(x0 , λ >> 1) agree for x0 >> 1. We conclude that they do, for all x0 : Theorem 1. Let U be a boundary layer, with p + r = n (so that the boundary condition is a pure Dirichlet one) and q = p. Then the stability index of the linearized operator L is even. From this, we cannot conclude, regarding the linear stability of U . The next subcase comes with q = p + 1. The same arguments as above show that each of the determinants are block-triangular, since p + (n − q + 1) = n. Therefore, they may be written as products of two determinants, of respective sizes p × p and (n − p) × (n − p). However, we may rewrite the differential equation in a simpler form, with the next lemma. Lemma 5. If r + p = n and q = p + 1, then q

˜ j −1 , q˜j , bv ˜ j +1 , . . . ) = (−1)p det(q1 , . . . , qp+1 , bvq+1 , . . . , bvn ). det(. . . , bv

j =1

Proof. Let us define Qj := qj − q˜j = (qj,0 , 0)T . We use qj = q˜j + Qj and the linearity of the determinant to rewrite det(q1 , . . . , qp+1 , bvq+1 , . . . , bvn ) as a sum of 2p+1 terms. Those containing two (or more) q˜j vanish, since they contain a null block of size p × (n − p + 1). The term with only Qj ’s vanishes too, since it contains a null block of size (n − p) × (p + 1). There remain only those terms with exactly one q˜j . These ones are block diagonal and are not changed when replacing the lower null block (of size (n − p) × p) by a non-zero block. Such a change is performed when replacing the corresponding Ql ’s by −bvl , since their first p components agree. We finally obtain ˜ k for k > q. the expected formula by noticing that bvk = bv We now face the linear ODE W = (TrK)W + (−1)p det(q1 , . . . , qp+1 , bvq+1 , . . . , bvn ) =: τ (x)W + s(x), (25) for which we can write W (x) exp −

x

τ (y)dy

x

=c+

0

s(y) exp −

0

y

τ (ξ )dξ

dy,

0

where c is a constant. Then the sign of W equals the one of the right-hand side. This may be evaluated for x → −∞, independently of c, each time the integral

x −∞

diverges.

s(y) exp − 0

y

τ (ξ )dξ

dy

(26)

Boundary Layer Stability

281

4.1. Boundary layers from 1-shock profiles. A rather interesting case consists in choosing for U a viscous shock profile of a steady shock wave, that is a solution of (10), which admits a limit u− as x → −∞. In order to have some control on the behaviour of U as x → −∞, we ask that this shock be non-characteristic for (2): df− := df (v− ) is invertible. Again, z− is a hyperbolic point for the ODE (12), and Z takes values in its unstable manifold. Then, generically, there exists a pair (µ− , r− ), such that V (x) ∼ (eµ− x r− ),

x → −∞,

(27)

and (µb − df )− r− = 0. We shall focus on the most frequent case of a Lax shock, that is an (n−q)-Lax shock. Similar to Lemma 2, we have Lemma 6. For an (n−q)-Lax shock, the unstable manifold of z− for (12) is of dimension 1 + q − p. Example (fundamental). In full gas dynamics, (1) is the Navier–Stokes model, with viscosity and heat conduction. Then, n = 3, r = 2. Let assume that p = 1 and q = 2 ; that is 0 < u+ < c+ , where c denotes the sound speed. Then the stable manifold of z+ for (12) is a curve, which splits into two trajectories. For reasonable state laws (as the perfect gas law p = (γ − 1)ρe), there is a single other state u− such that F (u− ) = F (u+ ), and the pair (u− , u+ ) is a 1-Lax shock2 . From Gilbarg’s study [8], we know that a shock profile exists for every positive choice of the viscosity and heat conductivity. Therefore, one of the two possible trajectories U such that U (+∞) = u+ is actually such a shock profile. This argument certainly admits generalisations to many systems and states u+ such that q = n − 1. Our assumption that U is a piece of a shock profile is thus not too restrictive. Let us assume that (u− , u+ ) is such a shock and that U is its profile, with as before r + p = n, q = p + 1. Since we do not know explicitly the vq+1 , . . . , vn , except for vn = V , we cannot in general evaluate s(x). We therefore assume q = n − 1, that is r = 2, p = n−2. Then, noticing that f (v− ) = f (v+ ) because of the Rankine–Hugoniot condition, s = (−1)n det(q1 , . . . , qn−1 , bV ) = (−1)n det(q1 , . . . , qn−1 , f (V ) − f (v± )). Since (u− , u+ ) is a 1-shock, Lemma 6 shows that z− is a source for (12). Thus, all eigenvalues of DG− , and from Lemma 4, both non-zero eigenvalues of K− have positive real parts, one of them being µ− , the other one denoted by σ− . Thus we have TrK− = µ− + σ− > µ− . Then the integrand in (26) is equivalent to Se−σ− x , S := (−1)p det(q1 , . . . , qn−1 , b− r− ). This proves that (26) diverges. Then there are two generic pictures. Either µ− is not real, and σ− = µ− . Then W oscillates, as shown in [20], so that the stability index is odd when x0 belongs to a denumerable union of intervals, implying the instability of the corresponding boundary layers. Or µ− , σ− are real and simple. Then the sign of W as 2 Let us remark that 2-shocks do not exist in gas dynamics, since the second characteristic field is linearly degenerate.

282

D. Serre, K. Zumbrun

x → −∞ is the one of −S/σ− , an explicit quantity ! This is made even simpler when choosing, as it is possible, qj = −Rj +1 (recall that (dF+ − aj )Rj = 0 and aj > 0): S = − det(R2 , . . . , Rn , b− r− ) = − det(R2 , . . . , Rn , B− R− ), with R− = df− r− . Choosing appropriately an eigenform L1 of dF+ (that is a non-trivial solution of L1 (dF+ − a1 ) = 0), we also have S = L1 B− R− = L1 b− r− . At this stage, one may wonder about the consistency of this analysis, since a change in the choice of the basis B(0) could result, for instance, in a flip of the sign of S. This is without taking into account the need to define continuously the bases B(λ) along R+ . Such a modification would also change the sign of B(x0 ; λ) for λ >> 1, but it would not affect the sign of the product B(x0 ; 0)B(x0 ; +∞), of course since it is intrinsic. For the moment, let us say that, considering a one-parameter family of steady shock waves L → (u− , u+ ), endowed with a smoothly varying family of profiles L → UL , we obtain a presumably smooth function L → S(L), instead of a single number. Then, detecting a value, for instance L = 0, where S vanishes with dS/dL = 0, we conclude that for, say L > 0, S and B(x0 ; λ) have the same sign, for all x0 and all λ >> 1. Therefore, there is a point X(L) such that, for L > 0 and x0 < X(L), the corresponding profile is unstable (let us point out that limL→0 X(L) = −∞). To summarize this analysis, we write For r = 2 and a profile U of a steady 1-shock wave, the vanishing of S detects the instability of some boundary layers associated to nearby steady shocks, with x0 1 is the adiabatic constant. Its value is 5/3 for a mono-atomic gas, 7/5 for the air.

284

D. Serre, K. Zumbrun

Table 1. Numbers of boundary conditions v+ q

−c+ 0

[p, p + r] [0, 2]

c+

0 1

2

3

[0, 2]

[1, 3]

[1, 3]

From now on, we consider only perfect gases. Without loss of generality, we may assume that θ ≡ e. We therefore have

ρv

2 + (γ − 1)ρe , f (v) = ρv 1 2 2 v + γ e ρv

0

0

0

0 . 0 νv κ

b(v) = 0

ν

We notice that r = 2. We easily check (H1–H4, H6) for the Navier–Stokes system. Assumption (H5) asks that v(0) = 0. Since ρv is a constant along a boundary layer (because of the √ first line of (11)), this amounts to v+ = 0. Next, denoting the sound speed by c := γ (γ − 1)e, the characteristic speeds in the Euler equations (2) are 2 − c2 ) = 0. The number of boundary v − c, v, v + c, so that (H7) asks that v+ (v+ + conditions that we need for (1) and (2) is given by Table 1. From the existence of a strictly convex, dissipative entropy, we know that (H9) holds. Since (H10) holds trivially, Lemma 1 shows that (H8) is satisfied. Therefore, the construction of the Evans function and the analysis done in the previous section apply when v+ > 0. When v+ > c+ , the boundary layer is trivial (that is, constant) and the linearized IBVP has a full Dirichlet boundary condition. An obvious a priori estimate shows that such a layer is stable. The situation is less clear when v+ < 0, since then a choice has to be made concerning the boundary condition (we need only two scalar data). Since the Evans function strongly depends on the linearized ones d1 , d2 , we anticipate that the stability index of a layer will depend not only on the layer but also on the linearized boundary conditions. We shall illustrate this dependence in the simpler case of an isentropic flow (see Sect. 6).

5.1. The case 0 < v+ < c+ . The case considered in the previous section (r = 2, q = n − 1 = 2, p = n − 2 = 1) corresponds to the choice 0 < v+ < c+ . From Gilbarg [8], we know that, given such a state v + there is a unique state v − with f (v − ) = f (v + ) and that the pair (v − , v + ) is 1-Lax steady shock. In particular, v− > c− . Also, it is proved that this shock admits a viscous profile V , for every choice of the positive functions ν and κ. We first compute the expression S = L1 b− r− . The differential form L1 is the eigenform for dF+ , associated to v+ − c+ , its first eigenvalue. We have L1 = (γ − 1)(e+ dρ + ρ+ de) − ρ+ c+ dv γ −1 2 v + vc = du1 − (c + (γ − 1)v)+ du2 + (γ − 1)du3 . 2 +

Boundary Layer Stability

285

Dropping for a moment the minus indices, r = r− obeys the eigen-equation (µb − df )r = 0. With r =: (x, y, z)T , this reads vx + ρy 0 = (v 2 + (γ − 1)e)x + 2ρvy + (γ − 1)ρz . µ νy 3 2 1 2 νvy + κz v + γ e vx + v + γ e ρy + γρvz 2 2 From vx + ρy = 0, we eliminate x. Making also a linear combination of the two last equalities, we arrive at µνvy = (γ − 1)ρvz + ρ(v 2 − (γ − 1)e)y, µκz = ρ((γ − 1)ey + vz). Since r and v are non-zero, we have (y, z) = 0, which implies µνv + ρ((γ − 1)e − v 2 ) −(γ − 1)ρv = 0. −(γ − 1)ρe µκ − ρv Defining ζ :=

(28)

µκ − 1, ρv

this is rewritten as ν ζ (ζ + 1) + ζ κ

1−γ 1 −1 + = 0, 2 γM γ M2

(29)

where M = v− /c− > 1 is the Mach number. This quadratic equation has two real solutions of opposite signs, the negative one corresponding to the smallest “eigenvalue” µ. That is precisely the one which governs the asymptotic behaviour of V near −∞ (see [8]). In passing, this shows that these eigenvalues are simple and real, so that the behaviour (27) is correct. From now on, by ζ we mean the negative root of (29). The corresponding eigenvector is given by the formula 0 −ρζ , br = , r= νvζ vζ νv 2 ζ + κ(γ − 1)e (γ − 1)e everything being evaluated at v − . Finally, S = L1 b− r− = −(c+ + (γ − 1)(v+ − v− ))ν− v− ζ + κ− (γ − 1)e− . We now investigate the possible vanishing of S. First of all, since ν, κ, v, e are positive and ζ is negative, ℵ := c+ + (γ − 1)[v] needs to be negative. This is far from true in general. For instance, a weak shock (v − being close to v + ) yields ℵ ∼ c+ > 0. Next, it can be shown that, as long as γ ≤ 2, all the 1-shocks satisfy ℵ > 0. However, when γ > 2, strong 1-shocks satisfy ℵ < 0. For instance, so-called “maximal” shocks (see

286

D. Serre, K. Zumbrun

[3]), for which e− vanishes, or equivalently M = +∞, give 2γ − 2 v+ , ℵ= γ −1 and the parenthesis is negative if and only if γ > 2. We therefore restrict to the case γ > 2 and select a strong enough steady 1-shock, that is one for which ℵ < 0. Having fixed the state v − in such a way, S appears to be an homogeneous function of degree one of (ν− , κ− ). Thus its vanishing depends only on the ratio ν/κ. However, we easily obtain the following asymptotics: ν/κ → 0+ : then S ∼ (γ − 1)e− κ− > 0, ν/κ → +∞: then S ∼ ℵv− ν− < 0. By continuity we see that S vanishes for some value of the ratio ν/κ. In order to have a qualitative feeling for this value, let us consider the example of a maximal 1-shock. Using β := ν/κ and L := 1/γ M 2 > 0, L β ∗ (γ )κ, particularly when ν ≥ κ, then S < 0 for sufficiently strong shocks, that is for sufficiently large M. Let us point out that the value of M > 1 2 , v 2 , e , e ), up to a single multiplicative positive constant, which just determines (v− + − + factorizes in S.

Boundary Layer Stability

287

Existence of unstable boundary layers. We now give the conclusion. Because the 1shock curves are connected, and because of Galilean invariance, the set of steady 1shocks is connected too. Then we may consider the Evans function as a continuous function of, altogether, λ, x0 , ν, κ and the shock itself. Consequently, W is a continuous function of x0 , ν, κ and the shock. Thanks to the Gårding inequality, we know that the stability index depends only on the sign of W . Since the “eigenvalues” µ are distinct and real, we know that the sign of W (x) becomes constant as x → −∞, and that it is opposite to the sign of S. In turn, S˜ := S/ν− is a continuuous function of ν− /κ− and the shock. On one hand, we know that for small shocks, S˜ is positive. On the other hand, we know that the corresponding boundary layers, being small whatever x0 is chosen, are stable (this comes from a direct entropy estimate on the linearized system). By continuity, we therefore conclude that the stability index of a boundary layer V |(x0 ,+∞) is even when W (x0 ) is negative and odd when it is positive. Now, let γ be larger than 2, assume that ν > β ∗ (γ )κ, for instance ν > κ, and let the shock (v − , v + ) be strong enough. Then S > 0, which means that W (x0 ) is negative for x0 < 0, large enough. Then the corresponding boundary layer is unstable. 6. Isentropic Gas Dynamics In one-space dimensional isentropic gas dynamics, the flow is described by v = (ρ, v) only. Therefore, n = 2 and the conservation laws express the mass and momentum balances. We have ρv ρ 0 0 , f (v) = . g(v) = , b(v) = ρv 2 + p(ρ) ρv 0 ν(ρ) Hereabove, ν > 0 (thus r = 1) and the pressure satisfies (hyperbolicity) p > 0. The sound speed is here c := p . The eigenvalues of dF are v ± c(ρ) and the one of dg −1 h (a 1 × 1 matrix !) is v. Contrary to the case of full gas dynamics, both matrices do not share a common eigenvalue. We summarize the number of boundary conditions for the viscous (p + r) and the inviscid (q) problems in Table 2. There are four distinct cases: v+ > c+ : then q = r + p, so that every boundary layer is trivial (that is constant). Since r + p = n, the linearized boundary layer is homogeneous Dirichlet, The “layer” is linearly stable, from an obvious a priori estimate. 0 < v+ < c+ : then q = p and r + p = n, so that W is a Wronskian. It cannot vanish. Thus the sign of B(x0 ; 0)B(x0 ; +∞) is constant, therefore positive. The stability index is even. −c+ < v+ < 0: then q = p + r, so every boundary layer is constant. Since r + p = 1, there is only one boundary condition for the Navier–Stokes system. Let αρ +βv = 0 be the linearized boundary condition, with β = 0 for the viscous IBVP being locally Table 2. Isentropic case v+ q

−c+ 0

[p, p + r] [0, 1]

c+

0 1

1

2

[0, 1]

[1, 2]

[1, 2]

288

D. Serre, K. Zumbrun

well-posed. The subspace E(0) is spanned by the constant v = (−ρ+ , c+ ). Then B(0) = −αρ+ + βc+ . From an obvious energy estimate, it is stable when α = 0. By continuity, we conclude that the stability index is even (resp. odd) when α ρ+ < c+ , β

resp. > c+ .

v+ < −c+ : then q = p and r + p = 1. Here, E(0) is spanned by V (if V is not trivial). With the same notations as above, B(0) = αρ (0) + βv (0). Since ρv ≡ j in the layer, the vanishing of B(0) is equivalent to αρ 2 = βj . By continuity, the stability index depends only on the sign of α ρ(0)2 − j. β For α = 0, we find that a constant layer (seen as a limit case) is stable, from an obvious energy estimate. Therefore the stability index is even (resp. odd) when the above quantity is positive (resp. negative). 7. Appendix A In this appendix, we establish the stability of weak boundary layers in the case of degenerate viscosity, under the simplifying hypothesis r + p = n followed in Sects. 4– 5; as discussed in the introduction, this corresponds to the case of full Dirichlet boundary conditions. This extends prior results of [10] and [12] in the one-dimensional and multidimensional case, respectively, obtained for strictly parabolic viscosities. Similarly as in [10, 12], a key ingredient in the proof is the following elementary Poincaré estimate. Lemma 7. Let w ∈ C 1 [0, +∞) vanish at x = 0. Then, for any weighting function α(·) > 0, we have +∞ +∞ +∞ 2 α(y)|w(y)| dy ≤ α(y)|y|dy |w (y)|2 dy. (30) 0

0

0

Proof. Cauchy–Schwarz inequality, applied to w(x) = |w(x)|2 ≤

x

x

1dy 0

|w (y)|2 dy

x 0

w (y)dy, gives

= |x|

0

x

|w (y)|2 dy,

0

whence

+∞

+∞

α(y)|w(y)|2 ≤

0

0

=

0

proving the claim.

+∞

y

α(y)|y| 0

|w (z)|2 (

|w (z)|2 dzdy +∞

z

α(y)|y|dy)dz,

Boundary Layer Stability

289

Proposition 2. Fixing u+ , assume that (H1)–(H10) hold in some neighborhood U of u+ . Further, suppose that r + p = n, where r and p are defined as in the introduction. Then, boundary layers U : U (+∞) = u+ that are sufficiently “weak” in the sense that the entire profile U (·) is contained in a sufficiently small ball about u+ , are spectrally stable. Proof. The result follows by a combination of two energy estimates. The first is the one used to establish stability in the strictly parabolic case: Writing the linearized eigenvalue equation in original coordinates, we have λu + (Au) = (Bu ) ,

(31)

where B := B(U ) is singular, and Aα := dF (U )α − dB(U )(α, U ) for any vector α. From (H6) and (H9), we find by continuity that, for sufficiently weak profiles, there exists a symmetric, positive definite symmetrizer S(x) ≥ s0 > 0 such that S(x)dF (U (x)) is symmetric for all x (recall, existence of a symmetrizer is equivalent to semisimple, real spectrum for dF ), and, moreover (dissipativity): X ∗ SBX ≥ β|BX|2 .

(32)

Taking the real part of the (complex) L2 inner product of Su against Eq. (31), carrying out various integrations by parts, and rearranging, we obtain the basic energy estimate: λ'Su, u( = 'O(|U |)u, u( + 'O(|U |)u, Bu ( − 'u , SBu ( − 'Su , (dB · u)U (, which, through (32), implies λ'Su, u( + βBu 2 ≤ 'O(|U |)u, u( + 'O(|U |)u, Bu ( − 'Su , (dB · u)U (.

(33)

Let us denote by π the projection from Cn to Cr , where we just retain the r last components. Then the last term in (33) is bounded by cst π Su · u, since the n − r first rows of dB · u, like those of B, vanish. We now use Lemma 8 below in order to bound this term by cst Bu · u. We then derive from (33) and Young’s inequality the following estimate, s0 λu2 + βBu 2 ≤ 'O(|U |)u, u(,

(34)

or, in (w, z) coordinates: s1 λ(w2 + z2 ) + β1 z 2 ≤ 'O(|U |)w, w( + 'O(|U |)z, z(, where s1 and β1 denote modified, positive constants. Applying (30) with α = O(|U (y)|), and observing that +∞ |U ||y|dy 0

(35)

290

D. Serre, K. Zumbrun

can be made as small as desired by enforcing sufficiently weak layer strength, we find that we can absorb the z2 term in the right-hand side of (35) to obtain, finally, the desired first energy estimate: s2 λ(w2 + z2 ) + β2 z 2 ≤ 'O(|U |)w, w(,

(36)

where s2 and β2 denote further modified, positive constants. Note that, were there no w term, this would already establish a contradiction to the assumption that λ ≥ 0, proving spectral stability. In the case of degenerate viscosity, however, we require also a second energy estimate controlling term 'O(|U |)w, w(. Restrict attention, now, to the (n − r)-dimensional reduced eigenvalue equation λγ w + (dhw) + (dkz) = 0

(37)

arising in the (w, z) coordinates, where γ and h are as defined in (H2)–(H4), and γ := γ (U ), dh := dh(U ), dk := dk(U ). By (H4), the matrix γ −1 dh is diagonalisable, with real eigenvalues, for all x. Moreover, by assumption p = n − r, we have at x = +∞ that all eigenvalues are positive; for sufficiently weak boundary layers, then, this property holds for all x ∈ [0, +∞). It follows that there is a symmetric, positive definite symmetrizer Sw (x) such that P := Sw γ −1 dh ≥ h0 > 0

(38)

is symmetric, positive definite for all x ∈ [0, +∞). That is, (37) features purely upwind propagation. This allows us to apply to the system of equations (37) a weighted energy estimate like that applied by Goodman [11] to individual characteristic fields, in the context of shock stability. Precisely, following Goodman, define the scalar weight α(x) > 0 by ODE α = −C|U |α/ h0 ,

(39)

with initial condition α(0) = 1, where C > 0 is a sufficiently large constant. Then, multiplying (37) by αγ −1 and taking the real part of the (complex) L2 inner product against Sw w, we obtain after rearrangement the basic energy estimate: 'αw, w( − (1/2)'w, (αP )w( = 'O(α|U |)w, w( + 'O(α|U |)z, z(.

(40)

But, from (39), (38) we easily obtain as in [11] that −(1/2)'w, (αP )w( ≥ (C/2)'α|U |w, w(. Thus, summing (34) and (36), and taking C sufficiently large, we obtain the final estimate s2 λ(w2 + z2 ) + (β2 /2)z 2 + (C/4)'O(|U |)w, w( ≤ 0,

(41)

where we have as before used (30) to absorb the term 'O(|U |)z, z( in β2 z 2 . Evidently, (41) implies spectral stability, since λ ≥ 0 gives an immediate contradiction. (Note: above, we have used freely the fact that α is bounded away from zero.) It remains to prove the following Lemma 8. Under assumption (32), there exists a finite number c(u) such that |π SX| ≤ c|BX|.

Boundary Layer Stability

291

Proof. Let Y be a vector such that BY = 0. Then, applying (32) to X + sY , we see that the affine function s → (S(X + sY ), BX) − β|BX|2 is non-negative, therefore constant. Thus, BY = 0 implies (SY, BX) = 0 for every X. In other words, π Y = 0 implies (π SY, π BX) = 0 for every X. However, π B is onto, therefore π Y = 0 implies πSY = 0. This means that there exists a matrix S, such that π S = Sπ B. A simple computation gives S = S0 B0−1 , where S0 and B0 are the lower right blocks of S and B. (Let us recall that, zero being a semi-simple eigenvalue of B, the block B0 is invertible.) Then the result holds, with c := S. We remark that essentially the same argument, rephrased as an energy estimate on the time-evolutionary equations, establishes linearized and not only spectral stability. In the case r = n, Gisclon and Serre [10] were able to establish a full nonlinear stability result. It would be interesting to determine whether or not stability of weak boundary layers holds also in the case that p < n−r, when there are fewer than n Dirichlet conditions. In this case, it seems conceivable that the nature of the boundary conditions may strongly affect stability, even in the weak layer limit. For related work, see [18]. References 1. Alexander, J.C., Gardner, R. and Jones, C.K.R.T.: A topological invariant arising in the stability analysis of travelling waves. J. Reine Angew. Math. 410, 167–212 (1990) 2. Benzoni-Gavage, S., Serre, D. and Zumbrun, K: Alternate Evans functions and viscous sjock wawes. SIAM J. Math. Anal. 32, 929–962 (2001) 3. Bultelle, M., Grassin, M. and Serre, D.: Unstable Godunov discrete profiles for steady shock waves. SIAM J. Numer. Anal. 35, 2272–2297 (1998) 4. Freistühler, H. and Serre, D.: The L1 -stability of boundary layers for scalar viscous conservation laws. J. Dynamics & Diff. Eqns. to appear 5. Gardner, R.A. and Jones, C.K.R.T.: Traveling waves of a perturbed diffusion equation arising in a phase field model. Indiana Univ. Math. J. 39, 4, 1197–1222 (1990) 6. Gardner, R.A. and Jones, C.K.R.T.: A stability index for steady state solutions of boundary value problems for parabolic systems. J. Differential Eqs. 91, 2, 181–203 (1991) 7. Gardner, R.A. and Zumbrun, K.: The gap lemma and geometric criteria for instability of viscous shock profiles. Comm. Pure Appl. Math. 51, 7, 797–855 (1998) 8. Gilbarg, D.: The existence and limit behavior of the one-dimensional shock layer: Am. J. Math. 73, 256–274 (1951) 9. Gisclon, M.: Etude des conditions aux limites pour un système hyperbolique, via l’approximation parabolique. J. Maths. Pures & Appl. 75, 485–508 (1996) 10. Gisclon, M. and Serre, D.: Etude des conditions aux limites pour un système hyperbolique, via l’approximation parabolique. C. R. Acad. Sci. Paris, Série I 319, 377–382 (1994) 11. Goodman, J.: Remarks on the stability of viscous shock waves. In Viscous profiles and numerical methods for shock waves (Raleigh, NC, 1990), Philadelphia: SIAM, 1991, pp. 66–72 12. Grenier, E. and Guès, O.: Boundary layers for viscous perturbations of noncharacteristic quasilinear hyperbolic problems. J. Diff. Eqs. 143, 110–146 (1998) 13. Grenier, E. and Rousset, F.: Stability of one dimensional boundary layers using Green’s functions. Comm. Pure Applied Math., to appear 14. Henry, D.: Geometric theory of semilinear parabolic equations. Lecture Notes in Maths. 840, Berlin: Springer-Verlag, 1981 15. Kawashima, S.: Systems of a hyperbolic–parabolic composite type, with applications to the equations of magnetohydrodynamics. PhD thesis, Kyoto University (1983) 16. Kawashima, S. and Shizuta, Y.: On the normal form of the symmetric hyperbolic-parabolic systems associated with the conservation laws. Tohoku Math. J. 40, 449–464 (1988)

292

D. Serre, K. Zumbrun

17. Li, Ta-tsien and Yu, Wen-ci: Boundary value problems for quasilinear hyperbolic systems. Durham: Duke Univ., 1985 18. Matsumura, A. and Mei, M.: Convergence to travelling fronts of solutions of the p-system with viscosity in the presence of a boundary. Arch. Ration. Mech. Anal. 146, 1–22 (1999) 19. Rousset, F.: Inviscid boundary conditions and stability of viscous boundary layers. Asymptotic Analysis, 2001, to appear 20. Serre, D.: Sur la stabilité des couches limites de viscosité. Ann. Inst. Fourier 51, 109–129 (2001) 21. Zumbrun, K. and Howard, P.: Pointwise semigroup methods and stability of viscous shock waves. Indiana Univ. Math. J. 47, 3, 741–871 (1998) Communicated by P. Constantin

Commun. Math. Phys. 221, 293 – 304 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Symplectic Structures of Moduli Space of Higgs Bundles over a Curve and Hilbert Scheme of Points on the Canonical Bundle Indranil Biswas1 , Avijit Mukherjee2 1 School of Mathematics, Tata Institute of Fundamental Research, Homi Bhabha Road, Bombay 400005,

India. E-mail: [email protected]

2 Scuola Internazionale Superiore di Studi Avanzati, via Beirut 4, 34014 Trieste, Italy.

E-mail: avijit@@sissa.it Received: 15 January 2000 / Accepted: 25 March 2001

Abstract: The moduli space of triples of the form (E, θ, s) are considered, where (E, θ ) is a Higgs bundle on a fixed Riemann surface X, and s is a nonzero holomorphic section of E. Such a moduli space admits a natural map to the moduli space of Higgs bundles simply by forgetting s. If (Y, L) is the spectral data for the Higgs bundle (E, θ ), then s defines a section of the line bundle L over Y . The divisor of this section gives a point of a Hilbert scheme, parametrizing 0-dimensional subschemes of the total space of the canonical bundle KX , since Y is a curve on KX . The main result says that the pullback of the symplectic form on the moduli space of Higgs bundles to the moduli space of triples coincides with the pullback of the natural symplectic form on the Hilbert scheme using the map that sends any triple (E, θ, s) to the divisor of the corresponding section of the line bundle on the spectral curve. 1. Introduction A Higgs bundle over a compact connected hyperbolic Riemann surface X is a pair of the form (E, θ ), where E is a holomorphic vector bundle over X and θ is a holomorphic section of KX End(E). Higgs bundles were introduced in [Hi1]. Let MH denote a moduli space of stable Higgs bundles. We assume that for a Higgs bundle in MH , the degree of the underlying vector bundle E satisfies the inequality degree(E) > rank(E)(g − 1). So the Riemann–Roch theorem for E ensures that E admits a nonzero section. It is known that MH is a connected complex manifold of dimension 2rank(E)2 (g − 1) + 2 [Hi2]. Here we consider triples of the form (E, θ, s), where (E, θ ) is a Higgs bundle in MH and s ∈ H 0 (X, E) is a nonzero section. We recall that triples of this kind are considered in [BG, Li].

294

I. Biswas, A. Mukherjee

Let MT denote the moduli space of isomorphism classes of triples. It can be shown that MT exists as an analytic space. Since for every (E, θ, s) ∈ MT we have (E, θ ) ∈ MH , there is a surjective morphism F : MT −→ MH . The sujectivity of F is a consequence of the earlier observation that dim H 0 (X, E) > 0. A holomorphic symplectic form on the moduli space MH was constructed in [Hi1]. We will denote this symplectic form on MH by . Therefore, F ∗ is a holomorphic closed two-form on MT . If E is of rank n, then it is possible to evaluate on θ a GL(n, C)-invariant polynomial on the Lie algebra M(n, C), namely polynomials of the form A −→ trace(Ai ). This yields an element of the vector space H 0 (X, KX⊗i ). The resulting map P : MH −→ V :=

n i=1

H 0 (X, KX⊗i ),

which is known as the Hitchin map, is proper [Hi1]. Any element of V defines a spectral curve, which is a curve in the total space of KX defined by a polynomial constructed from the given element of V. Given any point p ∈ MH , a holomorphic line bundle L can be constructed on the spectral curve Yp corresponding to the point P (p). If p = (E, θ ), then the fibers of L are basically the eigenvectors of θ . Let π denote the projection of Yp to X obtained from the obvious projection of KX to X. It turns out that π∗ L = E [Hi2]. Therefore, a section s of E defines a section s of L over Yp . Since Yp is embedded in KX , the divisor of s defines a 0-dimensional subscheme of KX . In other words, we have a map : MT −→ Hilbl (KX ) to a Hilbert scheme of points on KX ; the integer l is the degree of the line bundle L. The natural symplectic structure on KX induces a holomorphic symplectic structure on Hilbl (KX ) [Be]. This symplectic form on Hilbl (KX ) will be denoted by C . In Theorem 3.1 we prove that the 2-form F ∗ on MT coincides with ∗ C . Since both and C are exact, Theorem 3.1 reduces to an equality of two given 1-forms. We show that the difference of these two 1-forms in question descends to V. The properness of the Hitchin map is very useful for this step. Then we show that a twisted version of the descended 1-form on V further descends as a meromorphic 1-form on a projective space with appropriate poles. Finally, the proof of Theorem 3.1 is completed using the result that no nonzero meromorphic form of the given type exists. 2. Higgs Bundles and Triples Let X be a compact connected Riemann surface of genus g, with g ≥ 2. The holomorphic cotangent bundle of X will be denoted by KX . A Higgs bundle over X is a pair of the form End(E)) (E, θ ), where E is a holomorphic vector bundle over X and θ ∈ H 0 (X, KX known as a Higgs field. A Higgs bundle (E, θ ) is calledstable if for every proper subbundle F ⊂ E of positive rank and with θ (F ) ⊆ KX F , the inequality degree(F )/rank(F ) < degree(E)/rank(E)

Higgs Bundles and Hilbert Scheme

295

is valid [Hi1]. Stability is an open condition. In other words, given any algebraic (respectively, analytic) family of Higgs bundles the points of the parameter space over which the Higgs bundle is stable form a Zariski (respectively, analytic) open subset. End(E)) on a vector bundle E of rank n and Given a Higgs field θ ∈ H 0 (X, KX an integer i ∈ [1, n], consider pi (θ ) := trace(θ i ) ∈ H 0 (X, KX⊗i ), which is defined using the associative algebra structure of the fibers of End(E). The map which sends a Higgs bundle (E, θ ) to n

pi (θ ) =

i=1

n

trace(θ i ) ∈

i=1

n i=1

H 0 (X, KX⊗i )

is known as the Hitchin map. By p0 (θ ) we will mean the section of KX⊗0 = OX given by the constant function 1. The total space of the line bundle KX will also be denoted by KX . Let π denote the projection of KX to X. For a Higgs bundle (E, θ ), the subscheme of KX defined by the solution of the polynomial n n Pθ (t) = t n−i pi (θ ) = t n−i trace(θ i ) = 0 i=0

i=0

is called the spectral curve associated to (E, θ ) [Hi2, BNR]. We will denote this spectral curve by Yθ . The natural projection from the spectral curve Yθ to X obtained by restricting π – which we will also denote by π – is a finite morphism. Furthermore, there is a torsionfree sheaf L of rank one on Yθ such that π∗ L ∼ = E.

(2.1)

The fiber Ly of L over a point y ∈ Yθ can be considered as the eigenvector of θ(π(y)) for the eigenvalue y [Hi2]. The pair (Yθ , L) is called the spectral data for the Higgs bundle (E, θ ). Since Yθ ⊂ KX , there is a natural homomorphism fθ : L −→ π ∗ KX ⊗ L,

(2.2)

which sends any vector v ∈ Ly , where y ∈ Yθ , to the tensor product y ⊗l ∈ (KX )|π(y) ⊗ Ly . Its direct image π∗ fθ : π∗ L −→ π∗ (π ∗ KX ⊗ L) ∼ = KX ⊗ π ∗ L coincides with the Higgs field θ [Hi2]; the last isomorphism is obtained from the projection formula. The equivalence between Higgs bundles and spectral data was used in [Si] to give a construction of moduli space of semistable Higgs bundles. Given a Higgs bundle (E, θ ), let C. denote the following two term complex of sheaves on X: [−,θ] C. : C0 = End(E)−→C1 = KX ⊗ End(E), where End(E) is at the 0th position, and if θ = dz ⊗ A in a local coordinate function z and s is a local section of End(E), then [s, θ ] = dz ⊗ (sA − As).

296

I. Biswas, A. Mukherjee

The space of infinitesimal deformations of the Higgs bundle (E, θ ) is parametrized by the hypercohomology H1 (C. ) [BR]. There is a canonical element in the dual vector space H1 (C. )∗ . To construct this element, first consider the diagram [−,θ]

End(E) −→ KX ⊗ End(E) End(E) −→ 0

(2.3)

This diagram gives a homomorphism δ : H1 (C. ) −→ H 1 (X, End(E)). Now, given any α ∈ H1 (C. ), the pairing trace(δ(α) ∪ θ) ∈ H 1 (X, KX ) = C defines the canonical element in H1 (C. )∗ [Hi1]; we will denote this canonical element in H1 (C. )∗ by &θ . Let KX [1] denote the complex of sheaves with only one nonzero term KX at the first position. We have the following homomorphism from the complex C. ⊗ C. to KX [1]: 0 End(E) ⊗ End(E) [−, θ ] ⊕ [−, θ ]

0 −→ 0 trace

End(E) ⊗ (KX ⊗ End(E)) ⊕ (KX ⊗ End(E)) ⊗ End(E) −→ [−, θ ] + [−, θ ] (KX ⊗ End(E)) ⊗ −→ (KX ⊗ End(E)) 0

KX 0 0

where the middle homomorphism, namely trace, is defined using the trace map End(E) ⊗ End(E) −→ OX of endomorphisms. This homomorphism of complexes gives the following homomorphism of hypercohomologies: H1 (C. ) ⊗ H1 (C. ) −→ H2 (C. ⊗ C. ) −→ H2 (KX [1]) = H 1 (X, KX ) = C.

(2.4)

This bilinear form on H1 (C. ) is evidently anti-symmetric and nondegenerate. In other words, it defines a symplectic form on H1 (C. ). This symplectic form on H1 (C. ) will be denoted by θ . Given a holomorphic family of Higgs bundles parametrized by a complex manifold T , the pointwise construction of &θ gives a holomorphic one-form, which we will denote by &T on T . The exterior derivative d&T coincides with the one obtained from pointwise construction of θ [BR]. In fact, the pairing θ defines a symplectic form on the smooth locus of any moduli space of Higgs bundles over X [Hi2].

Higgs Bundles and Hilbert Scheme

297

Let MH denote the moduli space of stable Higgs bundles over X of rank n and degree d. See [Si] for the construction of MH . We assume that d > n(g − 1). So from Riemann–Roch, dim H 0 (X, E) = d − n(g − 1) + dim H 1 (X, E) > 0. In other words, E admits nonzero sections. Definition 2.1. A triple over X is a data of the form (E, θ, s), where (E, θ ) is a stable Higgs bundle of rank n and degree d, and s ∈ H 0 (X, E) − 0 is a nonzero holomorphic section. Let MT denote the moduli space of triples that parametrizes isomorphism classes of triples. Two triples (E, θ, s) and (E , θ , s ) are called isomorphic if there is a holomorphic isomorphism E −→ E that takes θ to θ and s to s . We will show that MT exists as an analytic space. Let (ET , θT ) be a family of stable Higgs bundles of rank n and degree d over X parametrized by a complex space T . Let ψT denote the projection of X × T to T . If (ET , θT ) is another such family such that for every point t ∈ T , the Higgs bundle (Et , θt ) is isomorphic to (Et , θt ), then it can be shown that there is a holomorphic line bundle ξ over T such that ET = ET ⊗ ψT∗ ξ and θT = θT ⊗ I dψT∗ ξ . Indeed, this is an immediate consequence of the fact that given a stable Higgs bundle (E, θ ), the only automorphisms of E that takes θ to itself are the scalar multiplications. Now note that any automorphism of E defined by a scalar multiplication acts trivially on the projective space PH 0 (X, E) of lines in H 0 (X, E). From this it follows that MT exists as an analytic space, and the fiber of the forgetful map, that assigns (E, θ ) to (E, θ, s), is PH 0 (X, E) over the point of MH represented by (E, θ ). In fact, MT can be constructed locally over MH , that is over sufficiently small analytic open subsets of MH . The earlier remarks ensure that these local constructions patch together to define MT . It was remarked in (2.1) that π∗ L is isomorphic to E. Since π −1 (π∗ L) is a subsheaf of L, there is a canonical injective homomorphism π∗ : H 0 (Yθ , L) −→ H 0 (X, π∗ L) = H 0 (X, E). The finiteness of the map π implies that this homomorphism π∗ is actually an isomorphism. Now, given a section s ∈ H 0 (X, E), let s := π∗−1 (s) ∈ H 0 (Yθ , L)

(2.5)

be the section of L that corresponds to it by the isomorphism π∗ defined above. For a nonzero section s ∈ H 0 (X, E)−0, consider the divisor div( s) on Yθ . Using the inclusion map of Yθ in the surface KX , the zero-dimensional subscheme div( s) of Yθ defines a zero-dimensional subscheme of KX . The genus of the spectral curve Yθ is n2 (g − 1) + 1 [Hi2]. Therefore, from the Riemann–Roch theorem it follows that degree(div( s)) = d + n(n − 1)(g − 1),

(2.6)

where d = degree(E). In the next section we will construct a morphism from a moduli space of triples to a Hilbert scheme parametrizing 0-dimensional subschemes of the total space KX of the canonical bundle.

298

I. Biswas, A. Mukherjee

3. Morphism from Triples to Hilbert Scheme For any integer j ≥ 1, let Hilbj (KX ) denote the Hilbert scheme, which is the moduli space parametrizing 0-dimensional subschemes of length j of the quasi-projective surface KX . Since the spectral curve Yθ is embedded in KX , the divisor div( s) (defined in (2.5)) on Yθ defines a point of Hilbl (KX ), where l = d + n(n − 1)(g − 1) is the degree of s as obtained in (2.6). As before, let MT denote the moduli space of triples of the form (E, θ, s) considered in Definition 2.1. Recall that d = degree(E) > rank(E)(g − 1) = r(g − 1). Associating to any triple (E, θ, s) the element of the Hilbert scheme Hilbl (KX ) defined by the divisor div( s), we obtain a morphism : MT −→ Hilbl (KX ).

(3.1)

As before, let MH denote the moduli space of stable Higgs bundles over X of degree d and rank n. Let (3.2) F : MT −→ MH denote the forgetful map which sends any triple (E, θ, s) to the Higgs bundle (E, θ ). Recall that we have assumed that d > n(g − 1). Therefore, the map F in (3.2) is surjective. We note that there are different notions of stability of a triple. If notions of moduli spaces of triples different from the one given here is used, then F is usually not everywhere defined. Therefore, MT is a projective bundle over MH . For any point p = (E, θ ) on MH , the inverse image F −1 (p) is P (H 0 (X, L)). Let denote the symplectic form on MT whose pointwise construction has been described in (2.4). This symplectic form was introduced in [Hi1]. The surface KX has a natural symplectic structure. This symplectic form induces a symplectic structure on any Hilbert scheme Hilbj (KX ) of 0-dimensional subschemes of KX of length j [Be, pp. 766–767]. Let C denote the canonical symplectic form on Hilbl (KX ). Therefore, on the moduli space MT we have two holomorphic 2-forms, namely F ∗ and ∗ C . The following theorem says that F ∗ = ∗ C . Theorem 3.1. The two holomorphic 2-forms F ∗ and ∗ C on MT coincide. The proof of this theorem will be carried out in the next section. In the rest of this section we will reduce the equality F ∗ = ∗ C to an equality of 1-forms. In Sect. 2 a one-form &θ was constructed on the space of infinitesimal deformations of a Higgs bundle (E, θ ). Let & denote the holomorphic 1-form on MH such that for any point (E, θ ) ∈ MH the form & at the point (E, θ ) coincides with the 1-form &θ . We already noted in Sect. 2 that d& = . (3.3) Take a point p ∈ Hilbl (KX ) representing the collection of points {p1 , p2 , · · · , pl }, where pi ∈ KX and pi are distinct. In other words, pi = pj if i = j . We have Tp Hilbl (KX ) =

l i=1

Tpi KX .

(3.4)

Higgs Bundles and Hilbert Scheme

299

This decomposition is immediate from the fact that a neighborhood of p in Hilbl (KX ) is identified with a neighborhood in the l th symmetric product of KX of the point represented by the collection {pi }. Let ω denote the canonical 1-form on KX . The exterior derivative dω is the canonical symplectic form on KX . Since ω(pi ) is a 1-form on Tpi KX , using the decomposition (3.4) we have an element &C (p) ∈ Tp∗ Hilbl (KX ). It is easy to check that this defines a 1-form on Hilbl (KX ). Let &C denote the 1-form on Hilbl (KX ) whose evaluation at any point p coincides with &C (p) constructed above. Proposition 3.2. The exterior derivative d&C coincides with the symplectic form C on Hilbl (KX ). Proof. Let v := {v1 , v2 , · · · vl } and w = {w1 , w2 , · · · wl } be two tangent vectors in Tp Hilbl (KX ), where vi , wi ∈ Tpi KX ; the decomposition of Tp Hilbl (KX ) used here is the one obtained in (3.4). The evaluation of the symplectic form C on the pair {v, w} is described by the following identity ([Be]; the construction in Prop. 5 (pp. 766)): C (p)(v, w) =

l

dω(pi )(vi , wi ),

(3.5)

i=1

where dω, as before, is the canonical symplectic form on KX . The equality (3.5) immediately implies that the decomposition (3.4) of Tp Hilbl (KX ) is orthogonal with respect to the symplectic form C (p). The decomposition (3.4) is obviously orthogonal with respect to the skew-symmetric form d&C (p). Consequently, it suffices to check that the restriction of d&C (p) to the subspace Tpi KX ⊂ Tp Hilbl (KX ) coincides with the restriction of C (p). But clearly both these restrictions coincide with the symplectic form dω(pi ) on Tpi KX . This completes the proof of the proposition. In view of Proposition 3.2 and the equality (3.3), the Theorem 3.1 is an immediate consequence of the following lemma. Lemma 3.3. The two holomorphic 1-forms F ∗ & and ∗ &C on MT coincide. The following section will be devoted to the proof of Lemma 3.3. 4. Proof of the Lemma Let Y be a connected Riemann surface, and let π : Y −→ X be a covering map, possibly ramified, of degree n. Fix a holomorphic section β ∈ H 0 (Y, π ∗ KX ). Using the natural homomorphism (dπ )∗ : π ∗ KX −→ KY , the section β gives a section of KY . This section of KY will also be denoted by β.

300

I. Biswas, A. Mukherjee

For any holomorphic line bundle L on Y , the direct image π∗ L is a holomorphic vector bundle of rank n over X. Considering the infinitesimal deformations, we have a homomorphism π : H 1 (Y, OY ) −→ H 1 (X, End(π∗ L)),

(4.1)

as the space of infinitesimal deformations of π∗ L (respectively, L) are parametrized by H 1 (X, End(π∗ L)) (respectively, H 1 (Y, OY )). Since β ∈ H 0 (Y, KY ), we have ∪β

fL : H 1 (Y, OY ) −→ H 1 (Y, KY ) = C.

(4.2)

On the other hand, since β ∈ H 0 (Y, π ∗ KX ), taking the direct image of the multiplication map β⊗

L −→ π ∗ KX ⊗ L we have a homomorphism φL : π∗ L −→ π∗ (π ∗ KX ⊗ L) ∼ = KX ⊗ π ∗ L of vector bundles, where the last isomorphism is obtained from the projection formula. So, φL defines a holomorphic section βL ∈ H 0 (X, KX End(π∗ L)). Let f L denote the composition βL ∪

H 1 (X, End(π∗ L)) −→ H 1 (X, KX ⊗ End(π∗ L) ⊗ End(π∗ L)) trace

−→ H 1 (X, KX ) = C. Here trace, as before, is the homomorphism End(π∗ L) End(π∗ L) −→ OX constructed using the Killing form on GL(n, C) defined by A B −→ trace(AB). Proposition 4.1. The following diagram commutes H 1 (Y, OY ) π

fL

−→ C fL

H 1 (X, End(π∗ L)) −→ C where π and fL are defined in (4.1) and (4.2) respectively. Proof. The proposition follows immediately by unraveling the definitions of the above homomorphisms. Take any cohomology class v ∈ H 1 (Y, OY ). Let α be a (0, 1)-form on Y which is a Dolbeault representative of v. Let U ⊂ X be the complement of the finite subset of X consisting of points over which π is ramified. Consider the π∗ Oπ −1 (U ) -valued (0, 1)-form on U defined by α. It is the restriction of a Dolbeault representative of π (α) to U . Since the canonical isomorphism H 1 (X, KX ) ∼ = C is defined using the integration of (1, 1)-forms on X, and similarly for Y , the proposition follows easily.

Higgs Bundles and Hilbert Scheme

301

In Sect. 2 we briefly described the identification of Higgs bundles and the spectral data constructed in [Hi2]. Set Y to be a spectral curve. So, in particular, it is a curve embedded in KX . Set π to be the natural projection of the spectral curve to X. Let JY denote the subvariety of MH consisting of all Higgs bundles such that the corresponding spectral curve coincides with Y . We know that JY is identified with the component Picl (Y ) of the Picard group of Y , where l, as before, is d + n(n − 1)(g − 1) [Hi2]. The inverse image F −1 (JY ) ⊂ MT will be denoted by AY . So AY is a projective bundle over Picl (Y ). More precisely, it is the projectivized Picard bundle over Picl (Y ). Let ν : AY 5→ MT be the inclusion map. Proposition 4.2. The two 1-forms (F ◦ ν)∗ & and ( ◦ ν)∗ &C on AY coincide. Proof. This is actually a consequence of Proposition 4.1. Let f denote the inclusion of Y in KX . As before, the canonical 1-form on KX is denoted by ω. Set β in Proposition 4.1 to be the 1-form f ∗ ω on Y . Note that f ∗ ω is indeed a section of π ∗ KX . For any point (L, s) ∈ AY , the homomorphism fL constructed in (4.2) gives the oneform ( ◦ ν)∗ &C . On the other hand, (F ◦ ν)∗ & is constructed from the homomorphism f L . Now, Proposition 4.1 finishes the proof. We will note a simple observation. Let g : Z1 −→ Z2 be a proper smooth holomorphic map between connected complex manifolds. For any u ∈ Z2 , the inverse image g −1 (u) is assumed to be connected. Let τ be a holomorphic 1-form on Z1 such that the contraction of τ with any vertical tangent vector v ∈ Tu Z1 vanishes. By a vertical vector we mean a vector in the kernel of the differential dg : T Z1 −→ g ∗ T Z2 of the map g. Then there is a 1-form on Z2 , say τ , such that g ∗ τ = τ . To see this first note that given any tangent vector w ∈ Tz Z2 , using the above condition, namely the contraction of τ with any vertical vector vanishes, a holomorphic function on g −1 (z) is obtained. Now the existence of such a form τ is an immediate consequence of the fact that there is no nonconstant holomorphic function on a compact connected complex manifold. As we know from [Hi2], the space of spectral curves is parametrized by the vector space n V := H 0 (X, KX⊗i ). i=1

Let P : MH −→ V denote the Hitchin map defined in Sect. 2 which sends any Higgs bundle to the spectral curve associated to it. The morphism P is proper [Hi2]. Now, from Proposition 4.2 coupled with the above observation it follows that there is a holomorphic 1-form γ on V such that F ∗ & − ∗ &C = (P ◦ F )∗ γ .

(4.3)

Indeed, taking Z1 in the above observation to be the open subset of MT over which P ◦ F is smooth and setting τ to be F ∗ & − ∗ &C , the existence of a form γ satisfying (4.3) is ensured by the above observation. Lemma 3.3 will be proved by showing that the form γ in (4.3) vanishes identically.

302

I. Biswas, A. Mukherjee

Consider the action of the group C∗ = C\{0} on V is defined by t · {v1 , v2 , · · · , vn } = {tv1 , t 2 v2 , · · · , t i vi , · · · , t n vn },

(4.4)

where t ∈ C∗ and vi ∈ H 0 (X, KX⊗i ). This action will be denoted by ρ. The quotient of V\{0} by this action ρ is a weighted projective space. This weighted projective space . Let will be denoted by P q : V\{0} −→ P denote the quotient map. Proposition 4.3. The evaluation of the 1-form γ , obtained in (4.3), on any vertical vector for the projection q vanishes. Proof. Take a point v ∈ V\{0}. For any z ∈ C∗ , let fz denote the automorphism of KX that sends any vector α ∈ KX to zα. It is easy to see that the automorphism fz of KX takes the spectral curve Yv defined by v to the spectral curve Yρ(z)v defined by ρ(z)v ∈ V, where ρ, as before, is the action defined in (4.4). The isomorphism of Yρ(z)v with Yv obtained from fz will be denoted by f z . Fix a line bundle L over the spectral curve Yv and also fix a holomorphic section s ∗ ∗ of L. Now f z L is a holomorphic line bundle over the spectral curve Yρ(z)v and f z s is ∗ a holomorphic section of f z L. Since the following diagram commutes fz

Yρ(z)v −→ Yv π π X = X ∗

∗

the two vector bundles π∗ L and π∗ f z L over X are isomorphic. Let (π∗ f z L, θz ) denote ∗ the Higgs bundle over X corresponding to the spectral data (Yρ(z)v , f z L). Since π∗ L ∗ and π∗ f z L are isomorphic, it follows that the evaluation of the 1-form & on MH to the ∗ tangent vector at the point (π∗ L, θ1 ) defined by the curve z −→ (π∗ f z L, θz ) vanishes. Indeed, this is immediate from the construction of &θ done following (2.3). ∗ −1 The divisor of the section f z s is simply f z (div(s)), the image of the divisor of −1

s by the isomorphism f z which is the inverse of f z . Furthermore, the evaluation of the canonical 1-form ω on KX on any vertical vector for the projection π vanishes. Consequently, the evaluation of the 1-form ∗ &C on the tangent vector at the point (π∗ L, θ1 , π∗ s) ∈ MT ∗

∗

defined by the curve z −→ (π∗ f z L, θz , π∗ f z s) vanishes. Combining the above two observations on & and ∗ &C respectively, we conclude that γ vanishes on any vertical vector for the projection q. This completes the proof of the proposition.

Higgs Bundles and Hilbert Scheme

303

For each integer i ∈ [1, n], fix a basis {βi,1 , βi,2 , · · · , βi,mi } of the vector space So n1 = g and ni = (2i − 1)(g − 1) if i ≥ 2. This collection of basis gives us a basis for the vector space V. n Let N denote the dimension of V. So we have N = i=1 ni . Let P denote the N projective space consisting of all lines in C . We will define a morphism from V − {0} to P using the basis of V. Take any nonzero vector H 0 (X, KX⊗i ).

v=

ni n

ci,j βi,j ∈ V − {0},

i=1 j =1

where ci,j ∈ C. Let q : V − {0} −→ P be the map that sends any such vector v to the point of P represented by {ci,j }. Let Q : V − {0} −→ V − {0}

(4.5)

(4.6)

be the holomorphic map that sends any vector v as above to the vector ni n

(ci,j )i βi,j ∈ V − {0}.

i=1 j =1

Now we have a commutative diagram Q

V − {0} −→ V − {0} q q P −→ P is the one induced by the map Q; the map q is defined in where the morphism P −→ P (4.5). In view of the commutativity of the above diagram, Proposition 4.3 implies that the evaluation of the 1-form Q∗ γ for any vertical vector for the projection q vanishes. Take a nonzero vector v ∈ V − {0}. Let α be a holomorphic tangent vector to the manifold V −{0} at v. Take a nonzero complex number λ. Let λ denote the automorphism of V which sends any vector w to λw. Therefore, d λ(α) is a tangent vector at λ(v), where d λ is the differential of the map λ. Choose a holomorphic map α from the unit disk in C to V such that α (0) = v and λ ◦ α ) (0) = d λ(α). α (0) = α. Therefore, we have ( For any point t in the unit disk, let Yt denote the spectral curve corresponding to the point α (t) of V. It is easy to see that the spectral curve corresponding to the point λ ◦ α (t) ∈ V is the image fλ (Yt ), where fλ , as before, is the automorphism of KX defined by multiplication with λ. If Lt is a line bundle over Yt and st is a holomorphic section of Lt , ∗ ∗ ∗ then f λ Lt is a line bundle on fλ (Yt ) and f λ s is a holomorphic section of f λ Lt ; here f λ : fλ (Yt ) −→ Yt , as before, is the isomorphism defined by fλ .

304

I. Biswas, A. Mukherjee

From the above observations it readily follows that λ · Q∗ γ (v)(α) = Q∗ γ (λv)(d λ(α)),

(4.7)

where Q is defined in (4.6). We already noted that the evaluation of the 1-form Q∗ γ for any vertical vector for the projection q (defined in (4.5)) vanishes. In view of this, the equality (4.7) implies OP (1) on P. This that the one-form Q∗ γ descends to a holomorphic section of 1P OP (1) such that means that there is a section γ of 1P q ∗ γ ∈ H 0 (V\{0}, 1V \{0} ⊗ q ∗ OP (1)) realized as a one-form on V\{0} using the canonical trivialization of q ∗ OP (1) coincides with the one-form Q∗ γ . It is known that H 0 (P, 1P ⊗ OP (1)) = 0 [SS, pp. 71, Theorem 4.3(a)]. Therefore, we obtain that γ = 0. Since γ = 0, the identity (4.3) completes the proof of Lemma 3.3. We already noted in Sect. 3 that Lemma 3.3 implies Theorem 3.1. Therefore, the proof of Theorem 3.1 is complete. We note that exactly imitating the construction of the 1-form & on MH we may construct a 1-form on MT . Indeed, there is a natural projection from the space of infinitesimal deformations of a triple (E, θ, s) to the space of infinitesimal deformations of the Higgs bundle (E, θ ). Similarly, exactly imitating the construction of the 2-form on MH a 2-form on MT can be constructed. If the parameter σ in the stability condition of a triple is not sufficiently small, then F is not defined everywhere. Even in this case, these 1-form and 2-form coincide on the domain of F with F ∗ & and F ∗ respectively. Theorem 3.1 remains valid if F ∗ is replaced by this 2-form on MT . Acknowledgements. We thank the referee for clarifying remarks.

References [Be]

Beauville, A.: Variétés Kähleriennes dont la premiére classe de Chern est nulle. J. Diff. Geom. 18, 755–782 (1983) [BNR] Beauville, A., Narasimhan, M.S., Ramanan, S.: Spectral curves and the generalised theta divisor. J. Reine Angew. Math. 398, 169–179 (1989) [BG] Bradlow, S.B., García-Prada, O.: Stable triples, equivariant bundles and dimension reduction. Math. Ann. 304, 225–252 (1996) [BR] Biswas, I., Ramanan, S.: An infinitesimal study of the moduli of Hitchin pairs. J. Lond. Math. Soc. 49, 219–231 (1994) [G] García-Prada, O.: Dimensional reduction of stable bundles, vortices and stable pairs. Internat. J. Math. 5, 1–52 (1994) [Hi1] Hitchin, N.J.: The self-duality equations on a Riemann surface. Proc. Lond. Math. Soc. 55, 59–126 (1987) [Hi2] Hitchin, N.J.: Stable bundles and integrable systems. Duke Math. J. 54, 91–114 (1987) [Li] Lin, T.-R.: Hermitian-Yang–Mills–Higgs metrics and stability for holomorphic vector bundles with Higgs fields. Preprint [Si] Simpson, C.T.: Moduli of representations of the fundamental group of a smooth projective variety. II. Inst. Hautes Études Sci. Publ. Math. 80, 5–79 (1994) [SS] Shiffman, B., Sommese, A.: Vanishing Theorems on Complex Manifolds. Progress in Math., Vol. 56 Boston: Birkhäuser, 1985 Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 221, 305 – 333 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Long Time Behavior of the Continuum Limit of the Toda Lattice, and the Generation of Infinitely Many Gaps from C ∞ Initial Data A.B.J. Kuijlaars1 , K. T.-R. McLaughlin2,3 1 Department of Mathematics, Katholieke Universiteit Leuven, Celestijnenlaan 200 B, 3001 Leuven, Belgium.

E-mail: [email protected]

2 Department of Mathematics, University of Arizona, Tucson, Arizona 85721, USA.

E-mail: [email protected]

3 Department of Mathematics, University of North Carolina, Chapel Hill, North Carolina 27599, USA.

E-mail: [email protected] Received: 8 May 2000 / Accepted: 27 March 2001

Abstract: We analyze a continuum limit of the finite non-periodic Toda lattice through an associated constrained maximization problem over spectral density functions. The maximization problem was derived by Deift and McLaughlin using the Lax–Levermore approach, initially developed for the zero dispersion limit of the KdV equation. It encodes the evolution of the continuum limit for all times, including evolution through shocks. The formation of gaps in the support of the maximizer is indicative of oscillations in the Toda lattice and the lack of strong convergence of the continuum limit. For large times, the maximizer tends to have zero gaps, which is the continuum analogue of the sorting property of the finite lattice. Using methods from logarithmic potential theory, we show that this behavior depends crucially on the initial data. We exhibit initial data for which the zero gap ansatz holds uniformly in the spatial parameter (at large times), and other initial data for which this uniformity fails at all times. We then construct an example of C ∞ smooth initial data generating, at a later time, infinitely many gaps in the support of the maximizer, while for even larger times, all gaps have closed.

1. Introduction The finite, non-periodic Toda lattice is a dynamical system that is given in Flaschka coordinates by dan 2 = 2(bn2 − bn−1 ), n = 1, . . . , N, dt dbn = bn (an+1 − an ), n = 1, . . . , N − 1 dt

(1.1) (1.2)

with b0 = bN = 0. The Toda lattice is completely integrable and is solved explicitly by the inverse spectral transform [12, 23, 24].

306

A. B. J. Kuijlaars, K. T.-R. McLaughlin

The continuum limit of the Toda lattice is studied by Deift and McLaughlin in [7]. For > 0, they choose N = [1/ ] and initial values an = a0 ( n),

bn = b0 ( n)

(1.3)

in which a0 (x) and b0 (x) are given continuous functions for x ∈ [0, 1] such that b0 (x) > 0 on (0, 1). For small > 0, the initial data (1.3) vary only very little with the index n. Since the constant functions an (t) ≡ a ∈ R and bn (t) ≡ b > 0 clearly satisfy (1.1)–(1.2) one may then reasonably expect that the solutions with initial data (1.3) are approximately constant in time, and vary noticeably only over large time scales. Putting an (t) = a ( n, t),

bn (t) = b ( n, t),

one may then make the ansatz about the asymptotic form of the functions a and b : a (x, t) ∼ a(x, t) + a1 (x, t) + · · · , b (x, t) ∼ b(x, t) + b1 (x, t) + · · · .

(1.4) (1.5)

Inserting the asymptotic form (1.4)–(1.5) into (1.1)–(1.2) and equating the leading order terms, one arrives at the formal continuum limit: ∂a ∂b = 4b , ∂t ∂x

∂b ∂a =b , ∂t ∂x

(1.6)

for 0 < x < 1 and t > 0, with boundary conditions b(0) = b(1) = 0. The system (1.6) is hyperbolic with Riemann invariants α = a − 2b,

β = a + 2b.

The Riemann invariant form of (1.6) is ∂α β − α ∂α =− , ∂t 2 ∂x

∂β β − α ∂β = , ∂t 2 ∂x

(1.7)

with initial values α0 (x) := a0 (x) − 2b0 (x),

β0 (x) := a0 (x) + 2b0 (x).

(1.8)

A rigorous justification of this procedure was provided by Deift and McLaughlin in [7] for certain initial data (see below), modified to correspond to WKB data, and for times t up to the shock time of the system (1.6), see also the recent work [2]. For times beyond shock time, a more complicated description was found in certain cases. The analysis of the continuum limit of the Toda lattice has much in common with the analysis of Lax and Levermore [22] of the zero-dispersion limit of the Korteweg–de Vries (KdV) equation ut − 6uux + 2 uxxx = 0 as → 0. In both cases, essential use is made of the inverse spectral (or scattering) transforms for the KdV equation and the Toda lattice, respectively. A fundamental step in the analysis is the formulation of a quadratic maximization problem in the spectral variable in which the space-time variables x and t appear as parameters. The maximization problem arises from an asymptotic analysis of the spectral transform.

Continuum Limit of the Toda Lattice

307

For the continuum limit of the Toda lattice, the spectral data are the eigenvalues λ k , k = 1, 2, . . . , N, of the tridiagonal matrix a1 b1 0 b1 a2 b2 .. . L = (1.9) b2 a3 .. .. . . bN −1 0 bN −1 aN constructed from the initial values (1.3), together with the “norming constants” wk > 0, k = 1, . . . , N, where wk is the first component of the normalized eigenvector of L corresponding to the eigenvalue λ k . The following restriction is put on the initial Riemann invariants α0 and β0 from (1.8). We assume as in [7, 21] • α0 has exactly one local minimum in [0, 1], and • β0 has exactly one local maximum in [0, 1]. In addition, we assume that α0 (x) < β0 (x) for x ∈ (0, 1), α0 (0) = β0 (0) and α0 (1) = β0 (1). We put A := min α0 (x), 0≤x≤1

B := max β0 (x). 0≤x≤1

It follows from the above assumptions that, for every λ ∈ [A, B], the set {x ∈ [0, 1] : α0 (x) ≤ λ ≤ β0 (x)} is an interval. We denote the endpoints of this interval by x− (λ) and x+ (λ). Under these hypothesis, Deift and McLaughlin [7] show that the eigenvalues λ k have an asymptotic density φ, i.e., for all λ1 < λ2 , λ2 lim #{λ k ∈ (λ1 , λ2 )} = φ(λ) dλ, →0

λ1

and φ is given by φ(λ) =

1 π

x+ (λ)

x− (λ)

√

1 dx, (β0 (x) − λ)(λ − α0 (x))

λ ∈ [A, B].

(1.10)

Furthermore, for every fixed λ∗ ∈ [A, B], and eigenvalues λ k( ) of L such that λ k( ) → λ∗ as → 0, the limit lim log wk ( ) = −V (λ∗ )

→0

exists, and V is given by

x− (λ)

V (λ) = 0

2

λ − a0 (x)

λ − a0 (x)

log

− 1

dx. + 2b0 (x)

2b0 (x)

(1.11)

308

A. B. J. Kuijlaars, K. T.-R. McLaughlin

In (1.11) that branch of the square root is chosen which is positive for λ > β0 (x). Following [7], we are led to the maximization problem Q(x, t) := max [(Lψ, ψ) − 2(V − tλ, ψ)] , where the maximum is taken for those functions ψ on [A, B] satisfying 0 ≤ ψ ≤ φ, ψ(λ) dλ = x. Here

(1.12)

(1.13)

Lψ(λ) =

log |λ − µ|ψ(µ) dµ,

and the inner product (· , ·) is the L2 inner product on [A, B]: B f (λ)g(λ) dλ. (f, g) = A

The maximization problem (1.12)–(1.13) is an extremal problem for logarithmic potentials. The function V (λ) − tλ in the right-hand side of (1.12) is known as an external field, see [27]. The external field changes with time, and initially, at time t = 0, it is given by the spectral function V . The other spectral function φ is a constant of the motion and appears in (1.13) as an upper constraint for the maximizer. The spatial coordinate x appears as a normalization in (1.13). It is important to note that (1.12)–(1.13) provides a global description of the continuum limit of the Toda lattice. Indeed, the maximizer ψ(λ) = ψ(λ; x, t) exists and is unique for every x ∈ (0, 1) and every t ∈ R. In case, for some range of the parameters x and t in space-time, the “free part” of ψ, i.e., the set of λ ∈ [A, B] where 0 < ψ(λ) < φ(λ), is an interval (α(x, t), β(x, t)) then the endpoints α(x, t) and β(x, t) satisfy Eqs. (1.7) and the asymptotic form (1.4) and (1.5) is believed to be valid. In such a case one says that a zero gap ansatz holds. In case the set 0 < ψ(λ) < φ(λ) consists of several intervals, separated by a number of gaps, then the continuum limit exists only in a weak, averaged sense. The endpoints of the intervals then evolve according to a system of PDEs, which is recognized as a Whitham-type system of modulation equations, see again [7]. It is generally believed that the lack of strong convergence is due to the development of oscillations in the Toda lattice, and that the oscillations can be modeled by theta functions built out of a Riemann surface with genus equal to the number of gaps (for the analogous case of the KdV equation with small dispersion, see [32] and [13] for genus 1 and arbitrary genus, respectively). The connection between the existence of gaps and the development of oscillations in the continuum limit of the Toda lattice has not been established rigorously. However, experience from the interplay between whole line scattering theory and periodic spectral theory indicates that the existence of gaps implies the development of oscillations. For the small dispersion limit of the KdV equation, Deift, Venakides, and Zhou [8] have shown, under real analyticity assumptions on the spectral data, that the existence of gaps implies oscillations. Also, for the so-called Toda shock problem, the connection between a gap and oscillations is well known [15, 31, 17]. As already indicated, the formulation and analysis of a maximization problem like (1.12)–(1.13) lies at the heart of the Lax–Levermore approach to the zero-dispersion limit of the KdV equation [22, 30, 8, 11]. In a similar way, other singular limits of integrable

Continuum Limit of the Toda Lattice

309

systems have been treated recently. We mention the semiclassical limit of the defocussing nonlinear Schrödinger equation [16] and the continuum limit of a discrete NLS chain [28]. The long time asymptotic behavior was considered in [22] for the zero-dispersion limit of the KdV equation, and in [1] for the semi-classical limit of the defocussing NLS equation. In this paper, we consider two questions related to the continuum limit of the Toda lattice. The first concerns the long time behavior. It is well known that the Toda lattice (1.1)–(1.2) has the sorting property, i.e., for fixed > 0, the tridiagonal matrix (1.9) converges as t → ∞ to a diagonal matrix with the eigenvalues λ k on the diagonal, sorted from large to small. This sorting property was discussed for the continuum limit in [3]. Deift and McLaughlin [7, Chapter 11] study the long time behavior in terms of the so-called right ansatz. The right ansatz holds for x and t if there exist α(x, t) and β(x, t) in [A, B] such that the maximizer ψ of the maximization problem (1.12)–(1.13) satisfies 0 < ψ(λ) < φ(λ), λ ∈ (α(x, t), β(x, t)), ψ = 0, λ ∈ [A, α(x, t)], ψ = φ, λ ∈ [β(x, t), B].

(1.14) (1.15) (1.16)

Theorem 1.1 (Deift–McLaughlin). For initial data as above, there exists a time t¯ such that for t > t¯ there exist x0 (t) and x1 (t) in [0, 1] satisfying lim x0 (t) = 0

t→∞

and lim x1 (t) = 1

t→∞

such that the right ansatz holds for every t > t¯ and every x ∈ (x0 (t), x1 (t)). Proof. See [7, Theorems 11.1 and 11.2], where this result was proved for the case that α0 is strictly decreasing. [In that case one may take x1 (t) = 1.] The more general case follows from similar arguments. It follows from Theorem 1.1 that, for fixed x ∈ (0, 1), the functions α(x, t) and β(x, t) exist for t sufficiently large. The fact that they have a common limit as t → ∞ represents the continuum analogue of the sorting property of the Toda lattice. We complement this long time result in two ways. • Firstly, we describe a general class of initial data for which the right ansatz holds for large enough times t, and for all x ∈ (0, 1), see Theorem 4.1. For these initial data, one may therefore write x0 (t) = 0 and x1 (t) = 1 in Theorem 1.1. • Secondly, we describe a different class of initial data for which the right ansatz does not hold near x = 0 no matter how large t is. See Theorem 5.3 for the precise conditions. For these initial data one therefore necessarily has x0 (t) > 0 in Theorem 1.1. The difference between the two cases lies in the smoothness of β0 at its maximum. The second problem we consider in this paper is the formation of an infinite gap solution, i.e., an infinite number of intervals in the support of the maximizer. • We present an example in which C ∞ initial data evolve into an infinite gap solution at a certain time t0 and position x0 , see Lemma 6.2 and Theorem 6.3. The construction makes essential use of the result on long time behavior.

310

A. B. J. Kuijlaars, K. T.-R. McLaughlin

• At the critical time t0 , we show that for x < x0 , and for x ∈ (x0 , x0 +δ) the constraint is not active, and the maximizer is supported on finitely many intervals. As x approaches x0 , the number of gaps increases without bound. This indicates that the oscillations in the continuum limit of the Toda lattice become more and more complicated. It would be of interest to show that the oscillations are described by theta functions associated to Riemann surfaces whose genuses grow as x tends to x0 . We remark that for real analytic spectral data the formation of an infinite gap solution is not possible, see [18]. The outline of the rest of the paper is as follows. Sections 2 and 3 contain preliminary material that is needed for the long time result of Theorem 4.1. In Sect. 2 we introduce an external field V˜ which is dual to V . In Sect. 3 we obtain a uniformly valid long time result from certain smoothness properties of both V and V˜ . Then in Sect. 4 we give a general class of initial data α0 and β0 which give rise to external fields V and V˜ with the required smoothness properties, so that for large enough time, the right ansatz holds uniformly in x. In Sect. 5 we consider a different class of initial data for which the right ansatz does not hold uniformly in x. This class includes C 2 initial data. In our final Sect. 6 we describe the generation of an infinite gap solution from C ∞ initial data. 2. The Dual Problem The Euler–Lagrange relations associated with the maximization problem (1.12)–(1.13) are Lψ(λ) − V (λ) + tλ = $, if 0 < ψ(λ) < φ(λ), Lψ(λ) − V (λ) + tλ ≤ $, if ψ(λ) = 0, Lψ(λ) − V (λ) + tλ ≥ $, if ψ(λ) = φ(λ),

(2.1) (2.2) (2.3)

where $ is a constant, which may depend on x and t. The maximizer is the only function on [A, B] that satisfies (1.13) and the relations (2.1)–(2.3) for some constant $. We need some notions from logarithmic potential theory. Good references are [26, 27]. The Green function with pole at infinity of an unbounded domain % in the complex λ-plane, is the unique continuous function of λ, that is harmonic in %, vanishes on C \ % and behaves like log |λ| as |λ| → ∞. We use g[α,β] to denote the Green function with pole at infinity of C \ [α, β]. It is known that

2

λ − a

λ−a g[α,β] (λ) = log

− 1

, + 2b

2b

where a = (α + β)/2 and b = (β − α)/4. Thus from (1.8) and (1.11) we see that the external field V is given as an integral of Green functions x− (λ) V (λ) = g[α0 (x),β0 (x)] (λ) dx. (2.4) 0

Also the function φ(λ) of (1.10) has a potential theoretic interpretation. Let ω[α,β] be the density of the equilibrium measure of the interval [α, β], that is, 1 √ if λ ∈ [α, β] π (β − λ)(λ − α) ω[α,β] (λ) = 0 elsewhere.

Continuum Limit of the Toda Lattice

311

Then by (1.12)

1

φ(λ) = 0

ω[α0 (x),β0 (x)] (λ) dλ.

(2.5)

We recall the following relation between the equilibrium measure and the Green function:

β −α Lω[α,β] = g[α,β] + log . (2.6) 4 We define a second external field 1 V˜ (λ) = g[α0 (x),β0 (x)] (λ) dx, x+ (λ)

Lemma 2.1. Assume that C =

1 0

λ ∈ [A, B].

(2.7)

log b0 (x)dx > −∞. Then

Lφ = V + V˜ + C. Proof. Using (2.5) and (2.6), we find 1 Lφ = g[α0 (x),β0 (x)] dx + 0

1

log b0 (x)dx.

0

The Green function g[α0 (x),β0 (x)] (λ) vanishes for λ ∈ [α0 (x), β0 (x)]. Thus for fixed λ, it vanishes for x ∈ [x− (λ), x+ (λ)]. This gives the lemma. 1 Remark 2.2. From now on we always assume 0 log b0 (x)dx > −∞. This is related to the assumption that Lφ is a bounded function. We put ψ˜ = φ − ψ, so that ψ˜ satisfies 0 ≤ ψ˜ ≤ φ,

˜ ψ(λ) dλ = 1 − x.

Then Lψ˜ = Lφ − Lψ, so that in view of (2.1)–(2.3) and Lemma 2.1, ˜ if 0 < ψ(λ) ˜ ˜ Lψ(λ) − V˜ (λ) − tλ = $, < φ(λ), ˜ if ψ(λ) ˜ ˜ Lψ(λ) − V˜ (λ) − tλ ≤ $, = 0, ˜ ˜ ˜ ˜ Lψ(λ) − V (λ) − tλ ≥ $, if ψ(λ) = φ,

(2.8) (2.9) (2.10)

with $˜ = C − $. Thus we have proved the following theorem. Theorem 2.3. If ψ is the maximizer for the maximization problem (1.12)–(1.13), then ˜ ψ˜ = φ − ψ is the maximizer for the maximization problem with external field V (λ) + tλ, ˜ constraint φ, and normalization ψ dλ = 1 − x. We will refer to the maximization problem with external field V˜ (λ) + tλ as the dual problem. The simultaneous study of a maximization problem and its dual was done earlier by Dragnev and Saff [10] in their work on the zero asymptotics of Krawtchouk polynomials.

312

A. B. J. Kuijlaars, K. T.-R. McLaughlin

3. The Right Ansatz and the Left Ansatz The right ansatz was formulated in (1.14)–(1.16). If the right ansatz is valid then the support of ψ is equal to the interval [α, B] and the support of ψ˜ = φ − ψ, the maximizer for the dual problem, is [A, β]. It is easy to see that also the converse holds. That is, if the support of ψ is an interval [α, B] and the support of ψ˜ is an interval [A, β], then α < β and (1.14)–(1.16) hold. The following lemma gives sufficient conditions for the supports to be intervals. Part of the lemma is covered by [9, Theorem 2.16 (b)] of Dragnev and Saff. For completeness and convenience of the reader, we give here the full proof. Lemma 3.1. Let t ∈ R. (a) If (λ − A)(V˜ (λ) + t) is increasing for λ ∈ [A, B], then, for every x ∈ (0, 1), there is β = β(x, t) such that the support of ψ˜ is [A, β]. In addition, β(x, t) depends continuously on x. (b) If (B − λ)(V (λ) − t) is increasing for λ ∈ [A, B], then, for every x ∈ (0, 1), there is α = α(x, t) such that the support of ψ is [α, B]. In addition, α(x, t) depends continuously on x. ˜ Proof. (a) If the support of ψ˜ is not an interval, then there exist λ1 < λ2 in supp(ψ) ˜ such that ψ(λ) = 0 for λ ∈ (λ1 , λ2 ). By (2.9) we then have ˜ ˜ Lψ(λ) − V˜ (λ) − tλ ≤ $,

λ ∈ (λ1 , λ2 ),

˜ while by (2.8) there is equality for λ = λ1 , and for λ = λ2 . Thus Lψ(λ) − V˜ (λ) − tλ assumes its minimum on [λ1 , λ2 ] at an internal point, say at λ∗ ∈ (λ1 , λ2 ). Then ˜ (λ∗ ) − V˜ (λ∗ ) − t = 0. (Lψ)

(3.1)

Next, we note that for λ ∈ (λ1 , λ2 ), ˜ (λ) = (λ − A) (λ − A)(Lψ)

B A

1 ˜ ψ(µ)dµ. λ−µ

(3.2)

It is easy to see that for every µ ∈ (A, B] \ (λ1 , λ2 ), the function (λ − A)/(λ − µ) is strictly decreasing on [λ1 , λ2 ]. Since ψ˜ is non-negative on [A, B] and vanishes on (λ1 , λ2 ), we find that the left-hand side of (3.2) is strictly decreasing on [λ1 , λ2 ]. Then we get, using the assumption that (λ − A)(V˜ (λ) + t) increases on [A, B], ˜ (λ) − V˜ (λ) − t is strictly decreasing on [λ1 , λ2 ]. (λ − A) (Lψ) (3.3) In view of (3.1) this implies that ˜ (λ) − V˜ (λ) − t > 0, (Lψ) ˜ (λ) − V˜ (λ) − t < 0, (Lψ)

λ ∈ (λ1 , λ∗ ), λ ∈ (λ∗ , λ2 ),

˜ which means that Lψ(λ) − V˜ (λ) − tλ has a local maximum at λ∗ . This contradiction shows that the support of ψ˜ is an interval. Let β = β(x, t) be the right endpoint of this interval.

Continuum Limit of the Toda Lattice

313

For later reference we note that for λ ∈ (β, B], we have strict inequality ˜ ˜ Lψ(λ) − V˜ (λ) − tλ < $. Indeed, if equality would hold, then we could repeat the same arguments as above, with λ1 = β and λ2 = λ, and we would obtain a contradiction again. ˜ Assuming A is not in the support of ψ, ˜ Next, we show that A is in the support of ψ. we let λ2 > A be the smallest number in the support. Then ˜ ˜ Lψ(λ) − V˜ (λ) − tλ ≤ $,

λ ∈ (A, λ2 ),

with equality for λ = λ2 . Then in the same way we obtained (3.3), we get ˜ (λ) − V˜ (λ) − t is strictly decreasing on [A, λ2 ]. (λ − A) (Lψ) As the above expression vanishes for λ = A, we see that it is negative for λ ∈ (A, λ2 ). ˜ Thus Lψ(λ) − V˜ (λ) − tλ is strictly decreasing on [A, λ2 ]. This is again a contradiction, since ˜ ˜ Lψ(λ) − V˜ (λ) − tλ ≤ $, for λ = A and equality holds for λ = λ2 . Thus the support of ψ˜ is equal to the interval [A, β(x, t)]. To prove that β(x, t) depends continuously on x, we note first that as x increases, ˜ the normalization ψdλ = 1 − x decreases, and therefore, by Proposition 4.1(a) of [18] (see also Lemma 5.1 below) the maximizer decreases as x increases. In addition, ˜ the measures ψdλ depend continuously on x in the weak∗ topology of measures on [A, B], see [18, Proposition 4.1(b)]. This immediately implies that the endpoint β(x, t) is right-continuous in x. What remains is to show that it is also continuous from the left. The limit from the left exists since β is a decreasing function of x. We denote the left limit by β(x−, t). Clearly β(x−, t) ≥ β(x, t). So we certainly have β(x−, t) = β(x, t) if β(x, t) = B. Assuming β(x, t) < B, we recall that we have strict inequality ˜ Lψ(λ) − V˜ (λ) − tλ < $˜ ˜ for all λ ∈ (β(x, t), B]. By [18, Proposition 4.1(b)] Lψ(λ) and $˜ both depend continuously on x. It follows that for any given λ ∈ (β(x, t), B] we can find sufficiently small δ > 0 such that also for x − δ strict inequality holds at λ. Then β(x − δ, t) < λ for all sufficiently small δ, and it follows that β(x−, t) < λ. Since λ can be taken arbitrarily close to β(x, t), we arrive at β(x−, t) ≤ β(x, t). This proves the left-continuity, since we already know that β(x−, t) ≥ β(x, t). This completes the proof of part (a). The proof of part (b) is similar. Combining Lemma 3.1 with the remarks immediately preceding it, we arrive at the following result. Theorem 3.2. Let t ∈ R. If both (λ−A)(V˜ (λ)+t) and (B −λ)(V (λ)−t) are increasing functions on the interval [A, B], then the right ansatz holds for the maximization problem (1.12)–(1.13) for every x ∈ (0, 1). Furthermore, the functions α(x, t) and β(x, t) appearing in (1.14)–(1.16) are continuous in x.

314

A. B. J. Kuijlaars, K. T.-R. McLaughlin

It is clear that if the conditions of the theorem hold for some t, then they hold for all later times as well. Thus in that case, the right ansatz continues to hold for all later times. It is important to note that in Theorem 3.2 the right ansatz holds for every x ∈ (0, 1). Dual to the right ansatz, we have the so-called left ansatz. By this we mean that the maximizer ψ of (1.12)–(1.13) satisfies 0 < ψ(λ) < φ(λ), ψ = φ, ψ = 0,

λ ∈ (α(x, t), β(x, t)), λ ∈ [A, α(x, t)], λ ∈ [β(x, t), B]

(3.4) (3.5) (3.6)

for some values α(x, t), β(x, t) in [A, B]. The following analogue of Theorem 3.2 holds for the left ansatz. Theorem 3.3. Let t ∈ R. If both (λ − A)(V (λ) − t) and (B − λ)(V˜ (λ) + t) are increasing functions on [A, B], then the left ansatz holds for the maximization problem (1.12)–(1.13) for every x ∈ (0, 1), and the functions α(x, t) and β(x, t) appearing in (3.4)–(3.6) are continuous in x. If the conditions of Theorem 3.3 hold at some time t, then they hold for all earlier times. Corollary 3.4. Suppose V and V˜ are differentiable functions on (A, B). (a) If there is a constant T such that for every λ ∈ (A, B), d (λ − A)V˜ (λ) ≥ −T , dλ and d (B − λ)V (λ) ≥ −T , dλ then the right ansatz holds for every time t ≥ T and every x ∈ (0, 1). (b) If there is a constant T such that for every λ ∈ (A, B), d (λ − A)V (λ) ≥ T , dλ and d (B − λ)V˜ (λ) ≥ T , dλ then the left ansatz holds for every time t ≤ T and every x ∈ (0, 1). (c) If V and V˜ are C 2 functions on [A, B], then there exist Tr and Tl such that the right ansatz holds for t ≥ Tr and x ∈ (0, 1), and the left ansatz for t ≤ Tl and x ∈ (0, 1). In all three cases, the functions α(x, t) and β(x, t) are continuous in x. Proof. This follows immediately from Theorems 3.2 and 3.3.

Remark 3.5. Under the conditions of Corollary 3.4, one may also show that the functions α(x, t) and β(x, t) are continuous in t. However, as we will not use this in the rest of the paper, we will not show it here.

Continuum Limit of the Toda Lattice

315

4. Long Time Behavior: Right Ansatz Holds Uniformly In many important cases the functions V and V˜ are not C 2 functions and special attention is required for points where differentiability fails. Our goal in this section is to find conditions on the initial Riemann invariants α0 and β0 such that Theorem 3.2 can be applied for t sufficiently large. Recall that V and V˜ are determined by α0 and β0 through Eqs. (2.4) and (2.7), respectively. We study initial data satisfying the following conditions (see Fig. 1 below) I. The function β0 (x) is continuous on [0, 1] and assumes its maximum at xB ∈ (0, 1). It is a C 2 function on [0, 1] \ {xB } with β0 > 0 on [0, xB ) and β0 < 0 on (xB , 1]. At the point xB the left and right limits of β0 and β0 exist with β0 (xB −) > 0,

β0 (xB +) < 0.

(4.1)

II. The function α0 (x) is continuous on [0, 1] and assumes its minimum at xA ∈ (0, 1). It is a C 2 function on [0, 1] \ {xA } with α0 < 0 on [0, xA ) and α0 > 0 on (xA , 1]. At the point xA the left and right limits of α0 and α0 exist with α0 (xA −) < 0,

α0 (xA +) > 0.

(4.2)

β0 (x) ... ... ... ... ..... . ... . ... .... ............. ... . ... ...... .. ...... ...... ....... ....... .... . . . ..... ...... . ............. ...... ... ...... .. ...... ... . ... .............. .. . . ............. . . . . . ................... . . . . . . . . ......... ............ . ....... . ..... ...... . ..... . ...... .... ...... . .... . ...... ... .. .. ... ............ . . . . . . ... .. α0 (x) ...... . . ... . . . . ....... .... .. ... ....... ........... ....... ... ... .. ...... ... ...... . ........... .. ... ............ ... ... .................... .......... . . ....................... .. .. ........... .. ........... .... ... .. 0

xB

xA

1

Fig. 1. Example of initial Riemann invariants satisfying the Conditions I and II

Theorem 4.1. If the initial data α0 , β0 satisfy Conditions I and II, then for t large enough, the right ansatz holds for all x ∈ (0, 1), and the functions α(x, t) and β(x, t) are continuous in x. Proof. The proof is based on Theorem 3.2, so that we have to show that, for t large enough, both (B − λ)(V (λ) − t) and (λ − A)(V˜ (λ) − t) increase on [A, B]. We discuss here in detail the behavior of (B − λ)(V (λ) − t), the other function being similar.

316

A. B. J. Kuijlaars, K. T.-R. McLaughlin

Without loss of generality we may assume that α0 (0) = β0 (0) = 0. Then A < 0 < B. We start with some remarks on the function V . It is a non-negative continuous function on [A, B] with V (0) = 0. It is differentiable on (A, 0) and on (0, B) with derivative

x− (λ)

1 dx, λ ∈ (0, B), (λ − α (x))(λ − β0 (x)) 0 0 x− (λ) 1 dx, λ ∈ (A, 0). V (λ) = − √ (λ − α0 (x))(λ − β0 (x)) 0

V (λ) =

√

(4.3) (4.4)

We see that V (λ) is positive for λ > 0 and negative for λ < 0. Hence V assumes its minimum only √ at 0. [We could have taken (4.3) as the formula for V for all λ, if we would consider (λ − α0 (x))(λ − β0 (x)) as a complex function of the complex variable λ, which would take positive values for λ > β0 (x) and negative values for λ < α0 (x). However, we chose here not to take that point of view. Instead all formulas are “real” formulas, and all square roots are positive.] The values V (B) and V (A) exist. Indeed, we have xB 1 V (B) = dx √ (B − α0 (x))(B − β0 (x)) 0 and the integral converges because of the assumption that β0 (xB −) > 0. Similarly, V (A) exists. Hence V is a continuous function on [A, B] \ {0}. At 0, V is not continuous, but the left and right limits at 0 exist. This may be seen from the formula √ 1 (λu) λ x− du V (λ) = , 0 < λ < B, (4.5) √ 1 −u λ − α (x (λu)) 0 0 − which is obtained from (4.3) through the change of variables x = x− (λu). As λ → 0+, it is easy to see that (λu) → x−

1

(4.6)

β0 (0)

and √ λ

α0 (x− (λu)) = 1− λ λ − α0 (x− (λu))

−1/2

−1/2

α (0) → 1 − 0 u β0 (0)

Thus V (0+) =

1 β0 (0)

1

1−

0

−1/2

α0 (0) u β0 (0)

arctan −α0 (0)/β0 (0) . =2 −α0 (0)β0 (0)

√

du 1−u

.

(4.7)

Continuum Limit of the Toda Lattice

317

Similarly arctan −β0 (0)/α0 (0) V (0−) = −2 . −α0 (0)β0 (0) Thus V has a jump at 0 of magnitude V (0+) − V (0−) =

π −α0 (0)β0 (0)

.

(4.8)

To study the differentiability of V we restrict ourselves to λ ∈ (0, B], the case λ ∈ [A, 0) being similar. We consider √ (λu) λ x− , 0 ≤ λ ≤ B, 0 ≤ u ≤ 1, (4.9) f (λ, u) = λ − α0 (x− (λu)) which for λ = 0, is interpreted as the limit from above (which exists by (4.6) and (4.7)). In view of (4.5) we have 1 du V (λ) = f (λ, u) √ . 1 −u 0 The function f is continuous on the rectangle R = {(λ, u) : 0 ≤ λ ≤ B, 0 ≤ u ≤ 1}. From (4.9) we see that f is differentiable with respect to λ, as all functions in (4.9) are differentiable. Furthermore, at λ = 0, f is differentiable from the right, and at λ = B, it is differentiable from the left. However, there is one exception, which has to do with the fact that α0 is not differentiable at xA . Thus α0 (x− (λu)) is not differentiable with respect to λ if x− (λu) = xA . This happens if xA < xB and λu = β0 (xA ). In that case it follows from the assumptions on α0 that the derivatives from the left and from the right exist. Thus ∂f/∂λ exists on R \ {λu = β0 (xA )}, is continuous there, and on the curve λu = β0 (xA ) its left and right limits exist. Then it easily follows that ∂f 1 (λ, u) √ ∂λ 1−u is integrable on R. By Fubini’s theorem, we then have for every λ0 ∈ (0, B],

λ0 1 1 λ0 ∂f ∂f du du (λ, u) √ (λ, u)dλ √ dλ = ∂λ 1−u 1−u 0 0 0 ∂λ 0 1 du = [f (λ0 , u) − f (0, u)] √ 1 −u 0 = V (λ0 ) − V (0+). Thus V is differentiable on (0, B] and V (λ) =

1 0

du ∂f (λ, u) √ ∂λ 1−u

318

A. B. J. Kuijlaars, K. T.-R. McLaughlin

is a continuous function on (0, B] and the limit V (0+) exists. Similarly, we find that V is a continuous function on [A, 0) and V (0−) exists. It follows that (B − λ)(V (λ) − t) is differentiable on [A, B] \ {0} with derivative (B − λ)V (λ) − V (λ) + t. The functions V and V are continuous on [A, B] \ {0} and have left and right limits at 0. Therefore they are bounded. It follows that for t ≥ T1 :=

sup

λ∈[A,B]\{0}

V (λ) − (B − λ)V (λ)

the function (B − λ)(V (λ) − t) increases on [A, 0) and on (0, B]. At λ = 0, V has a jump (4.8), from which it follows that lim (B − λ)(V (λ) − t) ≥ lim (B − λ)(V (λ) − t).

λ→0+

λ→0−

Thus (B − λ)(V (λ) − t) increases on the full interval [A, B] if t ≥ T1 . In the same way, we obtain that (λ − A)(V˜ (λ) + t) increases on [A, B] for t ≥ T2 , with T2 :=

sup

λ∈[A,B]\{0}

−V˜ (λ) − (λ − A)V˜ (λ) .

Thus for t ≥ max(T1 , T2 ), both (B − λ)(V (λ) − t) and (λ − A)(V˜ (λ) + t) increase on [A, B] and the theorem follows because of Theorem 3.2. Remark 4.2. The requirements that α0 and β0 are C 2 functions at 0 imply that φ(λ) becomes infinite at λ = 0 (as in the proof of Theorem 4.1, we assume that α0 (0) = β0 (0) = 0). In fact one may check that Conditions I and II imply that φ(λ) behaves like log (1/|λ|) for λ near 0. Thus the accumulation of eigenvalues for the original Toda matrix L is greater at zero than at other points of the spectrum. This observation has a noteworthy dynamical consequence. Since φ is a constant of the motion, the blow-up of φ at λ = 0 should be visible in the evolving curves α(x, t), β(x, t) at all times. Since the right ansatz holds for large time by Theorem 4.1, it follows that at large times, either α(·, t) or β(·, t) has a zero derivative at the value of x where it is 0. Remark 4.3. The inequalities (4.1) in Condition I imply that the function β0 is not C 2 at the point xB , but rather possesses a corner there. We will show in the next section that the conclusion of Theorem 4.1 does not hold if β0 is a C 2 function on the full interval [0, 1]. It is similarly true that the conclusion of the theorem does not hold if α0 is a C 2 function on the full interval. The difference between the two cases has an interesting dynamical interpretation. The point xB where β0 has its maximum moves to the left as t increases. If β0 has a corner at xB , then the top point hits the boundary x = 0 in finite time. This is a consequence of Theorem 4.1. On the other hand, if β0 is a C 2 function, then the top point will not hit the boundary in finite time. A similar interpretation applies to α0 .

Continuum Limit of the Toda Lattice

319

5. Long Time Behavior: Right Ansatz Does Not Hold Uniformly The goal of this section is to prove Theorem 5.3 from which it follows that for C 2 initial data α0 and β0 , the right ansatz does not hold uniformly in x, no matter how large t is. To establish Theorem 5.3 we will use two results on the x-dependence of the maximizer, which will be discussed first. We study the dependence of the maximizer ψ on the spatial parameter x. Lemmas 5.1 and 5.2 hold generally and are not restricted to C 2 spectral data. Lemma 5.1. Let V be continuous, and let φ be such that Lφ is continuous. Then for a fixed t, the maximizer ψ of the problem (1.11)–(1.12) increases with x. Proof. See Proposition 4.1 of [18].

We are also interested in the behavior for x → 0+. Lemma 5.2. Let V be continuous, and let φ be such that Lφ is continuous. We further assume that supp(φ) = [A, B]. Fix t ∈ R. The following are equivalent for λ0 ∈ [A, B], (a) λ0 is in the support of the maximizer ψ for every x ∈ (0, 1); (b) the function V (λ) − tλ assumes its minimum at λ0 . Proof. Assume (b) holds. Let x > 0 and denote the corresponding maximizer by ψ. The function Lψ is harmonic on C \ supp(ψ). Since it tends to +∞ at infinity, the minimum principle for harmonic functions gives that Lψ assumes its minimum only in supp(ψ). Let λ1 ∈ supp(ψ) be a point where the minimum is assumed. Then, by (2.1) and (2.3) Lψ(λ1 ) − V (λ1 ) + tλ1 ≥ $. If we assume that λ0 ∈ supp(ψ), then Lψ(λ1 ) < Lψ(λ0 ) and by (2.2) Lψ(λ0 ) − V (λ0 ) + tλ0 ≤ $. Combining the last three relations, we find V (λ0 ) − tλ0 > V (λ1 ) − tλ1 which contradicts (b). Thus λ0 ∈ supp(ψ) and (a) holds. Next, assume that (a) holds. For x ∈ (0, 1), we use ψ(·; x) to denote the maximizer corresponding to x and $x to denote the constant appearing in (2.1)–(2.3). Then by (2.1) and (2.3) we have for every x ∈ (0, 1), Lψ(λ0 ; x) − V (λ0 ) + tλ0 ≥ $x . Letting x → 0, we have that Lψ(·; x) → 0 by the dominated convergence theorem, and thus −V (λ0 ) + tλ0 ≥ lim sup $x . x→0

320

A. B. J. Kuijlaars, K. T.-R. McLaughlin

We are done, if we can prove that lim $x = − min{V (λ) − tλ : λ ∈ [A, B]}.

x→0

(5.1)

Let λ1 be a point where V (λ) − tλ assumes its minimum. Then λ1 ∈ supp(ψ(·; x)) for every x ∈ (0, 1), by what has been proved before. Thus Lψ(λ1 ; x) − V (λ1 ) + tλ1 ≥ $x .

(5.2)

For every x ∈ (0, 1), the set of λ values such that Lψ(λ; x) − V (λ) + tλ = $x is a non-empty closed set, because of the continuity of Lψ(λ; x). We let λx be a point closest to λ1 such that Lψ(λx ; x) − V (λx ) + tλx = $x .

(5.3)

We claim that λx → λ1 as x → 0. Indeed, if this were not the case, there would be a sequence (xn ) tending to 0 and an > 0 such that |λxn − λ1 | > . Then using (5.2) and (5.3) we would find that Lψ(λ; xn ) − V (λ) + tλ > $xn for all λ in the interval + = (λ1 − , λ1 + ). Then ψ(·; xn ) = φ in + by (2.3), so that xn = ψ(λ; xn )dλ ≥ ψ(λ; xn )dλ = φ(λ)dλ > 0. +

+

This contradicts the fact that (xn ) tends to 0. Therefore λx → λ1 as x → 0, as claimed. Now letting x → 0 in (5.3), we get −V (λ1 ) + tλ1 = lim $x , x→0

which proves (5.1) by the definition of λ1 . This completes the proof of the lemma. Theorem 5.3. Suppose the initial upper Riemann invariant β0 is increasing on the interval [0, xB ], and decreasing on the interval [xB , 1], and the lower Riemann invariant α0 is decreasing on the interval [0, xA ], and increasing on the interval [xA , 1], where xA , xB ∈ (0, 1). (a) If β0 is a C 2 function in a neighborhood of xB , then for every t > 0, there exists δ > 0 such that for every x ∈ (0, δ), the maximizer ψ vanishes in a neighborhood of B. (b) If α0 is a C 2 function in a neighborhood of xA , then for every t > 0, there exists δ > 0 such that for every x ∈ (1 − δ, 1), the maximizer ψ = φ in a neighborhood of A. Consequently, in both cases the right ansatz (1.14)–(1.16) is not valid for all x ∈ (0, 1), no matter how large t is. In case (a) it fails for x close to 0, and in case (b) it fails for x close to 1.

Continuum Limit of the Toda Lattice

321

Proof. We will restrict our attention to the proof of part (a), as the proof of part (b) is similar. So we assume that β0 is a C 2 function in a neighborhood of xB . Then as in (4.3), we have xB 1 V (B) = dx. (5.4) √ (B − α (x))(B − β0 (x)) 0 0 As β0 is a C 2 function around xB and xB is the point where β0 has its maximum, we have B − β0 (x) = O((xB − x)2 ),

(x → xB ).

Hence the integral in (5.4) diverges and V (B) = ∞. It follows that V (λ) − tλ does not assume its minimum at λ = B, no matter how large t is. Consequently, by Lemma 5.2, there is δ > 0 such that the maximizer ψ corresponding to normalization δ vanishes in a neighborhood of B. But then by Lemma 5.1, the maximizer corresponding to any smaller normalization also vanishes in this neighborhood, and thus part (a) of the theorem follows. Example 5.4. The effect described in Theorem 5.3 is clearly visible in the following explicit solution of the continuous Toda equations (1.7): 2 (1 − x)p − x(1 − p) , (5.5) α(x, t) = 2 (1 − x)p + x(1 − p) , (5.6) β(x, t) = where p = p(t) =

1 . 1 + e−2t

(5.7)

A straightforward calculation shows that (5.5) and (5.6) satisfy (1.7) indeed. This example is related to Krawtchouk polynomials [10]. The corresponding initial data are √ 2 1 √ 1 α0 (x) = 1 − x − x = − x(1 − x), 2 2 √ 2 1 √ 1 β0 (x) = 1 − x + x = + x(1 − x). 2 2 The upper Riemann invariant β is smooth and has its maximum initially at x = 1/2, and at later times at 1 x =1−p = . 1 + e2t Similarly, α has its minimum at x=p=

1 . 1 + e−2t

Thus for t > 0 the right ansatz holds for x in the interval 1 1 [1 − p, p] = , , 1 + e2t 1 + e−2t but not for x in [0, 1 − p) ∪ (p, 1].

322

A. B. J. Kuijlaars, K. T.-R. McLaughlin

6. Generation of Infinite Gaps from Smooth Initial Data We show in this section how the previous results can be used to establish the existence of smooth C ∞ initial data such that for some time t0 and some position x0 , the maximizer is supported on an infinite union of disjoint intervals. 6.1. Construction of the external field. We start with the construction of a C ∞ external field V0 such that the equilibrium measure in the presence of V0 (with normalization 1, and without upper constraint) is supported on infinitely many intervals. Lemma 6.1. Define for k = 0, 1, 2, . . . , ak =

1 3

k 1 , 2

bk =

1 2

k 1 , 2

(6.1)

and put . = {0} ∪

∞

[ak , bk ] .

k=0

Then there are functions ψ0 and V0 on R with the following properties: (a) ψ0 is a non-negative continuous function with support equal to . such that ψ0 (λ) dλ = 1.

(6.2)

(b) V0 is C ∞ on R and real analytic on R \ {0, b0 , a0 , b1 , . . . }. (c) Lψ0 = V0 on . and Lψ0 < V0 on R \ .. Proof. The function ψ0 will be built out of translates and rescalings of the function √ 2 1 − λ2 for λ ∈ [−1, 1], (6.3) f (λ) = π 0 otherwise. It is well known that Lf (λ) = λ2 − 1/2 − log 2,

for λ ∈ [−1, 1],

Lf (λ) < λ − 1/2 − log 2,

for λ ∈ R \ [−1, 1],

2

(see, for example, [27, p. 240]). Thus Lf is real analytic on the intervals (−∞, −1), (−1, 1) and (1, ∞). Then there is a function W such that W is C ∞ on R and real analytic on R \ {−2, −1, 1, 3},

(6.4)

Continuum Limit of the Toda Lattice

323

and Lf (λ) = W (λ), Lf (λ) < W (λ),

for λ ∈ (−∞, −2] ∪ [−1, 1] ∪ [3, ∞), for λ ∈ (−2, −1) ∪ (1, 3).

Now for k = 0, 1, 2, . . . , we define 5 1 k ak + bk ck = = , 2 12 2 and

1 bk − ak = rk = 2 12

(6.5) (6.6)

k 1 , 2

λ − ck , rk

λ − ck Wk (λ) = rk W + log rk . rk

fk (λ) = f

(6.7) (6.8)

Then the function fk is supported on the interval [ak , bk ], and by a straightforward calculation,

λ − ck + log rk . (6.9) Lfk (λ) = rk Lf rk From (6.5), (6.6), and the definitions (6.1) of ak and bk , it then follows that Lfk (λ) = Wk (λ), Lfk (λ) < Wk (λ),

for λ ∈ (−∞, bk+1 ] ∪ [ak , bk ] ∪ [ak−1 , ∞), for λ ∈ (bk+1 , ak ) ∪ (bk , ak−1 ),

(6.10) (6.11)

where a−1 = 2/3. Furthermore, Wk is C ∞ on R and real analytic on R \ {bk+1 , ak , bk , ak−1 } as a result of (6.4). Taking k = 0, we see that W0 is real analytic, except at the points 1/4, 1/3, 1/2 and 2/3. Then there exists a C ∞ function Wˆ 0 such that Wˆ 0 = W0 on [0, 1/2], Wˆ 0 > W0 on (−∞, 0) ∪ (1/2, ∞), and such that Wˆ 0 is real analytic on R \ {0, 1/4, 1/3, 1/2}. It follows from (6.10)–(6.11) that Lf0 (λ) = Wˆ 0 (λ), Lf0 (λ) < Wˆ 0 (λ),

for λ ∈ [0, 1/4] ∪ [1/3, 1/2],

(6.12)

for λ ∈ (−∞, 0) ∪ (1/4, 1/3) ∪ (1/2, ∞).

(6.13)

Now we form the two infinite series ∞ 12 fk (λ) ψ0 (λ) = √ , k! e k=0 ∞ Wk (λ) 12 ˆ . V0 (λ) = √ W0 (λ) + k! e k=1

(6.14)

(6.15)

√ The factor 12/( e) was taken in order to guarantee that (6.2) holds. Observe that by construction, the support of ψ0 is ., so that property (a) of the lemma holds. We note that V0 is a C ∞ function, since Wˆ 0 and each Wk is C ∞ and the series (6.15) is uniformly convergent on compacts, as are the series with the derivatives of any order. Similarly, V0

324

A. B. J. Kuijlaars, K. T.-R. McLaughlin

is real analytic on each of the open intervals (ak , bk ) and (bk+1 , ak ) for k = 1, 2, . . . . Thus property (b) holds. We see from (6.10)–(6.15) that Lψ0 = V0 on . and that Lψ0 < V0 on each of the gaps (bk+1 , ak ). Because of the modification of W0 to Wˆ 0 , we also have Lψ0 < V0 on (−∞, 0) and on (b0 , ∞). Thus (c) holds. From properties (a)–(c) of Lemma 6.1 it follows that ψ0 is the equilibrium measure in the presence of the external field V0 . That is, it maximizes (Lψ, ψ) − 2(V0 , ψ) among all non-negative functions ψ satisfying ψdλ = 1, see [5, 6, 27]. 6.2. Construction of initial data. Let ψ0 and V0 be as in Lemma 6.1. We put √ 2 1 − λ2 for λ ∈ [−1, 1], φ(λ) = π 0 otherwise.

(6.16)

Since ψ0 is bounded with support . ⊂ [0, 1/2], there is an x0 ∈ (0, 1) such that x0 ψ0 < φ

on (−1, 1).

(6.17)

We consider the external field x0 V0 and the constraint φ on the interval [−1, 1]. The dual external field, see Lemma 2.1, is Lφ − x0 V0 − C. Both x0 V0 and Lφ are C ∞ on [−1, 1] (in fact, Lφ is real analytic). Thus, by Corollary 3.4 (c), there exists a sufficiently negative time Tl such that the left ansatz holds for every t < Tl and every x ∈ (0, 1), with continuous functions α(·, t) and β(·, t). Choose t0 > −Tl and write V (λ) = x0 V0 (λ) + t0 λ,

λ ∈ [−1, 1].

(6.18)

Note that we then have V (λ) > 0

for all λ ∈ [−1, 1].

For the external field (6.18) and the constraint φ, the maximizer ψ(·; x, t) for the maximization problem (1.12)–(1.13) at time t ∈ (−∞, t0 + Tl ) satisfies the left ansatz (3.4)– (3.6). Thus for every x ∈ (0, 1) and t < t0 + Tl , there exist α(x, t) and β(x, t) in [−1, 1] such that ψ(·; x, t) = φ on [−1, α(x, t)] and 0 < ψ(·; x, t) < φ on (α(x, t), β(x, t)). We take α0 (x) = α(x, 0),

β0 (x) = β(x, 0),

(6.19)

as initial data, whose spectral functions φ and V are given by (6.16) and (6.18), respectively. Lemma 6.2. Let x0 and Tl be defined as above. Then for every t0 > −Tl the following statements hold for the functions α0 (x) and β0 (x) from (6.19): (a) α0 and β0 are continuous increasing functions on (0, 1) with −1 < α0 (x) < β0 (x) < 1, lim α0 (x) = lim β0 (x) = −1,

(6.20) (6.21)

lim α0 (x) = lim β0 (x) = 1.

(6.22)

x→0+ x→1−

x→0+ x→1−

Continuum Limit of the Toda Lattice

325

(b) The maximizer of the maximization problem (1.12)–(1.13) corresponding to x0 and t0 is equal to x0 ψ0 , and x0 ψ0 is supported on an infinite number of intervals. Proof. We already noted that the maximizer ψ(·; x, 0) at time t = 0 is equal to the constraint φ precisely on [−1, α0 (x)], that it vanishes precisely on [β0 (x), 1], and that the functions α0 (x) and β0 (x) are continuous in x. Since the maximizer increases with x by Lemma 5.1, it follows that both α0 and β0 are increasing functions of x. If α0 (x) would be −1, then the constraint φ would not be active. In that case, an explicit formula for ψ would be β V (s) 1 1 (β − s)(s + 1)ds , ψ(λ; x) = √ x + P.V. π π (β − λ)(λ + 1) −1 s − λ where P.V. denotes a Cauchy principal value integral, see e.g. [14,19]. Since V (s) > 0, we then see that the maximizer ψ would have a square-root singularity at −1, which is clearly impossible since ψ ≤ φ. Thus α0 (x) > −1. Similarly β0 (x) < 1. Now we follow the arguments of Deift and McLaughlin in [7, Chapter 4], where they consider decreasing initial data. If we modify their arguments to the case of increasing initial data, we find that α0 (x) and β0 (x) satisfy the equations T (α, β) = 0,

X(α, β) = x,

(6.23)

where the functions T and X are defined by T (α, β) =

1 π

β

α

and X(α, β) =

1 π

α

√

V (λ) dλ − (β − λ)(λ − α)

α −1

√

φ(λ) dλ, (β − λ)(α − λ)

β

(λ) α λ − α+β φ(λ) λ − α+β V 2 2 dλ − dλ. √ √ (β − λ)(λ − α) (β − λ)(α − λ) −1

(6.24)

(6.25)

If we let β → α+, then (6.24) becomes α φ(λ) dλ = −∞. V (α) − −1 α − λ Thus α0 (x) < β0 (x), and we proved (6.20). As the maximizer is equal to the constraint φ on the interval [−1, α0 (x)], it is clear that α0 (x) → −1 as x → 0+. Since the maximizer vanishes on [β0 (x), 1], and φ(λ)dλ = 1, we also find that β0 (x) → +1 as x → 1−. Suppose that β0 (x) → β > −1 as x → 0+. Then taking the limit x → 0+ in the equation T (α0 (x), β0 (x)) = 0, we find that 1 π

β −1

√

V (λ) dλ = 0, (β − λ)(λ + 1)

which is clearly impossible, since V (λ) > 0. Thus β0 (x) → −1 as x → 0+. Similar arguments, based on the dual problem, lead to the conclusion that α0 (x) → 1 as x → 1−. Hence (6.21) and (6.22) are proved.

326

A. B. J. Kuijlaars, K. T.-R. McLaughlin

Finally, we note that 0 ≤ x0 ψ0 ≤ φ by (6.17), and Lemma 6.1 (c) and (6.10) we have

x0 ψ0 dλ = x0 by (6.2). By

L(x0 ψ0 )(λ) = x0 Lψ0 (λ) ≤ x0 V0 (λ) = V (λ) − t0 λ with equality on the support of x0 ψ0 . The support of x0 ψ0 is equal to the set . of Lemma 6.1 and it consists of an infinite number of intervals. This proves part (b) and completes the proof of Lemma 6.2. Summarizing, for each choice of t0 > −Tl , we have constructed an external field V (λ) = x0 V0 (λ) + t0 λ out of V0 (λ) so that at t = 0, the maximization problem (1.12)– (1.13) is solved by the left ansatz for all x ∈ (0, 1), with α0 (x) and β0 (x) depending continuously on x. Next we would like to establish the C ∞ smoothness of the functions α0 (x) and β0 (x) (so far, we only know that they are continuous). For this, we will require that the parameter t0 be taken sufficiently large. We first observe that if we write T (α, β) from Eq. (6.24) in the form 1 T (α, β) = π −

1

−1 1 −1

V

β−α ! 2 u √ 2 1−u

α+β 2

φ β−

+

α−1 2

α−1 2

+

−

du !

α+1 2 u

!

α+1 2 u

(1 − u)

du,

and use the fact that V and φ are C ∞ functions on [−1, 1], we find that T is a C ∞ function for −1 < α < β < 1. Similarly, X is C ∞ . Theorem 6.3. Let x0 and Tl be as in Lemma 6.2. Then there is Tˆ > −Tl so that t0 > Tˆ implies that the functions α0 and β0 corresponding to t0 as in (6.19) are C ∞ smooth. Proof. We recall from the proof of Lemma 6.2 that for each x and t0 > −Tl , the pair (α0 (x), β0 (x)) solves the pair of equations (6.23). Using (6.24), together with V (λ) = x0 V0 (λ) + t0 , we may rewrite the function T as follows,

T (α, β) = t0 + −

x0 V0 (λ)/π dλ (β − λ)(λ − α) α α φ(λ) dλ. √ (β − λ)(α − λ) −1 β

√

(6.26)

Observe that the first integral on the right-hand side of (6.26) is uniformly bounded for all α < β in [−1, 1]: min x0 V0 (λ) ≤

λ∈[−1,1]

β α

√

x0 V0 (λ)/π dλ ≤ max x0 V0 (λ). λ∈[−1,1] (β − λ)(λ − α)

(6.27)

Continuum Limit of the Toda Lattice

327

Similarly, its partial derivatives are uniformly bounded. Differentiating (6.26) with respect to β, we find

α x0 V0 (λ)/π φ(λ) 1/2 dλ dλ + √ β −λ (β − λ)(λ − α) (β − λ)(α − λ) α −1 β α x0 V0 (λ)/π ∂ φ(λ) 1/2 ≥ dλ dλ + √ √ ∂β α β + 1 (β − λ)(α − λ) (β − λ)(λ − α) −1 β x0 V0 (λ)/π ∂ = dλ √ ∂β α (β − λ)(λ − α) x0 V0 (λ) 1 β 1/2 t0 + + dλ − T (α, β) . √ β +1 π α (β − λ)(λ − α)

Tβ =

∂ ∂β

β

√

(6.28)

β x0 V0 (λ)/π Now assume α and β satisfy T (α, β) = 0. Then, since α √(β−λ)(λ−α) dλ and its derivative with respect to β are uniformly bounded, we have Tβ > 0 for t0 sufficiently big. Similarly, we now show that for t0 sufficiently large, it follows that if α and β solve the equation T (α, β) = 0, then Tα < 0. We insert the definition (6.16) of φ into the second integral in (6.26), to obtain √ α φ(λ) 1 − λ2 2 α dλ = dλ. √ √ π −1 (β − λ)(α − λ) (β − λ)(α − λ) −1 By contour integration this integral may be re-expressed as an integral over the interval [β, 1], which yields the following formula: √ α φ(λ) 1 − λ2 2 1 dλ = α + β + dλ. √ √ π β (β − λ)(α − λ) (λ − β)(λ − α) −1 Now arguments quite similar to those used to prove that Tβ > 0 can be used to prove that Tα < 0, if T (α, β) = 0 and t0 is sufficiently large. We thus have shown that if t0 is sufficiently big, and if α and β solve T (α, β) = 0, then Tα < 0 and Tβ > 0. For the partial derivatives of X, it follows as in [7, Chapter 4] that Xα = −

β −α Tα , 2

Xβ =

β −α Tβ , 2

provided that α and β satisfy T (α, β) = 0. Therefore we learn that " # Tα Tβ det = (β − α)Tα Tβ = 0 Xα Xβ for −1 < α < β < 1 solving T (α, β) = 0. Hence the Jacobian of the mapping (α, β) ! → (T , X) is non-zero for t0 sufficiently large and T (α, β) = 0. Thus, recalling that α0 (x) and β0 (x) are continuous functions solving T (α, β) = 0 and X(α, β) = x, we deduce from the implicit function theorem that α0 (x) and β0 (x) are C ∞ functions on (0, 1). This proves the theorem.

328

A. B. J. Kuijlaars, K. T.-R. McLaughlin

Remark 6.4. Combining Lemma 6.2 and Theorem 6.3, we have constructed an example where an infinite gap solution arises out of C ∞ initial data at a certain position x0 and time t0 > 0. We used the global description provided by the maximization problem (1.12)–(1.13), but we were not able to analyse the support of the maximizer in general for every x and t. Hence we do not know whether the infinite gap solution occurs at other values of x and t, or not. What we can say is that the conditions of Corollary 3.4 are satisfied. Thus for large enough time (larger than t0 ) the right ansatz holds uniformly for x ∈ (0, 1). Therefore, for large enough time, all gaps in the support of the maximizer have disappeared, and we again have C ∞ functions α(x, t) and β(x, t). We are also able to analyse the maximizer at the fixed time t0 , with varying x ∈ (0, 1). It turns out that for x = x0 , we have a finite gap solution provided that the maximizer does not meet the constraint φ. This will be discussed in the next subsection. 6.3. Deformation in the spatial variable x. We further study the external field V0 constructed in the proof of Lemma 6.1, for which the equilibrium measure is supported on infinitely many intervals. We will consider the equilibrium problem with normalization x > 0, and prove that the maximizer is supported on finitely many intervals for every x different from 1. As discussed in the Introduction, the normalization x corresponds to the spatial variable in the continuum limit of the Toda lattice. In this subsection, we consider V0 as an external field defined on [−1, 1]. For each x > 0, we use ψ(·; x) to denote the maximizer with external field V0 and normalization x, i.e., ψ(λ; x)dλ = x, and no upper constraint. We write .x for the support of ψ(·; x). Recall that ψ(·; x) is increasing with x (cf. Lemma 5.1), and that ψ(·; 1) is equal to the function ψ0 from Lemma 6.1. Thus .1 = {0} ∪

∞

[ak , bk ],

(6.29)

k=0

where ak and bk are given by (6.1). Theorem 6.5. For every x > 0, x = 1, the set .x consists of a finite number of intervals. Proof. We consider first the case x < 1. Then .x ⊂ .1 . First, we want to show that for all k sufficiently large, the interval [ak , bk ] is disjoint from .x . We use Lemma 5.7 of [29], from which it follows that ψ0 (λ) = ψ(λ; 1) ≥ (1 − x)

dω.1 (λ), dλ

for λ ∈ .x ,

(6.30)

where ω.1 denotes the equilibrium measure without external field of the set .1 , and normalization 1. Enlarging the set .1 to the interval [0, 1], we decrease the equilibrium measure on .1 , and a fortiori on .x . This property of equilibrium measures follows for example from Theorem IV.4.5 of [27]. So we have dω.1 dω[0,1] 1 (λ) ≥ (λ) = √ , dλ dλ π λ(1 − λ)

for λ ∈ .x .

(6.31)

Continuum Limit of the Toda Lattice

329

For λ ∈ [ak , bk ], we have 1 2(k+1)/2 1 1 . ≥ √ ≥ √ = π π bk π λ(1 − λ) π bk (1 − bk ) √

(6.32)

Now combining inequalities (6.30), (6.31), and (6.32), we find that ψ0 (λ) ≥ (1 − x)

2(k+1)/2 , π

for λ ∈ [ak , bk ] ∩ .x .

(6.33)

On the other hand, from the construction of ψ0 in (6.3), (6.7), and (6.14), it is clear that ψ0 (λ) ≤

24 √ , π ek!

for λ ∈ [ak , bk ].

(6.34)

From (6.33) and (6.34), we learn that if 24 2(k+1)/2 < (1 − x) √ π π ek! then [ak , bk ] ∩ .x is empty. This is clearly satisfied for k large enough. Thus we have shown that .x ⊂

kx

[ak , bk ],

k=1

for some finite kx . Next, it also follows from (6.30) that the points ak and bk do not belong to .x . Indeed, we know that ψ0 vanishes at these points, and the density of ω.1 is infinite at these points. So we see that ak ∈ .x or bk ∈ .x would contradict (6.30). Thus [ak , bk ] ∩ .x is contained in [ak + δ, bk − δ] for some δ > 0. Since V0 is real analytic on (ak , bk ), Theorem 1.38 of [6] gives that [ak , bk ] ∩ .x consists of a finite number of intervals (cf. [18]). So it follows that .x consists of a finite number of intervals for all x ∈ (0, 1). Now we turn to the case that x is bigger than 1. Fix x > 1, so that .1 ⊂ .x . Our first goal is to show that for k large enough, the gaps (bk+1 , ak ) of .1 are fully contained in .x . To this end, we introduce external fields, one for each k ∈ N, V0 − Lψ0 on [bk+1 , ak ], Qk = (6.35) 0 on [−1, bk+1 ] ∪ [ak , 1]. 1

This is a C 1+ 2 external field on [−1, 1]. Let ak Qk (s) 1 1 x − 1 + P.V. 1 − s 2 ds , ηk (λ) = √ π π 1 − λ2 bk+1 s − λ

(6.36)

where P.V. denotes the Cauchy principal value. Then by standard results on singular integral equations, see e.g. [14, §42.3], we have Lηk = Qk

on [−1, 1]

(6.37)

330

A. B. J. Kuijlaars, K. T.-R. McLaughlin

and

1 −1

ηk (λ) dλ = x − 1.

(6.38)

We are going to show that ηk is non-negative on [−1, 1] if k is sufficiently large. From (6.10)–(6.15) and (6.35) we note that k+1 ! 12 1 Qk (λ) = √ Wj (λ) − Lfj (λ) , j! e

for λ ∈ [bk+1 , ak ].

(6.39)

j =k

From this and (6.7)–(6.8) we compute that for λ ∈ [bk+1 , ak ],

k+1 λ − cj λ − cj 12 1 Qk (λ) = √ − (Lf ) . W j! rj rj e

(6.40)

j =k

Inserting (6.40) into the principal value integral in the right-hand side of (6.36) and making a suitable transformation for each term separately, we arrive at the following principal value integrals: −1 12 W (t) − (Lf ) (t) 1 − (ck + rk t)2 ) dt, (6.41) √ P.V. t −ζ π ek! −3 2 12 W (t) − (Lf ) (t) 1 − (ck+1 + rk+1 t)2 dt, (6.42) P.V. √ t −ζ π e(k + 1)! 1 where in the first integral λ = ck +rk ζ , and in the second integral λ = ck+1 +rk+1 ζ . The functions (6.41) and (6.42) are Hilbert transforms of Hölder continuous functions, and therefore they are also Hölder continuous, and they decay to 0 for |ζ | → ∞, uniformly with respect to k. Using the continuity property of the Hilbert transform ! on Hölder continuous functions, we easily see that both (6.41) and (6.42) are O k!1 , as k → ∞, uniformly in ζ . Then it is clear from the definition (6.36) of ηk , that there exists kx ∈ N such that ηk > 0,

for all k ≥ kx .

(6.43)

Combining (6.35), (6.37), (6.38), (6.43) with Lemma 6.1, we see that for k ≥ kx , ψ0 + ηk > 0 on [−1, 1], 1 (ψ0 (λ) + ηk (λ))dλ = x, −1

and

L(ψ0 + ηk )

= V0

on [bk+1 , ak ],

≤ V0

on [−1, 1].

(6.44) (6.45)

(6.46)

We also note that ψ0 + ηk = ψ(·; x), since strict inequality occurs in (6.46) in each of the gaps (bj +1 , aj ) with j = k, and supp(ψ0 + ηk ) = [−1, 1]. From (6.44)–(6.46) it then follows by Lemma 2.2 of [4] that [bk+1 , ak ] ⊂ supp(ψ(·; x)) = .x

for k ≥ kx .

Continuum Limit of the Toda Lattice

331

Since .1 ⊂ .x , we conclude that [0, bkx ] ⊂ .x . Thus for each x > 1, a full interval around 0 is in the support of .x . To conclude that .x consists of a finite number of intervals we are now left with the intervals [bkx , 1] and [−1, 0]. The bands [ak , bk ] remain in the support .x . It is thus enough to show that for each k < kx the set .x ∩ [bk+1 , ak ] consists of a finite number of intervals, and similarly for .x ∩ [−1, 0]. To this end, we note that ψ(·; x) − ψ0 is a non-negative function with L(ψ(·; x) − ψ0 ) = V0 − Lψ0 + $x

on .x ,

and inequality ≤ on [−1, 1]. Thus ψ(·; x) − ψ0 is the maximizer for the external field V0 − Lψ0 and normalization x − 1. This external field is zero on each interval [ak , bk ], and convex in a neighborhood of each ak and bk . It then follows that some interval [ak − δ, bk + δ] is also contained in .x . In a neighborhood of the remaining gaps [bk+1 + δ, ak − δ], the external field V0 is real analytic, and so by Theorem 1.38 of [6], .x ∩ [bk+1 + δ, ak − δ] consists of a finite union of intervals. Similarly, V0 − Lψ0 is convex in a left neighborhood of 0, and a left neighborhood [−δ, 0] is also contained in .x . The external field V0 is real analytic on [−1, −δ] and again by [6, Theorem 1.38] it follows that .x ∩ [−1, −δ] consists of a finite union of intervals. Thus we have shown that for x > 1 the support .x is a finite union of disjoint closed intervals. Remark 6.6. Theorem 6.5 has the following consequence for the continuum limit of the Toda lattice with the initial data α0 and β0 considered in Subsect. 6.2. We showed in Lemma 6.2 (b) that x0 ψ0 is the maximizer at time t0 , and position x0 . In the same way it follows that xψ(·; x) is the maximizer at time t0 and position x provided that it satisfies the constraint xψ(·; x) ≤ φ.

(6.47)

Using the fact that x0 ψ0 < φ on (−1, 1), see (6.17), we may prove as in [18, Lemma 4.10] that there exists δ > 0 such that (6.47) holds with strict inequality on (−1, 1) for every x < x0 + δ. This implies that in the example of Subsect. 6.2 the infinite gap solution holds at time t0 at x0 , but not at other x values less than x0 + δ. Going from x0 to x < x0 , we have that an infinite number of bands disappear, while going from x0 to x ∈ (x0 , x0 + δ), we have that an infinite number of gaps close. For x > x0 + δ, the upper constraint becomes active, and we are not able to analyse this more complicated situation. Acknowledgements. Arno Kuijlaars was supported in part by FWO research project G.0278.97, and a research grant of the Fund for Scientific Research – Flanders. He is grateful to K. T.-R. McLaughlin for the support and hospitality during a visit to the University of Arizona. Kenneth T.-R. McLaughlin was supported in part by NSF postdoctoral fellowship grant # DMS-9508946 and NSF grant # DMS-9970328. He thanks the faculty and staff of the Princeton University Mathematics Department and MSRI for their support and hospitality, and thanks A. Kuijlaars and W. Van Assche for their hospitality and support during visits to K. U. Leuven.

332

A. B. J. Kuijlaars, K. T.-R. McLaughlin

References 1. Bardos, C., Ghidaglia, J.-M. and Kamvissis, S.: Weak convergence and deterministic approach to turbulent diffusion. In: Nonlinear wave equations, (Yan Guo ed.), Contemp. Math. 263, Providence RI: AMS, 2000, pp. 1–15 2. Bloch, A., Golse, F., Paul, T. and Uribe, A.: Dispersionless Toda and Toeplitz operators. Preprint 3. Brockett, R.W. and Bloch, A.: Sorting with the dispersionless limit of the Toda lattice. In:Hamiltonian systems, transformation groups and spectral transform methods (Montreal, 1989), Montreal: Univ. Montréal, 1990, pp. 103–112 4. Damelin, S.B. and Kuijlaars, A.B.J.: The support of the equilibrium measure in the presence of a monomial external field on [−1, 1]. Trans. Am. Math. Soc. 351, 4561–4584 (1999) 5. Deift, P.: Orthogonal polynomials and random matrices: a Riemann–Hilbert approach. Courant Lecture Notes in Mathematics 3, New York: Courant Institute, 1999 6. Deift, P., Kriecherbauer, T. and McLaughlin, K.T.-R.: New results on the equilibrium measure for logarithmic potentials in the presence of an external field. J. Approx. Theory 95, 388–475 (1998) 7. Deift, P. and McLaughlin, K.T.-R.: A continuum limit of the Toda lattice. Mem. Am. Math. Soc. 131 624, (1998) 8. Deift, P., Venakides, S. and Zhou, X.: New results in small dispersion KdV by an extension of the steepest descent method for Riemann–Hilbert problems. Internat. Math. Research Notices 6, 286–299 (1997) 9. Dragnev, P.D. and Saff, E.B.: Constrained energy problems with applications to orthogonal polynomials of a discrete variable. J. Anal. Math. 72, 223–259 (1997) 10. Dragnev, P.D. and Saff, E.B.: A problem in potential theory and zero asymptotics of Krawtchouk polynomials. J. Approx. Theory 102, 120–140 (2000) 11. Ercolani, N., Levermore, C.D. and Zhang, T.: The behavior of the Weyl function in the zero-dispersion KdV limit. Commun. Math. Phys. 183, 119–143 (1997) 12. Flaschka, H.: On the Toda lattice II. Prog. Theor. Phys. 51, 703–716 (1974) 13. Flaschka, H., Forest, M.G. and McLaughlin, D.W.: Multiphase averaging and the inverse spectral solutions of the Korteweg–de Vries equation. Comm. Pure Appl. Math. 33, 739–784 (1980) 14. Gakhov, F.: Boundary value problems Oxford: Pergamon Press, 1966 15. Holian, B.L., Flaschka, H. and McLaughlin, D.W.: Shock waves in the Toda lattice: analysis. Phys. Rev. A 24, 2595–2623 (1981) 16. Jin, S., Levermore, C.D. and McLaughlin, D.W.: The semiclassical limit of the defocusing NLS hierarchy. Comm. Pure Appl. Math. 52, 613–654 (1999) 17. Kamvissis, S.: On the Toda shock problem. Phys. D 65, 242–266 (1993) 18. Kuijlaars, A.B.J.: On the finite gap ansatz in the continuum limit of the Toda lattice. Duke Math. J. 104, 433–462 (2000) 19. Kuijlaars, A.B.J. and Dragnev, P.D.: Equilibrium problems associated with fast decreasing polynomials. Proc. Am. Math. Soc. 127, 1065–1074 (1999) 20. Kuijlaars, A.B.J. and McLaughlin, K.T-R: Generic behavior of the density of states in random matrix theory and equilibrium problems in the presence of real analytic external fields. Comm. Pure Appl. Math. 53, 736–785 (2000) 21. Kuijlaars, A.B.J. and Van Assche, W.: The asymptotic zero distribution of orthogonal polynomials with varying recurrence coefficients. J. Approx. Theory 99, 167–197 (1999) 22. Lax, P.D. and Levermore, C.D.: The small dispersion limit of the Korteweg–de Vries equation I, II, III. Comm. Pure Appl. Math. 36, 253–290, 571–593, 809–829 (1983) 23. Manakov, S.V.: Complete integrability and stochastization of discrete dynamical systems. Zh. Exp. Teor. Fiz. 67, 543–555 (1974) 24. Moser, J.: Finitely many mass points on the line under the influence of an exponential potential – an integrable system. In: Dynamical Systems, Theory and Applications (J. Moser ed.) Lect. Notes in Phys. 38, Berlin: Springer, 1975, pp. 467–497 25. Rakhmanov, E.A.: Equilibrium measure and the distribution of zeros of the extremal polynomials of a discrete variable. Mat. Sb. 187, 109–124 (1996); English transl. in Sb. Math. 187, 1213–1228 (1996) 26. Ransford, T.: Potential theory in the complex plane. Cambridge: Cambridge University Press, 1995 27. Saff, E.B. and Totik, V.: Logarithmic Potentials with External Fields. New York: Springer-Verlag, 1997 28. Shipman, S.P.: Modulated waves in a semiclassical continuum limit of an integrable NLS chain. Comm. Pure Appl. Math. 53, 243–279 (2000) 29. Totik, V.: Weighted Approximation with Varying Weight. Lecture Notes in Math. 1569, Berlin: SpringerVerlag, 1994 30. Venakides, S.: Higher order Lax–Levermore theory. Comm. Pure Appl. Math. 43, 335–362 (1990)

Continuum Limit of the Toda Lattice

333

31. Venakides, S., Deift, P. and Oba, R.: The Toda shock problem. Comm. Pure Appl. Math. 44, 1171–1242 (1991) 32. Whitham, G.B.: Linear and Nonlinear Waves. New York: Wiley, 1974 Communicated by M. Aizenman

Commun. Math. Phys. 221, 335 – 349 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

A Generic C1 Expanding Map has a Singular S–R–B Measure James T. Campbell , Anthony N. Quas Department of Mathematical Sciences, University of Memphis, Memphis, TN 38152-3240, USA. E-mail: [email protected]; [email protected] Received: 8 December 2000 / Accepted: 27 March 2001

Abstract: We show that for a generic C1 expanding map T of the unit circle, there is a unique equilibrium state for − log T that is an S–R–B measure for T , and whose statistical basin of attraction has Lebesgue measure 1. We also present some results related to the question of whether a generic C1 expanding map preserves a σ -finite measure, absolutely continuous with respect to Lebesgue measure. 1. Introduction Let E k denote the set of Ck expanding maps of the unit circle S 1 onto itself, k = 1, 2, . . . . Expanding maps have been widely studied in ergodic theory. In particular, various cases with k ≥ 2 have been studied by a large number of authors including Rényi ([17], 1965), Kr˙zyzewski ([9], 1971), Kr˙zyzewski and Szlenk ([11], 1969). A typical result says that an expanding map with C2 regularity has a unique absolutely continuous invariant measure with strong ergodic properties. These results have been extended to the case of C1+α expanding maps of the circle (maps with a Hölder continuous derivative) and even to maps satisfying weaker regularity conditions. More recently Góra ([5], 1994) proved results of this type under the Dini condition. A later result of Kr˙zyzewski ([10], 1979) gave the first indication that the situation for C1 expanding maps differs from that of the smoother maps. Namely, he showed that within the set of expanding C1 self-maps of any manifold, the set of such maps for which there is an absolutely continuous invariant probability measure, with continuous density bounded away from 0, is meager. (That is, its complement is generic, i.e., contains a dense Gδ set with respect to the C1 topology.) This theme was taken up by Góra and Schmitt ([4], 1989) who showed that there is an example of an expanding C1 map of the circle that has no absolutely continuous invariant probability measure. In further studies of C1 expanding maps of the circle by Quas ([15, 13, 14], all 1996) maps with respectively more than one absolutely continuous invariant measure and a J. Campbell is partially supported by NSF Grant #DMS–9801602

336

J. T. Campbell, A. N. Quas

non-weak-mixing invariant measure were constructed; and it was shown that a dense set of C1 expanding maps have a unique absolutely continuous invariant probability with unbounded density. In [2] (1998), Bruin and Hawkins constructed an example of an expanding C1 map of the circle with no σ -finite absolutely continuous invariant measure (finite or infinite). In a more recent paper of Quas ([16], 1999) it was shown that a generic C1 expanding map of the circle has no absolutely continuous invariant probability measure. Our main result shows that despite this result, there is (generically) a singular invariant probability from which properties of Lebesgue almost every orbit can be obtained. Theorem 1. For a generic T ∈ E 1 , there is a unique equilibrium measure µT for the potential − log T . This T - invariant probability measure has the following properties: 0 1 1. For a set of points S of Lebesgue measure 1, for all f ∈ C (S ), the averages n−1 k 1/n k=0 f (T x) converge to f dµT for all x ∈ S. 2. The measure µT is singular with respect to Lebesgue measure. 3. For each non-empty open set U , µT (U ) > 0.

In other words, a generic T ∈ E 1 possesses a fully supported singular Sinai–Ruelle– Bowen measure whose statistical basin of attraction has Lebesgue measure 1. A natural question is whether the result from [16] may be extended from probability measures to σ -finite measures; i.e., is it true that generically in E 1 , there is no absolutely continuous invariant measure? At the moment, we do not know the answer, but we include the following trio of results that give some information about this situation. Silva [19] introduced a notion of recurrence for a measure with respect to a non−1 singular transformation. To define this in our setting, let h be the density of λ ◦ 1T with respect to Lebesgue measure (h = dλ ◦ T −1 /dλ), and set ωn (x) = nj=1 h◦T j . Then 1 ωn > 0 on S and ωn dλ = 1, n = 1, 2, . . . . Lebesgue measure is recurrent for T if 1 the quantity ∞ n=1 ωn (x) is infinite for λ-a.e. x ∈ S . (We caution the reader that this notion of recurrence is much stronger than Poincaré recurrence. For example there exist C2 expanding maps of S 1 for which Lebesgue measure is not recurrent in this sense.) This recurrence property is relevant to the question of the existence of invariant measures as follows. If one can establish that a measure is recurrent for a non-invertible map, then existence or non-existence of absolutely continuous, σ -finite invariant measures for the map can be decided using a version of Krieger’s ratio set (see Hawkins and Silva [6] for a proof of this result). Theorem 2. For a generic subset of E 1 , Lebesgue measure is not recurrent. Recall that a measure µ is locally infinite if µ(I ) = ∞ for each open interval I . Theorem 3. For a generic T ∈ E 1 , any absolutely continuous invariant measure is locally infinite. To describe the next result in this direction, let hn (x) be the density of λ ◦ T −n with respect to λ: hn (x) = dλ ◦ T −n /dλ(x). Set S,n,a = T ∈ E 1 : λ{x : hn (x) ∈ [a, 2a]} < ,

Generic C 1 Expanding Maps

337

and consider the collection S=

S,n,a .

>0 n∈N a>0

If T ∈ S, we say the densities of λ ◦ T −n have no characteristic scale. This is because for such a T and for any > 0, there exists an n such that for each a > 0, the set {x : hn (x) ∈ [a, 2a]} has Lebesgue measure less than . It is known that there exist mappings with an infinite invariant measure so that the above densities hn , when appropriately rescaled, converge in measure to the invariant density (see Aaronson’s book [1] for examples). One can see that when T belongs to the class S defined above, this is impossible. Therefore, when T belongs to S, a natural way of producing an absolutely continuous invariant measure is lost. Theorem 4. The set S constructed above is a dense Gδ subset of E 1 . In the next section we give some notation and definitions, in Sect. 3 we state and prove some preliminary lemmas, in Sect. 4 we prove Theorems 1, 2, 3, and in Sect. 5 we prove Theorem 4. 2. Notation & Definitions We work on S 1 = [0, 1]/ ∼ , where ∼ identifies 0 with 1. The Borel sigma-algebra is denoted by B. The space of Borel measures on S 1 is denoted by M, with M1 denoting the subspace of probabilities. If T ∈ E 1 , MT1 denotes the set of Borel probability measures that are invariant under T . For ν ∈ MT1 , the measure-theoretic entropy of T with respect to ν is denoted by hν (T ), or hν if T is understood. For a continuous function f : S 1 → R, the pressure of f (with respect to T ) is given by PT (f ) = sup hν (T ) + f dν . ν∈M1T

An equilibrium state for f is an element µ ∈ MT1 satisfying PT (f ) = hµ + f dµ. Recall that a Borel measure µ is called a Sinai–Ruelle–Bowen measure for T ∈ E 1 if there exists a subset B of S 1 of positive Lebesgue measure such that for each f ∈ C 0 (S 1 ) and all x ∈ B, n−1

1

f (T k (x)) → n

f dµ.

k=0

The set B is called the statistical basin of attraction of µ. For each T ∈ E 1 , T is a continuous function whose absolute value is strictly larger than 1. Since S 1 is connected, E 1 decomposes into two disjoint open subsets, the first consisting of those T ’s for which T > 1, the other, those T ’s for which T < −1. Each of these sets has countably many open components, corresponding to the maps of degree k (k = 2, 3, . . . , and k = −2, −3, . . . , respectively). In some of our arguments, we want to prove, say, that a subset of E 1 with a certain property is generic. We proceed by supposing that T > 1 and the degree is a fixed but arbitrary integer k > 1, and proving that within the corresponding component, the set is generic. Since a practically

338

J. T. Campbell, A. N. Quas

identical argument (with only the obvious minor modifications) will hold for T < −1 and k ≤ −2, and the components partition E 1 , the general result will follow. With these conventions in place we set !T = ! = − log(T ) < 0. We define the Perron–Frobenius operator, or transfer operator LT by LT f (x) =

f (y) . |T (y)|

T y=x

For now we do not specify the space containing f or LT f . These will depend upon the context in which they are being used, and will be designated as needed in the development. We repeatedly use the fact (proved in [9]) that for each T ∈ E 2 , there exists a unique, absolutely continuous µ ∈ MT1 , whose density is strictly positive and continuous. 3. Preliminary Lemmas We state and prove some lemmas that lead to the main results. Following [7], for each natural number k ≥ 2, let Ek : S 1 → S 1 denote the linear expanding map Ek (x) = kx mod 1. For T ∈ E 1 of degree k, it is well-known that Ek is conjugate to T ; that is, there exists a homeomorphism γ of S 1 such that T ◦ γ = γ ◦ Ek . In fact, in general there is more than one such homeomorphism (although only finitely many). For a degree k map T ∈ E 1 , we shall write Conj(T ) for the set of conjugacies between Ek and T . For our purposes, it will be necessary to study and control the dependence of the conjugacy on the map T . To do this, we shall exploit the construction in [7] of such a conjugacy. Specifically, in their construction, they start with a point p that is fixed by T and use the Markov partition of the circle given by the intervals whose endpoints are the points of T −1 {p}. For our modification, we need to control the choice of p. For z ∈ S 1 , set Uz = {T ∈ E 1 : T (z) = z}. Note that Uz is a dense open subset of 1 E . Lemma 1. For each z ∈ S 1 , there is a continuous map 'z : Uz → Homeo(S 1 ) such that 'z (T ) ∈ Conj(T ) for each T . In particular, given T ∈ E 1 of degree k, there is a neighborhood U of T on which there is a continuous choice of conjugacies to the map Ek . Proof. The proof is essentially that given in the proof of Theorem 2.4.6 in [7]. For a map T ∈ Uz , we choose the fixed point p of T that is the first fixed point on the circle “to the right” of z. That is, considering the circle to be the set [0, 1), p is chosen to be the first fixed point to the right of z or if there is none, the first fixed point to the right of 0. This choice of fixed point determines a conjugacy 'z (T ). The fixed point may be seen to depend continuously on the map, and so do its preimages. This allows one to show the required continuity of 'z . To show that in a neighborhood of any given map T ∈ E 1 , there is a continuous family of conjugacies, we argue as follows: Let z be any point not fixed by T , then Uz is the required neighborhood and 'z (S) is the continuous choice of conjugacy for S ∈ Uz .

Generic C 1 Expanding Maps

339

Note that if γ ∈ Conj(T ) and f ∈ C0 (S 1 ), then PEk (f ◦ γ ) = PT (f ). Indeed, 1 and M 1 by ν → ν ◦ γ −1 . Then γ induces a bijection between ME f ◦ γ dν = T k −1 f dν ◦ γ , and since γ is a measure-theoretic isomorphism, hν (Ek ) = hν◦γ −1 (T ). The pressure equality follows. Lemma 2. For all T ∈ E 1 , PT (!T ) = 0. Proof. If T ∈ E 2 , this is well-known as the Ruelle-Ledrappier-Young entropy formula (see [12]). Given a degree k map T ∈ E 1 , by Lemma 1, we may find a neighborhood V of T and a choice of conjugacies γS for all S ∈ V so that the map S → γS is continuous on V . With these choices, if {Ti } ⊂ E 2 and Ti → T in E 1 , then !Ti ◦ γTi → !T ◦ γT in C 0 (S 1 ). Since pressure is continuous on C 0 (S 1 ), and 0 = PTi (!Ti ) = PEk (!Ti ◦ γTi ) for all i, it follows by taking limits that 0 = PEk (!T ◦ γT ) = PT (!T ). Corollary 1. If µ is any equilibrium state for !T , then µ is non-atomic. Proof. Let µ be an ergodic equilibrium state; then it must be either purely atomic, or continuous. If it is purely atomic, then hµ (T ) = 0 and !T dµ < 0, contradicting P (!T ) = 0. The result follows since the equilibrium states form a convex set, of which the extreme points are the ergodic states. Lemma 3. The set of T ∈ E 1 for which !T has a unique equilibrium state is generic. The lemma is a version of the Gibbs Phase Rule for the class of expanding maps of the circle. The original Gibbs Phase Rule for the case of a shift was proved by Ruelle [18] and Gallavotti and Miracle-Sole [3]. Proof. For any expansive T , there is at least one equilibrium state for each h ∈ C 0 (S 1 ) (see Walters [20], p. 224). Since expanding maps are expansive, every !T possesses at least one equilibrium state. To prove uniqueness for a dense Gδ , we work with equilibrium states for the map Ek : S 1 → S 1 given by Ek (x) = kx mod 1. We now show that the set B of potentials for which there is a unique Ek -equilibrium state forms a Gδ set. Theorems 4.3.3 and 4.3.5 of [8] characterize those potentials with unique equilibrium states as the set of f such that for all g ∈ C 0 (S 1 ), limt→0 (PEk (f + tg) − PEk (f ))/t exists. For fixed f and g, define H (t) = (PEk (f + tg) − PEk (f ))/t. Since the map t → PEk (f + tg) is convex, H is an increasing function. The above limit then exists if and only if lim inf t→0+ H (t) − H (−t) = 0. Hence f has a unique equilibrium state if and only if lim inf t→0+

PEk (f + tg) + PEk (f − tg) − 2PEk (f ) =0 t

for all g ∈ C 0 (S 1 ).

(1)

To show that these f form a Gδ set, we need to show that it is sufficient to calculate the lim inf for a collection of g belonging only to a countable set. To this end, let (gn )n∈N be a countable collection of continuous functions that is dense in C 0 (S 1 ). We note that PE (f + tg) + PE (f − tg) − 2PE (f ) /t − k k k

PEk (f + tgn ) + PEk (f − tgn ) − 2PEk (f ) /t ≤ 2g − gn ∞ .

340

J. T. Campbell, A. N. Quas

Hence (1) holds if and only if lim inf (PEk (f + tgn ) + PEk (f − tgn ) − 2PEk (f ))/t = 0 for all n ∈ N. t→0+

The set of B of functions f satisfying this condition may be written as f : |(PEk (f + tgn ) + PEk (f − tgn ) − 2PEk (f ))/t| < 1/p , n∈N p∈N m∈N t∈(0, 1 ) m

which is easily seen to be a Gδ subset of C 0 . From Lemma 1, there is a continuous choice of conjugacies for maps in U0 . For a map T ∈ U0 , we shall call this choice of conjugacy γT . Letting . be the map U0 → C 0 (S 1 ) defined by .(T ) = !T ◦ γT , we see that . is continuous. It follows that .−1 (B) is a Gδ subset of U0 . We now have T ∈ .−1 (B) if and only if !T ◦ γT has a unique Ek equilibrium state. Since there is a bijection between Ek -equilibrium states for !T ◦ γT and T -equilibrium states for !T , we see that T ∈ .−1 (B) if and only if !T has a unique T -equilibrium state. We have established that the set S ⊂ E 1 consisting of those T for which !T has a unique equilibrium state, contains a Gδ subset of U0 . Since E 2 ∩ U0 is a dense subset of U0 that is contained in S, it follows that S contains a dense Gδ subset of U0 . Since U0 is a dense open subset of E 1 , we conclude that S contains a dense Gδ subset of E 1 . Set / = {T ∈ E 1 : there exists a unique equilibrium state for !T }. Lemma 4. Equip / with the (relative) C1 -topology, and M1 with the (relative) weak∗ topology. Then M : / → M1 , given by M(T ) = µT , is continuous. Proof. Suppose T0 ∈ / is of degree k, Ti ∈ / and Ti → T0 in C1 . We shall show that µT0 is the limit of the µTi . As in Lemma 1, fix a neighborhood V of T0 such that there is a continuous family of conjugacies γT for T ∈ V . Suppose that µ is any limit point of the µTi . We shall show that µ = µT0 , and this is sufficient, by weak∗ -sequential compactness, to show that the original sequence must converge to µT0 . Replacing the original sequence with a subsequence if necessary, we suppose that µTi → µ. Set νi = µTi ◦ γTi and ν = µ ◦ γT0 . Then ν and the νi are all Ek -invariant measures on S 1 . By continuity of the family of conjugacies, we see that νi → ν in the weak∗ -topology. For each i, since νi is an Ek -equilibrium state for !Ti ◦ γTi which by Lemma 2 has pressure 0, we have 0 = hνi + !Ti ◦ γTi dνi . Since the entropy map is upper semi-continuous, lim sup hνi ≤ hν . Since Ti → T0 in C1 and νi → ν we have 0 = lim sup hνi + !Ti ◦ γTi dνi ≤ hν + !T0 ◦ γT0 dν ≤ 0 ,

Generic C 1 Expanding Maps

341

where the last inequality is true because the pressure is 0. Thus, all of the inequalities are equalities and ν is an equilibrium state for !T0 ◦ γT0 , so that µ is an equilibrium state for !T0 . Since T0 ∈ /, there is only one such state. Thus any limit point of the µTi is µT0 , the unique equilibrium state for !T0 , and the lemma is proved. ˜ = {T ∈ / : µT is fully supported} is a generic subset of / (and hence Lemma 5. / of E 1 ). Proof. From Corollary 1, for each T ∈ /, µT must be non-atomic. By Lemma 4, for a non-empty open interval I ⊂ S 1 , the map T → µT (I ) is continuous on /. Choose any collection {Ii } of non-empty open intervals that forms a countable basis for the topology ˜ = i {T ∈ / : µT (Ii ) > 0}, a Gδ that contains E 2 (and is therefore of S 1 . Then / dense). 4. Proofs of Theorems 1, 2, and 3 Proof of Theorem 1. Lemma 3 establishes that for T belonging to the residual set /, there is a unique equilibrium state µT for the potential − log T . To prove Statement 1, we use a result of Keller. Any fixed T ∈ E 1 , together with the Markov partition for T , forms what Keller [8] calls a continuous e−ψ -conformal fibred system. He shows ([8], Theorem 6.1.8)1 that in such a system, for λ-almost every x, the weak∗ -limit points of the averages k1 (δx + . . . + δT k−1 x ) are contained in the set of measures satisfying hµ + (− log T ) dµ ≥ 0. Since PT (− log T ) = 0, these measures are precisely the equilibrium states. Hence for T ∈ /, for λ-almost every x, the sequence k1 (δx + . . . + δT k−1 x ) has at most one weak-∗ limit point, namely µT . By weak-∗ sequential compactness, the entire sequence must converge to µT . To see that µT must be singular (with respect to λ), we first note that each T ∈ E 1 is a non-singular transformation (with respect to λ). Thus, if µT = µsi + µac is the decomposition of µT into singular and absolutely continuous components, the map µT → µT ◦ T −1 preserves µsi and µac , so that µac is a finite, absolutely continuous T -invariant measure. But we have seen that a generic T ∈ E 1 possesses no such invariant measure ([16]); that is, µac = 0. This proves Statement 2. Lemma 5 implies that generically, µT is fully supported, showing Statement 3. This completes the proof of Theorem 1. Before proving Theorem 2, we state and prove a lemma. There is a reference to a similar lemma in [2] although we have been unable to find the proof in the papers cited there. Recall that if T ∈ E 2 , µT is an absolutely continuous probability measure with strictly positive Radon–Nikodym derivative ρ = dµT /dλ. Lemma 6. Suppose T ∈ E 2 . Then log LT 1 dµT ≥ 0, with equality if and only if ρ is T −1 B-measurable. Proof. Fix T ∈ E 2 . In this case, the equilibrium state µT is absolutely continuous. We write ρ for the density of µT with respect to Lebesgue measure. 1 In fact the quoted theorem, as stated in the book, contains a mistake, although an irrelevant one for the present setting. The interested reader may go to http://www.mi.uni-erlangen/, keller/publications/equibook.html, where the needed correction to the proof of the theorem is given.

342

J. T. Campbell, A. N. Quas

Let P denote the Perron-Frobenius operator for T with respect to µT ∈ MT1 . Then L (ρ·f ) . In particular LT (1) = ρP( ρ1 ). Thus, P(f ) = T ρ log LT (1) dµT = log ρ dµT + log P(1/ρ) dµT = − log(1/ρ) dµT + log P(1/ρ) dµT = − P(log(1/ρ)) dµT + log P(1/ρ) dµT , where the last equality follows because P preserves µT -integrals. It is well-known that P(·)◦T = EµT (·|T −1 B). Since T preserves µT , we may continue the above calculations as follows: − P(log(1/ρ)) dµT + log P(1/ρ) dµT = − P(log(1/ρ)) ◦ T dµT + log P(1/ρ) ◦ T dµT −1 = − EµT log(1/ρ)|T B dµT + log EµT (1/ρ|T −1 B) dµT ≥ 0, where the last inequality follows from Jensen’s inequality, from which it also follows that equality holds in the last step if and only if log( ρ1 ) is T −1 B-measurable, which holds if and only if ρ is T −1 B-measurable. This concludes the proof of Lemma 6. n Proof of Theorem 2. Since log ωn (x) = − j =1 log LT 1 ◦ T j (x), by Theorem 1 1 we have that log ω (x) → − log LT 1 dµT for λ-a.e. x ∈ S 1 and T ∈ /. If n n n log LT 1 dµT > 0, then for large n, ωn (x) = O(a ) for λ-a.e. x, where a is any number such that − log LT 1 dµT < log a < 0. That is, the sequence ωn (x) is asymptotically comparable to a geometric sequence, and hence summable (for λ-a.e. x), so that Lebesgue measure is not recurrent for T. First we observe that {T : log L 1 dµT > 0} is open in /. To see this, if T ∈ / T satisfies log LT 1 dµT > 0 and S ∈ / is C1 -close to T , then LS 1 is C0 -close to LT 1. By Lemma 4, µS is weak∗ -close to µT , proving the observation. Thus by Lemma 6, it is sufficient to show that for maps T belonging to a dense subset of E 2 (and hence a dense subset of E 1 ), the invariant density ρT is not T −1 B-measurable. Choose T ∈ E 2 for which ρT is T −1 B-measurable. We shall show that there is an S ∈ E 2 arbitrarily close to T (in the C1 topology) for which ρS is not S −1 B-measurable. Since ρT is T −1 B-measurable, T x = T y implies that ρ(x) = ρ(y). Given a Markov partition for T, we call the atoms of the partition the branches of T . We shall construct a C2 -homeomorphism π : S 1 → S 1 in such a way that 1. 2.

π is arbitrarily (C1 -) close to the identity, and The map T˜ = π ◦ T ◦ π −1 has the property that ρT˜ = ρ˜ is not T˜ −1 B-measurable. Establishing Items 1 and 2 will finish the proof.

Suppose for the moment that π is any C2 -homeomorphism of the circle, and T˜ (x) ˜ = −1 x) ˜ −1 B-measurable precisely ˜ π ◦ T ◦ π −1 (x). ˜ Then ρ( ˜ x) ˜ = πρ(π , so that ρ ˜ will be T (π −1 x) ˜

Generic C 1 Expanding Maps

343

when T˜ (x) ˜ = T˜ (y) ˜ implies that ρ( ˜ x) ˜ = ρ( ˜ y). ˜ Suppose x˜ = y˜ and T˜ (x) ˜ = T˜ (y). ˜ Then, since ρ is T −1 B-measurable, ρ(π −1 y) ˜ = ρ(π −1 x). ˜ Hence ρ( ˜ x) ˜ will differ from ρ( ˜ y) ˜ precisely when π (π −1 x) ˜ = π (π −1 y). ˜ Hence, if π is chosen so that π is not T −1 B-measurable, these terms will be different. Now we specify that π is a C2 -homeomorphism of S 1 with the property that π ≡ 1 on one branch of T , and different from 1, yet arbitrarily close to 1, on the other branches. This completes the proof of Theorem 2. Proof of Theorem 3. Suppose that T satisfies the conditions of Theorem 1. We show that in this case, any absolutely continuous invariant measure for T is locally infinite. Suppose ν is an absolutely continuous invariant measure for T . Then ν(S 1 ) = ∞. Suppose, for the purpose of obtaining a contradiction, that I is any open interval with ν(I ) < ∞. Let f be any non-negative continuous function supported on I that is positive on some subinterval of I . Clearly f ∈ L1 (ν). By Birkhoff’s ergodic theorem for an infinite invariant measure, for ν-almost every x, n1 (f (x) + . . . + f (T n−1 x)) → 0. This holds in particular on a set of positive Lebesgue measure. On the other hand, since µT is a Sinai–Ruelle–Bowen measure, we have for λ-almost every x, n1 (f (x) + . . . + f (T n−1 x)) → f dµT . Since f is strictly positive on a subinterval of I and µT is fully supported, this quantity is strictly positive. This contradiction completes the proof of the theorem.

5. No Characteristic Scale In this section we prove that if S,n,a = T ∈ E 1 : λ{x : Ln 1(x) ∈ [a, 2a]} < , and S=

S,n,a

>0 n∈N a>0

then S is a dense Gδ subset of E 1 . Proof (Proof of Theorem 4). We can replace the uncountable intersections in the definition of S by countable intersections over the rationals without changing the set. Define LnT 1(x) 1 ≤2 . Fn (T ) = λ × λ (x, y) : ≤ n LT 1(y) 2 Clearly, Fn (T ) < 2 implies that for all positive a, the measure of the set of points with LnT 1(x)∈ [a, 2a] is less than . Letting R,n = {T : Fn (T ) < 2 }, it is clear that R,n ⊂ a>0 S,n,a . Conversely, for fixed x, let a1 = LnT 1(x)/2 and a2 = 2a1 . If T ∈ a>0 S 2 /2,n,a , then for each x, by considering ∪2i=1 {y : LnT 1(y) ∈ [ai , 2ai ]} we have λ{y : LnT 1(y) ∈ [LnT 1(x)/2, 2LnT 1(x)]} < 2 . By Fubini’s theorem, we see that Fn (T ) ≤ 2 so that T ∈ R,n . It follows that S,n,a = R,n . S= >0 n∈N a>0

>0 n∈N

344

J. T. Campbell, A. N. Quas

We shall show that Fn : E 1 → R is an upper semi-continuous map so that S is a Gδ set. To prove this, suppose that Fn (T ) < α. We have λ×λ

(x, y) :

Ln 1(x) 1 ∈ 2, 2 Ln 1(y)

= lim λ × λ k→∞

Ln 1(x) 1 1 . ∈ 2 − k , 2 + k1 (x, y) : n L 1(y)

One can therefore find a k such that λ × λ({(x, y) : LnT 1(x)/LnT 1(y) ∈ [1/2 − 1/k, 2 + 1/k]}) < α. Since the map . : E 1 → C 0 (S 1 × S 1 ) given by .(T )(x, y) = LnT 1(x)/ LnT 1(y) is continuous (with the C1 and C0 -topologies on the respective spaces), there exists a neighborhood U of T such that if T˜ ∈ U , then .(T ) − .(T˜ ) < 1/k. It follows that if T˜ ∈ U , then Fn (T˜ ) < α, proving the upper semi-continuity of Fn . It then remains to demonstrate the density of S. To do this, we shall establish that for any > 0, any T0 ∈ E 2 and any neighborhood U of T0 (in the C1 topology), there is a T ∈ U and an n ∈ N such that for each a, λ{x : Ln 1(x) ∈ [a, 2a]} < . This will be accomplished by conjugating T0 using a homeomorphism constructed via a cocycle. We shall therefore assume > 0, T0 ∈ E 2 and δ > 0 are given. Let η > 0 be such that (1 + η)/(1 − η) < 1 + δ. Then we also have (1 − η)/(1 + η) > 1 − δ. Since T0 belongs to E 2 , T0 preserves an absolutely continuous invariant probability measure, µ, with a strictly positive continuous density, ρ. Let m be such that m1 ≤ ρ(x) ≤ m for all x. Let T¯0 : X → X be a natural extension of T0 : S 1 → S 1 preserving the measure µ. ¯ From [21], µ¯ is Bernoulli, so we may find a non-trivial independent partition P = {A0 , A1 } of X. Write p for µ(A ¯ 0 ) and q for µ(A ¯ 1 ). We then define a ¯ 0 on X as follows: function G 1 + ηq if x ∈ A0 ¯ G0 (x) = 1 − ηp if x ∈ A1 . ¯ (n) defined by Let n > 0 be an integer. We then form the multiplicative cocycle G 0 ¯ 0 (x)G ¯ 0 (T¯0 x) . . . G0 (T¯ n−1 x). ¯ (n) (x) = G G 0 0 (n)

¯ takes on the value vk = (1 + ηq)k (1 − ηp)n−k on a set of measure The G 0

n function k n−k q . p k Let K ∈ N be the least integer so that

Since vk+1 /vk =

1+ηq 1−ηp

1 + ηq 1 − ηp

K

> 2m2 .

¯ (n) in , for each a there are at most K values taken by G 0

[a, 2m2 a]. We then have the estimate ¯ (n) (x) ∈ [a, 2m2 a]} ≤ K µ{x ¯ :G 0

n k n−k p q . {k:vk ∈[a,2m2 a]} k max

Generic C 1 Expanding Maps

345

Since for the values of k in the range over which the maximum is taken have the property that vk ≥ a, we see n ¯ (n) (x) ∈ [a, 2m2 a]} ≤ K a µ{x ¯ :G vk p k q n−k max 0 {k:vk ∈[a,2m2 a]} k n (p + ηpq)k (q − ηpq)n−k = K max 0≤k≤n k CK < √ , n where C is a constant that depends only on the values of p and q. ¯ (n) (x) ∈ [a, 2m2 a]}) < /4 for all a. It will turn out Now fix an n so that a µ({x ¯ :G 0 that an inequality of this type will be what is needed for the conjugate map to have the ¯ (n) is defined not on the circle, but on the desired property. At this point, the function G 0 natural extension space. We shall apply a conditional expectation and approximation ¯ (n) to obtain a function on the circle as needed. argument to G 0 Let Q be a Markov partition for T0 consisting of intervals. There exists a k such that k−1 −s s=0 T0 Q consists of intervals of length less than δ. Denote these intervals by Ij and write I¯j for π −1 Ij , where π denotes the natural projection from the natural extension (X, T¯0 , µ) ¯ to (S 1 , T0 , µ). ¯ Write ρ¯ = ρ ◦ π and define the natural extension of λ, λ¯ by λ(A) = A (1/ρ) ¯ d µ. ¯ We then calculate χ I¯j (n) i ¯ ¯ ¯ ¯ (n) ◦ T¯0i d µ. ¯ G0 ◦ T 0 d λ = ·G 0 ρ¯ I¯j Since T¯0 is mixing, we see that χ I¯j ¯ (n) ◦ T¯0i d λ¯ = ¯ (n) d µ¯ lim G d µ ¯ G 0 i→∞ I¯j 0 ρ¯ n ¯ ¯ ¯ G0 d µ¯ = λ(Ij ) = λ(Ij ), where we used the fact that P is an independent partition to get the second equality. We recall that n is chosen so that ¯ (n) (x) ∈ [a, 2m2 a]}) < /(4a), µ({x ¯ :G 0 for each a > 0. We now choose an i0 such that for i ≥ i0 , ¯ (n) ◦ T¯0i d λ¯ − λ(Ij ) < δ λ(Ij ), G I¯j 0 3

(2)

(3)

for each j . ¯ (n) if G ¯ is chosen to be We now show that similar inequalities persist for functions G ¯ 0. an appropriate perturbation of G ¯ (n) are in the range [(1 − It is useful to note that because the values taken by G 0 ηp)n , (1 + ηq)n ], the inequality (2) holds trivially for a outside this range.

346

J. T. Campbell, A. N. Quas

We define N to be a subset of L1 (µ) ¯ as follows and equip it with the L1 subspace topology: ¯ : 1 − ηp ≤ G ¯ ≤ 1 + ηq; G ¯ −G ¯ 0 1 < ζ }. N = {G ¯ and because N consists of bounded Since composition with T¯ is an isometry on L1 (µ), ¯ → G ¯ (n) is continuous. Clearly, for G ¯ ∈ functions, the map from N to L1 given by G (n) ¯ N, the values taken by G are in the range [(1 − ηp)n , (1 + ηq)n ]. By choosing ζ ¯ (n) | < (1 − ηp)n /2 on a set of measure ¯ (n) − G appropriately small, we can ensure that |G 0 at least 1 − /(8(1 + ηq)n ). For a given a in the range [(1 − ηp)n /2m2 , (1 + ηq)n ], let a1 = a/2 and a2 = 2a. Then ¯ (n) (x) ∈ [a, 2m2 a]} ⊂ {x : G ¯ (n) (x) ∈ [a1 , 2m2 a1 ]} {x : G 0 ¯ (n) (x) ∈ [a, 2m2 a]} ∪ {x : G 0 ¯ (n) (x) ∈ [a2 , 2m2 a2 ]} ∪ {x : G 0 (n)

¯ (x) − G ¯ (n) (x)| > (1 − ηp)n /2}. ∪ {x : |G 0 We shall denote the four sets on the right-hand side by A1 , A2 , A3 and A4 respectively. ¯ (n) we have µ(A ¯ 1 ) < /(2a), µ(A ¯ 2 ) < /(4a) and By our previous estimates on G 0 µ(A ¯ 3 ) < /(4a2 ) < /(8a). We chose ζ above to ensure that µ(A ¯ 4 ) < /(8(1+ηq)n ) < /(8a), so that ¯ (n) (x) ∈ [a, 2m2 a]}) < /a µ({x ¯ :G for each a in the range [(1 − ηp)n /(2m2 ), (1 + ηq)n ]. As before, the inequality holds trivially for a outside this range, so we have established that for sufficiently small ζ , a ¯ (n) , if G ¯ is chosen from N . similar inequality to (2) persists for all a and functions G Since ¯ (n) ◦ T¯0i − G ¯ (n) ◦ T¯0i | d λ¯ ≤ |G ¯ (n) ◦ T¯0i − G ¯ (n) ◦ T¯0i | d λ¯ |G I¯j

0

0

≤m

¯ (n) ◦ T¯0i − G ¯ (n) ◦ T¯0i | d µ, |G ¯ 0

¯ ∈ N. we see that provided ζ is sufficiently small, (3) holds for G ¯ ∈ N, We have therefore shown that there exists a ζ > 0 such that for G ¯ (n) (x) ∈ [a, 2m2 a]}) < /a for each a, and µ({x ¯ :G ¯ (n) (x) ◦ T¯0i d λ¯ − λ(Ij ) < δ λ(Ij ) for each j , and i ≥ i0 . G 3 I¯j

(4) (5)

We note that since T¯0 : X → X is a natural extension of T0 : S 1 → S 1 , the σ -algebras ¯ 0 | T¯ i π −1 BS 1 ) converges to G ¯ 0 in L1 . By T¯0k π −1 BS 1 increase to BX . It follows that Eµ¯ (G 0 the monotonicity of conditional expectation, these functions also satisfy the inequality ¯ 0 | T¯ i π −1 BS 1 ) ≤ 1 + ηq. It follows that for sufficiently large i ≥ i0 , (i0 1 − ηp ≤ Eµ¯ (G 0 ¯ 0 | T¯ i π −1 BS 1 ) in place of G ¯ . Fix some as above), (4) and (5) are satisfied with Eµ¯ (G 0 ¯ 1 for Eµ¯ (G ¯ 0 | T¯ i π −1 BS 1 ). such i and write G 0

Generic C 1 Expanding Maps

347

¯ 1 ◦ T¯ i = Eµ¯ (G ¯ 0 ◦ T¯ i |π −1 BS1 ) so we see that G ¯ 1 ◦ T¯ i may be written as g1 ◦ π Now G 0 0 0 for some B-measurable function g1 on the circle. Since C 0 (S 1 ) is dense in L1 (S 1 , B, µ), it follows that there exists a continuous function g2 such that g1 − g2 1 is arbitrarily ¯ 1 − g2 ◦ π ◦ T¯ −i 1 = g1 − g2 1 , we see that g2 may be chosen so that small. Since G 0 g2 ◦ π ◦ T¯0−i lies in N . Equations (4) and (5) now yield (n) g2 dλ − λ(Ij ) < 3δ λ(Ij ) for each j ; and Ij (n)

µ({x : g2 (x) ∈ [a, 2m2 a]}) < /a for each a > 0. (n) From the first equation, we see that 1 − 3δ < g2 dλ < 1 + 3δ , so finally we rescale g2 (i.e. multiply by a constant, that will, by our above estimates, be very close to 1) to obtain a function g that satisfies g (n) dλ = 1. We then have the inequalities (n) (6) g dλ − λ(Ij ) < δλ(Ij ) for each j ; and Ij µ({x : g (n) (x) ∈ [a, 2m2 a]}) < 2/a for each a > 0. (7) x (n) Set θ (x) = 0 g (t) dt and let T (x) = θ ◦ T0 ◦ θ −1 (x). Then from the above, and since each interval Ij has length less than δ, it may be verified that |θ(x) − x| < 2δ, and supx∈S 1 |T (x) − T0 (x)| < (C + 4)δ, where C = maxx∈S1 |T0 (x)|. Hence this quantity can be made arbitrarily small by choosing δ sufficiently small. Also, differentiating, we see θ (T0 (θ −1 x)) θ (θ −1 x) g (n) (T0 (θ −1 x)) = T0 (θ −1 x) g (n) (θ −1 x) g(T0n (θ −1 x)) = T0 (θ −1 x) . g(θ −1 x)

T (x) = T0 (θ −1 x)

Since g is uniformly close to 1 and T0 is uniformly continuous, we see that supx∈S 1 |T (x)− T0 (x)| can also be made arbitrarily small by controlling δ and η. This shows that T can be chosen arbitrarily close to T0 in the C1 norm. It remains to verify that T has the property that there exists an n such that for each a, λ{x : Ln 1(x) ∈ [a, 2a]} < . Since T is conjugate to T0 , there is also a conjugacy relation between their Perron-Frobenius operators given by LT = Lθ ◦ LT0 ◦ Lθ −1 , where Lθ f (x) = f (θ −1 (x))/θ (θ −1 x). Since T0 is a C2 expanding map, we have that LnT0 1 converges uniformly to ρ. It follows that LnT 1 converges uniformly to Lθ ρ(x) = ρ(θ −1 x)/θ (θ −1 x). We then estimate λ({x :

ρ(θ −1 x) 1 a , 2ma]}) ∈ [a, 2a]}) ≤ λ({x : (n) −1 ∈ [ m (n) −1 g (θ x) g (θ x) 1 = λ({x : g (n) (θ −1 x) ∈ [ 2ma ,m a ]}).

348

J. T. Campbell, A. N. Quas

1 1 m (n) But we see that {x : g (n) (θ −1 x) ∈ [ 2ma ,m a ]} = θ({y : g (y) ∈ [ 2ma , a ]}). Using this, we get

λ({x :

ρ(θ −1 x) 1 ,m ∈ [a, 2a]}) ≤ λ ◦ θ({y : g (n) (y) ∈ [ 2ma a ]}) g (n) (θ −1 x) = g (n) (y) dλ 1 m {y : g (n) (y)∈[ 2ma , a ]}

0 there is CR > 0 such that √ Pk f (u) − (µ, f ) ≤ CR e−c k sup |f | + Lip(f ) for k ≥ 0, H

(0.4)

where u ≤ R, f is an arbitrary bounded Lipschitz function on H , and c > 0 is a constant not depending on u, f , R, and k. Example 0.2. Let us consider the 2D Navier–Stokes (NS) equations perturbed by a random kick-force: u˙ − νu + (u, ∇)u + ∇p = η(t, x) ≡

∞

ηk (x)δ(t − k),

k=−∞

div u = 0,

(0.5)

u(t, ·) = 0,

where u = u(t, x), x ∈ T2 , and u = T2 u(x) dx. Let H be the space of divergencefree vector fields u ∈ L2 (T2 , R2 ) such that u = 0 and let {ej } be the normalised trigonometric basis in H . Assuming that the kicks ηk ∈ H have the form (0.1) and normalising solutions u(t) for (0.5) to be continuous from the right, we observe that (0.5) can be written in the form (0.2), where uk = u(k, ·) ∈ H and S : H → H is the timeone shift along trajectories of the free NS system (i.e., of Eqs. (0.5) with η ≡ 0). As it is shown in [KS1], the operator S satisfies all the required assumptions, and therefore Theorem 0.1 applies to (0.5).

Randomly Forced Nonlinear PDE’s

353

Theorem 0.1 can also be applied to many other dissipative nonlinear PDE’s perturbed by a random kick-force, in particular, to the complex Ginzburg–Landau equation u˙ − ν( − 1)u + i|u|2 u = η(t, x),

x ∈ Tn ,

where u = u(t, x) and ν > 0 (see [KS1, KS2]). Uniqueness of a stationary measure for (0.2) was first established1 in [KS1]. The proof in [KS1] is based on a Lyapunov–Schmidt type reduction of the system (0.2) to an N -dimensional RDS with delay (the integer N is the same as in Theorem 0.1). Due to this reduction, the problem of uniqueness of a stationary measure for (0.2) reduces to a similar question for an abstract 1D Gibbs system with an N -dimensional phase space. The uniqueness for the reduced Gibbs system is then established using a version of the Ruelle–Perron–Frobenius theorem. E, Mattingly, Sinai [EMS] and Bricmont, Kupiainen, Lefevere [BKL] used later similar approaches to show that the NS system (0.5) perturbed by a white (in time) force of the form N η(t, x) = bj β˙j (t)ej (x), N < ∞, j =1

also has a unique stationary measure µ ∈ P(H ), provided that bj = 0 for 1 ≤ j ≤ N ≤ N with some sufficiently large N = N (ν). Moreover, it is shown in [BKL] that for the case of white noise the convergence in (0.4) is exponentially fast for µ-almost all u ∈ H . In [KS3] the NS equations (0.5) with an unbounded kick-force η(t, x) is studied and the scheme of [KS1] is used to prove the uniqueness and ergodicity of a stationary measure. The approach presented in this work does not use a Lyapunov–Schmidt type reduction and the Gibbs measure technique. Instead it exploits some ideas from [KS2], interpreting them in terms of the coupling. The new approach gives rise to a shorter proof and is more flexible. The coupling is a well-known effective tool for studying finite-dimensional Markov chains (e.g., see [Lin] and the Appendix in [V]) and dynamical systems (e.g., see [Y, BL]). In [EMS] a coupling is used to study the auxiliary finite-dimensional RDS with delay which arises as a result of the Lyapunov–Schmidt reduction. Our work shows that a form of coupling applies directly to infinite-dimensional Markov chains and randomly forced PDE’s. When a preprint of this paper was sent around, we learned from L.-S. Young that a similar approach to prove Theorem 0.1 is developed by her and Nader Masmoudi in their work under preparation. Notation. We abbreviate a pair of random variables ξ1 , ξ2 or points u1 , u2 to ξ1,2 and u1,2 , respectively. Given a probability space ($, F, P), for any integer k ≥ 1 we denote by $k the space $ × · · · × $ (k times) endowed with the σ -algebra F × · · · × F and the measure P × · · · × P. For a random variable ξ , we denote by D(ξ ) its distribution. For a Banach space H , we shall use the following spaces and sets: 1 It is shown in [KS1, KS2] that the left-hand side of (0.4) converges to zero as k → ∞ for any f ∈ C (H ); b however, the rate of convergence is not specified.

354

S. Kuksin, A. Shirikyan

Cb (H )

is the space of bounded continuous functions on H with the supremum norm · ∞ . L(H ) is the space of bounded Lipschitz functions on H endowed with the natural norm · L (see Sect. 1). M(H ) is the space of signed Borel measures on H with bounded variation. P(H ) is the set of probability measures µ ∈ M(H ); this space is endowed with two different metrics described in Sect. 1. P(H, A) is the set of measures µ ∈ P(H ) with support in a closed set A. µv (k) is the measure P(k, v, ·), where P is the Markov transition function for (0.2). BH (R) is the closed ball of radius R > 0 centred at zero. 1. Measures on Hilbert Spaces Let H be a separable Hilbert space with the Borel σ -algebra B(H ) and let M(H ) be the space of signed Borel measures with bounded variation. We denote by P(H ) the set of probability measures µ ∈ M(H ) and by P(H, A) the subset in P(H ) consisting of measures supported by a closed set A ⊂ H . For any measure µ ∈ M(H ) and any function f ∈ Cb (H ), we write (µ, f ) = f (u) dµ(u) = f (u)µ(du). H

H

We shall use two different topologies on P(H ). The first of them is given by the variation norm on M(H ): µvar = sup |µ()|. ∈B(H )

The distance defined by this norm on P(H ) can be characterised in terms of densities. Namely, let us assume that µ1 , µ2 ∈ P(H ) are absolutely continuous with respect to a fixed Borel measure m, finite or infinite. (Such a measure always exists; for instance, one can take m = (µ1 + µ2 )/2.) In this case, we have 1 µ1 − µ2 var = |p1 (u) − p2 (u)| dm(u), (1.1) 2 H where pi (u), i = 1, 2, is the density of µi with respect to m. The space P(H ) is complete with respect to · var . To define a second topology, we denote by L(H ) the space of real-valued bounded Lipschitz functions on H with the norm

|f (u) − f (v)| f L := sup |f (u)| ∨ sup . u − v u=v u∈H Let · ∗L be the dual norm on M(H ):

µ∗L = sup (µ, f ). f L ≤1

It is clear that the norm · ∗L defines a metric on P(H ). Lemma 1.1. The space P(H ) is complete with respect to the metric · ∗L .

Randomly Forced Nonlinear PDE’s

355

Proof. Suppose that {µn } ⊂ P(H ) is a sequence such that µn − µm ∗L → 0 as m, n → ∞. Let L∗ (H ) be the space of continuous functionals on L(H ). Regarding µn as elements of L∗ (H ), we conclude that the sequence {µn } converges (in the norm ·∗L ) to a limit ) ∈ L∗ (H ), and we have )(f ) = lim (µn , f ), n→∞

f ∈ L(H ).

(1.2)

In view of the corollary2 from Theorem 1 in [GS, Chapter VI, §1], there is a measure µ ∈ P(H ) such that )(f ) = (µ, f ). This completes the proof. Note that, in the case when H is finite-dimensional, the fact that the functional ) in (1.2) is a measure is implied by the following well-known result (for instance, see [H, Theorem 2.1.7]): any nonnegative distribution is a measure; in particular, any positive functional ) ∈ L∗ (H ) is a measure as well. Let P(k, u, ), k ≥ 0, u ∈ H , ∈ B(H ), be a Markov transition function. A set A ∈ B(H ) is said be invariant for P if P(k, u, A) = 1

for all

k ≥ 0,

u ∈ A.

Lemma 1.2. Let A ∈ B(H ) be an invariant set for P(k, u, ). Suppose that there is k0 ≥ 1 and a sequence ζk , k ≥ k0 , going to zero as k → ∞ such that P(k, u, ·) − P(k, v, ·)∗L ≤ ζk for k ≥ k0 ,

u, v ∈ A.

(1.3)

Then there is a unique measure µ ∈ P(H, A) such that P(k, u, ·) − µ∗L ≤ ζk for k ≥ k0 ,

u ∈ A.

(1.4)

Proof. Let f ∈ L(H ), f L ≤ 1. Then, by (1.3) and the Chapman–Kolmogorov relation, for l ≥ k ≥ k0 and u, v ∈ A we have P(l, v, ·) − P(k, u, ·), f ≤ P(l − k, v, dz) P(k, z, dw)f (w) − P(k, u, dw)f (w) H H ≤ ζk P(l − k, v, dz) = ζk . H

(1.5)

By Lemma 1.1, the space P(H ) is complete with respect to · ∗L . Hence, there is a unique measure µ ∈ P(H ) such that P(l, v, ·) − µ∗L → 0 as l → ∞. It is clear that supp µ ⊂ A and therefore µ ∈ P(H, A). Passing to the limit in (1.5) as l → ∞, we obtain (1.4). We now recall that a pair of random variables (ξ1 , ξ2 ) defined on the same probability space is called a coupling for given measures µ1 , µ2 ∈ P(H ) if D(ξj ) = µj , j = 1, 2. For some basic results on the coupling, see [Lin,V] and the Appendix (Sect. 4). 2 The corollary of Theorem 1 in [GS, Chapter VI, §1] claims, in fact, that if the limit in (1.2) exists for any f ∈ Cb (H ), then the functional ) can be represented in the form )(f ) = (µ, f ), where µ ∈ P(H ). However, the same proof works also in the case under study.

356

S. Kuksin, A. Shirikyan

Lemma 1.3. If measures µ1 , µ2 ∈ P(H ) admit a coupling (ξ1 , ξ2 ) such that P ξ1 − ξ2 > ε ≤ θ,

(1.6)

where ε > 0 and θ > 0 are some constants, then µ1 − µ2 ∗L ≤ 2θ + ε.

(1.7)

Proof. Let f ∈ L(H ), f L ≤ 1. Then (µ1,2 , f ) = E f (ξ1,2 ) and, therefore, |(µ1 − µ2 , f )| ≤ EχQ (f (ξ1 ) − f (ξ2 )) + EχQc (f (ξ1 ) − f (ξ2 )),

(1.8)

where χQ and χQc are characteristic functions of the event ξ1 − ξ2 > ε and of its complement, respectively. By (1.6), the first term in the right-hand side of (1.8) is bounded by 2θ, while the second does not exceed εf L ≤ ε. This completes the proof of (1.7). 2. A Class of Random Dynamical Systems Let H be a Hilbert space with a norm · and an orthonormal basis {ej } and let S : H → H be an operator satisfying Conditions (A)–(C) below: (A) For any R > r > 0 there exist positive constants a = a(R, r) < 1 and C = C(R) and an integer n0 = n0 (R, r) ≥ 1 such that S(u1 ) − S(u2 ) ≤ C(R)u1 − u2 S n (u) ≤ max{au, r}

for all u1 , u2 ∈ BH (R), for u ∈ BH (R), n ≥ n0 .

(2.1) (2.2)

Let ηk , k ≥ 1, be a sequence of i.i.d. H -valued random variables that are defined on a probability space ($1 , F1 , P1 ) and have the form (0.1), where bj ≥ 0 are some constants such that ∞ j =1

bj2 < ∞,

(2.3)

and {ξj k } is a family of independent real-valued random variables such that |ξj k | ≤ 1 for all j , k, and ω1 ∈ $1 . We consider the following RDS in H : uk = S(uk−1 ) + ηk =: F ω1 (uk−1 ),

k ≥ 1.

(2.4)

It follows from (0.1) and (2.3) that the distribution of ηk is supported by the Hilbert cube K,

∞ K= u= uj ej : |uj | ≤ bj for all j ≥ 1 . j =1

Therefore, if the initial state u0 of the RDS (2.4) belongs to a set B for all k ≥ 1 and ω1 ∈ $1 , where A0 (B) = B and Ak (B) = S Ak−1 (B) + K for

⊂ H , then uk ∈ Ak (B)

k ≥ 1.

The next condition expresses the property of existence of a bounded absorbing set for the system in question.

Randomly Forced Nonlinear PDE’s

357

(B) There exists ρ > 0 such that for any bounded set B ⊂ H there is an integer k0 ≥ 1 such that Ak (B) ⊂ BH (ρ) for k ≥ k0 . Clearly, inequality (2.2) and Condition (B) are satisfied if S(u) ≤ γ u for all u ∈ H and some positive constant γ < 1. To formulate the last condition, we introduce some notations. For a subspace E ⊂ H , we denote by E ⊥ its orthogonal complement in H . For an integer N ≥ 1, let HN be the finite-dimensional subspace generated by the vectors e1 , . . . , eN and let PN and QN be the orthogonal projections onto HN and HN⊥ , respectively. (C) For any R > 0 there is a decreasing sequence γN (R) > 0 tending to zero as N → ∞ such that QN S(u1 ) − S(u2 ) ≤ γN (R)u1 − u2 for all u1 , u2 ∈ BH (R). Finally, we specify the random variables {ξj k }: (D) For any j , the random variables ξj k , k ≥ 1, have the same distribution πj (dr) = pj (r) dr, where the densities pj (r) are functions of bounded variation such that supp pj ⊂ [−1, 1] and |r|≤ε pj (r) dr > 0 for all j ≥ 1 and ε > 0. We normalise the functions pj to be continuous from the right. The RDS (2.4) defines a family of Markov chains in H with the transition function P(k, v, ) = P uk ∈ , where (uk , k ≥ 0) is the solution of (2.4) such that u0 = v. Let Pk and Pk∗ be the corresponding semigroups (see the Introduction for their definition). Continuity of S (see Condition (A)) and the Lebesgue theorem on dominated convergence imply that the transition function satisfies the Feller condition: if f ∈ Cb (H ), then Pk f ∈ Cb (H ) for all k ≥ 1. Let ρ > 0 be the constant in Condition (B). We introduce the set A=

Ak BH (ρ) .

(2.5)

k≥1

It is clear that A is an invariant set for the RDS (2.4): if u0 ∈ A, then uk ∈ A for all k ≥ 1 and ω1 ∈ $1 . Moreover, it follows from Condition (C) that the set A is compact in H . (Note that the union in (2.5) is taken over k ≥ 1 and therefore BH (ρ) is not a subset of A.) Our goal is to prove the following result: Theorem 2.1. There is an integer N ≥ 1 such that if (0.3) holds, then the RDS (2.4) has a unique stationary measure µ ∈ P(H, A). Moreover, for any R > 0 there is CR > 0 such that √ Pk f (u) − (µ, f ) ≤ CR e−c k f L for k ≥ 0, u ≤ R, where f ∈ L(H ) is an arbitrary function and c > 0 is a constant not depending on f , u, R, and k.

358

S. Kuksin, A. Shirikyan

Condition (B) and the definition of A imply that for any R > 0 there is an integer l ≥ 1 depending on R such that P(l, u, A) = 1 for any u ∈ BH (R). Hence, we can restrict our consideration to the invariant set A. In view of Lemma 1.2, Theorem 2.1 will be established if we show that there are positive constants C and c and an integer k0 ≥ 1 such that P(k, u, ·) − P(k, v, ·)∗L ≤ C e−c

√ k

for

k ≥ k0 ,

u, v ∈ A.

(2.6)

3. Proof of the Main Result We first establish some auxiliary assertions and then use them to prove inequality (2.6), which implies the required result. 3.1. Auxiliary assertions. We begin with a simple observation. Let R > 0 be so large that BH (R) ⊃ A. To simplify notation, we denote B = BH (R). Lemma 3.1. For any d > 0 there is an integer l = l(d) ≥ 0 and a constant : = :(d) > 0 such that P ul (v) ≤ d/2 for all v ∈ B ≥ :. (3.1) Proof. Let a and n0 be the constants in Condition (A) that correspond to the parameters R (the radius of B) and r = d/4 and let l = n0 m, where m is the smallest integer such that a m R ≤ d/4. If ηk = 0 in (2.4) for 1 ≤ k ≤ l, then, in view of (2.2), we have ul (v) ≤ max{a m R, d/4} = d/4

for all

v ∈ B.

By continuity, there is γ > 0 such that if ηk ≤ γ

for

1 ≤ k ≤ l,

(3.2)

then ul (v) ≤ d/2.

(3.3)

It follows from (2.3) and Condition (D) that the event (3.2) has a positive probability :. Inequality (3.1) follows now from (3.3). To simplify notation, for any v ∈ H we denote by µv (k) the measure P(k, v, ·) ∈ P(H ). For any measurable space (X, B(X)) and any integer k ≥ 1, we denote by X k the direct product X × · · · × X endowed with the product σ -algebra B k (X) = B(X) × · · · × B(X). Lemma 3.2. There is a probability space ($, F, P), an integer N ≥ 1, and a constant C > 0 such that if (0.3) holds, then for any u1 , u2 ∈ B the measures µu1,2 (1) admit a coupling V1,2 = V1,2 (u1 , u2 ; ω) that possesses the following properties: (i) The maps V1,2 are measurable with respect to the σ -algebra B 2 (H )×F as functions of (u1 , u2 , ω) ∈ B 2 × $. (ii) Let d = u1 − u2 . Then P V1 − V2 ≥ d/2 ≤ Cd. (3.4)

Randomly Forced Nonlinear PDE’s

359

Let us note that inequality (3.4) is nontrivial only in the case Cd < 1. Proof. Let ($1 , F1 , P1 ) be the probability space on which the random variables {ηk } are defined and let ($2 , F2 , P2 ) be the probability space constructed in Theorem 4.2 for the measures ν1,2 specified below. We shall show that the set $ = $1 × $2 endowed with the natural σ -algebra and probability of direct product is the required probability space. The random variables V1,2 are sought in the form V1 = S(u1 ) + ξ1 ,

V2 = S(u2 ) + ξ2 ,

where ξ1,2 are some random variables on $ such that D(ξ1 ) = D(ξ2 ) = D(η1 ). It is clear that D(V1,2 ) = µu1,2 (1) and that (i) holds. To define the random variables ξ1,2 , we specify their projections PN ξ1,2 and QN ξ1,2 , where N ≥ 1 is a sufficiently large integer which is chosen below. We set QN ξ1 = QN ξ2 = QN η˜ 1 , where η˜ 1 is the natural extension of η1 to $, i.e., η˜ 1 (ω) = η1 (ω1 ) for ω = (ω1 , ω2 ) ∈ $. To define PN ξ1,2 , let us write ν1,2 := PN µu1,2 (1) and assume that we have proved the inequality ν1 − ν2 var ≤ Cd,

(3.5)

where C > 0 is a constant not depending on u1,2 ∈ B. In view of Theorem 4.2, there is a maximal coupling =1,2 (u1 , u2 ; ω2 ) for the measures ν1,2 that is measurable with respect to (u1 , u2 , ω2 ) ∈ B 2 × $2 : P{=1 = =2 } = ν1 − ν2 var ≤ Cd.

(3.6)

Retaining the same notation for the natural extensions of =1 and =2 to $, we now set PN ξ1,2 = =1,2 − PN S(u1,2 ) and note that PN V1 = PN V2 if and only if =1 = =2 . Let N ≥ 1 be so large that γN (R) ≤ 1/2 (see Condition (C)). In this case, if PN V1 = PN V2 , then V1 − V2 = QN (V1 − V2 ) = QN (S(u1 ) − S(u2 )) ≤ u1 − u2 /2 ≤ d/2. Inequality (3.4) follows now from (3.6). Thus, it remains to establish (3.5). To this end, we set v1,2 = PN S(u1,2 ) and note that, in view of (2.1), v1 − v2 ≤ C(R)d.

(3.7)

Since bj = 0 for 1 ≤ j ≤ N , Condition (D) implies that D(PN η1 ) = p(x) dx, where dx is the Lebesgue measure on the finite-dimensional space HN and p(x) =

N j =1

qj (xj ),

qj (xj ) = bj−1 pj (xj /bj ),

x = (x1 , . . . , xN ) ∈ HN ,

is a bounded function with support in the set PN K. It follows that ν1,2 = D(v1,2 + PN η1 ) = p(x − v1,2 ) dx.

360

S. Kuksin, A. Shirikyan

Therefore, by (1.1), 1 = 2

v1 − v2 var

HN

|p(x − v1 ) − p(x − v2 )| dx.

We claim that HN

|p(x − v1 ) − p(x − v2 )| dx ≤ |v1 − v2 |

N j =1

bj−1 Var(pj ),

(3.8)

where Var(pj ) stands for the total variation of pj . The required inequality (3.5) follows immediately from (3.7) and (3.8). To prove (3.8), we first assume that pj are C 1 -smooth functions. In this case, we have |p(x − v1 ) − p(x − v2 )| dx HN

≤ |v1 − v2 |

HN

= |v1 − v2 | = |v1 − v2 |

HN N

(∇p)(x − θv1 − (1 − θ)v2 ) dθdx

1 0

N (∇p)(x) dx ≤ |v1 − v2 |

j =1 R

∂x qj (xj ) dxj j

Var(qj ).

j =1

It remains to note that Var(qj ) = bj−1 Var(pj ). Inequality (3.8) in the general case can be easily derived by a standard approximation procedure; we omit the corresponding arguments. k (u , u ) for the We now combine Lemmas 3.1 and 3.2 to obtain a coupling U1,2 1 2 measures µu1,2 (k), k ≥ 1. Let l = l(d) and C > 0 be the constants in Lemmas 3.1 and 3.2 and let d0 > 0 be so small that

Cd0 ≤ 1/8. We set dr = 2−r d0 , r ≥ 1. For a probability space ($, F, P), we shall denote by ($k , F k , Pk ) the direct product of its k independent copies. Points of the latter will be denoted by ωk = (ω1 , . . . , ωk ). Lemma 3.3. Suppose that the conditions of Lemma 3.2 are satisfied. Let u1 , u2 ∈ A and d = u1 − u2 . Then for any k ≥ 1 the measures µu1,2 (k) admit a coupling k = U k (u , u ; ωk ), ωk ∈ $k , such that the following assertions hold: U1,2 1,2 1 2 k (u , u ; ωk ) are measurable with respect to (u , u , ωk ) ∈ A2 × $k . (i) The maps U1,2 1 2 1 2 (ii) There is a constant θ > 0 not depending on u1 , u2 , and k such that (3.9) Pk U1k − U2k ≤ dr ≥ θ for all k ≥ r + l(d0 ), u1 , u2 ∈ A.

Randomly Forced Nonlinear PDE’s

361

(iii) If u1 − u2 ≤ dr , then Pk U1k − U2k ≤ dk+r ≥ 1 − 2−r−1 for all k ≥ 1,

r ≥ 0.

(3.10)

Proof. Let us recall that for any (u1 , u2 ) ∈ B × B a coupling V1,2 (u1 , u2 ; ω) was constructed in Lemma 3.2. We set Vj (u1 , u2 ; ω) if u1 − u2 ≤ d0 , Uj (u1 , u2 ; ω) = F ω (uj ) if u1 − u2 > d0 , k on where j = 1, 2 and F ω (u) is given by (2.4). We define random variables U1,2 ($k , F k ) by the following rule: if u1 − u2 > d0 , then

Ujk (u1 , u2 ; ωk ) = F ωk ◦ · · · ◦ F ω1 (uj ) for k ≤ l(d0 ) and

Ujk (u1 , u2 ; ωk ) = Uj U1k−1 (u1 , u2 ; ωk−1 ), U2k−1 (u1 , u2 ; ωk−1 ); ωk

(3.11)

for k > l(d0 ), where ωk = (ωk−1 , ωk ) = (ω1 , . . . ωk ) and Uj0 (u1 , u2 ) = uj . If u1 − k 0 (u , u ) = u k u2 ≤ d0 , then U1,2 1 2 1,2 and for k ≥ 1 the random variables Uj (u1 , u2 ; ω ) are inductively defined by (3.11). k satisfy assertions (i)–(iii) of the lemma. Indeed, the measurabilWe claim that U1,2 k is obvious since they are compositions of measurable maps. To ity of the maps U1,2 prove (3.9), we first note that it is sufficient to consider the case k = l + r, l = l(d0 ). We introduce the following events in $l+r : Q+ = U1l − U2l ≤ d0 , Q− = U1l − U2l > d0 , Q = U1l+r − U2l+r ≤ dr . By Lemma 3.1, we have Pk (Q) = Pk (Q|Q+ )P(Q+ ) + Pk (Q|Q− )P(Q− ) ≥ : Pk (Q|Q+ ).

(3.12)

If we assume that (3.10) is proved for r = 0, then (3.12) will imply the required estimate (3.9) with θ = :/2. Thus, it remains to establish (iii). For a fixed r ≥ 0, we set k k k k Q+ Q− k = U1 − U2 ≤ dk+r , k = U1 − U2 > dk+r − and denote by pk+ and pk− the probabilities of Q+ k and Qk , respectively. Using (3.4) with d = dk+r−1 , we derive + + − + − + k pk+ = pk−1 Pk (Q+ k |Qk−1 ) + pk−1 P (Qk |Qk−1 ) ≥ (1 − Cdk+r−1 )pk−1 .

Since p0+ = 1, iteration of this estimate results in pk+ ≥ λ :=

k−1 j =0

(1 − Cdj +r ).

(3.13)

362

S. Kuksin, A. Shirikyan

Since dm = 2−m d0 and Cd0 ≤ 1/8, we have log λ =

k−1

log(1 − Cdj +r ) ≥ −2C

j =0

≥ −2Cd0

k−1

dj +r

j =0 ∞

2−(j +r) = −22−r Cd0 ≥ −2−r−1 .

j =0

Therefore, λ ≥ 1 − 2−r−1 . 3.2. Proof of Theorem 2.1. As was mentioned at the end of Sect. 2, it is sufficient to establish inequality (2.6). In what follows, to simplify notation, we shall write P instead of Pk . (1) Let us fix arbitrary u1 , u2 ∈ A and set T0 = 0 and Tr = Tr−1 + r + l for r ≥ 1, i.e., Tr = r(r + 1)/2 + rl. We claim that for any integer r ≥ 0 there is a coupling y1,2 (Tr ) on $Tr for the measures µu1,2 (Tr ) such that (3.14) P y1 (Tr ) − y2 (Tr ) > dr ≤ C1 γ r , where C1 and γ < 1 are some positive constants. The construction of y1,2 (Tr ) = y1,2 (Tr , u1 , u2 ; ωTr ) and the proof of (3.14) are by induction. For r = 0, we set yj (0) = uj , and inequality (3.14) with C1 ≥ 1 is trivial in this case. Assuming that y1,2 (Ti ) are constructed for 0 ≤ i ≤ r, we set yj (Tr+1 , u1 , u2 ; ωTr+1 ) = Ujr+l+1 y1 (Tr , u1,2 ; ωTr ), y2 (Tr , u1,2 ; ωTr ); ωr+l+1 , (3.15) k (u , u ; ωk ) are defined in Lemma 3.3 and ωTr+1 = (ωTr , ωr+l+1 ). Let us where U1,2 1 2 introduce the events Q+ Q− r = y1 (Tr ) − y2 (Tr ) ≤ dr , r = y1 (Tr ) − y2 (Tr ) > dr

and denote by pr+ and pr− their probabilities. Then, in view of (3.9) and (3.10) with k = r + l, we have (cf. (3.12)) − − + + − − pr+1 = P(Q− r+1 |Qr )P(Qr ) + P(Qr+1 |Qr )P(Qr )

≤ 2−r−1 pr+ + (1 − θ)pr− ≤ 2−r−1 + γpr− ,

(3.16)

where γ = 1 − θ. Without loss of generality, we can assume that 0 < θ < 1/2, and therefore 1 < 2γ < 2. Iterating (3.16), we obtain − pr+1

≤2

−r−1

r

(2γ )j + γ r+1 p0− ≤ 2−r−1

j =0

This completes the induction.

(2γ )r+1 − 1 + γ r+1 ≤ C1 γ r+1 . 2γ − 1

Randomly Forced Nonlinear PDE’s

363

(2) We can now prove (2.6). Let us fix arbitrary positive integers r and m ≤ r + l and set k = Tr + m, so that Tr + 1 ≤ k < Tr+1 . We define a coupling y1,2 (k) = y1,2 (k, u1 , u2 ) for the measures µu1,2 (k) by the formula (cf. (3.15)) yj (k, u1 , u2 ; ωk ) = Ujm y1 (Tr , u1 , u2 ; ωTr ), y2 (Tr , u1 , u2 ; ωTr ); ωm . In view of (3.10) and (3.14), we have (cf. (3.16)) −r−1 r P y1 (k) − y2 (k) > dr+1 ≤ P(Q− P(Q+ r )+2 r ) ≤ C2 γ ,

(3.17)

where C2 > 0 is a constant. Now note that r 2 /2 ≤ Tr ≤ (l + 1)r 2 for any r ≥ 0 and therefore there are positive constants C and c such that dr+1 ≤ C e−c

√ k

,

C2 γ r ≤ C e−c

√ k

for

Tr ≤ k < Tr+1 .

Combining this with (3.17), we derive √ √ P y1 (k, u1 , u2 ) − y2 (k, u1 , u2 ) ≥ C e−c k ≤ C e−c k .

(3.18)

By Lemma 1.3, inequality (3.18) implies that √ µu (k) − µu (k)∗ ≤ 3C e−c k , 1 2 L

which completes the proof of (2.6) with k0 = T1 . Theorem 2.1 is proved.

4. Appendix: Coupling In this appendix, we present some results on the coupling in finite-dimensional spaces in the form which we learned from S. Foss. These results are well known (e.g., see [Lin, V] for Lemma 4.1 and [BF] for Lemma 4.3). Let ν1 , ν2 ∈ P(RN ) be two measures absolutely continuous with respect to the Lebesgue measure dx: ν1,2 (dx) = p1,2 (x) dx. We set ρ := ν1 − ν2 var

1 = 2

|p1 (x) − p2 (x)| dx

(4.1)

pˆ 1,2 := ρ −1 (p1,2 − p).

(4.2)

RN

and assume first that 0 < ρ < 1. Let p := (1 − ρ)−1 p1 ∧ p2 ,

For ρ = 1 or 0, we define p(x) and p1,2 (x) as follows: p(x) ≡ 0, p(x) ≡ p1 (x),

pˆ 1,2 (x) ≡ p1,2 (x) if ρ = 1, pˆ 1,2 (x) ≡ 0 if ρ = 0.

It is clear that p1,2 (x) = (1 − ρ)p(x) + ρ pˆ 1,2 (x)

almost everywhere.

(4.3) (4.4)

364

S. Kuksin, A. Shirikyan

If (ξ1 , ξ2 ) is a coupling for the measures (ν1 , ν2 ), then for any ∈ B(RN ) we have ν1 () − ν2 () = E χ (ξ1 ) − χ (ξ2 ) = E χ{ξ1 =ξ2 } χ (ξ1 ) − χ (ξ2 ) ≤ P{ξ1 = ξ2 }. Therefore,

P{ξ1 = ξ2 } ≥ ρ ≡ ν1 − ν2 var .

A coupling (ξ1 , ξ2 ) for (ν1 , ν2 ) is said to be maximal if P{ξ1 = ξ2 } = ρ ≡ ν1 − ν2 var . Lemma 4.1. Let ξ1,2 , ξ , and α be independent random variables such that P{α = 1} = 1 − ρ,

P{α = 0} = ρ,

D(ξ ) = p(x) dx,

D(ξ1,2 ) = pˆ 1,2 (x) dx. (4.5)

Then the random variables =1,2 = αξ + (1 − α)ξ1,2

(4.6)

form a maximal coupling for ν1,2 . Proof. Since ξ1 and ξ2 are independent and their distributions possess densities with respect to the Lebesgue measure, we have P{ξ1 = ξ2 } = 0. Taking into account the relation α(1 − α) ≡ 0, we get D(=1,2 ) = p1,2 (x) dx = ν1,2 ,

P{=1 = =2 } = P{α = 0} = ρ,

which completes the proof. Let us now assume that ϕ is a random variable in RN with the distribution D(ϕ) = q(x) dx, where q ∈ L1 (RN ). Consider the following family of measures depending on a parameter v ∈ RN : νv (dx) = D(v + ϕ) = q(x − v) dx. Let ρ(v1 , v2 ) be the variation distance between νv1 and νv2 . It is clear from (4.1) that ρ(v1 , v2 ) is measurable with respect to v1 , v2 ∈ R2N . In the construction above, let us take ν1,2 = νv1,2 . Then p(x) = p(x; v1 , v2 ),

pˆ 1,2 (x) = pˆ 1,2 (x; v1 , v2 ).

Clearly, the functions p(x; v1 , v2 ) and pˆ 1,2 (x; v1 , v2 ) are measurable with respect to (x, v1 , v2 ). Using the above observations, we construct a coupling for (νv1 , νv2 ) that is measurable with respect to (v1 , v2 , ω). Namely, we have the following result: Theorem 4.2. There is a probability space ($, F, P) such that for any pair (v1 , v2 ) ∈ R2N there are random variables =1,2 = =1,2 (v1 , v2 ; ω) satisfying the following properties: (i) The pair (=1 , =2 ) is a maximal coupling for (νv1 , νv2 ). (ii) The map =(v1 , v2 ; ω) : R2N ×$ → RN is measurable with respect to the σ -algebra B(R2N ) × F.

Randomly Forced Nonlinear PDE’s

365

To prove the theorem, we shall need the lemma below: Lemma 4.3. Let µz ∈ P(RN ), z ∈ Rd , be a family of probability measures such that µz (dx) = pz (x) dx, d where pz ∈ L1 (RN x ) for each z ∈ R and pz (x) is measurable as a function of (x, z) ∈ N d R × R . Then there is a probability space ($, F, P) and a family of random variables ζz : $ → RN such that D(ζz ) = µz for all z ∈ Rd and ζz (x) is measurable with respect to (z, x).

Proof. If N = 1, then we take ($, F, P) = ([0, 1], B, dt), where B is the Borel σ algebra and dt is the Lebesgue measure. Denoting by Fz (λ) the distribution function of the measure µz , Fz (λ) = µz ((−∞, λ]), we set ζz (t) = min{λ : Fz (λ) ≥ t}. The map (t, z) & → ζz (t) from [0, 1]×Rd to R is measurable, and the distribution function of D(ζz ) is equal to Fz . Thus, for N = 1 the lemma is proved. We now assume that the required assertion is established for N = L and prove it for N = L + 1. Let us write x ∈ RL+1 as x = (x , y), where x ∈ RL and y ∈ R. Decomposing µz in terms of the conditional density (see [GS]), we write µz (dx) = pz (x) dx = pz (x | y) dx qz (y) dy. Here

qz (y) =

RL

pz (x , y) dx ,

pz (x | y) =

(4.7)

pz (x , y) , qz (y)

where we set 0/0 = ∞/∞ = 0. Applying the induction hypothesis with z replaced by (z, y), we find a probability space ($ , F , P ) and a measurable map ζz (ω , y) : $ × Rd × R → RL

such that D ζz (·, y) = pz (x | y) dx for each (z, y) ∈ Rd × R. Applying the first step of the proof, we construct a measurable map ξz (t) : [0, 1] × Rd → R such that D(ξz ) = qz (λ) dλ. We now set $ = $ × [0, 1] and ζz (ω , t) = ζz (ω , ξz (t)), ξz (t) ∈ RL+1 . We have constructed a measurable map $×Rd → RL+1 such that, for any fixed z ∈ Rd , its distribution is given by the right-hand side of (4.7). Proof of Theorem 4.2. Applying Lemma 4.2 to measures in RN given by the densities p and pˆ 1,2 , we construct probability spaces (Fj , Sj , Pj ), j = 0, 1, 2, and random variables j ξ(v1 ,v2 ) on Fj such that 0 D(ξ(v ) = p(x; v1 , v2 ) dx, 1 ,v2 )

j

D(ξ(v1 ,v2 ) ) = pˆ j (x; v1 , v2 ) dx,

j = 1, 2.

(4.8)

We also define a random variable αρ : [0, 1] → {0, 1}, ρ = ρ(v1 , v2 ), by the formula αρ (t) = χ[0,1−ρ] (t),

366

S. Kuksin, A. Shirikyan

where [0, 1] is endowed with the Borel σ -algebra and the Lebesgue measure, and χ[0,r] is the characteristic function of the interval [0, r]. We now define the required probability space as the set $ = F0 × F1 × F2 × [0, 1] with the σ -algebra and the probability of direct product. The natural extensions3 of αρ j and ξ(v1 ,v2 ) , j = 0, 1, 2, to $ (for which we retain the same notations) form a quadruple of independent random variables satisfying (4.8) and also the relations P{αρ = 1} = 1 − ρ(v1 , v2 ),

P{αρ = 0} = ρ(v1 , v2 ).

A maximal coupling (=1 , =2 ) for the measures (νv1 , νv2 ) that satisfies assertion (ii) of 0 the theorem can now be defined by formula (4.6), in which α = αρ , ξ = ξ(v , and 1 ,v2 ) j

ξj = ξ(v1 ,v2 ) , j = 1, 2.

Acknowledgements. The authors thank Roger Tribe and Sergei Foss for fruitful discussions of the coupling approach during the Symposium “Stochastic Fluid Equations” in Warwick on January 19–20, 2001, and at seminars in Heriot-Watt University, respectively. The authors are also grateful to Jan Kristensen for useful remarks on functional analysis. This research was supported by EPSRC, grant GR/N63055/01.

References [BF]

Borovkov, A.A., Foss, S.G.: Stochastically recursive sequences and their generalizations. Siberian Adv. in Math. 2, no. 1, 16–81 (1992) [BL] Bressaud, X., Liverani, C.: Anosov diffeomorphism and coupling. To appear in Ergodic Theory Dynam. Systems [BKL] Bricmont, J., Kupiainen, A., Lefevere, R.: Exponential mixing for the 2D stochastic Navier–Stokes dynamics. Preprint [EMS] E, W., Mattingly, J.C., Sinai, Ya.G.: Gibbsian dynamics and ergodicity for the stochastically forced Navier–Stokes equation. Preprint [GS] Gihman, I.I., Skorohod, A.V.: The Theory of Stochastic Processes I. Berlin–Heidelberg–New York: Springer-Verlag, 1980 [H] Hörmander, L.: The Analysis of Linear Partial Differential Operators I. Distribution Theory and Fourier Analysis. Berlin: Springer-Verlag, 1983 [KS1] Kuksin, S., Shirikyan, A.: Stochastic dissipative PDE’s and Gibbs measures. Comm. Math. Phys. 213, 291–330 (2000) [KS2] Kuksin, S., Shirikyan, A.: On dissipative systems perturbed by bounded random kick-forces. Submitted to Ergodic Theory Dynam. Systems (www.ma.hw.ac.uk/kuksin) [KS3] Kuksin, S., Shirikyan, A.: Ergodicity for the randomly forced 2D Navier–Stokes equations. Preprint. (www.ma.hw.ac.uk/kuksin) [Lin] Lindvall, T.: Lectures on the Coupling Method. New York: John Wiley & Sons, 1992 [V] Veretennikov, A.Yu.: Parametric and non-parametric estimation of Markov chains. Moscow: Moscow State University Press, 2000 (in Russian) [Y] Young, L.-S.: Recurrence times and rates of mixing. Israel J. Math. 110, 153–188 (1999) Communicated by G. Gallavotti

3 For instance, the extension of α is given by α (ω) = α (t), where ω = (ω , ω , ω , t) ∈ $. ρ ρ ρ 0 1 2

Commun. Math. Phys. 221, 367 – 384 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Loop Homotopy Algebras in Closed String Field Theory Martin Markl Mathematical Institute of the Academy, Žitná 25, 11567 Prague 1, The Czech Republic. E-mail: [email protected] Received: 10 November 1999 / Accepted: 29 March 2001

Abstract: Barton Zwiebach constructed [20] “string products” on the Hilbert space of a combined conformal field theory of matter and ghosts, satisfying the “main identity”. It has been well known that the “tree level” of the theory gives an example of a strongly homotopy Lie algebra (though, as we will see later, this is not the whole truth). Strongly homotopy Lie algebras are now well-understood objects. On the one hand, strongly homotopy Lie algebra is given by a square zero coderivation on the cofree cocommutative connected coalgebra [14, 13]; on the other hand, strongly homotopy Lie algebras are algebras over the cobar dual of the operad Com for commutative algebras [9]. As far as we know, no such characterization of the structure of string products for arbitrary genera has been available, though there are two series of papers directly pointing towards the requisite characterization. As far as the characterization in terms of (co)derivations is concerned, we need the concept of higher order (co)derivations, which has been developed, for example, in [2, 3]. These higher order derivations were used in the analysis of the “master identity.” For our characterization we need to understand the behavior of these higher (co)derivations on (co)free (co)algebras. The necessary machinery for the operadic approach is that of modular operads, anticipated in [5] and introduced in [8]. We believe that the modular operad structure on the compactified moduli space of Riemann surfaces of arbitrary genera implies the existence of the structure we are interested in the same manner as was explained for the tree level in [11]. We also indicate how to adapt the loop homotopy structure to the case of open string field theory [19]. 1. Introduction Let H be the Hilbert space of a combined conformal field theory of matter and ghosts and let Hrel ⊂ H be the subspace of elements annihilated by b0− := b0 − b0 and

368

M. Markl

L− 0 := L0 − L0 (see, for example, [11, Sect. 4]). Barton Zwiebach constructed in [20], for each “genus” g ≥ 0 and for each n ≥ 0, multilinear “string products” ⊗n B1 × · · · × Bn −→ [B1 , . . . , Bn ]g ∈ Hrel . Hrel

Recall the basic properties of these products. If gh(−) denotes the ghost number, then [20, (4.8)] gh([B1 , . . . , Bn ]g ) = 3 − 2n +

n

gh(Bi ).

i=1

The string products are graded (super) commutative [20, (4.4)]: [B1 , . . . , Bi , Bi+1 , . . . , Bn ]g = (−1)Bi Bi+1 [B1 , . . . , Bi+1 , Bi , . . . , Bn ]g .

(1)

Here we used the notation (−1)Bi Bi+1 := (−1)gh(Bi )gh(Bi+1 ) . For n = 0 and g ≥ 0, [ . ]g ∈ Hrel is just a constant, and the products are constructed in such a way that [ . ]0 = 0 [20, (4.6)]. The linear operation [B]0 =: QB is identified with the BRST differential of the theory. These products satisfy, for all n, g, the main identity [20, (4.13)]: 0= σ (il , jk ) Bi1 , . . . , Bil , [Bj1 , . . . , Bjk ]g2 g (2) 1 1 + (−1)s [s , s , B1 , . . . , Bn ]g−1 . 2 s Here the first sum runs over all g1 + g2 = g, k + l = n, and all sequences i1 < · · · < il , j1 < · · · < jk such that {i1 , . . . , il , j1 , . . . , jk } = {1, . . . , n}. Such sequences are called unshuffles (see the terminology introduced at the beginning of Sect. 2). The sign σ (il , jk ) is picked up by rearranging the sequence (Q, B1 , . . . , Bn ) into the order (Bi1 , . . . , Bil , Q, Bj1 , . . . , Bjk ). In the second sum, {s } is a basis of Hrel and {s } its dual basis in the sense that (−1)r r , s = δrs (Kronecker delta), where −, − denotes the bilinear inner product on H [20, (2.44)]. Let us remark that, in the original formulation of [20], {s } was a basis of the whole H, but the sum in (2) was restricted to Hrel . The product satisfies [20, (2.62)]: A, B = (−1)(A+1)(B+1) B, A

(3)

and it is nontrivial only for elements whose ghost numbers add up to five: if A, B = 0, then gh(A) + gh(B) = 5.

(4)

The above two conditions in fact imply that A, B = B, A. Moreover, the product −, − is Q-invariant [20, 2.63]: QA, B = (−1)A A, QB.

(5)

Loop Homotopy Algebras in Closed String Field Theory

369

⊗2 Conditions (3) and (4) also imply that the element := (−1)s s ⊗s ∈ Hrel is symmetric in the sense that s

(−1)s s ⊗s = (−1)s s ⊗s = −(−1) s ⊗s .

(6)

We use, in the previous formula as well as at many places in the rest of the paper, the Einstein convention of summing over repeated indices. The last important property of string products is that the element ⊗2 s ⊗[s , B1 , . . . , Bn−1 ]g ∈ Hrel

(7)

is antisymmetric. This property is not explicitly stated in [20], though it is used in the proof of the identity [20, (4.28)]: B1 , . . . , Bl , s , [s , A1 , . . . , Ak ]g2 g = 0, for arbitrary l ≥ 0, k ≥ 0, 1

s

which then immediately follows from the antisymmetry (7) by the graded commutativity (1) of string products. Eq. (7) is a consequence of the important fact that the string products are defined with the aid of the multilinear string functions [20, (7.72)] ⊗(n+1)

Hrel

B0 , . . . , Bn −→ {B0 , . . . , Bn }g ∈ C

by [20, (4.33)] [B1 , . . . , Bn ]g :=

(−1)t t · {t , B1 , . . . , Bn }g .

(8)

t

Let us show that the graded commutativity [20, (4.36)] {B0 , . . . , Bi , Bi+1 , . . . , Bn }g = (−1)Bi Bi+1 {B0 , . . . , Bi+1 , Bi , . . . , Bn }g of the string multilinear functions implies the antisymmetry of the element in (7). Indeed, because of (6), we may write (8) as (−1)t t · {t , B1 , . . . , Bn }g , [B1 , . . . , Bn ]g = t

thus the element in (7) takes the form (−1)t (s ⊗t ) · {t , s , B1 , . . . , Bn−1 }g . s,t

The antisymmetry we are proving means that

(−1)t s ⊗t · {t , s , B1 , . . . , Bn−1 }g

s,t

=−

(−1)t +s t t ⊗s · {t , s , B1 , . . . , Bn−1 }g .

s,t

The replacement t ←→ s in the right-hand side of the above equation gives − (−1)s +t s s ⊗t · {s , t , B1 , . . . , Bn−1 }g s,t

370

M. Markl

which can be further rewritten, using the graded commutativity of string functions, as s t (−1)s +t s + s ⊗t · {t , s , B1 , . . . , Bn−1 }g . (9) − s,t

Since gh(s ) ≡ gh(s ) + 1 (mod 2) and gh(t ) ≡ gh(t ) + 1 (mod 2), gh(s )gh(t ) ≡ gh(s )gh(t ) + gh(s ) + gh(t ) + 1 (mod 2), therefore the sign factor in (9) is (−1)t . This proves the claim. 2. Sign Interlude and the Definition In this brief section we rewrite the axioms of string products into a more usual and convenient formalism. All algebraic objects will be considered over a fixed field k of characteristic zero. This, of course, includes the case k = C of the previous section. We will systematically use the Koszul sign convention meaning that whenever we commute two “things” of degrees p and q, respectively, we multiply by the sign factor (−1)pq . Our conventions concerning graded vector spaces, permutations, shuffles, etc., will follow closely those of [15]. For graded indeterminates x1 , . . . , xn and a permutation σ ∈ n define the Koszul sign (σ ) = (σ ; x1 , . . . , xn ) by x1 ∧ · · · ∧ xn = (σ ; x1 , . . . , xn ) · xσ (1) ∧ · · · ∧ xσ (n) , which is to be satisfied in the free graded commutative algebra ∧(x1 , . . . , xn ). Define also χ (σ ) := χ (σ ; x1 , . . . , xn ) := sgn(σ ) · (σ ; x1 , . . . , xn ). We say that σ ∈ n is an (i, j )-unshuffle, i + j = n, if σ (1) < · · · < σ (i) and σ (i + 1) < · · · < σ (n). In this case we write σ ∈ unsh(i, j ). In the obvious similar manner one may introduce (i, j, k)-unshuffles, etc. Let us denote, for a graded vector space U , by ↑ U (resp. ↓ U ) the suspension (resp. the desuspension) of U , i.e. the graded vector space defined by (↑ U )p := Up−1 (resp. (↓ U )p := Up+1 ). We have the obvious natural maps ↑: U → ↑ U and ↓: U → ↓ U . For a graded vector space U , let its reflection r(U ) be the graded vector space defined by r(U )p := U−p . There is an obvious natural map r : U → r(U ). Observe that r2 = 1, r ◦ ↑= ↓ ◦ r and r ◦ ↓=↑ ◦ r. Take now V := r(↓ Hrel ). Define, for each g ≥ 0 and n ≥ 0, multilinear maps g ln : V ⊗n → V by g

ln (v1 , . . . , vn ) := (−1)(n−1)v1 +(n−2)v2 +···+vn−1 ↓ [↑ r(v1 ), . . . , ↑ r(vn )]g , for v1 , . . . , vn ∈ V ⊗n . Define also the bilinear form B : V ⊗V → C by B(u, v) := ↑ r(u), ↑ r(v)

(10)

and, finally, the element h = hs ⊗hs by hs := (−1)s r(↓ s ), hs := r(↓ s ), which means that hs ⊗hs := (−1)s r(↓ s )⊗r(↓ s ) (Einstein summation convention). A technical, but absolutely straightforward, calculation shows that the above structure is an example of a loop homotopy Lie algebra in the sense of the following definition.

Loop Homotopy Algebras in Closed String Field Theory

371 g

Definition 1. A loop homotopy Lie algebra is a triple V = (V , B, {ln }) consisting of Vi , (i) a Z-graded vector space V , V∗ = (ii) a graded symmetric nondegenerate bilinear degree +3 form B : V ⊗V → k, and g (iii) the set {ln }n,g≥0 of degree n − 2 multilinear antisymmetric operations g ln : V ⊗n → V . These data are supposed to satisfy the following two axioms: (A1) For any n, g ≥ 0 and v1 , . . . , vn ∈ V , the following “main identity” g g 0= χ (σ )(−1)l(k−1) lk 1 (ll 2 (vσ (1) ,..., vσ (l) ), vσ (l+1) ,..., vσ (n) ) k+l=n+1 g1 +g2 =g

+

σ ∈unsh(l,n−l)

1 g−1 (−1)hs +n ln+2 (hs , hs , v1 , . . . , vn ) 2 s

(11)

holds. In the second term, {hs } and {hs } are bases of the vector space V dual to each other in the sense that B(hs , ht ) = δts .

(12)

(A2) The element g

(−1)(n+1)hs hs ⊗ln (hs , v1 , . . . , vn−1 ) ∈ V ⊗V

(13)

is symmetric, for all g ≥ 0, n ≥ 0, and v1 , . . . , vn−1 ∈ V . Remark 1. To give a reasonable meaning to the “basis {hs } of V ”, we must suppose either that V is finite dimensional, or that it has a suitable topology, as in the case of string products. We will always tacitly assume that assumptions of this form have been made. In the “main identity” for g = 0 we put, by definition, ln−1 = 0. Because deg(hs ) + deg(hs ) = −3, deg(hs ) deg(hs ) is even. The graded symmetry of B then implies that, besides (12), also B(hs , ht ) = δst . The element h = hs ⊗hs is s easily seen to be symmetric, hs ⊗hs = (−1)hs h hs ⊗hs = hs ⊗hs . For n = 0 axiom (2) gives

0=

g1 +g2 =g

g

g

l1 1 (l0 2 (.)) +

1 g−1 (−1)hs l2 (hs , hs ), 2 s

while for n = 1 it gives 0=

g1 +g2 =g

g

g

g

g

(l1 1 (l1 2 (v)) + l2 1 (l0 2 (.), v)) −

1 g−1 (−1)hs l3 (hs , hs , v), 2 s g

(14)

for all v ∈ V . From this moment on, we will assume that l0 = 0, for all g ≥ 0, that is, the theory has “no constants”. This assumption is not really necessary, but it will considerably simplify our exposition.

372

M. Markl

Exercise 1. Let us denote ∂ := l10 . Equation (14) implies that ∂ 2 = 0 (recall our asg sumption l0 = 0!). Thus ∂ is a degree −1 differential on the space V . The symmetry of s hs ⊗∂(h ) (Axiom (A2) with n = 1 and g = 1) is equivalent to the d-invariance of the form B, B(∂u, v) + (−1)u B(u, ∂v) = 0, for u, v ∈ V . The tree level. Let us discuss the “tree level” (g = 0) specialization of the above g structure. The only nontrivial ln ’s are ln := ln0 , n ≥ 1. The main identity (11) for g = 0 reduces to χ (σ )(−1)l(k−1) lk (ll (vσ (1) ,..., vσ (l) ), vσ (l+1) ,..., vσ (n) ) (15) 0= k+l=n+1 σ ∈unsh(l,n−l) n

while, for g = 1 it gives (after forgetting the overall factor (−1) 2 ) (−1)hs ln+2 (hs , hs , v1 , . . . , vn ). 0=

(16)

s

Axiom (A2) says that the elements (−1)(n+1)hs hs ⊗ln (hs , v1 , . . . , vn )

(17)

are symmetric. We immediately recognize (15) as the defining axiom for strongly homotopy Lie algebras [13, Def. 2.1]. Thus the tree level loop homotopy Lie algebra is a strongly homotopy Lie algebra (V , {ln }) with an additional structure given by a bilinear form B such that the element h = hs ⊗hs , uniquely determined by B, satisfies (16) and (17). We see that the “tree-level” specialization is a richer structure than just a strongly homotopy Lie algebra as it is usually understood. A proper name for such a structure would be a cyclic strongly homotopy Lie algebra. 3. Higher Order (Co)derivations In this section we investigate properties of higher order coderivations of cofree cocommutative coalgebras. Because this paper is meant for humans, not for robots, we derive necessary properties for derivations on free commutative algebras, and then simply dualize the results. This is an absolutely correct procedure, except for one fine point related to the cofreeness, see Remark 3. The following definitions were taken from [1, 3]. Let A be a graded (super) commutative algebra and ∇ : A → A a homogeneous degree k linear map. We define inductively, for each n ≥ 1, degree k linear deviations n∇ : A⊗n → A by 1∇ (a) := ∇(a),

2∇ (a, b) := ∇(ab) − ∇(a)b − (−1)ka a∇(b),

3∇ (a, b, c) := ∇(abc) − ∇(ab)c − (−1)a(b+c) ∇(bc)a − (−1)c(a+b) ∇(ca)b + ∇(a)bc + (−1)a(b+c) ∇(b)ca + (−1)c(a+b) ∇(c)ab,

.. . n n n+1 ∇ (a1 , . . . , an+1 ) := ∇ (a1 , . . . , an an+1 ) − ∇ (a1 , . . . , an )an+1

− (−1)an ·an+1 n∇ (a1 , . . . , an−1 , an+1 )an .

Loop Homotopy Algebras in Closed String Field Theory

373

As a matter of fact, it is possible to give a non-inductive formula for n∇ , namely n∇ (a1 , . . . , an ) = (−1)n−i (σ )∇(xσ (1) · · · xσ (i) )xσ (i+1) · · · xσ (n) . (18) 1≤i≤n σ ∈unsh(i,n−i)

We say that ∇ is a derivation of order r if r+1 ∇ is identically zero. In this case we write ∇ ∈ Der rk (A), where k = deg(∇). In the following proposition, which was stated in [1], [−, −] denotes the graded anticommutator of endomorphisms. Proposition 1. The subspaces Der rk (A) satisfy: (i) Der 1k (A) ⊂ Der 2k (A) ⊂ Der 3k (A) ⊂ · · · , (ii) Der rk (A) ◦ Der sl (A) ⊂ Der r+s k+l (A), and s r (iii) [Der k (A), Der l (A)] ⊂ Der r+s−1 k+l (A). Let now A = ∧X be the free graded commutative algebra on the graded vector space X. Let us prove the following useful proposition. Proposition 2. Let ∇ ∈ Der rk (∧X). Then ∇ is uniquely determined by its values on the products x1 · · · xs , s ≤ r, xi ∈ X for 1 ≤ i ≤ s. In particular, ∇ = 0 if and only if ∇(x1 · · · xs ) = 0, for x1 · · · xs as above. Proof. Since ∇ ∈ Der rk (∧X) is linear, it is enough to prove that ∇(x1 · · · xs ) = 0 for all s ≤ r implies that ∇(x1 · · · xn ) = 0 for each n. This we prove inductively. Suppose we already know ∇(x1 · · · xk ) = 0, for each k ≤ n, n ≥ r, and consider ∇(x1 · · · xn+1 ). We compute from (18) that n+1 ∇ (x1 , . . . , xn+1 )

= ∇(x1 · · · xn+1 ) + (−1)n−i+1 (σ )∇(xσ (1) · · · xσ (i) )xσ (i+1) · · · xσ (n+1) . 1≤i≤n σ ∈unsh(i,n−i+1)

Since ∇ ∈ Der rk (∧X) and n ≥ r, n+1 ∇ (x1 , . . . , xn+1 ) = 0, while the terms in the sum are zero by the inductive assumption. Thus ∇(x1 · · · xn+1 ) = 0 and the induction may go on. Remark 2. 1-derivations are ordinary derivations, Der 1k (A) = Der k (A). Proposition 2 then states the standard fact that derivations on free algebras are given by their restrictions to the space of generators. For a fixed n, we denote by ∧n X the subspace of ∧X spanned by the products x1 · · · xn , xi ∈ X, 1 ≤ i ≤ n; we put, by definition, ∧0 X := k. Let ιn : ∧n X (→ ∧X be the inclusion. The following proposition says that r-derivations of the free algebra ∧X are in one-to-one correspondence with r-tuples of linear maps, {fs : ∧s X → ∧X}1≤s≤r . Proposition 3. Suppose we are given homogeneous degree k linear maps fs : ∧s X → ∧X, for 1 ≤ s ≤ r. Then there exists a unique order r derivation ∇ ∈ Derrk (∧X) such that ∇ ◦ ιs = fs , for 1 ≤ s ≤ r.

(19)

374

M. Markl

Proof. The uniqueness follows immediately from Proposition 2. To prove the existence, observe first that, given degree k linear maps gs : ∧s X → ∧X, 1 ≤ s ≤ r, the formula ∇(x1 · · · xn ) :=

(σ )gs (xσ (1) · · · xσ (s) )xσ (s+1) · · · xσ (n) ,

1≤s≤min(r,n) σ ∈unsh(s,n−s)

defines an order k derivation. Condition (19) then leads to the following system of equations: f1 (x1 ) = g1 (x1 ), f2 (x1 x2 ) = g2 (x1 x2 ) + g1 (x1 )x2 + (−1)x1 x2 g1 (x2 )x1 , .. . fr (x1 · · · xr ) = (σ )gs (xσ (1) · · · xσ (s) )xσ (s+1) · · · xσ (r) . 1≤s≤r σ ∈unsh(s,n−s)

This system can obviously be solved for gs , 1 ≤ s ≤ r. Let us turn our attention to coalgebras. Suppose that C = (C, +) is a cocommutative coassociative coalgebra. To define higher-order coderivations of C, we need analogs of the deviations r∇ introduced above. By duality, we define, for any homogeneous degree n : C → C ⊗n inductively k linear endomorphism , of C, degree k multilinear maps -, as 1 -, := ,, 2 -, := + ◦ , − (,⊗1) ◦ + − (1⊗,) ◦ +,

3 -, := +[3] ◦, − (+⊗1)◦(,⊗1)◦+ − T312 ◦(+⊗1)◦(,⊗1)◦+

− T231 ◦(+⊗1)◦(,⊗1)◦+ + (,⊗12 )◦+[3] + T312 ◦(,⊗12 )◦+[3]

+ T231 ◦(,⊗12 )◦+[3] .. . n+1 n n n -, := (1n−1 ⊗+) ◦ -, − (-, ⊗1) ◦ + − T1,2,... ,n−1,n+1,n ◦ (-, ⊗1) ◦ +,

where +[3] := (+⊗1)+ (= (1⊗+)+ by the coassociativity) and, for σ ∈ n , Tσ (1)···σ (n) : C ⊗n → C ⊗n is defined by Tσ (1)···σ (n) (x1 ⊗ · · · ⊗ xn ) := (σ )(xσ (1) ⊗ · · · ⊗ xσ (n) . r+1 is identically We say that a linear map , : C → C is an order r coderivation, if -, r zero. Let coDer k (C) be the space of all such maps. The following proposition is an exact dual of Proposition 1.

Loop Homotopy Algebras in Closed String Field Theory

375

Proposition 4. The subspaces coDer rk (C) satisfy: (i) coDer 1k (C) ⊂ coDer 2k (C) ⊂ coDer 3k (C) ⊂ · · · , (ii) coDer rk (C) ◦ coDer sl (C) ⊂ coDer r+s k+l (C), and s r (iii) [coDer k (C), coDer l (C)] ⊂ coDer r+s−1 k+l (C). Let W be a graded vector space and consider again the free graded commutative algebra ∧W on W . We introduce on ∧W a cocommutative coassociative comultiplication + = 1⊗1 + + + 1⊗1 by defining the reduced diagonal + as +(w1 · · · wn ) = (σ )(wσ (1) · · · wσ (i) ) ⊗ (wσ (i+1) · · · wσ (n) ), 1≤i≤n−1 σ

w1 · · · wn ∈ ∧n W , where σ runs through all (i, n − i) unshuffles. We denote the coalgebra (∧W, +) by c∧W . Remark 3. Here it must be pointed out that c∧W is not the cofree cocommutative coassociative coalgebra cogenerated by W , as it is generally supposed to be. It is the cofree coalgebra in the category of connected coalgebras, see the discussion in [13, p. 2150]. Denote by πn : c∧W → ∧n W the natural projection of vector spaces. The following theorem is the exact dual of Proposition 3. Proposition 5. For each r-tuple us : c∧W → ∧s W , 1 ≤ s ≤ r, of homogeneous degree k linear maps there exists a unique order r coderivation , ∈ coDer rk (c∧W ) such that πs ◦, = us , for 1 ≤ s ≤ r.

(20)

4. Loop Homotopy Lie Algebras – 1st Description We already observed at the end of Sect. 2 that strongly homotopy Lie algebras are closely related to the “tree level” specializations of loop homotopy Lie algebras. Recall [13, Theorem 2.3] that strongly homotopy Lie algebras have the following characterization. Proposition 6. There exists a one-to-one correspondence between strongly homotopy Lie algebra structures on a graded vector space V and degree −1 coderivations δ ∈ coDer −1 (c∧W ), W :=↑ V , with the property δ 2 = 0. In this section we give a similar characterization for loop homotopy Lie algebras. Suppose that the vector space V and the bilinear form B is the same as in Def. 1. Let h = hs ⊗hs ∈ (V ⊗V )−3 be as in (12) (of course, h is uniquely determined by the nondegenerate form B). Let W :=↑ V and y = ys ⊗y s :=↑ hs ⊗ ↑ hs ∈ (W ⊗W )−1 . Because h is symmetric, y is symmetric as well, thus, in fact, y = ys y s ∈ (∧2 W )−1 . Let us consider the extension c∧W [t] of c∧W over the polynomial ring k[t], c∧W [t] := c∧W ⊗k k[t]. By Proposition 5, there exist a unique coderivation θ ∈ coDer 2−1 (c∧W [t]) such that 0, w ∈ ∧n W [t], n > 0, π1 (θ ) = 0 and π2 (θ )(w) = 1 (21) 0 0 ∼ 2 ty, w = 1 ∈ ∧ W · t = k. The rôle of θ is to incorporate the form B into our theory. In the rest of this section we prove the following theorem.

376

M. Markl

Theorem 1. Under the above notation, there is a one-to-one correspondence between loop homotopy Lie algebra structures on the graded vector space V and degree −1 coderivations δ ∈ coDer 1−1 (c∧W [t]) such that (δ + θ)2 = 0.

(22)

Let us analyze Eq. (22). It is, of course, equivalent to δ 2 + θδ + δθ + θ 2 = 0.

(23)

Sublemma 1. Under the above notation, θ 2 = 0, δ 2 ∈ coDer 1−2 (c∧W [t]), and (θ δ + δθ ) ∈ coDer 2−2 (c∧W [t]). Proof. For w1 · · · wn ∈ ∧n W obviously θ (w1 · · · wn ) =

1 tys y s w1 · · · wn , 2

(24)

thus θ 2 (w1 · · · wn ) =

1 2 t ys y s y t y t w 1 · · · w n . 4

(25)

The graded commutativity implies that ys y s yt y t = (−1)(ys +y

s )(y +y t ) t

yt y t ys y s = −yt y t ys y s .

On the other hand, the substitution s ↔ t gives ys y s yt y t = yt y t ys y s , therefore yt y t ys y s = 0, and θ 2 = 0 by (25). The remaining two statements follow from Proposition 4(iii) and the observation that δ 2 = 21 [δ, δ] and θ δ + δθ = [δ, θ ]. By Sublemma 1, (23) reduces to δ 2 + θδ + δθ = 0.

(26)

By the same sublemma and Proposition 1(i), δ 2 + θδ + δθ is an order 2 coderivation. Thus (26) is, by Proposition 5, equivalent to π1 (δ 2 + θδ + δθ ) = 0, and

(27)

π2 (δ 2 + θδ + δθ ) = 0.

(28)

Because, by (21), π1 (θ ) = 0, Eq. (27) further reduces to π1 (δ 2 + δθ ) = 0.

(29)

To understand better the meaning of this equation, let us introduce, for any g ≥ 0 and g n ≥ 0, linear maps δn : ∧n W → W by g

δn (w1 · · · wn ) := Coef g (π1 δ(w1 · · · wn )), w1 · · · wn ∈ ∧n W,

(30)

Loop Homotopy Algebras in Closed String Field Theory

377 g

where Coef g (−) is the coefficient at t g . By Proposition 5, the set {δn }n,g≥0 uniquely determines the coderivation δ. The explicit formula is (compare explicit formulas for coderivations acting on coalgebras in [14]): g (σ )t g δi (wσ (1) · · · wσ (i) )wσ (i+1) · · · wσ (n) , (31) δ(w1 · · · wn ) = 0≤i≤n

where the summation is taken over all g ≥ 0 and all σ ∈ unsh(i, n − i). From this and (24) we obtain π1 (δ 2 + δθ )(w1 · · · wn ) = k+l=n+1 g1 +g2 =g

+

σ ∈unsh(l,n−1)

g

g

(σ )t g δk 1 (δl 2 (wσ (1) · · · wσ (l) )wσ (l+1) · · · wσ (n) ) (32)

1 g+1 g t δn+2 (ys , y s , w1 , . . . , wn ). 2 s,g≥0

We formulate the result as: Sublemma 2. Eq. (29) means that, for all n ≥ 0, w1 · · · wn ∈ ∧n W and g ≥ 0, g g (σ )δk 1 (δl 2 (wσ (1) · · · wσ (l) )wσ (l+1) · · · wσ (n) ) 0= k+l=n+1 g1 +g2 =g

+

(33)

σ ∈unsh(l,n−1)

1 g−1 δ (ys , y s , w1 , . . . , wn ). 2 s n+2

We will see that Eq. (33) will correspond to the “main identity” (11). Let us make a similar analysis of Eq. (28). Because clearly π2 (θ δ) = 0, it reduces to π2 (δ 2 + δθ) = 0.

(34)

Using the similar arguments as above, we obtain, for any g ≥ 0 and w1 · · · wn ∈

∧n W ,

(35) Coef g (π2 (δ 2 )(w1 · · · wn )) g1 g2 = (σ )δk (δl (wσ (1) · · · wσ (l) )wσ (l+1) · · · wσ (n−1) )wσ (n) k+l=n+1 g1 +g1 =g

+

σ ∈unsh(l,n−l−1,1)

g

g

(−1)wσ (1) +···+wσ (p) (σ )δp1 (wσ (1) · · · wσ (p) )δq 2 (wσ (p+1) · · · wσ (n) ).

p+g=n σ ∈unsh(p,q) g1 +g1 =g

Similarly, we have Coef g (π2 (δθ )(w1 · · · wn )) 1 g−1 = (−1)wi (wi+1 +···+wn ) δn+1 (ys y s w1 · · · wi−1 wi+1 · · · wn )wi 2 1≤i≤n

1 s g−1 (−1)ys (y +w1 +···+wn ) δn+1 (y s w1 · · · wn )ys 2 s 1 s g−1 + (−1)y (w1 +···+wn ) δn+1 (ys w1 · · · wn )y s . 2 s +

(36)

378

M. Markl

Now, assuming (33), it is immediate to see that the first term at the right-hand side of (35) s is minus the first term at the right-hand side of (36). The symmetry ys y s = (−1)ys y y s ys implies that the second and third terms at the left-hand side of (36) are the same, both g−1 equal to 1/2 s (−1)ys ys δn+1 (y s w1 · · · wn ). We formulate these observations as Sublemma 3. Assuming (33), Eq. (34) is equivalent to 1 g−1 (−1)ys ys δn+1 (y s w1 · · · wn ) = 0. 2 s

(37)

Since we work in the free commutative algebra, (37) is equivalent to the antisymmetry of 1 g−1 (−1)ys ys ⊗δn+1 (y s w1 · · · wn ) ∈ W ⊗W. (38) 2 s Proof of Theorem 1. Recall that W =↑ V . The correspondence between the structure g operations {ln }g,n≥0 of a loop homotopy Lie algebra and coderivations δ of Theorem 1 is given by g

g

ln (v1 , . . . , vn ) = (−1)(n−1)v1 +···+vn−1 ↓ δn (↑ v1 · · · ↑ vn ), v1 , . . . , vn ∈ V , with the inverse formula g

g

δn (w1 · · · wn ) = (−1)n(n−1)/2 (−1)(n−1)w1 +···+wn−1 ↑ ln (↓ w1 , . . . , ↓ wn ), g

w1 · · · wn ∈ ∧n W , where the multilinear maps {δn } were introduced in (30). Observe the sign (−1)n(n−1)/2 in the second formula; it is typical for formulas of this type, see [15, g g Example 1.6]. A routine calculation shows that the substitution ln ↔ δn converts (33) to (11) and that the symmetry of the element in (13) is equivalent to the antisymmetry of the element of (38).

5. Loop Homotopy Lie Algebras – Operadic Approach In this section we give an operadic characterization of loop homotopy Lie algebras. We will not repeat here all details of necessary definitions concerning operads, because it would stretch the paper beyond any reasonable limit. Operads are introduced in the classical book [17]. The (co)bar construction over a (co)operad is defined in [9], see also [6]. Cyclic operads are introduced in [7] while modular operads and the corresponding modular (co)bar construction (called the Feynman transform) in [8]. There is also a nice overview [10]. These sources are easily available, we will thus rely on them and indicate only basic ideas. Recall that a collection is a system E = {E(n)}n≥1 of graded vector spaces such that each E(n) possesses a right action of the symmetric group n . Any collection E extends to a functor (denoted by the same symbol) from the category of finite sets to the category of graded vector spaces with the property that E(n) = E({1, . . . , n}) [6, 1.3]. Let Trn denote the set of rooted (= directed) trees with n labelled leaves. For a tree T ∈ Trn and a collection E, denote ([9, 1.2.13]) E(T ) := E(In(v)), v∈Vert(T )

Loop Homotopy Algebras in Closed String Field Theory

379

where Vert(T ) is the set of the vertices of T and In(v) the set of incoming edges of v. The free operad on E [9, 2.1.1] is then the collection F(E)(n) := E(T ), n ≥ 1, T ∈Trn

with the operadic structure induced by the grafting of underlying trees. Let P be an operad. Consider the free operad F(↓ sP ∗ ) on the collection ↓ sP ∗ (n) := ↑ n−2 P ∗ (n), n ≥ 1, where (−)∗ is the linear dual. As proved in [9, 3.2], structure operations of the operad P induce a differential ∂D on F(↓ sP ∗ ). The differential operad D(P) := (F(↓ sP ∗ ), ∂D ) is called the (operadic) cobar dual of the operad P. It is well-known [9, 4.2.14] that “classical” strongly homotopy Lie algebras are characterized as follows. Proposition 7. Strongly homotopy Lie algebras are algebras over the cobar dual D(Com) of the operad Com for commutative algebras. The above proposition means that a strongly homotopy Lie algebra structure on a differential graded vector space V = (V , ∂) is the same as a morphism a : D(Com) → End V from the operad D(Com) to the endomorphism operad End V of V [9, 1.2.9]. Our aim is to give a similar characterization of loop homotopy Lie algebras, based on a certain generalization of operads, called modular operads. An intermediate step between ordinary operads and modular operads are cyclic operads whose definition we briefly recall. A cyclic collection is a system E = {E((n))}n≥1 of graded vector spaces such that each E((n)) has a right n+1 -action. Each cyclic collection E induces a functor from the category of finite sets into the category of graded vector spaces (denoted again by E) such that E(({0, . . . , n})) = E((n)). This notation differs from that of [7] and [5] where E(({0, . . . , n})) = E((n + 1)). Let Tur n denote the set of unrooted trees T with leaves indexed by {0, . . . , n}. For a cyclic collection E and a tree T ∈ Tur n , let E((T )) := E((Leg(v))), v∈Vert(T )

where Leg(v) is the set of all edges of T adjacent to the vertex v. A cyclic operad is then a cyclic collection C = {C((n))}n≥1 together with a “coherent” system of “contractions” αT : C((T )) → C((n)), T ∈ Tur n , n ≥ 1,

(39)

see [7, Def. 2.1] Modular operads, anticipated in [5], were introduced by Getzler and Kapranov [8] for the study of moduli spaces of Riemann surfaces of arbitrary genera. Recall that a modular collection is a cyclic collection E with a second grading by the “genus” g ≥ 0, E = {E((g, n))}n≥1 . A modular operad A is then a modular collection which possesses, besides a cyclic operadic structure, also operations A((g, n + 2)) → A((g + 1, n)). These operations are abstractions of the “self-gluing” which produces, from a surface of genus g with (n + 2) punctures, a new surface of genus g + 1 with n punctures, as indicated in Fig. 1.

380

M. Markl

3

4

self-gluing

1

2

✲

1

2

Fig. 1. An example of “self-gluing”. The surface on the right has 2 punctures and genus 2. It is obtained from the surface on the left with 4 punctures and genus 1 by sewing along the punctures marked by 3 and 4

As cyclic operads are characterized by a system of contractions (39) indexed by unrooted trees, there is a similar characterization of modular operads, but based on labelled (or “modular”) graphs rather than trees. Following [5, 12], by a graph 8 we mean a finite set Flag(8) (whose elements are called flags or half-edges) together with an involution σ and a partition λ. The vertices Vert(8) of a graph 8 are the blocks of the partition λ. The edges Edg(8) are pairs of flags forming a two-cycle of σ relative to the decomposition of a permutation into disjoint cycles. The legs Leg(8) are the fixed-points of σ . We also denote by Leg(v) the flags belonging to the block v or, in common speech, half-edges adjacent to the vertex v. Each graph 8 has its geometric realization, a finite one-dimensional cell complex |8|, obtained by taking one copy of [0, 21 ] for each flag and imposing the following equivalence relation: the points 0 ∈ [0, 21 ] are identified for all flags in a block of the partition λ, and the points 21 ∈ [0, 21 ] are identified for pairs of flags exchanged by the involution σ . We will usually make no distinction between a graph and its geometric realization. A modular or labelled graph is a connected graph 8 together with a map g : Vert(8) → {0, 1, 2, . . . }. The genus g(8) of a modular graph 8 is the number g(8) := dim H1 (|8|) + g(v). v∈Vert(8)

Let 8 ((g, S)) be the category whose objects are pairs (|8|, ρ) consisting of a modular graph 8 of genus g and an isomorphism ρ : Leg(8) → S labeling the legs of 8 by elements of a finite set S. As usual, we write 8 ((g, n)) := 8 ((g, {0, . . . , n})). For a modular collection A = {A((g, n))}n≥1 and a modular graph 8, let A((8)) be the tensor product A((8)) := A((g(v), Leg(v))). (40) v∈Vert(8)

A modular operad structure on A is then given by a coherent system of contractions [8, 2.10] α8 : A((8)) → A((g, S)), for any 8 ∈ 8 ((g, S)), g ≥ 0 and a finite set S.

Loop Homotopy Algebras in Closed String Field Theory

381

Example 1. Let V = (V , B) be a differential graded vector space with a graded symmetric inner product B : V ⊗V → k. Let us define, for each g ≥ 0 and a finite set S, End V ((g, S)) := V ⊗S (the tensor product of copies of V indexed by S). It follows from definition that, for any 8 ∈ 8 ((g, S)), End V ((8)) = V ⊗Flag(8) . Let B ⊗Edg(8) : V ⊗Flag(8) → V ⊗Leg(8) be the multilinear form which contracts the factors of V ⊗Flag(8) corresponding to the flags which are paired up as edges of 8. Then we define α8 : End V ((g, 8)) → End V ((g, S)) to be the map B ⊗Edg(8)

α8 : End V ((8)) ∼ = V ⊗Flag(8) −−−−−−→ V ⊗Leg(8) ∼ = V ⊗S = End V ((g, S)).

(41)

It is easy to show that the contractions {α8 | 8 ∈ 8 ((g, S))} define on End V the structure of a modular operad. We would like to modify Example 1 to the situation when the degree of the form B is +3, as in the definition of a loop homotopy Lie algebra. Formula (41) does not work, among other things also because α8 will not be of degree zero. For this modification we need to introduce “twisted” modular operads. If X is a finite set with card(X) = s, let Det(X) := ∧s ((↓ k)⊕X ), the top dimensional piece of the s-fold exterior power of the direct sum of the copies of ↓ k indexed by elements of X. Clearly Det(X) is an one-dimensional vector space concentrated in degree −s. The determinant of a graph 8 ∈ 8 ((g, S)) is defined by Det(8) := Det(Edg(8)). A twisted modular operad ([5, p. 293], also called a K-modular operad in [7]) is then a modular collection A together with a coherent system of contractions α˜ 8 : A((8))⊗Det(8) → A((g, S)), for any 8 ∈ 8 ((g, S)), g ≥ 0 and a finite set S. Example 2. Let W = (W, H ) be a graded vector space with a nondegenerate degree −1 W by symmetric bilinear form H . Define the modular collection End W ((g, S)) := W ⊗S , End for g ≥ 0 and a finite set S. For 8 ∈ 8 ((g, S)), the twisted modular contraction W ((8))⊗Det(8) → End W ((g, S)) α˜ 8 : End is defined as follows. Let us choose labels se , te such that e = {se , te } for each edge e ∈ Edg(8) and define α˜ 8 to be the composition: W ((8))⊗Det(8) ∼ End = W ⊗Flag(8) ⊗Det(8)

∼ W ⊗{se ,te } ⊗Span(↓ e) = W ⊗S ⊗ e∈Edg(8)

∼ =W

⊗S

⊗

Wse ⊗Wte ⊗Span(↓ e)

e∈Edg(8)

1⊗ e He W ((g, S)), −−−−−−→ W ⊗S ⊗k⊗Edg(8) ∼ = End

382

M. Markl

where He is the map that sends u⊗v⊗↓e ∈ Wse ⊗Wte ⊗Span(↓e) to H (u, v) ∈ k. The symmetry of H assures that the definition of α˜ 8 does not depend on the choice of labels. W the structure of a twisted modular The system {α˜ 8 | 8 ∈ 8 ((g, S))} induces on End operad. If V = (V , B) is a graded vector space with a nondegenerate degree +3 bilinear symmetric form B, then W = (W, H ) with W := ↑2 V and the form H defined by H (u, v) := B(↓2 u, ↓2 v), u, v ∈ W , form the data as in Example 2, so we may ↑2 V . consider the twisted modular operad End Another example of a twisted modular operad is provided by the Feynman transform (E) on a of a modular operad. Recall [8, 4.2] that the free twisted modular operad M modular collection E is given by (E)((g, n)) := M

colim

E((8))⊗Det(8),

8 ∈ Iso 8 ((g, n))

where Iso 8 ((g, n)) is the full subcategory of isomorphisms in 8 ((g, n)). The twisted modular operad structure is induced by the “grafting” of underlying graphs. (A)((g, n)) carries a natural differential ∂F [5, If A is a modular operad, then M (A), ∂F ) is called Theorem 4.4]. The twisted differential modular operad F(A) := (M the Feynman transform of the modular operad A. Let us consider the “forgetful” functor For : MOp → COp from the category of modular operads to the category of cyclic operads given by For(A)((S)) := A((0, S)), for any finite set S. It is not difficult to show [16] that this functor has a left adjoint Mod : COp → MOp. Definition 2. The modular operad Mod(P) is called the modular operadic completion of the cyclic operad P. An easy calculation shows that Mod(Com)((g, n)) ∼ = k, for each g ≥ 0, n ≥ 1,

(42)

with the trivial action of the symmetric group n+1 . The key role in our characterization is played by the Feynman transform F(Mod(Com)) of the modular completion of the operad Com. It follows from (42) that, as a nondifferential operad, F(Mod(Com)) is the free twisted modular operad on the g generators ωn , (Mod(Com)) ∼ ({ωng ; n ≥ 1, g ≥ 0}), M =M

(43)

g where ωn corresponds to the dual of 1 ∈ k ∼ = Mod(Com)((g, n)). The central result of this section reads as follows.

Theorem 2. There exists a natural one-to-one correspondence between twisted modular F(Mod(Com))-algebra structures on (↑ 2 V , B(↓ 2 −, ↓ 2 −), i.e. morphisms

↑2 V , ∂ = 0 (44) a : F(Mod(Com)), ∂F → End of differential twisted modular operads, and loop homotopy algebra structures on (V , B) in the sense of Def. 1.

Loop Homotopy Algebras in Closed String Field Theory

383

Sketch of proof. Description (43) shows that a map a of (44) is determined by its values g g ↑2 V ((g, n)) on the generators. Moreover, the map a ought to commute ξn := a(ωn ) ∈ End with the differentials, so the equation g

a(∂F (ωn )) = 0

(45) g

↑2 V ((g, n)) can be must be satisfied, for each g ≥ 0 and n ≥ 1. Observe that ξn ∈ End interpreted as a degree −2(n + 1)-element of the graded vector space V ⊗n+1 . Let us introduce a map @ : V ⊗n+1 → Hom(V ⊗n , V ) by @(x0 ⊗ · · · ⊗ xn )(v1 , . . . , vn ) : := (−1)nx0 +(n−1)x1 +···+xn−1 x0 B(x1 , v1 )B(x2 , v2 ) · · · B(xn , vn ),

(46)

for x0 ⊗ · · · ⊗ xn ∈ V ⊗n+1 and v1 , . . . , vn ∈ V . The map @ is clearly a degree 3n isog morphism of V ⊗n+1 and Hom(V ⊗n , V ). Finally, let ln : V ⊗n → V be a homogeneous degree n − 2 map given by g

ln (v1 , . . . , vn ) := (−1)

n(n+1) +n(v1 +···+vn ) 2

g

@(ωn )(v1 , . . . , vn ), for v1 , . . . , vn ∈ V . g

A long but straightforward calculation shows that ln are antisymmetric operations satisfying (13) and that (45) translates to the main identity (11). On the other hand, all steps above can clearly be reversed, thus a loop homotopy Lie algebra structure induces a map (44). Remark 4. Observe that Theorem 2 is formulated in such a way that the differential ∂ on V is a part of the structure, namely ∂ := a(ω10 ). 6. Possible Generalizations (Open Strings) Let P be an operad. It is now well-understood what a “strongly homotopy P-algebra” is. In the case when P is Koszul, it is an algebra over the cobar construction on the quadratic dual P ! of P [9, Def. 4.2.14]. An alternative characterization is that a homotopy P-algebra is a square zero differential on the cofree connected P ! -coalgebra. The equivalence of these two characterizations follows for example from [9, Prop. 4.2.15]. The quadratic dual of the operad Lie for Lie algebras is Com, the operad for commutative associative algebras, and the above characterization give Proposition 6, resp. Proposition 7. Another example is P = Ass, the operad for associative algebras. It is quadratic self-dual, P ! = Ass, and the corresponding strongly homotopy algebras are called strongly homotopy associative or A∞ -algebras [18, 15]. Let us look for possible generalizations to the loop case. If P is a cyclic operad (recall that both Lie and Ass are cyclic), the quadratic dual P ! is again cyclic [7], so it makes sense to consider the modular completion Mod(P ! ) (Def. 2). We suggest the following definition. Definition 3. Let P be a Koszul cyclic operad. A loop homotopy P-algebra is a modular algebra over the twisted differential modular operad F(Mod(P ! )).

384

M. Markl

For P = Lie we get Theorem 2. It would be interesting to write out explicitly axioms of loop homotopy associative algebras, because these structures should play an important rôle in the higher-genera open string field theory, as suggested by [19]. While in the Lie g case we had, for each n and g, only one antisymmetric operation ln : V ⊗n → V , in the loop homotopy associative case we expect to have (n + 1)! g 2 · g! · (n + 1 − 2g)! operations V ⊗n → V , due to the dimension of Mod(Ass)((g, n)). A seemingly easier approach would be the one based on coderivations. We would like to say that a loop homotopy P-algebra is an order 2 coderivation of the cofree connected P ! -coalgebra, having properties analogous to (22). This works nicely for P = Lie, because we know what is a higher order coderivation of a cocommutative coalgebra. But we are not sure whether there exists a reasonable concept of higher-order coderivations without the cocommutativity, though the paper [4] seems to suggest this. Acknowledgement. I would like to express my gratitude to Jim Stasheff for reading the manuscript and many helpful remarks and suggestions.

References 1. Akman, F.: On some generalizations of Batalin–Vilkovisky algebras. Preprint q-alg/9506027, June 1995 2. Akman, F.: Multibraces on the Hochschild complex. Preprint q-alg/9702010, February 1997 3. Alfaro, J., Bering, K., Damgaard, P.H.: Algebra of higher antibrackets. Preprint hep-th/9604027, April 1996 4. Alfaro, J., Damgaard, P.H.: Non-Abelian antibrackets. Preprint hep-th/9511066, November 1995 5. Behrend, K., Manin, Yu.: Stacks of stable maps and Gromov-Witten invariants. Preprint alggeom/9506023, June 1995 6. Getzler, E., Jones, J.D.S.: Operads, homotopy algebra, and iterated integrals for double loop spaces. Preprint, 1993 7. Getzler, E., Kapranov, M.M.: Cyclic operads and cyclic homology. In: S.-T. Yau, editor, Geometry, Topology and Physics for Raoul Bott, Volume 4 of Conf. Proc. Lect. Notes. Geom. Topol., Cambridge, MA: International Press, 1995, pp. 167–201 8. Getzler, E., Kapranov, M.M.: Modular operads. Compositio Math. 110 (1), 65–126 (1998) 9. Ginzburg, V., Kapranov, M.M.: Koszul duality for operads. Duke Math. J. 76 (1), 203–272 (1994) 10. Kapranov, M.M.: Operads in algebraic geometry. Documenta Mathematica Extra Volume ICM, pp. 277– 286 (1998) 11. Kimura, T., Stasheff, J.D., Voronov, A.A.: On operad structures of moduli spaces and string theory. Commun. Math. Phys. 171, 1–25 (1995) 12. Kontsevich, M.: Graphs, homotopical algebra and low-dimensional topology. Preprint, 1994 13. Lada, T., Markl, M.: Strongly homotopy Lie algebras. Communications in Algebra 23 (6), 2147–2161 (1995) 14. Lada, T., Stasheff, J.D.: Introduction to sh Lie algebras for physicists. International J. Theor. Phys. 32 (7), 1087–1103 (1993) 15. Markl, M.: A cohomology theory for A(m)-algebras and applications. J. Pure Appl. Algebra 83, 141–175 (1992) 16. Markl, M., Shnider, S., Stasheff, J.D.: Operads in algebra, topology and mathematical physics. Book, work in progress 17. May, J.P.: The Geometry of Iterated Loop Spaces. Lecture Notes in Mathematics Vol 271 Berlin– Heidelberg–New York: Springer-Verlag, 1972 18. Stasheff, J.D.: Homotopy associativity of H-spaces I,II. Trans. Am. Math. Soc. 108, 275–312 (1963) 19. Stasheff, J.D.: Higher homotopy algebras: String field theory and Drinfel’d quasi-Hopf algebras. In: Proceedings of the XXth International Conference on Differential Geometric Methods in Theoretical Physics, Baruch College, CUNI, June 1991, Singapore: World Scientific, 1992, pp. 408–425 20. Zwiebach, B.: Closed string field theory: Quantum action and the Batalin–Vilkovisky master equation. Nucl. Phys. B 390, 33–152 (1993) Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 220, 385 – 432 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Noncommutative Instantons and Twistor Transform Anton Kapustin1, , Alexander Kuznetsov2, , Dmitri Orlov3, 1 School of Natural Sciences, Institute for Advanced Study, Olden Lane, Princeton, NJ 08540, USA.

E-mail: [email protected]

2 Institute for Problems of Information Transmission, Russian Academy of Sciences, 19 Bolshoi Karetnyi,

Moscow 101447, Russia. E-mail: [email protected]; [email protected]

3 Algebra Section, Steklov Mathematical Institute, Russian Academy of Sciences, 8 Gubkin str., GSP-1,

Moscow 117966, Russia. E-mail: [email protected] Received: 3 May 2000 / Accepted: 3 April 2001

Dedicated to A.N. Tyurin on his 60th birthday Abstract: Recently N. Nekrasov and A. Schwarz proposed a modification of the ADHM construction of instantons which produces instantons on a noncommutative deformation of R4 . In this paper we study the relation between their construction and algebraic bundles on noncommutative projective spaces. We exhibit one-to-one correspondences between three classes of objects: framed bundles on a noncommutative P2 , certain complexes of sheaves on a noncommutative P3 , and the modified ADHM data. The modified ADHM construction itself is interpreted in terms of a noncommutative version of the twistor transform. We also prove that the moduli space of framed bundles on the noncommutative P2 has a natural hyperkähler metric and is isomorphic as a hyperkähler manifold to the moduli space of framed torsion free sheaves on the commutative P2 .The natural complex structures on the two moduli spaces do not coincide but are related by an SO(3) rotation. Finally, we propose a construction of instantons on a more general noncommutative R4 than the one considered by Nekrasov and Schwarz (a q-deformed R4 ). 1. Physical Motivation In this section we explain the physical motivation for studying instantons on a noncommutative R4 . Readers uninterested in the motivation may skip most of this section and proceed directly to Subsect. 1.5. Likewise, readers familiar with the way noncommutative instantons arise in string theory may start with Subsect. 1.5. 1.1. Instanton equations. Let E be a vector bundle with structure group G on an oriented Riemannian 4-manifold X, and let A be a connection on E. The instanton equation is Supported by DOE grant DE-FG02-90ER4054442.

Supported by NSF grant DMS97-29992 and RFFI grants 99-01-01144, 99-01-01204. Supported by NSF grant DMS97-29992 and RFFI grant 99-01-01144.

386

A. Kapustin, A. Kuznetsov, D. Orlov

the equation FA+ = 0,

(1)

where FA is the curvature of A, and FA+ denotes the self-dual (SD) part of FA . Solutions of this equation are called instantons, or anti-self-dual (ASD) connections. The second Chern class of E is known in the physics literature as the instanton number. Instantons automatically satisfy theYang–Mills equation dA (∗F ) = 0, where dA : p ⊗End(E) −→ p+1 ⊗ End(E) is the covariant differential, and ∗ : p −→ 4−p is the Hodge star operator. There are several physical reasons to be interested in instantons. If one is studying quantum gauge theory on a Riemannian 3-manifold M (space), then instantons on X = M × R describe quantum-mechanical tunneling between different classical vacua. The possibility of such tunneling has drastic physical effects, some of which can be experimentally observed. If one is studying classical gauge theory on a 5-dimensional space-time X × R, then instantons on X can be interpreted as solitons, i.e. as static solutions of the Yang–Mills equations of motion. In fact, instantons are the absolute minima of the Yang–Mills energy function of the 5-dimensional theory (with fixed second Chern class). Both interpretations arise in string theory, but to explain this we need to make a digression and discuss D-branes.

1.2. D-branes. It has been discovered in the last few years that string theory describes, besides strings, extended objects (branes) of various dimensions. These extended objects should be regarded as static solutions of (as yet poorly understood) stringy equations of motion. D-branes are a particularly manageable class of branes. Recall that ordinary closed oriented superstrings, known as Type II strings, are described by maps from a Riemann surface (“worldsheet”) to a 10-dimensional manifold Z (“target”). The physical definition of a D-brane is “a submanifold of Z on which strings can end”. This means that if a D-brane is present, then one needs to consider maps from a Riemann surface with boundaries to Z such that the boundaries are mapped to a certain submanifold X ⊂ Z. In this case one says that there is a D-brane wrapped on X. If X is connected and has dimension p + 1, then one says that one is dealing with a Dp-brane. In general, X can have several components with different dimensions, and then each component corresponds to a D-brane. In perturbative string theory, the role of equations of motion is played by the condition that a certain auxiliary quantum field theory on the Riemann surface is conformally invariant. When D-branes are present, has boundaries, and the auxiliary theory must be supplemented with boundary conditions. The requirement that the boundary conditions preserve conformal invariance imposes constraints on the submanifold X. These constraints should be regarded as equations of motion for D-branes. For example, if we consider a D0-brane wrapped on a 1-dimensional submanifold X, then conformal invariance requires that X be a geodesic in Z. This is the usual equation of motion for a relativistic particle moving in Z. An important subtlety is that to specify fully the boundary conditions for the auxiliary theory on it is not sufficient to specify X; one should also specify a unitary vector bundle E on X and a connection on it. In the simplest case this bundle has rank 1, but one can also have “multiple” D-branes, described by bundles of rank r > 1. Such bundles describe r coincident D-branes wrapped on the same submanifold X. Using

Noncommutative Instantons and Twistor Transform

387

the requirement of conformal invariance of the auxiliary two-dimensional quantum field theory, one can derive equations of motion for the Yang–Mills connection on E. In the low-energy approximation, the equations of motion are the usual Yang–Mills equations dA (∗FA ) = 0.In particular, instantons are solutions of these equations. 1.3. Instantons and D-branes. Let Z be R10 with a flat metric, and let X → Z be R5 = R4 × R linearly embedded in Z. We regard R4 as space and R as time. Consider r D4-branes wrapped on X. This physical system is described by the Yang–Mills action on R5 = R4 × R. If one is looking for static solutions of the equations of motion, one needs to consider the minima of the Yang–Mills energy function W [A] = ||FA ||2 , R4

where FA is the curvature of a U (r) connection A, and ||FA ||2 = −Tr (FA ∧ ∗FA ). The instanton number of A is defined by 1 c2 = (2) Tr (FA ∧ FA ) . 8π 2 R4 If the Yang–Mills energy evaluated on A is finite, then the bundle E and the connection A extend to S4 , the one-point compactification of R4 (see [4] for details). In this case c2 is the second Chern class of E and is therefore an integer. Solutions of instanton equations on R4 are precisely the absolute minima of the Yang–Mills energy function. These solutions should be regarded as composed of identical particle-like objects (instantons) on X, their number being c2 . Since the energy of the instanton is proportional to c2 , all “particles” have the same mass. Since the solution is static, the particles neither repel nor attract. This is actually a consequence of supersymmetry: Type II string theory is supersymmetric, and D4-branes with instantons on them leave part of supersymmetry unbroken. In string theory one may also consider k D0-branes present simultaneously with r D4-branes. More specifically, we will consider D0-branes which are at rest, i.e. the corresponding one-dimensional manifolds are straight lines parallel to the time axis. Such a configuration of branes is also supersymmetric, and consequently there are no forces between any of the branes. The positions of D0-branes are not constrained by anything, so their moduli space is (R9 )k . More precisely, since D0-branes are indistinguishable, the moduli space is Symk (R9 ). It turns out that an instanton with instanton number k and k D0-branes are related: they can be deformed into each other without any cost in energy. A convenient point of view is the following. In the presence of D4-branes wrapped on X the moduli space of D0-branes has two branches: a branch where their positions are unconstrained and D0-branes are point-like (this branch is isomorphic to Symk (R9 )), and the branch where they are constrained to lie on X. The latter branch is isomorphic to the moduli space Mr,k of U (r) instantons on X = R4 with c2 = k. The dimension of Mr,k is known to be 4rk for r > 1 (see for example [4]). For r = 1 instantons do not exist. The translation group of R4 acts freely on Mr,k , and the quotient space describes the relative positions and sizes of instantons. Thus D0-branes are pointlike objects when they are away from D4-branes, but when they bind to D4-branes they can acquire finite size.

388

A. Kapustin, A. Kuznetsov, D. Orlov

The “instanton” branch touches the “point-like” branch at submanifolds where some or all of the instantons shrink to zero size. These are the submanifolds where the instanton moduli space is singular. At these submanifolds the point-like instantons can detach from D4-branes and start a new life as D0-branes. This lowers the second Chern class of the bundle on D4-branes. Thus from the string theory perspective it is natural to glue together the moduli spaces of instantons with different Chern classes along singular submanifolds. 1.4. Noncommutative geometry and D-branes. Instanton equations (and, more generally, Yang–Mills equations) arise in the low-energy limit of string theory, or equivalently for large string tension. Recently, another kind of low-energy limit of string theory was discussed in the literature [32]. Consider a trivial U (r)-bundle on X = R4 with a connection A whose curvature FA is of the form 1⊗f where 1 is the unit section of End(E), and f is a constant nondegenerate 2-form. For small f the D4-branes are described by the ordinary Yang–Mills action, but for large FA the stringy equations of motion get complicated. It turns out that the equations of motion simplify again in the limit when both FA and the string tension are taken to infinity, with a certain combination of the two kept fixed (one also has to scale the metric appropriately, see [32]). We will call this limit the Seiberg–Witten limit. In this limit the D4-branes are described by Yang– Mills equations on a certain noncommutative deformation of R4 (see [32] and references therein). There is another description of the Seiberg–Witten limit, which is gauge-equivalent to the previous one. Type II string theory reduces at low energies to Type II supergravity in 10 dimensions. The bosonic fields of this low-energy theory include a symmetric ranktwo tensor (metric) and a 2-form B. R10 with a flat Lorenzian metric and a constant B is a solution of supergravity equations of motion, as well as full stringy equations of motion. A constant B can be gauged away, so this is not a very interesting solution. Life gets more interesting if there are D-branes present. For example, consider r coincident flat D4-branes embedded in R10 with a constant B-field. It turns out that one can gauge away a constant B-field only at the expense of introducing a constant FA of the form 1 ⊗ f , where f is equal to the pull-back of B to the worldvolume of the D4-branes. Thus the solution with zero FA and nonzero B is equivalent to the solution with nonzero FA and zero B. Therefore the Seiberg–Witten limit can be described as the limit in which both the B-field and the string tension become infinite. The idea that D-branes in a nonzero B-field are described Yang–Mills theory on a noncommutative space was first put forward in [13] for the case of D-branes wrapped on tori. 1.5. Instanton equations on a noncommutative R4 . The deformed R4 that one obtains in the Seiberg–Witten limit is completely characterized by its algebra of functions A. It is a noncommutative algebra whose underlying space is a certain subspace of C ∞ functions on R4 . The product is the so-called Wigner–Moyal product formally given by ∂2 1 f (x)g(y). (3) hθ (f g)(x) = lim exp ¯ ij y→x 2 ∂xi ∂yj Here θ is a purely imaginary matrix, and h¯ is a real parameter (“Planck constant”) which is introduced to emphasize that the Wigner–Moyal product is a deformation of the usual product. In the string theory context θ is proportional to f −1 .

Noncommutative Instantons and Twistor Transform

389

Of course, to make sense of this definition we must specify a subspace in the space of C ∞ functions which is closed under the Wigner–Moyal product. Leaving this question aside for a moment,1 one can define the exterior differential calculus over A. Differential geometry of noncommutative spaces has been developed by A. Connes [12]. In our situation Connes’ general theory is greatly simplified. For example, the sheaf of 1-forms 1 (A) is simply a bimodule A⊕4 (the relation of this definition with the general theory is explained in Subsect. 8.11). The elements of 1 (A) will be denoted i f i (x)dxi , or simply f i (x)dxi . The exterior differential d is a vector space morphism d : A → 1 (A),

f →

∂f dxi . ∂xi

The exterior differential d satisfies the Leibniz rule d(f1 f2 ) = df1 f2 + f1 df2 . This makes sense because 1 (A) is a bimodule. The sheaf of 2-forms over A is a bimodule 2 (A) = A⊕6 (see Subsect. 8.11). The definition of the exterior differential can be extended to 1 (A) in an obvious manner. Complex conjugation acts as an anti-linear anti-homomorphism of A, i.e. (f g) = g f .Thus A has a natural structure of a ∗-algebra. We will denote the ∗-conjugate of f ∈ A by f † . A trivial bundle over the noncommutative R4 is defined as a free A-module E. A trivial unitary bundle over the noncommutative R4 is defined as a free module V ⊗C A, where V is a Hermitian vector space. A connection on a trivial bundle E is defined as a map ∇ : E → E ⊗A 1 (A), which is a vector space morphism satisfying the Leibniz rule ∇(m f ) = ∇(m) f + m df. This formula makes use of the bimodule structure on 1 (A). The curvature F∇ = [∇, ∇] is a morphism of A-modules F∇ : E → E ⊗A 2 (A). As in the commutative case, a connection on a trivial bundle E can be written in terms of a connection 1-form A ∈ EndA (E) ⊗A 1 (A): ∇(m) = dm + A m. This formula uses the bimodule structure on m. If E is a unitary bundle, and we have A† = −A, then we say that A is a unitary connection. The curvature is given in terms of A by the usual formula F∇ := FA = dA + A ∧ A. Here it is understood that f i dxi ∧ g j dxj = f i g j dxi ∧ dxj . 1 String theory considerations do not shed light on this problem.

390

A. Kapustin, A. Kuznetsov, D. Orlov

The instanton equation on A is again given by (1), and the instanton number is defined by (2). The most obvious choice of the space of functions closed under the Wigner–Moyal product is the space of polynomial functions. However, this choice is not suitable for our purposes because it precludes the decrease of FA at infinity which is necessary for the instanton action to converge. In the commutative case, components of an instanton connection are rational functions [4], so we would like our class of functions to include rational functions on R4 . A possible choice for the underlying set of A is the set of C ∞ functions on R4 all of whose derivatives are polynomially bounded. Then we face the question of the convergence of the series (3). To avoid dealing with this issue, we modify our definition of the Wigner–Moyal product (see the Appendix for details). The modified product makes the space of C ∞ functions all of whose derivatives are polynomially bounded into an algebra over C, and agrees with (3) on polynomial functions. Polynomial functions form a subalgebra of A. This subalgebra is isomorphic to the algebra generated by four variables xi , i = 1, 2, 3, 4 with relations [xi , xj ] = hθ ¯ ij . This algebra is usually called the Weyl algebra. To summarize, there is a limit of string theory in which D4 branes are described by Yang–Mills equations on the noncommutative R4 (= A). D0-branes bound to D4-branes are described in this limit by the instanton equations on the noncommutative R4 . One can show that, unlike in the commutative case, instantons cannot be deformed to point-like D0-branes without a cost in energy. Thus it is natural to suspect that the moduli space of instantons on the noncommutative R4 is metrically complete. 2. Review of the ADHM Construction and Summary All instantons on the commutative R4 arise from the so-called ADHM construction. Recently N. Nekrasov and A. Schwarz [29] introduced a modification of this construction which produces instantons on the noncommutative R4 .2 In the commutative case the completeness of the ADHM construction can be proved using the twistor transform of R. Penrose, so one could hope that the same approach could work in the noncommutative case. In this paper we show that the deformed ADHM data of [29] describe holomorphic bundles on certain noncommutative algebraic varieties and interpret the deformed ADHM construction in terms of noncommutative twistor transform. In this subsection we review both ordinary and deformed ADHM constructions and make a summary of our results. 2.1. Review of the ADHM construction of instantons. First let us outline the ADHM construction of U (r) instantons on the commutative R4 following [15]. We assume that the constant metric G on R4 has been brought to the standard form G = diag(1, 1, 1, 1) by a linear change of basis. To construct a U (r) instanton with c2 = k one starts with two Hermitian vector spaces V Ck and W Cr . The ADHM data consist of four linear maps B1 , B2 ∈ Hom(V , V ), I ∈ Hom(W, V ), J ∈ Hom(V , W ) which satisfy the following two conditions: 2 As in the commutative case, one may consider different classes of functions on the noncommutative R4 : polynomial, C ∞ functions rapidly decreasing at infinity, C ∞ functions all of whose derivatives are polynomially bounded, etc. Our class of functions differs somewhat from that adopted in [29].

Noncommutative Instantons and Twistor Transform

391

(i) µc = [B1 , B2 ] + I J = 0, µr = [B1 , B1† ] + [B2 , B2† ] + I I † − J † J = 0. (ii) For any ξ = (ξ1 , ξ2 ) ∈ C2 ∼ = R4 the linear map Dξ ∈ Hom(V ⊕ V ⊕ W, V ⊕ V ) defined by Dξ =

B1 − ξ1 −B2 + ξ2 I B2† − ξ¯2 B1† − ξ¯1 J †

(4)

is surjective. The equations µc = µr = 0 are called the ADHM equations. They are invariant with respect to the action of the group of unitary transformations of V . Solutions of these equations are called ADHM data. The space of ADHM data modulo U (V ) transformations has dimension 4rk and carries a natural hyperkähler metric. ADHM construction identifies this moduli space with the moduli space of U (r) instantons with c2 = k and fixed trivialization at infinity. The role of the condition (ii) above is to remove submanifolds in this moduli space where the hyperkähler metric becomes singular (these are point-like instanton singularities mentioned in Subsect. 1.3). As a result the moduli space of the ADHM data is metrically incomplete. The instanton connection can be reconstructed from the ADHM data as follows. The condition (ii) implies that the family Ker Dξ forms a trivial subbundle of V ⊕ V ⊕ W of rank r. Let v(ξ ) be its trivialization, i.e. a linear map v(ξ ) : Cr → V ⊕ V ⊕ W smoothly depending on ξ such that Dξ v(ξ ) = 0 for all ξ , and ρ(ξ ) = v(ξ )† v(ξ ) is an isomorphism for all ξ . We set A(ξ ) = ρ(ξ )−1 v(ξ )† dv(ξ ). The matrix-valued one-form A is a connection on a trivial unitary bundle of rank r. One can show that its curvature FA is ASD (see [4]). However, it does not satisfy A† = −A, because we are not using a unitary gauge. Instead A satisfies A† (ξ ) = −(ρ(ξ )A(ξ )ρ(ξ )−1 + ρ(ξ )dρ(ξ )−1 ). To go to a unitary gauge, we must make a gauge transformation A (ξ ) = g(ξ )A(ξ )g(ξ )−1 + g(ξ )dg(ξ )−1 , where g(ξ ) is a function taking values in Hermitian r ×r matrices and satisfying g(ξ )2 = ρ(ξ ). We now explain, following [29], how to modify the ADHM construction so that it produces rank r instantons on the noncommutative R4 defined in the previous section. It proves convenient to apply an orthogonal transformation which brings the matrix θ in (3) to the standard form θ=

0

a

0

0 . 0 b 0 −b 0

√ −a 0 −1 0 0 0

0

0

392

A. Kapustin, A. Kuznetsov, D. Orlov

We will assume that a + b = 0.Since θ enters only in the combination hθ ¯ , we can set a + b = 1 without loss of generality. The relation between the affine coordinates ξ1 , ξ2 on C2 and affine coordinates x1 , x2 , x3 , x4 on R4 is chosen as follows: √ √ ξ1 = x4 − −1 x3 , ξ2 = −x2 + −1 x1 . Then ξ1 , ξ2 , ξ¯1 , ξ¯2 obey the Weyl algebra relations [ξ1 , ξ¯1 ] = 2hb, ¯

[ξ2 , ξ¯2 ] = 2ha, ¯

[ξ1 , ξ2 ] = [ξ1 , ξ¯2 ] = [ξ¯1 , ξ2 ] = [ξ¯1 , ξ¯2 ] = 0.

The modified ADHM data consist of the same four maps which now satisfy µc = 0, µr = −2h(a ¯ + b) · 1k×k . The instanton connection is given by essentially the same formulas as in the commutative case. The operator D is given by the same formula as Dξ , but is now regarded as an element of HomA ((V ⊕ V ⊕ W ) ⊗C A, (V ⊕ V ) ⊗C A). The module Ker D is a projective module over A. Following [10], we assume that it is isomorphic to a free module of rank r, and v is the corresponding isomorphism v : A⊕r → Ker D. We further assume [10] that the morphism / = DD† ∈ EndA ((V ⊕ V ) ⊗ A) is an isomorphism.3 Then it is easy to see that ρ = v † v ∈ EndA (Cr ⊗ A) is an isomorphism too. We set A = ρ −1 v † dv.

(5)

(The multiplication here and below is understood to be the Wigner–Moyal multiplication.) This formula defines a connection 1-form on a trivial unitary bundle on A of rank r. The curvature of this connection is given by FA = ρ −1 dv † ∧ (1 − vρ −1 v † )dv. A short computation (essentially the same as in the commutative case) shows that the curvature can be written in the form FA = ρ −1 v † dD† /−1 ∧ dD v. Furthermore, since D and D† are linear in ξi , ξ¯i , their exterior derivatives have a very simple form: −d ξ¯1 −dξ2 −dξ1 dξ2 0 , dD† = dD = d ξ¯2 −dξ1 . −d ξ¯2 −d ξ¯1 0 0 0 3 One can show that the latter assumption is always valid provided h = 0. As for the former one, it is ¯ not known what constraints should be imposed on the deformed ADHM data to ensure that Ker D is a free A-module of rank r. For r = 1 Ker D is never free [16].

Noncommutative Instantons and Twistor Transform

393

Note also that by virtue of the deformed ADHM equations / has a block-diagonal form: δ 0 , /= 0 δ where δ ∈ EndA (V ⊗ A) is an isomorphism. Using this fact, one can easily see that FA is proportional to the 2-forms dξ1 ∧ d ξ¯1 + dξ2 ∧ d ξ¯2 ,

dξ1 ∧ d ξ¯2 ,

dξ2 ∧ d ξ¯1 ,

which are anti-self-dual. As in the commutative case, the connection A does not satisfy A† = −A. To go to a unitary gauge one has to perform a gauge transformation A = g A g −1 + g dg −1 . Here g ∈ AutA (Cr ⊗ A) should be found from the conditions g † = g, g g = ρ. The existence of such g is an additional assumption. 2.2. Summary of results. In the commutative case there is a one-to-one correspondence between the following four classes of objects: A. Rank r holomorphic bundles on P2 with c2 = k and a fixed trivialization on the line at infinity. B. The set of ADHM data modulo the natural action of U (k). C. Rank r holomorphic bundles on P3 with c2 = k, a trivialization on a fixed line, vanishing H 1 (E(−2)), and satisfying a certain reality condition. D. U (r) instantons on R4 with c2 = k. The correspondence between C and D is a particular instance of twistor transform [6]. The correspondence between B and C has been proved by Atiyah, Hitchin, Drinfeld, and Manin [5, 4]. Together these two results imply that all instantons on R4 arise from the ADHM construction. The correspondence between A and B has been proved by Donaldson [15]. One can also prove the correspondence between A and D directly [7, 11, 18]. The goal of this paper is to extend some of these results to the noncommutative case. We show that there is a natural one-to-one correspondence between the isomorphism classes of the following objects: A . Algebraic bundles on a noncommutative deformation of P2 with c2 = k and a fixed trivialization on the line at infinity. B . Deformed ADHM data of Nekrasov and Schwarz modulo the natural U (k) action. C . Certain complexes of sheaves on a noncommutative deformation of P3 satisfying reality conditions. The moduli space of the deformed ADHM data has a natural hyperkähler metric, and the other two moduli spaces inherit this metric. Furthermore, we reinterpret the deformed ADHM construction of Nekrasov and Schwarz in terms of a noncommutative deformation of the twistor transform. It is interesting to note that H. Nakajima [27] studied the same linear algebra data as Nekrasov and Schwarz and showed that their moduli space coincides with the moduli

394

A. Kapustin, A. Kuznetsov, D. Orlov

space of torsion free sheaves on a commutative P2 with a trivialization on a fixed line. On the other hand, we show that the same data describe algebraic bundles on a noncommutative P2 . As shown below, the interpretation in terms of complexes of sheaves on a noncommutative P3 provides a geometric reason for this “coincidence”. We prove that the two moduli spaces are isomorphic as hyperkähler manifolds, but the natural complex structures on them differ by an SO(3) rotation. The rest of the paper is organized as follows. In Sect. 3 we define noncommutative deformations of certain commutative projective varieties (P2 , P3 , and a quadric in P5 ). Section 4 is an algebraic preparation for the study of bundles on noncommutative projective spaces. In Sect. 5 we study the cohomological properties of sheaves on noncommutative P2 and P3 and define locally free sheaves (i.e. bundles). In Sect. 6 we show that any bundle on a noncommutative P2 trivial on the commutative line at infinity arises as a cohomology of a monad. In Sect. 7 we exhibit bijections between A , B , and C and explain the relation with Nakajima’s results. In Sect. 8 we construct a noncommutative deformation of Grassmannians and flag manifolds and describe a noncommutative version of the twistor transform. We also describe a nice class of noncommutative projective varieties associated with a Yang–Baxter operator and define differential forms on these varieties. In Sect. 9 we consider a more general deformation of R4 (a q- deformed R4 ) whose physical significance is obscure at present. We propose an ADHM-like construction of instantons on this space and outline its relation to noncommutative algebraic geometry. In the Appendix we define the Wigner–Moyal product on the space of C ∞ functions on Rn all of whose derivatives are polynomially bounded, and prove that the Wigner–Moyal product provides this space with a structure of an algebra over C. Note added in proof. After this paper was submitted to the electronic archive, we learned that coherent sheaves on the noncommutative projective plane and their moduli spaces have been studied by L. Le Bruyn [21]. 3. Geometry of Noncommutative Varieties 3.1. Algebraic preliminaries. Let k be a base field (we will be dealing only with k = C or k = R in this paper). Let A be an algebra over k. It is called right (left) noetherian if every right (left) ideal is finitely generated, and it is called noetherian if it is both right and left noetherian. Let A = ⊕ Ai be a graded noetherian algebra. We denote by mod(A) the category i≥0

of finitely generated right A-modules, by gr(A) the category of finitely generated graded right A-modules, and by tors(A) the full subcategory of gr(A) which consists of finite dimensional graded A-modules. An important role will be played by the quotient category qgr(A) = gr(A)/tors(A). It has the following explicit description. The objects of qgr(A) are the objects of gr(A) (we

the object in qgr(A) which corresponds to a module M). The morphisms denote by M in qgr(A) are given by

N

) = lim Homgr (M , N ), Homqgr (M, −→ M

where M runs over submodules of M such that M/M is finite dimensional. On the category gr(A) there is a shift functor: for a given graded module M = ⊕i≥0 Mi the shifted module M(r) is defined by M(r)i = Mr+i . The induced shift

to M(r)

functor on the quotient category qgr(A) sends M = M(r).

Noncommutative Instantons and Twistor Transform

395

Similarly, we can consider the category Gr(A) of all graded right A-modules. It contains the subcategory Tors(A) of torsion modules. Recall that a module M is called torsion if for any element x ∈ M one has xA≥s = 0 for some s, where A≥s = ⊕ Ai . We i≥s

denote by QGr(A) the quotient category Gr(A)/Tors(A). The category QGr(A) contains qgr(A) as a full subcategory. Sometimes it is convenient to work in QGr(A) instead of qgr(A). Henceforth, all graded algebras will be noetherian algebras generated by the first component A1 with A0 = k. Sometimes we use subscripts R or L for categories gr(A), qgr(A), etc., to specify whether right or left modules are considered. If the subscript is omitted, the modules are taken to be right modules. For the same reason for an A-bimodule M we sometimes write MA or A M to specify whether the right or left module structure is considered. 3.2. Noncommutative varieties. A variety in commutative geometry is a topological space with a sheaf of functions (continuous, smooth, analytic, algebraic, etc.) which is, obviously, a sheaf of algebras. One of the main objects in geometry (algebraic or differential) is a bundle or, more generally, a sheaf. To any variety X we can associate an abelian category of sheaves of modules (maybe with some additional properties) over the sheaf of algebras of functions. Given a sheaf of modules on X, the space of its global sections is a module over the algebra of global functions on X. Thus the functor of global sections associates to every X an algebra and a certain category of modules over it. Under favorable circumstances, much of the information about the geometry of X is contained in this purely algebraic datum. Let us give a few examples. If X is a compact Hausdorff topological space, then the category of vector bundles over X is equivalent to the category of finitely generated projective modules over the algebra of continuous functions on X [34, 36]. The equivalence is given by the functor which maps a vector bundle to the module of its global sections. It is well known that if A is a commutative noetherian algebra, the category of coherent sheaves on the noetherian affine scheme Spec(A) is equivalent to the category of finitely generated modules over A. The equivalence is again given by the functor which attaches to a coherent sheaf the module of its global sections. In the case of projective varieties the only global functions are constants, so one has to act somewhat differently. Since a projective variety X is by definition a subvariety of a projective space, it inherits from it the line bundle OX (1) and its tensor powers OX (i). We can consider a graded algebra 9(X) = ⊕ H 0 (X, OX (i)). i≥0

This algebra is called the homogeneous coordinate algebra of X. Furthermore, for any sheaf F we can define a graded A-module 9(F) = ⊕ H 0 (X, F(i)). i≥0

It can be checked that 9 is a functor from the category of coherent sheaves on X coh(X) to gr(9(X)). In a brilliant paper [33], J.-P. Serre described the category of coherent sheaves on a projective scheme X in terms of graded modules over the graded algebra 9(X). He proved that the category coh(X) is equivalent to the quotient category qgr(9(X)) = gr(9(X))/tors(9(X)). The equivalence is given by the composition of the functor 9

396

A. Kapustin, A. Kuznetsov, D. Orlov

with the projection π : gr(A) → qgr(A). On other hand, let A = ⊕ Ai be a graded i≥0

commutative algebra generated over k by the first component (which is assumed to be finite dimensional). We can associate to A a projective scheme X = Proj(A). Serre proved that the category coh(X) is equivalent to the category qgr(A). The equivalence also holds for the category of quasicoherent sheaves on X and the category QGr(A) = Gr(A)/Tors(A). In all of the above examples it turned out that the natural category of sheaves or bundles on a variety is equivalent to a certain category defined in terms of (graded) modules over some (graded) algebra. On the other hand, “as A. Grothendieck taught us, to do geometry you really don’t need a space, all you need is a category of sheaves on this would-be space” ([25], p. 83). For this reason, in the realm of algebraic geometry it is natural to regard a noncommutative noetherian algebra as a coordinate algebra of a noncommutative affine variety; then the category of finitely generated right modules over this algebra is identified with the category of coherent sheaves on the corresponding variety. Similarly, a noncommutative graded noetherian algebra is regarded as a homogeneous coordinate algebra of a noncommutative projective variety. The category of finitely generated graded right modules over this algebra modulo torsion modules is identified with the category of coherent sheaves on this variety (see [3, 25, 35]). A different approach to noncommutative geometry has been pursued by Connes [12]. 3.3. Noncommutative deformations of commutative varieties. Many important noncommutative varieties arise as deformations of commutative ones. Let X be a commutative variety (affine or projective). Let A be the corresponding commutative (graded) algebra. A noncommutative deformation of X is a deformation of the algebra structure on A, that is, a deformation of the multiplication law. Usually it is not easy to write down an explicit formula for the deformed product. There is a more algebraic way to describe noncommutative deformations of commutative varieties. Assume that the algebra A is given in terms of generators and relations. This means that A is given as a quotient A = T (V )/R, where V is the vector space spanned by the generators, T (V ) is the tensor algebra of V , and R is a two-sided ideal in T (V ) generated by a subspace of relations R ⊂ T (V ). Assume that Rh¯ ⊂ T (V ) is a one-parameter deformation of the subspace R. Then Ah¯ = T (V )/Rh¯ is a oneparameter deformation of A. (If A is graded, then we assume that R is a graded subspace, and the deformation preserves the grading). We denote by Xh¯ the noncommutative variety corresponding to the algebra Ah¯ . Thus Xh¯ is a noncommutative one-parameter deformation of X. If X is projective and A is a graded algebra, then we denote by coh(Xh¯ ) the category qgr(Ah¯ ). Furthermore, as in the commutative case, we will write O(r) for the object h¯ (r). A Now we define noncommutative varieties which are going to be used in this paper. 3.4. Noncommutative C4 . Denote by A(C4 ) the algebra of polynomial functions on C4 . Let θ be a skew-symmetric 4 × 4 matrix. Let us define the algebra A(C4h¯ ) as an algebra over C generated by xi (i = 1, 2, 3, 4) with relations [xi , xj ] = hθ ¯ ij : A(C4h¯ ) = T(x1 , x2 , x3 , x4 )/[xi , xj ] = h¯ θij 1≤i,j ≤4 .

(6)

Noncommutative Instantons and Twistor Transform

397

We will regard A(C4h¯ ) as the algebra of polynomial functions on a noncommutative affine variety C4h¯ . 3.5. Noncommutative 4-dimensional quadric. Let G be a 4 × 4 symmetric nondegenerate matrix. Consider a graded algebra Qh¯ = ⊕ Qi over C generated by the elements i≥0

X1 , X2 , X3 , X4 , D, T of degree 1 with the following quadratic relations: [T , D] = [T , Xi ] = 0, 2 [Xi , Xj ] = hθ ¯ ij T , θil Glk Xk T , [D, Xi ] = 2h¯

(7)

lk

Gij Xi Xj = DT .

ij

We denote by Q4h¯ the noncommutative projective variety corresponding to the algebra Qh¯ . It is evident that Q4h¯ is a deformation of a 4-dimensional commutative quadric Q4 = { ij Gij Xi Xj = DT } ⊂ CP5 . 3.6. Embedding C4h¯ → Q4h¯ . Let Qh¯ [T −1 ] be the localization of the algebra Qh¯ with respect to T . Elements of degree 0 in Qh¯ [T −1 ] form a subalgebra which will be denoted by Qh¯ [T −1 ]0 . Lemma 3.1. The map xi → T −1 Xi (i = 1, 2, 3, 4) induces an isomorphism of the algebra A(C4h¯ ) with the algebra Qh¯ [T −1 ]0 . Proof. Obvious.

" !

This means that C4h¯ can be identified with the open subset {T = 0} in Q4h¯ . For this reason, Q4h¯ may be regarded as a compactification of C4h¯ which is compatible with the bilinear form G. Note also that the complement of C4h¯ in Q4h¯ corresponds to the algebra Qh¯ /T = T(X1 , X2 , X3 , X4 , D)/ [Xi , Xj ] = [D, Xi ] = 0, Gij Xi Xj = 0 . ij

Since this algebra is commutative, the complement is the usual 3-dimensional commutative quadratic cone. Thus one may say that Q4h¯ is obtained from C4h¯ by adding a cone “at infinity”. This is in complete analogy with the commutative case. 3.7. Noncommutative P2h¯ and P3h¯ . Noncommutative deformations of the projective plane have been classified in [1, 2, 9]. We will need one of them, namely the one whose homogeneous coordinate algebra is a graded algebra P Ph¯ = ⊕ P Ph¯ i over C generated by the elements w1 , w2 , w3 of degree 1 with the relations:

i≥0

[w3 , wi ] = 0 for any i = 1, 2, 3, 2 [w1 , w2 ] = 2hw ¯ 3.

(8)

398

A. Kapustin, A. Kuznetsov, D. Orlov

We will also need a noncommutative deformation of the 3-dimensional projective space, whose homogeneous coordinate algebra will be denoted P Sh¯ = ⊕ P Sh¯ i . It is a i≥0

graded algebra over C generated by P Sh¯ 1 = U , where the vector space U is spanned by elements z1 , z2 , z3 , z4 obeying the relations [z3 , zi ] = [z4 , zi ] = 0 for any i = 1, 2, 3, 4, [z1 , z2 ] = 2hz ¯ 3 z4 .

(9)

The noncommutative projective varieties corresponding to P Ph¯ and P Sh¯ will be denoted P2h¯ and P3h¯ , respectively. Note that for h¯ = 0 all algebras P Sh¯ are isomorphic, and therefore the varieties P3h¯ are the same for all h¯ = 0. The same is true for P2h¯ . 3.8. Subvarieties in P3h¯ and P2h¯ . If I ⊂ P Sh¯ is a graded two-sided ideal, then the quotient algebra P Sh¯ /I corresponds to a closed subvariety X(I ) ⊂ P3h¯ . Let us describe some of them. Let J be the graded two-sided ideal generated by z3 and z4 . Then P Sh¯ /J = T(z1 , z2 )/[z1 , z2 ] = 0, hence X(J ) is the commutative projective line. For each point p = (λ : µ) ∈ P1 let Jp denote the graded two-sided ideal generated by λz3 + µz4 . If p = (0 : 1) or p = (1 : 0), then it is easy to see that X(Jp ) is the commutative projective plane. For all other p ∈ P1 we have λ P Sh¯ /Jp = T(z1 , z2 , z3 )/ [z1 , z3 ] = [z2 , z3 ] = 0, [z1 , z2 ] = −2h¯ z32 , µ hence X(Jp ) is a noncommutative projective plane isomorphic to P2h¯ . We have Jp ⊂ J for all p ∈ P1 , hence all planes X(Jp ) pass through the line X(J ). Thus we see that P3h¯ is a pencil of noncommutative projective planes passing through a fixed commutative projective line. Similarly, the two-sided ideal generated by w3 in P Ph¯ corresponds to a commutative projective line l = {w3 = 0} ⊂ P2h¯ . 4. Properties of Algebras P Sh¯ and P Ph¯ and the Resolution of the Diagonal This section is a preparation for the study of sheaves on P3h¯ and P2h¯ . We show that the algebras P Sh¯ and P Ph¯ are regular and Koszul and construct the resolution of the diagonal, which will enable us to associate monads to certain bundles on P2h¯ . 4.1. Quadratic algebras. A graded algebra A = ⊕ Ai over a field k is called quadratic i≥0

if it is connected (i.e. A0 = k), is generated by the first component A1 , and the ideal of relations is generated by the subspace of quadratic relations R(A) ⊂ A1 ⊗ A1 . Therefore the algebra A can be represented as T (A1 )/R(A), where T (A1 ) is a free tensor algebra generated by the space A1 . The algebras P Sh¯ and P Ph¯ are quadratic algebras. For example, P Sh¯ can be represented as T(U )/W , where U = P Sh¯ 1 is a 4-dimensional vector space and W is the 6–dimensional subspace of U ⊗ U spanned by the relations (9).

Noncommutative Instantons and Twistor Transform

399

4.2. The dual algebra. For any quadratic algebra A = T (A1 )/R(A) we can define its dual algebra which is also quadratic. Let us identify A∗1 ⊗ A∗1 with (A1 ⊗ A1 )∗ by (l ⊗ m)(a ⊗ b) = m(a)l(b). Denote by R(A)⊥ the annulator of R(A) in A∗1 ⊗ A∗1 , i.e. the subspace which consists of such q ∈ (A∗1 )⊗2 that q(r) = 0 for any r ∈ R(A). Definition 4.1 ([25]). The algebra A! = T (A∗1 )/R(A)⊥ is called the dual algebra of A. Example 4.2. Let {ˇzi }, i = 1, 2, 3, 4, be the basis of P Sh¯ !1 = U ∗ which is dual to {zi }. By definition, P Sh¯ ! is generated by {ˇzi } with defining relations zˇ i2 = 0 for all i = 1, . . . , 4; zˇ i zˇ j + zˇ j zˇ i = 0 for all i < j, (i, j ) = (3, 4); zˇ 3 zˇ 4 + zˇ 4 zˇ 3 = h[ˇ ¯ z1 , zˇ 2 ] = 2h¯ zˇ 1 zˇ 2 . In the commutative case the dual algebra of the symmetric algebra S · (U ) is isomorphic to the exterior algebra C· (U ∗ ). Obviously, the algebras P Sh¯ ! and P Ph¯ ! are deformations of exterior algebras. For example, the vector space P Sh¯ !k is spanned by the elements zˇ i1 · · · zˇ ik with i1 < · · · < ik . In particular, the dimension of the vector space P Sh¯ !k is equal to k4 . Similarly, the dimension of P Ph¯ !k is equal to k3 . Proposition 4.3. Let A be P Sh¯ or P Ph¯ , and let n be 4 or 3, respectively. The multiplication map A!k ⊗ A!n−k −→ A!n is a non-degenerate pairing. Hence the dual algebra A! is a Frobenius algebra, i.e. (A! )A! ∼ = (A! A! )∗ as right A! -modules. Proof. The proposition holds for the exterior algebra, and therefore also for the algebra A! , since the latter is a “small” deformation of the exterior algebra. ! " 4.3. The Koszul complex. Consider right A-modules (A!k )∗ ⊗A. The following complex K· (A) is called the (right) Koszul complex of a quadratic algebra: d

d

d

d

· · · −→ (A!3 )∗ ⊗A(−3) −→ (A!2 )∗ ⊗A(−2) −→ (A!1 )∗ ⊗A(−1) −→ (A!0 )∗ ⊗A −→ 0, where the map d : (A!k )∗ ⊗ A → (A!k−1 )∗ ⊗ A is a composition of two natural maps: (A!k )∗ ⊗ A −→ (A!k )∗ ⊗ A!1 ⊗ A1 ⊗ A −→ (A!k )∗ ⊗ A. Here the first arrow sends α ⊗ a to α ⊗ e ⊗ a with e defined as e= yi ⊗ xi ∈ A!1 ⊗ A1 , i

and {xi } and {yi } being the dual bases of A1 and A!1 , respectively. The second map is determined by the algebra structures on A! and A. It is a well–known fact that d 2 = 0 (see, for example, [25]). Let kA be the trivial right A-module. The Koszul complex K· (A) possesses a natural ε augmentation K· −→ kA −→ 0.

400

A. Kapustin, A. Kuznetsov, D. Orlov

Definition 4.4 (see [31]). A quadratic algebra A = ⊕ Ai is called a Koszul algebra if i≥0

ε

the augmented Koszul complex K· (A) −→ kA −→ 0 is exact. In the same manner one can define the left Koszul complex of a quadratic algebra. It is well known that the exactness of the right Koszul complex is equivalent to the exactness of the left Koszul complex (see, for example, [22]). Proposition 4.5. The algebras P Sh¯ and P Ph¯ are Koszul algebras. Proof. For h¯ = 0 this is a well-known fact about the symmetric algebra S · (U ). Since the augmented Koszul complex is exact for h¯ = 0, it is also exact for small h, ¯ and consequently for all h. " ¯ ! Since the dual algebras P Sh¯ ! and P Ph¯ ! are finite, the Koszul resolutions for the algebras P Sh¯ and P Ph¯ are finite too and have the same form as the resolutions for ordinary symmetric algebras. For example, the Koszul resolution for A = P Ph¯ is: {0 → (A!3 )∗ ⊗ A(−3) → (A!2 )∗ ⊗ A(−2) → (A!1 )∗ ⊗ A(−1) → (A!0 )∗ ⊗ A} → C. 4.4. Resolution of the diagonal. Consider a bigraded vector space 2 2 K··2 (A) = Kk,l (A) with Kk,l (A) = A(k) ⊗ (A!l−k )∗ ⊗ A(−l). k,l≥0

2 → K2 2 2 Consider morphisms dR : Kk,l k,l−1 and dL : Kk,l → Kk+1,l given by the following compositions:

dR : A ⊗ (A!k )∗ ⊗ A → A ⊗ (A!k )∗ ⊗ A!1 ⊗ A1 ⊗ A → A ⊗ (A!k−1 )∗ ⊗ A, dL : A ⊗ (A!k )∗ ⊗ A → A ⊗ A1 ⊗ A!1 ⊗ (A!k )∗ ⊗ A → A ⊗ (A!k−1 )∗ ⊗ A. Here the leftmost maps are given by yi ⊗ xi ∈ A!1 ⊗ A1 eR =

and

eL =

i

xi ⊗ yi ∈ A1 ⊗ A!1 ,

i

where {xi } and {yi } are the dual bases of A1 and A!1 , respectively, while the rightmost maps are induced by the algebra structures of A! and A. It is easy to show that dR2 = dL2 = 0

and

dR dL = dL dR ,

hence K··2 (A) is a bicomplex. It is called the double Koszul bicomplex of the quadratic algebra A. The topmost part of the bicomplex looks as follows: dR

dR

dR

dR

. . . −−−−→ A ⊗ (A!l+1 )∗ ⊗ A(−1 − l) −−−−→ dL

A ⊗ (A!l )∗ ⊗ A(−l) dL

dR

−−−−→ . . .

dR

. . . −−−−→ A(1) ⊗ (A!l )∗ ⊗ A(−1 − l) −−−−→ A(1) ⊗ (A!l−1 )∗ ⊗ A(−l) −−−−→ . . .

Noncommutative Instantons and Twistor Transform

401

Each term of the bicomplex K··2 (A) has an obvious structure of a bigraded Abimodule, and it is clear that the differentials are morphisms of bigraded A-bimodules. Let 2 2 Kl (A) = Ker dL : K0,l (A) → K1,l (A). Then K· (A) is a complex of bigraded A-bimodules (with respect to the differential dR ). Consider a bigraded algebra / = i,j /ij with /ij = Ai+j and with the multiplication induced from A. The algebra / is called the diagonal bigraded algebra of A. Note that the multiplication map induces a surjective morphism of A-bimodules δ : A ⊗ A → /. Lemma 4.6. The map δ : K0 (A) = A ⊗ A → / gives an augmentation of the complex K· (A). 2 (A) = Proof. We have to check that δ · dR : K1 (A) → A vanishes. Note that K0,1 2 A ⊗ A1 ⊗ A(−1), and that the differentials dR and dL restricted to K0,1 (A) coincide with the multiplication maps m1,2 and m2,3 , respectively. Thus we have the following commutative diagram:

K1 (A)

dR

δ

m1,2

δ

−−−−→ K0 (A) −−−−→

/

A ⊗ A1 ⊗ A(−1) −−−−→ A ⊗ A −−−−→ / m2,3 A(1) ⊗ A(−1) Now the lemma follows because δ · m1,2 = δ · m2,3 (associativity) obviously annihilates Ker m2,3 = K1 (A). ! " δ

Proposition 4.7. If A is Koszul, then K· (A) → / is exact. ! 2 (A) is equal to A ∗ Proof. The (p, q)-bigraded component of Kk,l p+k ⊗ (Al−k ) ⊗ Aq−l , 2 hence the (p, q)-bigraded component of the bicomplex K·· (A) vanishes for l < k or l > q. Thus the (p, q)-bigraded component of the bicomplex K··2 (A) is bounded. Therefore both spectral sequences of the bicomplex K··2 (A) converge to the cohomology of the total complex Tot(K··2 (A)). The first term of the first spectral sequence reads A(l) ⊗ k(−l), if k = l 1 Ek,l = 0, otherwise.

Hence the spectral sequence degenerates in the first term, and we have H 0 (Tot(K··2 (A))) =

∞ l=0

A(l) ⊗ k(−l),

H =0 (Tot(K··2 (A))) = 0.

402

A. Kapustin, A. Kuznetsov, D. Orlov

On the other hand, the first term of the second spectral sequence reads

1 Ek,l

k(l) ⊗ A(−l), if k = l > 0 = Kl (A), if k = 0 0, otherwise.

Hence the spectral sequence degenerates in the second term, and we have H

0

(Tot(K··2 (A)))

= H (K· (A)) ⊕ 0

∞

k(l) ⊗ A(−l) ,

l=1

H l (Tot(K··2 (A))) = H l (K· (A)). Therefore H =0 (K· (A)) = 0, and we have an exact sequence 0 → H 0 (K· (A)) →

∞

A(l) ⊗ k(−l) →

l=0

∞

k(l) ⊗ A(−l) → 0.

l=1

Looking at the (p, q)-bigraded component of this sequence we see that (H (K· (A)))p,q = 0

Thus H 0 (K· (A)) = /.

Ap+q , 0,

if p, q ≥ 0 . otherwise

" !

Definition 4.8. Define the left A-module Ω k as the cohomology of the left Koszul complex, truncated in the term Kk . In particular, Ω 1 is defined by the so-called Euler sequence m

ε

0 → Ω 1 → A(−1) ⊗ A1 → A → k → 0.

(10)

In Sect. 8.11 we will show that for noncommutative projective spaces the sheaves corresponding to the modules Ω k can be regarded as sheaves of differential forms. Proposition 4.9. We have Kk (A) = Ω k (k) ⊗ A(−k). " Proof. This follows immediately from the definition of Ω k and Kk (A). ! Combining Propositions 4.7 and 4.9, we obtain the following resolution of the diagonal: . . . −→ Ω 2 (2) ⊗ A(−2) −→ Ω 1 (1) ⊗ A(−1) −→ A ⊗ A −→ / −→ 0.

(11)

Noncommutative Instantons and Twistor Transform

403

4.5. Cohomological properties of the algebras P Sh¯ and P Ph¯ . First we note that both algebras P Sh¯ and P Ph¯ are noetherian. This follows from the fact that they are Ore extensions of commutative polynomial algebras (see for example, [26]). For the same reason the algebras P Sh¯ and P Ph¯ have finite right (and left) global dimension, which is equal to 4 and 3, respectively (see [26], p. 273). We recall that the global dimension of a ring A is the minimal number n (if it exists) such that for any two modules M and N we have Ext n+1 A (M, N ) = 0. In the paper [1] the notion of a regular algebra has been introduced. Regular algebras have many good properties (see [3, 2, 40], etc.). Definition 4.10. A graded algebra A is called regular of dimension d if it satisfies the following conditions: (1) A has global dimension d, (2) A has polynomial growth, i.e. dim An ≤ cnδ for some c, δ ∈ R, (3) A is Gorenstein, meaning that ExtiA (k, A) = 0 if i = d, and ExtdA (k, A) = k(l) for some l. Here ExtA stands for the Ext functor in the category mod(A). It is easy to see that these properties are verified for P Sh¯ and P Ph¯ . Property (2) holds because our algebras grow as ordinary polynomial algebras. Property (3) follows from the fact that P Sh¯ and P Ph¯ are Koszul algebras and the dual algebras are Frobenius resolutions. In this case the Gorenstein parameter l in (3) is equal to the global dimension d. Thus we have Proposition 4.11. The algebras P Sh¯ and P Ph¯ are noetherian regular algebras of global dimension 4 and 3, respectively. For these algebras the Gorenstein parameter l coincides with the global dimension d. 5. Cohomological Properties of Sheaves on P2h¯ and P3h¯ 5.1. Ampleness and cohomology of O(i). Let A be a graded algebra and X be the corresponding noncommutative projective variety. Consider the sequence of sheaves {O(i)}i∈Z in the category coh(X) ∼ = qgr(A), where O(i) = A(i). This sequence is called ample if the following conditions hold: (a) For every coherent sheaf F there are integers k1 , . . . , ks and an epimorphism s

⊕ O(−ki ) −→ F.

i=1

(b) For every epimorphism F −→ G the induced map Hom(O(−n), F) −→ Hom(O(−n), G) is surjective for n & 0. It is proved in [3] that the sequence {O(i)} is ample in qgr(A) for a graded right noetherian k-algebra A if it satisfies the extra condition: (χ1 ) :

dimk Ext 1A (k, M) < ∞

for any finitely generated graded A-module M. This condition can be verified for all noetherian regular algebras (see [3], Theorem 8.1). In particular, the categories coh(P3h¯ ), coh(P2h¯ ) have ample sequences.

404

A. Kapustin, A. Kuznetsov, D. Orlov

For any sheaf F ∈ qgr(A) we can define a graded module 9(F) by the rule: 9(F) := ⊕ Hom(O(i), F). i≥0

It is proved in [3] that for any noetherian algebra A that satisfies the condition χ1 the correspondence 9 is a functor from qgr(A) to gr(A) and the composition of 9 with the natural projection π : gr(A) −→ gqr(A) is isomorphic to the identity functor (see [3, Ch. 3,4]). Now we formulate a result about the cohomology of sheaves on noncommutative projective spaces. This result is proved in [3] for a general regular algebra and parallels the commutative case. Proposition 5.1 ([3, Theorem 8.1.]). Let A be P Sh¯ or P Ph¯ , and X be P3h¯ or P2h¯ , respectively. Denote by n the dimension of X (in our case n = 3 or n = 2, respectively). Then (1) The cohomological dimension of coh(X) is equal to dim(X), i.e. for any two coherent sheaves F and G Exti (F, G) vanishes if i > n. (2) There are isomorphisms for p = 0, i ≥ 0 Ak H p (X, O(i)) = A∗−i−1−n for p = n, i ≤ −n − 1 (12) 0 otherwise. This proposition and the ampleness of the sequence {O(i)} implies the following corollary: Corollary 5.2. Let X be either P3h¯ or P2h¯ . Then for any sheaf F ∈ coh(X) and for all sufficiently large i ≥ 0 we have Hom(F, O(i)) = 0. Proof. By ampleness a sheaf F can be covered by a finite sum of sheaves O(kj ). Now the statement follows from the proposition, because Hom(O(kj ), O(i)) = 0 for all i < kj . ! " Corollary 5.3. Let X be either P3h¯ or P2h¯ . Then for any sheaf F ∈ coh(X) and for all sufficiently large i ≥ 0 we have H k (X, F(i)) = 0 for all k ≥ 1. Proof. The group H k (X, F(i)) coincides with Extk (O(−i), F). Let k be the maximal integer (it exists because the global dimension is finite) such that for some F there exists arbitrarily large i such that Extk (O(−i), F) = 0. Assume that k ≥ 1. s

Choose an epimorphism ⊕ O(−kj ) → F. Let F1 denote its kernel. Then for i > j =1

max{kj

} we have Ext>0 (O(−i),

s

⊕ O(−kj )) = 0, hence Extk (O(−i), F) = 0 implies

j =1

Extk+1 (O(−i), F) = 0. This contradicts the assumption of the maximality of k.

" !

Noncommutative Instantons and Twistor Transform

405

5.2. Serre duality and the dualizing sheaf. A very useful property of commutative smooth projective varieties is the existence of the dualizing sheaf. Recall that a sheaf ω is called dualizing if for any F ∈ coh(X) there are natural isomorphisms of k-vector spaces H i (X, F) ∼ = Extn−i (F, ω)∗ , where ∗ denotes the k-dual. The Serre duality theorem asserts the existence of the dualizing sheaf for smooth projective varieties. In this case the dualizing sheaf is a line bundle and coincides with the sheaf of differential forms nX of top degree. Since the definition of ω is given in abstract categorical terms, it can be extended to the noncommutative case. More precisely, we will say that qgr(A) satisfies classical Serre duality if there is an object ω ∈ qgr(A) together with natural isomorphisms Exti (O, −) ∼ = Extn−i (−, ω)∗ . Our noncommutative varieties P3h¯ and P2h¯ satisfy classical Serre duality, with dualizing sheaves being OP3 (−4) and OP2 (−3), respectively. This follows from the paper [40], h¯ h¯ where the existence of a dualizing sheaf in qgr(A) has been proved for a general class of algebras which includes all noetherian regular algebras. In addition, the authors of

[40] showed that the dualizing sheaf coincides with A(−l), where l is the Gorenstein paramenter for A (see condition (3) of Definition 4.10). 5.3. Bundles on noncommutative projective spaces. To any graded right A-module M one can attach a left A-module M ∨ = HomA (M, A) which is also graded. Note that under this correspondence the right module AA (r) goes to the left module A A(−r). It is known that if A is a noetherian regular algebra, then HomA (−, A) is a functor from the category gr(A)R to the category gr(A)L . Moreover, its derived functor RHom·A (−, A) gives an anti-equivalence between the derived categories of gr(A)R and gr(A)L (see [39, 40, 38]). If we assume that the composition of the functor HomA (−, A) with the projection gr(A)L −→ qgr(A)L factors through the projection gr(A)R −→ qgr(A)R , then we obtain a functor from qgr(A)R to qgr(A)L which is denoted by Hom(−, O). This functor is not right exact and has right derived functors Ext i (−, O), i > 0, from qgr(A)R to qgr(A)L . For a noetherian regular algebra the functor Hom(−, O) and its right derived functors exist. This follows from the fact that the functors ExtiA (−, A) send a finite dimensional module to a finite dimensional module (see condition (3) of Definition 4.10). Moreover, in this case the functor Hom(−, O) can be represented as the composition of the functor 9 : qgr(A)R −→ gr(A)R , the functor HomA (−, A) : gr(A)R −→ gr(A)L , and the projection π : gr(A)L −→ qgr(A)L . This can be illustrated by the following commutative diagram: gr(A)R π 9 qgr(A)R

HomA (−,A)

−−−→

Hom(−,O)

−−−→

gr(A)L π

(13)

qgr(A)L

For a noetherian regular algebra the functor RHom·A (−, A) is an anti-equivalence between the derived categories of gr(A)R and gr(A)L and takes complexes of finite dimensional modules over gr(A)R to complexes of finite dimensional modules over gr(A)L .

406

A. Kapustin, A. Kuznetsov, D. Orlov

This implies that the functor RHom· (−, O) gives an anti-equivalence between the derived categories of qgr(A)R and qgr(A)L . (Note that for derived functors RHomA (−, A) and RHom(−, O) there is also a commutative diagram like (13).) The functors Ext j (−, O) can be described more explicitly. Let M be an A-bimodule.

Regarding it as a right module, we see that for any F ∈ QGr(A)R the groups Ext i (F, M) have the structure of left A-modules. We can project them to QGr(A)L . Thus each bimodule M defines functors from QGr(A)R to QGr(A)L , which will be denoted by

πExti (−, M). Now, using π9 = id and the commutativity of the diagram (13) for the derived j functors ExtA (−, A) and Ext j (−, O), we obtain isomorphisms j j Ext j (F, O) ∼ = π ExtA (9(F), A) ∼ = π Extgr(A) (9(F), ⊕ A(i)) ∼ = π Extj (F, ⊕ O(i)) i≥0

i≥0

(14) for any sheaf F ∈ qgr(A)R . Definition 5.4. We call a coherent sheaf F ∈ qgr(A)R locally free (or a bundle) if Ext j (F, O) = 0 for any j = 0. Remark. In the commutative case this definition is equivalent to the usual definition of a locally free sheaf. Definition 5.5. The dual sheaf Hom(F, O) ∈ qgr(A)L will be denoted by F ∨ . If F ∈ qgr(A)L is a bundle, then the dual sheaf F ∨ is a bundle in qgr(A)L , because RHom· (F ∨ , O) = F in the derived category, and Ext j (F ∨ , O) = 0 for j = 0. Thus we have a good definition of locally free sheaves on P3h¯ and P2h¯ . Since the derived functor RHom(−, O) gives an anti-equivalence between the derived categories of qgr(A)R and qgr(A)L , there is an isomorphism: Hom(F, G) ∼ = Hom(G ∨ , F ∨ )

(15)

for any two bundles F and G on P3h¯ or P2h¯ . 6. Bundles on P2h¯ 6.1. Bundles on P2h¯ with a trivialization on the commutative line. In this section we study bundles on P2h¯ . By definition, a bundle is an object E ∈ coh(P2h¯ ) satisfying the additional condition Ext i (E, O) = 0 for all i > 0 (see (5.4)). The noncommutative plane P2h¯ contains the commutative projective line l ∼ = P1 given by the equation w3 = 0. If M is a P Ph¯ -module, then the quotient module M/Mw3 is a P Ph¯ /w3 -module. This gives a functor coh(P2h¯ ) → coh(P1 ), F → F|l . The sheaf F|l is referred to as the restriction of F to the line l. Lemma 6.1. If F is a bundle, there is an exact sequence: ·w3

0 −→ F(−1) −→ F −→ F|l −→ 0.

(16)

Noncommutative Instantons and Twistor Transform

407

Proof. To prove this we only need to show that multiplication by w3 is a monomorphism. s

If F is a bundle, it can be embedded into a direct sum ⊕ O(ki ), because by ampleness i=1

the dual bundle F ∨ is covered by a direct sum of line bundles. Now, since the morphism ·w3 ·w3 O(ki −1) −→ O(ki ) is mono for any i, the same is true for the morphism F(−1) −→ F. " ! Lemma 6.2. Let E be a bundle on P2h¯ such that its restriction E|l to the commutative line l is isomorphic to a trivial bundle Ol⊕r . Then H 0 (P2h¯ , E(−1)) = H 0 (P2h¯ , E(−2)) = H 2 (P2h¯ , E(−1)) = H 2 (P2h¯ , E(−2)) = 0. Proof. We have the following exact sequence in the category coh(P2h¯ ): 0 −→ E(−2) −→ E(−1) −→ E(−1)|l −→ 0.

(17)

Since E(−1)|l ∼ = Ol (−1)⊕r , we have H 0 (E(−1)|l ) = 0. Assume that E(−1) has a nontrivial section. Then E(−2) has a nontrivial section too. For the same reason E(−3) has a nontrivial section, and so on. Thus for any n < 0 the bundle E(−n) has a nontrivial section. By (15) we have isomorphisms: H 0 (E(−n)) ∼ = Hom(O(n), E) ∼ = Hom(E ∨ , O(−n)). On the other hand, by Corollary 5.2 the last group is trivial for n & 0. Hence H 0 (E(−n)) = 0 for all n & 0, and consequently H 0 (E(−2)) = H 0 (E(−1)) = 0. Further, assume that H 2 (E(−2)) is nontrivial. Since H 1 (E(i)|l ) = 0 for all i ≥ −1 we have from the exact sequence (16) with F = E(i) that H 2 (E(i)) is nontrivial too for all i ≥ −1. But this contradicts Corollary 5.3. Therefore H 2 (E(−2)) = H 2 (E(−1)) = 0. This completes the proof. ! " 6.2. Monads on P2h¯ and P3h¯ . As in the commutative case, a non-degenerate monad on P2h¯ or P3h¯ is a complex over coh(P2h¯ ) m

n

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0 for which the map n is an epimorphism and m is a monomorphism. (Note that there is another more restrictive definition of a monad, according to which the dual map (m)∗ has to be an epimorphism, see [30]). The coherent sheaf E = Ker(n)/ Im(m) is called the cohomology of a monad. A morphism between two monads is a morphism of complexes. The following lemma is proved in [30, Lemma 4.1.3] in the commutative case, but the proof is categorical and applies to the noncommutative case as well. Lemma 6.3. Let X be either P2h¯ or on P3h¯ , and let E and E be the cohomology bundles of two monads m

n

M :0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0, m

n

M :0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0

408

A. Kapustin, A. Kuznetsov, D. Orlov

on X. Then the natural mapping Hom(M, M ) −→ Hom(E, E ) is bijective. The proof is based on the fact that Extj (O, O(−1)) = Extj (O(1), O(−1)) = Extj (O(1), O) = 0 for all j (see [30], Lemma 4.1.3). 6.3. Non-degeneracy conditions. In the definition of a monad we require that the map n be an epimorphism. In the commutative case this condition must be verified pointwise. In the noncommutative case the situation is simpler in some sense, because the complement of the commutative line l does not have points. Lemma 6.4. If the restriction of a sheaf F ∈ coh(P2h¯ ) to the projective line l is the zero object, then F is also the zero object.

Consider Proof. Let M be a finitely generated graded P Ph¯ -module such that F ∼ = M. an exact sequence: ·w3 M −→ M(1) −→ N −→ 0.

= F(1)|l = 0, the module N is finite dimensional. This implies that for i & 0 Since N ·w3 the map Mi → Mi+1 is surjective. Moreover, these maps are isomorphisms for i & 0, because all Mi are finite dimensional vector spaces. Let us identify all Mi for i & 0 with respect to these isomorphisms. Using the A-module structure on M, we obtain a representation of the Weyl algebra T(X, Y )/[X, Y ] = 2h ¯ on the vector space Mi . But it is well known that the Weyl algebra does not have finite dimensional representations. Thus Mi = 0 for all i & 0, and M is finite dimensional. Therefore F = 0. ! " The following corollary is an immediate consequence of the lemma. Corollary 6.5. Let f : F −→ G be a morphism in coh(P2h¯ ). Suppose its restriction f¯ : F|l −→ G|l is an epimorphism. Then f is an epimorphism too. 6.4. From the resolution of the diagonal to a monad. Let M be an A-bimodule. Regard have ing it as a left module, we see that for any F ∈ QGr(A)L the groups Exti (F, M) the structure of right A-modules. We can project them to QGr(A)R . Thus each bimodule

from QGr(A)L to QGr(A)R . M defines functors π Exti (−, M) Let E be a bundle on P2h¯ such that its restriction to the line l is a trivial bundle. Let us consider the bundle E ∨ (1) ∈ qgr(P Ph¯ )L and the resolution of the diagonal K· (P Ph¯ ), which has only three terms: {0 −→ P Ph¯ (−1) ⊗ P Ph¯ (−2) −→ Ω 1 (1) ⊗ P Ph¯ (−1) −→ P Ph¯ ⊗ P Ph¯ } −→ /.

· over The resolution of the diagonal is a complex of bimodules. It induces a complex K QGr(P Ph¯ )L :

, {0 −→ O(−1) ⊗ P Ph¯ (−2) −→ 1 (1) ⊗ P Ph¯ (−1) −→ O ⊗ P Ph¯ } −→ /

(18)

Noncommutative Instantons and Twistor Transform

409

where 1 is a sheaf on P2h¯ corresponding to the P Ph¯ -module Ω 1 .

from As described above, each A-bimodule M gives the functors π Ext i (−, M) QGr(A)L to QGr(A)R . In particular, each object of the resolution of the diagonal induces such functors.

. Note that the object /

coincides First we calculate these functors for the object / with ⊕ O(i). Hence by (14) we have i≥0

) = 0 π Ext j (E ∨ (1), /

) ∼ if j > 0, while π Ext0 (E ∨ (1), / = E(−1). The resolution of the diagonal (18) gives us a spectral sequence with the E1 term pq

−p ) (⇒ π Ext p+q (E ∨ (1), /

), E1 = πExt q (E ∨ (1), K which converges to

i E∞ =

E(−1) 0

if i = 0 otherwise.

pq

Now we describe all terms E1 of this spectral sequence. First we have π Extj (E ∨ (1), O ⊗ P Ph¯ ) ∼ Ph¯ = Extj (E ∨ (1), O) ⊗ P j ∨ ∼ = H j (P2 , E(−1)) ⊗ O. = Ext (E (1), O) ⊗ O ∼ h¯

By Lemma 6.2, these groups are trivial for j = 1. For the same reason we have π Extj (E ∨ (1), O(−1) ⊗ P Ph¯ (−2)) = H j (P2h¯ , E(−2)) ⊗ O(−2) = 0 for j = 1 and πExt1 (E ∨ (1), O(−1) ⊗ P Ph¯ (−2)) ∼ = H 1 (P2h¯ , E(−2)) ⊗ O(−2). Now let us consider the functors which are associated with the object 1 (1)⊗P Ph¯ (−1). We have πExtj (E ∨ (1), 1 (1) ⊗ P Ph (−1)) ∼ = Extj (E ∨ , 1 ) ⊗ O(−1). ¯

It follows from the Koszul complex that the sheaf 1 can be included in two exact sequences: 0 −→ 1 −→ O(−1) ⊗ P Ph¯ 1 −→ O −→ 0, 0 −→ O(−3) −→ O(−2) ⊗ (P Ph¯ 1 )∗ −→ 1 −→ 0. Applying the functor Hom(E ∨ , −) to the first sequence and taking into account that Hom(E ∨ , O(−1)) = 0, we obtain Hom(E ∨ , 1 ) = 0. Similarly, we deduce from the second sequence that Ext2 (E ∨ , 1 ) = 0, because Ext2 (E ∨ , O(−2)) = 0. This implies that the object πExtj (E ∨ (1), 1 (1) ⊗ P Ph¯ (−1)) is trivial for all j = 1. Thus our spectral sequence is nothing more than the complex

2 ) −→ π Ext 1 (E ∨ (1), K

1 ) −→ π Ext 1 (E ∨ (1), K

0 ), π Ext1 (E ∨ (1), K which is isomorphic to the complex H 1 (P2h¯ , E(−2)) ⊗ O(−2) −→ Ext 1 (E ∨ , 1 ) ⊗ O(−1) −→ H 1 (P2h¯ , E(−1)) ⊗ O. It has only one cohomology which coincides with E(−1).

410

A. Kapustin, A. Kuznetsov, D. Orlov

Theorem 6.6. Let E be a bundle on P2h¯ such that its restriction to the commutative line l is isomorphic to the trivial bundle Ol⊕r . Then E is the cohomology of a monad m

n

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0 with H = H 1 (P2h¯ , E(−2)), L = H 1 (P2h¯ , E(−1)), and such a monad is unique up to an isomorphism. Moreover, in this case the vector spaces H and L have the same dimension. Proof. The existence of such a monad was proved above. The uniqueness follows from Lemma 6.3. The equality of dimensions of H and L follows immediately from the exact sequence (17). ! " 6.5. Barth description of monads. Now following Barth [8], we give a description of the moduli space of vector bundles on P2h¯ trivial on the line l in terms of linear algebra (see also [15]). Denote by Mh¯ (r, 0, k) the moduli space of bundles on the noncommutative P2h¯ trivial on the line l and with a fixed trivialization there (i.e. with a fixed isomorphism E|l ∼ = Ol⊕r ). Let dim H 1 (P2h¯ , E(−1)) = k. As in the commutative case, the numbers r, 0, k can be regarded as the rank, first Chern class, and second Chern class of E, respectively. The following theorem gives a description of this moduli space which is similar to the description given by Barth in the commutative case. Theorem 6.7. Let {(b1 , b2 ; j, i)} be the set of quadruples of matrices b1 , b2 ∈ Mk×k (C), j ∈ Mr×k (C), i ∈ Mk×r (C), which satisfy the condition [b1 , b2 ] + ij + 2h¯ · 1k×k = 0. Then the space Mh¯ (r, 0, k) is the quotient of this set with respect to the following free action of GL(k, C): bi → gbi g −1 ,

j → jg −1 ,

i → gi,

where g ∈ GL(k, C).

Proof. Let E be a bundle on P2h¯ trivial on the line l. We showed above that any such bundle comes from a monad unique up to an isomorphism. Conversely, suppose we have a monad m

n

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0

(19)

with dim H = dim L = k such that its restriction to the line l is a monad with the cohomology Ol⊕r . Then the cohomology of this monad is a bundle on P2h¯ which belongs to Mh¯ (r, 0, k). Indeed, the cohomologies of the dual complex n∗

m∗

0 −→ O(−1) ⊗ L∗ −→ O ⊗ K ∗ −→ O(1) ⊗ H ∗ −→ 0 coincide with Hom(E, O) and Ext 1 (E, O). Hence, to prove that E is a bundle, it is sufficient to show that the dual complex is a monad too, i.e. that the map m∗ is an epimorphism. The restriction of the dual complex to l is a monad which is dual to the restriction of the monad (19) to l. Hence the restriction of m∗ on l is an epimorphism. Then, by Lemma 6.5, m∗ is an epimorphism as well. Thus to describe the moduli space

Noncommutative Instantons and Twistor Transform

411

Mh¯ (r, 0, k) we have to decsribe the space of all monads (19) modulo isomorphisms preserving trivialization on l. Consider a monad m

n

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0 with dim H = dim L = k and dim K = 2k + r. Denote by E its cohomology bundle. The maps m and n can be regarded as elements of H ∗ ⊗ K ⊗ W and K ∗ ⊗ L ⊗ W , respectively, where W = H 0 (P2h¯ , O(1)) is the vector space spanned by w1 , w2 , w3 . The maps m and n can be written as m1 w1 + m2 w2 + m3 w3 ,

n1 w1 + n2 w2 + n3 w3 ,

where mi : H → K and ni : K → L are constant linear maps. Let us restrict the monad to the line l. The monadic condition nm = 0 implies now: n1 m2 + n2 m1 = 0,

n1 m1 = 0,

n2 m2 = 0.

Moreover, since the restriction of E to l is trivial, the composition n1 m2 is an isomorphism (see [30], Lemma 4.2.3). We can choose bases for H, K, L so that n1 m2 = 1k×k (the identity matrix) and 1k×k 0k×k m1 = m2 = 0k×k , 1k×k , 0r×k 0r×k ! ! n1 = 0k×k 1k×k 0k×r , n2 = −1k×k 0k×k 0k×r . Using the equations n3 m1 + n1 m3 = 0 and n3 m2 + n2 m3 = 0 we can write: b1 ! m3 = b2 , n = . −b b i 3 2 1 j Now the monadic condition nm = 0 can be written as: (n3 m3 ) · w32 + 1k×k · [w1 , w2 ] = 0. Therefore we obtain the following matrix equation: [b1 , b2 ] + ij + 2h¯ · 1k×k = 0. Note that the last r basis vectors of K give us a trivialization of the restriction of E to the line l. It is easy to check that any isomorphism of a monad which preserves trivialization on l and the choice of the bases of H, K, L made above has the form bi → gbi g −1 , This proves the theorem. ! "

j → jg −1 ,

i → gi,

where g ∈ GL(k, C).

412

A. Kapustin, A. Kuznetsov, D. Orlov

7. The Noncommutative Variety P3h¯ as a Twistor Space 7.1. Real structures. A ∗-algebra is, by definition, an algebra over C with an anti-linear anti-homomorphism ∗ satisfying ∗2 = id.A ∗-structure on a (graded) algebra is regarded as a real structure on the corresponding (projective) noncommutative variety. Let us introduce real structures on the complex varieties C4h¯ and Q4h¯ defined in Sect. 3. Assume that in (6), (7) the skew-symmetric matrix θ is purely imaginary and h¯ is real. Then there is a unique ∗-structure on the algebra A(C4h¯ ) such that xi∗ = xi . We denote the corresponding noncommutative variety by R4h¯ . Assume in addition that the symmetric matrix G in (7) is real and positive definite. There is a unique ∗-structure on the algebra Qh¯ such that Xi∗ = Xi ,D ∗ = D, and T ∗ = T . The corresponding noncommutative real variety will be called the noncommutative sphere and denoted by S4h¯ . The embedding of C4h¯ into Q4h¯ induces an embedding R4h¯ → S4h¯ . Recall that the complement of C4h¯ in Q4h¯ is a commutative quadratic cone kl G Xk Xl = 0 which has only one real point. Thus S4h¯ can be regarded as a one-point kl

compactification of R4h¯ . By a linear change of basis one can bring the pair (G, θ ) to the standard form

1 0 0 0

0 1 0 0 , G= 0 0 1 0 0 0 0 1

θ=

0

a

0

0 . 0 b 0 −b 0

√ −a 0 −1 0 0 0

0

0

(20)

Furthermore, since h¯ and θ enter only in the combination h¯ · θ , and we asssume that a + b = 0, we can set a + b = 1 without loss of generality. 7.2. Realification of P3h¯ . Recall that the noncommutative projective space P3h¯ corresponds to the algebra P Sh¯ with generators zi , i = 1, 2, 3, 4, and relations (9). Consider an algebra P" Sh¯ with generators zi , z¯ i , i = 1, 2, 3, 4, and relations [z1 , z2 ] = 2h(a ¯ + b)z3 z4 , [z1 , z¯ 1 ] = 2h¯ bz3 z¯ 3 − 2haz ¯ 4 z¯ 4 , [z1 , z¯ 2 ] = 0, [¯z1 , z¯ 2 ] = −2h(a + b)¯ z z ¯ , [z , z ¯ ] = 2 h az z ¯ − 2 hbz ¯ ¯ 3 3 ¯ 4 z¯ 4 , [z2 , z¯ 1 ] = 0, (21) 3 4 2 2 [zi , zj ] = [zi , z¯ j ] = [¯zi , zj ] = [¯zi , z¯ j ] = 0 for all i = 3, 4; j = 1, 2, 3, 4. There is a unique ∗-structure on this algebra such that zi∗ = z¯ i ,¯zi∗ = zi . We denote the corresponding real variety P3h¯ (R). This variety can be considered a realization of P3h¯ . Remark. In contrast to the commutative situation, a noncommutative complex variety in general has many different realization. We have an ambiguity in the choice of relations involving both zi and z¯ j . The realization (21) is distinguished by the fact that it is the twistor space of the noncommutative sphere S4h¯ , as explained below. In the commutative case there is a map from P3 (R) to the sphere S4 which is a P1 fibration. The corresponding P1 bundle is the projectivization of a spinor bundle on S4 . This map is known as the Penrose map. In the noncommutative case we have a

Noncommutative Instantons and Twistor Transform

413

similar picture. The analogue of the Penrose map is a map N : P3h¯ (R) −→ S4h¯ which is Sh¯ : associated with the homomorphism of ∗-algebras Qh¯ −→ P" √ −1 (z1 z¯ 4 − z¯ 1 z4 − z¯ 2 z3 + z2 z¯ 3 ), X1 → − 2 1 D → − (z1 z¯ 1 + z¯ 1 z1 + z2 z¯ 2 + z¯ 2 z2 ), 2 1 X2 → (z1 z¯ 4 + z¯ 1 z4 − z¯ 2 z3 − z2 z¯ 3 ), 2 T → − (z3 z¯ 3 + z4 z¯ 4 ), √ −1 X3 → − (¯z1 z3 − z1 z¯ 3 + z2 z¯ 4 − z¯ 2 z4 ), 2 1 X4 → (z1 z¯ 3 + z¯ 1 z3 + z¯ 2 z4 + z2 z¯ 4 ). 2 Note that for h¯ = 0 we obtain the homomorphism of commutative algebras which corresponds to the usual Penrose map. This means that P3h¯ (R) is the twistor space of S4h¯ . The variety P3h¯ (R) is a twistor space in yet another sense. For the commutative R4 the complex structures compatible with the symmetric bilinear form G and orientation are parametrized by points of a P1 . This remains true in the noncommutative case. A complex structure (resp. orientation) on R4h¯ is defined as a complex structure (resp. orientation) on the real vector space U spanned by x1 , . . . , x4 . We will choose an orientation on U and require that the complex structure be compatible with it. All such complex structures are parametrized by points of a P1 . Recall now that P3h¯ is a pencil of noncommutative projective planes passing through the commutative line. Let us pick any one of them. The realification of P3h¯ defined above induces a realification of the noncommutative projective plane. It is easy to see that the complement of the commutative line w3 = w¯ 3 = 0 in the realified projective plane is isomorphic to R4h¯ . Furthermore, the complement carries a natural complex structure defined by √ √ w3−1 wi → −1 w3−1 wi , w¯ 3−1 w¯ i → − −1 w¯ 3−1 w¯ i , i = 1, 2. The Penrose map induces an identification between the complement and R4h¯ ⊂ S4h¯ , and therefore induces a complex structure on the latter. Varying the noncommutative projective plane, one obtains all possible complex structures on R4h¯ compatible with a particular orientation. This is completely analogous to the commutative case.

7.3. Connection between sheaves on commutative and noncommutative planes. In this subsection we are going to connect the moduli space Mh¯ (r, 0, k) of bundles on P2h¯ with a trivialization on the line l with the moduli space M(r, 0, k) of torsion free sheaves on the commutative P2 with a trivialization on a fixed line. The bridge between bundles on P2h¯ and torsion free sheaves on P2 is provided by the twistor variety P3h¯ . This gives a geometrical interpretation of Nakajima’s results (the description of the moduli space M(r, 0, k) by the deformed ADHM data [28, 27]). We will construct a hyperkähler manifold M parametrizing certain complexes on P3h¯ which is isomorphic to M(r, 0, k) (which is also a hyperkähler manifold [28]). The isomorphism is given by the restriction of complexes

414

A. Kapustin, A. Kuznetsov, D. Orlov

to one of the commutative P2’s. On the other hand, the restriction of complexes to a noncommutative plane P2h¯ yields an isomorphism between M with a particular choice of complex structure and the moduli space Mh¯ (r, 0, k). Thus Mh¯ (r, 0, k) can be obtained from M(r, 0, k) by a rotation of complex structure. Consider complexes C · on P3h¯ of the form M

N

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0

(22)

with dim H = dim L = k, dim K = 2k + r, which satisfies the condition that its restriction to the line l has only one cohomology which is a trivial bundle (with a fixed trivialization). This condition implies that M is a monomorphism. Note that N is not an epimorphism in general, so (22) is not a monad. But the restriction of the complex (22) to any noncommutative plane is a monad by Corollary 6.5. Thus N can fail to be surjective only on the commutative planes z3 = 0 and z4 = 0. Now we introduce a real structure on P3h¯ (this is different from the real structure on the realification of P3h¯ defined above). Assume that h¯ is a real number. Consider an anti-linear anti-homomorphism J¯ of P Sh¯ defined by J¯ (z1 ) = z2 ,

J¯ (z2 ) = −z1 ,

J¯ (z3 ) = z4 ,

J¯ (z4 ) = −z3 ,

¯ J¯ (λ) = λ,

λ ∈ C.

Thus J¯ is a homomorphism of R-algebras from P Sh¯ to the opposite algebra P Sh¯ op . (The notation J¯ is used by analogy with the commutative case, where this anti-homomorphism is a composition of a complex structure J with complex conjugation [15].) The anti-homomorphism J¯ induces a functor J¯ ∗ from qgr(P Sh¯ )R to qgr(P Sh¯ op )R . The latter category is naturally identified with the category qgr(P Sh¯ )L . Using this identification we can consider the composition of J¯ ∗ with the dualization functor Hom(−, O) as a functor from qgr(P Sh¯ )R to itself. For any bundle E we denote by J¯ ∗ (E)∨ its image under this functor. The functor can be extended to complexes of bundles. It takes the complex C · (22) to the complex J¯ ∗ (C · )∨ J¯ ∗ (N)∨ J¯ ∗ (M)∨ 0 −→ L¯ ∗ ⊗ O(−1) −→ K¯ ∗ ⊗ O −→ H¯ ∗ ⊗ O(1) −→ 0.

Let us consider complexes C · on P3h¯ with an isomorphism J¯ ∗ (C · )∨ ∼ = C·

(23)

and trivialization on the line l. Then the space K acquires a hermitian metric and L becomes isomorphic to H¯ ∗ . The reasoning of Sect. 6 shows that we can represent the maps M and N as M 1 z1 + M 2 z2 + M 3 z3 + M 4 z 4 ,

N1 z1 + N2 z2 + N3 z3 + N4 z4 ,

where Mi and Ni are constant maps. By a suitable choice of bases we can put these maps into the form 1 0 B1 B1 (24) M1 = 0 , M2 = 1 , M3 = B2 , M4 = B2 , 0 0 J J

Noncommutative Instantons and Twistor Transform

! N1 = 0 1 0 ,

! N3 = −B2 B1 I ,

415

! N2 = −1 0 0 , N4 = −B2

B1

I

!

.

Using the reality conditions J¯ ∗ (N )∨ = M and J¯ ∗ (M)∨ = −N we find that

B1 = −B2 † ,

B2 = B1 † ,

J = I †,

I = −J † .

(25)

Finally the condition N M = 0 gives a) b)

µc = [B1 , B2 ] + IJ = 0, µr = [B1 , B1 † ] + [B2 , B2 † ] + II † − J † J = −2h¯ · 1k×k .

These matrix equations are invariant under the following action of U (k): Bi → gBi g −1 ,

I → gI,

J → Jg −1 ,

where g ∈ U (k).

(26)

Denote by M the vector space of complex matrices (B1 , B2 , I, J). It has a structure of a quaternionic vector space defined by (B1 , B2 , I, J) → (−B2 † , B1 † , −J † , I † ), and, moreover, it is a flat hyperkähler manifold (see [28]). The map µ = (µr , µc ) is a hyperkähler moment map for the action of U (k) defined in (26) (see [19]). Since the −1 −1 action of U (k) on µ−1 is free, the quotient M = µ−1 ¯ ¯ c (0)∩µr (−2h·1) c (0)∩µr (−2h· 1)/U (k) is a smooth hyperkähler manifold. This manifold parametrizes complexes (22) with a real structure (23) and a trivialization on the line l. On the other hand, it was proved in [28, 27] that the moduli space M(r, 0, k) of torsion free sheaves on the commutative P2 with a trivialization on a fixed line can be identified with M. This identification can be described geometrically as follows. Let us assume that h¯ is positive. It can be checked that in this case the map N can fail to be surjective only on the plane z4 = 0. We can restrict the complex (22) to the commutative plane z3 = 0. The restriction is a monad and its cohomology sheaf is a torsion free sheaf. It is easy to see that this yields a complex isomorphism from M to M(r, 0, p). The restriction of the complex (22) to a noncommutative plane is a monad as well. This yields a map from M to the moduli space Mh¯ (r, 0, k) of bundles on the noncommutative plane. Let us show that this map is an isomorphism. To this end we note that on the level of the linear algebra data this map sends a quadruple (B1 , B2 , I, J) to the quadruple (b1 , b2 , i, j) with b1 = B1 − B2 † ,

b2 = B2 + B1 † ,

i = I − J †,

j = J + I †.

Further, note that the equations µc = 0, µr = −2h¯ · 1 are equivalent to the equation [b1 , b2 ] + i · j + 2h¯ · 1 = 0 and the vanishing of the moment map for the action of the group U (k) on the space of quadruples (b1 , b2 , i, j). Now it follows from the theorem of Kempf and Ness ([28, 20]) that the map M → Mh¯ (r, 0, k) is a diffeomorphism. It becomes a complex isomorphism if we replace the natural complex structure of the space M with another one within the P1 of complex structures on M. Thus we have

416

A. Kapustin, A. Kuznetsov, D. Orlov

Theorem 7.1. The moduli space Mh¯ (r, 0, k) is a smooth hyperkähler manifold of real dimension 4rk, and as a hyperkähler manifold it is isomorphic to the moduli space M(r, 0, k) of torsion free sheaves on the commutative P2 with a trivialization on a fixed line. As a complex manifold Mh¯ (r, 0, k) is obtained from M(r, 0, k) by a rotation of the complex structure. The above discussion shows that there are natural bijections between A . Bundles on P2h¯ with a trivialization on the commutative line l and c2 = k. B . Solutions of the equations µc = 0, µr = −2h¯ · 1 modulo the action of U (k). C . Complexes of sheaves on P3h¯ of the form (22) with a trivialization on the commutative line l satisfying the reality condition (23). One can show that for r > 1 a generic complex (22) is a monad and its cohomology is a bundle E on P3h¯ such that H 1 (P3h¯ , E(−2)) = 0,

J¯ ∗ (E)∨ ∼ = E.

(27)

Moreover, it can be shown that any bundle E satisfying the conditions (27) can be represented as a cohomology of a monad of the form (22).

8. Noncommutative Twistor Transform 8.1. Review of the twistor transform. In the commutative case the ADHM construction of instantons has the following geometric interpretation. Consider the double fibration p

q

G(2; 4) ←−−−− Fl(1, 2; 4) −−−−→ P3 ,

(28)

where G(2; 4) is the Grassmannian and Fl(1, 2; 4) is the partial flag variety. The Grassmannian G(2; 4) has a real structure with S4 as the set of real points. For any bundle E on P3 its twistor transform is defined as a sheaf p∗ q ∗ E on G(2; 4). Given ADHM data we have a monad on P3 whose cohomology is a bundle. It can be shown that the restriction of its twistor transform to the sphere S4 coincides with the instanton bundle corresponding to these ADHM data. The instanton connection can also be reconstructed from the bundle on P3 (see [4, 24] for details). In this section we show that one can consider the noncommutative quadric introduced in Sect. 3 as a noncommutative Grassmannian G(2; 4). We also construct a noncommutative flag variety Fl(1, 2; 4) and projections p, q giving a noncommutative analogue of the twistor diagram (28). The twistor transform can be defined in the same way as above. It produces a bundle on the noncommutative sphere from the deformed ADHM data. We show that this bundle is precisely the kernel of the map D defined in Sect. 2. It should also be possible to construct the instanton connection on the noncommutative R4 from the complex of sheaves on P3h¯ . To do this, one needs to develop the differential geometry of noncommutative affine and projective varieties. We go some way in this direction by defining differential forms and spinors. Since the goal of this section is mainly illustrative, we limit ourselves to stating the results. An interested reader should be able to fill in the proofs.

Noncommutative Instantons and Twistor Transform

417

8.2. Tensor categories. A good way to construct noncommutative varieties with properties similar to those of commutative varieties is to start with a tensor category (see [25, 23]). Let T be an abelian tensor category. Consider a tensor functor O : T → Vect to the abelian tensor category of vector spaces compatible with the associativity constraint but not compatible with the commutativity constraint. If A is a commutative algebra in the tensor category T , then O(A) is a noncommutative algebra in the tensor category Vect. If M ∈ T is a right A-module, then O(M) is a right O(A)-module. Any right A-module (in the category T ) has a natural structure of a left A-module (and hence an A-bimodule). Thus any right O(A)-module of the form O(M) has a natural structure of a O(A)-bimodule. Consider the category CommT of all finitely generated (graded) commutative algebras in the tensor category T . Then under O the category CommT is mapped to a subcategory of the category of finitely generated (graded) algebras. This subcategory enjoys many properties of the category of commutative (graded) algebras. For example, for all A, B ∈ CommT there is a natural algebra structure on O(A) ⊗ O(B) coming from the algebra structure on A ⊗ B. The corresponding subcategory in the category of noncommutative affine (resp. projective) varieties shares a lot of properties with the category of commutative varieties. For example, if X and Y are varieties in this category, then using the tensor product of the corresponding algebras one can define the “Carthesian” product X × Y . More generally, given a pair of morphisms X → Z and Y → Z one can define the fiber product X ×Z Y . Further, starting from the module of differential forms of A one can construct the sheaf of differential forms on the corresponding noncommutative variety. The category qgr(O(A)) has a nice subcategory which consists of modules of the form O(M), where M ∈ T is an A-module. To any object O(M) of this subcategory one can associate its symmetric and exterior powers. The symmetric powers of O(M) form a noncommutative graded algebra. This enables one to define the projectivization of the sheaf corresponding to the module O(M). 8.3. Yang–Baxter operators. One way to construct an abelian tensor category T with a functor O : T → Vect is to consider a Yang–Baxter operator (see [25, 23]). A Yang–Baxter operator on a vector space V is an operator R : V ⊗ V → V ⊗ V , such that R 2 = idV ⊗V , (R ⊗ idV )(idV ⊗ R)(R ⊗ idV ) = (idV ⊗ R)(R ⊗ idV )(idV ⊗ R).

(29)

A Yang–Baxter operator induces an action of the permutation group Sn on the tensor power V ⊗n , where the transposition (i, i + 1) ∈ Sn acts as the operator Ri,i+1 = idV ⊗(i−1) ⊗ R ⊗ idV ⊗(n−i−1) : V ⊗n → V ⊗n . Equations (29) ensure that operators Ri,i+1 satisfy the relations between the transpositions (i, i + 1) in the group Sn . If R is a Yang–Baxter operator on a vector space V , then the dual operator R ∨ : V ∗ ⊗ V ∗ → V ∗ ⊗ V ∗ is also a Yang–Baxter operator. Given a Yang–Baxter operator R : V ⊗ V → V ⊗ V , one can construct an abelian tensor category TR and a functor OR : TR → Vect such that V is a OR -image of some object of TR , and the commutativity morphism in the category TR is mapped by OR to R [23]. As mentioned above, given any two objects A, B of the category CommTR , one

418

A. Kapustin, A. Kuznetsov, D. Orlov

has a natural algebra structure on the vector space O(A) ⊗ O(B). This algebra will be denoted O(A) ⊗ O(B) and called the R-tensor product of O(A) and O(B). R

It is well known that there is a one-to-one correspondence between irreducible representations of the group Sn and partitions of n (Young diagrams). Under this correspondence the trivial partition (n) corresponds to the sign representation, while the maximal partition (1, 1, . . . , 1) corresponds to the identity representation. Given # $% & n times

a partition (k1 , . . . , kr ) of n (k1 ≥ k2 ≥ · · · ≥ kr ) we denote by (k1 , . . . , kr ) the (k ,...,kr ) (k ,...,kr ) ∗ V (resp. R 1 V ) the corresponding irreducible representation and by R 1 ⊗n ∗ ⊗n (k1 , . . . , kr )-isotypical component of V (resp. (V ) ), i.e. the sum of all subrepresen(n) tations of V ⊗n (resp. (V ∗ )⊗n ) isomorphic to (k1 , . . . , kr ). We also put CnR V = R V , (n) CnR V ∗ = R V ∗ for brevity. Remark. The subspaces Rλ V ⊂ V ⊗n are the OR -images of some objects of the category TR . Let λ, µ be partitions of n and m respectively. It is clear that the action of the permutation σn,m ∈ Sn+m i + m, if 1 ≤ i ≤ n σn,m (i) = i − n, if n + 1 ≤ i ≤ n + m gives an isomorphism µ

µ

Rn,m : Rλ V ⊗ R V → R V ⊗ Rλ V . Remark. This isomorphism is the image of an isomorphism in the category TR . The trivial example of a Yang–Baxter operator is the usual transposition R0 (v1 ⊗ v2 ) = v2 ⊗ v1 . We will say that R is a deformation-trivial Yang–Baxter operator if R is an algebraic deformation of R0 in the class ofYang–Baxter operators. For a deformation-trivialYang– Baxter operator R we have dim Rλ V = dim Rλ 0 V for any partition λ. 8.4. The noncommutative projective space. Let R be a deformation-trivial Yang–Baxter operator on the vector space V ∗ . Then the graded algebra ) '( SR· V ∗ = T (V ∗ ) C2R V ∗ is a noncommutative deformation of the coordinate algebra of the projective space P(V ). We denote by PR (V ) the corresponding noncommutative variety. Thus PR (V ) is a noncommutative deformation of the projective space P(V ).

Noncommutative Instantons and Twistor Transform

419

Example 8.1. The operator if (i, j ) = (1, 2), (2, 1), R(zi ⊗ zj ) = zj ⊗ zi , R(z1 ⊗ z2 ) = z2 ⊗ z1 + 2h(az ⊗ z + bz4 ⊗ z3 ), ¯ 3 4 R(z2 ⊗ z1 ) = z1 ⊗ z2 − 2h(bz ¯ 3 ⊗ z4 + az4 ⊗ z3 ),

(30)

is a deformation trivialYang–Baxter operator on the 4-dimensional vector space Z ∗ with the basis {z1 , z2 , z3 , z4 }. By definition the homogeneous coordinate algebra of PR (Z) is generated by z1 , z2 , z3 , z4 with relations (9) (we set a + b = 1 as before). Hence PR (Z) is isomorphic to the noncommutative projective space P3h¯ defined in Sect. 3. The space Z ∗ was denoted U in that section. The above example shows that part of the data encoded in theYang–Baxter operator R is lost in the structure of the corresponding noncommutative projective space. We will see below that this data appears in the structure of other noncommutative varieties associated with R. 8.5. Noncommutative Grassmannians. It is well known that the homogeneous coordinate algebra of the Grassmann variety G(k; V ) is a graded quadratic algebra with Ck V ∗ as the space of generators and ! Ker Ck V ∗ ⊗ Ck V ∗ → (V ∗ )⊗2k → (k,k) V ∗ as the space of relations. This description justifies the following definition. Definition 8.2. Let R be a Yang–Baxter operator on the space V ∗ . The noncommutative Grassmann variety GR (k; V ) is the noncommutative projective variety corresponding to the quadratic algebra '( ) (k,k) Ker(CkR V ∗ ⊗ CkR V ∗ → R V ∗ ) . GR (k; V ) = T (CkR V ∗ ) The algebra GR (k; V ) is the OR -image of a commutative algebra in the category TR . If R is deformation-trivial, then GR (k; V ) is a noncommutative deformation of G(k; V ). Note that GR (1; V ) = PR (V ) by definition. Example 8.3. Consider the noncommutative Grassmannian GR (2; Z) corresponding to the Yang–Baxter operator (30). Let zij =

1 ((zi ⊗ zj − zj ⊗ zi ) − R(zi ⊗ zj − zj ⊗ zi )) ∈ C2R Z ∗ . 2

Then it is easy to check that GR (2; Z) is generated by the elements Y1 = z13 ,

Y2 = −z24 ,

Y3 = z23 ,

Y4 = z14 ,

D = −z12 ,

T = z34 ,

with relations [Y1 , Y2 ] = 2h¯ aT 2 , [Y3 , Y4 ] = 2h¯ bT 2 , [D, Y1 ] = −2h¯ aY1 T , [D, Y2 ] = 2haY ¯ 2T , [D, Y3 ] = −2h¯ bY3 T , [D, Y4 ] = 2hbY ¯ 4T , 1 DT = (Y1 Y2 + Y2 Y1 + Y3 Y4 + Y4 Y3 ) , 2

(31)

420

A. Kapustin, A. Kuznetsov, D. Orlov

[Yi , Yj ] = [T , Yj ] = [T , D] = 0 for all i = 3, 4, j = 1, 2, 3, 4. Comparing with (7) one can see that the algebra GR (2; Z) is isomorphic to Qh¯ with G and θ given by 0 a 0 0 0 1 0 0 −a 0 0 0 1 0 0 0 1 . , θ = 2h¯ G= 2 0 0 0 1 0 0 0 b 0 0 −b 0 0 0 1 0 Note that the variables Xi , i = 1, 2, 3, 4, used in Sect. 7 to describe the quadric are related to Yi , i = 1, 2, 3, 4, by the following formulas: √ √ Y1 = X2 + −1 X1 , Y2 = −X2 + −1 X1 , (32) √ √ Y3 = X4 + −1 X3 , Y4 = −X4 + −1 X3 . 8.6. Products of Grassmannians and flag varieties. Let R be a Yang–Baxter operator on the vector space V ∗ . Consider a sequence k1 , . . . , kr of integers. Let Zr be the free abelian group with r generators e1 , . . . , er . The R-tensor product GR (k1 ; V ) ⊗ . . . ⊗ GR (kr ; V ) R

R

is a Zr -graded algebra generated by the vector spaces CkRi V ∗ in degree ei , with relations ! (k ,k ) Ker CkRi V ∗ ⊗ CkRi V ∗ → R i i V ∗ in degree 2ei for all i and k

(id,−Rkj ,ki )

k

k

Ker (CkRi V ∗ ⊗ CRj V ∗ ) ⊕ (CRj V ∗ ⊗ CkRi V ∗ ) −−−−−−−−−−→ CkRi V ∗ ⊗ CRj V ∗

!

in degree ei + ej for all i > j . For any increasing sequence k1 , . . . , kr we define also a Zr -graded algebra FLR (k1 , . . . , kr ; V ).It has the same generators as the algebra GR (k1 ; V ) ⊗ . . . ⊗ GR (kr ; V ),, subject to the same relations in degrees 2ei and to relations k

R

R

kj

kj

k

(id,−Rkj ,ki )

kj

k

(ki ,kj )

Ker (CRi V ∗ ⊗ CR V ∗ ) ⊕ (CR V ∗ ⊗ CRi V ∗ ) −−−−−−−−→ CRi V ∗ ⊗ CR V ∗ −−−−−→ R

V∗

!

in degree ei + ej for all i > j . This definition is suggested by the Borel–Weil–Bott theorem (see [14]). In particular, for R = R0 we get the algebra corresponding to the commutative flag variety. We define the R-Carthesian product GR (k1 ; V ) × . . . × GR (kr ; V ) and the noncomR

R

mutative flag variety FlR (k1 , . . . , kr ; V ) as the noncommutative varieties corresponding to the algebras GR (k1 ; V ) ⊗ . . . ⊗ GR (kr ; V ) and FLR (k1 , . . . , kr ; V ) respectively. R

R

To make this compatible with our definition of a noncommutative variety, we consider instead of a Zr -graded algebra its diagonal subalgebra. The diagonal subalgebra is a graded algebra whose nth graded component is the n(e1 + · · · + er )-graded component of the Zr -graded algebra. Thus according to Sect. 3 the category of coherent sheaves on

Noncommutative Instantons and Twistor Transform

421

the R-Cartesian product of Grassmannians (or the flag variety) is the category qgr of the corresponding diagonal subalgebra. The algebra FLR (k1 , . . . , kr ; V ) is the OR -image of a commutative algebra in the category TR . Hence one can define the R-Carthesian product of several flag varieties. If R is deformation-trivial, then GR (k1 ; V ) × . . . × GR (kr ; V ) R

and

R

FlR (k1 , . . . , kr ; V )

are noncommutative deformations of the corresponding commutative varieties. Note that we have a canonical embedding of the graded algebra GR (ki ; V ) into the graded algebra FLR (k1 , . . . , ki , . . . , kr ; V ) inducing the canonical projections pi : FlR (k1 , . . . , ki , . . . , kr ; V ) → GR (ki ; V ). On the other hand, by definition FLR (k1 , . . . , kr ; V ) is a quotient algebra of the algebra GR (k1 ; V ) ⊗ . . . ⊗ GR (kr ; V ). Hence FlR (k1 , . . . , kr ; V ) can be regarded as a closed R

R

subvariety in GR (k1 ; V ) × . . . × GR (kr ; V ). R

R

Example 8.4. The algebra GR (1; Z) ⊗ GR (2; Z) corresponding to the Yang–Baxter opR

erator (30) is generated by the elements z1 , z2 , z3 , z4 , Y1 , Y2 , Y3 , Y4 , D, T with relations (9), (31), and [z2 , Y1 ] = 2h¯ az3 T , [z1 , Y2 ] = −2haz ¯ 4T , [z1 , Y3 ] = −2hbz [z2 , Y4 ] = −2hbz ¯ 3T , ¯ 4T , [z1 , D] = −2hbz ¯ 3 Y4 − 2haz ¯ 4 Y1 , [z2 , D] = 2haz ¯ 3 Y2 − 2hbz ¯ 4 Y3 , [z1 , Y1 ] = [z2 , Y2 ] = 0, [z3 , Yi ] = [z3 , D] = 0, [z4 , Yi ] = [z4 , D] = 0, [zi , T ] = 0 for all i = 1, 2, 3, 4. The algebra FLR (1, 2; Z) is given by the same generators subject to the same relations, as well as the additional relations

0

T Y2

T

0 z1 z2 0 −Y4 Y1 = . 0 D − h(a ¯ + b)T z3 0 0 −D − h(a 0 z4 ¯ + b)T Y2

0 Y4

Y3 −Y1

Y3

(33)

As explained above, we have projections Qh¯

p

q

GR (2; Z) ←−−−− FlR (1, 2; Z) −−−−→ PR (Z)

and a closed embedding FlR (1, 2; Z) ⊂ GR (2; Z) × PR (Z) = Qh¯ × P3h¯ . R

R

P3h¯

422

A. Kapustin, A. Kuznetsov, D. Orlov

8.7. Tautological bundles. Let V (resp. V ∗ , Rλ V, Rλ V ∗ ) denote the coherent sheaf on GR (k; V ) corresponding to the free right GR (k; V )-module V ⊗ GR (k; V ) (resp. V ∗ ⊗GR (k; V ), Rλ V ⊗GR (k; V ), Rλ V ∗ ⊗GR (k; V )). Since the space of global sections ∗ of the sheaf O(1) on the Grassmannian GR (k; V ) is CkR V ∗ , the maps Ck−1 R V → k ∗ ∗ ∗ V ⊗ CkR V ∗ and Ck+1 R V → V ⊗ CR V induce morphisms of sheaves φ

∗ −−−→ V Ck−1 R V (−1) −

and

ψ

∗ Ck+1 −−−→ V ∗ . R V (−1) −

We put S = Im φ, V/S = Coker φ, S = Im ψ, V ∗ /S = Coker ψ. Remark. For k = 1 we have S = O(−1), V ∗ /S = O(1). One can show that these sheaves are locally free. We refer to them as tautological bundles. The free GR (k; V )-modules, corresponding to the sheaves Rλ V, Rλ V ∗ are the OR images of free modules over the corresponding algebra in the category TR . Furthermore, the morphisms φ and ψ are OR -images. This implies that the GR (k; V )-modules corresponding to the tautological bundles are OR -images as well. Therefore they all have a natural structure of GR (k; V )-bimodules. This allows to define R-symmetric powers SRk (−) (resp. R-exterior powers CkR (−)) of the tautological bundles as the corresponding OR -images. One can check that we have canonical isomorphisms of bimodules V ∗ /S ∼ = S∨,

S ∼ = (V/S)∨ .

Example 8.5. Let R be the Yang–Baxter operator (30) and k = 2. Let zˇ 1 , zˇ 2 , zˇ 3 , zˇ 4 be the dual basis of Z. Then the twisted maps φ(1) : Z ∗ ⊗ OGR → Z ⊗ OGR (1), ψ(1) : Z ⊗ OGR ∼ = C3R Z ∗ ⊗ OGR → Z ∗ ⊗ OGR (1) are given by 0 D + h(a z1 zˇ 1 ¯ − b)T −Y1 −Y4 D − h(a z2 0 −Y3 Y2 ¯ − b)T zˇ 2 , φ(1) : → −Y1 Y3 0 −T zˇ 3 z3 −Y4 z4 −Y2 T 0 zˇ 4 zˇ 1 0 T Y2 Y3 z1 zˇ 2 T z2 0 −Y4 Y1 . ψ(1) : → 0 D − h(a ¯ + b)T z3 zˇ 3 Y2 Y4 Y3 −Y1 −D − h(a zˇ 4 0 z4 ¯ + b)T Note that ψ(1)φ = 0 and φ(1)ψ = 0. Hence we have isomorphisms S (1) ∼ = V/S,

S(1) ∼ = S∨.

Noncommutative Instantons and Twistor Transform

423

Note also that on the open subset T = 0 elements (z3 , z4 ) give a trivialization of the tautological bundle S ∨ . More precisely, the restriction of the sections z1 , z2 of S ∨ can be expressed as z1 = y4 z3 − y1 z4 ,

z2 = −y2 z3 − y3 z4 ,

(34)

where yi = T −1 Yi . Similarly, the elements(ˇz1 , zˇ 2 ) give a trivialization of V/S on T = 0. Thus the restrictions of all tautological bundles to the open subset T = 0 correspond to the free rank two bimodule over the Weyl algebra A(C4h¯ ). 8.8. Pull-back and push-forward. Recall that we have canonical projections pi : FlR (k1 , k2 ; V ) → GR (ki ; V )

(i = 1, 2).

Given a right graded GR (ki ; V )-module E we consider the right bigraded FLR (k1 , k2 ; V )-module E ⊗GR (ki ;V ) FLR (k1 , k2 ; V ). The diagonal subspace of this module is a graded module over the diagonal subalgebra of FLR (k1 , k2 ; V ). This gives the pull-back functor pi∗ : coh(GR (ki ; V )) → coh(FlR (k1 , k2 ; V )). The pull-back functor is exact and takes a OR -image to a OR -image. In particular, the pull-backs of the tautological bundles have a canonical bimodule structure. The pull-back functor pi∗ admits a right adjoint functor pi∗ : coh(FlR (k1 , k2 ; V )) → coh(GR (ki ; V )), called the push-forward functor. It also takes a OR -image to a OR image. The line bundles p1∗ O(i) and p2∗ O(j ) on the flag variety FlR (k1 , k2 ; V ) are OR images, hence they have a canonical bimodule structure. Therefore, we have a welldefined tensor product O(i, j ) = p1∗ O(i) ⊗ p2∗ O(j ). The line bundle O(i, j ) is also a OR -image and has a canonical bimodule structure. The nth graded component of the corresponding module over the diagonal subalgebra of FLR (k1 , k2 ; V ) is the ((n + i)e1 + (n + j )e2 )-graded component of the algebra FLR (k1 , k2 ; V ). One can check that the push-forward of the line bundle O(j1 , j2 ) with respect to p2 is given by the formula j p2∗ O(j1 , j2 ) = SR1 (S ∨ )(j2 ). 8.9. FlR (1, 2; Z) as the projectivization of the tautological bundle. The R-symmetric powers of the tautological bundle form a sheaf of graded algebras on the Grassmannian GR (k; V ), ) '( SR· (S ∨ ) = T (S ∨ )

C2R S ∨ .

The corresponding GR (k; V )-module ∞ i,j =0

j

9(GR (k; V ), SR (S ∨ )(i))

424

A. Kapustin, A. Kuznetsov, D. Orlov

is a bigraded module with a structure of a bigraded algebra. One can check that this bigraded algebra is isomorphic to the bigraded algebra FLR (1, k; V ). Thus we can regard the flag variety FlR (1, k; V ) as the projectivization of the tautological bundle S on the Grassmannian GR (k; V ). In particular, FlR (1, 2; Z) is the projectivization of the tautological bundle S on the Grassmannian GR (2; Z). 8.10. Noncommutative twistor transform. If E is a coherent sheaf on the noncommutative projective space PR (Z) = P3h¯ , we define its twistor transform as the sheaf p∗ q ∗ E on GR (2; Z) = Qh¯ , where q is the projection FlR (1, 2; Z) → PR (Z) = P3h¯ and p is the projection FlR (1, 2; Z) → GR (2; Z) = Qh¯ . Similarly, we can define the twistor transform of a complex of sheaves on P3h¯ . Actually, it is more natural to consider the derived twistor transform, i.e. the derived functor of the ordinary twistor transform. Consider a complex C · of the form M

N

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0 on the projective space P3h¯ . One can check that under the twistor transform one has OP3 (−1) → 0, h¯

OP3 (1) → S ∨ .

OP3 → OGR , h¯

h¯

In fact, for these sheaves the derived twistor transform coincides with the ordinary one. Thus the (derived) twistor transform takes the complex C · to the complex N

0 −→ K ⊗ O −→ L ⊗ S ∨ −→ 0. Let E denote the middle cohomology of the complex C · . It follows that the twistor transform takes E to the kernel of the map N : K ⊗ O −→ L ⊗ S ∨ . One can describe N without reference to the twistor transform. The morphism N is the same thing as a vector space morphism N1 z1 + N2 z2 + N3 z3 + N4 z4 : K −→ Z ∗ ⊗ L.

(35)

Here the maps Ni are given in terms of the deformed ADHM data according to (24) and (25). The map N is a composition of two maps K ⊗ OGR −→ L ⊗ Z ∗ ⊗ OGR −→ L ⊗ S ∨ , where the first map is given by (35), while the second map comes from the canonical projection Z ∗ ⊗ OGR → S ∨ . (We remind that S ∨ is the cokernel of the map ψ : Z ⊗ OGR (−1) −→ Z ∗ ⊗ OGR .) Recall that on the open subset {T = 0} the bundle S ∨ is trivial, and the elements (z3 , z4 ) give its trivialization (see (34)). Hence the restriction of the twistor transform of the complex (22) to this open subset is isomorphic to the complex

N3 N4

+ y 4 N1 − y 2 N 2

− y 1 N1 − y 3 N2 0 −−−−→ K ⊗ O −−−−−−−−−−−−−−−−→ (L ⊕ L) ⊗ O −→ 0.

(36)

Noncommutative Instantons and Twistor Transform

425

Assume now that the complex (22) is given by the deformed ADHM data (B1 , B2 , I, J) (see Sect. (7)). Applying the formulas (24) and (25), we see that with respect to the chosen bases of L and K the map N is given by the matrix −B2 + y2 B1 + y 4 I . −B1 † + y3 −B2 † − y1 −J † It is evident that this operator is related to the operator D in (4) by a change of basis. In particular, the Nekrasov–Schwarz coordinates ξ1 , ξ2 , ξ¯1 , ξ¯2 (see Sect. 2) can be expressed through xi = T −1 Xi as follows: √ √ ξ1 = −y4 = x4 − −1 x3 , ξ2 = y2 = −x2 + −1 x1 , √ √ ξ¯1 = y3 = x4 + −1 x3 , ξ¯2 = −y1 = −x2 − −1 x1 . Thus the twistor transform of the complex corresponding to the deformed ADHM data coincides with the instanton bundle corresponding to these data (see Sect. 2). This gives a geometric interpretation of the deformed ADHM construction of the noncommutative instanton bundle. 8.11. Differential forms. Let an algebra A be the OR -image of a commutative algebra in the category TR . This means that there exists an operator R : A⊗2 −→ A⊗2 compatible with the multiplication law of A. Above we have defined the R-tensor product A ⊗ A R

which is also an algebra with a Yang–Baxter operator. Explicitly, the multiplication law of A ⊗ A is defined as follows. Let m be the multiplication map from A ⊗ A to A. Then R

the multiplication map from (A ⊗ A) ⊗ (A ⊗ A) to A ⊗ A is given by m12 m34 R23 in the obvious notation. It is easy to see that the multiplication map m is a homomorphism of algebras. Let I denote the kernel of the map m : A ⊗ A → A. Then I is a two-sided ideal of R

the algebra A ⊗ A. R

Definition 8.6. We define the bimodule of R-differential forms of the algebra A by ΩA1 = I /I 2 . For a motivation of this definition, see [12]. Furthermore, suppose A is a graded algebra. Consider the total grading of the bigraded algebra A ⊗ A. The two-sided ideal I inherits R

the grading. Therefore the bimodule ΩA1 is graded too. In the graded case, besides ΩA1 , we can define the module of projective differential forms of A in the following way. Let χ : A ⊗ A → A ⊗ A be the linear operator which R

R

acts on the (p, q)th graded component of the algebra A ⊗ A as a scalar multiplication by R

q. Since χ is a derivation, we have χ (I 2 ) ⊂ I . Therefore m(χ (I 2 )) = 0. Furthermore, m·χ the induced map ΩA1 = I /I 2 −→ A is a morphism of graded A-bimodules. Definition 8.7. We define the A-bimodule of projective differential forms of the algebra A by m·χ *A1 = Ker(ΩA1 −→ Ω A).

426

A. Kapustin, A. Kuznetsov, D. Orlov

First, let us apply this construction of differential forms to the noncommutative affine variety C4h¯ (Subsect. 3.4). The algebra A(C4h¯ ) of polynomial functions on C4h¯ is the Weyl algebra: A(C4h¯ ) = T(x1 , x2 , x3 , x4 )/[xi , xj ] = h¯ θij 1≤i,j ≤4 . Let us define the Yang–Baxter operator on the tensor square of the subspace of A(C4h¯ ) spanned by 1, x1 , x2 , x3 , x4 by the formula 1 ⊗ xi → xi ⊗ 1, xi ⊗ 1 → 1 ⊗ xi , xi ⊗ xj → xj ⊗ xi + hθ ¯ ij · 1 ⊗ 1 for all

1 ≤ i, j ≤ 4.

This Yang–Baxter operator has a unique extension to the whole A(C4h¯ ) compatible with the multiplication law. There is another way to look at this Yang–Baxter operator. Recall that C4h¯ is an open subset T = 0 in the noncommutative Grassmannian GR (2; Z), where R is defined by (30). The Yang–Baxter operator on the quadratic algebra GR (2; Z) has the property that R(T ⊗ a) = a ⊗ T for any a ∈ GR (2; Z). Hence it descends to a Yang–Baxter operator on A(C4h¯ ). It is easy to see that it acts on the tensor square of the subspace spanned by 1, x1 , x2 , x3 , x4 in the above manner. We define the sheaf of differential forms 1C4 as the bimodule of R-differential forms h¯

of the algebra A(C4h¯ ). It is easy to check that 1C4 is isomorphic to the bimodule A(C4h¯ )⊕4 . h¯

p

Futhermore, we can take any R-exterior power of 1C4 and thereby define C4 . This h¯

h¯

enables us to define a connection and its curvature on any bundle on the noncommutative affine space. The relevant formulas were written above (see Subsect. 1.5). Second, we define the sheaf of differential forms 1GR on the noncommutative Grassmannian GR (k; V ) as the sheaf corresponding to the module of projective differential *1 . forms Ω GR It can be shown that as in the commutative case we have an isomorphism of coherent sheaves on the noncommutative Grassmannian GR (k; V ): 1GR ∼ = S ⊗ S. It follows that for k = 1 that we have an exact sequence 0 −→ 1PR (V ) −→ V ∗ (−1) −→ O −→ 0. Thus this definition of the sheaf of differential forms 1PR (V ) is consistent with Definition 4.8. Similarly, one can define the sheaf of differential forms 1FlR on the noncommutative flag variety FlR (k1 , . . . , kr ; V ). One can check that the projection pi : FlR (k1 , . . . , ki , . . . , kr ; V ) → GR (ki ; V ) induces a morphism of bundles pi∗ : 1GR → 1FlR . In the commutative case the ADHM construction of the instanton connection can be interpreted in terms of twistor transform (see [4, 24] for details). We believe that this can be done in the noncommutative case as well. It appears that the most convenient definition of connection on a bundle on a noncommutative projective variety is in terms of jet bundles (see, for example, [24]).

Noncommutative Instantons and Twistor Transform

427

9. Instantons on a q-Deformed R4 In this paper we have focused on a particular noncommutative deformation of R4 related to the Wigner–Moyal product (3). This is the only deformation of R4 which is known to arise in string theory. But most of our constructions work for more general deformations which do not have a clear physical interpretation. For example, let us replace C4h¯ with a noncommutative affine variety whose coordinate ring is generated by z1 , z2 , z3 , z4 subject to the following quadratic relations: qz3 z4 − q −1 z4 z3 = h, qz1 z2 − q −1 z2 z1 = h, ¯ ¯ [z1 , z3 ] = [z1 , z4 ] = [z2 , z3 ] = [z2 , z4 ] = 0. We will denote this noncommutative affine variety by C4q,h¯ , and its coordinate algebra by Aq,h¯ . If h¯ and q are real, we can define a ∗-operation on Aq,h¯ by z1∗ = z2 , z3∗ = z4 . The corresponding real noncommutative affine variety will be denoted by R4q,h¯ . Consider now the following deformation of the ADHM equations: [B1 , B1† ]q −1 + [B2 , B2† ]q + I I † − J † J = −2h¯ · 1k×k . (37)

[B1 , B2 ]q −1 + I J = 0,

Here B1 , B2 ∈ Hom(V , V ), I ∈ Hom(W, V ), J ∈ Hom(V , W ), as usual, and by [A, B]q we mean a q-commutator: [A, B]q = qAB − q −1 BA. We claim that solutions of these “q- deformed” ADHM equations can be used as an input for the construction of instantons on R4q,h¯ of rank r = dim W and instanton charge k = dim V . Let us sketch this construction. Define an operator D ∈ HomAq,h¯ ((V ⊕ V ⊕ W ) ⊗C Aq,h¯ , (V ⊕ V ) ⊗C Aq,h¯ ) by the formula

D=

B1 − qz1 −qB2 + qz2

I

B2† − z¯ 2

J†

qB1† − z¯ 1

.

Now we can go through the same manipulations as in Sect. 2: assume that D is surjective, and its kernel is a free module, and define a connection 1-form by the expression (5). The same formal computation as in Sect. 2 shows that the curvature of this connection is anti-self-dual. In order to ensure that D is surjective, it is probably necessary to replace the algebra Aq,h¯ with some bigger algebra containing Aq,h¯ as a subalgebra. This bigger algebra should play the role of the algebra of smooth functions on our noncommutative R4 . For h¯ = 0, q = 1 there is even a natural candidate for this bigger algebra: it should consist of C ∞ functions on C2 with some suitable growth conditions at infinity and the product defined by (f g)(z1 , z2 , z¯ 1 , z¯ 2 ) = exp − ln(q) z1 z¯ 1

∂2 ∂2 ∂2 ∂2 + z2 z¯ 2 − z1 z¯ 1 − z2 z¯ 2 ∂z1 ∂ z¯ 1 ∂z2 ∂ z¯ 2 ∂z ∂ z¯ 1 ∂z2 ∂ z¯ 2 1 f (z1 , z2 , z¯ 1 , z¯ 2 ) g z1 , z2 , z¯ 1 , z¯ 2 |z1 =z1 ,z2 =z2 . (38)

428

A. Kapustin, A. Kuznetsov, D. Orlov

Assuming that this formal expression exists, it is easy to check that the product is associative, that polynomial functions form a subalgebra with respect to it, and that this subalgebra is isomorphic to Aq,h¯ . It is natural to conjecture that all instantons on R4q,h¯ arise from this deformed ADHM construction. Note that in this case the deformed ADHM equations are not hyperkähler moment map equations, and one cannot use the hyperkähler quotient construction to infer the existence of a hyperkähler metric on the quotient space. The algebro-geometric part of the story can also be generalized. We did not go through this carefully, but nevertheless would like to indicate one result. It appears that the q-deformed ADHM data can be interpreted in terms of sheaves on a more general noncommutative P2 than the one defined in Sect. 3. The graded algebra corresponding to this noncommutative P2 is generated by degree one elements z1 , z2 , z3 with the quadratic relations 2 qz1 z2 − q −1 z2 z1 = 2hz ¯ 3 , [zi , z3 ] = 0, i = 1, 2. This algebra is one of the Artin-Schelter regular algebras of dimension three [1, 2]. It is characterized by the fact that the corresponding noncommutative variety P2q,h¯ contains as subvarieties a commutative quadric and a noncommutative line. The latter is given by the equation z3 = 0. In the limit q → 1 the plane P2q,h¯ reduces to P2h¯ , and the union of the quadric and the line turns into the triple commutative line l which played such a prominent role in this paper. If q = 1, then in the limit h¯ → 0 the quadric turns into a union of two intersecting commutative lines z1 = 0 and z2 = 0. For any q the line z3 = 0 should be regarded as “the line at infinity” (which is noncommutative for q = 1). It is plausible that the q- deformed ADHM data are in one-to-one correspondence with bundles, or may be torsion–free sheaves, on P2q,h¯ with a trivialization on this line. 10. Appendix In this section we define a -product on the space of complex-valued C ∞ functions on Rn whose derivatives of arbitrary order are polynomially bounded. The -product endows this space with a structure of a C-algebra and reduces to the Wigner–Moyal product (3) on polynomial functions. Definition 10.1. Let O be a topological vector space which is a subspace of the space of C ∞ functions on Rn , and let O be the space of distributions on O. Let f be a C-valued function on Rn which simultaneously is a distribution in O . f is called a multiplier if for any φ ∈ O, f φ ∈ O. The set of multipliers of O is obviously a subspace of O . Definition 10.2. Let f ∈ O . f is called a convolute if for any φ ∈ O we have (f ∗ φ)(x) ≡ (f (ξ ), φ(x + ξ )) ∈ O, and this expression depends continuously on φ. The above expression is called the convolution of f with φ. The set of convolutes is obviously a subspace of O .

, respectively. If f ∈ O,

and O We will denote the Fourier duals of O and O by O

will be the Fourier transform of f , etc. then f ∈ O

Noncommutative Instantons and Twistor Transform

429

Definition 10.3. The Schwartz space S(Rn ) is the space of C-valued C ∞ functions on Rn such that φ ∈ S if and only if all the norms sup x k D m φ(x), x

k = 0, 1, 2, . . . ,

(39)

are finite. Here m = (m1 , . . . , mn ) is an arbitrary polyindex. Convergence on S is defined using the family of norms (39). Then S becomes a complete countably normed space [17]. Proposition 10.4. A function f ∈ S is a multiplier if and only if it is a C ∞ function on Rn all of whose derivatives are polynomially bounded. Proof. Obvious. ! " The following theorem proved in [37] describes the subspace of convolutes of S : Theorem 10.5. A distribution f ∈ S is a convolute if and only if it has the form f = D α fα (x), |α| 0 u+ (τ, m, η) = j 0, elsewhere.

Hence + + −

u1 0 u2 L2 (R×Z×R) u+ 1 0 u2 L2 (R×Z×R) + u1 0 u2 L2 (R×Z×R)

+ − − + u− 1 0 u2 L2 (R×Z×R) + u1 0 u2 L2 (R×Z×R)

and a use of (18) yields + + −

u1 0 u2 L2 (R×Z×R) u+ 1 0 u2 L2 (R×Z×R) + u1 0 j(u2 ) L2 (R×Z×R)

Since τ − m5 −

η2 m

+ − − + j(u− 1 ) 0 u2 L2 (R×Z×R) + j(u1 ) 0 j(u2 ) L2 (R×Z×R) . 2 is an odd function, one has 0 < m ∼ Mj and τ − m5 − ηm ∼ Kj ,

− j = 1, 2 on the support of u+ j and j(uj ). Hence we can suppose m > 0 on the support of uj , j = 1, 2, when proving Lemma 4. We need to bound the expression ∞ ∞ ∞ ∞ u1 (τ1 , m1 , η1 )u2 (τ − τ1 , m − m1 , η − η1 ) m=0 −∞ −∞ m1 >0,m−m1 >0 −∞ −∞

2 dτ1 dη1 dτ dη.

Periodic KP-I Type Equations

461

The Cauchy–Schwarz inequality in (τ1 , m1 , η1 ), the support properties of u1 and u2 and the Cauchy–Schwarz inequality in (τ, m, η) yield

u1 0 u2 2L2 (R×Z0 ×R)

sup

(τ,m=0,n)

|Aτ mη | u1 2L2 (R×Z×R) u2 2L2 (R×Z×R) ,

(19)

where Aτ mη ⊂ R × Z × R is the set Aτ mη = (τ1 , m1 , η1 ) : 0 < m1 ∼ M1 , 0 < (m − m1 ) ∼ M2 , 2 2 τ1 − m5 − η1 ∼ K1 , τ − τ1 − (m − m1 )5 − (η − η1 ) ∼ K2 . 1 m1 m − m1 Further we obtain via the triangle inequality |Aτ mη | (K1 ∧K2 )|Bτ mη |, where Bτ mη ⊂ Z × R is the set Bτ mη = (m1 , η1 ) ∈ Z × R : 0 < m1 ∼ M1 , 0 < (m − m1 ) ∼ M2 , 2 2 τ − m5 − (m − m1 )5 − η1 − (η − η1 ) (K1 ∨ K2 ) . 1 m1 m − m1 It remains to bound |Bτ mη |. We shall again use Lemma 1. The projection of Bτ mη on the m1 axis is bounded by c(M1 ∧ M2 ). Fix now m1 . We need to estimate the Lebesgue measure of η1 such that the expression τ − m51 − (m − m1 )5 −

η12 (η − η1 )2 − m1 m − m1

(20)

ranges in an interval of size c(K1 ∨ K2 ). For that purpose we need the following lemma, the proof of which is straightforward. Lemma 5. Let a = 0, b, c be real numbers and I be an interval on the real line. Then 1

mes {x : ax + bx + c ∈ I } 2

|I | 2

1

|a| 2

.

Write the expression (20) as −

2m 2η η2 η1 + τ − m51 − (m − m1 )5 − . η12 + m1 (m − m1 ) m − m1 m − m1

Since m1 and m − m1 are both positive we have that 1 m . M1 ∧ M 2 m1 (m − m1 ) Therefore Lemma 5 implies that the Lebesgue measure of η1 such that the expression (20) 1 1 ranges in an interval of size c(K1 ∨ K2 ) is bounded by c(M1 ∧ M2 ) 2 (K1 ∨ K2 ) 2 . Hence

462

J.-C. Saut, N. Tzvetkov 3

1

using Lemma 1 we obtain |Bτ mη | (M1 ∧ M2 ) 2 (K1 ∨ K2 ) 2 and moreover 3

1

|Aτ mη | (K1 ∧ K2 )(M1 ∧ M2 ) 2 (K1 ∨ K2 ) 2 .

(21)

Substituting (21) in (19) completes the proof of Lemma 4. Consider the dyadic levels

η2 K1 K2 K 5 DM1 M2 M = (τ, m, η, τ1 , m1 , η1 ) : τ − m − ≈ K, |m| ≈ M, m

η2 τ1 − m51 − 1 ≈ K1 , |m1 | ≈ M1 , m1

(η − η1 )2 5 τ − τ1 − (m − m1 ) − ≈ K2 , m − m1

|m − m1 | ≈ M2 , (m, m1 , η, η1 ) ∈ "2 ,

where K1 , K2 , K, M1 , M2 , M are dyadic integers. Denote by J2K1 K2 KM1 M2 M the conK1 K2 K tribution of DM to (6). Then 1 M2 M

J2

J2K1 K2 KM1 M2 M .

K1 ,K2 ,K,M1 ,M2 ,M-dyadic

Define fK1 M1 (τ, m, η) and gK2 M2 (τ, m, η) as in (8) and (9) respectively. In the estimate of J2 we shall perform an additional (comparing to the estimate for J1 ) localization of 2 h near the level set of τ − m5 − ηm . So we set 2 h(τ, m, η), when τ − m5 − ηm ≈ K, |m| ≈ M hKM (τ, m, η) = 0, elsewhere. Then clearly J2K1 K2 KM1 M2 M is bounded by M · fK M (τ1 , m1 , η1 )gK M (τ − τ1 , m − m1 , η − η1 )hKM (τ, m, η) 1 1 2 2 K K K 1 2

*M1 M2 M

1

+

1

K 2 − K12 K22 1

"+

+

,

K1 K2 K ⊂ R4 is defined as where *M 1 M2 M

K1 K2 K *M = (τ, τ1 , η, η1 ) ∈ R4 such that there exists (m, m1 , η, η1 ) ∈ "2 1 M2 M K1 K2 K . with (τ, m, η, τ1 , m1 , η1 ) ∈ DM 1 M2 M Using Lemma 3 one obtains that max {K, K1 , K2 } M1 M2 M 3 . We are in a position to state the following lemma.

(22)

Periodic KP-I Type Equations

463

Lemma 6. 1

fK1 M1 L2 gK2 M2 L2 hKM L2 . [max{K, K1 , K2 }]0+ Proof. Via a symmetry argument we can assume that K1 ≥ K2 . We shall consider separately the cases K1 ≥ K and K1 ≤ K. Case K1 ≥ K. Then M J2K1 K2 KM1 M2 M hKM 0 j(gK2 M2 ), fK1 M1 L2 , 1 1 + 1+ K 2 − K12 K22 J2K1 K2 KM1 M2 M

where ·, · L2 connotes the L2 (R × Z × R) scalar product. Using the Cauchy–Schwarz inequality and Lemma 4 we obtain 3

J2K1 K2 KM1 M2 M

1

1

M(M ∧ M2 ) 4 (K ∧ K2 ) 2 (K ∨ K2 ) 4

1

+

1

+

K 2 − K12 K22 · fK1 M1 L2 gK2 M2 L2 hKM L2 . 1

Now (22) yields 1

1

1

3

3

K12 (M1 M2 M 3 ) 2 M22 M 2 M(M ∧ M2 ) 4 . Hence for K1 ≥ K one has J2K1 K2 KM1 M2 M

1 K10+

fK1 M1 L2 gK2 M2 L2 hKM L2 .

Case K1 ≤ K. Then J2K1 K2 KM1 M2 M

M K

1 2−

1

+

1

K12 K22

+

fK1 M1 0 gK2 M2 , hKM L2 .

The Cauchy–Schwarz inequality and Lemma 4 yield 3

J2K1 K2 KM1 M2 M

1

1

M(M1 ∧ M2 ) 4 (K1 ∧ K2 ) 2 (K1 ∨ K2 ) 4 1

+

1

+

K 2 − K12 K22 · fK1 M1 L2 gK2 M2 L2 hKM L2 . 1

Next using (22) we obtain K 2 − (M1 M2 M 3 ) 2 − M(M1 ∧ M2 ) 4 . 1

1

3

Hence for K1 ≤ K one has 1

fK1 M1 L2 gK2 M2 L2 hKM L2 . K 0+ This completes the proof of Lemma 6. J2K1 K2 KM1 M2 M

Now using (22) and Lemma 6 we can sum J2K1 K2 KM1 M2 M over dyadic K1 , K2 , K, M1 , M2 , M and arrive at J2 f L2 g L2 h L2 . This completes the proof of Theorem 2.1.

464

J.-C. Saut, N. Tzvetkov

3. Local Well-Posedness The goal of this section is to prove a local well-posedness result in the Fourier transform restriction spaces associated to the energy density of the fifth order KP-I equation posed on T × R. This well-posedness result is a consequence of a bilinear estimate in the framework of the above spaces. The gain of smoothness is obtained as in the previous section. Because of the specific structure of the energy density, an additional argument is needed in order to deal with the terms containing antiderivatives. This argument was already given in [12]. Here we perform it again with the needed modifications. We define now the antiderivative operator ∂x−k which acts on functions defined on T × R with zero x mean value(or equivalently vanishing of some Fourier modes). Let ˆ η) = 0). Define φ : T × R → R be such that T φ(x, y)dx = 0 (or equivalently φ(0, −k ∂x φ through its Fourier transform as ˆ (−im)−k φ(m, η), when m = 0 −k ∂ φ(m, η) = x 0, elsewhere. Note that ∂x−1 (∂x φ) = φ for any φ having zero x mean value. Let φ : T × R → R be such that T φ(x, y)dx = 0. Then an integration by parts yields ∂x−2 φ · φ = |∂x−1 φ|2 . T×R

T×R

H s,k (T × R)

be the Sobolev-type space (related to the Let s and k be real numbers and energy density of the KP equation for s = 2 and k = 1) of functions having zero x mean value equipped with the norm

φ H s,k =

∞

m=0 −∞

(|m| + |m| 2s

−2

ˆ |η| )|φ(m, η)| dη 2k

2

21

.

Let b and k be real numbers. Since the energy density of the KP equations contains an antiderivative we introduce the Fourier transform restriction space Y b,k (R × T × R) as Y b,k (R × T × R) = u ∈ S (R × T × R) : u(τ, ˆ 0, η) = 0 and u Y b,k < ∞ , where

u Y b,k =

∞

∞

−∞ −∞ m=0

1 2 2 η −2 2k 5 2b 2 |m| |η| (1 + |τ − m − |) |u(τ, ˆ m, η)| dτ dη . m

Define now the space Z b,s,k (R × T × R) := X b,s (R × T × R) ∩ Y b,k (R × T × R) equipped with the norm

u Z b,s,k = u Xb,s + u Y b,k . Let I ⊂ R be an interval. Then we define a localized Bourgain space Z b,s,k (I ) endowed with the norm

u Z b,s,k (I ) =

inf { w Z b,s,k , w(t) = u(t) on I }.

w∈Z b,s,k

Periodic KP-I Type Equations

465

We have the following local well-posedness result. Theorem 3.1. Let s ≥ 1 and k ≥ 0. Then for any φ ∈ H s,k (T×R), there exist a positive T = T ( φ H s,k ) (limρ→0 T (ρ) = ∞) and a unique solution u(t, x, y) of the initial value problem associated to the fifth order KP-I equation with data on T × R on the 1 time interval I = [−T , T ] such that u ∈ C(I, H s,k (T × R)) ∩ Z 2 +,s,k (I ). The proof of the Theorem 3.1 results from the following fundamental estimate:

∂x (uv)

1

Z − 2 +,s,k

u

1

Z 2 +,s,k

v

1

Z 2 +,s,k

,

s ≥ 1, k ≥ 0.

(23)

3.1. Proof of (23). Due to Theorem 2.1 we obtain for s > 1/2,

∂x (uv)

1

X− 2 +,s

u

1

X 2 +,s

v

u

1

X 2 +,s

1

Z 2 +,s,k

v

1

Z 2 +,s,k

.

Therefore the proof of (23) is reduced to estimating

∂x (uv)

1

Y − 2 +,k

by

u

1

Z 2 +,s,k

v

1

Z 2 +,s,k

.

Actually a stronger estimate holds. More precisely we have the following theorem. Theorem 3.2. Let s ≥ 1 and k ≥ 0. Then

∂x (uv) − 1 +,k u 1 +,s v

Y

X2

2

Y

1 2 +,k

+ u

Y

1 2 +,k

v

1 X 2 +,s

.

Proof of Theorem 3.2. Write

∂x (uv)

1

Y − 2 +,k

=

∞

∞

m=0 −∞ −∞

I 2 (τ, m, η)dτ dη

1 2

,

where I (τ, m, η) =

|η|k τ − m5 − m1 =0 m−m1 =0

=

R2

η2 21 − m

u(τ1 , m1 , η1 ) v (τ − τ1 , m − m1 , η − η1 )dτ1 dη1

|η|k τ − m5 −

η2 21 − m

|η|≤2|η1 |

··· +

m1 =0 m−m1 =0

:= I1 (τ, m, η) + I2 (τ, m, η) Theorem 3.2 is a direct consequence of the next lemma.

|η|≥2|η1 |

m1 =0 m−m1 =0

···

466

J.-C. Saut, N. Tzvetkov

Lemma 7. The following estimates hold:

∞

∞

m=0 −∞ −∞

∞

I12 (τ, m, η)dτ dη

∞

m=0 −∞ −∞

1 2

u

I22 (τ, m, η)dτ dη

1

Y 2 +,k

1 2

u

1

v

X 2 +,s

1

X 2 +,s

v

1

,

(24)

.

(25)

Y 2 +,k

Proof of Lemma 7. Since |η| ≤ 2|η1 | on the domain of the integral defining I1 (τ, m, n), a duality argument shows that in order to prove (24) we should bound the expression |m1 ||m − m1 |−s f (τ1 , m1 , η1 )g(τ − τ1 , m − m1 , η − η1 )h(τ, m, η) , η12 1 + η2 1 − (η−η1 )2 1 + 5 5 5 "+ τ − m − m 2 τ1 − m1 − m1 2 τ − τ1 − (m − m1 ) − m−m1 2 (26) by c f L2 (R×Z×R) g L2 (R×Z×R) h L2 (R×Z×R) , where f , g, h are positive L2 functions. Estimate for the contribution of "1 to (26). Denote by J1 the contribution of "1 to the K1 K2 expression (26). Consider the dyadic levels DM , where K1 , K2 , M1 , M2 , M are 1 M2 M

dyadic integers as in the proof of Theorem 2.1. Denote by J1K1 K2 M1 M2 M the contribution K1 K2 of DM to (26). Then 1 M2 M

J1

J1K1 K2 M1 M2 M .

K1 ,K2 ,M1 ,M2 ,M−dyadic

Define fK1 M1 (τ, m, η), gK2 M2 (τ, m, η) and hM (τ, m, η) as in (8), (9), (10). Then clearly J1K1 K2 M1 M2 M is bounded by M1 · fK M (τ1 , m1 , η1 )gK M (τ −τ1 , m−m1 , η−η1 )hM (τ, m, η) 1 1 2 2 dτ dτ1 , 1 K1 K2 + 1+ *M M M + M2s · K12 K22 " 1 2 1 K2 4 where *K M1 M2 M ⊂ R is defined as in the proof of Theorem 2.1. Moreover, similarly to the proof of Theorem 2.1 we obtain that

J1K1 K2 M1 M2 M

M1 M2−s 1 2+

1 2+

K1 K2

sup

(τ,|m|≈M,n)

|Aτ mn | 2 fK1 M1 L2 gK2 M2 L2 hM L2 , 1

where the set Aτ mη is defined as in (12). Again similarly to the proof of Theorem 2.1 we obtain via the triangle inequality that |Aτ mη | (K1 ∧K2 )|Bτ mn |, where Bτ mη ⊂ Z×R is the set defined by (14). We shall estimate |Bτ mη | in a slightly different fashion compared to the proof of Theorem 2.1. The projection of Bτ mη on the m1 axis is contained in a set

Periodic KP-I Type Equations

467

of cardinality at most c(M1 ∧ M2 ) since for (m1 , η1 ) ∈ Bτ mη one has |m1 | ≈ M1 and |m − m1 | ≈ M2 . Fix now m1 . Recall that for (m, m1 , η, η1 ) ∈ "1 one has ∂ 2 2 η ) (η − η 1 (τ − m51 − m − m1 )5 − 1 − ∼ |m| m2 − mm1 + m21 ∂η1 m1 m − m1 |mm1 | ∼ M1 M. Hence, due to Lemma 2, the maximum cardinality of the sections of Bτ mη with lines 2) parallel to the η1 axis is bounded by c(KM11∨K M . Now using Lemma 1 we obtain that the cardinality of Bτ mη is bounded by c(M1 ∧ M2 )(M1 M)−1 (K1 ∨ K2 ). Moreover |Aτ mη |

K1 K2 (M1 ∧ M2 ) . M1 M

Hence 1

J1K1 K2 M1 M2 M

1

M12 (M1 ∧ M2 ) 2 1

M 2 M2s K10+ K20+

fK1 M1 L2 gK2 M2 L2 hM L2

1

M12 1

s− 21

M 2 M2

K10+ K20+

fK1 M1 L2 gK2 M2 L2 hM L2

1

M12 1

1

M 2 M22 K10+ K20+

fK1 M1 L2 gK2 M2 L2 hM L2 ,

since s ≥ 1. By the triangle inequality we have that M1 max{M, M2 }. Using a symmetry argument we can suppose that M ≥ M1 and therefore M1 M. Let M = 2l M1 , where l ∈ Z, l ≥ −l0 (l0 is fixed, positive and independent of M1 ). Then we have that 1

l

J1K1 ,K2 ,M1 ,M2 ,2 M1

1

l

K10+ K20+ M22 2 2

fK1 M1 L2 gK2 M2 L2 h2l M1 L2 .

(27)

It remains to sum (27) over K1 , K2 , M1 , M2 , l. First we can easily sum (27) over K1 , K2 , M2 ,

l

J1K1 ,K2 ,M1 ,M2 ,2 M1

K1 ,K2 ,M2 -dyadic

1 l

22

fM1 L2 g L2 h2l M1 L2 ,

where fM1 (τ, m, η) =

f (τ, m, η), when |m| ≈ M1 0, elsewhere.

468

J.-C. Saut, N. Tzvetkov

Next we sum over M1 and l via the Cauchy–Schwarz inequality J1

∞

l=−l0

K1 ,K2 ,M1 ,M2 -dyadic

∞ 1 l 22 l=−l0

M1 -dyadic

l

J1K1 ,K2 ,M1 ,M2 ,2 M1

fM1 2L2

1/2

M1 -dyadic

h2l M1 2L2

1/2

g L2

f L2 g L2 h L2 . Estimate for the contribution of "2 to (26). For (m, m1 ) ∈ "+ and s ≥ 1 one has |m1 ||m − m1 |−s |m|. Hence the contribution of "2 to the sum in the expression (26) is bounded by |m|f (τ1 , m1 , η1 )g(τ − τ1 , m − m1 , η − η1 )h(τ, m, η) 1 1 1+ . 2 2− η12 2 + (η−η1 )2 2 5 5 "+ τ − m5 − η τ1 − m1 − m1 τ − τ1 − (m − m1 ) − m−m1 m Now we remark that the above expression has the same nature as (6) with s = 0. Hence we can use the arguments implemented above when estimating the contribution of "2 to the expression (6). This completes the proof of (24). When |n| ≥ 2|n1 | one has |n| ≤ 2|n − n1 |. Hence a duality argument shows that the proof of (25) is reduced to bound the expression |m − m1 ||m1 |−s f (τ1 , m1 , η1 )g(τ − τ1 , m − m1 , η − η1 )h(τ, m, η) 1 1 1+ , 2 2− η12 2 + (η−η1 )2 2 5 5 "+ τ − m5 − η τ1 − m1 − m1 τ − τ1 − (m − m1 ) − m−m1 m (28) by c f L2 (R×Z×R) g L2 (R×Z×R) h L2 (R×Z×R) , where f , g, h are positive L2 functions. A symmetry argument (m1 → (m − m1 )) shows that we can bound (28) similarly to (26). This completes the proof of Lemma 7.

3.2. The fixed point argument. In this section we perform a fixed point argument for the integral equation corresponding to the fifth order KP-I equation. This argument is standard since the linear estimates in the Fourier transform restriction method of J. Bourgain do not depend on the particular equation in hand. Write the fifth order KP-I equation as an integral equation 1 t u(t) = U (t)φ − U (t − t )∂x (u2 (t ))dt , (29) 2 0

Periodic KP-I Type Equations

469

where U (t) = exp(t (∂x5 + ∂x−1 ∂y2 )) is the unitary group generating the solutions of the linear problem. We shall apply the contraction mapping principle to a cut-off version of (29). Let ψ be a bump function such that ψ ∈ C0∞ (R), supp ψ ⊂ [−2, 2], ψ = 1 on the interval [−1, 1]. Consider the integral equation 1 u(t) = ψ(t)U (t)φ − ψ(t/T ) 2

t

U (t − t )∂x (u2 (t ))dt .

0

(30)

We shall solve (30) globally in time in the space Z b,s,k , where I = [−T , T ]. To the solutions of (30) correspond local solutions of the fifth order KP-I equation in the time interval [−T , T ] in the space Z b,s,k (I ), where I = [−T , T ]. Consider the nonlinear operator L acting on Z b,s,k as 1 Lu := ψ(t)U (t)φ − ψ(t/T ) 2

t 0

U (t − t )∂x (u2 (t ))dt .

We claim that for small enough T the operator L is a contraction in the space Z 2 +,s,k for any φ ∈ H s,k (T × R). This will follow from the next estimates of the two terms in the right-hand side of (30). 1

Lemma 8 (linear estimates). Let − 21 < b ≤ 0 ≤ b ≤ b + 1, s ≥ 0 and k ≥ 0. Then the following inequalities hold:

ψ(t)U (t)φ Z b,s,k φ H s,k ,

t

ψ(t/T ) 0

(31)

U (t − t )∂x (u2 (t ))dt Z b,s,k T 1−b+b uux Z b ,s,k .

(32)

We refer to [9] for the proof of (31) and (32) (and for a very clear introduction to Bourgain’s method). These estimates are essentially one dimensional and do not depend on the unitary group U (t). Now using (31), (32) and (23) we obtain that

Lu

1

Z 2 +,s,k

φ H s,k + T 0+ u 2 1 +,s,k , s ≥ 1, k ≥ 0. Z2

Hence L maps Z 2 +,s,k into itself for s ≥ 1, k ≥ 0. In a similar way we obtain that 1

Lu − Lv

1

Z 2 +,s,k

T 0+ u − v

1

Z 2 +,s,k

u + v

1

Z 2 +,s,k

.

for some positive Therefore L is a contraction in Z b,s,k for a small T of order φ −a H s,k constant a. It remains to use the contraction mapping principle to solve (30) in Z b,s,k . This implies the local well-posedness of (29) in Z b,s,k (I ). The embedding of Z b,s,k (I ) in C(I, H s,k (T × R)) follows from a one dimensional Sobolev inequality. This completes the proof of Theorem 3.1.

470

J.-C. Saut, N. Tzvetkov

4. Global Well-Posedness In this section we extend globally in time the local solutions obtained in Theorem 3.1. This results from the energy conservation. More precisely, applying Theorem 3.1 with s = 2 and k = 1 we obtain a local solution u of the fifth order KP-I equation on the time interval [−T , T ]. The local well-posedness implies that the following alternative holds: either limt→T u(t) H 2,1 (T×R) = ∞ or T = ∞. Our goal is to show that the second cla

Communications in

Mathematical Physics

© Springer-Verlag 2001

Evolution of a Model Quantum System Under Time Periodic Forcing: Conditions for Complete Ionization O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko Department of Mathematics, Rutgers University, Piscataway, NJ 08854-8019, USA Received: 1 November 2000 / Accepted: 5 February 2001

Abstract: We analyze the time evolution of a one-dimensional quantum system with an attractive delta function potential whose strength is subjected to a time periodic (zero mean) parametric variation η(t). We show that for generic η(t), which includes the sum of any finite number of harmonics, the system, started in a bound state will get fully ionized as t → ∞. This is irrespective of the magnitude or frequency (resonant or not) of η(t). There are however exceptional, very non-generic η(t), that do not lead to full ionization, which include rather simple explicit periodic functions. For these η(t) the system evolves to a nontrivial localized stationary state which is related to eigenfunctions of the Floquet operator. 1. Introduction and Results We are interested in the qualitative long time behavior of a quantum system evolving under a time dependent Hamiltonian H (t) = H0 + H1 (t), i.e. in the nature of the solutions of the Schrödinger equation i h∂ ¯ t ψ = [H0 + H1 (t)]ψ.

(1)

Here ψ is the wavefunction of the system, belonging to some Hilbert space H, H0 and H1 are Hermitian operators and Eq. (1) is to be solved subject to some initial condition ψ0 . Such questions about the solutions of (1) belong to what Simon [1] calls “second level foundation” problems of quantum mechanics. They are of particular practical interest for the ionization of atoms and/or dissociation of molecules, in the case when H0 has both a discrete and a continuous spectrum corresponding respectively to spatially localized (bound) and scattering (free) states in Rd . Starting at time zero with the system in a bound state and then “switching on” at t = 0 an external potential H1 (t), we want to know the “probability of survival”, P (t), of the bound states, at times t > 0: P (t) = 2 j | ψ(t), uj | , where the sum is over all the bound states uj [2–6, 8, 9].

2

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

This problem has been investigated both analytically and numerically for the case H1 (t) = η(t)V1 (x) with η(t) = r sin(ωt + θ) and V1 a time independent potential, x ∈ Rd . When ω is sufficiently large for “one photon” ionization to take place, i.e., when hω ¯ > −E0 , E0 the energy of the bound (e.g. ground) state of H0 and r is “small enough” for H1 to be treated as a perturbation of H0 then this is a problem discussed extensively in the literature ([8, 9]). Starting with the system in its ground state the long time behavior of P (t) is there asserted to be given by the P (t) ∼ exp[−F t]. The rate constant F is computed from first order perturbation theory according to Fermi’s golden rule. It is proportional to the square of the matrix element between the bound and free states, multiplied by the appropriate density of continuum states in the vicinity of the final state which will have energy hω ¯ − E0 [6, 8–10]. Going from perturbation theory to an exponential decay involves heuristics based on deep physical insights requiring assumptions which seem very hard to prove. It is therefore very gratifying that many features of this scenario have been recently made mathematically rigorous by Soffer and Weinstein [6] (their analysis was generalized by Soffer and Costin [7]). They considered the case when H0 = −∇ 2 + V0 (x), x ∈ R3 , V0 compactly supported and such that there is exactly one bound state with energy −ω0 (from now on we use units in which h¯ = 2m = 1) and a continuum of quasi-energy states with energies k 2 for all k ∈ R3 . The perturbing potential is H1 (t) = r cos(ωt)V1 (x) with V1 (x) also of compact support and satisfying some technical conditions. They then showed that for ω > ω0 and r small enough there is indeed an intermediate time regime where P (t) has a dominant exponential form with the Fermi exponent F . This regime is followed for longer times by an inverse power law decay. Some of these restrictions can presumably be relaxed but the requirement that r be small is crucial to their method which is essentially perturbative. The behavior of P (t) becomes much more difficult to analyze when the strength of H1 (t) is not small and perturbation theory is no longer a useful guide. This became clear in the seventies with the beautiful experiments by Bayfield and Koch, cf. [11] for a review, on the ionization of highly excited Rydberg (e.g. hydrogen atoms) by intense microwave electric fields. These experiments showed quite unexpected nonlinear behavior of P (t) as a function of the initial state, field strength E and the frequency ω. These results as well as other multiphoton ionizations of hydrogen atoms have been (and continue to be) analyzed by various authors using a variety of methods. Prominent among these are semi-classical phase-space analysis, numerical integration of the Schrödinger equation, Floquet theory, complex dilation, etc. While the results obtained so far are not rigorous, they do give physical insights and quite good agreement with experiments although many questions still remain open even on the physical level [11–15]. In addition to the above experiments on Rydberg atoms there are also many experiments which use strong laser fields to produce multiphoton (ω < −E0 ) ionization of multielectron atoms and/or dissociation of molecules [16, 17]. These systems are more complex than Rydberg atoms and their analysis is correspondingly less developed. One unexpected result of certain studies is that an increase in the intensity of the field may reduce the degree of ionization, i.e., P (t) can be non-monotone in the field strength E at large values of E. This phenomenon, which is often called “stabilization”, can be observed in some numerical simulations, analyzed rigorously in some models and is claimed to have been seen experimentally cf. [5] and [18–21]. It turns out that many features observed for Rydberg atoms and also stabilization are already present in a simple model system which we have recently begun to investigate analytically [22–24]. This somewhat surprising finding is based on comparisons between

Ionization of Simple Model

3

experimental and model results described in detail in [23]. In fact the phenomenon of ionization by periodic fields is very complex indeed once one goes beyond the perturbative regime even in the most simple model. This will become clear from the new results about this model presented here. 2. The Model We consider a very simple quantum system where we can analyze rigorously many of the phenomena expected to occur in more realistic systems described by (1). This is a one dimensional system with an attractive delta function potential. The unperturbed Hamiltonian H0 has, in suitable units, the form H0 = −

d2 − 2 δ(x), dx 2

−∞ < x < ∞.

(2)

The zero range (delta-function) attractive potential is much used in the literature to model short range attractive potentials [25–28]. It belongs, in one dimension, to the class K1 [2]. H0 has a single bound state ub (x) = e−|x| with energy −ω0 = −1. It also has continuous uniform spectrum on the positive real line, with generalized eigenfunctions 1 1 ikx i|kx| u(k, x) = √ , −∞ < k < ∞ e e − 1 + i|k| 2π and energies k 2 . Beginning at t = 0, we apply a parametric perturbing potential, i.e. for t > 0 we have H (t) = H0 − 2 η(t)δ(x)

(3)

and solve the time dependent Schrödinger equation (1) for ψ(x, t), with ψ(x, 0) = ψ0 (x). Expanding ψ in eigenstates of H0 we write ψ(x, t) = θ(t)ub (x)eit ∞ 2 + !(k, t)u(k, x)e−ik t dk (t ≥ 0)

(4)

−∞

with initial values θ (0) = θ0 , !(k, 0) = !0 (k) suitably normalized, ∞ ψ0 , ψ0 = |θ0 |2 + |!0 (k)|2 dk = 1. −∞

(5)

We then have that the survival probability of the bound state is P (t) = |θ(t)|2 , while |!(k, t)|2 dk gives the “fraction of ejected particles” with (quasi-) momentum in the interval dk. This problem can be reduced to the solution of an integral equation in a single variable [22, 23]. Setting Y (t) = ψ(x = 0, t)η(t)eit

(6)

4

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

we have

t

θ (t) = θ0 + 2i

Y (s)ds,

0

√

!(k, t) = !0 (k) + 2|k|/

(7)

2π (1 − i|k|)

t

Y (s)ei(1+k

2 )s

ds.

(8)

0

Y (t) satisfies the integral equation t [2i + M(t − t )]Y (t )dt Y (t) = η(t) I (t) + 0

= η(t) I (t) + (2i + M) ∗ Y ,

(9)

where the inhomogeneous term is i I (t) = θ0 + √ 2π and 2i M(s) = π

∞

0

∞

!0 (k) + !0 (−k) −i(k 2 +1)t dk, e 1 + ik

0

1+i u2 e−is(1+u ) du = √ 2 1+u 2 2π 2

with f ∗g =

t

s

∞

e−iu du u3/2

f (s)g(t − s)ds.

0

In our previous works we considered the case where !0 (k) = 0 and η(t) is a finite sum of harmonics with period 2πω−1 . In particular, we showed in [23] how to compute the survival probability P (t) as a function of the strength r and frequency ω when η(t) = r sin ωt. Here we study the general periodic case and write η=

∞

Cj eiωj t + C−j e−iωj t .

j =0

Our assumptions on the Cj are (a) (b) (c)

0 ≡ η ∈ L∞ (T), C0 = 0, C−j = Cj .

Genericity condition (g). Consider the right shift operator T on l2 (N) given by T (C1 , C2 , . . . , Cn , . . . ) = (C2 , C3 , . . . , Cn+1 , . . . ). We say that C ∈ l2 (N) is generic with respect to T if the Hilbert space generated by all the translates of C contains the vector e1 = (1, 0, 0 . . . , ) (which is the kernel of T ): e1 ∈

∞

T nC

(10)

n=0

(where the right side of (10) denotes the closure of the space generated by the T n C with n ≥ 0). This condition is generically satisfied, and is obviously weaker than the

Ionization of Simple Model

5

n “cyclicity” condition l2 (N) ∞ n=0 T C = {0}, which is also generic [29] (Appendix B discusses in more detail the rather subtle cyclicity condition). An important case, which satisfies (10), (but fails the cyclicity condition) corresponds to η being a trigonometric polynomial, namely C ≡ 0 but Cn = 0 for all large enough n. (We can in fact replace e1 in (10) by ek with any k ≥ 1.) A simple example which fails (10) is η(t) = 2rλ

λ − cos(ωt) 1 + λ2 − 2λ cos(ωt)

(11)

for some λ ∈ (0, 1), for which Cn = −rλn for n ≥ 1. In this case the space generated by T n C is one-dimensional. We will prove that there are values of r and λ for which the ionization is incomplete, i.e. θ(t) does not go to zero for large t. 3. Results and Remarks Theorem 1. Under assumptions (a) . . . (c) and (g), the survival probability P (t) of the bound state ub , |θ (t)|2 tends to zero as t → ∞. Theorem 2. For ψ0 (x) = ub (x) there exist values of λ, ω and r in (11), for which |θ (t)| → 0 as t → ∞. Remarks. 1. Theorem 1 can be extended to show that D |ψ(x, t)|2 dx → 0 for any compact interval D ⊂ R. This means that the initially localized particle really wanders off to infinity since by unitarity of the evolution R |ψ(x, t)|2 dx = 1. Theorem 2 can be extended to show that for some fixed r and ω in (11) there are infinitely many λ, accumulating at 1, for which θ(t) → 0. In these cases, it can also be shown that for large t, θ approaches a quasiperiodic function. 2. While Theorem 1 holds for arbitrary ψ0 , care has to be taken with the initial conditions for Theorem 2. In particular we cannot have an initial state such that in (9) I (t) = 0 for all t. This would occur, for example, if ψ0 (x) is an odd function of x. In that case the evolution takes place as if the particle was entirely free – never feeling the delta function potential. There may also be other special ψ0 for which θ0 = 0 but for which θ(t) → 0 as t → ∞. We have therefore stated Theorem 2 for the case ψ0 = ub . We shall also, for simplicity, use this choice of ψ0 in the proofs of Theorem 1. For this case, which is natural from the physical point of view, I (t) = 1 in (9). The extension to general ψ0 is immediate and is given at the end of Sect. 5. 3. In [23] we gave a detailed picture of how the decay of θ(t) depends on r and ω when η(t) = r sin(ωt), θ0 = 1. For small r and ω−1 not too close to an integer we get an −1 exponential decay with a decay rate (r, ω) ∼ r 2(1+ω ) , where ω−1 is the integer part of ω−1 . (For ω > 1, this corresponds to ∼ F ). At times large compared to −1 , |θ (t)| decays as t −3/2 . The picture becomes much more complicated when r is large and/or ω−1 is an integer. In particular there is no monotonicity in |θ(t)| as a function of r. In [24] we proved complete ionization for the case where Cn = 0 for n > N , N ≥ 1. 4. We note here that Pillet [3] proved complete ionization for quite general H0 under the assumption that H1 (t) is “very random”, in fact a Markov process. Our results are not only consistent with this but support the expectation that generic perturbations will lead to complete ionization for general H0 . This is what we expect from entropic considerations – there is just too much phase space “out there”. The surprising thing is that even for our simple example one can readily find exceptions to the rule.

6

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

We should also mention here the work of Martin et al. [31, 32] who consider the case where H0 has an isolated eigenvalue E0 plus an absolutely continuous spectrum in the interval [0, Emax ]. They show that if the frequency ω of the periodic, small, perturbation H1 (t) is larger than E0 then the bound state is stable. This can be understood in terms of Fermi’s golden rule by noting that the density of states at the energy E0 + ω > Emax is zero so that F would be zero. 5. There is a direct connection between our results and Floquet theory where, for a time-periodic Hamiltonian H (t) with period T = 2π/ω, one constructs a quasienergy operator (QEO) [2, 33, 34] ∂ K = −i + H (θ). ∂θ K acts on functions of x and θ , periodic in θ , i.e. on the extended Hilbert space H ⊗ L2 (S, T −1 dθ ). Let now φ(x, θ ) be an eigenfunction satisfying Kφ = µφ, φ(x, θ + T ) = φ(x, θ) then,

(12)

ψ(x, t) = e−iµt φ(x, t)

is a solution of the Schrödinger equation i ∂ψ ∂t = H (t)ψ. The existence of a real eigenvalue µ of the QEO with an associated φ(x, θ ) ∈ L2 (Rd ⊗ S) is thus seen to imply the existence of a solution of the time-dependent Schrödinger equation which is, in absolute value, periodic. This shows that for appropriate initial conditions, the particle has a nonvanishing probability of staying in a compact domain and thus, for the case considered here, that ionization is incomplete. We also note that for each such µ there is actually a whole set µn = µ + nω of eigenvalues of K. For the specific model considered here, (12) takes the form Kφ = −

∂ 2 φ(x, θ ) ∂φ − 2(1 + η(θ ))δ(x)φ − i = µφ. 2 ∂x ∂θ

(13)

We can now look for solutions of (13) in the form φµ (x, θ ) = yn einωθ eαn x n∈Z

√

with αn± = ± µ − nω. Such a solution is in L2 only if (αn x) < 0, a condition which obviously selects different roots λn depending on whether x > 0 or x < 0. The requirement that φµ be in L2 (R) leads to a set of matching conditions which determine whether such eigenvalues µ can exist. It is easy to see that φµ has to be continuous at zero and satisfy the condition 2φµ (0− , θ) − φµ (0+ , θ ) = 2(1 + η(θ ))φµ (0, θ). This implies, after taking the Fourier coefficients of both sides of the above equality, the recurrence relation yn (2 − αn+ + αn− ) = 2 Cj yn−j (14) j =0

Ionization of Simple Model

7

for which a (nontrivial) solution yn ∈ l 2 is sought. This is effectively the same equation as (20) below which is at the core of our analysis. Complete ionization thus corresponds to the absence of a discrete spectrum of the QEO operator and conversely stabilization implies the existence of such a discrete spectrum. In fact, an extension of Theorem 2 shows that for the initial condition ψ0 = ub , ψt approaches such a function with µ = −s0 . More details about Floquet theory and stability can be found in [33, 34]. 6. We are currently investigating extensions of our results to the case where H0 = −∇ 2 + V0 (x), x ∈ Rd , has a finite number of bound states and the perturbation is of the form η(t)V1 (x) and both V0 and V1 have compact support. Preliminary results indicate that, with much labor, we shall be able to generalize Theorem 1, to generic V1 (x). The definition of genericity will, however, depend strongly on V0 . The physically important case of an external electric dipole field, V1 (x) = −Ex can be transformed into the solution of a Schrödinger equation of the form H (t) = −∇ 2 + V0 (x − g(t)), see [2]. This should, in principle, also be amenable to our methods but so far we have no results for that case. Outline of the technical strategy. The method ∞ of proof relies on the properties of the Laplace transform of Y , y(p) = LY (p) = 0 e−pt Y (t)dt. Since the time evolution of ψ is unitary, |θ(t)| ≤ 1. This gives some a priori control on Y . For our purposes however it is useful to characterize directly the solution of the convolution equation (9). (We restrict ourselves to !0 (k) = 0 and I (t) = 1 there.) We show that this equation has a unique solution in suitable norms. This solution is Laplace transformable and the Laplace transform y satisfies a linear functional equation. The solution of the functional equation satisfied by the transform of Y is unique in the right half plane provided it satisfies the additional property that y(p0 + is) is square integrable in s for any p0 > 0. Any such solution y transforms back (by the standard properties of the inverse Laplace transform) into a solution of our integral equation with no faster than exponential growth; however there is a unique locally integrable solution of this equation, and this solution is exponentially bounded. This must thus be our Y . We can thus use the functional equation to determine the analytic properties of y(p). This is done using (appropriately refined versions of) the Fredholm alternative. After some transformations, the functional equation reduces to a linear inhomogeneous recurrence equation in l2 , involving a compact operator depending parametrically on p, see e.g. (17). The dependence is analytic except for a finite set of poles and squareroot branch-points on the imaginary axis and we show that the associated homogeneous equation has no nontrivial solution. We then show that the poles in the coefficients do not create poles of y, while the branch points are inherited by y. The decay of y(p) when |(p)| → ∞, and the degree of regularity on the imaginary axis give us the needed information about the decay of Y (t) for large t. 4. Behavior of y(p) in the Open Right Half Plane H Lemma 3. (i) Equation (9) has a unique solution Y ∈ L1loc (R+ ) and |Y (t)| < KeBt for some K, B ∈ R. (ii) The function y(p) = LY exists and is analytic in HB = {p : (p) > B}. (iii) In HB , the function y(p) satisfies the functional equation y=

∞ j =−∞

Cj T j h + by

(15)

8

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

with

T f (p) = f (p + iω),

of

h(p) = −p −1

and b(p) = −

i 1 + 1 − ip . p

The branch of the square root is such that for p ∈ H = {p : (p) > 0}, the real part √ 1 − ip is nonnegative and the imaginary part nonpositive.

The straightforward proofs of this lemma are done in Appendix A. (Some of the results can also be gotten directly from standard results on the Schrödinger operators and on integral equations.) Remark 4. It is clear that the functional equation (15) only links points on the one dimensional lattice {p + iZω}. It is convenient to take p0 such that p = p0 + inω with (p0 ) = (p) and (p0 ) ∈ [0, ω).

(16)

The functions y, h, b in (15) will now depend parametrically on p0 . We set y = {yj }j ∈Z , h = {hj }j ∈Z , b = {bj }j ∈Z with yn = y(p0 + inω) = y(p) (and similarly for h(p) and b(p)). It is convenient to define the operator (Hˆ y)n = bn yn . Let (T y)n = yn+1 be the right shift on l2 (Z) (which we denote for simplicity by l2 ) and rewrite (15) as y=

∞ j =−∞

j

Cj T h +

∞

Cj T j Hˆ y ≡ f + J y.

(17)

j =−∞

Proposition 5. For (p0 ) > 0 there exists a unique solution of (17) in l2 . This solution is analytic in p0 , (p0 ) > 0. Thus y(p) is analytic in p ∈ H and inverse Laplace transformable there with L−1 (y) = Y . Proof. The proof uses the Fredholm alternative. We first prove the following results. Lemma 6. The operator J is compact on l2 if p0 = 0. Proof. The proof uses standard compact operator results, see e.g. [30]. First note that the operator Hˆ is compact. This is straightforward: since bj → 0 as j → ∞, it follows that Hˆ is the norm limit as N → ∞ of the finite rank operators defined by (Hˆ N y)j = bj yj for |j | ≤ N and (Hˆ N y)j = 0 otherwise, and thus is compact. The operator J is the composition between the “convolution” operator C given by (Cv)n := (C ∗ v)n := ˆ j ∈Z Cj vn+j , which is continuous on l2 , and the compact operator H . Thus J is compact. " # Remarks. 1. Note that f ∈ l2 if p0 = 0 (a straightforward consequence of the fact that C and h in (17) are in l2 ). 2. The operator J is analytic in p0 , except for p0 = 0, where the coefficients have poles, and for an additional value on the imaginary axis (possibly also 0), where the coefficients have square root branch points.

Ionization of Simple Model

9

Remark 7. Setting, for p0 = 0,

yl = ( 1 − i(p0 + ilω) − 1)zl

(18)

y = Jy

(19)

the homogeneous equation

clearly has a (nontrivial) l2 solution y only if

∞

Ck zl+k + C k zl−k 1 − ip0 + lω − 1 zl = −

(20)

k=1

has a (nontrivial) l2 solution z with

1 − ip0 + j ω − 1 zj

j ∈Z

∈ l2 .

(21)

Lemma 8. For any η under assumptions (a) to (c), if p0 ∈ H there is no nonzero l2 solution of (20) such that (21) holds. Proof. To get a contradiction, assume z ∈ l2 , z ≡ 0, satisfying (21), is a solution of (20). Multiplying (20) by zl , and summing with respect to l from −∞ to +∞ we get ∞ ∞ ∞

Ck zl+k zl + C k zl−k zl 1 − ip0 + lω − 1 |z|2l = −

l=−∞

=− =− √

l=−∞ k=1 ∞ ∞

Ck zl zl−k + C k zl−k zl

l=−∞ k=1 ∞ ∞

(22)

2 Ck zl zl−k .

l=−∞ k=1

If p0 ∈ H the imaginary part of 1 − ip0 + lω is negative (see Remark 24) and thus, if some zl is nonzero then the left side of (22) has strictly negative imaginary part, which is impossible since the right side is real. " # Proof of Proposition 5. The existence of the analytic solution follows now immediately from the analytic Fredholm alternative and the analyticity of the coefficients, for p0 ∈ H. The fact that {yn } ∈ l2 together with the stated analyticity imply that the function L−1 y(p) exists and satisfies the integral equation of Y , and thus coincides with Y . " # 5. Behavior of y(p) in the Neighborhood of (p) = 0 in the Generic Case Discussion of methods. We start again from relation (17). This has the form yn = i

j

Cj Cj qn+j yn+j , C0 = 0, − −ip0 + (n + j )ω j

(23)

10

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

where

qn =

1+

1 − ip0 + nω . −ip0 + nω

√

(24)

As the imaginary axis (p0 ) = 0 is approached, two types of potential singularities in the coefficients need attention: the poles in the coefficients due to the presence of p −1 , and the square root singularities. It will turn out that by cancellation effects, the poles play no role, generically. The square root singularities will be manifested in the solution y. The study of these questions requires further regularization of the functional Eq. (23). It is convenient to separate out the terms in (23) which are singular at p0 = 0. Using (from now on) the notation s0 = −ip0 we have √ Cj C−n C−n (1 + 1 + s0 ) − y0 + i yn = i s0 s0 s + (n + j )ω j =−n 0 Cj qn+j yn+j , n = 0, − (25) j =−n

y0 = i

j =0

Cj − Cj qj yj . s0 + j ω j =0

We break up the proof into two parts, the non-resonant and resonant case. We start with the former. 5.1. The non-resonant case, ω−1 ∈ N. Proposition 9. If condition (g) is satisfied, and ω−1 ∈ N, then the solution y of (25) is analytic in a small neighborhood of s0 = 0. For the proof we write y0 = i/2 + s0 u0 , and for n = 0 we make the substitution yn = vn + dn u0 , where we will choose dn according to (26) in order to eliminate u0 from all equations with n = 0. Lemma 10. (i) For s0 ∈ R there exists a unique solution d ∈ l2 (Z \ {0}) of the system dn = −C−n (1 + 1 + s0 ) − Ck−n qk dk , n = 0. (26) k=0

This solution is analytic at s0 = 0. (ii) With this choice of d, the system (25) becomes v n = fn − s0 +

j =0

Cj qj dj u0 = f0 −

Ck−n qk vk ,

k=0

C j q j vj ,

(27)

j =0

where

√ Cj Ck−n 1 − 1 + s0 i f0 = − + i , fn = iC−n . +i 2 s0 + j ω 2s0 s0 + kω j =0

k=0

(28)

Ionization of Simple Model

11

(iii) For small s0 we have j =0 Cj qj dj = 0, and the system (27) has a unique solution with v ∈ l2 (Z \ {0}), and vn , u0 are analytic at s0 = 0 . Proof. (i) Equation (26) is of the form (I − J )d = c in l2 (Z \ {0}), where cn = √ −(1 + 1 + s0 )C n and (J d)n = − Ck−n qk dk , (n = 0). k=0

We show first that Ker(I − J ) = {0}. Indeed, assume d = J d and set Dk = qk dk . Then we see that Ck−n Dk = 0 (29) qn −1 Dn + k=0

and, by multiplying with D n and summing over n we get qn−1 |Dn |2 + Ck−n Dk Dn = 0. n=0

(30)

n,k=0

Note that, because C−n = C n , the following quantity is real:

Ck−n Dk D n =

n,k=0

n,k=0

implying that

n=0

with (cf. (24)) Let N0 = −(1 + s0 have, by Remark 24

Cn−k Dk Dn =

Ck−n Dk D n ,

(31)

n,k=0

qn−1 |Dn |2 ∈ R

qn−1 = −1 + )ω−1

∈ R. Obviously

1 + s0 + nω.

qn−1

∈ R for n ≥ N0 while for n < N0 we

(qn−1 ) < 0.

Thus it is necessary that Dn = 0 for all n < N0 . Assume D = 0. Let N ∈ N be such that Dn = 0 for all n < N and DN = 0 (thus N0 ≤ N). Then from (29), Ck−n Dk = 0 for any n < N k≥N;k=0

or, setting k = N − 1 + j ,

Cj +n DN−1+j = 0

for n ≥ 0.

(32)

j ≥1,j =1−N

It is here that we use the genericity condition on C. In fact we will show that (32) implies D = 0 if condition (g) is satisfied. To see this define D˜ ∈ l2 (N) as D˜ j = DN−1+j if j ≥ 1, j = 1 − N and, if 1 − N ≥ 1, D˜ 1−N = 0. Then by (32) D˜ is orthogonal in

12

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

˜ e1 >= DN = 0, l2 (N) to all T n C, n ≥ 0. By the genericity condition (g) then < D, which is a contradiction. Thus D = 0. Since J is analytic in s0 for small enough s0 , and compact by the same simple arguments as in Lemma 6, it follows that (I − J )−1 exists and is analytic in s0 at s0 = 0. (ii) This part is an immediate calculation. (iii) Note first that f ∈ l2 (Z \ {0}), because 1/2 √ Ck−n 2 1 − 1 + s0 'c' + 'f ' ≤ 2s s + kω 0

≤ 'c'

k=0

n=0 k=0 0

1 < ∞. |s0 + kω|2

Also, formula (28) expresses f in terms of a discrete measure integral with respect to k of a function which depends analytically on the (small) parameter s0 , and which is uniformly in l1 . Therefore f depends analytically on s0 . The rest of the proof of (iii) closely follows that of part (i), using the following result. Cj qj dj = 0. Lemma 11. For s0 = 0 we have j =0

Proof. Assume the contrary was true. At s0 = 0, with Dn0 = Dn |s0 =0 and qn0 = qn |s0 =0 , relation (29), using (26), gives D0 0 = 0n = − Ck−n Dk0 − 2C−n (n = 0). (33) qn k=0

Multiplying with Dn0 and summing over n = 0 we would get √ (−1 + 1 + nω)|Dn0 |2 = − Ck−n Dk0 Dn0 − 2C−n Dn0 , n=0

k,n=0

(34)

n=0

and since we assumed n Cn Dn0 = 0 then, as in the proof of Lemma 10 (i), it follows that Dn0 = 0 for all n < N0 = −ω−1 . This gives, using (33), that Ck−n Dk0 + 2C−n = 0. (35) k≥N0 ;k=0

∈ l2 the sequence Dk1 = Dk0 if k = 0 and D01 = 2. As in the Denote by proof of Lemma 10 (i), using the genericity condition (g), we get D 1 = 0, an obvious contradiction. " # D1

This concludes the proof of Proposition 9: for generic η the solution y of (17) has, / N, analytic components yn when p = 0. for ω−1 ∈ Square root singularities. We now study the behavior at the square root singularities of the coefficients of the equation of y. Let k0 be the unique integer such that for some sr ∈ [0, ω) we have 1 + sr + k0 ω = 0 (then sr is a branch point in the coefficient q). The following proposition describes the analytic structure of y(p) near the imaginary axis.

Ionization of Simple Model

13

√ Proposition 12. We have the decomposition yn = un + ( s0 − sr )vn , where un and vn are analytic in s0 in a complex neighborhood of the segment [0, ω). √ Proof. The substitution yn = un + ( s0 − sr )vn , and Uk = qk uk ; Vk = qk vk

(k = k0 )

and Uk0 =

uk0 ; s0 + k 0 ω

Vk0 =

vk0 s0 + k 0 ω

leads to the following system of equations for Un and Vn : Ck−n Ck−n Uk − Ck0 −n (s0 − sr )Vk0 (n = k0 ), − s0 + kω k k Ck−n Vk − Ck0 −n (s0 − sr )Vk0 − Ck0 −n Uk0 (n = k0 ), (36) qn−1 Vn = −

qn−1 Un = ri

k

(s0 + k0 ω)Uk0 (s0 + k0 ω)Vk0

Ck−k 0 =i , s0 + kω k =− Ck−k0 Vk . k

√ We now let Qk0 = s0 + k0 ω and, for n = k0 , Qn = qn−1 = −1 + 1 + s0 + kω. We use again the Fredholm alternative and, as in the previous proofs, we need only to show the absence of a solution of the homogeneous equation at s0 = sr . We thus multiply the homogeneous equations associated to (36) in the following manner: the equation for Uj by Uj and the equation for Vj by Vj , then sum over all j . As in the previous proofs, from the reality of the r.h.s. and then from the genericity condition (g) U ≡ 0. Then, similarly, V ≡ 0. The rest is immediate. " #

5.2. The resonant case: ω−1 = M ∈ N. In this case when s0 = 0 there are poles in the coefficients of (23) when n + j = 0 and branch points when n + j = −M. The proof is a combination of the two regularization techniques used in the previous case. √ Proposition 13. We can set y(s0 ) = A(s0 ) + B(s0 ) s0 with A and B analytic in a complex neighborhood of the segment [0, ω). Proof. Special care is only needed near s0 = 0. The system (26)–(28) now reads dn = −C−n (1 + vn = fn −

1 + s0 ) −

k ∈{0,−M} /

k ∈{0,−M} /

Ck−n qk dk − C−M−n

√ 1 + s0 v−M . Ck−n qk vk − C−M−n s0 − 1

√ 1 + s0 d−M , s0 − 1 (37)

14

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

√ s0 βn and vn = γn + s0 δn . The system becomes αn = − C−n (1 + 1 + s0 ) − Ck−n qk αk

We take dn = αn +

√

k ∈{0,−M} /

− C−M−n

βn = −

Ck−n qk βk − C−M−n

k ∈{0,−M} /

γn = fn −

δn = −

1 (α−M + β−M ), s0 − 1

Ck−n qk γk − C−M−n

k ∈{0,−M} /

Ck−n qk δk − C−M−n

k ∈{0,−M} /

1 (α−M + s0 β−M ), s0 − 1 (38)

1 (γ−M + s0 δ−M ), s0 − 1

1 (δ−M + γ−M ). s0 − 1

(39)

The system (38) is of the form α F1 α , + = S(s0 ) β F2 β where α, β, F1 , F2 are in l2 . We prove that the homogeneous equation has no nontrivial solutions: α α Lemma 14. (I − S(0)) = 0 implies = 0. β β Proof. Let Qn = qn , An = qn αn , Bn = qn βn for n = 0, −M and Q−M = −1, A−M = −α−M and B−M = −β−M . The system (38) becomes Ck−n Ak , Q−1 n An = − k=0

Q−1 n Bn

=−

Ck−n Bk − C−M−n A−M .

(40)

k=0

As in the proofs in Case I, multiplying the first equation by An , summing over n we first get from the reality of the r.h.s. that An = 0 for n < −M and then by the condition (g) we get that A ≡ 0. The conclusion B ≡ 0 now follows in the same way. " # End of proof of Proposition 13. The operator S is compact on l2 ⊕ l2 and S and (F1 , F2 ) are analytic in a complex neighborhood of 0. We saw in Lemma 14 that the kernel of I − S(0) is trivial and by the analytic Fredholm alternative it follows that (I − S(0))−1 exists and is analytic in a small neighborhood of s0 = 0. Hence (α, β) are analytic. Similarly, γ , δ are analytic in the same region. " #

5.3. Proof of Theorem 1. Combining the above results we have the following conclusion: Proposition 15. If condition (g) is fulfilled, then y(p) is analytic in a neighborhood of iR \ {isr + iωZ}. For any j ∈ Z,√in a neighborhood of p = isr + ij ω (sr ∈ R) y has the form y(p) = Aj (p) + Bj (p) −ip − sr − ij ω, where Aj and Bj are analytic. In particular, y is Lipschitz continuous of exponent 1/2 in the closed right half plane. Thus Y (t) = O(t −3/2 ) for large t.

Ionization of Simple Model

15

Proof. All but the last claim has already been shown. The last statement is a standard Tauberian theorem (note that L−1 is the Fourier transform along the imaginary line). # " Proposition 16. We have θ (t) → 0 as t → ∞. Proof. We can write (9) (with I (t) = 1) as Y = η(θ + M ∗ Y ).

(41)

It is easy to check, in view of the fact that M and Y are O(t −3/2 ), that M ∗ Y → 0. t Furthermore 1 + 2i 0 Y (s)ds is convergent as t → ∞. Thus θ(t) → const as t → ∞. Since now the l.h.s. of (9) converges to zero and η does not, the equality (41) is only consistent if θ (t) → 0. " # This completes the proof of Theorem 1 for the case ψ0 = ub = e−|x| . The general case follows by noting that the inhomogeneous term does not affect the main argument, using the Fredholm alternative. Hence we will still have |θ(t)| → 0 but the rate of decay may be different. 6. A Nongeneric Example Let η be given by (11), for which Cn = −rλn for n ≥ 1,

Cn = C−n .

As in Sect. 5 set −ip0 = s0 and let qn be given by (24). Denote

1 1 1 an = an (s0 ) = = 1 + s0 + nω − 1 . r qn r

(42)

(43)

For r ∈ (0, 1), ω > 1, ω−1 ∈ N such that (1 − r)2 < ω − 1, let sr and sp be the unique numbers in (0, ω) so that 1 + sr ∈ ωZ and 1 + a−1 (sp ) = 0. We choose r, ω such that sr = sp . 6.1. The homogeneous equation. Lemma 17. Let s0,0 be a point in (0, sr ) ∪ (sr , ω). Consider s0 in a small enough neighborhood of s0,0 . The linear operator J = J (s0 ) of (17) depends analytically on s0 , and is compact on l2 . For s0 = sp , (I − J (s0 ))−1 exists and is analytic. Lemma 18. Denote for short J0 = J (sr ). There exists a value λ = λs ∈ (0, 1) such that dim Ker (I − J0 ) = 1.

(44)

Denote by A the diagonal (unbounded) operator (Az)n = an zn in l2 ; A−1 is bounded. Lemma 19. For λ = λs as in Lemma 18 we have Ker (I − J0 ) = A Ker I − J0∗ .

(45)

16

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

6.2. Proof of Lemma 17. The operator J is compact by Lemma 6. To show that I − J is invertible we prove this for any points s0 ∈ (0, ω), s0 = sp , sr ; by the analytic Fredholm theorem it will follow that I − J is invertible in a small enough neighborhood of any such point, thus proving the lemma. Let s0 ∈ (0, ω), s0 = sp , sr . As in Remark 7 in Sect. 5, the substitution yn = an zn (n ∈ Z) transforms the homogeneous equation (19) to an zn =

∞

λj zn+j + zn−j ,

n ∈ Z.

(46)

j =1

Note that an < 0 for n < −1 for s0 ∈ [ω − 1, ω) and an < 0 for n < 0 for s0 ∈ (0, ω − 1). We will discuss only the first case, s0 ≥ ω − 1, since the second one is completely analogous. As in the proof of Lemma 8, it follows that zn = 0

for n < −1.

(47)

Then Eqs. (46) for n < −1 become ∞

λk zk−2 = 0.

(48)

k=1

For n = −1 (46) gives (a−1 + 1)z−1 = 0,

(49)

and for n ≥ 0, using (48), we get (1 + an )zn =

n+1

(λj − λ−j )zn−j , n ≥ −1.

(50)

j =1

Since s0 = sp , (49) gives z−1 = 0, and it follows by induction, from (50), that zn = 0 for all n. By the Fredholm alternative theorem then I − J (s0 ) is invertible. 6.3. Proof of Lemma 18. In what follows s0 = sr . 6.3.1. An auxiliary lemma. We show that if z ∈ l2 then Eq. (48) is redundant. Lemma 20. If z is an l2 solution of (50) with zn = 0 for n < −1 then z satisfies (48). Proof. Let z ∈ l2 be a solution of (50). Then Z [n+1] ≡

n

λk zk−2

(51)

k=1

is the truncation of a convergent series, since there is a constant M with |zn | < M for all n. Note that n+1 1 + an )zn = λj zn−j − λ−n−2 Z [n+1] , j =1

Ionization of Simple Model

17

hence Z [n+1] = λn+2

n+1

λj zn−j − λn+2 (1 + an )zn ,

j =1

so that √ Mλ [n+1] + λn+2 M 1 + const n −→ 0 as n → ∞. (52) Z ≤ λn+2 1−λ Since (51) are truncations of the series in the LHS of (48), then (52) implies (48). " # 6.3.2. Behavior of the general solution of (50). A direct calculation shows that the sequence zn satisfying the infinite order recurrence (50) and the initial condition z−1 = 1 satisfies, in fact, the three step recurrence (1 + an+1 )zn+1 + (1 + an−1 )zn−1 = [λ(1 + an ) + λ + an λ−1 ]zn (n ≥ 0)

(53)

with the initial condition z−1 = 1,

z0 =

λ − λ−1 . 1 + a0

(54)

Denote zn = then (53) becomes

λ − λ−1 Vn−1 , 1 + an

(55)

Vn + Vn−2

λ 2 + an = λ+ Vn−1 λ(1 + an )

n ≥ 1.

(56)

We are looking for l2 solutions. Recent rigorous WKB estimates (see e.g.√[35]) would imply there are solutions of the discrete equation (56) behaving like λ−n e− n/ω and like √ n n/ω . We will prove this in our context and find special values of λ for which the λ e solution decaying for large n satisfies the initial condition. We will show that this solution is obtained by taking Vn−2 = gn−1 Vn−1

(57)

in (56) and iterating: gn−1 = Gn −

1 gn

with Gn = λ +

λ 2 + an , λ(1 + an )

(58)

i.e., g0 is given by the continued fraction: g 0 = G1 −

1 G2 −

1 ... G3

,

which needs to match the initial condition (see (54): g0 = g0 (λ) =

λ + λ−1

1 . + (1 + a0 )−1 (λ − λ−1 )

(59)

18

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

Lemma 21. (i) Let λ ∈ (0, 1). The recurrence (58) has a solution such that gn → λ−1 as n → ∞. (ii) g0 is meromorphic in λ on [0, 1) and has poles. (iii) There exists λs ∈ (0, 1) such that g0 (λs ) satisfies (59). (iv) Let λ = λs . To the solution of (i) there corresponds a solution V [s] of the recurrence n+o(n) as n → ∞. The corresponding solution z[s] of (50) (56) such that Vn[s] ∼ λs satisfies zn → 0 as n → ∞. (v) Let λ = λs . There exists a solution of (56) with the asymptotic behavior Vn[l] ∼ −n+o(n) . λs Thus, for λ = λs , there exists a unique (up to a multiplicative constant) “small” n+o(n) for large n, while the general solusolution of (56), with the behavior Vn[s] ∼ λs −n+o(n) . As a consequence, a similar statement holds for the tion behaves like Vn ∼ λs recurrence (53). Remark. The proof of (iii) can be refined to show that, in fact, there is a countable set of points λs for which g0 satisfies the initial condition, and that these values accumulate to 1. Proof. (i) With the substitution gn = Gn+1 − λ + δn ,

(60)

the recurrence (58) becomes 1 ≡ (Sδ)n , n ≥ 0. (61) Gn+2 − λ + δn+1 −1 For n0 ≥ 0 and F small, positive, define λn0 = an0 +2 2 + an0 +2 − F. Let Nn0 be a small neighborhood of the interval In0 = [0, λn0 ]. Consider the Banach space Bn0 of sequences {δn }n≥n0 with δn = δn (λ) analytic on Nn0 and continuous up to the boundary, with the norm 'δ' = supn≥n0 supλ∈Nn |δn (λ)|. Direct estimates show that 0 the operator S defined by (61) takes the ball of radius ρn0 = 2/(2 + an0 +2 ) + F in Bn0 into itself (if F, F and Nn0 are small enough), and is a contraction in this ball. Therefore the equation δ = S(δ) has a unique solution in Bn0 , of norm less than ρn0 . Then |δn (λ)| < const(n + 2)−1/2 for all λ ∈ In and all n ≥ 0. Since the sequence λn increases to 1, (i) follows. (ii) Step I: All gn are meromorphic on [0, 1). Since δn is analytic on In , then from (60), gn is analytic on In \ {0}, having a pole at λ = 0: gn ∼ λ−1 an+1 (1 + an+1 )−1 (λ → 0). Iterating (58) it follows that gn−1 , gn−2 , . . . , g0 are meromorphic on In . Since the intervals In increase toward [0, 1) it follows that g0 , g1 , . . . gn . . . are meromorphic on [0, 1). Step II: There exists n1 and λ0 ∈ (0, 1) such that gn1 (λ0 ) ≤ 0. Define Fn = (1 + an )−1 ; we have (see (43)) δn = λ −

Fn0 ∼ r(n0 ω)−1/2 ,

n0 → ∞.

(62)

Let n0 be large and denote λ0 = 1 − Fn0 . Let N0 be large enough so that λ0 is in the domain of analyticity of gN0 . Iterating (58) starting from N0 (and decreasing indices) we get the value gn0 (λ0 ). If for some n ∈ {n0 , n0 + 1, . . . , N0 } we get gn (λ0 ) ≤ 0, Step II is proved. Then assume that gn0 (λ0 ) > 0.

Ionization of Simple Model

19

Consider the recurrence r˜n−1 = Gn0 (λ0 ) −

1 r˜n

for n ≤ n0 ,

r˜n0 = gn0 (λ0 ),

(63)

where, in fact, Gn0 (λ0 ) = 2 − Fn20 . The recurrence (63) can be solved explicitly (it is a discrete Riccati equation and a substitution r˜n−1 = xn−1 /xn transforms it into a linear recurrence with constant coefficients). It has the solution r˜n =

cos ((n − n0 )φ + θ ) , cos ((n + 1 − n0 )φ + θ )

(64)

where cos φ = 1−Fn20 /2, sin φ > 0, and tan θ = (cos φ −λ)/ sin φ so that θ ∼ π4 − 41 Fn0 (Fn0 → 0). We assume, to get a contradiction, that gn (λ0 ) > 0 for all n = 0, 1, . . . , n1 . Then gn (λ0 ) ≤ r˜n

for n ≤ n0 ,

(65)

which follows immediately by induction using (58), (63), noting that Gn is increasing in n. Note that there is an n1 ∈ {1, 2, . . . , n0 − 1} so that r˜n > 0

for n ∈ {n1 + 1, . . . , n0 } and r˜n1 < 0.

(66)

Indeed (from (62)) when n decreases from n0 the numerator and denominator in (64) increase up to 1, then decrease, until the numerator becomes negative, when n equals n1 = n0 − k1 , where k1 is the integer with k1 − 1 < (π/2 + θ )/φ ≤ k1 . Since φ ∼ Fn0 (Fn0 → 0) then k1 ∼ (3π)/(4Fn0 ), and, using (62), clearly k1 ∈ {1, . . . , n0 − 1} (if n0 is sufficiently large). Then (65) and (66) contradict the assumption that gn1 (λ0 ) > 0, and Step II is proved. Step III. The function gn1 is meromorphic on [0, 1), with gn1 (0+) = +∞. There is a smallest value of λ in (0, λ0 ), where gn1 changes sign: this is either a zero, or a pole. Assume it was a pole. Let p ∈ (0, λ0 ) be the first pole of gn1 . Then gn1 is positive and analytic on (0, p), and gn1 (p−) = +∞, gn1 (p+) = −∞. Since gn+1 = 1/(Gn+1 −gn ) (see (58)) then gn1 +1 (p−) = 0−, hence gn1 +1 changes sign in (0, p). But gn1 +1 has no zero in (0, p) (otherwise at that zero gn1 would have had a pole, from (58)). Then gn1 +1 has a pole, with a change of sign, from + to −, in (0, p). Now the argument can be repeated. It follows that for any k > 0, gn1 +k has a pole in (0, p), which contradicts the fact that the domain of analyticity of gn increases to (0, 1) as n → ∞. Therefore, the first change of sign of gn1 is at a zero. Let ζ1 be the smallest value in (0, 1) such that gn1 (ζ1 −) = 0+, gn1 (ζ1 +) = 0−. Then from (58) we have gn1 −1 (ζ1 −) = −∞ and gn1 −1 changes sign in (0, ζ1 ). Now the argument can be repeated. It follows that g0 has a pole at a point ζn1 with g0 (ζn1 −) = −∞. (iii) Since g0 (λ) takes all the values when λ ∈ (0, ζn1 ) there exists λ = λs ∈ (0, 1) such that (59) holds. (iv) For λ = λs , since the solution of (i) satisfies gn (λ) = λ−1 +O(n−1/2 ) we have from n+o(n) ) (57), with the notation V [s] = V (λs ), that Vn[s] = nk=0 gk (λs )−1 V0[s] = O(λs n+o(n) n+o(n) [s] [s] [s] ); then from (55) zn = O(λs ). and thus Vn − Vn−1 = O(λs (v) The substitution (variation of constants) Vn = Vn[s] vn brings the recurrence (56) to [s] /Vn[s] In−1 and a first order one: with the notation In = vn − vn−1 we have In = Vn−2 the rest of the argument consists of straightforward estimates. " #

20

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko 2

1

0

0.75

0.8

0.85

0.9

λ

–1

–2

Fig. 1. Graph of g0 given by (58) (discontinuous graph) and by (59) in a region near λ = 1, as functions of λ

6.3.3. Proof of Lemma 18. Proof. Lemma 21(v) shows that Eq. (53) has a unique (up to a multiplicative constant) n+o(n) small solution, zn[s] ∼ λs (n → ∞), while the general solution behaves like zn ∼ √ −n+o(n) . Since yn ∼ nzn the uniqueness of the l2 solution is proven. λs 6.3.4. Examples of solutions. We will show next how concrete values λs satisfying Lemma 21 (iii) are relatively straightforwardly, and rigorously, found. One method is as follows. Note that the minimum/maximum of the function a − b/x, where x varies in an interval not containing zero is achieved at the endpoints. We thus take the recurrence 2 √ (58) with initial conditions gn0 = λ−1 ± 1−λ and compute g0 from these. The actual nω graph will be between these two, unless the condition mentioned is violated in between n0 and 0. This graph is to be intersected with the graph of the initial condition (59). We take for instance ω = 1.1, r = 0.45, sp = 0.11, n0 = 10, for which the rigorous control is not too involved. The two graphs are very close to each other (within about 3.10−6 for λ ∈ (0.3, 0.4)) and cannot be distinguished from each-other in Fig. 1. A first intersection is seen at λ ≈ 0.327; see Fig. 2. 6.4. Proof of Lemma 19. Denote B = (I − J0 )A; we have B = A − S. Hence B ∗ = A − S. Then Ker(B) = Ker(B ∗ ) (since Az = Sz implies (47), so Az = Az, and similarly, Az = z implies Az = Az). So Ker[(1 − J0 )A] = Ker[A(1 − J0 ∗ )] so that (since A is one-to-one) A−1 Ker(1 − J0 ) = Ker(1 − J0 ∗ ), which proves the lemma. 6.5. Discussion of the singularities of solutions of (17). Let λ = λs . We have that I − J is invertible for p0 > 0, and is not invertible at p0 = isp (Lemma 18). By the analytic Fredholm theorem (see e.g. [30]) (I − J )−1 is meromorphic on a small neighborhood of isp , therefore there exist m ≥ 1 and operators Sm , . . . , S1 , R(p0 ) so that: 1

(I − J )−1 =

p0 − isp

m Sm + · · · +

1 S1 + R(p0 ), p0 − isp

(67)

Ionization of Simple Model

21

1.2

1.1

1

0.9

0.8

λ 0.3

0.32

0.34

0.36

0.38

0.4

Fig. 2. Graphs of g0 (steeper graph) and of the initial condition for g0 (59)

where R(p0 ) is analytic at isp , and Sm = 0 (since I − J0 is not invertible). Multiplying (67) by I −J to the left, respectively to the right, and writing J = J0 +(p0 −isp )J1 (p0 ) (where J1 (p0 ) is analytic at isp ) we get that −m+1

1 , R1 (p0 ) = m (I − J0 ) Sm + O p0 − isp p0 − isp −m+1

1 R2 (p0 ) = , m Sm (I − J0 ) + O p0 − isp p0 − isp where R1,2 are analytic at p0 = isp . By the uniqueness of the series of the analytic functions (Banach space valued) R1,2 we must then have (I − J0 ) Sm = 0 = Sm (I − J0 ) .

(68)

The first equality in (68)

implies Ran(Sm ) ⊂ Ker (I − J0 ) = {yKer } and since Sm = 0 then Ran(Sm ) = {yKer }, therefore Sm y = y, u yKer u ∈ l2 \ {0}. for some The second equality in (68) means u ∈ Ran (I − J0 )⊥ = Ker I − J0∗ . By Lemma 19 then (up to a multiplicative constant) u = A−1 yKer = zKer , where zKer satisfies (46), hence (53),(54). The solution y = (I − J )−1 f of (17) is then singular at p0 = isp if c =< f, zKer > = 0. For the example of Sect. 6.3.4 this latter condition can be checked by explicit calculation of the truncations to 10 terms and estimation of the remainder based on the contractivity bounds in the previous section. The result is c = −1.953 ± 0.001. Thus the inhomogeneous equation has poles. Lemma 22. Let Y (t) be analytic on [0, ∞), with limt→∞ Y (t) = 0. Let s ∈ R. Then ∞ lim a e−(a+is)t Y (t) dt = 0. a↓0

0

(69)

∞ Corollary 23. Let Y (t) be as in Lemma 22. Let y(p) = 0 e−pt Y (t) dt. Assume that y(p) is analytic on iR+ , except for a set of isolated points. Then y(p) does not have poles on iR+ .

22

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

Proof. I. We first show (69) for s = 0. Separating the positive and negative parts of Y (t), Y (t) write Y (t) = Y [1] (t) − [2] Y (t) + iY [3] (t) − iY [4] (t) with Y [k] (t) nonnegative, continuous, nonanalytic only on a discrete set, where the left and right derivatives exist, with limt→∞ Y [k] (t) = 0. It is enough to show (69) for each Y [k] . Let then Y be one of the Y [k] ’s. Denote H (t) = supτ ≥t Y (τ ). The function H on [0, ∞) has the same properties as Y and in ∞ addition is decreasing. Then H exists a.e. and H ∈ L1 [0, ∞), since 0 |H (τ )| dτ = t − limt→∞ 0 H (τ ) dτ = limt→∞ −H (t) + H (0) = H (0). Then ∞ ∞ ∞ d −at −at −at a H (t) dt e e Y (t) dt ≤ a e H (t) dt = − dt 0 0 0 ∞ e−at H (t) dt, = H (0) + 0

therefore

∞

lim a a↓0

e−at Y (t) dt ≤ H (0) + lim

a↓0 0

0

∞

e−at H (t) dt = 0,

which proves the lemma in this case. II. Let now s ∈ R arbitrary. Then (69) follows from the result for s = 0 applied to the # function Y˜ (t) = e−ist Y (t). " Proof of Theorem 2. In conclusion Y (t) cannot tend to zero as t → ∞ and complete ionization fails. " # Acknowledgements. The authors would like to thank A. Soffer and M. Weinstein for interesting discussions and suggestions. Work of O. C. was supported by NSF Grant 9704968, of R.D.C. by NSF Grant 0074924, and that of O. C., J. L. L. and A. R. by AFOSR Grant F49620-98-1-0207 and NSF Grant DMR-9813268.

Appendix A. Proof of Lemma 3 A (i) Consider L1loc [0, A] endowed with the norm 'F 'ν := 0 |F (s)|e−νs ds, where ν > 0. If f is continuous and F, G ∈ L1loc [0, A], a straightforward calculation shows that 'f F 'ν < 'F 'ν sup |f |,

(A1)

'F ∗ G'ν < 'F 'ν 'G'ν , 'F 'ν → 0 as ν → ∞,

(A2) (A3)

[0,A]

where the last relation follows from the Riemann–Lebesgue lemma. The integral equation (9) can be written as Y = η +JY

where J F := η(2i + M) ∗ F.

(A4)

Since M is locally in L1 and bounded for large x it is clear that for large enough B2 , (9) is contractive if ν > B2 , for any A.

Ionization of Simple Model

23

(ii) This is an immediate consequence of Lemma 3 and of elementary properties of the Laplace transform. (iii) We have in H, ∞ 2 −i(x−ia)(1+u2 ) 2i ∞ u e −px dxe du (A5) LM = lim a↓0 π 0 1 + u2 0 ∞ i u2 = du. (A6) π −∞ (1 + u2 )(p + i(1 + u2 )) For (p) > 0 we push the integration contour through the upper half plane. At the two poles in the upper half plane u2 + 1 equals 0 and ip respectively, so that i π

∞ −∞

u2 du (1 + u2 )(p + i(1 + u2 )) u20 (−1) ds i u0 ds i + =− + , = π (2i)(p) s (ip)(2iu0 ) s p p

(A7)

where u0 is the root of p + i(1 + u2 ) = 0 in the upper half plane. Thus √ i i 1 − ip LM = − + (A8) p p √ with the branch satisfying 1 − ip → √ 1 as p → 0 in H. Thus, the analytic continuation of 1 − ip in H∪∂H in our calculations is as follows: √ Remark 24. As p varies in H, 1−ip belongs to the √ lower half plane −iH so that 1 − ip varies√in the fourth quadrant, and in particular 1 − ip < 0. If p√∈ iR and −ip ≥ −1 then 1 − ip is real and nonnegative, while if −ip < −1 and 1 − ip has zero real part and negative imaginary part. To show (15) note that for (p) > 0, ω > 0 we have √

i i 1 − ip ∓ ω ±iω L e M =− + , p ∓ iω p ∓ iω (with 1 − ip − ω = −i ω − 1 + ip if ω > 1)

(A9)

The branch of the square root was discussed in Remark 24. This concludes the proof of Lemma 3 (iii). Appendix B. Discussion of the Genericity Condition (g) A thorough analysis of the properties of the shift operator is provided by the treatise [29]. We provide here an independent discussion, meant to give an insight on the interesting analytic properties involved in this condition. Let C = (C0 , C1 , . . . , Cn , . . . ) ∈ l2 (N) and the operator T defined as before by T C = (C1 , C2 , . . . ). We want to see for which such vectors, the system of equations (z, T j C) = 0, j = 0, 1, . . .

(B1)

24

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

has nontrivial solutions z in l2 . We can associate to z and C analytic functions in the unit disk, Z(x) and C(x) by C(x) =

∞

Ck x k

Z(x) =

k=0

∞

zk x k .

(B2)

k=0

These functions, extend to L2 functions on the unit circle. The system of equations (B1) can be written as z0 C(x) + z1 x −1 (C(x)C(0)) + . . . + zn x

−n

C(x) − x

−n

n−1 k x k=0

k!

C (k) (0) + · · · = 0.

Using Cauchy’s formula, we can the difference in square brackets in (B3) as 1 C(s) ds, 2πi |s|=1 s n (s − x)

(B3)

(B4)

and thus (B1) becomes |s|=1

C(s)Z(1/s) ds = 0. s−x

(B5)

The functions C for which this equation has nontrivial solutions Z relate to the Beurling inner functions [29] and are very “rare”. Examples. (i) Let |λ| < 1 and Cn = λn , i.e. C(x) = (1 − λx)−1 . This is related to the function (11). Taking advantage of the analyticity of Z outside the unit circle, we can push the contour of integration towards s = ∞, collecting the residue at x = λ−1 ; we see that Eq. (B5) holds iff Z(λ) = 0, i.e., for a space of analytic functions of codimension one. (ii) Let λn = 1/n. Then C(x) = ln(1 − x), and by taking s = 1/t in (B5) we get 1 Z(t) ln(t − 1) ln(t)Z(t) 1 dt − dt = 0. (B6) x |t|=1 (t − x −1 )t x |t|=1 t (t − x −1 ) By making a cut on [1, ∞) for the log we see that the integrand in the first integral is analytic in the unit circle and thus the integral vanishes. We decompose the second integral by partial fractions and we get ln(t)Z(t) ln(t)Z(t) dt − dt = 0, (B7) t |t|=1 |t|=1 (t − y) where y = x −1 . The first integral is a constant, C. By pushing the contour of integration inwards, we see that the second integral extends analytically for small y = 0. For such y we thus have ln(t)(Z(t) − Z(y)) ln(t) dt + Z(y) dt = −C. (B8) (t − y) |t|=1 |t|=1 (t − y)

Ionization of Simple Model

25

Now the contour of integration can be pushed to the sides of the interval [0, 1] collecting the difference between the branches of the log. We get 1 1 Z(t) − Z(y) 1 dt + Z(y) dt = 0. (B9) t −y 0 0 t −y Thus φ(y) + Z(y) ln(−y) = C with φ and Z analytic in the unit circle, thus ln(−y) is analytic unless Z = 0. This shows Cn = 1/n is generic. References 1. Simon, B.: Schrödinger Operators in the Twentieth Century. J. Math. Phys. 41, 3523 (2000) 2. Cycon, H.L., Froese, R.G., Kirsch, W. and Simon, B.: Schrödinger Operators. Berlin–Heidelberg–New York: Springer-Verlag, 1987 3. Pillet, C.-A.: Some results on the quantum dynamics of a particle in a Markovian potential. Commun. Math. Phys. 102, 237–254 (1985) and 105, 259 (1986) 4. Yajima, K.: Existence of Solutions for Schrödinger Evolution Equations. Commun. Math. Phys. 110, 415–426 (1987) 5. Fring, A., Kastrykin, V. and Schrader, R.: Ionization Probabilities through Ultra-Intense Fields in the Extreme Limit. J. Phys. Math. Gen. 30, 8599–8610 (1997) 6. Soffer, A. and Weinstein, M.I.: Nonautonomous Hamiltonians. J. Stat. Phys. 93, 359–391 (1998) 7. Soffer, A. and Costin, O.: Resonance Theory for Schrödinger Operators. Submitted to Commun. Math. Phys. 8. Landau, L.D. and Lifschitz, E. M.: Quantum Mechanics – Nonrelativistic Theory. 2nd ed. New York: Pergamon, 1965 9. Cohen-Tannoudji, C., Duport-Roc, J. and Arynberg, G.: Atom-Photon Interactions. New York: Wiley, 1992 10. Fermi, E.: Notes on Quantum Mechanics. Chicago: The University of Chicago Press, 1961, p. 100 11. Koch, P.M. and van Leeuven, K.A.H.: The Importance of Resonances in Microwave “Ionization” of Excited Hydrogen Atoms. Phys. Repts. 255, 289–403 (1995) 12. Casatti, G., Chirikovand, B. and Shepelyansky, D. L.: Relevance of Classical Chaos in Quantum Mechanics: The Hydrogen Atom in a Monochromatic Field. Phys. Repts. 154, 77–123 (1987) 13. Patolige, R.M. and Shaheshaft, R.: Multiphoton Processes in an Intense Laser Field: Harmonic Generation and Total Ionization Rates for Atomic Hydrogen. Phys. Rev. A 40, 3061–3079 (1990) 14. Buchleitner, A., Delande, D. and Gay, J.-C.: Microwave Ionization of Three Dimensional Hydrogen Atoms in a Realistic Numerical Experiment. J. Opt. Soc. Am. B 12, 505–519 (1995) 15. Benenti, G., Casati, G., Maspero, G. and Shepelyansky, D.L.: Quantum Poincaré Recurrences for Hydrogen Atom in a Microwave Field. Preprint, physics/9911200 16. Schwendner, P., Seyl, F. and Schinke, R.: Photodissociation of Ar2+ in Strong Laser Fields, Chem. Phys. 217, 233–244 (1997) 17. Guerin, S. and Jauslin, H.-R.: Laser-Enhanced Tunneling through Resonant Intermediate Levels. Phys. Rev. A 55, 1262–1275 (1997) 18. Eberly, J.M. and Kulander, K.C.: Atomic-Stabilization by Super-Intense Lasers. Science 262 1233 (1993) 19. Su, Q., Irving, B.P. and Eberly, J.H.: Ionization Modulation in Dynamic Stabilization. Laser Physics 7, 568 (1997) 20. Figueira de Morisson Faria, C., Fring, A. and Schrader, R.: Analytical Treatment of Stabilization. Preprint, physics/9808047, v2 21. Barash, D., Orel, A.E. and Vemuri, V.R.: A Genetic Search in Frequency Space for Stabilizing Atoms by High-Intensity Laser Fields. J. Comp. Info. Tech. CIT 8, 2, 103–113 (2000) 22. Rokhlenko, A. and Lebowitz, J.L.: Ionization of a Model Atom by Perturbation of the Potential. J. Math. Phys. 41, 3511–3523 (2000) 23. Costin, O., Lebowitz, J.L. and Rokhlenko, A.: Exact Results for the Ionization of a Model Quantum System. J. Phys. A. 33, 6311–6319 (2000) physics/9905038, and work in preparation 24. Costin, O., Lebowitz, J.L. and Rokhlenko, A.: To appear in: Proceedings of the CRM meeting “Nonlinear Analysis and Renormalization Group”, American Mathematical Society Publications (2000), mathph/0002003 25. Demkov, Yu.N. and Ostrovskii, V.N.: Zero Range Potentials and Their Application in Atomic Physics. New York: Plenum, 1988; Albeverio, S., Gesztesy, F., Høegh-Krohn, R. and Holden, H.: Solvable Models in Quantum Mechanics. Berlin–Heidelberg–New York: Springer-Verlag, 1988

26

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

26. Susskind, R.M., Cowley, S.C. and Valeo, E.J.: Multiphoton Ionization in a Short Range Potential: A Nonperturbative Approach. Phys. Rev. A 42, 3090–3106 (1990) 27. Scharf, G., Sonnemoser, K. and Wreszinski, W.F.: Sensitive Multiphoton Ionization. Phys. Rev. A 44, 3250–3265 (1991) 28. LaGattuta, K.J.: Laser-Assisted Scattering from a One-Dimensional δ-function potential: An Exact Solution. Phys. Rev. A 49 No. 3, 1745–1751 (1994) 29. Nikol’skii, N.K.: Treatise on the Shift Operator. Berlin–Heidelberg–New York: Springer-Verlag, 1986 30. Reed, M. and Simon, B.: Methods of Modern Mathematical Physics, Vol. I. New York: Academic Press, 1979 31. Martin, P.A.: Scattering with Time Periodic Potentials and Cyclic States. Preprint 1999, Texas 32. Kovar, T. and Martin, P.A.: Scattering with a periodically kicked interaction and cyclic states. Preprint (1999) 33. Belissard, J.: Stability and Instability in Quantum Mechanics. In: Trends and Developments in the Eighties, Albeverio, S. and Blanchard, Ph. eds., Singapore: World Scientific, 1985, pp. 1–106 34. Jauslin, H.R. and Lebowitz, J.L.: Spectral and Stability Aspects of Quantum Chaos. Chaos 1, 114–121 (1991) 35. Costin, O. and Costin, R.D.: Rigorous WKB for Discrete Schemes with Smooth Coefficients. SIAM J. Math. Anal. 27 no. 1, 110–134 (1996) Communicated by B. Simon

Commun. Math. Phys. 221, 27 – 56 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

How to Prove Dynamical Localization Serguei Tcheremchantsev MAPMO-CNRS, Département des Mathématiques, Université d’Orléans, BP 6759, 45067 Orléans Cedex 2, France. E-mail: [email protected] Received: 16 November 2000 / Accepted: 14 February 2001

Abstract: Let H be a self-adjoint operator on l 2 (Zd ) or L2 (Rd ) with pure point spectrum on some interval I . We establish general necessary and sufficient conditions for dynamical localization for a given vector and on the interval of energies I . The sufficient conditions we obtain improve the existing ones such as SULE or WULE and can be useful in applications. 1. Introduction Let H be a self-adjoint operator on the Hilbert space H with pure point spectrum on some interval I ⊂ R. We shall consider the case H = l 2 (Zd ) as well as H = L2 (Rd ). Consider the subspace H(I ) of H, H(I ) = PI (H )H, where PI (H ) is the spectral projector of H onto I . It is natural to say that the operator H has dynamical localization on I if for any p > 0 and well localised ψ ∈ H, sup r p (t) ≡ supexp(−itH )PI (H )ψ, |X|p exp(−itH )PI (H )ψ < +∞, t

t

(1.1)

where X is the usual position operator. The problem of dynamical localization was intensively studied during the last past years [1–8], especially in the case of random discrete and continuous Schrödinger operators (in particular, the Anderson model). The aim of the present paper is not to give a review of the results obtained for concrete models. We are rather interested in general mathematical methods which can be used to prove dynamical localization (1.1) for any self-adjoint operator H . We hope, however, that the obtained results (especially sufficient conditions for dynamical localization) will be useful in applications. Let {ek } be any orthonormal set of eigenfunctions of H complete in H(I ). For any k we have H ek = λk ek with λk ∈ I . Suppose that H = l 2 (Zd ) (in the last section of the paper we discuss also the continuous case). Let ψ = δm for some m ∈ Zd . Then exp(−itλk )ek (n)ek (m) ψI (t, n) ≡ (exp(−itH )PI (H )ψ)(n) = k

28

S. Tcheremchantsev

and sup |ψI (t, n)| ≤ W (n, m), t

where

W (n, m) =

sup r p (t) ≤ t

|n|p W 2 (n, m),

(1.2)

n∈Zd

|ek (n)ek (m)|.

k

To prove dynamical localization (1.1) for any ψ = δm , it is sufficient to show that the function W (n, m) is fast decaying as |n| → ∞ for all m ∈ Zd . What one usually proves for “concrete” Schrödinger operators is the so-called exponential (or mathematical) localization on the interval I . Namely, for some α > 0 and any eigenfunction ek ∈ H(I ), |ek (n)| ≤ C(k) exp(−α|n|),

(1.3)

where C(k) < +∞. If the sum is finite, it is evident that W (n, m) is exponentially decaying in n for any m and (1.1) holds. However, typically the sum has infinitely many terms. In this case the bounds (1.3) give nothing about the behaviour of the sum W (n, m). The well known example of [5] shows that it is quite possible that (1.3) holds, but there is no dynamical localization for ψ = δ0 and r 2 (tn )/tn2−δ → +∞ for any δ > 0 for some sequence tn → ∞. It is clear that to prove dynamical localization, one should control the decay of |ek (n)ek (m)| both in n and in k. Or, equivalently, one should control the constants C(k) in (1.3). Two approaches have been developed to solve this problem. The first is to estimate rather directly |ek (n)ek (m)| and to prove that the sum W (n, m) is fast decaying as |n| → ∞ [7, 8]. For example, one shows that

where

|ek (n)ek (m)| ≤ C(m)ak exp(−α|n − m|), k ak

(1.4)

= D < +∞. Clearly, (1.4) yields W (n, m) ≤ C(m)D exp(−α|n − m|)

and (1.1) holds. In particular, the condition called WULE was considered [7]. This is 2 and B is the operator of multiplication by (1 + |n|2 )−δ/2 with (1.4), where ak = Bek δ > d/2. Obviously, as k |ek (n)|2 ≤ 1 for any n, ak ≤ (|n|2 + 1)−δ < +∞. k

n

Another possibility is to have some control of constants C(k) in (1.3). Namely, the following condition called SULE was introduced in [5]. The operator H has SULE on I , if H has a complete set {ek } in H(I ) of orthonormal eigenfunctions and there exist α > 0 and nk ∈ Zd such that for any δ > 0, |ek (n)| ≤ C(δ) exp(δ|nk | − α|n − nk |)

(1.5)

with some finite constants C(δ) uniform in k, n. One shows that if (1.5) holds, then ek (m) are fast decaying in k for all m. So, it is easy to see that the function W (n, m) is exponentially decaying in n for all m and (1.1) holds (in fact, one controls also the behaviour in m, so one proves (1.1) for any exponentially decaying ψ). This result

How to Prove Dynamical Localization

29

was extended to the continuous case in [6] and applied in [6, 4] to prove dynamical localization for some concrete models. Similar ideas are used in [3] to prove strong dynamical localization. In the present paper we propose a new approach which ameliorates considerably the existing methods. Surprisingly, one can show that if the function |ek (n)ek (m)| decays sufficiently fast in n uniformly in k, then one should not take special care of decay in k necessary for the convergence of the sum W (n, m). The key point is the following. Due to the fact that the system {ek } is orthonormal in H, the decay in n and decay in k of |ek (n)ek (m)| are related. Therefore, one can “sacrifice” some decay in n to obtain a decay in k sufficient to the convergence of the sum. One can even allow some growth in k in the bounds for |ek (n)ek (m)| provided the decay in n is fast enough (see Theorem 5.3 and Theorem 5.4). One should say that there is a deep relation between our approach and that of [5]. The main technical ingredient in our consideration is the following result (Theorem 2.2 and Theorem 6.1). Let {ek } be any orthonormal system in H. For any p > 0 define the positive numbers dk (p) = |ek (n)|2 (|n| + 1)p , ηk (p) = sup(|ek (n)|2 exp(p|n|)) n

n

(where it is possible that dk (p) = +∞, ηk (p) = +∞). For any p > 0, one can reorder dk (p), ηk (p) so that dk (p) ≥ D(p, d)k p/d ,

ηk (p) ≥ C(p, d) exp(βk 1/d )

(1.6)

with some positive universal constants D(p, d), C(p, d), β(p, d). Considering the SULE condition, one can easily verify the fact that |nk | ≥ Ck 1/d (after reordering) implies dk (p) ≥ Dk p/d and ηk (p) ≥ C exp(βk 1/d ) for any p > 0. So, the growth of |nk | for the systems {ek } with SULE, which plays a key role in the proof of [5], can be considered as a manifestation of a more general result (1.6) valid for any orthonormal system. When proving (1.6), we don’t need any assumptions about the form of eigenfunctions ek or the notion of “center of localization” nk . The paper is organised as follows. In Sect. 2 we prove our main technical result about the growth of the moments dk (p). In Sect. 3 we establish necessary conditions for dynamical localization for a given vector ψ. In particular, we show (Theorem 3.4) that if p

sup |X|ψ (t) ≡ supexp(−itH )ψ, |X|p exp(−itH )ψ < +∞, t

t

(1.7)

for some p > 0, then the coefficients of the spectral measure of ψ decay sufficiently fast. Namely, if µψ = k ak δλk , then k

1

ak1+β < +∞

for any 0 < β < p/d. In particular, if (1.7) holds for any p > 0, then ak are fast decaying: akν < +∞ ∀ν > 0. k

In Sect. 4 we give some sufficient conditions for dynamical localization for a given vector ψ and p > 0 (Theorem 4.2 is the main result of the section). As a result, we obtain

30

S. Tcheremchantsev

(Corollary 4.3) general necessary and sufficient conditions for dynamical localization for a given vector ψ for all p > 0. Namely, let ψ ∈ Hpp . One can always represent it as ψ= ψk , ψk ∈ Hλk , k

where λk = λs for k = s and Hλ is an eigenspace of H corresponding to the eigenvalue λ. Consider the function Rψ (n) = sup |ψk (n)|. k

Then (1.7) holds for any p > 0 iff Rψ (n) is fast decaying. We show also that the dynamical localization p sup |X|ψ (t) < +∞ ∀p > 0 t

is equivalent to the dynamical localization for Cesaro averages: T p sup 1/T |X|ψ (t)dt < +∞ ∀p > 0. T

0

The results of Sects. 3 and 4 are well adapted to the case of power law or subexponential decay of eigenfunctions of H . In Sect. 5 we discuss the problem of dynamical localization on the interval of energies I (in the sense (1.1)). Theorems 5.3 and 5.4 give sufficient conditions that (1.1) hold for any finite ψ and any fast decaying ψ respectively. We give also some bound (Theorem 5.5) which can be used to prove the strong dynamical localization on I considered in [3, 8] (in the case when there is a family of operators H depending on some parameter). In Sect. 6 we consider exponential dynamical localization: sup |(exp(−itH )ψ)(t, n)| ≤ C exp(−γ |n|), t

γ > 0,

(1.8)

(denoted as EDL(ψ)) which is a special case of dynamical localization. In particular, we show that if (1.8) holds for some vector ψ, then the coefficients of the spectral measure of ψ decay (after reordering) as follows: ak ≤ C exp(−βk 1/d ) with some β > 0. We give also (Theorem 6.5) some sufficient conditions for exponential localization on the interval of energies I . In particular, if |ek (n)ek (m)| ≤ C exp(−α|n − m| + β|m|)

(1.9)

for some α > 0, β > 0, then EDL(PI (H )ψ) holds for any exponentially decaying ψ. The condition (1.9) is similar to (1.4), but is much easier to prove in concrete cases because one doesn’t need the decreasing factors ak . In particular, the SULE condition (1.5) implies immediately (1.9). In Sect. 7 we consider the continuous case H = L2 (Rd ). We show that most of the results proved in the discrete case remain true under some additional assumptions about eigenfunctions of H . In particular, it is the case if H = −( + V (x) with V (x) bounded from below and I = (−∞, K], K ∈ R. One should stress that practically all results of the paper hold regardless of the multiplicity of the spectrum of H .

How to Prove Dynamical Localization

31

2. Lower Bounds for the Moments of Eigenfunctions Let {akn } be a double sequence of nonnegative numbers labelled by k ∈ N, n ∈ Zd . We shall suppose that there exist two positive constants A, B such that ∀n ∈ Zd ,

∞

akn ≤ A < +∞,

(2.1)

k=1

∀k ∈ N,

akn ≥ B > 0.

(2.2)

n∈Zd

For p > 0 define the numbers

dk (p) =

akn (|n| + 1)p

n∈Zd

with the understanding that dk (p) may be equal to +∞. One can also remark that dk (p) ≥ B for any k, p. Lemma 2.1. Let p > 0. There exist a positive constant D(A, B, p, d) such that for any {ank } satisfying (2.1), (2.2), one can reorder dk (p) so that p

dk (p) ≥ Dk d .

(2.3)

Proof. For any N > 0 consider the following set in N: J (N) = k ∈ N akn ≤ B/2 n:|n|≥N

and the sum

S(N ) =

akn .

k∈J (N) |n| 0 such that L = of the set I (N, p) and (2.7) that

(2.7)

B p 2 (N +1) . It follows from the definition d

Card({k ∈ N | dk (p) ≤ L}) ≤ C(d)A/B(2L/B) p .

(2.8)

The bound (2.8) implies, in particular, that the set {k ∈ N | dk (p) ≤ L} is finite for any L > 0. Reordering dk (p) in such a way that they increase with k, we obtain the result of the lemma. As a direct application of this lemma, we obtain the following important result. Theorem 2.2. Let {ek }, k ∈ N be any orthonormal system in l 2 (Zd ) (not necessarily complete). For any p > 0 define the moments dk (p) = |ek (n)|2 (|n| + 1)p . n∈Zd

One can reorder dk (p) so that p

dk (p) ≥ D(p, d)k d , where the positive constants D are the same for any system {ek }. Proof. Let akn = |ek (n)|2 . Since the system is orthonormal, ∀n ∈ Zd ,

∞

akn ≤ 1,

(2.9)

k=1

∀k ∈ N,

akn = 1.

n∈Zd

(One has the equality in (2.9) if the system is complete.) The result of the theorem follows directly from Lemma 2.1, where A = B = 1.

How to Prove Dynamical Localization

33

Remark 2.3. The result of the theorem is optimal since there exist orthonormal systems such that C1 k p/d ≤ dk (p) ≤ C2 k p/d . The simplest example is the canonical basis in l 2 (Zd ) or, more generally, complete systems with SULE [5], where the functions ek (n) are well localised and fast decaying at infinity. One can observe that if the system is not complete, then dk (p) can grow as fast as you will (taking, for example, ek = δm(k) with fast growing |m(k)|). Even if the system is complete but the functions ek decay not too fast at infinity, it is possible that dk (p) are fast growing (in particular, dk (p) = +∞ for any k). An interesting problem: is it possible that dk (p) grow faster than k p/d for some complete systems where all the functions ek (n) decay fast (for example, exponentially) as |n| → +∞? 3. Localization for a Given Vector: Necessary Conditions Let H be a self-adjoint operator in H = l 2 (Zd ), ψ ∈ H, ψ = 1, ψ(t) = exp(−itH )ψ. For any p > 0, t ∈ R, define the moments of the position operator p |X|ψ (t) = |ψ(t, n)|2 (|n| + 1)p . n∈Zd

We prefer to take (|n| + 1)p rather than |n|p to avoid some technical problems in the proofs. Definition 3.1. One has dynamical localization for ψ and the moment of order p, if p

sup |X|ψ (t) < +∞. t

We shall denote it as DL(ψ, p). One has Cesaro dynamical localization CDL(ψ, p) if T p p sup|X|ψ (T ) ≡ sup 1/T |X|ψ (t)dt < +∞. T

T

0

Clearly, DL(ψ, p) ⇒ CDL(ψ, p). Definition 3.2. One has dynamical localization (Cesaro dynamical localization) for ψ if DL(ψ, p) (respectively, CDL(ψ, p)) holds for any p > 0. We shall write DL(ψ) and CDL(ψ) respectively. Again, DL(ψ) ⇒ CDL(ψ). Let Hc be the subspace of a continuous spectrum of H and Pc be the orthogonal p projection on it. It is well known that if Pc ψ = 0, then |X|ψ (t) → +∞ as t → ∞ for any p > 0. So, the dynamical localization is possible only if ψ ∈ Hpp - the subspace of pure point spectrum of H . We shall denote by λ the eigenvalues of H and by Hλ the corresponding eigenspaces: Hλ = {ϕ| H ϕ = λϕ}.

34

S. Tcheremchantsev

Clearly, the subspaces Hλ and Hµ are mutually orthogonal for λ = µ and Hpp = ⊕λ Hλ . We shall denote by Pλ orthogonal projection on Hλ . For any ψ ∈ Hpp consider the set (at most countable) I (ψ) = {λ| ψλ ≡ Pλ ψ = 0}. Then ψ can be written as ψ=

+∞

ψk , ψk = 0, ψk ∈ Hλk , λk ∈ I (ψ),

k=1

where λk = λs for k = s. (It is possible that the sum is finite, in this case the problem of dynamical localization for the vector ψ is rather trivial.) For any k define ek = ψk −1 ψk . As Hλk and Hλs are mutually orthogonal for k = s, the system M(ψ) ≡ {ek } is orthonormal in Hpp . Finally, any ψ ∈ Hpp can be written as ψ=

γ k ek ,

k

where γk = ψ, ek , H ek = λk ek and M(ψ) = {ek } is some orthonormal system of eigenfunctions of H depending on ψ. It is obvious that exp(−itλk )γk ek . (3.1) ψ(t) = k

Let dk (p) be the moments of the functions ek : dk (p) =

|ek (n)|2 (|n| + 1)p .

n∈Zd

One can also note that the spectral measure of ψ is equal to µψ =

ak δλk ,

(3.2)

k

where ak = |γk |2 > 0, k ak = ψ2 = 1. The first result we shall prove is a necessary condition for dynamical localization. Theorem 3.3. 1. For any p > 0, p lim inf |X|ψ (T ) T →∞

≥

∞

ak dk (p)

k=1

(with the convention that if dk (p) = +∞, for some k, then k = +∞). 2. DL(ψ, p) ⇒ CDL(ψ, p) ⇒ dk (p) < +∞ for any k and k ak dk (p) < +∞.

How to Prove Dynamical Localization

35

Proof. It follows from (3.1) that ∞

|ψ(t, n)|2 =

exp(−it (λk − λm ))γk γm ek (n)em (n).

(3.3)

k,m=1

The sum in (3.3) is absolutely converging for any n because |γk |2 = 1, |ek (n)|2 ≤ 1. k

k

Therefore, for any N > 0, T ∈ R, T ∞ dt |ψ(t, n)|2 (|n| + 1)p = bkm (T )γk γm dkm (p, N ), A(N, T ) ≡ 1/T 0

|n|≤N

k,m=1

(3.4)

p where dkm (p, N ) = |n|≤N ek (n)em (n)(|n| + 1) , bkm (T ) = 1 for k = m and bkm (T ) = (exp(−iT (λk − λm )) − 1)/(−iT (λk − λm ) for k = m. As A(N, T ) ≤ p |X|ψ (T ), for any N > 0 we have the inequality p

liminf T →∞ |X|ψ (T ) ≥ liminf T →∞ A(N, T ).

(3.5)

On the other hand, due to the dominated convergence theorem, one can take the limit in (3.4) for a fixed N as T → ∞: lim A(N, T ) = |γk |2 |ek (n)|2 (|n| + 1)p . (3.6) T →∞

k

|n|≤N

As ak = |γk |2 , it follows from (3.5) and (3.6) that p ak |ek (n)|2 (|n| + 1)p liminf T →∞ |X|ψ (T ) ≥ k

|n|≤N

for any N > 0. Taking the limit N → +∞, we obtain the first statement of the theorem. The second statement follows directly from the first. As a corollary of Theorem 3.3, we shall prove a necessary condition for dynamical localization for the vector ψ in terms of its spectral measure µψ given by (3.2). Theorem 3.4. The following statements hold: 1. For any p > 0, DL(ψ, p) ⇒ CDL(ψ, p) ⇒

k

1

ak1+β < +∞

for all 0 < β < p/d. 2.

DL(ψ) ⇒ CDL(ψ) ⇒

k

for all ν > 0.

akν < +∞

36

S. Tcheremchantsev

Proof. Suppose that CDL(ψ, p) holds. Theorem 3.3 implies ak dk (p) < +∞. k

One can apply Theorem 2.2 to the orthonormal system {ek } and reorder dk (p) so that dk (p) ≥ Dk p/d . Therefore, after reordering, ak k p/d < +∞. k

Let 0 < β < p/d, r = 1 + easily see that k

1 1+β

ak

β, r

= 1 + 1/β. Applying the Hölder inequality, one can

≤

ak k

p/d

1 1+β

k

β/(1+β) k

−p/(βd)

< +∞

k

1 The fact that k ak1+β < +∞, does not depend on reordering of {ak }. The second part of the theorem is obvious.

Corollary 3.5. If CDL(ψ) holds, then the numbers ak are fast decaying: for any s > 0, one can reorder {ak } so that ak ≤ C(s)k −s . One should stress that the statements of Theorem 3.4 and Corollary 3.5 are weaker than that of Theorem 3.3, because they do not depend on the system {ek }, and inevitably, one loses some information about the moments dk (p). If dk (p) grow very fast as k → ∞, it is even possible that k akν < +∞ for all ν > 0, but k ak dk (p) = +∞ for all p > 0. Finally, in this section we shall give necessary conditions for DL(ψ, p) in terms of projections of ψ on Hλk . Lemma 3.6. Let M = {ek } be any orthonormal system in H (in particular, the system M(ψ)), ψ ∈ H. Define the following function: Rψ,M (n) = sup |γk ek (n)|, k

Then

n∈Zd

γk = ψ, ek .

2 2 Rψ, M (n) ≤ ψ .

Proof. As the system {ek } is orthonormal, k |γk |2 ≤ ψ2 . Therefore, S= |γk ek (n)|2 = |γk |2 ≤ ψ2 . k,n

On the other hand, 2 S= |γk ek (n)|2 ≥ sup |γk ek (n)|2 = Rψ, M (n). n

k

(3.7)

k

n

k

The result of the lemma follows from (3.7)–(3.8).

n

(3.8)

How to Prove Dynamical Localization

37

The function Rψ,M (n) (especially its decay properties) will play an important role in the next part of the paper. Lemma 3.6 implies that Rψ,M always lies in l 2 (Zd ). We shall see below that if DL(ψ, p) holds, then Rψ,M(ψ) decays faster at infinity. On the other hand, in the next section we shall prove that a sufficiently fast decay of Rψ,M for some M implies DL(ψ, p). Definition 3.7. We shall say that a function f : Zd → C (f : Zd → R) is fast decaying if for any s > 0, sup |f (n)|(|n| + 1)s < +∞. n

Theorem 3.8. The following statements hold: 2 p 1. DL(ψ, p) ⇒ CDL(ψ, p) ⇒ n Rψ, M(ψ) (n)(|n| + 1) < +∞. 2. DL(ψ) ⇒ CDL(ψ) ⇒ Rψ,M(ψ) is fast decaying. Proof. Suppose that CDL(ψ, p) holds. Theorem 3.3 implies that dk (p) < +∞ for any k and ak dk (p) ≤ C(p) < +∞. S(p) = k

On the other hand, it follows from definition of dk (p) that S(p) = (|n| + 1)p |γk ek (n)|2 ≥ (|n| + 1)p sup |γk ek (n)|2 n

k

2 = (|n| + 1)p Rψ, M(ψ) (n),

n

k

n

so we obtain the first statement of the theorem. The second statement follows directly from the first. 4. Localization for a Given Vector: Sufficient Conditions In this section we shall give some sufficient conditions for DL(ψ, p) and DL(ψ). Let M = {ek } be any orthonormal system of eigenfunctions of H and HM the subspace of Hpp spanned by M. For any ψ ∈ HM , we have the identity ψ=

γk e k ,

γk = ψ, ek .

(4.1)

k

We shall denote as λk and dk (p) the eigenvalue and the moments of ek (n) respectively. In this section we shall consider any systems M of eigenfunctions of H , not necessarily M(ψ), so it is possible that λk = λm for k = m if the spectrum of H is not simple. The decomposition ψk ψ= k

of the previous section is a special case of (4.1), when M = M(ψ). The simplest sufficient condition for DL(ψ, p) can be given in terms of dk (p) and ak = |γk |2 .

38

S. Tcheremchantsev

Lemma 4.1. Let p > 0. The following statement holds: p sup |X|ψ (t) t

2 p 2 ≤ (|n| + 1) sup |ψ(t, n)| ≤ ak dk (p) . t

n

k

So, if the last sum converges, DL(ψ, p) holds. Proof. Since sup |ψ(t, n)| = sup | t

t

exp(−itλk )γk ek (n)| ≤

k

|γk ek (n)|,

(4.2)

k

the Cauchy–Schwartz inequality yields for any t, (|n| + 1)p sup |ψ(t, n)|2 ≤ |γk γm | |ek (n)em (n)|(|n| + 1)p n

t

n

k,m

≤

|γk γm | dk (p)dm (p) =

k,m

(4.3)

2

ak dk (p)

.

k

If the functions ek and em have only small overlapping for k = m, one can better estimate the sums |ek (n)em (n)|(|n| + 1)p . dkm (p) = n

(p) decay fast when |k − m| → ∞, the sum on the r.h.s of (4.3) can be majorated If dkm by C k ak dk (p) (or by something close to it). In this case one obtains a sufficient condition which is close to (or even identical with) the necessary condition given by Theorem 3.3. In particular, this is the case when M is the canonical basis in l 2 (Zd ). The sufficient condition of Lemma 4.1 is, however, difficult to apply in the concrete cases, because one should control at the same time the growth of dk (p) and the decay of ak . A more practical sufficient condition can be given in terms of the function Rψ,M (n), defined by Rψ,M (n) = sup |γk ek (n)|, γk = ψ, ek , |γk |2 = ψ2 . k

k

Later on in this section, ψ and M are fixed and we omit them in notations. To prove DL(ψ, p), one shall use the trivial bound (4.2): |ψ(t, n)| ≤ |γk ek (n)|. k

Clearly, any term in the sum is majorated by R(n), and if the sum has a finite number of terms, the sufficiently fast decay of R(n) implies DL(ψ, p). However, typically it is not the case, and one needs some decay in k so that the sum converges. The key moment is the following: one can sacrifice some decay in n to obtain a necessary decay in k. This is possible due to the growth of the moments dk (p) given by Theorem 2.2. Surprisingly,

How to Prove Dynamical Localization

39

one can even allow some growth in k in the bounds for γk ek (n). Namely, suppose that for some α > 0 the moments dk (α) are finite for any k. Consider the function R(n, α) = sup k

|γk ek (n)| . dk (α)

For α = 0 one has dk (0) = 1 (so, dk (α) are always finite) and R(n, 0) coincides with the function R(n) considered above. For any s > 0 define the moments R 2 (n, α)(|n| + 1)s , Lα (s) = n

where Lα (s) depend also on ψ, M and it is possible that Lα (s) = +∞. Theorem 4.2. Let ψ ∈ HM , α ≥ 0. Suppose that dk (α) < +∞ for any k. The following statements hold: 1. Let δ ∈ (0, 1), ε > 0. Then sup |ψ(t, n)| ≤ C(ε, δ, d)Lδ/2 α t

2α + d(2 − δ) + ε R 1−δ (n, α), δ

(4.4)

where the constants C(ε, δ, d) are universal, i.e. do not depend on H, M or ψ. 2. Let p > 0, ε > 0. There exist the universal constants C(ε, p, d) such that p (|n| + 1)p sup |ψ(t, n)|2 sup |X|ψ (t) ≤ t t (4.5) n ≤ C(ε, p, d)Lα (2α + p + 2d + ε). 3. If R(n, α) is fast decaying in n, then so is supt |ψ(t, n)| and DL(ψ) holds. Proof. Let r = 2α+d(2−δ) + ε. If Lα (r) = +∞, the bound (4.4) is trivially true with δ any finite constant C. Suppose that Lα (r) < +∞. It follows from definition of dk (r) and Rα (n) that for any r > 0, k ∈ N, |γk ek (n)|2 (|n| + 1)r ≤ dk2 (α) R 2 (n, α)(|n| + 1)r ≡ dk2 (α)Lα (r). ak dk (r) = n

n

As Lα (r) < +∞, dk (r) < +∞ for any k such that ak = |γk |2 = 0. Therefore, 1/2 |γk | ≤ Lα (r)dk2 (α)/dk (r) .

(4.6)

At the same time, |γk ek (n)| ≤ dk (α)R(n, α),

(4.7)

directly from the definition of R(n, α). Using the bounds (4.6)–(4.7), one can estimate: |γk ek (n)| ≤ (dk (α)Rα (n))1−δ |γk ek (n)|δ |ψ(t, n)| ≤ k

k

≤R

(n, α)Lδ/2 α (r)

1−δ

k

−δ/2

|ek (n)|δ dk (α)dk

(r),

(4.8)

40

S. Tcheremchantsev

where the summation is carried only over k such that ak > 0, so dk (r) < +∞. Let s = 2/(2 − δ), s = 2/δ. Applying the Hölder inequality, and using the fact that 2 k |ek (n)| ≤ 1, one obtains:

s 2/(2−δ) −δ/2 −δ/(2−δ) δ |ek (n)| dk (α)dk (r) ≤ dk (α)dk (r). (4.9) S≡ k

k

Let w < q. Applying the Hölder inequality with s = q/w, s = q/(q − w), one can estimate: |ek (n)|2/s (|n| + 1)w |ek (n)|2/s dk (w) = n

1/s

≤

|ek (n)| (|n| + 1) 2

ws

n

1/s |ek (n)|

2

(4.10) = (dk (q))w/q .

n

The bound (4.10) with w = α, q = r and (4.9) imply S≤ dk (r)(2α/r−δ)/(2−δ) .

(4.11)

k

Now we shall use the result of Theorem 2.2. One can reorder dk (r) so that dk (r) ≥ D(r, d)k r/d .

(4.12)

The choice of r and (4.11)–(4.12) yield S ≤ C(ε, δ, d) < +∞

(4.13)

with some universal constants C. The first statement of the theorem follows from (4.8) and (4.13). To prove the second statement, we shall use the inequality (4.6) with r = 2α + p + 2d + ε and the bound of Lemma 4.1. Again, if Lα (r) = +∞, there is nothing to prove. Suppose that Lα (r) < +∞. One obtains 2 d 2 (α)dk (p) k . (|n| + 1)p sup |ψ(t, n)|2 ≤ Lα (r) (4.14) dk (r) t n k

Applying twice the bound (4.10), we obtain

2 2α+p−r p 2 2r (|n| + 1) sup |ψ(t, n)| ≤ Lα (r) dk (r) . n

t

(4.15)

k

Again, by Theorem 2.2 and the choice of r, the sum converges and the second statement of the theorem follows directly from (4.15). The third statement follows directly from the first and the second. As a direct application of Theorem 4.2 we obtain the necessary and sufficient conditions for DL(ψ). Corollary 4.3. Let ψ ∈ Hpp and R(n) = R(n, 0) be defined with the system M(ψ) as in the previous section. Then CDL(ψ) ⇔ DL(ψ) ⇔ R(n) is fast decaying. The result follows from Theorem 3.8 and Theorem 4.2.

How to Prove Dynamical Localization

41

5. Localization on the Interval of Energies In the previous parts of the paper the vector ψ ∈ Hpp was fixed and we did not suppose anything about decay of ψ or about the set of λk such that ψk = 0. In this section we shall consider the set of functions ψ with some decaying properties at infinity and with the energies λk from some interval I ⊂ R (bounded or not). First of all, if DL(ψ) holds, then, in particular, for any p > 0, p |X|ψ (0) = |ψ(n)|2 (|n| + 1)p = C(p) < +∞, n

so ψ is fast decaying. Therefore, the largest set of ψ for which one could prove DL(ψ) is the set of fast decaying functions: A = {ψ ∈ H | ψ fast decaying}. We shall also consider the set of finite functions B = {ψ ∈ H | ψ finite}, which is the subset of A. The set of ψ exponentially decaying at infinity is intermediate between A and B and will be considered in the next section. Let I ⊂ R be some interval (bounded or not). We shall denote by H(I ) the subspace of Hpp with the energies from I H(I ) = ⊕λ∈I Hλ , and by PI (H ) the orthogonal projection on H(I ). Definition 5.1. The operator H has an A-dynamical localization on I if for any ψ ∈ A, we have DL(PI (H )ψ). H has a B-dynamical localization on I if for any ψ ∈ B we have DL(PI (H )ψ). Let M = {ek } be any orthonormal system of eigenfunctions of H complete in H(I ). One can obtain such systems choosing for all eigenvalues λ ∈ I orthonormal systems Mλ complete in Hλ and then taking M = ∪λ∈I Mλ . Clearly, M is unique if the spectrum of H is simple on I . For any ϕ ∈ H(I ) the identity holds: γk ek , γk = ϕ, ek . ϕ= k

Suppose that for some α ≥ 0,

|g(n)|2 (|n| + 1)α < +∞

n

for any eigenfunction of H from H(I ) (if α = 0, this is always true). Define as in the previous section |γk ek (n)| . Rϕ,M (n, α) = sup dk (α) k

42

S. Tcheremchantsev

Define also three functions from Z2d to R+ : |ek (n)ek (m)| , dk (α) k |g(n)g(m)| , Gα (n, m) = sup dg (α) g∈L Fα (n, m) = sup

where L = {g ∈ H| Hg = λg, λ ∈ I, g = 1},

dg (α) =

|g(n)|2 (|n| + 1)α ,

n

and

g (n) g (m)|, Uα (n, m) = sup | g ∈K

where K = { g | H g = λ g , λ ∈ I, and ∀n ∈ Zd ,

| g (n)| ≤ (|n| + 1)−α/2 }.

One can see that Fα (n, m) ≤ Gα (n, m) ≤ Uα (n, m).

(5.1)

The first inequality in (5.1) is obvious. To prove the second, for any g ∈ L define g (n) = (dg (α))−1/2 g(n), so that |g(n)g(m)| = | g (n) g (m)|. dg (α) One verifies that

| g (n)|2 (|n| + 1)α = 1,

n

so g ∈ K and the second inequality in (5.1) holds. Lemma 5.2. Let α ≥ 0, ψ ∈ H, ϕ = PI (H )ψ. The inequality holds: Rϕ,M (n, α) ≤ Nα (n, m)|ψ(m)|,

(5.2)

m∈Zd

where Nα is one of the three functions Fα , Gα , Uα . Proof. As ϕ = PI (H )ψ and the system M is complete in H(I ), ϕ= γk ek , γk = ϕ, ek = ψ, ek . k

Therefore, γk =

ψ(m)ek (m)

m

and |ek (n)ek (m)| |γk ek (n)| . |ψ(m)| ≤ dk (α) dk (α) m

(5.3)

Taking in (5.3) the supremum over k, we obtain the statement of the lemma for Fα . The inequality (5.1) yields the result for Gα and Uα .

How to Prove Dynamical Localization

43

The bound (5.2) combined with Theorem 4.2 allows us to give sufficient conditions for B- and A-dynamical localization on I . Theorem 5.3. The following statements hold: 1. Let α ≥ 0. If one of three functions Fα (n, m), Gα (n, m), Uα (n, m) is fast decaying in n for all fixed m ∈ Zd , then B-dynamical localization holds on I . In particular, PI (H )ψ ∈ A for any ψ ∈ B. 2. If the spectrum of H is simple on I and B-dynamical localization holds on I , then the function F0 (n, m) = G0 (n, m) is fast decaying in n for all fixed m ∈ Zd . Proof. Let ψ ∈ B and Nα be one of three functions Fα , Gα , Uα . As the function ψ is finite and Nα (n, m) is fast decaying in n for any m, (5.2) implies that Rϕ,M (n, α) is fast decaying in n. The third statement of Theorem 4.2 implies DL(ϕ), so B-dynamical localization holds on I . To prove the second statement of the theorem, observe that since the spectrum of H is simple on I , the system M is unique and coincides with the set of normalised eigenfunctions of H with eigenvalues from I . Therefore, Fα (n, m) = Gα (n, m). Moreover, one sees easily that for any ϕ ∈ H(I ), M(ϕ) is a subset of M, where M(ϕ) was defined in Sect. 3. Namely, M(ϕ) = {ek }k∈J , J = {k| γk = 0}. Since γk = 0 for any k ∈ / J, Rϕ,M(ϕ) (n, 0) = sup |γk ek (n)| = sup |γk ek (n)| = Rϕ,M (n, 0). k∈J

k

Let ψ ∈ B, so that DL(ϕ) ≡ DL(PI (H )ψ) holds. By the second statement of Theorem 3.8, the function Rϕ,M (n, 0) is fast decaying in n. In particular, if ψ = δm , m ∈ Zd , then γk = ek (m) and Rϕ,M (n, 0) = F0 (n, m) is fast decaying in n, so the second statement of the theorem holds. As to the A-dynamical localization on I , there are many possible sufficient conditions to propose. For example, the following result holds. Theorem 5.4. Let α ≥ 0 and Nα is one of three functions Fα , Gα , Uα . Assume that one of the two conditions holds: 1. For any s > 0 there exist two finite positive constants k(s), C(s) such that Nα (n, m) ≤ C(s)(|m| + 1)k(s) (|n| + 1)−s .

(5.4)

2. For any s > 0 there exist two finite positive constants k(s), C(s) such that Nα (n, m) ≤ C(s)(|m| + 1)k(s) (|n − m| + 1)−s . Then A-dynamical localization holds on I . Proof. Let ψ ∈ A, so

|ψ(m)| ≤ C(r)(|m| + 1)−r

for any r > 0. For any s > 0 the bounds (5.2) ans (5.4) yield Rϕ,M (n, α) ≤ C(r)C(s)(|n| + 1)−s (|m| + 1)k(s)−r . m

(5.5)

44

S. Tcheremchantsev

Taking r = k(s) + 2d, we see that Rϕ,M (n, α) is fast decaying in n. The first statement of the theorem follows from the third statement of Theorem 4.2. In the case of (5.5) the proof is similar. Up to now, the operator H was fixed. Suppose that there is a family of self-adjoint operators H (θ ) depending on some parameter θ ∈ 0,

0 such that sup |ψ(t, n)| ≤ C exp(−γ |n|). t

We shall note it as EDL(ψ). Clearly, EDL(ψ) implies DL(ψ). To establish necessary and sufficient conditions for EDL(ψ) we shall need the following version of Theorem 2.2.

How to Prove Dynamical Localization

45

Theorem 6.1. Let {ek } be any orthonormal system in H, γ > 0. For any k define the numbers ηk (γ ) = sup(|ek (n)|2 exp(γ |n|)). n

One can reorder ηk (γ ) so that

ηk (γ ) ≥ D exp(βk 1/d ) with some universal positive constants D(γ , d), β(γ , d). Proof. We shall follow the proof of Lemma 2.1 with akn = |ek (n)|2 and A = B = 1. Let N > 0, then for the set J (N ) = {k| |ek (n)|2 ≤ 1/2} |n|>N

we have

Card(J (N )) ≤ K(N + 1)d . Let L > 0. Consider the following set in N: I (L) = {k| ηk (γ ) ≤ L}.

It follows from definition of ηk (γ ) that for any k ∈ I (N ), |ek (n)|2 ≤ L exp(−γ |n|). Therefore, for any ν > 0, |ek (n)|2 ≤ C(ν, d)L exp(−(γ − ν)N ). |n|>N

Let L be such that C(ν, d)L exp(−(γ − ν)N ) = 1/2. Then I (L) ⊂ J (N ) and Card(I (L)) ≤ Card(J (N )) ≤ K(N + 1)d ≤ C(γ , ν, d) logd L for any L ≥ L0 (γ , ν, d). The result of the theorem follows directly from (6.1).

(6.1)

With this result we can obtain a necessary condition for EDL(ψ) in terms of projections ψk and in terms of coefficients of the spectral measure of ψ. Let M(ψ) be the orthonormal system of eigenfunctions of H defined in Sect. 2 and Rψ,M(ψ) (n) = supk |γk ek (n)|, where γk = ψ, ek , H ek = λk ek . The spectral measure of ψ can be written as µψ = ak δλk . k

Theorem 6.2. Suppose that sup |ψ(t, n)| ≤ C exp(−α|n|)

(6.2)

sup |γk |2 ηk (2α) < +∞,

(6.3)

Rψ,M(ψ) (n) ≤ C exp(−α|n|).

(6.4)

t

for some α > 0. Then k

or, equivalently, One can reorder ak so that with some positive C, β.

ak ≤ C exp(−βk 1/d )

46

S. Tcheremchantsev

Proof. The proof of the first statement is made in [5]. Since ψ(t, n) =

exp(−itλs )γs es (n),

s

then for any k, n we have

T

1/T 0

ψ(t, n) exp(itλk )dt → γk ek (n)

(6.5)

as T → ∞. The bound (6.2) and (6.5) yield |γk ek (n)| ≤ C exp(−α|n|) for any k, n, which implies (6.4) and (6.3). Next, it follows from (6.3) and Theorem 6.1 that after reordering ak ≡ |γk |2 ≤ C(ηk (2α))−1 ≤ C exp(−βk 1/d ) with some positive C, β.

In the following statement we shall use the same notations as in Theorem 4.2. As usual, M = {ek } is any orthonormal system of eigenfunctions of H . Moreover, for δ ≥ 0 we define Rψ,M (n, δ) = sup k

|γk ek (n)| , ηk (δ)

(6.6)

supposing that ηk (δ) < +∞ for any k (it is always true for δ = 0 because ηk (0) = 1). Theorem 6.3. Let ψ ∈ HM The following statements hold with universal constants C: 1. If Rψ,M (n, 0) ≤ C exp(−α|n|) for some α > 0, then sup |ψ(t, n)| ≤ C(α, d)(|n|d + 1) exp(−α|n|). t

2. Suppose that ηk (δ) < +∞ for some δ > 0 for any k and Rψ,M (n, δ) ≤ C exp(−α|n|), where α > δ. Then for any ν : 0 < ν < α − δ, sup |ψ(t, n)| ≤ C(ν, α, d) exp(−ν|n|). t

In particular, in both cases EDL(ψ) holds.

How to Prove Dynamical Localization

47

Proof. Since sup |γk ek (n)| ≤ C exp(−α|n|), k

we obtain sup(|γk |2 ηk (2α)) < +∞. k

The result of Theorem 6.1 yields after reordering |γk | ≤ C exp(−βk 1/d ) with some C, β > 0. Now ψ(t, n) can be estimated in the usual way. For any n ∈ Zd and B > 0, |ψ(t, n)| ≤ |γk ek (n)| ≤ C exp(−α|n|) + |γk | d d k k≤B|n| k>B|n| (6.7) ≤ CB|n|d exp(−α|n|) + K exp(−β/2(B|n|d )1/d ). Taking in (6.7) B so that β/2B 1/d = α, we obtain the first statement of the theorem. To prove the second statement of the theorem we shall need a bound relating ηk (α) and ηk (ν) for ν ≤ α. It follows from definition of ηk (α) that |ek (n)|2 ≤ ηk (α) exp(−α|n|) for any k, n. At the same time |ek (n)|2 ≤ 1. Therefore, |ek (n)|2 ≤ min{1, ηk (α) exp(−α|n|)}.

(6.8)

We shall use the elementary inequality min{1, z} ≤ zs ,

z ≥ 0, 0 < s < 1.

(6.9)

The bounds (6.8) and (6.9) where s = ν/α yield |ek (n)|2 ≤ ηks (α) exp(−ν|n|), and finally ηk (ν) ≤ (ηk (α))ν/α .

(6.10)

This bound is very similar to the bound (4.10) for the moments dk (r). Now we can end the proof. For any k, n, |γk ek (n)| ≤ Cηk (δ) exp(−α|n|). Therefore, |γk | ηk (2α) ≤ Cηk (δ). Next, as |ψ(t, n)| ≤

k

|γk ek (n)|,

(6.11)

48

S. Tcheremchantsev

one has

A ≡ sup exp(ν|n|) sup |ψ(t, n)| ≤ |γk | ηk (2ν). n

t

(6.12)

k

The bounds (6.11) and (6.12) imply A≤C

k

1/2

−1/2

ηk (δ)ηk (2ν)ηk

(2α).

(6.13)

Using twice the bound (6.10), we obtain from (6.13): A≤C

(ηk (2α))(δ+ν−α)/(2α) . k

As δ + ν < α, by Theorem 6.1 the sum converges and is bounded by some universal constant, so the second statement of theorem holds. The result of the theorem can be used to give sufficient conditions for exponential dynamical localization on the interval of energies I . Consider the set of exponentially decaying functions in H: C = {ψ| ∃r > 0 : |ψ(n)| ≤ C exp(−r|n|)}. Clearly, B ⊂ C ⊂ A, where A and B were defined in the previous section. Definition 6.4. The operator H has exponential dynamical localization on I , if for any ψ ∈ C, we have EDL(PI (H )ψ). Using the result of Theorem 6.3, one can give sufficient conditions for EDL on the interval I . For the sake of simplicity, we restrict ourselves to the first statement of this theorem. One could, however, if necessary, give also a more general sufficient condition based on the second statement of Theorem 6.3. As in the previous section, M = {ek } is some orthonormal system of eigenfunctions of H complete in H(I ) and F0 (n, m) = sup |ek (n)ek (m)|, k

G0 (n, m) = sup |g(n)g(m)|, g∈L

F0 (n, m) ≤ G0 (n, m). Theorem 6.5. Let N be one of two functions F0 , G0 . Let ψ ∈ C so that |ψ(m)| ≤ C exp(−r|m|) for some r > 0. As usual, let ϕ = PI (H )ψ, ϕ(t) = exp(−itH )ϕ. The following statements hold:

How to Prove Dynamical Localization

49

1. Suppose that there exist α > 0, β > 0 such that N (n, m) ≤ C exp(−α|n| + β|m|). Then

sup |ϕ(t, n)| ≤ C(γ ) exp(−γ |n|), t

where 0 < γ < min{α, αr/β}. 2. Suppose that there exist α > 0, β > 0 such that N (n, m) ≤ C exp(−α|n − m| + β|m|). Then sup |ϕ(t, n)| ≤ C(γ ) exp(−γ |n|), t

(6.14)

where 0 < γ < min{α, αr/(α + β)}. In particular, in both cases EDL holds on I . Proof. We shall give it for the second statement of the theorem; for the first the things are similar. As in the proof of Lemma 5.2, we have the bound Rϕ,M (n, 0) ≤ N (n, m)|ψ(m)|. m

Since N (n, m) ≤ 1 for any n, m, N (n, m) ≤ N s (n, m) ≤ C s exp(−sα|n − m| + sβ|m|) for all s ∈ [0, 1] (the argument is similar to (6.8)–(6.9)). Therefore, Rϕ,M (n, 0) ≤ C(s) exp(−sα|n − m| − (r − sβ)|m|). m

If r ≥ α + β, we take s = 1, and for r < α + β, we take s = r/(α + β). In both cases we obtain the bound Rϕ,M (n, 0) ≤ C(γ ) exp(−γ |n|)

(6.15)

for all 0 < γ < min{α, αr/(α + β)}. The bound (6.14) follows directly from (6.15) and the first statement of Theorem 6.3. As an example where this theorem can be directly applied consider operators with SULE on I . Namely, assume that there exists an orthonormal system M = {ek } of eigenfunctions of H complete in H(I ) such that for some nk ∈ Zd for any δ > 0, |ek (n)| ≤ C(δ) exp(δ|nk | − α|n − nk |),

(6.16)

where α > 0 and the constants C(δ) are uniform in k, n. It follows from (6.16) that |ek (n)ek (m)| ≤ C 2 (δ) exp(2δ|nk | − α(|n − nk | + |m − nk |)). Using the elementary inequalities |nk | ≤ |m| + |m − nk |,

|n − nk | + |m − nk | ≥ |m − n|,

50

S. Tcheremchantsev

one can easily show that F0 (n, m) = sup |ek (n)ek (m)| ≤ C 2 (δ) exp(2δ|m| − (α − 2δ)|n − m|) k

for any δ > 0. The second statement of Theorem 6.5 implies EDL on I . Moreover, if |ψ(m)| ≤ C exp(−r|m|), then sup |ϕ(t, n)| ≤ C(γ ) exp(−γ |n|), t

where 0 < γ < min{α, r}. 7. Adaptation to the Continuous Case Most of results of the previous sections remain valid in the case of L2 (Rd ) provided the result of Theorem 2.2 holds. However, one cannot expect that Theorem 2.2 is true in the continuous case in such a generality. For example, in the case of L2 (R), define the moments dk (p) = |ek (x)|2 (|x| + 1)p dx. R

It is sufficient to take any orthonormal system {ek (x)} in L2 ([−1, 1]) and to put ek (x) = 0 for |x| > 1. For such a system dk (p) ≤ 2p for any k. However, if the functions ek (x) do not oscillate fast, the same phenomenon of “repulsion” of eigenfunctions occurs and one can show the result similar to that of Theorem 2.2. The main result of this section is the following. Theorem 7.1. Let {ek }, k ∈ N be an orthonormal system in L2 (Rd ) such that lim sup | ek (u)|2 du = 0, R→+∞ k

|u|>R

(7.1)

where ek is the Fourier transformation of ek . Then for any p > 0 one can reorder the moments dk (p) = (|x| + 1)p |ek (x)|2 dx Rd

so that

dk (p) ≥ D(p, d)k p/d

with some positive constants D(p, d) depending on the system {ek }. Proof. To prove the theorem, we shall discretize the problem and use the same technical Lemma 2.1 as in the discrete case. For any n = (n1 , ..., nd ) ∈ Zd , ε > 0 define the cube of size ε in Rd : Kn (ε) = {x = (x1 , ..., xd ) ∈ Rd | xj ∈ [εnj , ε(nj + 1)), j = 1, ..., d}. It is clear that the cubes are disjoint and Rd = ∪n Kn (ε). Let x ∈ Kn (ε). Then C1 (|n| + 1) ≤ |x| + 1 ≤ C2 (|n| + 1) with some constants C1 (ε), C2 (ε). As dk (p) = (|x| + 1)p |ek (x)|2 dx = (|x| + 1)p |ek (x)|2 dx, Rd

n∈Zd

Kn (ε)

How to Prove Dynamical Localization

51

we obtain that p

p

C1 (ε)wk (p) ≤ dk (p) ≤ C2 (ε)wk (p),

(7.2)

where p wk (p) = (|n| + 1) n

Kn (ε)

|ek (x)|2 dx ≡

(|n| + 1)p |gk (n)|2 . n

One Lemma 2.1 taking akn = |gk (n)|2 . It is obvious that could 2 try to apply 2 = ek = 1, so the condition (2.2) is satisfied. However, it is not clear n |gk (n)| whether k |gk (n)|2 ≤ A < +∞. To avoid this problem, we shall consider rather the quantities hk (n) = ek (x)dx. Kn (ε)

By the Cauchy–Schwartz inequality, |hk (n)|2 ≤ εd |gk (n)|2 .

(7.3)

Therefore, (7.2) implies p

dk (p) ≥ ε−d C1 (ε)

(|n| + 1)p |hk (n)|2 ,

(7.4)

n

and to prove the theorem it is sufficient to show that the numbers akn = |hk (n)|2 verify the conditions of Lemma 2.1 for some ε > 0. One can represent hk (n) as hk (n) = ek , ηn L2 (Rd ) , where ηn is the characteristic function of Kn (ε). Since the system {ek } is orthonormal, |hk (n)|2 ≤ ηn 2 = εd , k

so (2.1) holds with A = εd . To prove (2.2) is more difficult. We shall show that the numbers ε −d |hk (n)|2 are close 2 to |gk (n)| for ε small enough if the condition (7.1) is satisfied. Using n |gk (n)|2 = 1, we shall prove (2.2) for some B(ε) > 0 if ε is small enough. We need the following technical result. Lemma 7.2. Let ψ ∈ L2 (Rd ), ψ = 1. For any n ∈ Zd , ε > 0 define (n (ε) =

|ψ(x)| dx − ε 2

Kn (ε)

−d

Kn (ε)

2 ψ(x)dx

((n (ε) ≥ 0 by Cauchy–Schwartz inequality). There exists some universal constant C(d) such that for any ε > 0, R > 0, 0≤

n∈Zd

1/2

(n (ε) ≤ C(d) R ε + 2 2

|u|>R

(u)|2 du |ψ

.

52

S. Tcheremchantsev

Proof. One can represent (n (ε) as (n (ε) = ε−d dx ψ(x) Kn (ε)

Kn (ε)

dy (ψ(x) − ψ(y)).

(7.5)

Applying twice the Cauchy–Schwartz inequality (to the integral over y and to the integral over x), we obtain from (7.5): (2n (ε) ≤ ε−d dx|ψ(x)|2 dxdy|ψ(x) − ψ(y)|2 . Kn (ε)

Kn (ε) Kn (ε)

Therefore,

(n (ε) ≤ ε

n

·

n

=ε

1/2

−d/2

n

Kn (ε)

1/2

Kn (ε) Kn (ε)

−d/2

dx|ψ(x)|

n

2

dxdy|ψ(x) − ψ(y)|

2

(7.6)

1/2

Kn (ε) Kn (ε)

dxdy|ψ(x) − ψ(y)|

2

.

√ One can now observe that |x − y| ≤ ε d for any x, y ∈ Kn (ε). Therefore, dxdy|ψ(x) − ψ(y)|2 n

Kn (ε) Kn (ε)

≤

n

=

Rd

Kn (ε)

Rd

dx

Rd

√ dy|ψ(x) − ψ(y)|2 F |x − y| ≤ ε d (7.7)

√ dxdy|ψ(x) − ψ(y)|2 F (|x − y| ≤ ε d),

√ where F is the characteristic function of the set {(x, y) | |x − y| ≤ ε d}. The bounds (7.6)-(7.7) imply (n (ε) ≤ ε−d/2 L1/2 (δ), (7.8) n

√ where L(δ) = Rd Rd |ψ(x) − ψ(y)|2 F (|x − y| ≤ δ) and δ = ε d. Changing the variable z = y − x in the integral, we obtain in Fourier representation (u)|2 |1 − eiz,u |2 . L(δ) = dz du|ψ (7.9) |z|≤δ

Rd

Let R > 0. The integral over u can be written as (u)|2 |1 − eiz,u |2 = I1 (z) + I2 (z), du|ψ Rd

How to Prove Dynamical Localization

53

where in I1 (z) and I2 (z) one integrates over u : |u| ≤ R and over u : |u| > R respectively. Using the elementary bound |eiw − 1| ≤ C|w|, w ∈ R, we estimate I1 (z) ≤ C|z|2

|u|≤R

(u)|2 ≤ C|z|2 R 2 ψ 2 = C|z|2 R 2 . du|u|2 |ψ

(7.10)

As to I2 (z), trivially I2 (z) ≤ 4

|u|>R

(u)|2 . du|ψ

(7.11)

The bounds (7.9)-(7.11) imply 2 d+2

L(δ) ≤ CR δ

+ Cδ

d

|u|>R

(u)|2 . du|ψ

Finally, (7.8) and (7.12) yield the statement of the lemma.

(7.12)

Now we can finish the proof of the theorem. Let {ek } be an orthonormal system verifying (7.1). The bound of Lemma 7.2 applied to ek yields 2 −d 2 (|gk (n)| − ε |hk (n)| ) ≤ C(d) R 2 ε 2 + n

1/2 |u|>R

du| ek (u)|

2

.

(7.13)

Using the condition (7.1), it is easy to see that one can choose R > 0 big enough and ε> 0 small enough so that the r.h.s. of (7.13) is smaller than 1/2 for any k ∈ N. As 2 n |gk (n)| = 1 for any k, (7.13) yields for such ε:

|hk (n)|2 ≥ εd /2

n

and (2.2) holds for akn = |hk (n)|2 with B = εd /2. The proof of the theorem is completed. Remark. One can note that the choice of ε depends on the system {ek }, so, unlike the discrete case, the constants D(p, d) are not necessarily universal in the continuous case. An important example where the condition (7.1) is satisfied, is given by the following Theorem 7.3. Let H = −( + V (x) be an operator in L2 (Rd ) self-adjoint on H 2 (Rd ), where V (x) is a real function bounded from below: V (x) ≥ −M for a.e. x ∈ Rd . Let K ∈ R and {ek } be any orthonormal family of eigenfunctions of H with eigenvalues λk ≤ K for all k. Then for any p > 0 one can reorder the moments dk (p) so that dk (p) ≥ D(p, d, K + M)k p/d with universal positive constant D depending only on p, d and A + M.

54

S. Tcheremchantsev

Proof. For any k ∈ N we have H ek (x) = −(ek (x) + V (x)ek (x) = λk ek (x). Therefore, −(ek , ek =

Rd

dx(λk − V (x))|ek (x)|2 ≤ (K + M)ek 2 = K + M.

(7.14)

On the other hand,

−(ek , ek =

Rd

du|u|2 | ek (u)|2 ≥ R 2

|u|>R

du| ek (u)|2

(7.15)

for any R > 0. The bounds (7.14)–(7.15) imply sup du| ek (u)|2 ≤ (K + M)/R 2 , k

|u|>R

so (7.1) is satisfied. Moreover, it is clear from the proof of Theorem 7.1 that one can choose R > 0 and ε > 0 depending only on d and K + M so that the r.h.s of (7.13) is smaller than 1/2 for any k. That means that the constants A = ε d and B = εd /2 in Lemma 2.1 depend only on d and K + M but not on the choice of the system {ek }. This gives us the result of the theorem. All the results of Sect. 3 hold if the orthonormal system M(ψ) satisfies the condition (7.1). The proof of Theorem 3.3 is essentially the same (one considers |x|≤N dx instead of |n|≤N in the proof). The proofs of Theorem 3.4 and Corollary 3.5 do not change. The results of Lemma 3.6 and Theorem 3.8 hold with the function Rψ,M (n) defined as follows: Rψ,M (n) = sup |γk gk (n)|, γk = ψ, ek , n ∈ N, k

where |gk ≡ Kn (1) dx|ek (x)|2 . The sufficient conditions for DL(ψ, p) and DL(ψ) in the continuous case are based on the following version of Lemma 4.1: (n)|2

p |X|ψ (t)

≤C

n∈Zd

(|n| + 1)

p

|ψ(t, x)| ≤ 2

Kn (1)

2 ak wk (p)

,

(7.16)

k

where wk (p) = n (|n| + 1)p |gk (n)|2 . The numbers wk (p) are equivalent to the moments dk (p) due to (7.2), so the lower bounds wk (p) ≥ Dk p/d hold. The result similar to (n)| , Theorem 4.2 can be easily obtained. One should only replace R(n, α) by supk |γwk gk k(α) supt |ψ(t, n)| by k |γk gk (n)|, dk (p) by wk (p), and ek (n) by gk (n). The only differ ence is the following: one does not have the bound k |gk (n)|2 ≤ 1 which was valid for ek (n) in the discrete case. Therefore the bounds in Statements 1 and 2 of the theorem one can prove are slightly weaker than in the discrete case. Statement 3 of the theorem and the result of Corollary 4.3 remain true. For the sake of completeness, let us give a direct proof of the third statement of Theorem 4.2 in the continuous case (this proof is valid also in the discrete case). For simplicity we shall suppose that α = 0.

How to Prove Dynamical Localization

55

Theorem 7.4. Let M = {ek } be some orthonormal system of eigenfunctions of H in L2 (Rd ) verifying the conditions of Theorem 7.1. For agiven vector ψ ∈ HM consider the function R(n) = supk |γk gk (n)|, where |gk (n)|2 = Kn (1) |ek (x)|2 dx ≤ 1, n ∈ Zd and γk = ψ, ek . If the function R(n) is fast decaying then DL(ψ) hold. Proof. As the function R(n) is fast decaying, for any r > 0, |γk |2 wk (r) =

(|n| + 1)r |γk gk (n)|2 ≤ (|n| + 1)r R 2 (n) ≤ C(r) < +∞. n

n

On the other hand, wk (r) ≥ Dk r/d after reordering. Therefore, |γk | ≤ C(m)k −m

for any m > 0.

(7.17)

Next, as |ψ(t, x)| ≤

∞

|γk ek (x)|,

k=1

for any n ∈ Zd the bound holds: Kn (1)

|ψ(t, x)|2 dx ≤

∞

2 |γk gk (n)|

≡ S 2 (n).

(7.18)

k=1

Reorder the terms in the sum so that (7.17) hold. Then 2 |n| S(n) = + k=1

∞

|γk gk (n)|

k=|n|2 +1 ∞

≤ |n|2 R(n) +

(7.19) |γk | ≤ |n|2 R(n) + C(m)(|n|2 + 1)1−m

k=|n|2 +1

for any m > 0. The bounds (7.16), (7.18) and (7.19) yield DL(ψ, p) for all p > 0. The proof is completed. Most of results of Sect. 5 can be adapted to the continuous case. It is sufficient to take gk (n) instead of ek (n) and ( Kn (1) |ψ(x)|2 )1/2 instead of ψ(n). The results of Theorem 5.3 and Theorem 5.4 are true if the system M complete in H(I ) satisfies the conditions of Theorem 7.1. In particular, this is the case if H = −( + V (x) with V (x) bounded from below and I = (−∞, K]. The result similar to that of Theorem 5.5 can be proved in the case H (θ ) = −( + V (x, θ), where V (x, θ ) ≥ −M for µ-a.e. θ and a.e.x. The constants in the bounds will depend on ε, p, d and K + M. The main results of Sect. 6 can be also generalized to the continuous case in a similar way. Acknowledgements. I thank F. Germinet for stimulating discussions on the subject of the paper.

56

S. Tcheremchantsev

References 1. Aizenman, M.: Localization at weal disorder: Some elementary bounds. Rev. Math. Phys. 6, 1163–1182 (1994) 2. Aizenman, M., Schenker, J.H., Friedrich, R.M., Hundertmark, D.: Finite-volume fractional-moment criteria for Anderson localization. To appear in Commun. Math. Phys. 3. Damanik, D. and Stollman, P.: Multi-scale analysis implies strong dynamical localization. Preprint (1999) 4. De Bièvre, S. and Germinet, F.: Dynamical localization for the random dimer Schrödinger operator. J. Stat. Phys. 98, 1135–1148 (2000) 5. Del Rio, R., Jitomirskaya, S., Last, Y. and Simon, B.: Operators with singular continuous spectrum IV: Hausdorff dimensions, rank one perturbation and localization. J. d’Analyse Math. 69, 153–200 (1996) 6. Germinet, F. and De Bièvre, S.: Dynamical localization for discrete and continuous random Schrödinger operators. Commun. Math. Phys. 194, 323–341 (1998) 7. Germinet, F.: Dynamical localization II with an application to the almost Mathieu operator. J. Stat. Phys. 95, 273–286 (1999) 8. Germinet, F. and Jitomirskaya, S.: Strong dynamical localization for the almost Mathieu model. Preprint (2000) Communicated by B. Simon

Commun. Math. Phys. 221, 57 – 76 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Conformal and Quasiconformal Realizations of Exceptional Lie Groups M. Günaydin1, , K. Koepsell2 , H. Nicolai2 1 CERN, Theory Division, 1211 Geneva 23, Switzerland. E-mail: [email protected] 2 Max-Planck-Institut für Gravitationsphysik, Albert-Einstein-Institut, Mühlenberg 1, 14476 Potsdam,

Germany. E-mail: [email protected]; [email protected] Received: 12 August 2000 / Accepted: 2 March 2001

Abstract: We present a nonlinear realization of E8(8) on a space of 57 dimensions, which is quasiconformal in the sense that it leaves invariant a suitably defined “light cone” in R 57 . This realization, which is related to the Freudenthal triple system associated with the unique exceptional Jordan algebra over the split octonions, contains previous conformal realizations of the lower rank exceptional Lie groups on generalized space times, and in particular a conformal realization of E7(7) on R 27 which we exhibit explicitly. Possible applications of our results to supergravity and M-Theory are briefly mentioned.

1. Introduction It is an old idea to define generalized space-times by association with Jordan algebras J , in such a way that the space-time is coordinatized by the elements of J , and that its rotation, Lorentz, and conformal group can be identified with the automorphism, reduced structure, and the linear fractional group of J , respectively [11–13]. The aesthetic appeal of this idea rests to a large extent on the fact that key ingredients for formulating relativistic quantum field theories over four dimensional Minkowski space extend naturally to these generalized space times; in particular, the well-known connection between the positive energy unitary representations of the four dimensional conformal group SU (2, 2) and the covariant fields transforming in finite dimensional representations of the Lorentz group SL(2, C) [29, 28] extends to all generalized space-times defined by Jordan algebras [16]. The appearance of exceptional Lie groups and algebras in extended supergravities and their relevance to understanding the non-perturbative regime of string theory have provided new impetus; indeed, possible applications to string and M-Theory constitute the main motivation for the present investigation. This work was supported in part by the NATO collaborative research grant CRG. 960188.

Work supported in part by the National Science Foundation under grant number PHY-9802510.

Permanent address: Physics Department, Penn State University, University Park, PA 16802, USA.

58

M. Günaydin, K. Koepsell, H. Nicolai

In this paper, we will present a novel construction involving the maximally extended Lie group E8(8) . This construction of E8(8) together with the corresponding construction of E8(−24) contain all previous examples of generalized space-times based on exceptional Lie groups, and at the same time goes beyond the framework of Jordan algebras. More precisely, we show that there exists a quasiconformal nonlinear realization of E8(8) on a space of 57 dimensions1 . This space may be viewed as the quotient of E8(8) by its maximal parabolic subgroup [18, 19]; there is no Jordan algebra directly associated with it, but it can be related to a certain Freudenthal triple system which itself is associated with the “split” exceptional Jordan algebra J3O S , where O S denote the split real form of the octonions O. It furthermore admits an E7(7) invariant norm form N4 , which gets multiplied by a (coordinate dependent) factor under the nonlinearly realized “special conformal” transformations. Therefore the light cone, defined by the condition N4 = 0, is actually invariant under the full E8(8) , which thus plays the role of a generalized conformal group. By truncation we obtain quasiconformal realizations of other exceptional Lie groups. Furthermore, we recover previous conformal realizations of the lower rank exceptional groups (some of which correspond to Jordan algebras). In particular, we give a completely explicit conformal Möbius-like nonlinear realization of E7(7) on the 27-dimensional space associated with the exceptional Jordan algebra J3O S , with linearly realized subgroups F4(4) (the “rotation group”) and E6(6) (the “Lorentz group”). Although in part this result is implicitly contained in the existing literature on Jordan algebras, the relevant transformations have not been exhibited explicitly so far, and are here presented in the basis that is also used in maximal supergravity theories. The basic concepts are best illustrated in terms of a simple and familiar example, namely the conformal group in four dimensions [29], and its realization via the Jordan algebra J2C of hermitian 2 × 2 matrices with the hermiticity preserving commutative (but non-associative) product a ◦ b := 21 (ab + ba)

(1)

(basic properties of Jordan algebras are summarized in Appendix A). As is well known, these matrices are in one-to-one correspondence with four-vectors x µ in Minkowski space via the formula x ≡ xµ σ µ , where σ µ := (1, σ ). The “norm form” on this algebra is just the ordinary determinant, i.e. N2 (x) := det x = xµ x µ

(2)

(it will be a higher order polynomial in the general case). Defining x¯ := xµ σ¯ µ (where σ¯ µ := (1, −σ )) we introduce the Jordan triple product on J2C : ¯ ◦ c + (c ◦ b) ¯ ◦ a − (a ◦ c) ◦ b¯ {a b c} := (a ◦ b) 1 ¯ + cba) ¯ = a, b c + c, b a − a, c b = (a bc

(3)

2

with the standard Lorentz invariant bilinear form a, b := aµ bµ . However, it is not generally true that the Jordan triple product can be thus expressed in terms of a bilinear form. The automorphism group of J2C , which is by definition compatible with the Jordan product, is just the rotation group SU (2); the structure group, defined as the invariance 1 A nonlinear realization will be referred to as “quasiconformal” if it is based on a five graded decomposition of the underlying Lie algebra (as for E8(8) ); it will be called “conformal” if it is based on a three graded decomposition (as e.g. for E7(7) ).

Conformal Realizations of Exceptional Lie Groups

59

of the norm form up to a constant factor, is the product SL(2, C) × D, i.e. the Lorentz group and dilatations. The conformal group associated with J2C is the group leaving invariant the light-cone N2 (x) = 0. As is well known, the associated Lie algebra is su(2, 2), and possesses a three-graded structure g = g−1 ⊕ g0 ⊕ g+1 ,

(4)

where the grade −1 and grade +1 spaces correspond to generators of translations P µ and special conformal transformations K µ , respectively, while the grade 0 subspace is spanned by the Lorentz generators M µν and the dilatation generator D. The subspaces g1 and g−1 can each be associated with the Jordan algebra J2C , such that their elements are labeled by elements a = aµ σ µ of J2C . The precise correspondence is Ua := aµ P µ ∈ g−1

U˜ a := aµ K µ ∈ g+1 .

and

(5)

By contrast, the generators in g0 are labeled by two elements a, b ∈ J2C , viz. Sab := aµ bν (M µν + ηµν D).

(6)

The conformal group is realized non-linearly on the space of four-vectors x ∈ J2C , with a Möbius-like infinitesimal action of the special conformal transformations δx µ = 2c, x x µ − x, x cµ

(7)

with parameter cµ . All variations acquire a very simple form when expressed in terms of the above generators: we have Ua (x) = a, Sab (x) = {a b x}, U˜ c (x) = − 1 {x c x},

(8)

2

where {. . . } is the Jordan triple product introduced above. From these transformations it is elementary to deduce the commutation relations [Ua , U˜ b ] = Sab , [Sab , Uc ] = U{abc} [Sab , U˜ c ] = U˜ {bac} ,

(9)

[Sab , Scd ] = S{abc} d − S{bad} c (of course, these could have been derived directly from those of the conformal group). As one can also see, the Lie algebra g admits an involutive automorphism ι exchanging g−1 and g+1 (hence, ι(K µ ) = P µ ). The above transformation rules and commutation relations exemplify the structure that we will encounter again in Sect. 3 of this paper: the conformal realization of E7(7) on R 27 presented there has the same form, except that J2C is replaced by the exceptional Jordan algebra J3O S over the split octonions O S . While our form of the nonlinear variations appears to be new, the concomitant construction of the Lie algebra itself by means of the Jordan triple product has been known in the literature as the Tits–Kantor–Koecher construction [32, 21, 25], and as such generalizes to other Jordan algebras. The generalized linear fractional (Möbius) groups of Jordan algebras can be abstractly defined in an

60

M. Günaydin, K. Koepsell, H. Nicolai

analogous manner [26], and shown to leave invariant certain generalized p-angles defined by the norm form of degree p of the underlying Jordan algebra [22, 14]. However, to our knowledge, explicit formulas of the type derived here have not appeared in the literature before. While this construction works for the exceptional Lie algebras E6(6) , and E7(7) , as well as other Lie algebras admitting a three graded structure, it fails for E8(8) , F4(4) , and G2(2) , for which a three grading does not exist. These algebras possess only a five graded structure g = g−2 ⊕ g−1 ⊕ g0 ⊕ g+1 ⊕ g+2 .

(10)

Our main result, to be described in Sect. 2, states that a “quasiconformal” realization is still possible on a space of dimension dim(g1 ) + 1 if the top grade spaces g±2 are one-dimensional. Five graded Lie algebras with this property are closely related to the so-called Freudenthal Triple Systems [9, 30], which were originally invented to provide alternative constructions of the exceptional Lie groups2 . This relation will be made very explicit in the present paper. The novel realization of E8(8) which we will arrive at, together with its natural extension to E8(−24) , contains various other constructions of exceptional Lie algebras by truncation, including the conformal realizations based on a three graded structure. For this reason, we describe it first in Sect. 2, and then show how the other cases can be obtained from it. Whereas previous attempts to construct generalized space-times mainly focused on generalizing Minkowski space-time and its symmetries, the physical applications that we have in mind here are of a somewhat different nature, and inspired by recent developments in superstring and M-Theory. Namely, the generalized “space-times” presented here could conceivably be identified with certain internal spaces arising in supergravity and superstring theory, which are related to the appearance of central charges in the associated superalgebras. Central charges and their solitonic carriers have been much discussed in the recent literature because it is hoped that they may provide a window on M-Theory and its non-perturbative degrees of freedom. More specifically, it has been argued in [5] that a proper description of the non-perturbative M-Theory degrees of freedom might require supplementing ordinary space-time coordinates by central charge coordinates. Solitonic charges also play an important role in the microscopic description of black hole entropy: for maximally extended N = 8 supergravity, the latter is conjectured to be given by an E7(7) invariant formula [20, 8]. The corresponding formula for the entropy in maximally extended supergravity in five dimensions is E6(6) invariant and involves a cubic form. In [7] an invariant classification of orbits of E7(7) and E6(6) actions on their fundamental representations that classify BPS states in d = 4 and d = 5 was given. The entropy formula in [20, 8] is identical to the equation for a vector with vanishing norm in 57 dimensions (see Eq. (27)), provided we use the SL(8, R)form of the quartic E7(7) invariant. This suggests that the 57th component of our E8(8) realization should be interpreted as the entropy. However, we should stress that the quartic invariant can assume both positive and negative values, cf. the simple examples given inAppendix B. In order to avoid imaginary entropy, one must therefore restrict oneself to the positive semidefinite values of the quartic invariant, corresponding to the “time-like” and “light-like” orbits of E7(7) in the language of [7]. With the 57th coordinate interpreted as the entropy and the remaining 56 coordinates as the electric and magnetic charges, it is natural from our point of view to define a distance in this “entropy-charge space” between any two 2 The more general Kantor–Triple-Systems for which g±2 have more than one dimension, will not be discussed in this paper.

Conformal Realizations of Exceptional Lie Groups

61

black hole solutions using our Eqs. (25), (26). If two black hole solutions are light-like separated in this space, they will remain so under the action of E8(8) .3 We should also point out that it is not entirely clear from the existing black hole literature whether it is the SU(8) or the SL(8, R) form of the invariant that should be used here (the detailed relation between the two is worked out in Appendix B). The SU(8) basis is relevant for the central charges, which appear in the superalgebra via surface integrals at spatial infinity and determine the structure (and length) of BPS multiplets. By contrast, the 28 electric and 28 magnetic charges carried by the solitons of d = 4, N = 8 supergravity transform separately under SL(8, R) [4], and therefore the SL(8, R) form of the invariant appears more appropriate in this context. For applications to M-Theory it would be important to obtain the exponentiated version of our realization. One might reasonably expect that modular forms with respect to a fractional linear realization of the arithmetic group E8(8) (Z) will have a role to play. We expect that our results will pave the way for the explicit construction of such modular forms. According to [19] these would depend on 28+1 = 29 variables, such that the 57dimensional Heisenberg subalgebra of E8(8) exhibited here would be realized in terms of 28 “coordinates” and 28 “momenta”. Consequently, the 57 dimensions in which E8(8) acts might alternatively be interpreted as a generalized Heisenberg group, in which case the 57th component would play the role of a variable parameter h. ¯ The action of E8(8) (Z) on the 57 dimensional Heisenberg group would then constitute the invariance group of a generalized Dirac quantization condition. This observation is also in accord with the fact that the term modifying the vector space addition in R 57 (cf. Eq.(25)), which is required by E8(8) invariance, is just the cocycle induced by the standard canonical commutation relations on an (28+28)-dimensional phase space. 2. Quasiconformal Realization of E8(8) 2.1. E7(7) decomposition of E8(8) . We will start with the maximal case, the exceptional Lie group E8(8) , and its quasiconformal realization on R 57 , because this realization contains all others by truncation. Our results are based on the following five graded decomposition of E8(8) with respect to its E7(7) × D subgroup g−2 ⊕ g−1 ⊕

g0

⊕ g+1 ⊕ g+2

1 ⊕ 56 ⊕ (133 ⊕ 1) ⊕ 56 ⊕ 1

(11)

with the one-dimensional group D consisting of dilatations. D itself is part of an SL(2, R) group, and the above decomposition thus corresponds to the decomposition 248 → (133, 1) ⊕ (56, 2) ⊕ (1, 3) of E8(8) under its subgroup E7(7) × SL(2, R). In order to write out the E7(7) generators, it is convenient to further decompose them w.r.t. the subgroup SL(8, R) of E7(7) . In this basis, the Lie algebra of E7(7) is spanned by the SL(8, R) generators Gi j , and the antisymmetric generators Gij kl , transforming in the 63 and 70 representations of SL(8, R), respectively. We also define Gij kl :=

1 24 %ij klmnpq

Gmnpq

3 For the exceptional N = 2 Maxwell–Einstein supergravity [17] defined by the exceptional Jordan algebra the U-duality groups in five and four dimensions are E6(−26) and E7(−25) , respectively. The quasi-conformal symmetry of the exceptional supergravity in four dimensions is hence E8(−24) , with the maximal compact subgroup E7 × SU (2).

62

M. Günaydin, K. Koepsell, H. Nicolai

with SL(8, R) indices 1 ≤ i, j, . . . ≤ 8. The commutation relations are [Gi j , Gk l ] = δ kj Gi l − δ il Gk j , lmn]i − [Gi j , Gklmn ] = −4 δ [k j G

[Gij kl , Gmnpq ] =

1 36

δ ij Gklmn ,

1 2

% ij kls[mnp Gq] s .

The fundamental 56 representation of E7(7) is spanned by the two antisymmetric real tensors Xij and Xij and the action of E7(7) is given by4 δX ij = *i k X kj − *j k X ki + + ij kl Xkl , δXij = *k i Xj k − *k j Xik + +ij kl X kl ,

(12)

where +ij kl =

mnpq 1 . 24 %ij klmnpq +

(13)

In order to extend E7(7) × D to the full E8(8) , we must enlarge D to an SL(2, R) with generators (E, F, H ) in the standard Chevalley basis, together with 2 × 56 further real generators (Eij , E ij ) and (Fij , F ij ). Under hermitian conjugation, we have E ij = Fij† ,

F ij = −Eij† ,

and

E = −F † .

The grade −2, −1, 1 and 2 subspaces in the above decomposition correspond to the subspaces g−2 , g−1 , g1 , and g2 in (11), respectively: E ⊕ {E ij , Eij } ⊕ {Gij kl , Gi j ; H } ⊕ {F ij , Fij } ⊕ F.

(14)

The grading may be read off from the commutators with H [H , E] = −2 E, ij

ij

[H , E ] = −E , [H , Eij ] = −Eij ,

[H , F ] = 2 F, [H , F ij ] = F ij , [H , Fij ] = Fij .

The new generators (Eij , E ij ) and (Fij , F ij ) form two (maximal) Heisenberg subalgebras of dimension 28 ij

[E ij , Ekl ] = 2 δ kl E,

ij

[F ij , Fkl ] = 2 δ kl F,

and they transform under SL(8, R) as [Gi j , E kl ] = δ kj E il − δ lj E ik − 41 δ ij E kl , [Gi j , Ekl ] = δ ik Elj − δ il Ekj + 41 δ ij Ekl , [Gi j , F kl ] = δ kj F il − δ lj F ik − 41 δ ij F kl , [Gi j , Fkl ] = δ ik Flj − δ il Fkj + 41 δ ij Fkl . 4 We emphasize that X ij and X are independent. This convention differs from the one used for the SU(8) ij basis in the appendix.

Conformal Realizations of Exceptional Lie Groups

63

The remaining non-vanishing commutation relations are given by [E, F ] = H and [ij

1 ij klmnpq [Gij kl , E mn ] = − 24 % Epq ,

[ij

1 ij klmnpq [Gij kl , F mn ] = − 24 % Fpq ,

[Gij kl , Emn ] = −δ mn E kl] , [Gij kl , Fmn ] = −δ mn F kl] , [E ij , F kl ] = 12 Gij kl , [E ij , Fkl ] = 4 δ [i[k G

j]

l]

[Eij , Fkl ] = −12 Gij kl , ij

− δ kl H,

[E , F ij ] = −E ij , ij

l] kl [Eij , F kl ] = 4 δ [k [i G j ] + δ ij H,

[E , Fij ] = −Eij ,

ij

[F , E ] = F ,

[F , Eij ] = Fij .

To see that we are really dealing with the maximally split form of E8(8) , let us count the number of compact generators: The antisymmetric part (Gi j − Gj i ) of Gi j and (Gij kl − Gij kl ) correspond to the 63 generators of the maximal compact subalgebra SU (8) of E7(7) [4]. The remaining compact generators are the 28+28+1 anti-hermitian generators (Eij + F ij ), (E ij − Fij ), and (E + F ) giving a total of 120 generators which close into the maximal compact subgroup SO(16) ⊃ SU(8) of E8(8) . An important role is played by the symplectic invariant of two 56 representations. It is given by X, Y := X ij Yij − Xij Y ij .

(15)

The second structure which we need to introduce is the triple product. This is a trilinear map 56 × 56 × 56 −→ 56, which associates to three elements X, Y and Z another element transforming in the 56 representation, denoted by (X, Y, Z), and defined by (X, Y, Z)ij := − 8 X ik Ykl Z lj −8 Y ik Xkl Z lj −8 Y ik Zkl X lj − 2 Y ij X kl Zkl − 2 X ij Y kl Zkl − 2 Z ij Y kl Xkl +

1 2

% ij klmnpq Xkl Ymn Zpq ,

(X, Y, Z)ij := 8 Xik Y kl Zlj + 8 Y ik X kl Zlj + 8 Y ik Z kl Xlj

(16)

+ 2 Yij Z kl Xkl + 2 Xij Z kl Ykl + 2 Zij X kl Ykl −

kl mn pq 1 Z . 2 %ij klmnpq X Y

A somewhat tedious calculation5 shows that this triple product obeys the relations (X, Y, Z) = (X, Y, Z) = (X, Y, Z) , W = (X, Y, (V , W, Z)) =

(Y, X, Z) + 2 X, Y Z, (Z, Y, X) − 2 X, Z Y, (X, W, Z) , Y − 2 X, Z Y, W , (V , W, (X, Y, Z)) + ((X, Y, V ) , W, Z) + (V , (Y, X, W ) , Z) .

5 Which relies heavily on the Schouten identity ε [ij klmnpq Xr]s = 0.

(17)

64

M. Günaydin, K. Koepsell, H. Nicolai

We note that the triple product (16) could be modified by terms involving the symplectic invariant, such as X, Y Z; the above choice has been made in order to obtain agreement with the formulas of [6]. While there is no (symmetric) quadratic invariant of E7(7) in the 56 representation, a real quartic invariant I4 can be constructed by means of the above triple product and the bilinear form; it reads I4 (X ij , Xij ) := ≡

1 48 (X, X, X) , X Xij Xj k X kl Xli − 41 X ij Xij X kl Xkl 1 ij klmnpq + 96 % Xij Xkl Xmn Xpq 1 + 96 %ij klmnpq X ij X kl X mn X pq .

(18)

2.2. Quasiconformal nonlinear realization of E8(8) . We will now exhibit a nonlinear realization of E8(8) on the 57-dimensional real vector space with coordinates X := (X ij , Xij , x), where x is also real. While x is a E7(7) singlet, the remaining 56 variables transform linearly under E7(7) . Thus X forms the 56 ⊕ 1 representation of E7(7) . In writing the transformation rules we will omit the transformation parameters in order not to make the formulas (and notation) too cumbersome. To recover the infinitesimal variations, one must simply contract the formulas with the appropriate transformation parameters. The E7(7) subalgebra acts linearly by Gi j (X kl ) = 2 δ kj X il − 41 δ ij X kl ,

Gij kl (X mn ) =

Gi j (Xkl ) = −2 δ ik Xj l + 41 δ ij Xkl ,

Gij kl (Xmn ) =

1 ij klmnpq Xpq , 24 % [ij δ mn X kl] ,

(19)

Gij kl (x) = 0,

Gi j (x) = 0, H generates scale transformations H (Xij ) = Xij ,

H (Xij ) = Xij ,

H (x) = 2 x,

(20)

and the E generators act as translations; we have E(Xij ) = 0,

E(Xij ) = 0,

E(x) = 1

(21)

and E ij (X kl ) = 0, Eij (X kl ) = δ kl ij ,

ij

E ij (Xkl ) = δ kl ,

E ij (x) = −Xij ,

Eij (Xkl ) = 0,

Eij (x) = Xij .

(22)

Conformal Realizations of Exceptional Lie Groups

65

By contrast, the F generators are realized nonlinearly: F (X ij ) = −

1 6

(X, X, X)ij + X ij x

≡ 4Xik Xkl X lj +X ij X kl Xkl 1 ij klmnpq Xkl Xmn Xpq + X ij 12 % − 16 (X, X, X)ij + Xij x − 4X ik X kl Xlj − Xij X kl Xkl

− F (Xij ) = ≡

x,

(23)

kl mn pq 1 + Xij x, 12 %ij klmnpq X X X 4 I4 (X ij , Xij ) + x 2 4 Xij Xj k X kl Xli − X ij Xij X kl Xkl 1 ij klmnpq + 24 % Xij Xkl Xmn Xpq 1 + 24 %ij klmnpq X ij X kl X mn X pq + x 2 .

+

F (x) = ≡

Observe that the form of the r.h.s. is dictated by the requirement of E7(7) covariance: (F (Xij ), F (Xij )) and F (x) must still transform as the 56 and 1 of E7(7) , respectively. The action of the remaining generators is likewise E7(7) covariant: F ij (X kl ) = − 4 Xi[k X l]j + 41 % ij klmnpq Xmn Xpq , F ij (Xkl ) = + 8 δ [ik X

j ]m

ij

ij

Xml + δ kl X mn Xmn + 2 X ij Xkl − δ kl x,

mn kl kl Fij (X kl ) = − 8 δ k[i Xj ]m X ml +δ kl ij X Xmn − 2 Xij X − δ ij x,

Fij (Xkl ) = 4 X ki Xj l −

mn pq 1 4 %ij klmnpq X X ,

(24)

F ij (x) = 4 X ik Xkl X lj +X ij X kl Xkl −

1 12

% ij klmnpq Xkl Xmn Xpq + X ij x,

Fij (x) = 4 X ik X kl Xlj + Xij X kl Xkl −

kl mn pq 1 12 %ij klmnpq X X X

− Xij x.

Although E7(7) covariance considerably constrains the expressions that can appear on the r.h.s., it does not fix them uniquely: as for the triple product (16) one could add further terms involving the symplectic invariant. However, all ambiguities are removed by imposing closure of the algebra, and we have checked by explicit computation that the above variations do close into the full E8(8) algebra in the basis given in the previous section. This is the crucial consistency check. The term “quasiconformal realization” is motivated by the existence of a norm form that is left invariant up to a (possibly coordinate dependent) factor under all transformations. To write it down we must first define a nonlinear “difference” between two points X ≡ (Xij , Xij ; x) and Y ≡ (Y ij , Yij ; y); curiously, the standard difference is not invariant under the translations (E ij , Eij ). Rather, we must choose δ(X , Y) := (X ij − Y ij , Xij − Yij ; x − y + X, Y ).

(25)

66

M. Günaydin, K. Koepsell, H. Nicolai

This difference still obeys δ(X , Y) = −δ(Y, X ) and thus δ(X , X ) = 0, and is now invariant under (E ij , Eij ) as well as E; however, it is no longer additive. In fact, with the sum of two vectors being defined as δ(X , −Y), the extra term involving X, Y can be interpreted as the cocycle induced by the standard canonical commutation relations. The relevant invariant is a linear combination of x 2 and the quartic E7(7) invariant I4 , viz. N4 (X ) ≡ N4 (X ij , Xij ; x) := 4I4 (X) − x 2 ,

(26)

In order to ensure invariance under the translation generators, we consider the expression N4 (δ(X , Y)), which is manifestly invariant under the linearly realized subgroup E7(7) . Remarkably, it also transforms into itself up to an overall factor under the action of the nonlinearly realized generators. More specifically, we find F N4 (δ(X , Y)) = 2 (x + y) N4 (δ(X , Y)), F ij N4 (δ(X , Y)) = 2 (Xij + Y ij ) N4 (δ(X , Y)), H N4 (δ(X , Y)) = 4 N4 (δ(X , Y)). Therefore, for every Y ∈ R 57 the “light cone” with base point Y, defined by the set of X ∈ R 57 obeying N4 (δ(X , Y)) = 0,

(27)

is preserved by the full E8(8) group, and in this sense, N4 is a “conformal invariant” of E8(8) . We note that the light cones defined by the above equation are not only curved hypersufaces in R 57 , but get deformed as one varies the base point Y. As we will show in Appendix B, the quartic invariant I4 can take both positive and negative values, but in the latter case Eq. (27) does not have real solutions. However, we can remedy this problem by extending the representation space to C 57 and using the same formulas to get a realization of the complexified Lie algebra E8 (C) on C 57 . The existence of a fourth order conformal invariant of E8(8) is noteworthy in view of the fact that no irreducible fourth order invariant exists for the linearly realized E8(8) group (the next invariant after the quadratic Casimir being of order eight). 2.3. Relation with Freudenthal Triple Systems. We will now rewrite the nonlinear transformation rules in another form in order to establish contact with mathematical literature. Both the bilinear form (15) and the triple product (16) already appear in [6], albeit in a very different guise. That work starts from 2 × 2 “matrices” of the form α 1 x1 A= , (28) x2 α 2 where α1 , α2 are real numbers and x1 , x2 are elements of a simple Jordan algebra J of degree three. There are only four simple Jordan algebras J of this type, namely the 3 × 3 hermitian matrices over the four division algebras, R, C, H and O. The associated matrices are then related to non-compact forms of the exceptional Lie algebras F4 , E6 , E7 , and E8 , respectively. For simplicity, let us concentrate on the maximal case J3O S , when the matrix A carries 1+1+27+27 = 56 degrees of freedom. This counting suggests

Conformal Realizations of Exceptional Lie Groups

67

an obvious relation with the 56 of E7(7) and its decomposition under E6(6) , but more work is required to make the connection precise. To this aim, [6] defines a symplectic invariant A, B , and a trilinear product mapping three such matrices A, B and C to another one, denoted by (A, B, C). This triple system differs from a Jordan triple system in that it is not derivable from a binary product. The formulas for the triple product in terms of the matrices A, B and C given in [6] are somewhat cumbersome, lacking manifest E7(7) covariance. For this reason, instead of directly verifying that our prescription (16) and the one of [6] coincide, we have checked that they satisfy identical relations: a quick glance shows that the relations (T1)–(T4) [6] are indeed the same as our relations (17), which are manifestly E7(7) covariant. To rewrite the transformation formulas we introduce Lie algebra generators UA and ˜ UA labeled by the above matrices, as well as generators SAB labeled by a pair of such matrices. For the grade ±2 subspaces we would in general need another set of generators KAB and K˜ AB labeled by two matrices, but since these subspaces are one-dimensional in the present case, we have only two more generators Ka and K˜ a labelled by one real number a. In the same vein, we reinterpret the 57 coordinates X as a pair (X, x), where X is a 2 × 2 matrix of the type defined above. The variations then take the simple form Ka (X) = 0, UA (X) = A, SAB (X) = (A, B, X) , U˜ A (X) = 1 (X, A, X) − Ax,

Ka (x) = 2 a, UA (x) = A, X , SAB (x) = 2 A, B x, (29) 1 ˜ UA (x) = − (X, X, X) , A + X, A x,

2

6

K˜ a (X) = − 16 a (X, X, X) + aXx,

K˜ a (x) =

1 6

a (X, X, X) , X + 2 ax 2 .

From these formulas it is straightforward to determine the commutation relations of the transformations. To expose the connection with the more general Kantor triple systems we write KAB ≡ KA,B

(30)

in the formulas below. The consistency of this specialization is ensured by the relations (17). By explicit computation one finds [UA , U˜ B ] = SAB , [UA , UB ] = −KAB , [U˜ A , U˜ B ] = −K˜ AB , [SAB , UC ] = −U(A,B,C) , [SAB , U˜ C ] = −U˜ (B,A,C) , [KAB , U˜ C ] = U(A,C,B) − U(B,C,A) , [K˜ AB , UC ] = U˜ (B,C,A) − U˜ (A,C,B) , [SAB , SCD ] = −S(A,B,C)D − SC(B,A,D) , [SAB , KCD ] = KA(C,B,D) − KA(D,B,C) , [SAB , K˜ CD ] = K˜ (D,A,C)B − K˜ (C,A,D)B , [KAB , K˜ CD ] = S(B,C,A)D − S(A,C,B)D − S(B,D,A)C + S(A,D,B)C .

(31)

68

M. Günaydin, K. Koepsell, H. Nicolai

For general KAB , these are the defining commutation relations of a Kantor triple system, and, with the further specification (30), those of a Freudenthal triple system (FTS). Freudenthal introduced these triple systems in his study of the metasymplectic geometries associated with exceptional groups [10]; these geometries were further studied in [1, 6, 30, 24]6 . A classification of FTS’s may be found in [24], where it is also shown that there is a one-to-one correspondence between simple Lie algebras and simple FTS’s with a non-degenerate bilinear form. Hence there is a quasiconformal realization of every Lie group acting on a generalized lightcone. 3. Truncations of E8(8) For the lower rank exceptional groups contained in E8(8) , we can derive similar conformal or quasiconformal realizations by truncation. In this section, we will first give the list of quasiconformal realizations contained in E8(8) . In the second part of this section, we consider truncations to a three graded structure, which will yield conformal realizations. In particular, we will work out the conformal realization of E7(7) on a space of 27 dimensions as an example, which is again the maximal example of its kind. 3.1. More quasiconformal realizations. All simple Lie algebras (except for SU (2)) can be given a five graded structure (10) with respect to some subalgebra of maximal rank and one can associate a triple system with the grade +1 subspace [23, 2]. Conversely, one can construct every simple Lie algebra over the corresponding triple system. The realization of E8 over the FTS defined by the exceptional Jordan algebra can be truncated to the realizations of E7 , E6 , and F4 by restricting oneself to subalgebras defined by quaternionic, complex, and real Hermitian 3 × 3 matrices. Analogously the non-linear realization of E8(8) given in the previous section can be truncated to nonlinear realizations of E7(7) , E6(6) , and F4(4) . These truncations preserve the five grading. More specifically we find that the Lie algebra of E7(7) has a five grading of the form: E7(7) = 1 ⊕ 32 ⊕ (SO(6, 6) ⊕ D) ⊕ 32 ⊕ 1.

(32)

Hence this truncation leads to a nonlinear realization of E7(7) on a 33 dimensional space. Note that this is not a minimal realization of E7(7) . Further truncation to the E6(6) subgroup preserving the five grading leads to: E6(6) = 1 ⊕ 20 ⊕ (SL(6, R) ⊕ D) ⊕ 20 ⊕ 1.

(33)

This yields a nonlinear realization of E6(6) on a 21 dimensional space, which again is not the minimal realization. Further reduction to F4(4) preserving the five grading F4(4) = 1 ⊕ 14 ⊕ (Sp(6, R) ⊕ D) ⊕ 14 ⊕ 1

(34)

leads to a minimal realization of F4(4) on a fifteen dimensional space. One can further truncate F4 to a subalgebra G2(2) while preserving the five grading G2(2) = 1 ⊕ 4 ⊕ (SL(2, R) ⊕ D) ⊕ 4 ⊕ 1,

(35)

6 FTS’s have also been used in [3] to give a classification and a unified realization of non-linear quasisuperconformal algebras and in the realizations of nonlinear N = 4 superconformal algebras in two dimensions [15].

Conformal Realizations of Exceptional Lie Groups

69

which then yields a nonlinear realization over a five dimensional space. One can go even futher and truncate G2 to its subalgebra SL(3, R) SL(3, R) = 1 ⊕ 2 ⊕ (SO(1, 1) ⊕ D) ⊕ 2 ⊕ 1,

(36)

which is the smallest simple Lie algebra admitting a five grading. We should perhaps stress that the nonlinear realizations given above are minimal for G2(2) , F4(4) , and E8(8) which are the only simple Lie algebras that do not admit a three grading and hence do not have unitary representations of the lowest weight type. The above nonlinear realizations of the exceptional Lie algebras can also be truncated to subalgebras with a three graded structure, in which case our nonlinear realization reduces to the standard nonlinear realization over a JTS. This truncation we will describe in Sect. 3.2 in more detail. With respect to E6(6) the quasiconformal realization of E8(8) (11) decomposes as follows: 1 ⊕

56

(133 ⊕ 1)

⊕

56

1

1 ⊕

1

⊕

27

⊕

⊕

⊕

✧ 27 ✧ ✧ ⊕ ❜ ❜ ❜ 27

⊕

27

⊕

1

⊕

1

27 1

⊕

❜

⊕

❜ ❜ ✧ ✧

27

✧

78

⊕

1

1

1 The numbers in the first line are the dimensions of E7(7) , whereas the remaining numbers correspond to representations of USp(8) which is the maximal compact subgroup of E6(6) . The 27 of grade −1 subspace and the 27 of grade +1 subspace close into the E6(6) ⊕ D subalgebra of grade zero subspace and generate the Lie algebra of E7(7) . Similarly 27 of grade −1 subspace together with the 27 of grade +1 subspace form another E7(7) subalgebra of E8(8) . Hence we have four different E7(7) subalgebras of E8(8) : i) E7(7) subalgebra of grade zero subspace which is realized linearly. ii) E7(7) subalgebra preserving the 5-grading, which is realized nonlinearly over a 33 dimensional space iii) E7(7) subalgebra that acts on the 27 dimensional subspace as the generalized conformal generators. iv) E7(7) subalgebra that acts on the 27 dimensional subspace as the generalized conformal generators.

70

M. Günaydin, K. Koepsell, H. Nicolai

Similarly for E7(7) under the SL(6, R) subalgebra of the grade zero subspace the 32 dimensional grade +1 subspace decomposes as 32 = 1 + 15 + 15 + 1. The 15 from grade +1 (−1) subspace together with 15 (15) of grade −1 (+1) subspace generate a nonlinearly realized SO(6, 6) subalgebra that acts as the generalized conformal algebra on the 15 (15) dimensional subspace. For E6(6) , F4(4) , G2(2) , and SL(3, R) the analogous truncations lead to nonlinear conformal subalgebras SL(6, R), Sp(6, R), SO(2, 2), and SL(2, R), respectively. 3.2. Conformal Realization of E7(7) . As a special truncation the quasiconformal realization of E8(8) contains a conformal realization of E7(7) on a space of 27 dimensions, on which the E6(6) subgroup of E7(7) acts linearly. The main difference is that the construction is now based on a three-graded decomposition (4) of E7(7) rather than (10) – hence the realization is “conformal” rather than “quasiconformal”. The relevant decomposition can be directly read off from the figure: we simply truncate to an E7(7) subalgebra in such a way that the grade ±2 subspace can no longer be reached by commutation. This requirement is met only by the two truncations corresponding to the diagonal lines in the figure; adding a singlet we arrive at the desired three graded decomposition of E7(7) 133 = 27 ⊕ (78 ⊕ 1) ⊕ 27

(37)

under its E6(6) × D subgroup. The Lie algebra E6(6) has USp(8) as its maximal compact subalgebra. It is spanned ˜ ij in the adjoint representation 36 of USp(8) and a fully antisymby a symmetric tensor G ˜ ij kl transforming under the 42 of USp(8); indices metric symplectic traceless tensor G 1 ≤ i, j, . . . ≤ 8 are now USp(8) indices and all tensors with a tilde transform under ˜ ij kl is traceless with respect to the real symplectic metric USp(8)rather than SL(8, R). G j 9ij = −9j i = −9ij (thus 9ik 9kj = δi ). The symplectic metric also serves to pull up and down indices, with the convention that this is always to be done from the left. The remaining part of E7(7) is spanned by an extra dilatation generator H˜ , translation generators E˜ ij and the nonlinearly realized generators F˜ ij , transforming as 27 and 27, respectively. Unlike for E8(8) , there is no need here to distinguish the generators by the position of their indices, since the corresponding generators are linearly related by means of the symplectic metric. The fundamental 27 of E6(6) (on which we are going to realize a nonlinear action of E7(7) ) is given by the traceless antisymmetric tensor Z˜ ij transforming as ˜ i j (Z˜ kl ) = 2 δ k Z˜ il , G j ˜ ij kl (Z˜ mn ) = G

1 ij klmnpq ˜ Z pq , 24 %

where Z˜ ij := 9ik 9j l Z˜ kl = (Z˜ ij )∗

and

9ij Z˜ ij = 0.

(38)

Conformal Realizations of Exceptional Lie Groups

71

Likewise, the 27 representation transforms as ˜ i j (Z¯ kl ) = 2 δ k Z¯ il , G j ˜ ij kl (Z¯ mn ) = − 1 % ij klmnpq Z¯ pq . G 24

(39)

Because the product of two 27’s contains no singlet, there exists no quadratic invariant of E6(6) ; however, there is a cubic invariant given by ˜ := Z˜ ij Z˜ j k Z˜ kl 9il . N3 (Z)

(40)

We are now ready to give the conformal realization of E7(7) on the 27 dimensional space spanned by the Z˜ ij .As the action of the linearly realized E6(6) subgroup has already been given, we list only the remaining variations. As before E˜ ij acts by translations: E˜ ij (Z˜ kl ) = −9i[k 9l]j − 18 9ij 9kl

(41)

H˜ (Z˜ ij ) = Z˜ ij .

(42)

and H˜ by dilatations

The 27 generators F˜ ij are realized nonlinearly: F˜ ij (Z˜ kl ) := − 2 Z˜ ij (Z˜ kl ) + 9i[k 9l]j (Z˜ mn Z˜ mn ) +

1 8

9ij 9kl (Z˜ mn Z˜ mn )

+ 8 Z˜ km Z˜ mn 9n[i 9j ]l −9kl (Z˜ im 9mn Z˜ nj ).

(43)

The norm form needed to define the E7(7) invariant “light cones” is now constructed from the cubic invariant of E6(6) . Then N3 (X˜ − Y˜ ) is manifestly invariant under E6(6) and under the translations E˜ ij (observe that there is no need to introduce a nonlinear difference unlike for E8(8) ). Under H˜ it transforms by a constant factor, whereas under the action of F˜ ij we have F˜ ij N3 (X˜ − Y˜ ) = (X˜ ij + Y˜ ij )N (X˜ − Y˜ ). (44) Thus the light cones in R 27 with base point Y˜ N3 (X˜ − Y˜ ) = 0

(45)

are indeed invariant under E7(7) . They are still curved hypersurfaces, but in contrast to the E8(8) light-cones constructed before, they are no longer deformed as one varies the base point Y˜ . The connection to the Jordan Triple Systems of Appendix A can now be made quite explicit, and the formulas that we arrive at in this way are completely analogous to the ones given in the introduction. We first of all notice that we can again define a triple product in terms of the E6(6) representations; it reads ˜ ij = 16 X˜ ik Z˜ kl Y˜ lj +16 Z˜ ik X˜ kl Y˜ lj +4 9ij (X˜ kl Y˜lm Z˜ mn 9kn ) {X˜ Y˜ Z} + 4 X˜ ij Y˜ kl Z˜ kl + 4 Y˜ ij X˜ kl Z˜ kl + 2 Z˜ ij X˜ kl Y˜kl .

(46)

72

M. Günaydin, K. Koepsell, H. Nicolai

This triple product can be used to rewrite the conformal realization. Recalling that a triple product with identical properties exists for the 27-dimensional Jordan algebra J3O S , we now consider Z˜ as an element of J3O S . Next we introduce generators labeled by elements of J3O S , and define the variations ˜ = a, Ua (Z) ˜ = {a b Z}, ˜ Sab (Z) ˜ = U˜ c (Z)

(47)

˜ − 21 {Z˜ c Z},

for a, b, c ∈ J3O S . It is straightforward to check that these reproduce the commutation relations listed in the introduction with the only difference that J2C has been replaced by J3O S . Acknowledgements. We are very grateful to R. Kallosh for poignant questions and comments on the first version of this paper. We would also like to thank B. de Wit and B. Pioline for enlightening discussions.

Appendix A. Jordan Triple Systems Let us first recall the defining properties of a Jordan algebra. By definition these are algebras equipped with a commutative (but non-associative) binary product a ◦ b = b ◦ a satisfying the Jordan identity (a ◦ b) ◦ a 2 = a ◦ (b ◦ a 2 ).

(A.1)

A Jordan algebra with such a product defines a so-called Jordan triple system (JTS) under the Jordan triple product ˜ ◦ c − b˜ ◦ (a ◦ c), {a b c} = a ◦ (b˜ ◦ c) + (a ◦ b) where ˜ denotes a conjugation in J corresponding to the operation † in g. The triple product satisfies the identities (which can alternatively be taken as the defining identities of the triple system) {a b c} = {c b a}, {a b {c d x}} − {c d {a b x}} − {a {d c b} x} + {{c d a} b x} = 0.

(A.2)

The Tits–Kantor–Koecher (TKK) construction [32, 21, 25] associates every JTS with a 3-graded Lie algebra g = g−1 ⊕ g0 ⊕ g+1 ,

(A.3)

satsifying the formal commutation relations: [g+1 , g−1 ] = g0 , [g+1 , g+1 ] = 0, [g−1 , g−1 ] = 0. With the exception of the Lie algebras G2 , F4 , and E8 every simple Lie algebra g can be given a three graded decomposition with respect to a subalgebra g0 of maximal rank.

Conformal Realizations of Exceptional Lie Groups

73

By the TKK construction the elements Ua of the g+1 subspace of the Lie algebra are labelled by the elements a ∈ J . Furthermore every such Lie algebra g admits an involutive automorphismι, which maps the elements of the grade +1 space onto the elements of the subspace of grade −1: ι(Ua ) =: U˜ a ∈ g−1 .

(A.4)

To get a complete set of generators of g we define [Ua , U˜ b ] = Sab , [Sab , Uc ] = U{abc}

(A.5)

where Sab ∈ g0 and {abc} is the Jordan triple product under which the space J is closed. The remaining commutation relations are [Sab , U˜ c ] = U˜ {bac} , [Sab , Scd ] = S{abc}d − Sc{bad} ,

(A.6)

and the closure of the algebra under commutation follows from the defining identities of a JTS given above. The Lie algebra generated by Sab is called the structure algebra of the JTS J , under which the elements of J transform linearly. The traceless elements of this action of Sab generate the reduced structure algebra of J . There exist four infinite families of hermitian JTS’s and two exceptional ones [31, 27]. The latter are listed in the table below (where M1,2 (O) denotes 1 × 2 matrices over the octonions, i.e. the octonionic plane) J

G

H

M1,2 (O S )

E6(6)

SO(5, 5)

M1,2 (O)

E6(−14) SO(8, 2)

J3O S

E7(7)

E6(6)

J3O

E7(−25)

E6(−26)

Here we are mainly interested in the real form J3O S , which corresponds to the split octonions O S and has E7(7) and E6(6) as its conformal and reduced structure group, respectively. Appendix B. The Quartic E7(7) Invariant In the SL(8, R) basis E7(7) the quartic invariant is given by (18), which we here repeat for convenience SL(8,R)

I4

= Xij Xj k X kl Xli − 41 X ij Xij X kl Xkl + +

1 ij klmnpq Xij Xkl Xmn Xpq 96 % ij kl mn pq 1 96 %ij klmnpq X X X X .

(B.1)

74

M. Günaydin, K. Koepsell, H. Nicolai

Another very useful form of E7(7) makes the maximal compact subgroup SU(8) manifest. The fundamental 56 representation then is spanned by the complex tensors ZAB which are related to the SL(8, R) basis by [4] Z AB = (ZAB )∗ =

1 √ (X ij 4 2

ij

− i Xij ):AB ,

(B.2)

ij

where :AB are the SO(8) gamma matrices. In this basis the quartic invariant takes the form SU(8)

I4

= Z AB ZBC Z CD ZDA − 41 Z AB ZAB Z CD ZCD + +

1 ABCDEF GH ZAB ZCD ZEF ZGH 96 % AB CD EF GH 1 Z Z Z . 96 %ABCDEF GH Z SU(8)

(B.3)

SL(8,R)

The precise relaton between I4 and I4 has never been spelled out in the literature although it is claimed in [4] that they should be proportional. In fact, we have SU(8)

I4

SL(8,R)

= −I4

.

(B.4)

To prove this claim, one needs the identities ij

ij

ij

pq

kl Tr(: ij : kl : mn : pq ) = − 128 δ p[k δlmn ] q + 128 δ p[m δn]q + 128 δ k[m δn]l ij

mn + 96 (δkl δpq )sym ∓ 8 % ij klmnpq ,

(B.5)

and ij

pq

ij

ij

kl mn mn % ABCDEF GH :AB :CD :EF :GH = − 128 (12 δkl δpq + 48 δ p[k δlmn ] q )sym

∓ % ij klmnpq ,

(B.6)

where (. . . )sym denotes symmetrization w.r.t. the pairs of indices (ij ), (kl), (mn), (pq), and the signs ∓ depend on whether the spinor representation or the conjugate spinor representation of the gamma matrices is used: : ij klmnpq = ∓% ij klmnpq . To see that I4 can assume both positive and negative values it is sufficient to consider configurations in the SU(8) basis of the form [8] z1 0 1 .. ZAB =: ⊗ , (B.7) . −1 0 z4 with complex parameters z1 , . . . , z4 . For this configuration the quartic invariant becomes SU(8) I4 = |zα |4 − 2 |zα |2 |zβ |2 + 4 z1 z2 z3 z4 + 4 z1∗ z2∗ z3∗ z4∗ . (B.8) α

β>α

Using this formula, one can easily see that both positive and negative values are possible for I4 :

Conformal Realizations of Exceptional Lie Groups

i)

75

We find positive values for I4 when all but one parameter vanish: SU(8)

I4

= |z1 |4 > 0

for

z1 = 0, z2 = z3 = z4 = 0

ii) I4 vanishes when all parameters take the same real (electric) or imaginary (magnetic) value: SU(8)

I4

=0

for

z1 = z2 = z3 = z4 = M or iM, M ∈ R.

This is the example considered in [20] corresponding to maximally BPS black hole solutions in d = 4, N = 8 supergravity with vanishing entropy and vanishing area of the horizon. iii) I4 is negative when all parameters take the same complex “dyonic” value. For instance, SU(8)

I4

x0 }, the map A → A∗ , A ∈ A(W1 ) defines an antilinear operator SW1 : A(W1 ) → A(W1 ) which is closable. Its closure is called the Tomita operator of and A(W1 ) and admits a unique polar 1/2 decomposition SW1 = JW1 W1 into an antiunitary conjugation JW1 (the “phase” of SW1 ) which is called the modular conjugation of ( , A(W1 ) ), and a positive operator 1/2 W1 (the “modulus” of SW1 ) whose square W1 is referred to as the modular operator of ( , A(W1 ) ). The main theorem of Tomita–Takesaki theory [46] now implies that the adjoint actions of the operators itW1 map the algebras A(W1 ) and A(W1 ) onto themselves, whereas the adjoint action of the conjugation JW1 maps the two algebras onto one another. Bisognano and Wichmann showed that for finite-component Wightman fields, the unitary itW1 coincides with the unitary representing the 01-boost by −2π t for all t ∈ R, whereas JW1 implements a charge conjugation together with a time reflection and a spatial reflection in the 1-direction, this combination of discrete transformations will be referred to as a P1 CT-symmetry. For the algebraic setting, Borchers proved in [11]2 that the spectrum condition (without assuming Lorentz covariance) implies the commutation relations (i)

JW1 U (a)JW1 = U (j1 a),

(ii) itW1 U (a)−it W1 = U (1 (−2π t)a)

for all t ∈ R,

where 1 (−2πt) denotes the Lorentz boost by −2π t in the 1-direction, while j1 is the reflection defined by j1 x := (−x0 , −x1 , x2 , . . . , xs ). Wiesbrock noted that Borchers’ relations are not only a necessary, but also a sufficient condition for the spectrum condition ([52], cf. also [25]). For 1+1 dimensions, Borchers’ relations immediately imply [11] that the net of observables may be enlarged to a local net which generates the same wedge algebras (and hence the same corresponding modular operator and conjugation) as the original one and which has the property that J1 is a P1 CT-operator (modular P1 CT-symmetry), whereas itW1 implements the Lorentz boost by −2πt for each t ∈ R (modular Lorentz symmetry). The first uniqueness theorem for modular symmetries states that even in higher dimensions, JW1 or itW1 , t ∈ R, can be shown to be a P1 CT-operator or a 0-1-Lorentz boost, respectively, provided that JW1 or itW1 implement any geometric action on the net. The first step towards it is the following lemma. In this lemma and in what follows, K will denote the class of all double cones of the form O := (a + V+ ) ∩ (b − V+ ), a, b ∈ R1+s . Lemma 2.1. Let K be a unitary or antiunitary operator with the property that for every double cone O there are open sets MO and NO such that KA(O)K ∗ = A(MO ),

K ∗ A(O)K = A(NO ),

2 For a considerably simpler proof found recently, see [28].

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

81

and let κ be a causal automorphism3 of R1+s such that KU (a)K ∗ = U (κa) for all a ∈ R1+s . Then there is a unique ξ ∈ R1+s such that KA(O)K ∗ = A(κO + ξ ),

for all O ∈ K.

A first proof of Lemma 2.1 was published in [37], but both the statement and the proof given there were more general, which made the formulation somewhat technical. For the reader’s convenience a less general, but more accessible formulation is used here, and a more detailed version of the proof is given below. The following theorem is a consequence of Lemma 2.1 and Borchers’ commutation relations. Theorem 2.2 (First Uniqueness Theorem). (i) If for every double cone O ∈ K there is an open set MO such that JW1 A(O)JW1 = A(MO ), then

JW1 A(O)JW1 = A(j1 O) for all O ∈ K.

t such that (ii) If for every t ∈ R and for every O ∈ K there is an open set MO t itW1 A(O)−it W1 = A(MO ),

then

itW1 A(O)−it W1 = A(1 (−2π t)O) for all O ∈ K.

The statement of part (ii) implies the statement of part (i) [30], i.e., the Unruh effect implies modular P1 CT-symmetry. Further results relating the above statements to each other and to similar conditions can be found in [26]. Assuming that is separating with respect to the algebra A(V+ ), Borchers also found commutation relations for the corresponding modular conjugation and unitaries: for each a ∈ R1+s , he found that J+ U (a)J+ it + U (a)−it +

= U (−a); = U (e−2πt a)

for all t ∈ R.

These relations, together with Lemma 2.1, imply the following corollary: Corollary 2.3 (Uniqueness Theorem “1a”). Assume A to be Poincaré covariant, and assume that the vacuum vector is separating with respect to the algebra A(V+ ) , and let itV+ and JV+ be the corresponding modular operator and conjugation, respectively. 3 Recall that a causal automorphism of R1+s is a bijection f : R1+s → R1+s which preserves the causal structure of R1+s , i.e., f (x) and f (y) are timelike with respect to each other if and only if x and y are timelike with respect to each other. Without assuming linearity or continuity, one can show that the group of all causal automorphisms of R1+s is generated by the elements of the Poincaré group and the dilatations [1, 3, 2, 54, 15]. Since the transformations implemented on the translations by Borchers’ commutation relations happen to be causal in all applications discussed below, this assumption means no loss of generality.

82

B. Kuckert

(i) If for every double cone O there is an open set MO such that JV+ A(O)JV+ = A(MO ), then

JV+ A(O)JV+ = A(−O) for all O ∈ K.

t such that (ii) If for every t ∈ R and every double cone O there is an open set MO t itV+ A(O)−it V+ = A(MO ),

then

−2πt O) for all O ∈ K. itV+ A(O)−it V+ = A(e

Since massive theories cannot be dilation invariant unless their mass spectrum is dilation invariant (cf., e.g., [42]), the models concerned by part (ii) of this corollary are massless theories. But it follows from the scattering theory for massless fermions and bosons in 1+3 or 1+1 dimensions (see [17–19]) that either of the symmetry properties found in part (i) and part (ii) of the corollary implies a massless theory to be free (i.e., its S-matrix is trivial) (see [18, 20, 23]). Note that for the 1+1-dimensional case, all modular symmetries considered in Thm. 2.2 and Cor. 2.3 have been established in [11]. It is assumed above that the adjoint actions of JW1 and itW1 , t ∈ R, map each local algebra A(O), O ∈ K, onto the algebra A(MO ) associated with some open region MO in Minkowski space. This means that, essentially, the net structure has to be preserved. This is the restrictive aspect of the assumption. On the other hand, the shape of the region MO is left completely arbitrary, the map K O → MO is not even assumed to be induced by a point transformation. In this aspect, the above assumptions are rather weak. But there are, of course, other ways to specify what a “geometric action” is. Denote by W the class of all wedges, i.e., all images of the Rindler wedge W1 under Poincaré transformations. For M ⊂ R1+s , define the causal complement M c to be the set of all points that are spacelike to M, and let M denote the interior of M c . It has been shown in [38, 39] that one can define a nonempty localization region for each local observable A∈ / Cid by L(A) := {W : W ∈ W, A ∈ A(W ) }. This localization prescription will be said to satisfy locality if any two local observables A and B with the property that L(A) and L(B) are spacelike separated commute. This property does not follow from the locality property of the net alone, but with the following additional assumptions one can derive it for the present setting [39]: (E) Wedge duality. A(W ) = A(W ) for each wedge W ∈ W. (F) Wedge additivity. For each wedge W ∈ W and each double cone O ∈ K with W ⊂ W + O one has A(W ) ⊂ A(a + O) . a∈W

Wedge duality is a property of all finite-component Wightman fields by the Bisognano–Wichmann theorem, and wedge additivity is a standard property of Wightman

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

83

fields as well. Condition (F) is slightly stronger than the definition of wedge additivity used in [47, 39], where the algebras A(a + O) in Condition (F) are replaced by the larger algebras A(a + O ) , but as this difference is not expected to be substantial for physics, we use the same term for convenience, which is in harmony with the other existing notions of additivity used in algebraic quantum field theory. Assume now that the localization region of the observable At := itW1 A−it W1 depends continuously on t, i.e., that for every sequence (tν )ν∈N which converges to some t∞ ∈ R, the localization region L(At∞ ) consists precisely of all accumulation points of sequences (xν )ν∈N with xν ∈ L(Atν ). Then the following lemma establishes a first restriction on how the localization region can depend on t. Lemma 2.4. With Assumptions (A)–(E), suppose the localization prescription L defined above satisfies locality. Let A be a local observable in A(W1 ), and assume that there exists an ε > 0 such that all At , t ∈ [0, ε], are local observables and such that the function [0, ε] t → L(At ) is continuous in the above sense. Then

(i) L(Aε ) ⊂ 1 (−2πε) (L(A) + W1 )cc ∩ (L(A) − W1 )cc ; (ii) L(Aε ) ⊂ L(A) − V + ; (iii) L(A) ⊂ L(Aε ) + V + . It is shown in the Appendix that the continuity assumption made on t → L(At ) is equivalent to continuity with respect to a metric first considered by Hausdorff, and that L(A t ) is compact. t∈[0,ε] Next suppose that t → L(At ) is continuous not only for sufficiently small t, but for all t ∈ R, and assume wedge additivity in addition. With these slightly strengthened assumptions one can now prove the following: Theorem 2.5 (Second Uniqueness Theorem). With Assumptions (A)–(F), assume that itW1 Aloc itW1 = Aloc , and suppose that L(At ) depends continuously from t for all t ∈ R and for all A ∈ Aloc . Then L(itW1 A−it W1 ) = 1 (−2π t)L(A) for all A ∈ Aloc . By the result of Guido and Longo, the conclusion of this proposition also implies modular P1 CT-symmetry, but Proposition 2.5 does not provide a proper parallel to the P1 CT-part of the first uniqueness theorem, which may also apply if the modular group does not act in any geometric way. The assumption that every local observable A is mapped onto some other local observable under the adjoint action of the modular group prevents A to be mapped onto an observable localized in an unbounded region. For every bounded open region O there are conformal transformations which map O onto an unbounded region; these transformations are excluded a priori. In contrast, the assumptions of the first uniqueness theorem do not exclude these symmetries explicitly, while it is evident from this theorem that the modular objects under consideration cannot implement these symmetries. Another restrictive assumption of the second uniqueness theorem is that wedge duality is assumed there, whereas the first one can be used to derive wedge duality. On the other hand the assumptions made in the second uniqueness theorem admit the situation that the net structure of A is destroyed completely under the action of the modular group.

84

B. Kuckert

3. Proofs For every algebra M ⊂ B(H), define its localization region L(M) with respect to the net A by L(M) := {O ∈ K : A(O) ⊂ M}. The only reason to use the class K of double cones in this definition is convenience; one could replace K by the larger class T of all open sets in R1+s without affecting the definition. To see this, denote the localization region obtained this way by LT (M); it is trivial that L(M) ⊂ LT (M) as K ⊂ T , while from isotony of the net and the fact that each open region M is the union of all double cones O ⊂ M, one finds {M ∈ T : A(M) ⊂ M} = {O ∈ K : ∃M ∈ T : O ⊂ M, A(M) ⊂ M} ⊂ {O ∈ K : A(O) ⊂ M} = L(M),

LT (M) =

which is the converse inclusion. It is obvious from the definitions that L(A(M)) ⊃ M. For causally complete and convex regions one can prove the converse inclusion, which we recall without proof from [39] (Cor. 5.4) for later use. Here a causally complete region is a region R such that (R c )c = R. Lemma 3.1. Let R ⊂ R1+s be a causally complete convex open region. (i) For every open region M ⊂ R1+s , one has A(M) ⊂ A(R ) if and only if M ⊂ R. (ii) L(A(R)) = R. One also checks that for any such R, one has L(A(R)) = L(A(R) ) = L(A(R ) ). We emphasize that the above assumption s ≥ 2 is crucial for this lemma; in 1+1 dimensions, there are chiral theories which do not obey the statement of the lemma. The repeated use of this lemma in the proofs is the main reason why s ≥ 2 is assumed throughout this paper. Proof of Lemma 2.1. In what follows, K and κ are defined as in Lemma 2.1. As before, K will denote the class of double cones. For any open region M ⊂ R1+s , we denote by KM the class of all double cones O ∈ K with O ⊂ M, and for each subalgebra M of B(H), we denote by KM the class of all double cones O such that A(O) ⊂ M. The proof will be subdivided into five lemmas. The first implies that for every O ∈ K, the regions MO and NO are bounded. It uses the fact that a region M is bounded if and only if its difference region M − M is bounded, and that difference sets can be expressed in terms of translations. Since the behaviour of translations under the action of the symmetry K is known by assumption, one can prove the following lemma. Lemma 3.2. For every double cone O ∈ K, one has L(KA(O)K ∗ ) − L(KA(O)K ∗ ) = κ(O − O).

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

85

Proof. Using the assumptions of Theorem 2.1, one obtains L(KA(O)K ∗ ) − L(KA(O)K ∗ ) = L(A(MO )) − L(A(MO )) = {a ∈ R1+s : ∃P ∈ KA(MO ) : A(P + a) ⊂ A(MO )} = {a ∈ R1+s : ∃P ∈ KA(MO ) : KU (κ −1 a)K ∗ A(P )KU (−κ −1 a)K ∗ ⊂ A(MO )} = κ{a ∈ R1+s : ∃P ∈ KA(MO ) : U (a) K ∗ A(P )K U (a) ⊂ K ∗ A(MO )K }

=A(NP )

⊂ κ{a ∈ R

1+s

: ∃P ∈ K

A(MO )

: ∃Q ∈ K

A(NP )

=A(O )

: A(Q + a) ⊂ A(O)}.

Since the definitions and isotony imply ∗ ∗ KA(NP ) = KK A(P )K ⊂ KK A(MO )K = KA(O) ,

and since, as remarked above, KA(O) = KO , one obtains L(KA(O)K ∗ ) − L(KA(O)K ∗ ) ⊂ κ{a ∈ R1+s : ∃Q ∈ KO : A(Q + a) ⊂ A(O)} = κ(O − O). Conversely, κ(O − O) = κ{a ∈ R1+s : ∃P ∈ KO : A(P + a) ⊂ A(O)} = {a ∈ R1+s : ∃P ∈ KO : A(P + κ −1 a) ⊂ A(O)} = {a ∈ R1+s : ∃P ∈ KO : K ∗ U (a)KA(P )K ∗ U (−a)K ⊂ A(O)} = {a ∈ R1+s : ∃P ∈ KO : A(MP + a) ⊂ A(MO )} ⊂ {a ∈ R1+s : ∃P ∈ KO : ∃Q ∈ KA(MP ) : A(Q + a) ⊂ A(MO )}, and since ∗

∗

KA(MP ) = KK A(P )K ⊂ KK A(O)K = KA(MO ) , one obtains κ(O − O) ⊂ {a ∈ R1+s : ∃Q ∈ KA(MO ) : A(Q + a) ⊂ A(MO )} = L(A(MO )) − L(A(MO )). The next lemma proves that strict inclusions of double cones are preserved under the adjoint action of the operator K. Again, this boils down to translating local algebras up and down Minkowski space and using the commutation relations between K and the translation operators. One uses the fact that O ⊂ P if and only if O can be translated within P into all directions. Lemma 3.3. For any two double cones O, P ∈ K with O ⊂ P , one has L(KA(O)K ∗ ) ⊂ L(KA(P )K ∗ ).

86

B. Kuckert

Proof. O ⊂ P if and only if the set {a ∈ R1+s : O + a ⊂ P } is a neighbourhood of the origin of R1+s . After using Lemma 3.1, elementary transformations yield {a ∈ R1+s : O + a ⊂P } = {a ∈ R1+s : A(O + a) ⊂ A(P )} = {a ∈ R1+s : K ∗ U (κa)KA(O)K ∗ U (−κa)K ⊂ A(P )} = {a ∈ R1+s : A(MO + κa) ⊂ A(MP )} = κ −1 {a ∈ R1+s : A(MO + a) ⊂ A(MP )} ⊂ κ −1 {a ∈ R1+s : L(A(MO )) + a ⊂ L(A(MP ))}. Since κ is a linear automorphism of R1+s , it follows that O can be a subset of P only if {a ∈ R1+s : L(A(MO )) + a ⊂ L(A(MP ))} is a neighbourhood of the origin. This implies the statement.

The next lemma proves that the maps K K → L(KA(O)K ∗ ) and

K O → L(K ∗ A(O)K)

are induced by continuous functions κ˜ : R1+s → R1+s and κˆ : R1+s → R1+s . Lemma 3.4. Let x ∈ R1+s be arbitrary, and let (Oν )ν∈N be a neighbourhood base of x consisting of double cones Oν ∈ K. Then (L(KA(Oν )K ∗ ))ν∈N is a neighbourhood base of a (naturally, unique) point κ(x) ˜ ∈ R1+s , and (L(K ∗ A(Oν )K))ν∈N is a neighbourhood base of a point κ(x) ˆ ∈ R1+s . The functions x → κ(x) ˜ and x → κ(x) ˆ are continuous. Proof. Without loss of generality, one may assume that Oν+1 ⊂ Oν for all ν ∈ N. It follows from L(A(O)) = O for all O ∈ K and Lemma 3.2 that all L(KA(Oν )K ∗ ), ν ∈ N, are bounded sets, and it follows from Lemma 3.3 that L(KA(Oν+1 )K ∗ ) ⊂ L(KA(Oν )K ∗ ). Therefore, the intersection of this family is nonempty, and Lemma 3.2 implies that the diameter of L(KA(Oν )K ∗ ) tends to zero as ν tends to infinity. This implies that the intersection contains precisely one point κ(x), ˜ as stated. The corresponding statements for K ∗ are proved analogously. This proves that x → κ(x) ˜ is a bijective point transformation. Let (xν )ν∈N be a sequence in R1+s that converges to a point x∞ . Then there is a neighbourhood base (Oν )ν∈N of x∞ with xν ∈ Oν for all ν ∈ N. But since κ(x ˜ ν ) ∈ κ(O ˜ ν ) for all ν ∈ N, and since κ(O ˜ ν ) is a neighbourhood base of κ(x ˜ ∞ ), it follows that κ(x ˜ ν ) tends to κ(x ˜ ∞ ) as ν → ∞. This line of argument applies to κˆ as well. The next lemma determines the functions κ˜ and κˆ up to a constant translation. Lemma 3.5. For every x ∈ R1+s , one has κ(x) ˜ = κ(0) ˜ + κx, and κ(x) ˆ = κ(0) ˆ + κ −1 x.

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

87

Proof. Let (Oν )ν∈N be a neighbourhood base of o. Then (Oν +x)ν∈N is a neighbourhood base of x, and L(KA(Oν + x)K ∗ ) = κ(O ˜ ν + x) = {κ(x)}. ˜ ν∈N

ν∈N

On the other hand, L(KA(Oν + x)K ∗ ) = L(U (κx)KA(Oν )K ∗ U (−κx)) ν∈N

ν∈N

= κx +

κ(O ˜ ν)

ν∈N

= κx + {κ(0)}. ˜ The corresponding reasoning also leads to the statement made on κ. ˆ It has been shown now that L(KA(O)K ∗ ) = κ(O) ˜ for each double cone O ∈ K, and since KA(O)K ∗ = A(MO ) by assumption, one concludes from MO ⊂ K(A(MO )) and isotony that KA(O)K ∗ ⊂ A(κ(O)) ˜ for all O ∈ K and that

ˆ for all O ∈ K. K ∗ A(O)K ⊂ A(κ(O))

Using this, one can now prove that κ˜ and κˆ are inverse to each other. Lemma 3.6. κˆ = κ˜ −1 , and in particular, κ˜ and κˆ are homeomorphisms. Proof. For every double cone O, it follows from the preceding results that A(O) = K ∗ KA(O)K ∗ K ⊂ K ∗ A(κ(O))K ˜ ⊂ A(κ( ˆ κ(O))), ˜ and since κ( ˆ κ(O)) ˜ is a double cone by Lemma 3.5, one can use Lemma 3.1 to conclude that O ⊂ κ( ˆ κ(O)). ˜ On the other hand, it follows from Lemma 3.2 that the radii of the double cones O and κ( ˆ κ(O)) ˜ are equal, so these double cones coincide, and as this applies for any double cone O, it follows that κˆ = κ˜ −1 , as stated. The proof of Lemma 2.1 is now almost complete. For each O ∈ K, one has KA(O)K ∗ ⊂ A(κ(O)), ˜ and conversely, ∗ ∗ A(κ(O)) ˜ = KK ∗ A(κ(O))KK ˜ ⊂ KA(κ˜ −1 (κ(O)))K ˜ = KA(O)K ∗ ,

so

KA(O)K ∗ = A(κ(O)), ˜

and with ξ := κ(0) ˜ it follows from Lemma 3.5 that KA(O)K ∗ = A(κO + ξ )

for all O ∈ K.

That ξ is unique, immediately follows from Lemma 3.1, so the proof of Lemma 2.1 is complete.

88

B. Kuckert

Proof of Theorem 2.2 (i). It follows from Lemma 2.1 that there is a unique ι ∈ R1+s such that JW1 A(O)JW1 = A(j1 O + ι)

for all O ∈ K.

It remains to be shown that ι = 0. Since J is an involution, one has x = j1 (j1 x + ι) + ι) = x + j1 ι + ι

for all x ∈ R1+s ,

which gives ι = −j1 ι, hence ι2 = · · · = ιs = 0. Furthermore, one has A(W1 + ι) = JW1 A(W1 ) JW1 = A(W1 ) from Lemma 2.1 and the Tomita–Takesaki theorem, so on the one hand, it follows from Lemma 3.1 that W1 + ι ⊂ W1 , and on the other hand, locality implies A(W1 ) ⊂ A(W1 ) = A(W1 + ι) ⊂ A(W1 + ι) , so using Lemma 3.1 once more one finds W1 ⊂ W1 + ι, arriving at W1 + ι = W1 and ι0 = ι1 = 0, as stated.

In what follows, a well-known generalization of Asgeirsson’s Lemma will be used repeatedly. It is called the double cone theorem of Borchers andVladimirov [50, 9, 51, 12]. Below, it will be applied together with the edge of the wedge theorem due to Bogoliubov (cf., e.g., [45, 51, 12]). For the reader’s convenience, both theorems are recalled here. For ε > 0, Bε will denote the open ε-ball centered at the origin of R2 , and n will denote some natural number. Theorem 3.7 (Edge of the Wedge Theorem). Let C be a nonempty, open and convex cone in Rn . For some ε > 0, assume that g+ is a function analytic in the tube Rn + i(C ∩ Bε ), and that g− is a function analytic in the tube Rn − i(C ∩ Bε ). If there is an open region γ ⊂ Rn where g+ and g− have a common boundary value in the sense of distributions, then g+ and g− are branches of a function g which is analytic in a complex neighbourhood . of γ . Theorem 3.8. Given the assumptions and notation of Theorem 3.7, let c be any smooth curve in γ which has all its tangent vectors in C. Then g is analytic in a complex neighbourhood of the double cone (c + C) ∩ (c − C). Another well known lemma that will be used repeatedly is the following (cf. e.g., part (i) of Lemma 2.4.1 in [39]). Lemma 3.9. Let R ⊂ R1+s be a region that contains an open cone, and let A ∈ Aloc be a local observable such that , AB = , BA for all B ∈ A(R). Then A ∈ A(R) .

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

89

Proof of Theorem 2.2 (ii). In what follows, e0 and e1 denote the unit vectors pointing into the 0- and the 1-direction, respectively. For every t ∈ R, Theorem 2.1 implies the existence of a unique ξ(t) ∈ R1+s with itW1 A(O)−it W1 = A(ξ(t) + 1 (−2π t)O)

for all O ∈ K.

By Corollary 3.1 it is clear that ξ(t) + W1 = W1 , so for all s ∈ R, one has 1 (−2π s)ξ(t) = ξ(t) and −it −is it A(ξ(s + t) + 1 (−2π(t + s))O) = is W1 W1 A(O)W1 W1

= A(ξ(s) + 1 (−2π s)(ξ(t) + 1 (−2π t)O)) = A(ξ(s) + 1 (−2π s)ξ(t) + 1 (−2π(t + s))O) = A(ξ(s) + ξ(t) + 1 (−2π(t + s))O), so ξ(s+t) = ξ(s)+ξ(t) follows from Lemma 3.1. One now concludes that ξ(λt) = λξ(t) for λ ∈ Q, so t → ξ(t) is Q-linear. Next we prove that the function R t → ξ(t) is continuous and, hence, R-linear. As ξ is additive, it is sufficient to prove continuity at t = 0. Assume ξ were not continuous there, then there would exist a sequence (tν ), ν ∈ N, in R that tends to zero, while |ξ(tν )| > ε for some ε > 0. Define the double cone O := − 3ε e0 + V+ ∩ 3ε e0 − V+ . By the above results and locality, there is an Nε ∈ N such that for any A, B ∈ A(O), one has ν [itWν1 A−it W1 , B] = 0 for all ν > Nε . But as itW1 depends strongly continuously on t, one concludes that A and B commute, and since A and B are arbitrary elements of A(O), it follows that A(O) is abelian. Ad is abelian as well, so H = C by irreducibility, which contradicts ditivity implies that A the assumption that H is infinite-dimensional. It follows that ξ is continuous and, hence, R-linear, so there is a ξ ∈ R1+s with ξ(t) = ξ t for all t ∈ R. It remains to be shown that ξ = 0. To this end, define the double cone O := (ρe1 + V+ ) ∩ (ρe1 + ρe0 − V+ ) ⊂ W1 for some ρ > 0. If one chooses ρ sufficiently small, there are a ∈ R1+s and ε, δ > 0 such that (1) 1 (−2πt)O + tξ − δte0 ⊂ a + V+ for all t ∈ [0, ε]; (2) O ⊂ a + V+ . As an example, choose a := ρe1 + ξ − |ξ |e0 , where |ξ | :=

|ξ 2 |. Defining

f (t) := (1 (−2π t)ρe1 + tξ − δte0 − a)2 , one computes

f (0) = 2|ξ |(−2πρ + |ξ | − δ).

|ξ | If one chooses ρ < 2π , one can choose δ such that 0 < δ < −2πρ + |ξ |. With this choice one has f (0) > 0, and as f is smooth and satisfies f (0) = 0, there is an ε > 0 such that f (t) ≥ 0 for all t ∈ [0, ε], which immediately implies Condition (1), whereas Condition (2) follows from f (0) = 0.

90

B. Kuckert b

P O

V1 (−2π t)O + εξ

a Fig. 1. The double cone P in the proof of Thm. 2.2 (ii)

As the set

0≤t≤ε (1 (−2πt)O

+ tξ ) is bounded, there is a b ∈ R1+s such that

(3) 1 (−2πt)O + tξ ⊂ b − V+ for all t ∈ [0, ε]. Now denote P := (a + V+ ) ∩ (b − V+ ) (Fig. 1), choose A ∈ A(O) and B ∈ A(P ), denote by e0 the unit vector in the time direction, and consider the function gA,B defined by R2 (t, s) → gA,B (t, s) := , [B, U (se0 )itW1 A−it W1 U (−se0 )] . By Conditions (1) and (3), this function vanishes in the closure of the open triangle γ with corners (0, 0), (ε, 0) and (ε, −δε) (Fig. 2). Clearly, γ contains a smooth curve that joins (0, 0) to (ε, −δε) and that has tangent vectors in the cone C := {(t, s) ∈ R2 : t > 0, s < 0}. It will be shown that by the double cone theorem, gA,B vanishes in the whole open rectangle ]0, ε[ × ]−δε, 0[. Since gA,B is continuous, it follows that it even vanishes in the closed rectangle [0, ε] × [−δε, 0]. Since B ∈ A(P ) and A ∈ A(O) are arbitrary, Lemma 3.9 implies that A(O − δεe0 ) ⊂ A(P ) . But since by Condition (2), the double cone O − δεe0 cannot be contained in P no matter how small δε is, this is in conflict with Lemma 3.1, so it follows that ξ = 0, which completes the proof.

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

91

1s 0 0 1 0 1 0 1 0 1 0 1 0 1 ε 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 −εδ 0 1 0 1 0 1 0 1 0 1 0 1 0 1

t

Fig. 2. Where gA,B vanishes in the proof of Thm. 2.2 (ii)

It remains to be shown that the function gA,B fulfills the assumptions of the double cone theorem. To this end, first note that gA,B = , BU (se0 )it A − , A−it U (−se0 )B = , BU (se0 )it A − , B ∗ U (se0 )it A∗ =: g+ (t, s) − g− (t, s). Using elementary arguments from spectral theory it can be shown that given any ρ > 0, any vector φ in the domain of ρ and any ψ ∈ H, the function R t → ψ, it φ has an extension to a function that is continuous on the strip {t ∈ C : −ρ ≤ Im t ≤ 0} and analytic on the interior of this strip (cf. [40], Lemma 8.1.10 (p. 351)). 1 As O ⊂ W1 , the vectors A and A∗ are in the domain of 2 , and it follows that for every ψ ∈ H, the functions R t → ψ, it A and R t → ψ, it A∗ have extensions that are continuous in the strips {t ∈ C : − 21 ≤ Im t ≤ 0} and {t ∈ C : 0 ≤ Im ≤ 21 }, respectively, and that are analytic in the interior of these strips. On the other hand, it follows from the spectrum condition that for any two vectors φ, ψ ∈ H, the functions R s → ψ, U (se0 )φ and R s → ψ, U (se0 )φ have extensions that are continuous in the (complex) closed upper and lower half plane, respectively, and analytic in the interior of these half planes. This proves that the function g+ has a continuous extension to the tube T+ := {(t, s) ∈ C2 : −1/2 ≤ Im t ≤ 0, Im s ≥ 0} and that at every interior point of this strip, this extension is analytic separately in t and in s. Using Hartogs’ fundamental theorem stating that a function of several complex variables is holomorphic if and only if it is holomorphic separately in each of these variables [33, 51], it follows that g+ , as a function in two complex variables, is analytic in the interior of T+ . It follows in the same way that g− has the corresponding properties for the tube −T+ =: T− . The tubes T+ and T− contain the smaller tubes R2 − iC ∩ B 1 and R2 + iC ∩ B 1 . 2

2

92

B. Kuckert

Since g+ and g− coincide as continuous functions in the closure of γ , they coincide as distributions in the open region γ , and it follows from the edge of the wedge theorem that they are branches of a function g that is analytic in a complex neighbourhood . of γ . But since γ contains a smooth curve joining the points (0, 0) and (ε, −δε) with tangent vectors in C, it follows from the double cone theorem that the function g is analytic in the region ((0, 0) + C) ∩ ((ε, −δε) − C) =]0, ε[ × ] − δε, 0[. This implies that gA,B vanishes in this region, which is all that remained to be shown, so the proof is complete. Proof of Corollary 2.3. If J+ or it+ behave the way assumed in (i) or (ii), respectively, the commutation relations recalled in the remark preceding the corollary, together with Lemma 2.1, imply that its geometrical action can differ from the stated symmetry at most by a translation. Since V+ is Lorentz-invariant, J+ and it+ , t ∈ R, commute with ↑ all U (g), g ∈ L+ . However, there are no nontrivial translations that commute with all ↑ g ∈ L+ . Proof of Lemma 2.4. It follows from the Tomita–Takesaki Theorem that the modular group under consideration leaves the algebras A(W1 ) and A(W1 ) invariant. By wedge duality, it also leaves the algebra A(W1 ) = A(−W1 ) invariant. Borchers’commutation relations now imply −iε iε W1 A(a ± W1 ) W1 = A(1 (−2π ε)a ± W1 ) .

L(A) + W1 is a union of translates of W1 , so (L(A) + W1 )c , being an intersection of translates of −W 1 , is a translate of −W 1 . It follows that (L(A) + W1 )cc is a translate of W 1 . In particular, (L(A) + W1 )cc = {a + W 1 : a ∈ R1+s , (L(A) + W1 )cc ⊂ a + W1 }. But if a ∈ R1+s is chosen such that (L(A) + W1 )cc ⊂ a + W1 , Lemma 3.1 above and wedge duality imply A ∈ A(a + W1 ) = A(a + W1 ) , so one finds {a + W 1 : a ∈ R1+s , A ∈ A(a + W1 ) } ⊂ (L(A) + W1 )cc , and one concludes

−iε {a + W 1 : a ∈ R1+s , iε W1 AW1 ∈ A(a + W1 ) } iε = {a + W 1 : a ∈ R1+s , A ∈ −iε W1 A(a + W1 ) W1 } = {a + W 1 : a ∈ R1+s , A ∈ A(1 (2π ε)a + W1 ) } = 1 (−2πε) {a + W 1 : a ∈ R1+s , A ∈ A(a + W1 ) }

L(Aε ) ⊂

⊂ 1 (−2πt)(L(A) + W1 )cc . The proof that L(Aε ) ⊂ 1 (−2π t)(L(A) − W1 )cc is completely analogous, so the proof of (i) is complete.

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

93

It remains to prove (ii) and (iii). We prove (iii); (ii) can be established along precisely the same line of argument by replacing itW1 by −it W1 and by exchanging, respectively, V+ and −V+ , A and Aε with one another. Due to Borchers’ commutation relations it suffices to consider A ∈ A(W1 ) , which, as in the proof of Theorem 2.2 (ii), will ensure that A ∈ D(1/2 ) in the following argument. Assume that L(A) ⊂ L(Aε ) + V + . Then one finds an a ∈ R1+s such that (1) L(Aε ) ⊂ a + V+ , while (2) L(A) ⊂ a + V+ . This can be seen as follows. The assumption that L(A) ⊂ L(Aε ) + V + and Statement (i) just proved imply that there is a double cone O ⊂ L(A) such that O and L(Aε ) are spacelike separated, so there is a double cone P ⊃ L(Aε ) such that O and P are spacelike separated (cf., e.g., Prop. 3.8 (b) in [47]); choosing a to be the lower tip of P , one arrives at both Conditions (1) and Condition (2). By Condition (1), L(Aε ) is a compact subset of the open set a + V+ , and as L(At ) depends continuously on t by assumption, there exist σ 7 > 0 and δ > 0 such that (1’) L(At ) − σ 7 e0 ⊂ a + V+

for all t ∈ [ε − δ, ε],

and this condition is, of course, equivalent to Condition (1). Since L(At ) depends continuously on t ∈ [0, ε], the set 0≤t≤ε L(At ) is bounded, so one finds a σ 8 ≥ 0 such that (3) L(At ) + σ 8 e0 ⊂ a + V+ for all t ∈ [0, ε], and for the same reason there is a b ∈ R1+s such that (4) L(At ) + 2σ 8 e0 ⊂ b − V+ for all t ∈ [0, ε]. Now define P := (a + V+ ) ∩ (b − V+ ), and for any B ∈ A(P ), consider – as in the proof of Proposition 2.2 – the function gA,B defined by R2 (t, s) → gA,B (t, s) := , [B, U (se0 )At U (−se0 )] . Locality and Conditions (3) and (4) imply that this function vanishes in the rectangle [0, ε] × [σ 8 , 2σ 8 ], and Condition (1’) implies that it also vanishes in the rectangle [ε −δ, ε]×[−σ 7 , σ 8 ]. By the double cone theorem, gA,B vanishes throughout the whole rectangle [0, ε] × [−σ 7 , 2σ 8 ] (Fig. 3). In particular, one obtains gA,B (0, −σ 7 ) = 0 for all B ∈ A(P ), so one can use Lemma 3.9 to conclude that A ∈ A(σ 7 e0 + P ) . By the definition of L(A), one finds L(A) − σ 7 e0 ⊂ P ⊂ a + V + , and as σ 7 > 0, this implies L(A) ⊂ a + V+ , which is in conflict with Condition (2) above and completes the proof. Proof of Theorem 2.5. Fix any ρ > 0, and define the double cones O1 := (ρ(2e1 + e0 ) + V+ ) ∩ (ρ(2e1 + 2e0 ) − V+ ), O2 := (ρ(2e1 − 2e0 ) + V+ ) ∩ (ρ(2e1 + 2e0 ) − V+ ), and O3 := (ρ(2e1 − 3e0 ) + V+ ) ∩ (ρ(2e1 + 3e0 ) − V+ ), (Fig. 4) and choose A ∈ A(O1 ). As L(A) ⊂ O1 , it follows from Lemma 2.4 (i) and (ii)

94

B. Kuckert

s

000 111 11111111111111 00000000000000 00000000000000 11111111111111 000 111 00000000000000 11111111111111 000 111 000 111 00000000000000 11111111111111 00000000000000 11111111111111 000 111 000 111 00000000000000 11111111111111 00000000000000 11111111111111 000 111 000 111 8 11111111111111 00000000000000 σ 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 ε − δ 111 ε 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 111 000 −σ 7 111 000 11111111111111 00000000000000

2σ 8

t

Fig. 3. Where gA,B vanishes in the proof of Lemma 2.4

that L(At ) ⊂ (1 (−2π t)ρ ( 23 e1 + 23 e0 ) + W 1 ) ∩ (1 (−2π t)ρ ( 25 e1 + 23 e0 ) − W 1 ) ∩ (ρ(2e1 + 2e0 ) − V + ) =: Rt , and there is an ε > 0 such that Rt ⊂ O2

for all t ∈ [0, ε].

Note that by the linearity of the Lorentz boosts, ε does not depend on ρ. One now has L(At ) ⊂ O2 for all A ∈ A(O1 ), and with Corollary 5.4 in [39], it follows that itW1 A(O1 )−it W1 ⊂ A(O3 )

for all t ∈ [0, ε].

Using Borchers’ commutation relations, one finds itW1 A(a + O1 )−it W1 ⊂ A(1 (−2π t)a + O3 )

for all a ∈ R1+s and all t ∈ [0, ε]. Defining x := ρ(2e1 + e0 ), P1 := O1 − x, and P3 := O3 − x, one obtains itW1 A(a + P1 )−it W1 ⊂ A(1 (−2π t)a + (x − 1 (−2π t)x) + P3 ) .

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry x0

95

W1

2ρ O3 O1 O2 x1 2ρ

W1 (dashed lines) Fig. 4. The double cones O1 , O2 , and O3 in the proof of Thm. 2.5

Note that the euclidean length of the vector x − 1 (−2π t)x is ≤ 3ρ for all t ∈ [0, ε], as 1 (−2πt)x ∈ Rt ⊂ O2 by the above choice of ε. Now choose any wedge W ∈ W. As W ⊂ W + P1 , it follows from wedge additivity that

A(W ) ⊂

A(a + P1 )

.

a∈W

Define, for δ > 0, the wedges W (δ) := Bδ (W ) , where Bδ (W ) denotes the euclidean δ-ball around W , and W (−δ) := ((W )(δ) ) , then it follows from isotony and wedge duality that

A(a

+ P3 )

⊂ A(W (4ρ) ) ,

a∈W

and as the euclidean length of the vector (1 (−2π t)x − x) is ≤ 3ρ, one arrives at

a∈W

A(a + (x

− 1 (−2π t)x) + P3 )

⊂ A(W (7ρ) ) .

96

B. Kuckert

For t ∈ [0, ε], one now obtains

itW1 A(1 (2π t)W ) −it W1 ⊂

a∈1 (2πt)W

⊂

itW1 A(a + P1 )−it W1

A(1 (−2π t)a + (x − 1 (−2π t)x) + P3 )

a∈1 (2πt)W

⊂ A(W (7ρ) ) , and as W = (W (−7ρ) )(7ρ) , this can be rewritten it (−7ρ) ) ). −it W1 A(W ) W1 ⊃ A(1 (2π t)W

Using the fact that the transformations 1 (2π t) are linear and, hence, bounded maps in R1+s , which map the euclidean 7ρ-ball onto some bounded set with radius proportional to ρ, and using the facts that this radius continuously depends on t ∈ [0, ε], that the interval [0, ε] is compact, and that ε does not depend on the choice of ρ, one concludes that there is an M > 0 which is independent from ρ and satisfies 1 (2πt)W (−7ρ) ⊃ (1 (2π t)W )(−Mρ)

for all t ∈ [0, ε],

so with the above specifications of ε and M, one obtains it (−Mρ) ) −it W1 A(W ) W1 ⊃ A((1 (2π t)W )

for all wedges W ∈ W and all ρ > 0. For each A ∈ Aloc , one now concludes {W : W ∈ W, itW1 A−it L(At ) = W1 ∈ A(W ) } it = {W : W ∈ W, A ∈ −it W1 A(W ) W1 } ⊂ {W : W ∈ W, A ∈ A((1 (2π t)W )(−Mρ) ) } ρ>0

=

{1 (−2π t)X : X ∈ W, A ∈ A(X (−Mρ) ) }

ρ>0

= 1 (−2πt)

{X : X ∈ W, A ∈ A(X (−Mρ) ) }

ρ>0

= 1 (−2πt)

{X (Mρ) : X ∈ W, A ∈ A(X) }

ρ>0

= 1 (−2π t)L(A). To prove the converse inclusion, one proves L(At ) ⊂ 1 (−2π t) for t ∈ [−ε, 0] by mimicking the above argument: one defines the double cone O1 := ρ(2e1 − 2e0 ) + V+ ) ∩ (ρ(2e1 − e0 ) − V+ ), keeps O2 and O3 as before, defines x := ρ(2e1 − e0 ) and proceeds like above with t ∈ [−ε, 0], using Lemma 2.4 (iii) instead of Part (ii) of the same lemma. Now having proved L(At ) ⊂ 1 (−2π t)L(A) for all t ∈ [−ε, ε] and for all A ∈ Aloc , one concludes L(At ) = 1 (−2π t)L(A) for all t ∈ [−ε, ε] and for all A ∈ Aloc . As this immediately implies the statement for all t ∈ R and all A ∈ Aloc , the proof is complete.

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

97

4. Conclusion By the above results, the modular group of a theory that does not exhibit the Unruh effect acts in a completely “non-geometric” fashion, in the sense that it can neither preserve the net structure nor act on the local observables in such a way that localization regions evolve continuously. In particular, it cannot implement any equilibrium dynamics in this case. The above results imply that the only observer who can possibly experience the vacuum in thermodynamical equilibrium is the uniformly accelerated one (whose acceleration may, of course, be zero). Physically, this result reflects the fact that any nonuniformly accelerated observer would feel nonstationary inertial forces destroying any thermodynamical equilibrium, while the constant acceleration felt by a uniformly accelerated observer does not affect thermodynamical equilibrium provided the theory exhibits the Unruh effect. The first results similar to the above ones have been obtained by Araki and by Keyl [4, 35]. These authors avoid the spectrum condition and assume stronger a priori restrictions on the possible geometric behaviour instead. Recently, more results in this spirit have been found by Buchholz et al. and by Trebels [21, 27, 29, 48]. One aim of these approaches is to obtain new insight on quantum fields on curved spacetimes by avoiding the spectrum condition. So far, results have been obtained for de Sitter, Anti-de Sitter, and certain Robertson–Walker spacetimes [21, 22, 24]. For the vacuum states in Minkowski space considered above, the spectrum condition is a reasonable physical assumption. The assumptions made above on the possible geometric behaviour of the modular objects (in particular those made in the first uniqueness theorem) are less restrictive than those made in any of the other approaches, since a small class of regions, namely, the double cones, is assumed to be mapped into an extremely large class of regions, namely, the open sets. In this sense the above results are, at present, the most general uniqueness results in Minkowski space that point towards the Unruh effect and modular P1 CT-symmetry. Even more than a uniqueness result can be found if conformal symmetry holds in addition to our above Conditions (A) through (C). In this case, the whole representation of the conformal group arises from the modular objects of the theory, and in particular, the Bisognano–Wichmann symmetries can be established [16]. Appendix. A Remark on the Continuity of t → L(At ) In the discussion of the second uniqueness theorem it was assumed that L(At ) depends continuously on t for t ∈ [0, ε] in the sense that for each sequence (tν )ν∈N tending to a t∞ ∈ [0, ε], the localization region L(At∞ ) consists precisely of all accumulation points of sequences (xν )ν∈N with xν ∈ L(Atν ). In this appendix we show that this notion of convergence, which we refer to as pointwise convergence, is equivalent to the convergence according to a metric first considered by Hausdorff, which one can introduce on the set C of compact convex subsets of R1+s by defining, for any two such sets K, L ∈ C, δH (K, L) := inf{δ > 0 : K ⊂ Bδ (L) and L ⊂ Bδ (K)} (cf. Problem 4D (p. 131) in [34]). It is evident that continuity of [0, ε] t → L(At ) with respect to this metric, which we refer to as uniform continuity, implies the pointwise

98

B. Kuckert

continuity for this map. Conversely, one can also show that pointwise continuity implies uniform continuity for t → L(At ). To prove this indirectly, assume that t → L(At ) is pointwise continuous for t ∈ [0, ε] and that this map is not continuous with respect to Hausdorff’s metric. Then there exists a ρ > 0 and a sequence (tν )ν∈N of points in [0, ε] which converges to a point t∞ ∈ [0, ε] and has the property that δH (L(Atν ), L(At∞ )) ≥ ρ. On the other hand, there is a subsequence (sν )ν∈N of (tν )ν∈N with the property that all L(Asν ) have nonempty intersection with Bρ (L(At∞ )), as otherwise L(At∞ ) would be empty by the assumption of pointwise continuity. As δH (L(Asν ), L(At∞ )) ≥ ρ, there exists a sequence (xν )ν∈N such that the euclidean distance δ(xν , L(At∞ )) between xν and L(At∞ ) is ≥ ρ/2 for all ν ∈ N, and as all L(Asν ) are convex sets with a nonempty intersection with Bρ (L(At∞ )), this sequence can be chosen such that it is bounded and, hence, has an accumulation point x. ˜ As δ(xν , L(At∞ )) ≥ ρ/2 for all ν ∈ N, one finds δ(x, ˜ L(At∞ ) ≥ ρ/2, so x˜ ∈ / L(At∞ ). But this contradicts the assumption that t → L(At ) is pointwise continuous and proves that this map is pointwise continuous if and only if it is uniformly continuous, as stated. It is now easy to see that t∈[0,ε] L(At ) is bounded, as stated in the text. Namely, the function [0, ε] t → δH (L(A), L(At )) is continuous and, hence, has a maximum ρ > 0 in the compact interval [0, ε]. It follows that t∈[0,ε] L(At ) ⊂ Bρ (L(A)), which is a bounded set. Acknowledgements. It was an important help that D. Arlt and N. P. Landsman read the manuscript carefully. This research was funded by the Deutsche Forschungsgemeinschaft, a Feodor–Lynen grant of the Alexander von Humboldt foundation, and a Hendrik Casimir–Karl Ziegler award of the Nordrhein-Westfälische Akademie der Wissenschaften. The idea to reinitiate the project originated during a stay in 1997 at the Erwin-Schrödinger Institute for Mathematical Physics at Vienna. Helpful discussions there with S. Trebels and D. Guido are gratefully acknowledged.

References 1. Alexandrov, A. D.: On Lorentz transformations. Uspekhi Mat. Nauk. 5 No. 3 (37), 187 (1950) 2. Alexandrov, A. D.: Mappings of Spaces with Families of Cones and Space-Time Transformations. Annali di matematica 103, 229–257 (1975) 3. Alexandrov, A. D., Ovchinnikova, V. V.: Notes on the foundations of relativity theory. Vestnik Leningrad Univ. 14, 95 (1953) 4. Araki, H.: Symmetries in a Theory of Local Observables and the Choice of the Net of Local Algebras. Rev. Math. Phys. Special Issue, 1–14 (1992) 5. Araki, H.: Mathematical Theory of Quantum Fields. Oxford: Oxford University Press, 1999 6. Baumgärtel, H., Wollenberg, M.: Causal Nets of Operator Algebras. Berlin: Akademie-Verlag, 1992 7. Bisognano, J. J., Wichmann, E. H.: On the Duality Condition for a Hermitian Scalar Field. J. Math. Phys. 16, 985–1007 (1975) 8. Bisognano, J. J., Wichmann, E. H.: On the Duality Condition for Quantum Fields. J. Math. Phys. 17, 303 (1976) 9. Borchers, H.-J.: Über die Vollständigkeit lorentzinvarianter Felder in einer zeitartigen Röhre. Nuovo Cimento 19, 787–796 (1961) 10. Borchers, H.-J.: On the Vacuum State in Quantum Field Theory, II. Commun. Math. Phys. 1, 57 (1965) 11. Borchers, H.-J.: The CPT-Theorem in Two-Dimensional Theories of Local Observables. Commun. Math. Phys. 143, 315–332 (1992) 12. Borchers, H.-J.: Translation Group and Particle Representations in Quantum Field Theory. Berlin– Heidelberg: Springer, 1996 13. Borchers, H.-J.: On Poincaré transformations and the modular group of the algebra associated with a wedge. Lett. Math. Phys. 46, 295–301 (1998)

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

99

14. Borchers, H.-J.: On the Revolutionization of Quantum Field Theory by Tomita’s Modular Theory. J. Math. Phys. 41, 3604–3673 (2000) 15. Borchers, H.-J., Hegerfeldt, G. C.: The Structure of Space-Time Transformations. Commun. Math. Phys. 28, 259–266 (1972) 16. Brunetti, R., Guido, D., Longo, R.: Modular Structure and Duality in Conformal Quantum Field Theory. Commun. Math. Phys. 156, 201–219 (1993) 17. Buchholz, D.: Collision Theory for Massless Fermions. Commun. Math. Phys. 42, 269–279 (1975) 18. Buchholz, D.: Collision Theory for Waves in Two Dimensions and a Characterization of Models with Trivial S-Matrix. Commun. Math. Phys. 45, 1–8 (1975) 19. Buchholz, D.: Collision Theory of Massless Bosons. Commun. Math. Phys. 52, 147–173 (1977) 20. Buchholz, D.: On the Structure of Local Quantum Fields with Non-Trivial Interaction. In: Proceedings of the International Conference on Operator Algebras, Ideals and Their Applications in Theoretical Physics, Leipzig, 1977. Stuttgart: Teubner, 1978 21. Buchholz, D., Dreyer, O., Florig, M., Summers, S. J.: Geometric Modular Action and spacetime Symmetry Groups. Rev. Math. Phys. 12, 475–560 (2000) 22. Buchholz, D. Florig, M., Summers, S. J.: Hawking–Unruh Temperature and Einstein Causality in Anti-de Sitter Space-Time. Class. Quant. Grav. 17, L31–L37 (2000) 23. Buchholz, D., Fredenhagen, K.: Dilations and interaction. J. Math. Phys. 18, 1107–1111 (1977) 24. Buchholz, D., Mund, J., Summers, S. J.: Transplantation of Local Nets and Geometric Modular Action on Robertson–Walker Space-Times. Preprint, hep-th/0011237 25. Buchholz, D., Summers, S. J.: An Algebraic Characterization of Vacuum States in Minkowski Space. Commun. Math. Phys. 155, 449–458 (1993) 26. Davidson, D. R.: Modular Covariance and the Algebraic PCT/Spin-Statistics Theorem. Preprint, hep-th/9511216 27. Dreyer, O.: Das Prinzip der geometrischen modularen Wirkung im de Sitter-Raum. diploma thesis, University of Hamburg, 1996 28. Florig, M.: On Borchers’ Theorem. Lett Math. Phys. 46, 289–293 (1998) 29. Florig, M.: Geometric Modular Action. PhD-thesis, University of Florida, Gainesville, 1999 30. Guido, D., Longo, R.: An Algebraic Spin and Statistics Theorem. Commun. Math. Phys. 172, 517–534 (1995) 31. Guido, D., Longo, R.: The Conformal Spin and Statistics Theorem. Commun. Math. Phys. 181, 11–36 (1996) 32. Haag, R.: Local Quantum Physics. Berlin: Springer, 1992 33. Hartogs, F.: Zur Theorie der Funktionen mehrerer komplexer Veränderlicher, insbesondere über die Darstellung derselben durch Reihen, welche nach Potenzen einer Veränderlichen fortschreiten. Math. Ann. 62, 1–88 (1906) 34. Kelley, J. L.: General Topology. New York: van Nostrand, 1955 35. Keyl, M.: Remarks on the relation between causality and quantum fields. Class. Quantum Grav. 10, 2353–2362 (1993) 36. Kuckert, B.: A New Approach to Spin & Statistics. Lett. Math. Phys. 35, 319–335 (1995) 37. Kuckert, B.: Borchers’ Commutation Relations and Modular Symmetries in Quantum Field Theory. Lett. Math. Phys. 41, 307–320 (1997) 38. Kuckert, B.: Spin & Statistics, Localization Regions, and Modular Symmetries in Quantum Field Theory. PhD-thesis, Hamburg 1998, DESY-thesis 1998-026 39. Kuckert, B.: Localization Regions of Local Observables. Commun. Math. Phys. 215, 197–216 (2000) 40. Li Bing-Ren: Introduction to Operator Algebras. Singapore: World Scientific, 1992 41. Longo, R.: On the spin-statistics relation for topological charges. In: Doplicher, S., Longo, R., Roberts, J. E., Zsido, L. (eds.): Operator Algebras and Quantum Field Theory. Proceedings of the conference at the Accedemia Nazionale dei Lincei, Rome 1996. Cambridge, MA: International Press, 1997 42. Mack, G., Salam, A.: Finite-Component Field Representations of the Conformal Group. Ann. Phys. 53, 174–202 (1969) 43. Mund, J.: Quantum Field Theory of Particles with Braid Group Statistics in 2+1 dimensions. PhD-thesis, Freie Universität Berlin, 1998 44. Reeh, H., Schlieder, S.: Bemerkungen zur Unitäräquivalenz von lorentzinvarianten Feldern. Nuovo Cimento 22, 1051 (1961) 45. Streater, R. F., Wightman, A. S.: PCT, Spin & Statistics, and All That. New York: Benjamin, 1964 46. Takesaki, M.: Tomita’s Theory of Modular Hilbert Algebras and Its Applications. Lecture Notes in Mathematics 128, New York: Springer, 1970 47. Thomas, L. J., Wichmann, E. H.: Standard forms of local nets in quantum field theory. J. Math. Phys. 39, 2643–2681 (1998) 48. Trebels, S.: PhD-thesis. Göttingen 1997, cf. also [14] 49. Unruh, W. G.: Notes on black hole evaporation. Phys. Rev. D 14, 870–892 (1976)

100

B. Kuckert

50. Vladimirov, V. S.: The construction of envelopes of holomorphy for domains of a special type. (in Russian) Doklady Akad. Nauk SSSR 134, 251–254 (1960) 51. Vladimirov, V. S.: Methods of the Theory of Functions of Many Complex Variables. Cambridge, MA: M. I. T. Press, 1966 52. Wiesbrock, H.-W.: A Comment on a Recent Work of Borchers. Lett. Math. Phys. 25, 157–159 (1992) 53. Yngvason, J.: A Note on Essential Duality. Lett. Math. Phys. 31, 127–141 (1994) 54. Zeeman, E. C.: Causality Implies the Lorentz Group. J. Math. Phys. 5, 490–493 (1964) Communicated by H. Araki

Commun. Math. Phys. 221, 101 – 140 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Renormalization Group and the Melnikov Problem for PDE’s Jean Bricmont1, , Antti Kupiainen2, , Alain Schenkel2 1 UCL, FYMA, 2 chemin du Cyclotron, 1348 Louvain-la-Neuve, Belgium 2 Department of Mathematics, Helsinki University, P.O. Box 4, 00014 Helsinki, Finland

Received: 29 January 2001 / Accepted: 8 March 2001

Abstract: We give a new proof of persistence of quasi-periodic, low dimensional elliptic tori in infinite dimensional systems. The proof is based on a renormalization group iteration that was developed recently in [BGK] to address the standard KAM problem, namely, persistence of invariant tori of maximal dimension in finite dimensional, near integrable systems. Our result covers situations in which the so called normal frequencies are multiple. In particular, it provides a new proof of the existence of small-amplitude, quasi-periodic solutions of nonlinear wave equations with periodic boundary conditions. 1. Introduction In this paper, we address the persistence problem of quasi-periodic, low dimensional, elliptic tori in infinite dimensional systems. A typical example that we will consider is the nonlinear wave equation (NLW) on a bounded interval, ∂t2 u = ∂x2 u − V u − f (u),

(1.1)

with Dirichlet or periodic boundary conditions and f (u) = O(u3 ). The first results concerning the Melnikov problem (i.e., the persistence of elliptic invariant tori of dimension lower than the number of degrees of freedom, [M, E]) for infinite dimensional Hamiltonian systems were obtained independently by Kuksin, Pöschel and Wayne, [K2, P1, W]. In particular, existence of quasi-periodic solutions of (1.1) was shown in [K1, W]. Based on the Kolmogorov–Arnold–Moser (KAM) approach, these results were restricted to Dirichlet or Neumann boundary conditions and to specific classes of adjustable potentials V , excluding, in particular, arbitrary constant potentials. This latter case was covered in [BK] by using the sine-Gordon PDE as the unperturbed integrable system, and, following a different approach, in [P2]. In [P2], the existence of a Birkhoff normal Partially supported by ESF/PRODYN.

Partially supported by EC grant FMRX-CT98-0175.

102

J. Bricmont, A. Kupiainen, A. Schenkel

form for the Hamiltonian of (1.1) is exploited in order to control the torus frequencies via amplitude-frequency modulation, and therefore to dispense with outer parameters provided by an adjustable potential. This approach was applied in [KP] to the persistence of quasi-periodic solutions for the nonlinear Schrödinger equation (NLS) subject to Dirichlet (or Neumann) boundary conditions. The case of periodic boundary conditions is more delicate due to the fact that the eigenvalues of the Sturm-Liouville operator L = −d 2 /dx 2 + V are degenerate. This leads to resonances between pairs of frequencies corresponding to motion in directions normal to the torus (the so called normal frequencies). These additional resonances prevent one from controlling quadratic terms in the Hamiltonian of the system. (This difficulty also appears in finite-dimensional Melnikov situations.) Developing new techniques based on the Lyapunov-Schmidt method, Craig and Wayne proved in [CW] persistence of periodic solutions of the NLW with periodic boundary conditions. Later, their approach was significantly improved by Bourgain in [B1-2] who constructed quasi-periodic solutions of the NLW and NLS with periodic boundary conditions. Most notably, it is shown in [B2] that solutions of this type can be constructed, in particular, for the NLS on twodimensional domains. The usual Melnikov nonresonance condition reads, with ω ∈ Rd and µ ∈ Rn denoting the torus and, respectively, the normal frequencies (n is possibly infinite), k, ω + l, µ = 0,

k ∈ Zd , l ∈ Zn with |k| + |l| = 0, |l| ≤ 2.

(1.2)

In Bourgain’s approach and at the price of a considerable technical effort, condition (1.2) is reduced to k, ω + µs = 0,

k ∈ Zd , s = 1, . . . , n,

i.e., all nonresonance conditions on pairs of normal frequencies are absent. More recently, Chierchia and You, see [Y,CY], showed that persistence of quasi-periodic solutions of the NLW with periodic boundary conditions is tractable by KAM techniques. Their nonresonance condition, k, ω + l, µ = 0,

k ∈ Zd \ {0}, l ∈ Zn with |l| ≤ 2,

(1.3)

is stronger than Bourgain’s condition but weaker than (1.2). However, their result does not cover the case of constant potential. In the present paper, we give a new proof of Bourgain’s result for the NLW with periodic boundary conditions in the case of constant potential. Our proof is based on a renormalization group procedure recently developed in [BGK] for standard KAM problems. The nonresonance condition that we will impose is the same as Chierchia and You’s condition, but our technique could in principle accommodate Bourgain’s condition. In order to describe our result further, we start by specifying the infinite dimensional Hamiltonians we will consider. For dk , k ≥ 1, a sequence of strictly positive integers uniformly bounded by some d¯ < ∞, let R∞ denote the set of infinite sequences x = (x1 , x2 , . . . ) with xk ∈ Rdk . For an integer d ≥ 1, let P = Td × Rd × R∞ × R∞ , where Td is the torus Rd /(2πZd ). Denoting the coordinates in P by (φ, I, x, y) and endowing P with the symplectic structure dφ ∧ dI + dx ∧ dy, we consider perturbations of integrable Hamiltonians of the form H (φ, I, x, y) = ω · I + 21 I · gI + 21 µ2k |xk |2 + |yk |2 + λU (φ, I, x), (1.4) k≥1

Renormalization Group and Melnikov Problem for PDE’s

103

where µk ∈ R, k ≥ 1, ω ∈ Rd , and g is a real symmetric, invertible d × d matrix. 2 Above, |v|2 for v ∈ Rm denotes m i=1 vi . The Hamiltonian flow generated by (1.4) is given by the equations of motion I˙ = −λ∂φ U,

φ˙ = ω + gI + λ∂I U,

(1.5)

and x¨k = −µ2k xk − λ∂xk U.

(1.6)

For λ = 0 and the initial condition I 0 = φ 0 = x 0 = y 0 = 0, the flow φ(t) = ωt, I (t) = 0, and x(t) = 0, is quasi-periodic and spans a d-dimensional torus in Td × Rd × R∞ × R∞ . In order to study the case for which the perturbation is turned on, we consider a quasi-periodic solution of the form (φ(t), I (t), x(t)) = (ωt + !(ωt), J (ωt), Z(ωt)). Then, (1.5) and (1.6) require that T ≡ (!, J, Z) : Td → Rd × Rd × R∞ satisfies the equation DT (ϕ) = −λ∂U (ϕ + !(ϕ), J (ϕ), Z(ϕ)),

(1.7)

where ∂ = (∂φ , ∂I , ∂x ) and, setting µ ≡ diag(µ1 1d1 , µ2 1d2 , . . . ), together with D ≡ ω · ∂φ ,

(1.8)

0 D 0 . 0 D = −D g 2 2 0 0 D +µ

(1.9)

Note that if T is a solution of Eq. (1.7), then so is Tβ for β ∈ Rd , where Tβ (ϕ) = T (ϕ − β) − (β, 0, 0).

(1.10)

We now state the two hypotheses under which we shall prove existence of a solution T of Eq. (1.7), first introducing the following family of Banach spaces R∞ s , s ∈ R,

∞ R∞ k s |Zk |Rdk < ∞ . (1.11) s = Z ∈ R | |Z|s ≡ k≥1

(H1) Asymptotics of eigenvalues. The sequence {µk }k≥1 satisfies µk > 0 and µk = µl for all k = l ≥ 1, and there exist γ ≥ 1 and c > 0 such that µk ≥ ck γ

for all k ≥ 1.

(1.12)

Furthermore, if γ > 1 then µk − µk ≥ c(k γ − k γ ) for all k > k ≥ 1.

(1.13)

If γ = 1, then there exist constants ξ > 0 and cl > 0 such that µk − µk = cl (1 + O(k −ξ )) for all k − k = l ≥ 1.

(1.14)

104

J. Bricmont, A. Kupiainen, A. Schenkel

(H2) Regularity of the perturbation. The map (φ, I, x) → U (φ, I, x) is assumed to be real analytic in φ ∈ Td and real analytic in I and x in a neighborhood of the origin of Rd and R∞ 0 . In addition, we assume that there exist an s > 0 and a ξ > 0 such that for some OI ⊂ Rd and Ox ⊂ R∞ s neighborhoods of the origin, the gradient ∂x U is bounded as a map from Td × OI × Ox to R∞ s+ξ −γ . In the sequel, we will often use the short notation s ≡ s + ξ − γ . Theorem 1.1. Let {µk } satisfy (H1) and U satisfy (H2). Then, there exists a set +∗ = +∗ (U, µ) ⊂ Rd such that for ω ∈ +∗ , Eq. (1.7) has a unique solution (up to translations (1.10)) which is real analytic in λ and φ provided that |λ| is small enough. Furthermore, for all bounded + ⊂ Rd the set +∗ of admissible frequencies satisfies meas(+\+∗ ) → 0 as λ → 0. The proof of Theorem 1.1 is based on an inductive procedure developed in [BGK] for standard KAM problems. This renormalization group iteration can be viewed as an iterative resummation of the Lindstedt series, as is explained in more details in [BGK], and was directly inspired by the quantum field theory analogy with KAM problems forcefully emphasized by Gallavotti et al. [G, GGM]. Melnikov type problems require to deal with the additional resonances arising from the normal frequencies µk , and the goal of the present paper is to explain how the procedure of [BGK] can be applied in such cases. In contrast to standard KAM problems, the set +∗ of admissible frequencies depends for Melnikov type problems on the perturbation U . In our approach, this dependence expresses itself by the fact that under iteration, the normal frequencies are renormalized in a U -dependent way and that the set +∗ is defined according to the renormalized normal frequencies. As usual, the set +∗ is constructed in such a way that nonresonance conditions are fulfilled in order for the inductive scheme to converge. Our scheme is technically simplified if one imposes the nonresonance condition of the form (1.3), i.e., conditions involving pairs of normal frequencies. Hypothesis (H1) ensures that +∗ has large measure under these conditions, and hypothesis (H2) ensures that the asymptotic properties of the normal frequencies stated in (H1) are preserved under renormalization. The requirement ξ > 0 is needed both in (H1) when γ = 1, and, for γ > 1, in (H2) in order to cover the case of degenerate normal frequencies (more precisely the case where dk > 1 for infinitely many k). In Sect. 2, we show how Theorem 1.1 provides a proof of the existence of quasi-periodic solutions of the 1D NLW with periodic boundary conditions. In particular, γ = 1 in (H1) and we will see that (H2) is satisfied with ξ = 1. In contrast, one has for the 1D NLS γ = 2 and ξ = 0. Thus, the scheme presented here only applies to NLS with Dirichlet boundary conditions (namely dk = 1 for all k) or to the persistence of periodic solutions of NLS (namely d = 1). In order to cover the other situations, one must be able to dispense with nonresonance conditions involving certain pairs of normal frequencies. The remainder of the paper is organized as follows. Section 2 is devoted to the NLW. In Sect. 3 we explain the renormalization group scheme that will be used to prove Theorem 1.1. Section 4 is devoted to the definition of the spaces we will consider. In Sect. 5, we state some crucial inductive bounds, which will be shown to hold in Sect. 6. Section 7 is concerned with the measure estimate of +∗ , whereas the proof of Theorem 1.1 is carried out in Sect. 8. Finally, we have collected in the appendix some technical and intermediary results.

Renormalization Group and Melnikov Problem for PDE’s

105

2. The 1D Wave Equation In this section, we show how Theorem 1.1 implies the existence of small amplitude quasi-periodic solutions of nonlinear 1D wave equations of the form ∂t2 u = ∂x2 u − mu − f (u),

t > 0, x ∈ [0, 2π ],

(2.1)

with periodic boundary conditions u(0, t) = u(2π, t), ∂t u(0, t) = ∂t u(2π, t). Here, m > 0 is a real parameter and f is a real analytic function of the form f (u) = u3 +O(u4 ). For f ≡ 0, Eq. (2.1) becomes ∂t2 u = ∂x2 u − mu ≡ −Lu.

(2.2)

The operator L with periodic boundary conditions admits a complete orthonormal basis of eigenfunctions ψn ∈ L2 ([0, 2π ]), n ∈ Z, with corresponding eigenvalues ζn = n2 + m,

(2.3)

√ if one sets ψ0 = 1/ 2π and for n ≥ 1,

1 ψn (x) = √ cos(nx), π

1 ψ−n (x) = √ sin(nx). π

(2.4)

Every solution of the linear wave Eq. (2.2) can be written √ as a superposition of the basic modes ψn , namely, for I any subset of Z and µn ≡ ζn , an cos(µn t + θn )ψn (x), (2.5) u(x, t) = n∈I

with amplitudes an > 0 and initial phases θn . Regarding existence of solutions for the nonlinear wave equation (2.1), we will prove Theorem 2.1. Let 1 ≤ d < ∞ and I = {n1 , . . . , nd } ⊂ Z satisfying |ni | = |nj | for i = j . Then, for λ > 0 small enough there is a set A ⊂ {a = (a1 , . . . , ad ) | 0 < ai < λ} of positive measure such that for a ∈ A Eq. (2.1) has a solution u(x, t) =

d i=1

ai cos(µni t + θi )ψni (x) + O(|a|3 ),

(2.6)

with frequencies µni = µni + O(|a|2 ). Furthermore, the set A is of asymptotically full measure as |a| → 0. As is well known, the nonlinear wave Eq. (2.1) can be studied as an infinite dimensional Hamiltonian system by taking the phase space to be the product of the Sobolev spaces H01 ([0, 2π ]) × L2 ([0, 2π ]) with coordinates u and v = ∂t u. The Hamiltonian for (2.1) is then 2π 1 1 H = 2 (v, v) + 2 (Lu, u) + g(u) dx, (2.7)

0

where L = −d 2 /dx 2 + m, g = f ds, and (·, ·) denotes the usual scalar product in L2 ([0, 2π ]). In order to prove existence of solutions of type (2.6) by means of Theorem 1.1, we would like to write (2.7) in the form (1.4). This turns out to be possible,

106

J. Bricmont, A. Kupiainen, A. Schenkel

through amplitude-frequency modulation, due to the availability of a (partial) normal form theory for (2.7). As we shall see, the requirement for the parameter m to be non zero is crucial for this part of the argument. In the sequel, we will closely follow the exposition of Pöschel in [P2]. Introducing the coordinates q = (q0 , q1 , q−1 , . . . ) and p = (p0 , p1 , p−1 , . . . ) by setting u(x) =

qn ψn (x),

v(x) =

n∈Z

pn ψn (x),

(2.8)

n∈Z

one rewrites the Hamiltonian (2.7) in the coordinates (q, p), H =

1 2 2 µn qn + p2n + G(q), 2

(2.9)

n∈Z

where

2π

G(q) =

g

0

qn ψn (x) dx.

(2.10)

n∈Z

The Hamiltonian flow generated by (2.9) is given by the equations of motion q¨ n = −µ2n qn − ∂qn G(q),

(2.11)

and one can show that a solution q of (2.11) yields a solution of the nonlinear wave Eq. (2.1) if q has some decaying properties. More precisely, defining lbs to be the Banach space of all real valued bi-infinite sequences w = (w0 , w1 , w−1 , . . . ) with norm ||w||s =

[n]s |wn |, n∈Z

where [n] = max(1, |n|), one has the Lemma 2.2. Let s ≥ 2. If a curve I → lbs , t → q(t), is a solution of (2.11), then u(x, t) =

qn (t)ψn (x)

n∈Z

is a classical solution of (2.1). For the proof of Lemma 2.2, see [CY]. Before turning to the normal form analysis of the Hamiltonian (2.9), we state a result concerning the regularity of the gradient ∂q G. Lemma 2.3. For all s > 0, the gradient ∂q G is real analytic as a map from some neighborhood of the origin in lbs into lbs , with ||∂q G(q)||s = O(||q||3s ).

(2.12)

Renormalization Group and Melnikov Problem for PDE’s

107

Proof. We first note that lbs is a Banach algebra with respect to convolution of sequences, with s

[i] ||q ∗ p||s ≤ [i]s |qj −i ||pj | ≤ sup ||q||s ||p||s ≤ 2s ||q||s ||p||s . i,j ∈Z [j − i][j ] i,j ∈Z

(2.13) Therefore, using the analyticity of f (u) = u3 +O(u4 ), one computes that in a sufficiently small neighborhood of the origin, ||f (u)||s ≤ C||q||3s .

(2.14)

On the other hand, since ∂qn G(q) =

2π 0

f (u)ψn (x)dx,

the components of ∂q G(q) are the Fourier components of f (u) and (2.12) follows from the estimate (2.14). The regularity of ∂q G follows from the regularity of its components and its local boundedness, cf. [PT], p. 138. We now turn to the normal form analysis of (2.9). First, since g(u) = 41 u4 + O(u5 ), we find that 1 G(q) = gij kl qi qj qk ql + O(|q|5 ), 4 i,j,k,l

where gij kl =

2π 0

ψi ψj ψk ψl dx.

(2.15)

An easy computation shows that gij kl = 0 unless i ± j ± k ± l = 0 for at least one combination of plus and minus signs. This will play an important role later on. Next, given a finite subset of indices Id = {n1 , . . . , nd } ⊂ Z with |ni | = |nj | if i = j , we decompose the Hamiltonian (2.9) as H = Hd + H∞ , where Hd (q, p) =

1 2 2 (µn qn + p2n ) 2 n∈Id

+ H∞ (q, p) =

1 4

gij kl qi qj qk ql ≡ 7d (q, p) + Gd (q),

(2.16)

i,j,k,l∈Id

1 2 2 (µn qn + p2n ) 2 n∈Id

+ G(q) − Gd (q) ≡ 7∞ (q, p) + G∞ (q).

(2.17)

108

J. Bricmont, A. Kupiainen, A. Schenkel

Introducing the complex coordinates zj , j = 1, . . . , d, by zj =

1 (µnj qnj + i pnj ), 2µnj

one obtains the Hamiltonian Hd (z, z¯ ) = j µnj |zj |2 + Gd (z, z¯ ) on Cd with symplectic structure i j dzj ∧ d z¯ j . For the remaining coordinates, one introduces the notation, for k ≥ 1, (qk , q−k ) ∈ R2 if k, −k ∈ Id , xk = ˜ for some k˜ ∈ Id , q−k˜ ∈ R if k = |k| and similarly for pn , n ∈ Id , denoted in terms of yk ∈ Rdk , k ≥ 1, with dk as above, namely, dk = 2 if both k, −k ∈ Id and dk = 1 otherwise. Clearly, for q, p ∈ lbs one has ∞ x, y ∈ R∞ s , where Rs is defined in (1.11), and H∞ reads in these notations H∞ (z, z¯ , x, y) =

1 2 (µk |xk |2 + |yk |2 ) + G∞ (z, z¯ , x), 2 k≥1

with |G∞ | = O 3l=0 |z|l ||x||4−l + |z|5 + ||x||5s . The next proposition establishes the s existence of a symplectic change of coordinates that transforms the Hamiltonian Hd into a Birkhoff normal form. As it will be clear from the proof, this normal form is not available for H = Hd + H∞ , since most frequencies in H∞ are degenerate. This is the main difference with [P2] in the present discussion. Proposition 2.4. For each m > 0 and each subset Id , d < ∞, satisfying |ni | = |nj | when i = j , there exists a near identity, real analytic, symplectic change of coordinates 9d in some neighborhood of the origin in Cd that takes the Hamiltonian (2.16) into ¯ d + Kd , Hd ◦ 9d = 7d + G where |Kd | = O(|z|6 ) and d 1 3 4 − δij ¯ Gd (z, z¯ ) = g¯ ij |zi |2 |zj |2 with g¯ ij = . 2 8π µni µnj

(2.18)

i,j =1

∞ , one has H∞ ◦ 9∞ = 7∞ + K∞ with Furthermore, setting 9∞ = 9d ⊕ 1R∞ s ×Rs 3 l 4−l 5 5 |K∞ | = O l=0 |z| ||x||s + |z| + ||x||s .

Proof. Modulo straightforward modifications, the proof is carried out in [P2] and we restrict ourselves here to a quick overview. Proceeding as in [P2] and using that |n| = |n | for n = n ∈ Id , one can show that for integers i, j, k, l ∈ Id satisfying i ±j ±k ±l = 0 and {i, j, k, l} = {n, n, n , n }, one has for all combinations of plus and minus signs, |µi ± µj ± µk ± µl | ≥ c

(N 2

m > 0, + m)3/2

(2.19)

Renormalization Group and Melnikov Problem for PDE’s

109

with c some absolute constant and N = min{|i|, . . . , |l|}. This allows to eliminate all terms in Gd (z, z¯ ) that are not of the form |zi |2 |zj |2 . To see this, it is convenient to adopt the notation zj = wj and z¯ j = w−j in which Gd reads Gd =

1 g˜ ij kl wi wj wk wl , 16 i,j,k,l

g˜ ij kl = √

gn|i| ...n|l| µn|i| . . . µn|l|

,

where the prime indicates that the sum runs over all indices i, j, k, l ∈ {1,−1, . . . , d,−d} with n|i| ±n|j | ±n|k| ±n|l| = 0 for at least one combination of plus and minus signs. Defining the transformation 9d as the time-1 map of the flow of the vectorfield XF given by a Hamiltonian F (z, z¯ ) of order four, namely, 9d = XFt |t=1 and F = Fij kl wi wj wk wl , one obtains using Taylor’s formula Hd ◦ 9d = 7d + Gd + {7d , F } + O(|z|6 ) with {7d , F } = −i (µˆ i + µˆ j + µˆ k + µˆ l )Fij kl wi wj wk wl , i,j,k,l

where µˆ i ≡ sign(i)µn|i| . With (2.19), one easily checks that if {i, j, k, l} = {a,−a, b,−b} then µˆ i + µˆ j + µˆ k + µˆ l > 0. Therefore, choosing Fij kl suitably, one finally obtains, using giijj = (2 + δij )/4π and counting multiplicities, d 3 4 − δij ¯ d. |zi |2 |zj |2 ≡ G Gd + {7d , F } = µni µnj 16π i,j =1

For the rest of the proof, we refer the reader to [P2].

¯ d is integrable with integrals |zi |2 , i = 1, . . . , d. FurtherThe Hamiltonian 7d + G more, the matrix g¯ = (g¯ ij )i,j is non degenerate, as can be checked from the explicit formula (2.18). Hence, introducing the standard action-angle variables (I, φ) ∈ Rd ×Td and linearizing H around a given value for the action, namely, by setting for some a = (a1 , . . . , ad ) ∈ Rd , zi z¯ i = Ii + ai2 , one finally obtains Ha = ω · I + 21 I · gI ¯ +

k≥1

(µ2k xk2 + yk2 ) + Ua (I, φ, x),

(2.20)

where Ua is just Kd + K∞ with the variables zi , z¯ i , i = 1, . . . , d, expressed in terms of I, φ, and where ω = (ω1 , . . . , ωd ) is given by ωi = µni +

d j =1

g¯ ij aj2 ,

and covers a cone at (µn1 , . . . , µnd ) as a varies in a neighborhood of the origin of Rd . Furthermore, Ua is real analytic in φ ∈ Td and real analytic in I in a sufficiently small neighborhood OI of the origin of Rd . As a function of x, Ua is real analytic in a neighborhood Ox ⊂ R∞ s and by Lemma 2.3, its gradient ∂x Ua is bounded as a map from Td × OI × Ox to R∞ s . Therefore, since hypothesis (H1) is satisfied with γ = 1,

110

J. Bricmont, A. Kupiainen, A. Schenkel

Ua satisfies (H2) with ξ = 1. Finally, the small parameter λ is given in terms of |a| = δ. In the Hamilton’s equations for Ha , rescaling a by δ, x and y by δ 2 , and I by δ 4 , one obtains an Hamiltonian system given by the rescaled Hamiltonian H˜ a (φ, I, x, y) = δ −4 Hδa (φ, δ 4 I, δ 2 x, δ 2 y) δ4 = ω · I + I · gI (µ2k xk2 + yk2 ) + U˜ a (I, φ, x), ¯ + 2 k≥1

with U˜ a analytic in δ and, as a function of I , U˜ a = O(δ + δ 3 |I | + δ 5 |I |2 ). Hence, Theorem 1.1 implies the existence of quasi-periodic solutions I, x and y of period ω, real analytic in φ and λ. Tracing the coordinate transformations back to the original variables qn (t) in the expression (2.8) for u(x, t) completes the proof of Lemma 2.2 with u(x, t) given by (2.6). 3. The Renormalization Group Scheme Equation (1.7) consists in a system of equations for the variables (!, J ) and Z which are coupled through the perturbation U only. Adopting the notation ∂ U V (!, J, Z)(ϕ) = λ φ (ϕ + !(ϕ), J (ϕ), Z(ϕ)), (3.1) ∂I U W (!, J, Z)(ϕ) = λ∂x U (ϕ + !(ϕ), J (ϕ), Z(ϕ)),

(3.2)

one rewrites Eq. (1.7) as

! = −V (!, J, Z), J

(3.3)

(D 2 + µ2 )Z = −W (!, J, Z).

(3.4)

0 D −D g

Our strategy will be to consider (3.3) and (3.4) separately, treating the functions Z and (!, J ), respectively, as parameters. As we will see in Sect. 8, existence of a (unique) solution of the original equation (1.7) can then be proved by using the implicit function theorem. Note that (3.3) involves only the torus frequencies ω and is equivalent to a standard KAM problem. Existence of a solution for such equations is well known and has been established by various means. One important feature we will use is the regular dependence of the solution (!, J ) on the function Z. A precise result about the solution of (3.3) will be stated in Sect. 4, Theorem 4.1, once the required Banach spaces of functions have been introduced. We now focus our attention on Eq. (3.4), and will suppress from the notation the dependence of the vector field W on the parameters ! and J . Most of our analysis will be conducted in Fourier space, and we will denote by lower case letters the Fourier transforms of functions of ϕ, the latter being denoted by capital letters, namely, F (ϕ) = e−iq·ϕ f (q), where f (q) = eiq·ϕ F (ϕ)dϕ, q∈Zd

Td

Renormalization Group and Melnikov Problem for PDE’s

111

where dϕ stands for the normalized Lebesgue measure on Td . For Z(ϕ) ∈ R∞ , note dk ˆ ∞ with zki (q) = zki (−q), where R ˆ ∞ stands for that z(q) ∈ R k≥1 C and ki refers th d ∞ ˆ s will denote the complexification of the to the i component of C k . Similarly, R Banach space R∞ defined in (1.11). Finally, we will denote the vector space of functions s ˆ ∞ by h, z(q) ∈ R ˆ ∞ , q ∈ Zd }. h = {z = (z(q)) | z(q) ∈ R In terms of the Fourier transform of W , namely, w0 (z)(q) ≡ λ eiq·ϕ ∂x U (ϕ + !(ϕ), J (ϕ), Z(ϕ))dϕ, Td

(3.5)

Eq. (3.4) becomes, K0 z = w0 (z), where the operator K0 is given by the diagonal kernel K0 (q, q ) = |ω · q|2 − µ2 δqq .

(3.6)

(3.7)

Solving Eq. (3.6) requires to invert the operator K0 . Although the inverse of K0 is unbounded for generic frequencies, restricting ω to a set of admissible frequencies gives sufficient control on the inverse of K0 to prove existence of a solution. As is well known for Melnikov problems, this set depends on the perturbation U . In order to prove existence of a solution to Eq. (3.6), we will follow a strategy developed in [BGK] for standard KAM problems, namely, for equations of the type (3.3). This strategy basically consists in inductively reducing (3.6) to a sequence of effective equations involving denominators of decreasing size. One inductive step, say the nth step, consists in splitting the effective equation obtained at the previous step into two equations involving only large and, respectively, small denominators, where large and small are defined with respect to a scale of order ηn for some fixed η < 1. This splitting is done in such a way that the nonlinear operator involved in the large denominators equation is a contraction, and this equation can thus be solved by a simple application of the contraction mapping principle. This, in turn, allows to map the small denominators equation into a new effective equation of type (3.6), with a new righthand side wn and (eventually) a new linear operator Kn . In [BGK], it was shown that for equations of type (3.3), the above mentioned contraction property follows naturally from symmetries specific to this case. In contrast, Eq. (3.4) involves in addition the normal frequencies µk and does not possess such symmetry. In order to obtain the required contraction, we must make at every inductive step an additional preparation step. As we shall see below, this amounts to renormalizing the linear operator Kn−1 obtained at the previous step into a new operator Kn , which, in effect, corresponds to renormalizing the normal frequencies. Furthermore, we will see that the renormalized normal frequencies converge to a U -dependent set {µ∗α }, α ≥ 1, as n → ∞. Therefore, since the Diophantine conditions imposed on ω will eventually be defined relative to this set, one obtains in a constructive way the dependence of the set of admissible frequencies on the perturbation U . We now describe how the renormalization group approach is implemented in practice for Melnikov type problems. First, we proceed with the above mentioned preparation step by decomposing w0 as w0 (z) = w˜ 0 (z) + A0 z,

112

J. Bricmont, A. Kupiainen, A. Schenkel

where the linear operator A0 is the dominant part of Dw0 (z) evaluated at z = 0. With K1 ≡ K0 − A0 , Eq. (3.6) now reads K1 z = w˜ 0 (z).

(3.8)

As explained in more details below, A0 can be chosen in such a way that K1 is of the same form as K0 , cf. (3.7), but now given in terms of a new set of frequencies µ˜ ki ∈ R which are perturbation of order λ of the original normal frequencies µk . The notation µ˜ ki reflects the fact that the perturbation A0 may lift some of the degeneracies. Therefore, when inverting K1 , denominators smaller than O(η) occur for q such that ||ω · q| − µ˜ ki | ≤ O(η) for some ki . Furthermore, these small denominators only occur, q for such q, in a specific subspace hki of Cdk depending on which µ˜ kj , if any, has been q separated from µ˜ ki by more than O(η). Introducing P1 as the projection of h onto hki for q such that ||ω · q| − µ˜ ki | ≤ O(η) and defining Q1 ≡ 1 − P1 , one thus expects that the restriction of K1 to Q1 h is invertible with an inverse of order O(η−1 ). Multiplying (3.8) by Q1 and P1 leads to the small and large denominators equations for z˜ 1 ≡ Q1 z and z1 ≡ P1 z, K1 z˜ 1 = Q1 w˜ 0 (˜z1 + z1 ), K1 z1 = P1 w˜ 0 (˜z1 + z1 ),

(3.9) (3.10)

and by definition of Q1 , the first equation can be rewritten as a fixed point equation for the functional R1 defined as R1 (z1 ) ≡ z˜ 1 , namely, R1 (z) = K1−1 Q1 w˜ 0 (z + R1 (z)).

(3.11)

By choice of A0 , the nonlinear operator K1−1 Q1 w˜ 0 is a contraction and one can solve Eq. (3.11) for R1 using the Banach fixed point theorem. (See point (a) of Theorem 5.1 for this part of the inductive step.) Next, with w1 defined as w1 (z) ≡ w˜ 0 (z + R1 (z)), Eq. (3.10) reads K1 z1 = P1 w1 (z1 ),

(3.12)

and the solution z = z1 + z˜ 1 of the original Eq. (3.6) is now given by z = z1 + R1 (z1 ) ≡ F1 (z1 ). Hence, the problem of solving (3.6) is reduced to solving the effective Eq. (3.12). To solve this equation one proceeds similarly, starting with our preparation step. After n steps of this inductive process, the solution of (3.6) is given by z = Fn−1 (zn + Rn (zn )) ≡ Fn (zn ),

(3.13)

where Rn solves the functional equation Rn (z) = 9n w˜ n−1 (z + Rn (z)),

(3.14)

9n ≡ Kn−1 Qn Pn−1 ,

(3.15)

with

Renormalization Group and Melnikov Problem for PDE’s

113

and, for some linear operator An−1 , w˜ n−1 (z) ≡ wn−1 (z) − An−1 z, Kn ≡ Kn−1 − Pn−1 An−1 ,

(3.16) (3.17)

whereas zn solves the effective equation Kn zn = Pn wn (zn ),

(3.18)

wn (z) ≡ w˜ n−1 (z + Rn (z)).

(3.19)

with wn defined as

Remark 3.1. The point of this inductive procedure is that Pn wn (z) becomes effectively linear in z for large n. More precisely, we will show, cf. Theorem 5.1 below, that the rescaled maps wnr defined by wnr (z) = η−n r −n wn (r n z) satisfy for r < η, Pn wnr (z) = Pn Dwnr (0)z + O(λr 2n η−n )

with

Pn Dwnr (0) = O(λ),

in some appropriate Banach space. Thus, zn = 0 becomes a better and better approximation to the solution of (3.18), and we shall construct the solution z of the original Eq. (3.6) as the limit of the approximate solutions z = lim Fn (0). n→∞

(3.20)

We now give a precise description of the operators Pn . Note that in order to obtain (3.14) and (3.18), we have tacitly assumed that Pn Pn−1 = Pn . The possibility to define Pn satisfying such a property follows from the convergence of the normal frequencies under renormalization. Recall that renormalization occurs because at every inductive step one turns the nonlinear map wn of the effective functional equation (3.18) into a contraction by subtracting some linear operator An . Delaying to subsequent sections the discussion of the appropriate choice for the family Am , m ≥ 0, it suffices to point here to the properties of Am that will ensure convergence of the renormalized normal frequencies. As will be shown, cf. point (c) of Theorem 5.1 for a precise statement, Am is a perturbation of order ˆ∞ → R ˆ ∞ linear ληm and is given by a constant kernel Am (q, q ) = am δqq with am : R n−1 and hermitian. As a consequence, the operator Kn = K0 − m=0 Pm Am has a kernel of the form (3.7) with µ2 essentially replaced by the positive definite matrix µ˜ 2n ≡ µ2 +

n−1

am ,

(3.21)

m=0

with µ˜ n having a discrete spectrum σ (µ˜ n ) ⊂ R+ . One easily checks that the singularities of Kn−1 are given by the eigenvalues of µ˜ n , which therefore correspond to renormalized normal frequencies. Since am is of order ληm , one expects the eigenvalues of µ˜ n to converge as n → ∞ with |νn+1 − νn | ≤ O(ληn ) for νn+1 ∈ σ (µ˜ n+1 ) and νn ∈ σ (µ˜ n ). This, in turn, allows us to define scales of denominators in a consistent way by carefully keeping track of the separation properties of σ (µ˜ n ) as n increases. To this end, one groups the normal frequencies into a hierarchy of clusters satisfying gap conditions that are preserved by the renormalization procedure. We first introduce some notation. For

114

J. Bricmont, A. Kupiainen, A. Schenkel

x ∈ R and C a finite collection of points in R, let d(x, C) denote the distance between x and the smallest interval containing all points in C, and for two finite collections C1 , C2 ⊂ R, let d(C1 , C2 ) ≡ inf d(x, C2 ). x∈C1

Then, one can uniquely decompose σ (µ˜ n ) into a maximal number of disjoint clusters n , k ≥ 1, i = 1, . . . , M n , satisfying d(µ , C n ) = O(λ) and the gap condition Ck,i k k,i k n n , Ck,j ) > ηn d(Ck,i

if i = j.

(3.22)

Note that Mkn ≤ dk , where dk denotes the multiplicity of the original normal frequency µk , and that by requiring Mkn to be maximal, the decomposition n

σ (µ˜ n ) =

Mk k≥1 i=1

n Ck,i

(3.23)

is unique. The above observation about the rate of convergence of σ (µ˜ n ) as n → ∞ ensures that eigenvalues belonging to different clusters will remain separated. Generically, one expects all degeneracies to be lifted eventually, so that Mkn = dk for n sufficiently n contains a single eigenvalue. Next, defining S ⊂ Zd as large and each cluster Ck,i n n

Sn =

Mk k≥1 i=1

n Sk,i ,

(3.24)

where n n Sk,i = {q ∈ Zd | d(|ω · q|, Ck,i ) < 41 ηn },

(3.25)

one is ensured that all q ∈ Zd \ Sn satisfy d(|ω · q|, σ (µ˜ n )) ≥ O(ηn ) for n ≥ n. Hence, such q can be safely “integrated out” in the large denominators equation. Remark that n are pairwise disjoint. In order to achieve the construction of due to (3.22), the sets Sk,i Pn , one must isolate for every q ∈ Sn the subspace of R∞ in which small denominators n , the latter is given by the eigenspace of µ will occur. For q ∈ Sk,i ˜ n associated with n . This eigenspace will be denoted by J n , whereas the the eigenvalues belonging to Ck,i k,i n will be denoted by P n . Thus, one defines P to be the diagonal projector onto Jk,i n k,i operator acting on h given by the kernel n

Pn (q) =

Mk k≥1 i=1

n n χk,i (ω · q)Pk,i ,

n denotes a function in ∈ C 1 (R) which satisfies where χk,i

n χk,i (κ)

=

1

n ) ≤ 1 ηn , if d(|κ|, Ck,i 8

0

n ) ≥ 1 ηn , if d(|κ|, Ck,i 4

(3.26)

Renormalization Group and Melnikov Problem for PDE’s

115

and interpolates monotonically between 0 and 1 otherwise, with n sup |χk,i (κ)| ≤ Cη−n ,

(3.27)

Qn = 1 − Pn .

(3.28)

κ∈R

whereas Qn is defined as

n have been introduced Note that Pn and Qn are not projectors. The smooth functions χk,i in order to ensure the continuity of the diagonal kernels 9n (q, q), cf. the discussion preceding Lemma 5.3 below. However, we will make use later of the projector n

Pˆn (q) =

Mk k≥1 i=1

n n (q)P ISk,i k,i ,

(3.29)

where IO denotes the indicator function of a set O. Note that Pn Pˆn = Pn , whereas Qn Pˆn = 0. We conclude this section by a few remarks related to the convergence of the inductive n ⊂ R to be the smallest interval covering C n , one easily checks scheme. First, setting Ik,i k,i n that |Ik,i | ≤ (dk − 1)ηn . Hence, since the multiplicities of the normal frequencies µk were assumed to be uniformly bounded in k, i.e., dk ≤ d¯ for all k ≥ 1, one obtains for all n ≥ 1, k ≥ 1, and i = 1, . . . , Mkn , n ¯ n. |Ik,i | ≤ dη

(3.30)

Next, it follows from the gap condition (3.22) being preserved that for all m < n the n are perturbation of all or some eigenvalues belonging eigenvalues in a given cluster Ck,i m m . Furthermore, C n remains close to C m . More to a single cluster Ck,j , denoted by Ck,j n k,i k,jin i precisely, we will show that sup

inf d(x, y) ≤ ηm+1

n y∈I m n x∈Ik,i k,j

for

1 ≤ m < n.

(3.31)

i

n . One has by construction Finally, we consider the properties of the eigenspaces Jk,i n P n = δ δ P n . However, it will be possible to choose a in (3.21) in such a way Pk,i kl ij k,i m l,j n , every m is an invariant subspace for a . Hence, by definition of µ that each Jk,i ˜ n and Jk,i m n−1 n m Jk,i is a subspace of some Jk,j , and by recursion, of some Jk,j for all m < n. The m containing J n will be denoted by J m . Therefore, one has (unique) eigenspace Jk,j k,i k,jin for all 1 ≤ m ≤ n, k ≥ 1, and i = 1, . . . , Mkn , n m n Pk,i Pl,j = δkl δjjin Pk,i ,

(3.32)

which, in particular, implies that Pn Pn−1 = Pn−1 Pn = Pn .

(3.33)

116

J. Bricmont, A. Kupiainen, A. Schenkel

Notations. For most of the subsequent analysis, it will not be necessary to distinguish between indices (k, i) and (l, j ) with k = l or k = l. This intervenes only in the description of the asymptotic behavior of the spectrum σ (µ˜ n ) and the measure estimate of +∗ . For notational convenience, we thus introduce the index sets I n = {(k, i) | k ≥ 1, i = 1, . . . , Mkn }, I n.

and will reserve bold letters for indices in With this n ,k ≥ denotes for instance the collection of all clusters Ck,i

n ≥ 1, convention, {Ckn | 1, i = 1, . . . , Mkn .

(3.34) k ∈ I n },

4. Spaces For the Fourier transform z of the solution Z of our original Eq. (3.4), we consider the Banach space hs , s ∈ R, defined by ˆ∞ hs = {z = (z(q)) | z(q) ∈ R |z(q)|s < ∞}. (4.1) s , ||z||s ≡ q∈Zd

For s ≥ t, one has the natural embedding hs → ht with || · ||t ≤ || · ||s . We will denote by hns the subspace Pˆn hs . In particular, one has for z ∈ hns , ||z||s = |Pkn z(q)|s . (4.2) k∈I n q∈Skn

(n,m)

(n)

The operator norm in L(hns , hm t ) will be denoted by || · ||s,t , and by || · ||s when n = m and s = t. Let us now turn to the spaces we will consider for the functions wn . Recall that in our analysis of (3.4), the functions ! and J only appear as parameters. In the sequel, we consider !, J : Td → Rd as (fixed) real analytic maps belonging to a small neighborhood of the origin OB in the Banach space B = {(F, G) : Td → Rd × Rd | ||(F, G)||B ≡ |f (q)| + |g(q)| < ∞}. (4.3) q∈Zd

Next, it follows from assumption (H2) that the gradient ∂x U is real analytic as a map d ∞ from Td × OI × Ox to R∞ s , cf. [PT] p. 138. (Recall that OI ⊂ R and Ox ⊂ Rs are neighborhoods of the origin and that s ≡ s + ξ − a.) This implies that for (!, J ) ∈ OB small enough, one can write the Taylor expansion of ∂x U (ϕ + !(ϕ), J (ϕ), Z) = ∂x U ((ϕ + !(ϕ), J (ϕ), 0) + (0, 0, Z)) as ∂x U ((ϕ + !(ϕ), J (ϕ), 0) + (0, 0, Z)) =

∞ 1 !,J U (ϕ)(Z, . . . , Z), m! m+1

(4.4)

m=0

!,J where the coefficients Um+1 (ϕ) belong to the space of m-linear maps L(Rs , . . . , Rs ; Rs ), are real analytic in ϕ ∈ Td and analytic in (!, J ) ∈ OB . Hence, there !,J exist ρ > 0, α > 0 and b < ∞ such that the Fourier transforms of Um+1 (ϕ) satisfy −m eα|q| ||u!,J . (4.5) ˆ ,...,R ˆ ;R ˆ ) ≤ b m! ρ m+1 (q)||L(R q∈Zd

s

s

s

Renormalization Group and Melnikov Problem for PDE’s

117

Inserting the Fourier series for Z into (4.4), one obtains the expansion for w0 as defined in (3.5), w0 (z)(q) = λ ≡

∞ m 1 !,J qi (z(q1 ), . . . , z(qm )) um+1 q − m! q

m=0 ∞

i=1

(4.6)

(m) w0 (q; q1 , . . . , qm )(z(q1 ), . . . , z(qm )),

m=0 q

where q = (q1 , . . . , qm ) ∈ Zmd . This formula suggests to consider w0 as an analytic function of z ∈ hs . Let B(r0 ) be the open ball of radius r0 in hs centered at the origin and let H ∞ (B(r0 ), hs ) denote the Banach space of an analytic function w : B(r0 ) → Hs equipped with the supremum norm, which we shall denote by |||w|||. Then, bound (4.5) implies that w0 ∈ H ∞ (B(r0 ), hs ) for r0 small enough. (m) It will be convenient to encode the decay property of the kernels w0 inherited from the estimate (4.5) as a property of the functional w0 . Let τβ denote the translation by β ∈ Rd , i.e., (τβ Z)(ϕ) = Z(ϕ − β). On hs , τβ is realized by (τβ z)(q) = eiβ·q z(q), and it induces a map w → wβ from H ∞ (B(r0 ), hs ) to itself if we define wβ (z) = τβ w(τ−β z).

(4.7)

(m)

On the kernels w0 , this is given by

(m)

wβ (q; q1 , . . . , qm ) = eiβ·(q−

qi )

w (m) (q; q1 , . . . , qm ),

and makes sense also for β ∈ Cd . Since |||w0β ||| ≤

∞ m=0

r0m sup q

q∈Zd

e−Imβ·(q−

qi )

(m)

||w0 (q; q1 , . . . , qm )||L(Rˆ

ˆ

ˆ

s ,...,Rs ;Rs )

,

it thus follows from (4.5) that there exist r0 > 0, α > 0, and D < ∞, such that w0β belongs to H ∞ (B(r0 ), hs ) and extends to an analytic function of β in the strip | Im β| < α with values in H ∞ (B(r0 ), hs ) satisfying the bound |||w0β ||| ≤ D|λ|.

(4.8)

Let us now come back to the existence of a solution for Eq. (3.3), namely for the standard KAM problem. One has the classical result (see for instance [BGK]): Theorem 4.1. Let U satisfy hypothesis (H2) and let g be an invertible matrix. Then, there is a λ1 > 0 small enough such that for |λ| < λ1 and ω satisfying a Diophantine condition of the form |ω · q| > K|q|−ν for q ∈ Zd , q = 0, (3.3) has a solution (!, J ) ∈ B which is real analytic in ϕ, analytic in λ, and vanishes for λ = 0. Furthermore, this solution is unique up to translations (!, J )(ϕ) → (! − β, J )(ϕ − β) and depends analytically on Z, for Z in a small ball centered at the origin of the Banach space hs .

118

J. Bricmont, A. Kupiainen, A. Schenkel

To conclude this section, we list some standard properties of bounded analytic functions defined on open balls in Banach spaces. Let h, h , h be Banach spaces, B(r) ⊂ h, B(r ) ⊂ h , and wi ∈ H ∞ (B(r), h ), w ∈ H ∞ (B(r ), h ). First, one has the composition property: If |||wi ||| < r then w ◦ wi ∈ H ∞ (B(r), h ) and |||w ◦ wi ||| < |||w|||.

(4.9)

Next, one deduces from the Cauchy estimate that for r1 < r , sup ||Dw(x)||L(h ,h ) ≤ (r − r1 )−1 |||w|||.

||x|| 0 and {Ckn }k∈I n the clusters described in the previous section, +n (K) = ω ∈ Rd | d(|ω · q|, Ckn ), d(|ω · q|, |Ckn ± Ckn |) > K|q|−ν ∀ |q| < Kη−n/ν , Ckn

± Ckn

± ν | ν

Ckn , ν

q = 0,

and k, k ∈ I n },

(5.3)

Ckn }.

denotes the set {ν ∈ ∈ Note that +n (K) ⊂ +n (K ) where whenever K > K . Furthermore, one introduces for ω ∈ Rd the subsets of Zd , d Q+ ω = {q ∈ Z | ω · q > 0},

d Q− ω = {q ∈ Z | ω · q < 0}.

(5.4)

Renormalization Group and Melnikov Problem for PDE’s

119

Proposition 5.1. There exist positive constants r and λ0 small enough such that the following is true for |λ| < λ0 , n ≥ 1, and | Im β| < αn , where α1 = α and, for n ≥ 2, αn = (1 − n−2 )αn−1 .

(5.5)

There exists Kλ > 0 satisfying Kλ → 0 as λ → 0 such that one has for ω ∈ +n (Kλ ) arbitrary but fixed, (a) Equation (5.1) has a solution Rnβ in H ∞ (Bn , hn−1 ) analytic in |λ| < λ0 and s (!, J ) ∈ OB . (b) Defining wnβ according to (5.2), one has wnβ ∈ An and, writing wnβ (z) ≡ wn (z) = wn (0) + Dwn (0)z + δ2 wn (z), ||Pˆn wn (0)||s ≤ εr 2n , |||Pˆn δ2 wn |||An ≤ εr 2n ,

(5.6) (5.7)

where ε → 0 as λ → 0. (c) There exists An ∈ L(hs , hs ) such that w˜ n ≡ wn − An obeys for all z ∈ Bn , (n)

||Pˆn D w˜ n (z)||s,s ≤ εηn .

(5.8)

||An ||s,s ≤ 3εηn−1 ,

(5.9)

An (q, q) = an IQ+ω (q) + an IQ−ω (q),

(5.10)

Furthermore, An (q, q ) = 0 if q = q and T n ˆ∞ ˆ∞ where an ∈ L(R s , Rs ) is hermitian, i.e., an = an , and satisfies for all k ∈ I ,

an Jkn = Jkn .

(5.11)

(d) The matrix µ˜ 2n+1 ≡ µ2 + nm=0 am is positive definite and the spectrum of µ˜ n+1 can be uniquely decomposed into a maximal family of pairwise disjoint clusters n+1 Ck,i , k ≥ 1, i = 1, . . . , Mkn+1 , with Mkn+1 ≥ Mkn , satisfying for all k ≥ 1 the gap condition n+1 n+1 , Ck,j ) > ηn+1 if i = j, d(Ck,i

(5.12)

and n+1 ν = µk + O(εk −ξ ) for all ν ∈ Ck,i , i = 1, . . . , Mkn+1 .

(5.13)

n+1 defined according to (3.25) are pairwise disjoint, and Furthermore, the sets Sk,i (3.31), (3.32) and (3.33) hold with n replaced by n + 1.

Let us briefly comment on Proposition 5.1, whose proof will be carried out in Sect. 6. n+1 enjoy First, we note that point (d) ensures, in particular, that the new set of clusters Ck,i the properties required for proceeding to the next step of the induction, cf. the discussion at the end of Sect. 3. The asymptotic behavior (5.13) concerns the measure estimate of the set +∗ of admissible frequencies in Theorem 1.1. Such an asymptotic behavior is required in order to obtain a set of large measure because one imposes Diophantine conditions with respect to differences of the normal frequencies. We will show in Sect. 7 that (5.13) implies the

120

J. Bricmont, A. Kupiainen, A. Schenkel

Proposition 5.2. For ν = ν(d, ξ ) sufficiently large, the set +∗ (K) ≡ +n (K)

(5.14)

n≥1

satisfies for all bounded + ⊂ Rd , meas(+ \ +∗ (K)) → 0 as K → 0. Note that ω ∈ +∗ assume a Diophantine condition with respect to zero. Therefore, − one has for such ω, Zd \ {0} = Q+ ω ∪ Qω . Next, we turn to bound (5.8), the most delicate estimate to establish. To treat the off-diagonal part Dwn (q, q ), q = q , we will rely on the fact that the exponential decay of the kernel Dw0 (q, q ) in the size of |q − q | is preserved due to the introduction of the parameter β. We note that imposing Diophantine conditions on ω with respect to the differences Ckn ± Ckn ensures that |q − q | is of order O(η−n/ν ) for q = q ∈ Sn . To treat the diagonal part, we will use that Dwn (q, q) depends on q through ω · q only, and is, in some sense, continuous in this variable. More precisely, defining tp : L(hs , hs ) → L(hs , hs ), p ∈ Zd , by (tp L)(q, q ) = L(q + p, q + p),

(5.15)

Tp ≡ tp − 1,

(5.16)

and setting we will show that Tp Dwn is of order O(ε|ω · p|) on the diagonal. Therefore, since p = q − q satisfies |ω · p| ≤ ηn for q, q ∈ Skn such that sign(ω · q) = sign(ω · q ), one has for q ∈ Skn , Pˆn Dwn (q, q)Pˆn = aˆ k + O(εηn ), where aˆ k : Jkn → Jkn depends only on the sign of ω · q. The continuity of Dwn (q, q) ultimately follows from the fact that 9n (q) is continuous in ω·q, as stated in the following lemma, whose proof can be found in the Appendix. Lemma 5.3. Let σ ∈ R and p ∈ Zd . Then the operator 9n = Kn−1 Qn Pn−1 obeys ||9n ||σ,σ +γ ≤ Cη−n , ||Tp 9n ||σ,σ +γ ≤ Cη

−2n

(5.17) |ω · p|.

(5.18)

Finally, the perturbation an being hermitian will essentially follow from the reality of the original Eq. (3.4). More precisely, the derivative Dwn satisfies ij

ij

Dwn (q, q ) = Dwn (−q, −q ),

(5.19)

ij Dwn (q, q )

(5.20)

=

ji Dwn (−q , −q).

ˆ∞ → R ˆ ∞ is given by an hermitian matrix Thus, the diagonal element Dwn (q, q) : R for all q, and an hermitian will follow since, as was mentioned above, an will be chosen in such a way that its action on each Jkn is the constant approximation of Dwn (q, q) for q ∈ Skn . Note that due to (5.19), one expects Dwn (−q, −q) to be approximated by an , which explains the decomposition in formula (5.10). Identities (5.19) and (5.20) are easily checked to hold for n = 0. Indeed, the perturbation U in the Hamiltonian (1.4) being real analytic ensures (5.19), whereas (5.20) follows from the fact that Dw0 is the symmetric second derivative of the functional Z → λ U (ϕ + !(ϕ), J (ϕ), Z(ϕ))dϕ, cf. (3.5). Using the recursive relations (3.19) and (3.16), one obtains (5.19) and (5.20) for n ≥ 1 by iteration.

Renormalization Group and Melnikov Problem for PDE’s

121

Remark 5.4. The choice of constants is as follows. We first fix η small enough according, essentially, to the constants entering the asymptotics of the frequencies µk in (H1), cf. Sect. 6.4. Given η, ε and r are chosen small enough, and λ0 is chosen in turn according to ε. The latter choice plays a role only in ensuring that the inductive hypothesis of Proposition 5.1 are satisfied for n = 0, cf. the introduction in Sect. 6. Finally, Kλ is chosen large enough in order for the estimate −2 K 1/ν η−n/ν λ

Ce−Cn

≤ r 2n ,

(5.21)

to hold for all n ≥ 1. This will be needed in order to iterate the bound (5.6) in Sect. 6.2. Note that due to the double exponential, the dependence of Kλ on η and r is given by the behavior at small n of the expressions entering (5.21). That Kλ can be taken smaller as λ goes to zero will follow from the fact that r and ε, and thus ultimately η, can be taken smaller. Finally, we denote by C a generic constant, independent on n, r, and ε, which may vary from place to place. 6. Proof of Proposition 5.1 We proceed by induction and assume that Proposition 5.1 holds up to n − 1 ≥ 1. Regarding the inductive hypothesis in the case n = 1, we simply choose A0 ≡ 0, so that the bounds for w0 in points (b) and (c) of Proposition 5.1 are a simple consequence of (4.8). Furthermore, µ˜ 1 = µ and point (d) follows immediately from (H1). We note that in Sect. 6.1 below, point (a) is established for n = 1 by taking ε, namely λ, small enough. At some point in the induction, however, one is forced to consider nontrivial An in order for the inductive bounds to hold uniformly in n for a given λ. In the sequel, we adopt the convention, for B a ball of radius r centered at the origin, to denote by γ B the ball of radius γ r centered at the origin. 6.1. Existence of the functional Rnβ . With the notations R = Rnβ , 9 = 9n and w˜ = w˜ (n−1)β , Eq. (5.1) reads R(z) = 9 w(z ˜ + R(z)).

(6.1)

To prove existence in H ∞ (Bn , hn−1 ) of a solution R to Eq. (6.1), one starts, using the s identities w(0) ˜ = w(0) and δ2 w˜ = δ2 w, by decomposing w˜ as w(z) ˜ = w(0) + D w(0)z ˜ + δ2 w(z),

(6.2)

R(z) = 9w(0) + 9D w(0)(z ˜ + R(z)) + 9δ2 w(z + R(z)).

(6.3)

−1 H = 1 − 9D w(0) ˜ ,

(6.4)

to obtain from (6.1),

Defining

and using the identity 1 + H 9D w(0) ˜ = H , one rewrites (6.3) as R(z) = H 9w(0) + H 9D w(0)z ˜ + u(z),

(6.5)

122

J. Bricmont, A. Kupiainen, A. Schenkel

where

and

u(z) = H 9δ2 w(˜z) ≡ G(u)(z),

(6.6)

z˜ ≡ z + R(z) = H z + 9w(0) + u(z).

(6.7)

Since 9 = 9 Pˆn−1 = Pˆn−1 9, (5.17) (with σ = s + ξ − γ ) and the recursive bound (5.8) (with n replaced by n − 1) imply (n−1)

(n−1) −1 ||9D w(0)|| ˜ ≤ ||9D w(0)|| ˜ s s,s+ξ ≤ Cεη .

(6.8)

≤ 2, ||H ||(n−1) s

(6.9)

Hence,

˜ = w(0), and since bounds (5.6) for ε = ε(η) small enough. Since Bn ⊂ Bn−1 , w(0) (with n replaced by n − 1), (5.17) and (6.8) hold, the existence of R in H ∞ (Bn , hn−1 ) s ∞ n−1 follows from the existence of u in H (Bn , hs ). For reasons that will become clear in the next section, we actually show that (6.6) has a solution u in the ball √ −n 2(n−1)

) | |||u||| ≤ εη r . (6.10) B = u ∈ H ∞ ( 18 Bn−1 , hn−1 s This result is stronger, since Bn ⊂ 18 Bn−1 for r small enough. Let us first check that G maps B into itself. From (6.9) and the recursive bound (5.6), it follows that for all z ∈ 18 Bn−1 and u ∈ B, z˜ ∈ hn−1 with s √ ||˜z||s ≤ 2( 18 r n + Cεη−n r 2(n−1) ) + εη−n r 2(n−1) ≤ 21 r n , for ε = ε(r, η) and r = r(η) small enough. Hence, z˜ ∈ 21 Bn−1 ⊂ Bn−1

for all z ∈ 18 Bn−1 ,

(6.11)

and one uses the bound (5.7) to conclude that for all u ∈ B, √ |||G(u)||| ≤ 2Cη−n εr 2(n−1) ≤ εη−n r 2(n−1) , for ε small enough. To show that G is a contraction in B, we apply the estimate (4.11) to the functions z˜ i given by (6.7) in terms of ui ∈ B, i = 1, 2. Noting that |||˜zi ||| ≤ 21 r n , which follows from (6.11), and using in addition (5.7), one obtains, ||Pˆn−1 δ2 w(˜z1 ) − Pˆn−1 δ2 w(˜z2 )||s 1 z∈ 8 Bn−1 4Cη−n r −n |||Pˆn−1 δ2 w|||An−1 sup ||˜z1 − z˜ 2 ||s 1 z∈ 8 Bn−1 4Cεη−n r −n r 2(n−1) sup ||u1 (z) − u2 (z)||s 1 z∈ 8 Bn−1

|||G(u1 ) − G(u2 )||| ≤ 2Cη−n ≤ ≤

sup

≤ 21 |||u1 − u2 |||, for r = r(η) and ε = ε(r, η) small enough.

Renormalization Group and Melnikov Problem for PDE’s

123

Before turning to part (b) of Proposition 5.1, we make some remarks that shall be useful later. First note that (6.11) means z + Rn (z) ∈ 21 Bn−1

for all

z ∈ 18 Bn−1 .

(6.12)

Therefore, with R˜ m (z) ≡ z + Rm (z),

(6.13)

Fnm (z) ≡ R˜ m ◦ R˜ m+1 ◦ · · · ◦ R˜ n (z),

(6.14)

and

it follows recursively that for m = 1, . . . , n, Fnm (z) ∈ 21 Bm−1

for all

z ∈ Bn .

(6.15)

Furthermore, since Fn1 = Fn , where Fn is defined in (3.13), one has Fn ∈ An , together with the uniform bound |||Fn |||An ≤ |||R˜ 1 |||A1 ≤ ε.

(6.16)

6.2. Bounds on the functional wn . According to (5.2), one defines wnβ (z) = w˜ (n−1)β (z + Rnβ (z)). ), it follows from (6.12) and the inductive bounds that for all Since Rnβ ∈ H ∞ (Bn , hn−1 s β with | Im β| < αn−1 , wnβ is well defined as a map from Bn to hs , with wnβ ∈ An . In the sequel, we adopt the simplified notation R = Rnβ , w = w(n−1)β and w = wnβ . We proceed with proving (5.6). Using the decomposition (6.2) at z = 0, one may write w (0) = w(0) + D w(0)R(0) ˜ + δ2 w(R(0)). Since (6.12) implies that R(0) ∈ 21 Bn−1 , one obtains using the bounds (5.6), (5.7) and (5.8), ||Pˆn w (0)||s ≤ εr 2(n−1) + 21 εηn−1 r n + εr 2(n−1) ≤ 3ε.

(6.17)

This leads to |Pkn w (0)(q)|s ≤ 3ε,

(6.18)

for all k ∈ I n and q ∈ Skn . The latter is valid for all β with | Im β| < αn−1 . Let now β with | Im β | < αn . Then, shifting β to β = β − i(αn−1 − αn )q/|q| and using the recursive relation (5.5) for αn , one obtains

−2 α n−1 |q|

wβ (0)(q) = ei(β −β)·q wβ (0)(q) = e−n

wβ (0)(q).

(6.19)

Since for such β one has | Im β| < αn−1 , it follows from (6.18) and (6.19) that −2 e−n αn−1 |q| . (6.20) ||Pˆn w (0)||s ≤ 3ε k∈I n q∈Skn

124

J. Bricmont, A. Kupiainen, A. Schenkel

From the Diophantine conditions satisfied by ω ∈ +n (K), one infers for q ∈ Skn that |q| > min(Kη−n/ν , (4K)1/ν η−n/ν ), cf. (3.25) and (5.3). Therefore, bound (5.6) finally follows by choosing K appropriately, cf. (5.21). We now iterate bound (5.7). Using again the decomposition (6.2), one has δ2 w (z) = D w(0)δ ˜ 2 R(z) + δ2 δ2 w(z + R(z)). The first term on the right-hand side is estimated by using δ2 R(z) = δ2 u(z) together with (4.12) applied to u ∈ B with γ = 8r, since Bn ⊂ 18 Bn−1 , to obtain n−1 |||Pˆn D w(0)δ ˜ ˜ 2 R|||An ≤ ||Pˆn−1 D w(0)|| s,s sup ||δ2 u(z)||s z∈Bn

≤ εηn

(8r)2

|||u||| 1 − (8r)2 √ 2 ε8 ≤ εr 2n 1 − (8r)2 ≤ 21 εr 2n , for ε small enough. In a similar way, one estimates, using (6.12), that sup ||Pˆn δ2 δ2 w(z + R(z))||s ≤ 21 εr 2n ,

z∈Bn

which finally leads to (5.7).

6.3. Bounds on the derivative. In this section, we prove the estimates stated in part (c) of Proposition 5.1. The main difficulty consists in controlling the diagonal part of the kernel of the derivative Dwn evaluated at zero, namely Dwn (0)(q, q), q ∈ Zd . To address this problem, as mentioned in the end of Sect. 5, we will use the fact that Dwn (0)(q, q) depends on q through ω · q only, and satisfies some continuity property when viewed as a function of ω · q. We start by deriving an a priori bound on the norm of Dwn . From (3.14), one infers that DRn (z) = Hn (˜z)9n D w˜ n−1 (˜z),

(6.21)

−1 Hn (˜z) = 1 − 9n D w˜ n−1 (˜z) ,

(6.22)

where

z˜ = z + Rn (z).

(6.23)

Since by definition, cf. (3.19), one has Dwn (z) = D w˜ n−1 (˜z) 1 + DRn (z) , (6.21) and the identity Hn (˜z) = 1 + Hn (˜z)9n D w˜ n−1 (˜z), imply the recursive relation Dwn (z) = D w˜ n−1 (˜z)Hn (˜z).

(6.24)

Renormalization Group and Melnikov Problem for PDE’s

125

As in the previous section, it follows from (5.17), (6.12), and the inductive bounds, that (n−1) ≤ 2 for all z˜ ∈ Bn−1 . Therefore, one obtains for all z ∈ 18 Bn−1 , using ||Hn (˜z)||s again the inductive bound (5.8), (n)

(n−1)

||Pˆn Dwn (z)||s,s ≤ ||Pˆn−1 Dwn (z)||s,s

≤ 2εηn−1 .

(6.25)

In order to iterate bounds (5.8), we decompose Dwn (z) as follows: Dwn (z) = σn + τn + δ1 Dwn (z),

(6.26)

where σn + τn = Dwn (0) and σn (q, q ) = Dwn (0)(q, q )δqq . Let us consider first the last two terms on the right-hand side of (6.26). One has Lemma 6.1. Let r and ε be the positive constants of Proposition 5.1. Then, one has for all n ≥ 0 and all z ∈ Bn , n

(n)

||Pˆn δ1 Dwn (z)||s,s ≤ 21 εr 2 ,

(6.27)

(n)

||Pˆn τn ||s,s ≤ εr n .

(6.28)

Proof. Proceeding by induction, we suppose that Proposition 5.1 and Lemma 6.1 are true up to some n − 1, n ≥ 1. We start with (6.27) and compute from δ1 Dwn (z) = Dwn (z) − Dwn (0) and the recursive relation (6.24) that δ1 Dwn (z) = H˜ n (˜z0 ) D w˜ n−1 (˜z) − D w˜ n−1 (˜z0 ) Hn (˜z), where z˜ 0 = Rn (0) and H˜ n (˜z0 ) = 1+Dwn−1 (˜z0 )Hn (˜z0 )9n . As previously, the inductive (n−1) ≤ 2. Using (6.12) and Pˆn H˜ n = Pˆn H˜ n Pˆn−1 , one bound (5.8) implies ||Pˆn H˜ n (˜z0 )||s infers from the identity D w˜ n−1 (˜z) − D w˜ n−1 (˜z0 ) = δ1 D w˜ n−1 (˜z) − δ1 D w˜ n−1 (˜z0 ) that for all z ∈ 18 Bn−1 , (n−1,n) ||Pˆn δ1 Dwn (z)||s,s ≤C

sup 1 z ∈ 2 Bn−1

(n−1) ||Pˆn−1 δ1 D w˜ n−1 (z )||s,s .

Since δ1 D w˜ n−1 = δ1 Dwn−1 , the recursive bound (6.27) leads to (n−1,n)

||Pˆn δ1 Dwn (z)||s,s

≤ Cεr

n−1 2

,

for all z ∈ 18 Bn−1 . Finally, iterating bound (6.27) is completed by restricting z to Bn ⊂ 1 8 Bn−1 and using (4.12) with γ = 8r. Next, we turn to (6.28), the estimate for the off-diagonal part of Dwn (0). The norm of τn reads (n) ||Pˆn τn ||s,s = sup sup |Pkn τn (q, q )Pkn |s,s , k ∈I n q ∈Skn k∈I n q∈S n k

and one infers from (6.27) and the a priori bound (6.25) that n

|Pkn τn (q, q )Pkn |s,s ≤ 2εηn−1 + 21 εr 2 ≤ 3ε.

(6.29)

126

J. Bricmont, A. Kupiainen, A. Schenkel

The latter is valid for all β with | Im β| < αn−1 . Let now β with | Im β | < αn . Then, shifting β to β = β − i(αn−1 − αn )(q − q )/|q − q |, one obtains

−2 α n−1 |q−q |

τnβ (q, q ) = ei(β −β)·(q−q ) τnβ (q, q ) = e−n Hence, since | Im β| < αn−1 for such ||Pˆn τn ||ns,s

τnβ (q, q ).

(6.30)

β ,

(6.29) and (6.30) lead to −2 ≤ 3ε sup sup e−n αn−1 |q−q | . k ∈I n q ∈Skn k∈I n

(6.31)

q∈Skn q=q

We now show that every term in the previous sum yields a super-exponentially small factor. Let q ∈ Skn and q ∈ Skn for some k ∈ I n , k ∈ I n . Then, one estimates using (3.25) and (3.30) that if sign(ω · q) = sign(ω · q ), ¯ n, d |ω · (q − q )|, Ckn + Ckn ≤ 21 ηn + |Ikn | + |Ikn | ≤ 3dη and that otherwise d |ω · (q − q )|, |Ckn − Ckn | ≤

1 n 2η

¯ n. + |Ikn | + |Ikn | ≤ 3dη

Therefore, since q = q , it follows from (5.3) and ω ∈ +n (K) that

K 1/ν , K η−n/ν . |q − q | ≥ min 3d¯ Hence, the contribution of each term in (6.31) is super-exponentially small, and (6.28) follows for some r * η < 1. Finally, we turn to σn , the diagonal part of Dwn (0) in the decomposition (6.26). We first state a result about the continuity properties of the kernel σn (q, q), namely that Tp σn = tp σn − σn is of order |ω · p|. More precisely, one has the Proposition 6.2. Suppose that Proposition 5.1 is valid up to n − 1 for some n ≥ 1. Then, the diagonal part σn (z) of Dwn (z) satisfies for all z ∈ Bn and all p such that 1 n−1 |ω · p| < 16 η , ||Pˆn Tp σn (z)||ns,s ≤ ε 2 |ω · p|. 3

(6.32)

Delaying the proof of the above proposition to the end of this section, we now construct a diagonal operator An ∈ L(hs , hs ) such that σ˜ n ≡ σn − An obeys ||Pˆn σ˜ n ||ns,s = sup sup |Pkn σ˜ n (q, q)Pkn |s,s ≤ 21 εηn . k∈I n q∈Skn

(6.33)

The equality above follows from the sets Skn being pairwise disjoint. This will conclude the proof of iterating (5.8), since (6.27), (6.28) and (6.33) imply that the derivative of w˜ n ≡ wn −An satisfies the required bound for r = r(η) small enough. In order to obtain bound (6.33) by using the continuity property (6.32), we would like to construct An as an approximation of σn (q, q) for ω · q close to the normal frequencies in Ckn , k ∈ I n . To this end, we set µ¯ k to be the center of the interval Ikn and, using that {ω · q | q ∈ Zd } is dense in R, we choose a sequence {ql,k }l≥1 ⊂ Skn such that ω · ql,k > 0 for all l ≥ 1 and lim ω · ql,k = µ¯ k .

l→∞

Renormalization Group and Melnikov Problem for PDE’s

127

Next, one defines the matrix aˆ n,k ∈ L(Jkn ) by aˆ n,k ≡ lim Pkn σn (ql,k , ql,k )Pkn . l→∞

(6.34)

Due to (6.32), the limit in (6.34) exists and does not depend on the particular choice of the sequence {ql,k }l≥1 . Finally, setting aˆ n,k , (6.35) an ≡ k∈I n

we define the operator An : h → h as given by the diagonal kernel An (q, q) = an IQ+ω (q) + an IQ−ω (q)

(6.36)

for all q ∈ Zd . We note that by construction, (5.11) is clearly satisfied. Furthermore, it follows from (5.19) and (5.20) that an is indeed hermitian. Let us check that definition (6.36) leads to the required bound (6.33). By construction, one has for all k ∈ I n , lim Pkn σ˜ n (ql,k , ql,k )Pkn = 0.

l→∞

(6.37)

On the other hand, since Tp An = 0, bound (6.32) is also satisfied by σ˜ n , which by definition of the norm implies that |Pkn Tp σ˜ n (q, q)Pkn |s,s ≤ ε 2 |ω · p|, 3

(6.38)

1 n−1 η . The definition of Skn for all q ∈ Skn , k ∈ I n , and p ∈ Zd with |ω · p| < 16 1 n−1 n ¯ together with (3.30) implies that |ω · (q − q )| ≤ 2dη ≤ 16 η for all q, q ∈ Skn with sign (ω · q) = sign (ω · q ) and η small enough. Therefore, using

σ˜ n (q, q) = σ˜ n (q , q ) + Tq−q σ˜ n (q , q ), one infers from (6.38) that for all ql,k and q ∈ Skn with ω · q > 0, |Pkn σ˜ n (q, q)Pkn |s,s ≤ |Pkn σ˜ n (ql,k , ql,k )Pkn |s,s + ε 2 |ω · (q − ql,k )|, 3

(6.39)

which, with (6.37), leads to ¯ 2 ηn . |Pkn σ˜ n (q, q)Pkn |s,s ≤ 2dε 3

(6.40)

For q ∈ Skn with ω · q < 0, we note that (6.39) is also valid if one replaces ql,k by −ql,k , and, due to (5.19), that the same is true of (6.37). Therefore, (6.40) holds for all q ∈ Skn , k ∈ I n , and bound (6.33) follows by taking ε small enough. Finally, we check that An obeys (5.9). The a priori bound (6.25) together with (6.33) imply that (n) ||Pˆn An ||s,s ≤ 3εηn−1 , which, with (5.11) and definition (6.36), leads to (5.9). To complete the proof of part (c) of Proposition 5.1, we are left with the Proof of Proposition 6.2. Denoting Dwn (z) = σn (z) + τn (z), with σn (z)(q, q ) = Dwn (z)(q, q )δqq , one computes from (6.24) the recursive relation σn (z) = σ˜ n−1 (˜z) + Tn (z) Hn (˜z), (6.41)

128

J. Bricmont, A. Kupiainen, A. Schenkel

where −1 Hn (˜z) = 1 − 9n σ˜ n−1 (˜z) , Tn (z)(q, q ) = τn (z)9n τn−1 (˜z) (q, q )δqq . Setting Rn (z) ≡ σ˜ n−1 Hn (˜z) − 1 ,

Tn (z) ≡ Tn (z)Hn (˜z),

and using Tp σ˜ n−1 = Tp σn−1 together with the identity Tp σ0 = 0, one applies (6.41) recursively to obtain Tp σn (z) =

n

Tp Rm (zm ) + Tm (zm ) ,

(6.42)

m=1

where zm = Fnm+1 (z), cf (6.14), with Fnn+1 ≡ 1. Note that Rm (z) is diagonal and can be rewritten as Rm (z) = σ˜ m−1 (˜z)9m σ˜ m−1 (˜z)Hm (˜z).

(6.43)

As shown below, each term in (6.42) is easily seen to be of order ε 2 |ω · p|. Thus, the main issue in obtaining (6.32) is to ensure that taking the sum will deteriorate the bound only slightly. Let us first consider the terms involving the quantities Tp Tm . They are higher order terms, since Tm is quadratic in the off-diagonal part τm which, as shown in Lemma 6.1, are bounded by powers of r. Indeed, as carried out in the Appendix, one has for all m = 1, . . . , n and z ∈ Bm , (m) ||Pˆm Tp Tm (z)||s,s ≤ ε2 ηm |ω · p|,

(6.44)

so that n (n) n (m) ˆn Tp Tm (z) ≤ ||Pˆm Tp Tm (z)||s,s ≤ ε2 |ω · p|. P m=1

s,s

(6.45)

m=1

On the other hand, the terms involving Tp Rm are not higher order terms. Since

Tp Hm (˜z) = tp Hm (˜z) Tp 9m tp σ˜ m−1 (˜z) + 9m Tp σ˜ m−1 (˜z) Hm (˜z), (5.18) with σ = s + ξ − γ and n replaced by m yields with the recursive bound (6.32) ≤ η−m |ω · p|. ||Tp Hm (˜z)||(m−2) s

(6.46)

Thus, using in addition the recursive bounds (5.8) and (6.32), together with ||Hm (˜z) − 1||(m−1) = ||9m σ˜ m−1 (˜z)Hm (˜z)||(m−1) ≤ Cε, s s one obtains for all m = 1, . . . , n and z ∈ Bm , (n)

(m)

||Pˆn Tp Rm (z)||s,s ≤ ||Pˆm Tp Rm (z)||s,s ≤ Cε 2 |ω · p|,

(6.47)

Renormalization Group and Melnikov Problem for PDE’s

129

to be compared with (6.44). However, one can actually show that n n (n) ˆ Pn Tp Rm (z) ≤ sup sup |Tp Rm (z)(q)|s,s m=1

s,s

k∈I n q∈Skn m=1

≤ Cε2 |ω · p|,

(6.48) (6.49)

with another n-independent constant C. Although (6.47) yields the a priori bound |Tp Rm (z)(q)|s,s ≤ Cε 2 |ω · p| for all q ∈ Skn and k ∈ I n , (6.49) will follow from the fact that all but a finite number of terms in (6.48) are identically zero. More precisely, there is for all k ∈ I n a set Zkn ⊂ {1, . . . , n} with #Zkn uniformly bounded in n and k such that for all q ∈ Skn , |Tp Rm (z)(q)|s,s ≡ 0

if

m ∈ Zkn .

(6.50)

This leads to (6.49) and concludes the proof of bound (6.32), since (6.42), (6.45) and (6.49) lead to (6.32) by taking ε small enough and by noting that zm ∈ Bm for all z ∈ Bn , cf. (6.15). Identity (6.50) for some finite set Zkn follows from the expression (6.43) for −1 Q P Rm since by localization of scales 9m (q) = (Km m m−1 )(q) = 0 for most m ≤ n if n q ∈ Sk . More precisely, one computes that 1 − χk˜m (q) χ ˜m−1 (q)Pkm Qm (q)Pm−1 (q) = ˜ , km−1

˜ Im k∈

where the index k˜ m−1 serves to denote the (unique) subspace J ˜m−1 containing J ˜m . Fix km−1

now some k ∈ I n . Then one has for all q ∈ Skn and all m < n, χ ˜m−1 (q)Pkm Qm (q)Pm−1 (q) = ˜ = PJ m−1 \J m , m ˜ k∈I k˜ =km

km−1

km−1

k

km

since by construction χkmm (q) = 1 for such m and q. Therefore, Qm (q)Pm−1 (q) = 0 . On the other hand, Jkmm is a strict for all q ∈ Skn if m < n is such that Jkmm = Jkm−1 m−1

only if #Ckmm < #Ckm−1 , i.e., if the eigenvalues contained in Ckm−1 subspace of Jkm−1 m−1 m−1 m−1 have been divided after perturbation by am−1 into two (or more) clusters. But this can be true only for finitely many m since the original eigenvalues µk are finitely many times degenerate. Hence, there is an L < ∞ such that for all n ≥ 1 and all 1 ≤ m ≤ n, one has Pˆn Rm (q) = 0, except for some m1 , . . . , mL . Since the same is true of Pˆn tp Rm (q) provided that p satisfies |ω · p| < ηn−1 /16, (6.50) follows.

6.4. The cluster decomposition. We now check that point (d) of Proposition 5.1 holds. First, (5.9), (5.10) and (5.11) lead to, for k = (k, ·) ∈ I n , |an Pkn |L(Jkn ) ≤ 3k γ −ξ εηn−1 ,

(6.51)

which, since µ2k ≥ ck 2γ by hypothesis (H1), implies that µ2 + nm=0 am ≡ µ˜ 2n+1 is positive definite for ε = ε(c, η) small enough. Next, it follows from an being hermitian that σ (µ˜ n+1 ) ⊂ R+ . Furthermore, using (5.11) and the fact that Jkn is by definition an

130

J. Bricmont, A. Kupiainen, A. Schenkel

invariant subspace for µ˜ n , one infers from µk ≥ ck γ , the asymptotic (5.13) for µ˜ n , and the estimate (6.51), that n −1 −ξ n−1 |an µ˜ −1 . n Pk |L(Jkn ) ≤ 3c k εη

ˆ∞ = Therefore, denoting by Pk the projector onto the k th component of R one obtains

1 µ˜ n+1 Pk = µ˜ 2n + an 2 Pk = µ˜ n Pk + O(k −ξ εηn−1 ),

k≥1 C

dk ,

(6.52)

which, since µPk = µk 1dk , implies by recursion that µ˜ n+1 Pk = µk 1dk + O(εk −ξ ). Hence, the asymptotic (5.13) holds, where for each k ≥ 1 the sequence of clusters n+1 Ck,i , i = 1, . . . , Mkn+1 , forms a partition of the component σ (µ˜ n+1 Pk ) satisfying n+1 n+1 d(Ck,i , Ck,j ) > ηn+1 for i = j . This partition is unique if Mkn+1 is required to be maximal. Furthermore, it follows from (1.13) and (1.14) in (H1) that for ε = ε(c) small enough, the components σ (µ˜ n+1 Pk ) are well separated. Therefore, the sets Skn+1 , k ∈ I n+1 , defined according to (3.25) are pairwise disjoint. Next, (6.52) and the gap condition (5.12) with n + 1 replaced by n imply that for ε = ε(c, η) small enough, n+1 every cluster Ck,i is composed of perturbed eigenvalues belonging to a unique C n n+1 . k,ji

The distance between these two clusters is at most of order O(k −ξ εηn−1 ), so that (3.31) follows for n + 1 by induction. In order to iterate (3.32), we note that by definition, Jkn+1 is the eigenspace of µ˜ n+1 associated with Ckn+1 , k ∈ I n+1 , and that every Jkn , k ∈ I n , n+1 is also an invariant subspace for µ˜ n+1 by (5.11). Therefore, each Jk,i is contained in n n a unique J n+1 , namely, the eigenspace associated with C n+1 . Finally, we check that k,ji

k,ji

n+1 (3.33) iterates. This is a simple consequence of (3.32) and Sk,i ⊂ Sn

k,jin+1

following from (6.52) for ε small enough.

, the latter

7. Measure Estimate In this section, we prove Proposition 5.2, namely, that +∗ (K) =

n≥1 +n (K)

lim meas(+ \ +∗ (K)) = 0,

K→0

satisfies (7.1)

for all bounded + ⊂ Rd . The strategy is standard and consists in studying the complementary sets of +n (K). For n ≥ 1, b > 0, and q ∈ Zd , let us define

n;k n;k,k n Oq,b ∪ , ≡ Oq,b Oq,b k∈I n

k,k ∈I n

where n;k Oq,b = {ω ∈ Rd | d(|ω · q|, Ckn ) ≤ b},

n;k,k = {ω ∈ Rd | d(|ω · q|, |Ckn ± Ckn |) ≤ b}. Oq,b

Renormalization Group and Melnikov Problem for PDE’s

131

Next, with Zn ≡ {q ∈ Zd | K ν η− 1

and O ∗ (K) ≡

n−1 ν

n≥1 q∈Zn

n

≤ |q| < K ν η− ν }, 1

n Oq,2K|q| −ν ,

one shows first, that for all bounded + ⊂ Rd , ξ meas + ∩ O ∗ (K) ≤ C+ K ξ +1 ,

(7.2)

for some constant C+ depending on + only, and, second, that ∗ c O (K) ⊆ +∗ (K).

(7.3)

Obviously, (7.1) follows from (7.2) and (7.3). Below, C+ will denote a generic constant that may change from place to place but depends on + only. Let us start with the bound (7.2). One has n n meas + ∩ O ∗ (K) ≤ Tq,2K|q|−ν + Tˆq,2K|q| (7.4) −ν , n≥1 q∈Zn

where n Tq,b =

k∈I n

n;k ˆ n n;k,k . , Tq,b = meas + ∩ meas + ∩ Oq,b Oq,b

(7.5)

k,k ∈I n

n , we first To treat the terms on the right-hand side of (7.4) involving the quantities Tq,b use (3.30) to estimate, n;k ¯ n ). meas + ∩ Oq,b ≤ C+ (b + dη

Next, we note that the asymptotic behavior of the clusters Ckn , cf. (1.12) and (5.13), n;k is empty if k = (k, ·) satisfies k ≥ C+ |q| for some constant C+ . implies that + ∩ Oq,b Hence, since the number of indices k of the form (k, ·) is uniformly bounded in k, the n is proportional to |q|, and number of terms which are non-zero in the sum defining Tq,b n n ¯ ). Finally, the fact that q ∈ Zn satisfies one obtains the estimate Tq,b ≤ C+ |q|(b + dη n −ν η ≤ K|q| leads to n ¯ Tq,2K|q| |q|1−ν ≤ C+ K, (7.6) −ν ≤ C+ 2K + dK n≥1 q∈Zn

q∈Zd

for ν = ν(d) large enough. To treat the remaining terms in (7.4), we first note that, as above, n;k,k ¯ n ). (7.7) ≤ C+ (b + 2dη meas + ∩ Oq,b Next, one distinguishes the cases γ = 1 and γ > 1. If γ > 1, then for k > k the n;k,k is empty inequality k γ − k γ > k γ −1 and the asymptotic (1.13) imply that + ∩ Oq,b

132

J. Bricmont, A. Kupiainen, A. Schenkel

for k = (k, ·) and k = (k , ·) such that k ≥ C+ |q|1/(γ −1) ≡ kq . Furthermore, it follows from (5.13) that for kb = b− ξ +1 ,

n;(k,i),(k,j ) −ξ meas Oq,b ≤ Ckb . 1

k>Ckb i,j

Therefore, one obtains with (7.7), ξ

n ≤ Cb ξ +1 + Tˆq,b

Ckb k=1

n;(k,i),(k,j )

meas(+ ∩ Oq,b

)+

kq k =2 k 0 and ν = ν(d, ξ ) large enough. We now consider the case γ = 1. From n;k,k is empty for (5.13) and the asymptotic behavior (1.14), it follows first that + ∩ Oq,b k = (k, ·) and k = (k , ·) with k − k = l ≥ C|q|, and second that for all l ≥ 0,

n;(k,i),(k+l,j ) −ξ meas Oq,b ≤ Ckb , k>Ckb i,j

where kb = b− ξ +1 . Therefore, (7.7) leads to 1

ξ

n ¯ n ), Tˆq,b ≤ C|q|b ξ +1 + C+ b− ξ +1 |q|(b + 2dη 1

and one finally obtains for ν = ν(d, ξ ) large enough, n≥1 q∈Zn

ξ

n 1+ξ Tˆq,2K|q| −ν ≤ C+ K

ξ

ξ

|q|1−ν 1+ξ ≤ C+ K 1+ξ .

(7.9)

q∈Zd

Inserting (7.6) and (7.9) into (7.4) yields (7.2). We now check that (7.3) holds. If ω ∈ O ∗ (K), then the following is true for all n ≥ 1, q ∈ Zn and k, k ∈ I n , d(|ω · q|, Ckn ) > 2K|q|−ν , d(|ω

· q|, |Ckn

± Ckn |)

> 2K|q|

(7.10) −ν

.

(7.11)

Next,we verify that for such ω, this implies that bounds (7.10) and (7.11) hold for all q ∈ nm=1 Zm provided one replaces the constant 2K on the right-hand side by K. This

Renormalization Group and Melnikov Problem for PDE’s

133

in turn implies that ω ∈ +n (K) for all n ≥ 1, so that ω ∈ +∗ (K). Let m < n and fix some k ∈ I n . Then, recalling (3.31), namely that there is at least one k ∈ I m for which sup infm d(x, y) ≤ ηm+1 ,

x∈Ikn y∈Ik

and since, on the other hand, ηm < K|q|−ν whenever q ∈ Zm , one infers from (7.10) with n replaced by m that for q ∈ Zm and η < 1, d(|ω · q|, Ckn ) ≥ d(|ω · q|, Ckm ) − ηm+1 ≥ (2K − ηK)|q|−ν

> K|q|

−ν

(7.12)

.

Since (7.12) holds for all q ∈ Zm , 1 ≤ m ≤ n, one concludes that d(|ω · q|, Ckn ) > K|q|−ν whenever 0 < |q| < Kη−n/ν . In a similar way, one derives an identical lower bound on d(|ω · q|, |Ckn ± Ckn |), thus achieving the proof of (7.3) and (7.1). 8. Proof of Theorem 1.1 Defining zn ≡ Fn (0), we now show that zn converges in hs , as n → ∞, to a function z whose Fourier transform is real analytic and provides a solution of Eq. (3.4). Using Fn (0) = Fn−1 (Rn (0)), cf. (3.13), one computes that zn − zn−1 = δ1 Fn−1 Rn (0) . According to (6.5), Rn (0) = Hn 9n wn−1 (0) + u(0), so that (5.6), (5.17), (6.9), (6.10) and the identity 9n = 9n Pˆn−1 lead to ||Rn (0)||hn−1 ≤ η−n r 2(n−1) . s Therefore, since, Fn−1 ∈ An−1 = H ∞ (Bn−1 , hs ), one can apply (4.12) to δ1 Fn−1 with γ = η−n r n−2 to obtain ||zn − zn−1 ||s ≤ Cη−n r n−2 |||Fn−1 |||An−1 , and the convergence of zn in hs follows from the uniform bound (6.16) by taking r = r(η) small enough. Bound (6.16) also implies ||zβ || ≤ ε uniformly in the strip | Im β| < α = −2 α ∞ n=2 (1 − n ). This yields the pointwise estimate

|z(q)| ≤ εe−α |q| , and, consequently, ensures the real analyticity of the Fourier transform of z. In order to prove that the limit z solves Eq. (3.6), namely, K0 z = w0 (z), we will show below that K0 zn = Qn w0 (zn ) + A 1, but with the transformation q → 1/q, α → α ∗ , β → qβ and z → z, we get a sphere which is C ∗ -isomorphic to one for |q| < 1. It is clear that the quotient of the C ∗ -algebra Aq by the ideal generated by z can be identified with the C ∗ -algebra of the compact quantum group SUq (2). However, in this paper we shall not make any use of additional structures (like coproduct, counit, and antipode) coming from SUq (2). In [19] it was shown that for q ∈ (−1, 0) ∪ (0, 1) the spaces SUq (2) are all homeomorphic in the sense that the corresponding C ∗ -algebras are isomorphic. Then, for q ∈ (−1, 0) ∪ (0, 1), all our C ∗ -algebras Aq are isomorphic as well and all corresponding spheres are homeomorphic. For the generic situation when −1 < q < 0 or 0 < q < 1 any character χ of Aq has to satisfy the equations χ (α ∗ ) = χ (α), χ (β) = 0

and

χ (β ∗ ) = χ (β),

χ (z∗ ) = χ (z),

|χ (α)|2 + (χ (z))2 = 1.

(4)

To show that the space of all characters is homeomorphic to the two dimensional sphere S 2 , we take a generic α ∈ C and z ∈ R such that |α|2 + z2 = 1. Then, from the general considerations presented above, there is a 1-dimensional representation (that is a character) χ of Aq such that χ (α) = α , χ (β) = 0 and χ (z) = z and this proves the homeomorphism in question. Hence, for −1 < q < 0 or 0 < q < 1 the space of (nonzero) characters of Aq , which can be thought of as the space of “classical points” of Sq4 , is homeomorphic to the classical S 2 . For the particular case q = 1 the algebra of the sphere Sq4 is commutative. The associated space of characters is homeomorphic to the 4-dimensional sphere S 4 . Indeed any character χ of Aq=1 satisfies the equations χ (α ∗ ) = χ (α), and

χ (β ∗ ) = χ (β),

χ (z∗ ) = χ (z),

|χ (α)|2 + |χ (β)|2 + (χ (z))2 = 1.

(5)

To show that any element of S 4 arises in this way, similarly to what we did before we take generic α , β ∈ C and z ∈ R such that |α |2 + |β |2 + z2 = 1. Thus they satisfy relations (5) (or relations (1) for q = 1) and there is a 1-dimensional representation χ of Aq (q = 1) such that χ (α) = α , χ (β) = β and χ (z) = z . This proves the homeomorphism in question and shows that the algebra Aq for q = 1 can be identified with the algebra of all continuous functions on the 4-dimensional sphere S 4 . It is in this sense that Sq4 provides a deformation of the classical S 4 . Next, we describe irreducible representations of the algebra Aq (for −1 < q < 0 or 0 < q < 1) as bounded operators on an infinite dimensional Hilbert space H with an orthonormal basis {ψn , n = 0, 1, 2, · · · }. With λ ∈ C, |λ| ≤ 1, we get two families of

164

L. D¸abrowski, G. Landi, T. Masuda

representations πλ,± : Aq → B(H ) given by πλ,± (z)ψn = πλ,± (z∗ )ψn = ± 1 − |λ|2 ψn , πλ,± (α)ψn = λ 1 − q 2(n+1) ψn+1 , πλ,± (α ∗ )ψn = λ¯ 1 − q 2n ψn−1 ,

(6)

¯ n ψn . πλ,± (β ∗ )ψn = λq

πλ,± (β)ψn = λq n ψn ,

To be precise, for λ such that |λ| = 1, the two representations πλ,+ and πλ,− are identical so that, in fact, we have a family of representations parametrized by points on a classical sphere S 2 , similarly to what happens for one dimensional representations (characters) as described before. As mentioned already, the quotient of the C ∗ -algebra Aq by the ideal generated by z is the C ∗ -algebra of the compact quantum group SUq (2). Then, with |λ| = 1, the representations πλ,+ = πλ,− =: πλ yield representations of SUq (2) which are unitary equivalent to the ones constructed by Woronowicz (see for instance ([21])).

3. The Instanton and Its Classes Consider now the following element e in the algebra Mat 4 (Aq ) Mat4 (C) ⊗ Aq

1+z, 0, α, β ∗ ∗ 1 0, 1+z, −qβ , α e= ∗ . 0 2 α , −qβ, 1−z, ∗ β , α, 0, 1−z

(7)

Using the relations (1) it can be verified that e is a selfadjoint idempotent (projection) e2 = e = e∗ . It operates on the right Aq -module A4q = Aq ⊗ C4 and its range may be thought of as sections of a vector bundle over Sq4 . It is easy to see that eA4q is a deformation of the classical instanton bundle over S 4 in the sense that for q = 1, the module eA4q is the module of sections of the complex rank two instanton bundle over S 4 [1]. Next, we compute the Chern–Connes Character of the idempotent e given in (7). If is the projection on the commutant of 4 × 4 matrices, up to normalization the component of the (reduced) Chern–Connes Character are given by

1 chn (e) = (8) ⊗ e ⊗ · · · ⊗ e , n = 0, 1, 2, . . . , e− 2 2n

and they are elements of Aq ⊗ A¯q ⊗ · · · ⊗ A¯q ,

(9)

2n

where A¯q = Aq /CI is the quotient of the algebra Aq by the scalar multiples of the unit.

Instantons on the Quantum 4-Spheres Sq4

165

The crucial property of the components chn (e) is that they define a cycle in the (b, B) bicomplex of cyclic homology [3, 12], that is, Bchn (e) = bchn+1 (e).

(10)

(−1)j a0 ⊗ · · · ⊗ aj aj +1 ⊗ · · · ⊗ am + (−1)m am a0 ⊗ a1 ⊗ · · · ⊗ am−1 ,

(11)

The operator b is defined by b(a0 ⊗ a1 ⊗ · · · ⊗ am ) =

m−1 j =0

while the operator B is written as B = AB0 ,

(12)

where B0 (a0 ⊗ a1 ⊗ · · · ⊗ am ) = I ⊗ a0 ⊗ a1 ⊗ · · · ⊗ am , m 1 A(a0 ⊗ a1 ⊗ · · · ⊗ am ) = (−1)mj aj ⊗ aj +1 ⊗ · · · ⊗ aj −1 , m

(13) (14)

j =0

with the obvious cyclic identification m + 1 = 0. To be precise, in formulæ (11), (13) and (14), all elements in the tensor products but the first one should be taken modulo complex multiples of the unit I, that is one has to project onto A¯q = Aq /CI. For the 0th component of the Chern–Connes Character of the idempotent (7) on the spheres Sq4 we find,

1 e− 2

ch0 (e) =

= 0.

(15)

This could be interpreted as saying that the idempotent and the corresponding module (the “vector bundle”) has complex rank equal to 2. Next for the 1st component we have, ch1 (e) = =

e−

1 2

⊗e⊗e

1 (1 − q 2 ) z ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) 8

(16)

+ β ∗ ⊗ (z ⊗ β − β ⊗ z) + β ⊗ (β ∗ ⊗ z − z ⊗ β ∗ ) .

It is straightforward to check that bch1 (e) = 0 = Bch0 (e).

(17)

166

L. D¸abrowski, G. Landi, T. Masuda

Finally, the 2nd component ch2 (e) =

1 e− 2

⊗e⊗e⊗e⊗e

(18)

can be written as a sum of five terms ch2 (e) =

1 z ⊗ c z + α ⊗ c α + α ∗ ⊗ cα ∗ + β ⊗ c β + β ∗ ⊗ cβ ∗ , 32

(19)

with cz = (1 − q 4 )(β ⊗ β ∗ ⊗ β ⊗ β ∗ − β ∗ ⊗ β ⊗ β ∗ ⊗ β) + (1 − q 2 ) z ⊗ z ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) + (β ⊗ z ⊗ z ⊗ β ∗ − β ∗ ⊗ z ⊗ z ⊗ β) + (β ⊗ β ∗ − β ∗ ⊗ β) ⊗ z ⊗ z + z ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) ⊗ z − z ⊗ (β ⊗ z ⊗ β ∗ − β ∗ ⊗ z ⊗ β) − (β ⊗ z ⊗ β ∗ − β ∗ ⊗ z ⊗ β) ⊗ z + (α ⊗ α ∗ − q 2 α ∗ ⊗ α) ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) ∗

∗

∗

(20)

2 ∗

+ (β ⊗ β − β ⊗ β) ⊗ (α ⊗ α − q α ⊗ α) + (β ⊗ α − qα ⊗ β) ⊗ (α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) + (α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) ⊗ (β ⊗ α − qα ⊗ β) + (α ∗ ⊗ β − qβ ⊗ α ∗ ) ⊗ (qα ⊗ β ∗ − β ∗ ⊗ α) + (qα ⊗ β ∗ − β ∗ ⊗ α) ⊗ (α ∗ ⊗ β − qβ ⊗ α ∗ ); cα = (z ⊗ α ∗ − α ∗ ⊗ z) ⊗ (β ∗ ⊗ β − β ⊗ β ∗ ) + q 2 (β ∗ ⊗ β − β ⊗ β ∗ ) ⊗ (z ⊗ α ∗ − α ∗ ⊗ z) + q(z ⊗ β − β ⊗ z) ⊗ (α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) + (α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) ⊗ (z ⊗ β − β ⊗ z)

(21)

+ q(β ∗ ⊗ z − z ⊗ β ∗ ) ⊗ (α ∗ ⊗ β − qβ ⊗ α ∗ ) + (α ∗ ⊗ β − qβ ⊗ α ∗ ) ⊗ (β ∗ ⊗ z − z ⊗ β ∗ ); cα ∗ = q 2 (z ⊗ α − α ⊗ z) ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) + (β ⊗ β ∗ − β ∗ ⊗ β) ⊗ (z ⊗ α − α ⊗ z) + (β ∗ ⊗ z − z ⊗ β ∗ ) ⊗ (β ⊗ α − qα ⊗ β) + q(β ⊗ α − qα ⊗ β) ⊗ (β ∗ ⊗ z − z ⊗ β ∗ ) + (z ⊗ β − β ⊗ z) ⊗ (β ∗ ⊗ α − qα ⊗ β ∗ ) + q(β ∗ ⊗ α − qα ⊗ β ∗ ) ⊗ (z ⊗ β − β ⊗ z);

(22)

Instantons on the Quantum 4-Spheres Sq4

167

cβ = (1 − q 4 ) (β ∗ ⊗ z − z ⊗ β ∗ ) ⊗ β ⊗ β ∗ + β ∗ ⊗ β ⊗ (β ∗ ⊗ z − z ⊗ β ∗ ) + (1 − q 2 ) β ∗ ⊗ z ⊗ z ⊗ z − z ⊗ β ∗ ⊗ z ⊗ z + z ⊗ z ⊗ β∗ ⊗ z − z ⊗ z ⊗ z ⊗ β∗ + (β ∗ ⊗ z − z ⊗ β ∗ ) ⊗ (α ⊗ α ∗ − q 2 α ∗ ⊗ α) ∗

2 ∗

∗

∗

(23)

+ (α ⊗ α − q α ⊗ α) ⊗ (β ⊗ z − z ⊗ β ) + (α ⊗ z − z ⊗ α) ⊗ (α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) + q(α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) ⊗ (α ⊗ z − z ⊗ α) + (β ∗ ⊗ α − qα ⊗ β ∗ ) ⊗ (α ∗ ⊗ z − z ⊗ α ∗ ) + q(α ∗ ⊗ z − z ⊗ α ∗ ) ⊗ (β ∗ ⊗ α − qα ⊗ β ∗ ); cβ ∗ = (1 − q 4 ) (z ⊗ β − β ⊗ z) ⊗ β ∗ ⊗ β + β ⊗ β ∗ ⊗ (z ⊗ β − β ⊗ z) + (1 − q 2 ) − β ⊗ z ⊗ z ⊗ z + z ⊗ β ⊗ z ⊗ z −z⊗z⊗β ⊗z+z⊗z⊗z⊗β + (z ⊗ β − β ⊗ z) ⊗ (α ⊗ α ∗ − q 2 α ∗ ⊗ α) ∗

2 ∗

(24)

+ (α ⊗ α − q α ⊗ α) ⊗ (z ⊗ β − β ⊗ z) + q(z ⊗ α ∗ − α ∗ ⊗ z) ⊗ (β ⊗ α − qα ⊗ β) + (β ⊗ α − qα ⊗ β) ⊗ (z ⊗ α ∗ − α ∗ ⊗ z) + q(α ∗ ⊗ β − qβ ⊗ α ∗ ) ⊗ (z ⊗ α − α ⊗ z) + (z ⊗ α − α ⊗ z) ⊗ (α ∗ ⊗ β − qβ ⊗ α ∗ ). By using the relations (1) for our algebra, and remembering that we need to project on A¯q in all terms of the tensor product but the first one, a long (one needs to compute 750 terms) but straightforward computation gives bch2 (e) =

1 (1 − q 2 ) I ⊗ z ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) 16

+ I ⊗ β ⊗ (β ∗ ⊗ z − z ⊗ β ∗ ) + I ⊗ β ∗ ⊗ (z ⊗ β − β ⊗ z) ,

(25)

and this is exactly equal to Bch1 (e). 4. Final Remarks There are several directions in which one can proceed and we just mention some of them. It would be clearly very interesting to study differential calculi on our quantum 4sphere and develop Yang–Mills theory. Another natural question is to which extent the sphere Sq4 could be endowed with a structure of a metric noncommutative manifold which fulfills (some of) the related axioms [5, 6]. In particular one should construct an appropriate Dirac operator. This will probably be possible along the lines of [8] where it was suggested that the true Dirac

168

L. D¸abrowski, G. Landi, T. Masuda

operator D for the quantum SUq (2) (and also for the quantum Podle´s 2-sphere Sq2 [16]) should satisfy an equation of the form q 2D − q −2D = Q, q 2 − q −2

(26)

where Q is some q-analogue of the Dirac operator like the ones found in [2, 13]. Once found the operator D, one would easily “suspend” it to the 4-sphere Sq4 . Finally, we mention that it will be interesting to study if there is any relation with the sheaf-theoretic construction of a q-deformed instanton in [15]. Acknowledgements. We are grateful to Alain Connes for several enlightening conversations. This work has been partially supported by the Regione Friuli-Venezia-Giulia via the Research Project “Noncommutative geometry: algebraic, analytical and probabilistic aspects and applications to mathematical physics”.

References 1. Atiyah, M.F.: Geometry of Yang–Mills fields. Accad. Naz. Dei Lincei, Scuola Norm. Sup. Pisa, 1979 2. Bibikov, P.N., Kulish, P.P.: Dirac operators on quantum SU (2) group and quantum sphere. q-alg/9608012 3. Connes, A.: Noncommutative differential geometry. Inst. Hautes Etudes Sci. Publ. Math. 62, 257–360 (1985) 4. Connes, A.: Noncommutative geometry. London–New York: Academic Press, 1994 5. Connes, A.: Gravity coupled with matter and foundation of noncommutative geometry. Commun. Math. Phys. 182, 155–176 (1996) 6. Connes, A.: Noncommutative geometry: The spectral aspect. Les Houches Session LXIV, London–New York: Elsevier, 1998, pp. 643–685 7. Connes, A.: A short survey of noncommutative geometry. J. Math. Phys. 41, 3832–3866 (2000) 8. Connes, A., Landi, G.: Noncommutative manifolds, the instanton algebra and isospectral deformations. math.QA/0011194, Commun. Math. Phys. 221, 141–159 (2001) 9. Dabrowski, L. and Landi, G.: Instanton algebras and quantum 4-spheres. math.QA/0101177 10. Furuuchi, K.: Instantons on noncommutative R 4 and projection operators. Prog. Theor. Phys. 103, 1043 (2000) 11. Kapustin, A., Kuznetsov, A., Orlov, D.: Noncommutative instantons and twistor transform. hepth/0002193 12. Loday, J.L.: Cyclic homology. Berlin–Heidelberg–New York: Springer, 1998 13. Majid, S.: Riemannian geometry of quantum groups and finite groups with nonuniversal differentials. math.QA/0006150 14. Nekrasov, N., Schwarz, A.: Instantons on noncommutative R 4 and (2,0) superconformal six dimensional theory. Commun. Math. Phys. 198, 689–703 (1998) 15. Pflaum, M.J.: Quantum groups on fibre bundles. Commun. Math. Phys. 166, 279–316 (1994) 16. Podle´s, P.: Quantum spheres. Lett. Math. Phys. 14, 521–531 (1987) 17. Rieffel, M.: Vector bundles over higher dimensional noncommutative tori. Lect. Notes. Math. 1132, Berlin–Heidelberg–New York: Springer-Verlag, 1985, pp. 456–467 18. Rieffel, M., Schwarz, A.: Morita equivalence of multidimensional noncommutative tori. Int. J. Math. 10, 289–299 (1999) 19. Woronowicz, S.L.: Twisted SU (2) group. An example of a non-commutative differential calculus. Publications of RIMS Kyoto University, Vol. 23 No. 1, 117–181 (1987) 20. Woronowicz, S.L.: Compact matrix pseudogroup. Commun. Math. Phys. 111, 613–665 (1987) 21. Woronowicz, S.L., D¸abrowski, L., Nurowski, P.: Compact and non-compact quantum groups. I. Preprint 153/95/FM, SISSA, Trieste, 1995 Communicated by A. Connes

Commun. Math. Phys. 221, 169 – 196 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Hyperelliptic Prym Varieties and Integrable Systems Rui Loja Fernandes1, , Pol Vanhaecke2 1 Departamento de Matemática, Instituto Superior Técnico, 1049-001 Lisboa, Portugal.

E-mail: [email protected]

2 Université de Poitiers, Département de Mathématiques, Téléport 2, Boulevard Marie et Pierre Curie,

BP 30179, 86962 Futuroscope Chasseneuil Cedex, France. E-mail: [email protected] Received: 12 December 2000 / Accepted: 26 March 2001

Abstract: We introduce two algebraic completely integrable analogues of the Mumford systems which we call hyperelliptic Prym systems, because every hyperelliptic Prym variety appears as a fiber of their momentum map. As an application we show that the general fiber of the momentum map of the periodic Volterra lattice a˙ i = ai (ai−1 − ai+1 ),

i = 1, . . . , n,

an+1 = a1 ,

is an affine part of a hyperelliptic Prym variety, obtained by removing n translates of the theta divisor, and we conclude that this integrable system is algebraic completely integrable. Contents 1. 2. 3. 4. 5. 6.

Introduction . . . . . . . . . . . . . . . . . Hyperelliptic Prym Varieties . . . . . . . . . The Hyperelliptic Prym Systems . . . . . . The Periodic Toda Lattices and KM Systems Painlevé Analysis . . . . . . . . . . . . . . Example: n = 5 . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

169 171 175 182 188 192

1. Introduction In this paper we introduce two algebraic completely integrable (a.c.i.) systems, similar to the even and odd Mumford systems (see [12] for the odd systems and [15] for the even systems). By a.c.i. we mean that the general level set of the momentum map is Supported in part by FCT-Portugal through the Research Units Pluriannual Funding Program, European Research Training Network HPRN-CT-2000-00101 and grant POCTI/1999/MAT/33081.

170

R. L. Fernandes, P. Vanhaecke

isomorphic to an affine part of an Abelian variety and that the integrable flows are linearized by this isomorphism ([16]). The phase space of these systems is described by triplets of polynomials (u(x), v(x), w(x)), as in the case of the Mumford system, but now we have the extra constraints that u, w are even and v odd for the first system (the “odd” case), and with the opposite parities for the other system (the “even” case). We show that in the odd case the general fiber of the momentum map is an affine part of a Prym variety, obtained by removing three translates of its theta divisor, while in the even case the general fiber has two affine parts of the above form. We call these systems the odd and the even hyperelliptic Prym system because every hyperelliptic Prym variety (more precisely an affine part of it) appears as the fiber of their momentum map. Thus we find the same universality as in the Mumford system: in the latter every hyperelliptic Jacobian appears as the fiber of its momentum map. To show that the hyperelliptic Prym systems are a.c.i. we exhibit a family of compatible (linear) Poisson structures, making these systems multi-Hamiltonian. These structures are not just restrictions of the Poisson structures on the Mumford system. Rather they can be identified as follows: the hyperelliptic Prym systems are fixed point varieties of a Poisson involution (with respect to certain Poisson structures of the Mumford system) and we prove a general proposition stating that such a subvariety always inherits a Poisson structure (Prop. 3.4). As an application we study the algebraic geometry and the Hamiltonian structure of the periodic Volterra lattice a˙ i = ai (ai−1 − ai+1 )

i = 1, . . . , n;

an+1 = a1 .

(1)

Although systems of this form go back to Volterra’s work on population dynamics ([20]), they first appear (in an equivalent form) in the modern theory of integrable system in the pioneer work of Kac and van Moerbeke ([10]), who constructed this system as a discretization of the Korteweg-de Vries equation and who discovered its integrability. Though those authors only considered the non-periodic case, we shall refer to (1) as the n-body KM system. In the second part of the paper we give a precise description of the fibers of the momentum map of the KM systems and we prove their algebraic complete integrability. We can summarize our results as follows: Theorem. Denote be M , P, T , K the phase spaces of the (even) Mumford system, the hyperelliptic Prym system (odd or even), the (periodic) sl Toda lattice and the (periodic) KM system. Then there exists a commutative diagram of a.c.i. systems TO ? K

/ M O ? /P

where the horizontal maps are morphisms of integrable systems, and the vertical maps correspond to a Dirac type reduction. We stress that the vertical arrows are natural inclusion maps exhibiting for both spaces the subspace as fixed points varieties, but they are not Poisson maps. On the other hand, the horizontal arrows are injective maps that map every fiber of the momentum map on the left injectively into (but not onto) a fiber of the momentum map on the right. In order

Hyperelliptic Prym Varieties and Integrable Systems

171

to make these into morphisms of integrable systems, we construct a pencil of quadratic brackets making Toda → Mumford a Poisson map. For one bracket in this pencil the induced map for the KM systems is also Poisson, so it follows that the diagram is also valid in the Poisson category. A description of the general fiber of the momentum map of the KM systems as an affine part of a hyperelliptic Prym variety follows. Since the flows of the KM systems are restrictions of certain linear flows of the Toda lattices this enables us to show that the KM systems are a.c.i.; moreover the map leads to an explicit linearization of the KM systems. In order to determine precisely which divisors are missing from the affine varieties that appear in the momentum map we use Painlevé analysis, since it is difficult to read this off from the map . The result is that n (= the number of KM particles) translates of the theta divisor are missing from these affine parts. We also show that each hyperelliptic Prym variety that we get is canonically isomorphic to the Jacobian of a related hyperelliptic Riemann surface, which can be computed explicitly, thereby providing an alternative, simpler description of the geometry of the KM systems. The plan of this paper is as follows. In Sect. 2 we recall the definition of a Prym variety and specialize it to the case of a hyperelliptic Riemann surface with an involution (different from the hyperelliptic involution). We show that such a Prym variety is canonically isomorphic to a hyperelliptic Jacobian and we use this result to describe the affine parts that show up in Sect. 3, in which the hyperelliptic Prym systems are introduced and in which their algebraic complete integrability is proved. In Sect. 4 we establish the precise relation between the KM systems and the Toda lattices and we construct the injective morphism . We use it to give a first description of the general fiber of the momentum map of the KM systems and we derive its algebraic complete integrability. A more precise description of these fibers is given in Sect. 5 by using Painlevé analysis. We finish the paper with a worked out example (n = 5) in which we find a configuration of five genus two curves on an Abelian surface that looks very familiar (Fig. 2). As a final note we remark that the (periodic) KM systems have received much less attention than the (periodic) Toda lattices, another family of discretizations of the Korteweg–de Vries equation, which besides admitting a Lie algebraic generalization, is also interesting from the point of view of representation theory. It is only recently that the interest in the KM systems has revived (see e.g. [6, 18], and the references therein). We hope that the present work clarifies the connections between these systems and the master systems (Mumford and Prym systems). It was pointed out to us by Vadim Kuznetsov, that an embedding of the KM systems in the Heisenberg magnet was constructed by Volkov in [19]. 2. Hyperelliptic Prym Varieties In this section we recall the definition of a Prym variety and specialize it to the case of a hyperelliptic Riemann surface , equipped with an involution σ . We construct an explicit isomorphism between the Prym variety of (, σ ) and the Jacobian of a related hyperelliptic Riemann surface. We use this isomorphism to give a precise description of the affine part of the Prym variety that will appear as the fiber of the momentum map of an integrable system related to KM system. 2.1. The Prym variety of a hyperelliptic Riemann surface. Let be a compact Riemann surface of genus G, equipped with an involution σ with p fixed points. The quotient

172

R. L. Fernandes, P. Vanhaecke

surface σ = /σ has genus g , with G = 2g +p/2−1, and the quotient map → σ is a double covering map which is ramified at the p fixed points of σ . We assume that g > 0, i.e., σ is not the hyperelliptic involution on a hyperelliptic Riemann surface . The group of divisors of degree 0 on , denoted by Div0 (), carries a natural equivalence relation, which is compatible with the group structure and which is defined by D ∼ 0 iff D is the divisor of zeros and poles of a meromorphic function on . The quotient group Div0 ()/ ∼ is a compact complex algebraic torus (Abelian variety) of dimension G, called the Jacobian of and denoted by Jac() ([9], Ch. 2.7), its elements are denoted by [D], where D ∈ Div0 () and we write ⊗ for the group operation in Jac(). Notice that σ induces an involution on Div0 () and hence on Jac(); we use the same notation σ for these involutions. Definition 2.1. The Prym variety of (, σ ) is the (G−g )-dimensional subtorus of Jac() given by Prym(/σ ) = {[D − σ (D)] | D ∈ Div0 ()}. We will be interested in the case in which is the Riemann surface of a hyperelliptic curve (0) : y 2 = f (x), where f is a monic even polynomial of degree 2n without multiple roots (in particular 0 is not a root of f ), so that the curve is non-singular. The Riemann surface has genus G = n−1 and is obtained from (0) by adding two points, which are denoted by ∞1 and ∞2 . The two points of (0) for which x = 0 are denoted by O1 and O2 . The 2n Weierstrass points of (the points (x, y) of (0) for which y = 0) come in pairs (X, 0) and (−X, 0); fixing some order we denote them by Wi = (Xi , 0) and −Wi = (−Xi , 0), where i = 1, . . . , n. The Riemann surface admits a group of order four of involutions, whose action on (0) and on the Weierstrass points (Xi , 0) and whose fixed point set are described in Table 1 for n odd, n = 2g + 1 and in Table 2 for n even, n = 2g + 2 (ı is the hyperelliptic involution). Table 1. n odd ı

(x, y)

O1

O2

∞1

∞2

Wi

−Wi

Fix

(x, −y)

O2

O1

∞2

∞1

Wi

−Wi

σ

(−x, y)

O1

O2

∞2

∞1

−Wi

Wi

Wi , −Wi O1 , O2

τ

(−x, −y)

O2

O1

∞1

∞2

−Wi

Wi

∞1 , ∞ 2

Table 2. n even (x, y)

O1

O2

∞1

∞2

Wi

−Wi

Fix

ı

(x, −y)

O2

O1

∞2

∞1

Wi

−Wi

Wi , −Wi

σ

(−x, −y)

O2

O1

∞2

∞1

−Wi

Wi

–

τ

(−x, y)

O1

O2

∞1

∞2

−Wi

Wi

O 1 , O 2 , ∞1 , ∞ 2

For future use we also point out that for points P ∈ which are not indicated on these tables, neither σ (P ) nor τ (P ) coincide with ı(P ). The involutions σ and τ lead to two quotient Riemann surfaces σ := /σ and τ := /τ , and to two covering maps πσ : → σ and πτ : → τ . It follows from Tables 1 and 2 that the genus of τ equals g, while the genus g of σ is g or g + 1 depending on whether n is odd or even. Also, the dimension of Prym(/σ ) = g

Hyperelliptic Prym Varieties and Integrable Systems

173

(whether n is odd or even). If the equation of (0) is written as y 2 = g(x 2 ) then for n (0) (0) odd, σ has an equation v 2 = g(u) while τ has an equation v 2 = ug(u); for n even (0) (0) the roles of σ and τ are interchanged. In order to describe Prym(/σ ), which we will call a hyperelliptic Prym variety, we need the following classical results about hyperelliptic Riemann surfaces and their Jacobians (for proofs, see [12], Ch. IIIa). Lemma 2.2. Let D be a divisor of degree H > G on , where G is the genus of , and let P be any point on . There exists an effective divisor E of degree G on such that D ∼ E + (H − G)P . Corollary 2.3. For any fixed divisor D0 of degree G, Jac() is given by G Jac() = Pi − D 0 | P i ∈ . i=1

Lemma 2.4. Let D be a divisor on of the form D = H i=1 (Pi − Qj ), where H ≤ G and Pi = Qj for all i and j . Then [D] = 0 if and only if H is even and D is of the form D=

H /2

(Ri + ı(Ri ) − Si − ı(Si )),

i=1

for some points Ri , Si ∈ . 2.2. Hyperelliptic Prym varieties as Jacobians. In the following theorem we show that for any n the Prym variety Prym(/σ ) associated with the hyperelliptic Riemann surface is canonically isomorphic to the Jacobian of τ . This result was first proven by D. Mumford (see [13]) for the case in which πσ : → σ is unramified (n even) and by S. Dalaljan (see [7]) for the case in which πσ : → σ has two ramification points (n odd). Our proof, which is valid in both cases, is different and has the advantage of allowing us to describe explicitly the affine parts of the Prym varieties that we will encounter as affine parts of the corresponding Jacobians. Theorem 2.5. Let πτ∗ denote the homomorphism Div0 (τ ) → Div0 () which sends every point of τ to the divisor on which consists of its two antecedents (under τ ). The induced map # : Jac(τ ) → Prym(/σ ) [D] →[πτ∗ D] is an isomorphism. Proof. It is clear that the homomorphism # is a well-defined: if [D] = 0 then D is the divisor of zeros and poles of a meromorphic function f on τ , hence πτ∗ D is the divisor of zeros and poles of f ◦ τ and [πτ∗ D] = 0. To see that the image of # is contained in Prym(/σ ), just notice that πτ∗ (D) can be written as E + τ (E) for some E ∈ Div0 (), so that [πτ∗ (D)] = [E + τ (E)] = [E − σ (E)] ∈ Prym(/σ ).

174

R. L. Fernandes, P. Vanhaecke

Since Jac(τ ) and Prym(/σ ) both have dimension g it suffices to show that # is injective. Suppose that [πτ∗ D] = 0 for some D ∈ Div0 (τ ). We need to show that this implies = 0. It follows from Corollary 2.3 that we may [D] gassume that D is of the g form i=1 pi − gπτ (∞1 ), where pi ∈ τ . Then πτ∗ D = i=1 Pi + τ (Pi ) − 2g∞1 (πτ (Pi ) = pi ). Since 2g ≤ G and ∞1 = ı(∞1 ) Lemma 2.4 implies that Pi = ∞1 , i.e., pi = πτ (∞1 ) for all i.

2.3. The theta divisor. We introduce two divisors on Jac() by %1 =

G−1

Pi − (G − 1)∞1 | Pi ∈ ,

i=1

%2 =

G−1

(2)

Pi + ∞2 − G∞1 | Pi ∈ .

(3)

i=1

These two divisors are both translates of the theta divisor and they differ by a shift over [∞2 − ∞1 ]. Since ∞2 = ı(∞1 ) they are tangent along their intersection locus, which is given by &=

G−2

Pi + ∞2 − (G − 1)∞1 | Pi ∈ .

i=1

Proposition 2.6. When n is odd Prym(/σ ) ∩ (%1 ∪ %2 ) consists of three translates of the theta divisor of Jac(τ ), intersecting as in the following figure.

1

2

[11 + 12 ]

[11 + O]

[12 + O]

Fig. 1.

Proof. We use the isomorphism # to determine which divisors of Jac(τ ) get mapped into %1 and %2 . Since O1 and O2 are the only points g of on which ı and τ coincide, Lemma 2.4 implies that the only divisors D = i=1 pi − gπτ (∞1 ) ∈ Div(τ ) for which πτ∗ D contains, up to linear equivalence, ∞1 or ∞2 are those for which at least

Hyperelliptic Prym Varieties and Integrable Systems

175

one of them contains πτ (∞1 ) or πτ (∞2 ) or πτ (O1 ) (=πτ (O2 )). Denoting O = πτ (O1 ) we find that these points constitute the following three divisors on Jac(τ ): g−1 θ1 = pi − (g − 1)πτ (∞1 ) | pi ∈ τ , i=1 g−1 θ2 = pi + πτ (∞2 ) − gπτ (∞1 ) | pi ∈ τ , i=1 g−1 θ= pi + O − gπτ (∞1 ) | pi ∈ τ . i=1

They all pass through g−2 ω= pi + πτ (∞2 ) − (g − 1)πτ (∞1 ) | pi ∈ τ , i=1

which is the tangency locus of θ1 and θ2 , and θi intersects θ in addition in g−2 ωi = pi + πτ (∞i ) + O − gπτ (∞1 ) | pi ∈ τ , i=1

which is a translate of ω.

When n is even then clearly Prym(/σ ) is contained in %1 , but the following result, similar to Prop. 2.6, holds for an appropriate translate of Prym(/σ ). The proof is left to the reader. Proposition 2.7. When n is even and i ∈ {1, 2} then (Prym(/σ ) ⊗ [O1 − ∞i ]) ∩ (%1 ∪ %2 ) consists of three translates of the theta divisor of Jac(τ ), intersecting as in Fig. 1 (in which O should now be replaced by O2 ). We will see in the next section how in both cases (n odd/even) the affine variety obtained by removing these three translates from the theta divisor from Prym(/σ ) can be described by simple, explicit equations. 3. The Hyperelliptic Prym Systems In this section we introduce two families of integrable systems, whose members we call the odd and the even hyperelliptic Prym systems, where the adjective “odd/even” refers to the parity of n, as in the previous section, and where “hyperelliptic Prym” refers to the fact that the fibers of the momentum map of these systems are precisely the affine parts of the hyperelliptic Prym varieties that were considered in the previous section. These systems are intimately related to the even Mumford systems, constructed by the second author (see [15]), as even analogs of the (odd) Mumford systems, constructed by Mumford (see [12]).

176

R. L. Fernandes, P. Vanhaecke

3.1. The Mumford systems. We first recall the definition of the g-dimensional odd and even Mumford systems and we describe their geometry. Details, generalizations and applications can be found in [16]. The phase space of each of these systems is an affine space CN , which is most naturally described as an affine space of triples (u(x), v(x), w(x)) of polynomials, often represented as Lax operators L(x) =

v(x)

w(x)

,

u(x) −v(x)

where u(x), v(x) and w(x) are subject to certain constraints. Denoting by Mg (resp. Mg ) the phase space of the g th odd (resp. even) Mumford system these constraints are indicated in the following table: Table 3. Phase space Mg Mg

dim

u(x)

3g + 1

monic

3g + 2

deg = g monic

v(x)

w(x)

deg < g

monic

deg < g

deg = g

deg = g + 1 monic deg = g + 2

It is natural to use the coefficients of the three polynomials u(x), v(x), w(x) as coordinates on Mg and on Mg : for Mg for example, which will be most important for this paper, we write u(x) = x g + ug−1 x g−1 + · · · + u0 , v(x) = vg−1 x g−1 + · · · + v0 , w(x) = x g+2 + wg+1 x g+1 + · · · + w0 , or, in terms of the Lax operator L(x), as

0 1 0 0

x g+2 +

0 wg+1 0

0

x g+1 +

0 wg 1

0

xg +

0≤i0

where su(A0 ) denotes the strictly upper triangular part of A0 . The vector fields Xi are also Hamiltonian with respect to {·, ·}xT and their flows are linear on the general fiber of the momentum map K : Tn → C[x], which is defined by 1 det(x Id −L(h)) = −h − + K(x)/2; h since the general fiber of K is an affine part of a hyperelliptic Jacobian, the n-body Toda lattice is an a.c.i. system (see [3] for details). For higher order brackets for the Toda lattices, see [5]. We now turn to the n-body, periodic, Kac–van Moerbeke system (n-body KM system, for short). Its phase space Kn is the subspace of Tn consisting of all Lax operators (10) with zeros on the diagonal. Kn is not a Poisson subspace of Tn . However, Kn is the fixed manifold of the involution : Tn → Tn defined by ((a1 , a2 . . . , an ), (b1 , b2 . . . , bn )) → ((a1 , a2 . . . , an ), (−b1 , −b2 . . . , −bn )), which is a Poisson automorphism of (Tn , {·, ·}xT ). Therefore, by Theorem 3.4, Kn inherits a Poisson bracket {·, ·}K from {·, ·}xT , which is given by ai , aj K = ai aj (δi,j +1 − δi+1,j ).

It follows that the restriction of the momentum map K to Kn is a momentum map for the n-body KM system. Notice that Ij = 0 for even j , while for j odd the Lax equations (11) lead to Lax equations for the n-body KM system, merely by putting b1 = · · · = bn = 0. Taking j = 1 we find the vector field a˙ i = ai (ai−1 − ai+1 ),

i = 1, . . . , n,

(13)

which was already mentioned in the introduction. More generally, taking j odd we find a family of commuting Hamiltonian vector fields on Kn which are restrictions of the Toda vector fields, while for j even the Toda vector fields Xj are not tangent to Kn . In order to conclude that the KM systems are a.c.i. we need to describe the fibers of the momentum map K : Kn → C[x]. This will be done in the next paragraph.

184

R. L. Fernandes, P. Vanhaecke

4.2. Algebraic integrability of KM. We first define a map : Tn → Mn−1 which maps the n-body Toda system to the even Mumford system. The following identity, valid for tridiagonal matrices, will be needed. Lemma 4.1. Let M be a tridiagonal matrix, β1 α1 0 · · · 0 0 γ1 β2 α2 0 .. 0 γ β . 2 3 M= . .. .. .. . . . . . 0 βn−1 αn−1 0 0 · · · · · · γn−1 βn

,

and denote by ;i1 ,...,ik the determinant of the minor of M obtained by removing from M the rows i1 , . . . , ik and the columns i1 , . . . , ik . Then: ;1 ;n − ;;1,n =

n−1 '

αi γ i .

(14)

i=1

Proof. For n = 2 this is obvious. For n > 2 one proceeds by induction, using the following formula for calculating the determinant ; of M, ; = βn ;n − αn−1 γn−1 ;n−1,n .

(15)

In the sequel we use the notation ;i1 ,...,ik from the above lemma taking as M the tridiagonal matrix obtained from x Id −L(h) in the obvious way, i.e., by removing the two terms that depend on h. In this notation the characteristic polynomial of L(h) is given by det(x Id −L(h)) = −h − h−1 + ; − an ;1,n .

(16)

Proposition 4.2. For any m = 1, . . . , n the map m : Tn → Mn−1 defined by u(x) = ;m , v(x) = am−1 ;m−1,m − am ;m,m+1 ,

(17)

w(x) = (x − bm ) ;m + 2(x − bm )(am−1 ;m−1,m + am ;m,m+1 ) + 4am am−1 ;m−1,m,m+1 , 2

maps each fiber of the momentum map K : Tn → C[x] into a fiber of the momentum map H : Mn−1 → C[x]. The restriction of m to Kn takes values in P n−1 when n 2 is odd and in P n −1 when n is even, mapping in both case the fiber of the momentum 2

map K : Kn → C[x] into the fiber of the momentum map H : P n−1 → C[x 2 ] (or 2

H : P n −1 → C[x 2 ]). As a consequence the general fiber of the momentum map of the 2 KM systems is an affine part of a hyperelliptic Jacobian.

Hyperelliptic Prym Varieties and Integrable Systems

185

Proof. Since the momentum map is equivariant with respect to the Z/n action on Tn it suffices to prove the proposition for m = n. It is easy to see that the triple (u, v, w), defined by (17) satisfies the constraints u, w monic, deg w = deg u + 2 = n + 1 and deg v < n − 1, so that n takes values in Mn−1 . Moreover, taking β1 = · · · = βn = x in (15) implies that when all entries on the diagonal of L(h) are zero then ;i1 ,...,ip has the same parity as n − p, so that the triples (u, v, w) which correspond to points in Kn have the additional property that v has the same parity as n while u and w have the opposite parity. Therefore the restriction of n to Kn takes values in P n−1 when n is odd and in P n −1 when n is even. 2

2

For p(x) a monic polynomial of degree n, let L(h) ∈ K −1 (2p(x)), i.e., p(x) = (x − bn );n − an ;1n − an−1 ;n−1,n .

(18)

Proving that n (L(h)) belongs to H −1 (p 2 (x)−4) amounts to showing that u(x)w(x)+ v 2 (x) = p2 (x) − 4, which follows from a direct computation, using (14). The commutativity of the following diagram follows: Tn

H

K

C[x]

/ M n−1

φ

/ C[x]

where φ is defined by φ(q) = (q/2)2 − 4, for q ∈ C[x]. To show that the map n is injective let (u(x), v(x), w(x)) ∈ n (Tn ). We show that the matrix L(h) ∈ Tn which is mapped to this point is unique. First observe that the monic polynomial p(x) = ; − an ;1,n can be recovered from u(x)w(x) + v(x)2 = p(x)2 − 4. We can then determine bn from the following two formulas: n bi x n−1 + · · · , p(x) = x n − i=1

n−1 u(x) = ;n = x n−1 − bi x n−2 + · · · . i=1

Next, the second relation in (17) and (18) lead to the system: an−1 ;n−1,n − an ;1,n = v(x), an−1 ;n−1,n + an ;1,n = (x − bn )u(x) − p(x). This linear system completely determines the products an ;1,n and an−1 ;n−1,n . Because the determinants of the principal minors of x Id −L(h) are monic polynomials, this means that we know an , ;1,n and ;n−1,n separately. From ; = p(x) + an ;1,n we also obtain ;. We have now shown how bn , an , ;, ;n and ;n−1,n are determined. We proceed by induction, showing how to determine bn−k−1 , an−k−1 , ;n−k−1,...,n once we know bn−i , an−i and ;n−i,...,n for i = 0, . . . , k. We use (15) to obtain the recursive relation: ;n−k+1,...,n = (x − bn−k );n−k,...,n − an−k−1 ;n−k−1,...,n .

186

R. L. Fernandes, P. Vanhaecke

This determines the product an−k−1 ;n−k−1,...,n , but also an−k−1 and ;n−k−1,...,n separately, again because ;n−k−1,...,n is monic. Now from ;n−k−1,...,n and ;n−k,...,n we bi and n−k−1 bi . Hence, bn−k−1 is determined. know, as above, the sums n−k−2 i=1 i=1 We saw in Prop. 3.3 that the fibers of the momentum map of the even Prym system are reducible (two isomorphic pieces), so there remains the question if the same is true for the n-body KM system for even n. To check that this is so, note that the highest degree coefficient of the characteristic polynomial of L(h) gives, for n even, the first integral I = a1 a3 a5 · · · an−1 + a2 a4 a6 · · · an . Since a1 a2 · · · an = 1, for generic values of I , the variety defined by a1 a3 a5 · · · an−1 = constant,

a2 a4 a6 · · · an = constant,

is reducible, and the claim follows. Note however that both a1 a3 a5 · · · an−1 and a2 a4 a6 · · · an are first integrals themselves, so we can construct a momentum map using these integrals (instead of their sum and product) and then the general fiber is irreducible. The map m : Tn → Mn−1 not only maps fibers to fibers of the momentum maps, but it maps the whole hierarchy of Toda flows to the Mumford flows defined by (7). To ϕ see this, we construct a family of quadratic Poisson brackets {·, ·}M,q on Mn−1 which make this map Poisson. First observe that there exist unique polynomials p(x) and r(x), with p(x) monic of degree n and r(x) of degree less than n, such that u(x)w(x) + v(x)2 = p(x)2 + r(x).

(19)

The coefficients of p(x) and r(x) are regular functions of ui , vi and wi . Hence, we can define a skew-symmetric biderivation on the space of regular functions of Mn−1 by setting, for any ϕ ∈ C[x] of degree at most 1, ϕ u(x), u(x ) M,q ϕ u(x), v(x ) M,q ϕ u(x), w(x ) M,q ϕ v(x), w(x ) M,q ϕ w(x), w(x ) M,q

ϕ = v(x), v(x ) M = 0, pϕ = u(x), v(x ) M + α ϕ (x + x )u(x)u(x ), pϕ = u(x), w(x ) M − 2α ϕ (x + x )u(x)v(x ), pϕ = v(x), w(x ) M + α ϕ (x + x )u(x)w(x )), pϕ = w(x), w(x ) M + 2α ϕ (x + x ) w(x)v(x ) − w(x )v(x) ),

where α ϕ (x) = ϕ(α(2x)/2). Notice that the polynomial pϕ, used in the definition of the bracket, depends on the phase variables. Proposition 4.3. Let ϕ be a polynomial of degree at most 1. Then ϕ

(i) {·, ·}M,q is a Poisson bracket on Mn−1 and the maps ϕ

ϕ

m : (Tn , {·, ·}T ) → (Mn−1 , {·, ·}M,q ) are Poisson and map the Toda flows to the Mumford flows;

Hyperelliptic Prym Varieties and Integrable Systems

187

ϕ

(ii) For ϕ odd, the bracket {·, ·}M,q induces a Poisson bracket {·, ·}P ,q on P(n−1)/2 (resp. on Pn/2−1 ), and the maps

m : (Kn , {·, ·}K ) → (P(n−1)/2 , {·, ·}P ,q ) , {·, ·}P ,q )

m : (Kn , {·, ·}K ) → (Pn/2−1

are Poisson and map the flows of the n-body KM system to the flows of the hyperelliptic Prym systems. Proof. We take the bracket of both sides of (19) with u(x) to obtain 2p(y)ϕ(y)

u(x)v(y) − u(y)v(x) ϕ ϕ = 2p(y) {u(x), p(y)}M,q + {u(x), r(y)}M,q . x−y ϕ

ϕ

It follows that {u(x), r(y)}M,q is divisible by p(y). Since {u(x), r(y)}M,q is of degree ϕ less than n in y and since p(y) is monic of degree n we must have {u(x), r(y)}M,q = 0 and u(x)v(y) − u(y)v(x) ϕ {u(x), p(y)}M,q = ϕ(y). x−y ϕ

ϕ

Similarly, we find {v(x), r(y)}M,q = {w(x), r(y)}M,q == 0 and also that: ϕ(y) w(x)u(y) − u(x)w(y) {v(x), p(y)}M,q = − α(x + y)u(x)u(y) , 2 x−y v(x)w(y) − w(x)v(y) {w(x), p(y)}M,q = ϕ(y) + α(x + y)v(x)u(y) . x−y These expressions also allow one to compute the brackets of u(x), v(x), w(x) and p(x) ϕ with α(y), and the check of the Jacobi identity follows easily from it. Therefore, {·, ·}M,q is a Poisson bracket for which the coefficients of r(x) are Casimirs. If we compare the expressions above for the brackets with p(y) with expressions (7) for the Mumford vector fields, we conclude that they are Hamiltonian with respect to {·, ·}1M,q with Hamiltonian function K. Checking that m is Poisson can be done by a straightforward (but rather long) computation using the following expressions for the derivatives of ;i1 ,...,ik : ∂;i1 ,...,ik −;i,i+1,i1 ,...,ik , i, i + 1 ∈ {i1 , . . . , ik } , = 0 otherwise, ∂ai ∂;i1 ,...,ik −;i,i1 ,...,ik , i ∈ {i1 , . . . , ik } , = 0 otherwise. ∂bi For the second statement, one easily checks that when ϕ is odd then is a Poisson involution, so that there is an induced bracket on P(n−1)/2 or on Pn/2−1 . Explicit formulas for this bracket are computed as in the proof of Proposition 3.5. The other statements in (ii) then follow from (i). ϕ

ψ

It is easy to check that the Poisson brackets {·, ·}M,q and {·, ·}M on Mn−1 are compatible, when ϕ and ψ have degree at most 1. This is however not true when ψ is of higher degree.

188

R. L. Fernandes, P. Vanhaecke

5. Painlevé Analysis The results in the previous section show that the general fiber of the momentum map of the KM systems is an affine part of a hyperelliptic Prym variety (or two copies of it), which can also be described as a hyperelliptic Jacobian. In order to describe precisely which affine part we determine the divisor which needs to be adjoined to each affine part in order to complete it into an Abelian variety. Since it is difficult to do this by using the maps m we do this by performing Painlevé analysis of the KM systems. The method that we use is based on the bijective correspondence between the principal balances of an integrable vector field (Laurent solutions depending on the maximal number of free parameters) and the irreducible components of the divisor which is missing from the fibers of the momentum map (see [1]). We look for all Laurent solutions ∞ 1 (j ) j ai t , tr

ai (t) =

(20)

j =0

to the vector field (13) of the n-body KM system. The following lemma shows that any such Laurent solution of (13) can have at most simple poles. We may suppose that r in (0) (20) is maximal, i.e., ai = 0 for at least one i, and we call r the order of the Laurent solution. The order of pole (or zero) of ai (t) is denoted by ri , so r = maxi ri . Lemma 5.1. Let the Laurent series ai (t), i = 1, . . . , n, given by (20) be a solution to the vector field (13) of the n-body KM system. If at least one of the ai has a pole (for t = 0) then it is a Laurent solution of order 1. Moreover the orders of the pole (or zero) of each ai (t) satisfy (0)

(0)

ri = ai+1 − ai−1 .

(21)

Proof. For s ∈ N we find from (20):

a˙ i (t) s −ri , t = Res t=0 ai (t) 0,

s=0 s > 0.

On the other hand, if we use (13) then we find Res t=0

a˙ i (t) s (r−s−1) (r−s−1) − ai+1 . t = Res (ai−1 (t) − ai+1 (t)) t s = ai−1 t=0 ai (t)

We conclude that

(k) ai−1

(k) − ai+1

=

−ri , 0,

k =r −1 0 ≤ k ≤ r − 2.

(22)

Now substituting (20) into (13) and comparing the coefficient of 1/t r+1 the following equation (the indicial equation) is obtained: (0)

−rai

(0)

(0)

(0)

= ai (ai−1 − ai+1 ), (0)

i = 1, . . . , n.

(23) (0)

(0)

If ai has a pole of order r > 0 then ai = 0 and (23) implies ai−1 − ai+1 = −r. Comparing with (22) we see that we must have r = 1 and that (21) holds.

Hyperelliptic Prym Varieties and Integrable Systems

189

Notice that in view of the periodicity of the indices (ai+n = ai ) the linear system (0)

(0)

1 = (ai+1 − ai−1 ), (0)

has no solutions, so that at least one of the ai (0)

i = 1, . . . , n, (0)

(0)

vanishes. If, say, a0 = ak+1 = 0 while

ai = 0 for i = 1, . . . , k for some k in the range 1, . . . , n − 1 (this includes the case of (0) a single i for which ai = 0) then the indicial equation specializes to (0)

a2 = 1, (0)

(0)

ai+1 − ai−1 = 1,

(0) ak−1

i = 2, . . . , k − 1,

= −1, (0)

(0)

which has no solution for k odd, and which has a unique solution (a1 , . . . , ak ) = (0) (0) (−l, 1, 1 − l, 2, . . . , −1, l) for even k, k = 2l. The other variables ak+1 . . . , an can either be all zero, or they can constitute one or several other solutions of this type (with varying k = 2l), separated by zeroes. Using periodicity the other solutions to the indicial equation are obtained by cyclic permutation. Thus we are led to the following combinatorial description of the solutions to the indicial equation of the n-body KM system. For a subset A of Z/n, and for p ∈ Z/n let us denote by A(p) ⊂ Z/p the largest subset of A that contains p and that consists of consecutive elements (with the understanding that A(p) = ∅ when p ∈ / A). If we define Fn = {A ⊂ Z/n | p ∈ A ⇒ #A(p) is even}, then we see that the solutions to the indicial equation are in one to one correspondence with the elements of Fn . In the sequel we freely use this bijection. For A ∈ Fn we call the integer #A/2 its order, denoted by ord A. For each solution to the indicial equation (i.e., for each A ∈ Fn ) we compute the eigenvalues of the Kowalevski matrix M, whose entries are given by Mij =

∂Fi (0) (a , . . . , an(0) ) + δij , ∂aj 1

where Fi = ai (ai−1 − ai+1 ), the i th component of (13). The number of non-negative integer eigenvalues of this matrix are precisely the number of free parameters of the (0) (0) family of Laurent solutions whose leading term is given by (a1 , . . . , an ) (see [1]), hence we can deduce from it which strata of the Abelian variety, whose affine part appears as a fiber of the momentum map, are parameterized by it. Proposition 5.2. For a solution of the indicial equation corresponding to A ∈ Fn the Kowalevski matrix M has n − ord A non-negative integer eigenvalues. Proof. In view of (21) the entries of M can be written in the form (0) (1 − ri )δi,j , if ai = 0 Mij = (0) (0) ai (δi,j +1 − δi,j −1 ), if ai = 0.

190

R. L. Fernandes, P. Vanhaecke

Note also that, by using the Z/n action, we can assume that 1 ∈ A, n ∈ / A, and that A is a disjoint union of A(p1 ), . . . , A(ps ), with p1 < p2 < · · · < ps . Let li = ord A(pi ). Then M has the following form: −l1 C1 E1 D1 0 C2 E2 D2 M= . .. . 0 C E s s Ds On the upper right corner the matrix has entry −l1 , and the blocks Ci ,Di and Ei , i = 1, . . . , s, are matrices as follows: • Ci is a tridiagonal matrix of size 2li of the form:

0

li

1 0 −1 1 − l i 0 li − 1 2 0 −2 Ci = . .. .. .. . . li − 1 0 1 − l i −1 0 1 li 0

;

• Di is a diagonal matrix of the form Di = diag (1 + li , 1, . . . , 1, 1 + li+1 ), with the convention that if Di is 1 × 1 then its only entry is 1 + li + lj ; • Ei is a matrix with only one non-zero entry −li in the lower left corner. It is clear that the set of eigenvalues of M is the union of the set of eigenvalues of the Ci ’s and Di ’s. Now we have: Lemma 5.3. The eigenvalues of the matrix Ci are {±1, ±2, . . . , ±li }. Assuming to hold we find that the number of negative eigenvalues of M the lemma is equal to si=1 li = si=1 ord A(pi ) = ord A, so the proposition follows. So we are left with the proof of the lemma. We write l for li and we denote by ej the j th vector of the standard basis of C2l . In the basis e1 , e3 , . . . , e2l−3 , e2l−1 , e2l , e2l−2 , . . . , e4 , e2 the matrix Ci takes form

0 A A 0

,

Hyperelliptic Prym Varieties and Integrable Systems

191

where A is the transpose of the matrix

0 .. . .. .

...

...

0

1

0 2 −1 .. I= . . 3 −2 0 . . .. 0 ... .. .. . l l − 1 0 ... 0 We show that this matrix has eigenvalues 1, −2, 3, . . . , (−1)l−1 l. Then the result follows because the eigenvalues of C are ± the eigenvalues of A. For j = 1, . . . , l, let fj = [1j −1 , 2j −1 , . . . , l j −1 ]T and let Vj denote the span of f1 , . . . , fj . For v = [v1 , . . . , vl ]T ∈ Cl we have that v ∈ Vj if and only if there exists a polynomial P of degree less than j such that vk = P (k) for k = 1, . . . , l. Since the k th component of Ifj is given by 1 k(l − k + 1)j −1 + (1 − k)(l − k + 2)j −1 = (−1)j −1 j k j −1 1 + O , k we have that Ifj ⊂ Vj , more precisely Ifj ∈ (−1)j −1 j fj + Vj −1 . This means that in terms of the basis fj the matrix I is upper triangular, with the integers 1, −2, 3, . . . , (−1)l−1 l on the diagonal. By the proposition above we can have a Laurent solution depending on n − 1 free (0) (0) parameters (a principal balance) only for the n choices of A given by (a1 , . . . , an ) = (−1, 1, 0, . . . , 0) and their cyclic permutations. Let us check that these lead indeed to asymptotic expansions which formally solve (13). By §2 in [1], these solutions are actually convergent and so they define convergent Laurent solutions. (0) (0) It suffices to do this for the solution (a1 , . . . , an ) = (−1, 1, 0, . . . , 0) of the indicial equation. By (21) we know that the order of the singularities of this solution are (r1 , . . . , rn ) = (1, 1, −1, 0, . . . , −1) so we have the following ansatz for the formal expansions: 1 a1 (t) = − + α1 + β1 t + O(t 2 ), t 1 a2 (t) = + α2 + β2 t + O(t 2 ), t a3 (t) = β3 t + O(t 2 ), aj (t) = αj + βj t + O(t 2 ), an (t) = βn t + O(t ). 2

4 ≤ j ≤ n − 1,

192

R. L. Fernandes, P. Vanhaecke

If we replace these expansions in Eq. (13) defining the n-body KM system we obtain the consistency equations: α 1 − α2 2β1 − β2 β1 − 2β2 βj

= 0, = −α1 α2 − βn , = −α1 α2 + β3 , = αj (αj −1 − αj +1 ),

4 ≤ j ≤ n − 1.

They give exactly the n − 1 free parameters α1 , α4 , . . . , αn−1 , β3 , βn . The coefficients (k) (k) a(k) = (a1 , . . . , an ) for k > 2 are then completely determined since they satisfy an equation of the form (j )

(M − kI ) · a(k) = some polynomial in the ai

with j < k,

and the eigenvalues of the Kowalevski matrix M are −1, 1, 2, by the proof above. This leads to the following result. Theorem 5.4. When n is odd the general fiber of the momentum map of the n-body KM system is an affine part of a hyperelliptic Prym variety, obtained by removing n translates of its theta divisor. When n is even the general fiber consists of two isomorphic components which admit the same description as in the odd case. In both cases the Prym variety admits an alternative description as a hyperelliptic Jacobian. 6. Example: n = 5 In this section we study the 5-body KM system in more detail. Its phase space is fourdimensional and is given by K5 = {(a1 , a2 , a3 , a4 , a5 ) | a1 a2 a3 a4 a5 = 1}, with Lax operator 0 a1 0 0 h−1 1 0 a2 0 0 L= 0 1 0 a3 0 . 0 0 1 0 a4 ha5 0 0 1 0 The spectral curve det(x Id −L) = 0 is explicitly given by h+

1 = x 5 − Kx 3 + Lx, h

where K = a1 + a2 + a3 + a4 + a5 , L = a1 a3 + a2 a4 + a3 a5 + a4 a1 + a5 a2 . These functions are in involution with respect to the quadratic Poisson structure, given by {ai , aj } = (δi,j +1 − δi+1,j )ai aj . It follows from the previous section that for generic k, l the affine surface Pkl defined by K = k, L = l is an affine part of the Jacobian

Hyperelliptic Prym Varieties and Integrable Systems

193

of the genus two Riemann surface τ minus five translates of its theta divisor, which is (0) isomorphic to τ . As we have seen, an equation for τ is given by τ(0) : y 2 = (u3 − ku2 + lu)2 − 4u.

(24)

The two commuting Hamiltonian vector fields XK and XL are given by a˙ 1 = a1 (a5 − a2 ),

a1 = a1 (a3 a5 − a2 a4 ),

a˙ 2 = a2 (a1 − a3 ),

a2 = a2 (a4 a1 − a3 a5 ),

a˙ 3 = a3 (a2 − a4 ),

a3 = a3 (a5 a2 − a4 a1 ),

a˙ 4 = a4 (a3 − a5 ),

a4 = a4 (a1 a3 − a5 a2 ),

a˙ 5 = a5 (a4 − a1 ),

a5 = a5 (a2 a4 − a1 a3 ).

The principal balance of XK for which a1 and a2 have a pole corresponds, according to Sect. 5, to the following solution of the indicial equations: (0)

(0)

(0)

(0)

(0)

(a1 , a2 , a3 , a4 , a5 ) = (−1, 1, 0, 0, 0), and its first few terms are given by 1 1 a1 = − + α − (α 2 + 2β + γ )t + O(t 2 ), t 3 1 1 2 a2 = + α + (α − β − 2γ )t + O(t 2 ), t 3 a3 = γ t + O(t 2 ),

(25)

a4 = δ + O(t ), 2

a5 = βt + O(t 2 ). Here α, β, γ and δ are the free parameters. If we look for Laurent solutions that correspond to the divisor to be added to Pkl we find by substituting the above Laurent solution in K = k, L = l, a1 a2 a3 a4 a5 = 1, 2α + δ = k, 2αδ + β − γ = l, γβδ = −1, which means that the Laurent solution depends on two parameters β and δ, bound by the relation (k − δ)δ + β +

1 = l, βδ

(26)

which is an (affine) equation for the theta divisor, i.e., for τ ; it is easy to see that this curve is birational to the curve (24). The other four principal balances are obtained by cyclic permutation from (25). Pkl can be embedded explicitly in projective space by using the functions with a pole of order at most 3 along one of the translates of the theta divisor and no other poles. Since the theta divisor defines a principal polarization on its Jacobian, the vector space of such functions has dimension 32 = 9, giving an embedding in P8 . One checks by

194

R. L. Fernandes, P. Vanhaecke

direct computation that the following functions z0 , . . . , z8 form a basis for the space of functions with a pole of order at most 3 along the divisor associated with the Laurent solution (25) (the first two functions are obvious choices from the expression (25), while the others can be obtained from them by taking the derivative along the two flows): z0 z1 z2 z3 z4 z5 z6

= 1, = a1 a2 , = a1 a2 a4 , = a1 a2 (a1 + a5 ), = a1 a2 a4 (a3 + a4 + a5 ), = a1 a2 a4 (a1 − a2 ), = a1 a2 a4 ((a3 + a4 )a1 − (a4 + a5 )a2 ),

z7 = a12 a22 a4 a5 , z8 = a1 a22 a4 ((a4 + a5 )2 + a3 a4 ). The corresponding embedding of the Jacobian in P8 is then given explicitly on the affine surface Pkl by (a1 , . . . , a5 ) → (z0 : · · · : z8 ). By substituting the five principal balances in this embedding and letting t → 0 we find an embedding of the five curves 1 , . . . , 5 (in that order) which constitute the divisor Jac(τ ) \ Pkl : (0 : 0 : 0 : 1 : 0 : 2δ : 2δ 2 : βδ : −δ 3 ) (βδ 2 : −β 2 δ 2 : 0 : −β 2 δ 3 : βδ : βδ : βδ 2 : 0 : 1 − βδ 3 ) (1 : 0 : βδ : 0 : βδ(k − δ) : βδ 2 : −βδ(β + δ 2 − kδ) : 0 : β 2 δ(k − δ)) (β, δ) → (β 2 δ : 0 : βδ : −βδ : βδ(k − δ) : −βδ 2 : 1 + βδ 2 (δ − k) : −δ : −βδ 2 (β − (δ − k)2 )) 2 2 (βδ : −δ : 0 : δ(δ − k) : βδ : −βδ : −βδ : −1 : 1). The points on the divisor that correspond to the above Laurent solutions are the ones for which β and δ are finite; notice that all these points in P8 are different. In order to determine the coordinates of the other points and the incidence relations between these points and the curves i we choose a local parameter around each of the three points needed to complete (26) into a compact Riemann surface: (a) (b) (c)

δ = 1/u, β = 1/u2 (1 + O(t)); δ = 1/u, β = u3 (1 + O(t)); β = 1/u, δ = −u2 (1 + O(t)).

Substituting these in the equations of the five embedded curves we find the following 5 points (each one is found 3 times because it belongs to three of the curves i ) p1 p2 p3 p4 p5

= (0 : 0 : 0 : 1 : 0 : 0 : 0 : 0 : 0), = (0 : 0 : 0 : 0 : 0 : 0 : 0 : 0 : 1), = (1 : 0 : 0 : 0 : 0 : 0 : 1 : 0 : −k), = (1 : 0 : 0 : 0 : 0 : 0 : −1 : 0 : 0), = (0 : 0 : 0 : 0 : 0 : 0 : 0 : 1 : −1).

Hyperelliptic Prym Varieties and Integrable Systems

3

195

1

4

p4

p3 p2

p3

p5

p1

2

p4

5

Fig. 2.

With this labeling of the points pi we have that i contains the points pi−1 , pi and pi+1 . As a corollary we find a 53 configuration on the Jacobian, where the incidence pattern of the 5 Painlevé divisors and the 5 points pi is as in the following picture (to make the picture exact one has to identify the two points labeled p3 , as well as the two points labeled p4 in such a way that the curves 2 and 4 are tangent, as well as the curves 3 and 5 ). Obviously the order 5 automorphism (a1 , a2 , a3 , a4 , a5 ) → (a2 , a3 , a4 , a5 , a1 ) preserves the affine surfaces Pkl and maps every curve i and every point pi to its neighbor. Since this automorphism does not have any fixed points it is a translation on Jac(τ ), and since its order is 5 it is a translation over 1/5 of a period. Notice also that with the above labeling of points and divisors the intersection point between i and i+2 is pi+1 (so they are tangent), while the intersection points between i and i+1 are pi and pi+1 . Dually, the divisors that pass through pi are precisely i−1 , i and i+1 . The usual Olympic rings are nothing but an asymmetric projection of this most beautiful Platonic configuration! References 1. Adler, M. and van Moerbeke, P.: The complex geometry of the Kowalewski–Painlevé analysis. Invent. Math. 97, 3–51 (1989) 2. Adler, M. and van Moerbeke, P.: Algebraic completely integrable systems: A systematic approach. Perspectives in Mathematics, Academic Press (to appear) 3. Adler, M. and van Moerbeke, P.: The Toda lattice, Dynkin diagrams, singularities and Abelian varieties. Invent. Math. 103, 223–278 (1991) 4. Courant, T.: Dirac manifolds, Trans. Am. Math. Soc. 319, 631–661 (1990) 5. Fernandes, R.L.: On the master symmetries and bi-Hamiltonian structure of the Toda lattice. J. Phys. A: Math. Gen. 26, 3797–3803 (1993) 6. Fernandes, R.L. and Santos, J.P.: Integrability of the periodic KM system. Rep. Math. Phys 40, 475–484 (1997) 7. Dalaljan, S.G.: The Prym variety of a two-sheeted covering of a hyperelliptic curve with two branch points. (Russian) Mat. Sb. (N.S.), 98 (140), no. 2 1(10), 255–267, 334 (1975)

196

R. L. Fernandes, P. Vanhaecke

8. Griffiths, P.A.: Linearizing flows and a cohomological interpretation of Lax equations. Am. J. Math. 107, 1445–1484 (1985) 9. Griffiths, P.A. and Harris, J.: Principles of algebraic geometry. New York: Wiley-Interscience 1978 10. Kac, M. and van Moerbeke, P.: On an explicitly soluble system of nonlinear differential equations related to certain Toda lattices. Adv. in Math. 3, 160–169 (1975) 11. Kuznetsov, V. and Vanhaecke, P.: Bäcklund transformations for finite-dimensional integrable systems: A geometric approach. nlin.SI/0004003 12. Mumford, D.: Tata lectures on theta. II. Boston: Birkhäuser Boston Inc., 1984 13. Mumford, D.: Prym varieties I. In: Contributions to analysis, Ahlfors L.V. Kra I. Maskit B. Nirenberg L., Eds., New York: Academic Press, 1974, pp. 325–350 14. Pedroni, M. and Vanhaecke, P.: A Lie algebraic generalization of the Mumford system, its symmetries and its multi-Hamiltonian structure. J. Moser at 70, Regul. Chaotic Dyn. 3, 132–160 (1998) 15. Vanhaecke, P.: Linearising two-dimensional integrable systems and the construction of action-angle variables. Math. Z. 211, 265–313 (1992) 16. Vanhaecke, P.: Integrable systems in the realm of algebraic geometry. Berlin–Heidelberg–New York: Springer-Verlag, 1996 17. Vanhaecke, P.: Integrable systems and symmetric products of curves. Math. Z. 227, 93–127 (1998) 18. Veselov, A.P. and Penskoï, A.V.: On algebro-geometric Poisson brackets for the Volterra lattice. Regul. Chaotic Dyn. 3, 3–9 (1998) 19. Volkov, A.: Hamiltonian interpretation of the Volterra model. J. Soviet Math. 46, (1576–1581) (1989) 20. Volterra, V.: Leçons sur la Théorie Mathématique de la Lutte pour la Vie. Paris: Gauthier-Villars et Cie., 1931 21. Weinstein, A.: The local structure of Poisson manifolds. J. Differ. Geom., 18, 523–557 (1983) Communicated by M. Aizenman

Commun. Math. Phys. 221, 197 – 227 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

The Complex Geometry of Weak Piecewise Smooth Solutions of Integrable Nonlinear PDE’s of Shallow Water and Dym Type Mark S. Alber1,2 , Roberto Camassa3,4, , Yuri N. Fedorov5, , Darryl D. Holm4,† , Jerrold E. Marsden6,‡ 1 Department of Mathematics, Stanford University, Building 380, MC 2125, Stanford, CA 94305, USA.

E-mail: [email protected]

2 Department of Mathematics, University of Notre Dame, Notre Dame, IN 46556, USA.

E-mail: [email protected]

3 Department of Mathematics, University of North Carolina, Chapel Hill, NC 27599, USA 4 Center for Nonlinear Studies and Theoretical Division, Los Alamos National Laboratory, Los Alamos,

NM 87545, USA. E-mail: [email protected]; [email protected]

5 Department of Mathematics and Mechanics, Moscow Lomonosov University, Moscow 119 899, Russia.

E-mail: [email protected]

6 Control and Dynamical Systems 107-81, California Institute of Technology, Pasadena, CA 91125, USA.

E-mail: [email protected] Received: 16 February 1999 / Accepted: 10 April 2001

To the 70th birthday of Solomon Alber Abstract: An extension of the algebraic-geometric method for nonlinear integrable PDE’s is shown to lead to new piecewise smooth weak solutions of a class of N component systems of nonlinear evolution equations. This class includes, among others, equations from the Dym and shallow water equation hierarchies. The main goal of the paper is to give explicit theta-functional expressions for piecewise smooth weak solutions of these nonlinear PDE’s, which are associated to nonlinear subvarieties of hyperelliptic Jacobians. The main results of the present paper are twofold. First, we exhibit some of the special features of integrable PDE’s that admit piecewise smooth weak solutions, which make them different from equations whose solutions are globally meromorphic, such as the KdV equation. Second, we blend the techniques of algebraic geometry and weak solutions of PDE’s to gain further insight into, and explicit formulas for, piecewisesmooth finite-gap solutions. The basic technique used to achieve these aims is rather different from earlier papers dealing with peaked solutions. First, profiles of the finite-gap piecewise smooth solutions are linked to certain finite dimensional billiard dynamical systems and ellipsoidal billiards. Second, after reducing the solution of certain finite dimensional Hamiltonian Research partially supported by NSF grant DMS 9626672 and NATO grant CRG 950897.

Research supported in part by US DOE CCPP and BES programs and NATO grant CRG 950897. Research supported by INTAS grant 97-10771 and, in part, by the Center for Applied Mathematics,

University of Notre Dame. † Research supported in part by US DOE CCPP and BES programs. ‡ Research partially supported by the California Institute of Technology and NSF grant DMS 9802106.

198

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

systems on Riemann surfaces to the solution of a nonstandard Jacobi inversion problem, this is resolved by introducing new parametrizations. Amongst other natural consequences of the algebraic-geometric approach, we find finite dimensional integrable Hamiltonian dynamical systems describing the motion of peaks in the finite-gap as well as the limiting (soliton) cases, and solve them exactly. The dynamics of the peaks is also obtained by using Jacobi inversion problems. Finally, we relate our method to the shock wave approach for weak solutions of wave equations by determining jump conditions at the peak location. Contents 1. 2. 3. 4. 5. 6.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Finite-Gap Solutions . . . . . . . . . . . . . . . . . . . . . . . . . Flows on n-Dimensional Quadrics and Stationary n-Gap Solutions of the (HD) and (SW) Equations . . . . . . . . . . . . . . . . . . Billiard Dynamical Systems and Piecewise-Smooth Weak Solutions of PDE’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kinematics of Peaks . . . . . . . . . . . . . . . . . . . . . . . . . The Dynamics of Peaks and Weak Solutions . . . . . . . . . . . .

. . . . . . . .

198 201

. . . .

208

. . . . . . . . . . . .

210 222 223

1. Introduction An important feature of many integrable nonlinear evolution equations is the nature of their soliton solutions. There are many examples of such solutions found in a variety of physical applications, such as nonlinear optics and water wave equations. Nonsmooth soliton solutions of integrable equations are now well known, and include solutions of the shallow water equation (SW) with peaks, the points at which their spatial derivative changes sign (see Camassa and Holm [1993] and Camassa, Holm and Hyman [1994]). It was noted in Alber et al. [1994, 1995, 1999] that the spatial structure of these “peakon” and finite-gap piecewise smooth weak solutions are closely related to finite dimensional integrable billiard systems. Some history. Camassa and Holm [1993] described classes of n-peakon solutions for an integrable equation in the context of a model for shallow water theory. This work (see also Camassa, Holm and Hyman [1994]) contains many other facts about these equations as well, such as a Hamiltonian derivation of the equation, the associated linear isospectral eigenvalue problem and its discrete spectrum corresponding to the peakons, a steepening lemma important for understanding how solutions lose regularity, numerical stability, etc. Of particular interest to us is their description of the dynamics of the peakons in terms of a finite-dimensional completely integrable Hamiltonian system. In other words, each peakon solution can be associated with a mechanical system of moving particles. Calogero [1995] and Calogero and Francoise [1996] further extended the class of mechanical systems of this type. It is well-known (see, for example, Ablowitz and Segur [1981]), that solitons and quasi-periodic solutions of most classical integrable equations can be obtained by using the inverse scattering transform (IST) method. This is done by establishing a connection with an isospectral eigenvalue problem for an associated operator that is often a Schrödinger operator. In some cases it involves a potential in the form of an entire function of the spectral parameter. Such an operator is called an energy-dependent

Complex Geometry of Piecewise Solutions

199

Schrödinger operator. The scattering problem for the operators of this type was studied by Jaulent [1972] and Jaulent and Jean [1976]. On the other hand, in connection with certain N -component systems of integrable evolution equations,Antonowicz and Fordy [1989] investigated certain energy dependent scalar Schrödinger operators. Using this formalism, they obtained multi-Hamiltonian structures for this class of systems. Later, Alber et al. [1994, 1995, 1999] showed that in case of certain potentials, a limiting procedure can be applied to generic solutions, which results in solutions with peaks. The latter were related to finite dimensional integrable dynamical systems with reflections and were termed piecewise-smooth solutions, a terminology that hereafter we will adopt. This relation provides an efficient route to the study of finite-gap and piecewise soliton solutions of nonlinear PDE’s. The approach is based on studying finite dimensional Hamiltonian systems on certain Riemann surfaces and can be used for a number of equations including the shallow water equation, the Dym type equation, as well as certain N -component systems and equations in their hierarchies. Finite-gap solutions of the Dym equation were studied in Dmitrieva [1993a] and Novikov [1999] by making use of a connection with the KdV equation and with the aid of additional phase functions. Soliton solutions of Dym type equations were studied in Dmitrieva [1993b]. Periodic solutions of the shallow water equation were discussed in McKean and Constantin [1999]. The papers by Beals et al. [1998, 1999, 2000] used Stieltjes’ theorem on continued fractions and the classical moment problem for studying multi-peakon solutions of the (SW) equation. Multi-peakon solutions have also been derived in Camassa [2000] by Gram–Schmidt orthogonalization. The main results of this paper. While our techniques are rather general and can be applied to large classes of N -component systems, we shall illustrate them in detail for two specific integrable PDE’s. One of these equations is a member of the Dym hierarchy that has been studied by, amongst others, Kruskal [1975], Cewen [1990], Hunter and Zheng [1994] and Alber et al. [1995, 1999]. Using subscript notation for partial derivatives, this equation is Uxxt + 2Ux Uxx + U Uxxx − 2κUx = 0.

(HD)

The other equation, derived from the Euler equations of hydrodynamics in a shallow water framework by Camassa and Holm [1993], is Ut + 3U Ux = Uxxt + 2Ux Uxx + U Uxxx − 2κUx .

(SW)

In both equations, the dependent variable U (x, t) may be interpreted as a horizontal fluid velocity and κ is a parameter. Under appropriate boundary conditions, applying the limit κ → 0 to (SW) leads to an equation that has peaked solutions. For equation (HD), such solutions exist also for κ = 0 (for example periodic and finite-gap peaked solutions). By using the method of generating equations for nonlinear integrable PDE’s, we reduce the equations to a Jacobi inversion problem associated with hyperelliptic curves. The solutions U (x, t) themselves are given by trace formulae, i.e., sums of coordinates of points on such curves. An important feature is that the corresponding Abel–Jacobi mapping is not a standard one. First of all, the holomorphic differentials that are involved do not form a complete set of such differentials on a hyperelliptic curve. Second, it involves a meromorphic differential.As a result, the image of the mapping turns out to be a non-Abelian subvariety

200

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

(a stratum) of a generalized Jacobian. This also implies that the x- and t-flows of (HD) and (SW) are essentially nonlinear, i.e., they are not translationally invariant. Seen from the viewpoint of algebraic geometry, these nonstandard aspects constitute the main difference between shallow water and Dym type equations, and equations of KdV type and more generally equations from the whole KP hierarchy which lead to standard Abel–Jacobi mappings. The basic technique of the present paper is rather different from earlier papers dealing with peaked solutions. First, profiles of the finite-gap piecewise-smooth solutions are linked to certain finite dimensional billiard dynamical systems and ellipsoidal billiards in the field of Hooke potentials. Second, after reducing the solution of the finite dimensional Hamiltonian systems on Riemann surfaces to the solution of a nonstandard Jacobi inversion problem, it is resolved by introducing new parametrizations. The philosophy that “justifies” procedures of this sort is that, in the end, by using the trace formulae, we obtain weak solutions of the PDE’s (HD) and (SW) in the spacetime sense. This is regarded as equivalent to the validity of Hamilton’s principle for these PDE’s and is taken as a fundamental criterion for the definition of their solutions. It is worth emphasizing that Hamilton’s principle naturally leads to weak solutions in the spacetime sense (and not in the spatial sense alone). We might also remark that even for billiards, one has to be careful about the sense in which solutions are interpreted. In the case of a point particle bouncing off a wall, for example, the equations of motion themselves do not rigorously make sense at the collision; what does make sense is the fundamental principle of Hamilton. This point of view of course is not new – see, e.g., Young [1969] and Kane et al. [1999]. The contents of the paper. In Sect. 2, basic trace formulae and µ-variable representations are used to establish a connection between solutions of the nonlinear equations and finite dimensional Hamiltonian systems on Riemann surfaces. These representations describe finite-gap and soliton type solutions, as well as mixed soliton–finite-gap solutions. Then, solving the Hamiltonian systems is reduced to Jacobi inversion problems with meromorphic differentials. These inversion problems are solved by introducing a new parameterization that yields a Hamiltonian flow on a nonlinear subvariety of the Jacobi variety. The approach of recurrence chains used in this section is demonstrated in detail in the case of Dym-type equations. In Sect. 3 the geodesic motion and motion in the field of a Hooke potential on an ellipsoid are linked, at any fixed time t, to finite-gap solutions of (HD) and (SW) equations respectively through trace formulae. In Sect. 4 it is shown how peaked finitegap solutions of (HD) and (SW) equations arise in the particular limit of smooth solutions. Based on this, a connection to ellipsoidal and hyperbolic billiards is used to construct the peak solutions of equations (HD) and (SW) in the form of an infinite sequence of pieces, corresponding to the segments between impacts, glued together along peaks. The motion between impacts in the billiard problems is made linear on generalized Jacobians of hyperelliptic curves. By solving the corresponding generalized Jacobi inversion problem, we find thetafunction solutions to the billiards, which thereby enables us to describe explicit peak solutions for the above PDE. We then extend the analysis from fixed-time peak solutions to time-dependent ones and show that the latter are described by an infinite number of meromorphic pieces in x and t that are glued along peak lines (surfaces) where the solution has discontinuous derivatives in the dependent variables. We give thetafunction expressions for the pieces and the peak surfaces. These formulae may be useful

Complex Geometry of Piecewise Solutions

201

for stability analysis as well as for numerical investigations of the perturbed nonlinear PDE’s. In Sect. 5 the Hamiltonian structure for the motion of the peaks of the finite-gap piecewise-solutions is obtained by using algebraic-geometric methods. Lastly, in Sect. 6 we relate our method to the shock wave approach for weak solutions of wave equations by determining jump conditions at the shock location. 2. Finite-Gap Solutions In this section we will show that even on the level of finite-gap solutions, there are crucial differences between the KdV equation case and equations (HD) or (SW). The same method can be applied to other equations forming the HD and SW hierarchy as well as to N -component systems of nonlinear evolution equations which have associated with them energy dependent Schrödinger operators (see Alber et al. [1997]). We will start by describing the algebraic geometrical structure of finite-gap solutions of equations (HD) and (SW) related to a hyperelliptic curve of genus n, also called n-gap solutions. The same method can be applied also to the other equations forming the HD and SW hierarchy. For the HD equation such solutions were obtained in terms of theta-functions by Dmitrieva [1993a] (see also Dmitrieva [1993b]) and Novikov [1999]. For equation (SW) on a circle, the problem was discussed in Constantin and McKean [1999]. Lax pairs and recurrence chains. We now use the recurrence chain approach to develop a basic trace formula which establishes a connection between solutions of equation (HD) and finite dimensional Hamiltonian systems on Riemann surfaces, written in the socalled µ-variables representation. This representation describes finite-gap solutions, as well as their limiting forms of soliton-type. This representation also yields the existence of peakons in a special limiting case. For definiteness, we concentrate here on equation (HD). Analogous results are available in the case of equation (SW) (for details see Alber et al. [1994, 1995]). The hierarchy of Dym equations is obtained from the Lax equations ∂ L = [L, An ], ∂tn

n ∈ N,

L=−

∂2 + V (E, x, tn ), ∂x 2

where the potential V (E, x, tn ) is written in terms of a complex parameter E in the form M(x, tn ) , (2.1) 2E for a function M(x, tn ) to be determined below. Assuming [L, An ] to be a scalar operator, we choose An = Bn ∂x − 21 Bn for some function Bn (E, x, tn ) and obtain the following sequence of equations for V , V (x, tn , E) =

∂V 1 ∂ 3 Bn ∂Bn ∂V V + Bn . =− +2 ∂tn 2 ∂x 3 ∂x ∂x

(2.2)

Now we choose Bn to be a polynomial in E of degree n: Bn (x, t, E) = b0

n

(E − µk (x, t)) =

k=1

n k=0

bn−k (x, t) E k .

(2.3)

202

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

Substituting the expressions (2.1) and (2.3) into the generating equation (2.2) and equating like powers of E, we obtain a recurrence chain for coefficients of B(x, t) which yields the nth equation of the Dym hierarchy. For example, putting t1 = t and choosing n = 1, B1 (x, t, E) = b0 (x, t)E + b1 (x, t) yields the following chain

E 1 : −b0 = 0,

E 0 : −b1 + 2b0 M + b0 M = 0, ∂M E −1 : 2b1 M + b1 M = . ∂t

(2.4)

After setting b0 = 1 and using (2.1), we get

M = b1 , ∂M . 2b1 M + b1 M = ∂t

(2.5)

The first equation defines b1 in terms of M, M = b1 + κ, with κ a constant. Renaming b1 = −U , so that M = −U + κ,

(2.6)

and putting this into the second equation of the set (2.5) results in equation (HD). (For further details about the hierarchies of (HD) and (SW), see for example Alber et al. [1994, 1995, 1999].) The method of generating equations is due to S. Alber [1979] and another exposition of it can be found in Alber et al. [1985, 1997]. We call (2.2) the “dynamical generating equations”, because it generates a hierarchy of equations governing the dynamics of the dependent variable M(x, t). Remark. The flows where Bn is a polynomial E, as in the definition (2.3) and in the example above, will in general lead to nonlocal equations, i.e., the evolution equation for M involves terms that depend on nonlocal operators acting on combinations of M and its derivatives. This can be seen, for instance, in Eq. (2.5) where both b1 and b1 require inverting (2.6) to write U in terms of M. Thus, flows generated by polynomials Bn in E should be properly classified as integro-differential evolution equations, rather than PDE’s. In contrast, the choice of polynomials in 1/E for Bn leads to flows that are local, i.e., Mt only depends on combinations M and its (spatial) derivatives, and these flows are proper PDE’s. This feature of equations of Dym (HD) or shallow water (SW) type is somewhat different from other completely integrable PDE’s like the KdV or Sine-Gordon equation. Equations (HD) and (SW) possess “open ended” hierarchies: the recurrence chain can be extended from negative to positive powers of E, by choosing Bn in (2.2) to be a rational function of the parameter E. The case when the chain includes only negative powers of E is in fact the one most studied in the literature (see, e.g., Dimitrieva [1993a], Novikov [1999] for the case of Dym equation). Now let us consider the stationary flow for the nth equation of the hierarchy, which is obtained by dropping the time derivative of V in the left-hand side of (2.2). By definition

Complex Geometry of Piecewise Solutions

203

a stationary equation describes a finite-dimensional system for the coefficients of Bn and is equivalent to the 2 × 2 Lax pair ∂ ∂ Wn (E) = −[Wn (E), L(E)], or + L(E), Wn (E) = 0, ∂x ∂x (2.7) 1 0 1 Bn − 2 Bn , L = . Wn (E) = M 1 − 21 Bn + Bn M E 0 E 2 Bn The matrix Wn (E) undergoes an isospectral deformation. Hence the spectral curve = {|Wn (E) − zI | = 0} is an invariant of the stationary flow. The curve is hyperelliptic and can be represented in the form = {w 2 = µC(µ)},

(2.8)

1 C(E) = E −Bn Bn + Bn 2 + Bn2 M. 2

(2.9)

where z = wE and

Since Bn is a polynomial of degree n, C(E) becomes a polynomial of degree (at most) 2n: C(E) =

2n

Cj E j = C2n

j =0

2n

(E − mk ),

(2.10)

k=0

for some constants mk , k = 1, . . . , 2n. In this case the curve has genus n and we set the coefficient C2n to be a negative number: C2n ≡ −L20 . We shall refer to (2.9) as the stationary generating equation. Equating like-powers of E in both sides of the stationary generating equation yields first integrals E 2n : C2n

= − b1 + M,

1 E 2n−1 : C2n−1 = − b1 b1 − b2 + (b1 )2 + 2b1 M, 2 Ej : ··· E 0 : C0

(2.11)

= 2bn2 M.

Let us consider the divisor of points P1 = (µ1 , w1 ), . . . , Pn = (µn , wn ) on . Substituting (2.3) into (2.9) and setting E = µ1 , . . . , µn successively, one gets the following system of equations describing evolution of the points under the stationary flow: √ ∂µi R(µi ) = µi ≡ , (2.12) ∂x µi nj=i (µi − µj )

204

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

where R(µ) = µC(µ) = −L20 µ

2n

(µ − mr ).

(2.13)

r=1

In the case of equation (SW), this should be replaced by R(µ) = µ

2n+1

(µ − mr ).

(2.14)

r=1

We now proceed to describe finite-gap solutions of equation (HD) and the other equations from its hierarchy. According to a general theory (see, e.g., Dubrovin [1981], Belokolos et al. [1994], for any fixed t, the x-profile of an n-gap solution of an integrable PDE satisfies the nth stationary equation of the hierarchy. Hence, n-gap solutions M(x, tk ) of k th equation of HD hierarchy must satisfy the stationary generating equation (2.9) represented by the Lax pair (2.7), as well as the dynamical generating equation ∂V 1 ∂ 3 Bk ∂Bk ∂V =− +2 V + Bk , ∂tk 2 ∂x 3 ∂x ∂x

V =

M(x, tk ) , 2E

(2.15)

where the coefficients of Bk (E) are found recursively. Notice that the latter equation is equivalent to the matrix commutativity relation ∂ ∂ (2.16) + L, + Wk = 0, ∂x ∂tk where Wk (E) =

− 21 Bk 1 − 2 Bk + B k M E

Bk 1 , 2 Bk

(2.17)

and L is defined in (2.7). The compatibility of conditions (2.16), and (2.7) leads to the following Lax pair: ∂ Wn (E) = −[Wn (E), Wk (E)], ∂tk

k ∈ N,

k = n.

(2.18)

For k = n, we replace (2.18) with the Lax pair (2.7) thus identifying tn with x. The (1,2)-entry of the matrix equation (2.18) implies the following tk -evolution of the polynomial Bn (E): ∂Bn ∂Bk ∂Bn = Bk − B n , ∂tk ∂x ∂x

k = n.

In case k = n this relation is replaced by ∂B ∂B = vb0 , ∂x ∂tn where v is a constant, which can always be eliminated by rescaling tn .

(2.19)

Complex Geometry of Piecewise Solutions

205

Expanding the right-hand side of (2.19) in E and using the condition that it must be a polynomial of degree n − 1, we find 1 Bk (E) = Bn (E) , (2.20) E n−k + where [ ]+ denotes the polynomial part of the expression. As follows from the first equation nin (2.11), M = C2n + b1 . On the other hand, according to formula (2.3), b1 = − i=1 µi . Finally, using the definition (2.6) of M in terms of the solution U and integrating twice with respect to x, we obtain U=

n i=1

µi +

1 κ − C2n x 2 + K1 x + K2 , 2

(2.21)

where K1 and K2 are constants of integration. If we assume that all the variables µi are bounded, which is related to the choice of sign of the leading order coefficient C2n , then b1 is a bounded function of x. To find bounded solutions U (x, t) of the PDE, we set C2n = κ,

and

K1 = 0.

Hence, when the above requirements are imposed, we see that the leading order coefficient of the polynomial C(E) must coincide with the parameter κ of the PDE. The Dym equation (HD) is invariant under the Galilean transformation xˆ = x + K2 t,

tˆ = t,

Uˆ = U − K2 ,

so that the constant K2 can always be eliminated from expression (2.21). Therefore, under the boundedness conditions above, and up to a Galilean transformation, we assume that the finite-gap and soliton solutions of the Dym equation (HD) is reconstructed in terms of the root variables µ s by the “trace” formula which in case of equations (HD) and (SW) have the form U (x, t) =

n

µi − m.

(2.22)

i=1

Here m is a constant, which equals zero in the case of equation (HD). Through (2.22) a solution of the system (2.12) allows to construct the instantaneous profile of U (x, ·) from a set of initial conditions µi (x, ·) = µi (0, ·) ∈ [m2i , m2i+1 ], i = 1, . . . , n. Here the “dot” notation stresses the fact that time t is just a parameter in this system. On the other hand, substitution of (2.3) into (2.19), setting E = µ1 , . . . , µn successively, and taking into account expressions (2.12) results in the following tk -evolution equations for µi , √ R(µi ) ∂µi ∂µi = Bk (µi ) = Bk (µi ) n , i = 1, . . . , n, (2.23) ∂tk ∂tx µi j =i (µi − µj ) where, in view of (2.3) and (2.20), for k = 1, . . . , n − 1, Bk (µi ) = Ress=0

1 (s − µ1 ) · · · (s − µn ) , s n−k s − µi

206

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

i.e., up to the sign, the k th elementary symmetric function of {µ1 , . . . µn } \ µi . In the case k = 1, √ ∂µi (µi − &) R(µi ) µ˙ i ≡ = , i = 1, . . . , n, ∂t1 µi nj=i (µi − µj )

& = µ1 + · · · + µn ,

(2.24)

the solution of which produces the µ’s, and hence the PDE’s solution U , at any (later) time t. We notice that for k > n, the derivatives ∂/∂tk are linear combinations of ∂/∂t1 , . . . , ∂/∂tn . Expressions (2.12), (2.23), and (2.12) provide the so-called µ-variables representation for the finite-gap solutions of an evolution equation. They are the analogs of Drach–Dubrovin equations which describe evolution of points on the spectral hyperelliptic curve in the case of the KdV equation. (For further details see Dubrovin [1975], Drach [1919], Alber et al. [1994, 1995, 1999], Gesztesy et al. [1996], and Alber and Fedorov [2001].) With the initial conditions chosen, the right-hand-side of system (2.12) is real, and the derivative of µi changes sign when µi reaches the end points of its gap, µi = m2i or µi = m2i+1 , corresponding to a change of the sheet of the spectral curve . Thus each variable µ undergoes (real) oscillations between the end points of a gap (so that the resulting PDE solution U (x, t) remains real). Remark. The condition that the root variables µ’s are real (or, equivalently, their initial conditions are chosen as described above), while certainly sufficient to assure reality of the PDE’s solution U resulting from (2.21), is clearly not necessary (namely, some of the µ’s could occur in conjugate pairs). A wider class of real solutions U could be constructed by relaxing the reality assumption on the µ-variables. However, a thorough discussion of the reality condition for U and its implications for the root variables, while certainly desirable, lies beyond the scopes of the present paper, and it will be addressed in future work. By rearranging and summing up Eqs. (2.12) and (2.24), (2.23), one obtains the following nonstandard Abel–Jacobi equations n µki dµi dtk = √ x R(µi ) i=1

k = 1, . . . , n − 1, k = n,

(2.25)

which contain (n − 1) holomorphic differentials and one meromorphic differential on . Thus, the number of holomorphic differentials is less than genus of the Riemann surface, which implies that the corresponding inversion problem cannot be solved in terms of meromorphic functions of x and t1 , . . . , tn−1 (see e.g., Markushevich [1977]). Finite-gap stationary flows in x. Let us first consider the x-flow by fixing time variables in (2.25): tk = tk0 =const, k = 1, . . . , n − 1, so that dtk = 0. Now introduce a new spatial variable x1 defined as follows:

x1

x= 0

1 µ1 · · · µn dx1 . L0

(2.26)

Complex Geometry of Piecewise Solutions

207

In view of the well-known Jacobi identities 1/(µ1 · · · µn ) k = −1, 0 µki k = 0, . . . , n − 2, n = 1 k = n − 1, j =i (µi − µj ) & k = n, Eqs. (2.12) give rise to the following system: n µi k−1 µ dµ x1 + φ 1 = √ φk R(µ) µ0 i=1

k = 1, k = 2, . . . , n,

(2.27)

(2.28)

where φ1 , . . . , φn are constant phases which depend on tk0 as on parameters. Equations (2.28) include n holomorphic differentials on and determine the standard Abel–Jacobi map of the symmetric product (n) of n copies of to the Jacobi variety (Jacobian) Jac(). Thus, the flow generated by the system (2.12) is made linear on Jac() after introducing the reparametrization (2.26). By using standard methods (see e.g., Dubrovin [1981] or Mumford [1983]), the map can be inverted, resulting in expressions for algebraic symmetric functions of µ-variables in terms of theta-functions of n arguments which depend linearly on x1 and, in a transcendental way, on tk0 as parameters. Then, by using the trace formula (2.22), one obtains a theta-functional expression for U as a function of x1 , tk0 , U = U˜ (x1 , tk0 ). On the other hand, substituting the theta-functional expression for the product µ1 · · · µn into (2.26) yields a quadrature. By solving it, one finds x as a meromorphic function of x1 which depends on t0 as a parameter. However, the inverse function x1 (x, t0 ) is no longer meromorphic in x. Finally, the composition function U (x, t0 ) = U˜ (x1 (x, t0 ), tk0 ) gives a profile of the finite-gap solutions of the (HD) or (SW) equation (for explicit theta-functional expressions U˜ (x1 , t0 ), x(x1 , t0 ) see Alber and Fedorov [2001]). Notice that as seen from (2.26) and (2.28), the original x-flow is also made linear on Jac(). However the straight line motion is not uniform. The transformation (2.26) involving x and x1 coincides with a change of variable in the well-known Liouville transformation (see, e.g., Verhulst [1996]). Finite-gap flows in tk . Now let us fix the coordinate x = x0 as well as all the times t1 , . . . , tn−1 but tk . Then introduce a new time variable t˜ defined by dtk =

µ1 · · · µn d t˜, L0 (&k−1 )

(2.29)

where &k−1 are the elementary symmetric functions of µ1 , . . . , µn such that (s − µ1 ) · · · (s − µn ) = s n + s n−1 &1 + · · · + s 0 &n . Applying again the identities (2.27), from (2.24) and (2.23) we arrive at the following canonical Abel–Jacobi mapping n µi s−1 dµ µ s = 1, ψ1 = t˜ + φ1 (2.30) = √ ψs = δs,k tk + φs s = 2, . . . , n, µ0 2 R(µ) i=1

208

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

where φ1 , . . . , φn are constant phases which depend on x0 and the rest of times tl as on parameters, and δij is the Kronecker delta. As a result of inversion of (2.30), elementary symmetric functions of µ s and therefore the solution of equations (HD) and (SW) can be found in terms of theta-functions of n arguments which depend linearly on ψs . This means that the arguments depend linearly on t˜, as well as on the original time tk . However, t˜ itself depends on tk in a nonlinear way. Indeed, to describe the relation between t˜ and tk , we substitute the theta-functional expressions for the symmetric functions &n = µ1 · · · µn and &k−1 into (2.29). As a result, in contrast to the quadrature (2.26) relating x and x1 , we now get a differential equation of the form dtk = F (tk , t˜|x0 ), d t˜ where F is a transcendental function of t, t˜ and the parameter x0 . It can be shown that the equation involves a transcendental integral. Remarks. 1. In contrast to the x1 - and x-flows considered above, the flows generated by (2.23) (tk -flows) including (2.24), are nonlinear flows on the Jacobi variety Jac(). From the point of view of algebraic geometry, this phenomenon constitutes the main difference between solutions of such well known equations as KdV and sine Gordon equations and equations of (HD) or (SW) type. 2. The problem of inversion of the full nonstandard Abel mapping defined by (2.25) can be also studied by using a generalized Jacobian of the curve . Namely, one has to extend the mapping by including an extra holomorphic differential on to get a complete set of such differentials. As a result of this procedure, one gets a flow on nonlinear subvarieties (strata) of generalized Jacobians. The complete algebraic geometrical description and explicit formulae are presented in Alber and Fedorov [2001]. 3. Flows on n-Dimensional Quadrics and Stationary n-Gap Solutions of the (HD) and (SW) Equations Consider a family of confocal quadrics in Rn+1 = (X1 , . . . , Xn+1 ) 2 Xn+1 X12 ˜ Q(s) = + ··· + = 1 , s ∈ R, 0 < an+1 < a1 < · · · < an . a1 − s an+1 − s (3.1) The elliptic coordinates µ1 , . . . , µn+1 can be defined in Rn+1 in a standard way (see, ˜ e.g., Jacobi [1884a]) as follows. The condition s = c determines the quadric Q(c) on which one of the coordinates, say µn+1 , equals c, and the other coordinates µ1 , . . . , µn ˜ are elliptic coordinates on Q(c) defined by relations n l=1 (aj − µl ) , j = 1, . . . , n + 1. (3.2) Xj2 = (aj − c) n+1 k=1,k=j (aj − ak ) In the sequel without loss of generality we assume c = 0. ˜ = Q(0) ˜ It is well-known that the problem of geodesics on the ellipsoid Q is completely integrable (Jacobi [1884 a,b]). Moreover, as noticed by Jacobi himself and later

Complex Geometry of Piecewise Solutions

209

by many other authors (see e.g. Rauch-Wojciechowski [1995]), there exists an infinite ˜ in the sequence of integrable generalizations of the problem describing a motion on Q force field of certain polynomial potentials Vp (X1 , . . . , Xn+1 ), p ∈ N of degree 2p. The simplest integrable potential is the quadratic Hooke potential or the potential of an ˜ to the point mass on it: elastic string joining the center of the ellipsoid Q V1 =

σ 2 2 ), (X + · · · + Xn+1 2 1

σ = const.

In this case in terms of the ellipsoidal coordinates, the total energy (Hamiltonian) takes the Stäckel form: n n 1 j =i (µi − µj )µi dµi 2 σ H = + µi + const, 8 6(µi ) dx 2 i=1

where

i=1

6(µ) = (µ − a1 ) · · · (µ − an+1 )

and x denotes time. After fixing constants of motion, the system is reduced to the Abel– Jacobi equations n

µk

k=1 µ0

µi dµ = δin x + φi , √ 2 R(µk )

R(µ) = −µ6(µ)[c0 (µ − c1 ) · · · (µ − cn−1 ) − σ µn ],

i = 1, . . . , n,

(3.3)

c0 , . . . , cn−1 = const,

where φ1 , . . . , φn are constant phases and c1 , . . . , cn−1 are constants of motion. Notice that for σ = 0 the order of the polynomial R(µ) is 2n + 1, whereas for σ = 0 ˜ c0 is the it is 2n + 2. The case σ = 0 corresponds to the free (geodesic) motion on Q. ˙ X) ˙ and the remaining constants admit a clear geometric constant in the first integral (X, interpretation: the tangent line to a geodesic is also tangent to the fixed confocal quadrics ˜ 1 ), . . . , Q(c ˜ n−1 ) (Chasles theorem). Q(c Now notice that Eqs. (3.3) are equivalent to the system (2.25) with dt = 0 describing stationary (HD) and (SW) equations, provided we identify the roots of the polynomial R(µ) with those of the odd order polynomial (2.13) (for σ = 0 and L0 = 1) and of the even order polynomial (2.14) (for σ = 1) respectively. The equivalence also holds when some of the parameters ai in (3.3) are negative, which correspond to the motion on a hyperboloid. For concreteness, we shall consider only the case of ellipsoids. Taking into account the trace formula (2.22), we arrive at the following theorem: Theorem 3.1. The geodesic motion and motion in the field of a Hooke potential on ˜ are linked, at any fixed time t, to the n-gap solutions of (HD) and the ellipsoid Q (SW) equations respectively through the trace formula (2.22). Namely, if the roots of the polynomials R(µ) in (2.13) or (2.14) coincide with the roots of R(µ) in (3.3), the profiles of such solutions are given by the sum of the elliptic coordinates of the moving ˜ with addition of (−m) in case of equation (SW). point on Q ˜ (σ = 0) and equation (HD), this result was obtained in For the geodesic flow on Q Alber and Alber [1985], Cewen [1990], and Alber et al. [1995]). As with Eq. (2.25), under the change of parameter (2.26), Eqs. (3.3) reduce to those containing holomorphic differentials only and having the same structure as (2.28). By

210

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

inverting the corresponding Abel–Jacobi mapping, one obtains explicit expressions for elementary symmetric functions of µi and, in view of (3.2), for the Cartesian coordinates X1 , . . . , Xn+1 in terms of theta-functions of the new parameter x1 (for the case of the geodesic flow, see Weierstrass [1844], Moser[1978], and Knörrer [1982]). In the case n = 2, the change of parameter (2.26) was first applied by Weierstrass [1844] to solve the classical Jacobi geodesic problem on a triaxial ellipsoid (Jacobi [1884a,1884b]).

4. Billiard Dynamical Systems and Piecewise-Smooth Weak Solutions of PDE’s In this section it is first shown how peaked finite-gap solutions of (HD) and (SW) equations arise in the limit m1 → 0, where m1 is the smallest root of the polynomial R(E) in Eqs. (2.12)–(2.24). Then a connection to ellipsoidal and hyperbolic billiards is established. Ellipsoidal billiards and generalized Jacobians. Suppose that one of the semi-axes of ˜ tends to zero, namely, an+1 → 0. In the limit, Q ˜ passes into the interior the ellipsoid Q of (n − 1)-dimensional ellipsoid Q = {X12 /a1 + · · · + Xn2 /an = 1} ∈ Rn ,

Rn = (X1 , . . . , Xn ).

˜ transform to elliptic coordinates in Rn giving The elliptic coordinates µ1 , . . . , µn on Q Xj2

n (aj − µl ) = n l=1 , k=1,k=j (aj − ak )

j = 1, . . . , n,

(4.1)

which appear as the corresponding limits of (3.2). ˜ gets transformed into billiard motion inside the ellipsoid Then the motion on Q ˜ Q. Geodesics on Q pass into straight line segments inside Q, whereas the points of intersection of the geodesics with the plane {Xn+1 = 0} are mapped into impact points ˜ under the Hooke force passes on Q with elastic reflection. Also, the motion on Q to the motion inside Q under the action of the Hooke force with the potential V = σ (X12 + · · · + Xn2 )/2. However, in contrast to cases σ = 0 or σ < 0, for σ > 0 (an attracting Hooke potential), for the trajectory to reach Q the total energy h must be sufficiently large. Namely, there ought to exist a positive ε such that inside Q the following double inequality holds: h + σ (X12 + · · · + Xn2 )/2 > ε > 0. ˜ transforms to billiard motion inside the ellipsoid Under this condition, the motion on Q Q again having impacts and elastic reflections along Q. Thus, we have “an ellipsoidal billiard with the Hooke potential” which is described by the mapping B : (x, v) → (˜x, v˜ ), where x, v ∈ Rn are the Cartesian coordinates of a point on Q and the starting velocity vector respectively, while (˜x, v˜ ) are the coordinates and the starting velocity at

Complex Geometry of Piecewise Solutions

211

the next impact point. Following Fedorov [2001], the mapping has the form −1 [(σ − (v, a −1 v))x + 2(x, a −1 v)v], ν −1 v˜ = [(σ − (v, a −1 v))v − 2σ (x, a −1 v)x] + :a −1 x˜ ν −1 = [(σ − (v, a −1 v))(v + :a −1 x) + 2(x, a −1 v)(:a −1 v − σ x)], ν x˜ =

ν=

4σ (x, a −1 v)2 + (σ − (v, a −1 v))2 ,

:=

(4.2)

2(˜v, a −1 x˜ ) . (˜x, a −2 x˜ )

Notice that in the limit σ → 0 this reduces to a standard billiard mapping given in Veselov [1988] x˜ = x −

2(x, a −1 v) v, (v, a −1 v)

v˜ = v +

2(˜v, a −1 x˜ ) −1 a x˜ . (˜x, a −2 x˜ )

˜ with the higher order The mapping (4.2), as well as the billiard limits of the motion on Q potentials Vp (X1 , . . . , Xn , Xn+1 ) (Xn+1 = 0) are completely integrable. In the limit an+1 → 0 and after using the change of variable (2.26), the Abel–Jacobi equations (3.3) are transformed as follows: n µk i−1 µ dµ = φi = const, i = 1, . . . , n − 1, √ 2 ρ(µ) k=1 µ0 (4.3) n µk dµ = x1 + φn , √ 2µ ρ(µ) µ 0 k=1 ρ(µ) = −(µ − a1 ) · · · (µ − an ) [c0 (µ − c1 ) · · · (µ − cn−1 ) − σ µn ]. This system contains n−1 holomorphic differentials on the Riemann surface C = {w 2 = ρ(µ)} of genus g = n − 1 and one differential of the third kind having a pair of simple poles Q− , Q+ on C with µ(Q± ) = 0. Here again φ1 , . . . , φn are constant phases and c0 , . . . , cn−1 are constants of motion. The elliptic coordinates µ1 , . . . , µn represent the divisor of n points Pi = (µi , wi ) on C. Equations (4.3) describe a well defined mapping of the symmetric product C (g+1) to Jac(C, Q− , Q+ ), the (g + 1)-dimensional generalized Jacobian of the curve C with two distinguished points Q± . The later is obtained from the genus n curve w 2 = R(µ) in (3.3) as a result of confluence of two Weierstrass points (an+1 → 0) and regularization: cutting out the double point and gluing Q− , Q+ . The generalized Jacobian is a noncompact algebraic variety which is topologically equivalent to the product of the customary g-dimensional Jacobian variety Jac(C) with complex angle coordinates φ1 , . . . , φg and the cylinder C∗ = C \ {0} (for the definition and description of generalized Jacobians see, among others, Serre [1959], Previato [1985], Gavrilov [1999], and Fedorov [1999]). As follows from (4.3), the geodesic and the potential billiard motion parameterized by x1 is represented by a straight line flow on Jac(C, Q− , Q+ ), which is directed along the real section of C∗ and leaves the coordinates φ on Jac(C) invariant. As we shall see below, the solutions to the generalized inversion problem (4.3) have different structures, depending on whether R(µ) is an even or an odd order polynomial.

212

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

Solutions in terms of generalized theta-functions. First we concentrate on straight line billiards corresponding to the case σ = 0 when the curve C has one infinite point ∞. Fix a canonical basis of cycles A1 , . . . , Ag , B1 , . . . , Bg on C and let ω¯ 1 , . . . , ω¯ g be the dual basis of normalized holomorphic differentials on C and z1 , . . . , zg be corresponding coordinates on the universal covering of Jac(C). There exists a unique g × g constant normalizing matrix D such that ω¯ k =

g Dkj µj −1 dµ , √ ρ(µ) j =1

zk =

g

Dkj φj ,

k = 1, . . . , g = n − 1.

(4.4)

j =1

Let us also introduce a normalized differential of the third kind ?0 having simple poles at Q± with residues ±1 respectively: ?0 =

√ g ρ(0) dµ βk ω¯ k , + √ µ ρ(µ) k=1

√ ρ(0) = a1 · · · an · c1 · · · cn−1 ,

(4.5)

where βk are unique constants such that ?0 has zero A-periods on C. Then the last equation in (4.3) can be represented in the following form: n

µk

k=1 µ0

?0 = Z,

Z = 2 ρ(0)x1 + const.

(4.6)

√ Notice that in case of the ellipsoidal billiards R(0) is always real and hence Z is also real. Let us also choose the base point (µ0 , w0 ) of the mapping (4.3) to be an infinite point ∞ ∈ C. According to Fedorov [1999], the solution of the problem of inversion (4.3) together with (4.1) yields the following expressions for the Cartesian coordinates Xi of the point moving inside the ellipsoid Q: e−Z/2 θ [D + η(i) ](z − q/2) + eZ/2 θ [D + η(i) ](z + q/2) , (4.7) e−Z/2 θ [D](z − q/2) + eZ/2 θ[D](z + q/2) i = 1, . . . , n, z = (z1 , . . . , zn−1 )T , Z = 2 R(0)x1 + Z0 , Q+ T Q+ z, Z0 = const, q = 2 ω¯ 1 , . . . , ω¯ g ∈ Cg ,

Xi (x1 , z) = κi

∞

∞

κi = const. These expressions involve quotients of generalized theta-functions, where θ [D+η(i) ](z) and θ[D](z) are customary theta-functions associated with the Riemann surface C with appropriately chosen half-integer theta-characteristics η(i) (D is the half-integer thetacharacteristic corresponding to the vector of Riemann’s constants). The vector q coincides with the vector of B-periods of the meromorphic differential ?0 . The constant factors κi depend on the parameters of the curve C only. (For the definition and properties of the generalized theta-functions see e.g., Belokolos et al. [1994], Gagnon et al. [1992], Ercolani [1987], and Fedorov [1999].) The expressions (4.7) describe a straight line segment in Rn (Cn ) with z playing a role of a constant phase vector which defines the position of the segment. When one of √ the µ-variables, say µ1 , equals zero, the corresponding point P1 = (µ1 , R(µ1 )) on

Complex Geometry of Piecewise Solutions

213

the curve C coincides with one of the poles Q− , Q+ of the differential ?0 . Then, as follows from the mapping (4.3) and (4.6), x1 and Z become infinite. On the other hand, in view of (4.1), at this moment the moving point in Rn meets an ellipsoid Q. It follows that as x1 and Z change from −∞ to ∞ along the real axis, the expressions (4.7) have finite limits, giving the coordinates of two subsequent impact points on Q. Notice that Xi (∞, z) have the same values as Xi (−∞, z + q). Hence the next segment of the billiard trajectory is given by (4.7) with z being replaced by z + q. This yields the following algebraic-geometrical description of the billiard motion (see also Fedorov [1999]). Theorem 4.1. As the point mass inside Q approaches the ellipsoid, the point P1 on C tends to the pole Q+ . At the moment of impact, P1 jumps from Q+ back to Q− , whereas the phase vector z is increased by q defined in formulas (4.7). The process repeats itself for each impact. Using this property and by applying induction, from (4.7) the coordinates of the whole sequence of impact points are found in the form xi (N ) = κi

θ [D + η(i) ](z0 + N q) , θ [D](z0 + N q)

i = 1, . . . , n,

(4.8)

where N ∈ N is the number of impacts and the phase vector z0 = (z10 , . . . , zg0 )T is the same for all the segments of the billiard trajectory. These expressions depend on customary theta-functions only and, as functions of z0 , are meromorphic on a covering of the Jacobian variety of C. They have also been obtained by Veselov [1988] by using a factorization of matrix polynomials (see also Moser and Veselov [1991]). The work of Veselov is closely related to the discretization of mechanics that preserves the integrable structure. The numerical implementation of Veselov’s procedures was given in Wendlandt and Marsden [1997], a discrete reduction procedure in Marsden, Pekarsky and Shkoller [1999], Bobenko and Suris [1999] and an extension to PDE’s in Marsden, Patrick and Shkoller [1999]. The generalized Abel map (4.3) yields expressions in terms of generalized thetafunctions for the elementary symmetric functions of the variables µ. In particular, following Fedorov [1999], one obtains µ1 · · · µn = ∂x1 ∂V log θ˜ [D](z, Z) e−Z/2 ∂V θ [D](z − q/2) + eZ/2 ∂V θ [D](z + q/2) , = 2 ρ(0) ∂Z e−Z/2 θ [D](z − q/2) + eZ/2 θ [D](z + q/2)

(4.9)

where ˜ θ[D](z, Z) = e−Z/2 θ [D](z − q/2) + eZ/2 θ[D](z + q/2), Z = 2 ρ(0)x1 + Z0 ,

∂V = V1

(4.10)

∂ ∂ + · · · + Vn , ∂z1 ∂zn

and where V is the last column of the normalizing matrix D defined in (4.4): V = (D1g , . . . , Dgg )T . The phases z and Z0 are the same as in (4.7). As follows from (4.9),

214

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

for x1 , Z → ±∞, the product µ1 · · · µn tends to zero, as expected. Taking the integral (2.26) with L0 = 1 yields ˜ Z) + const x(x1 , z) = µ1 · · · µn dx1 = ∂V log θ(z, (4.11) e−Z/2 ∂V θ [D](z − q/2) + eZ/2 ∂V θ[D](z + q/2) = + const. e−Z/2 θ [D](z − q/2) + eZ/2 θ[D](z + q/2) It follows from this expression that the original parameter x has finite values as x1 → ±∞ and x(∞, z) has the same value as x(−∞, z + q). Now, substituting in (4.11) Z = −∞, Z = ∞, by induction, we find the length of the N th segment of the billiard trajectory in the form x(N ) − x(N − 1) =

∂V θ [D](z0 + N q) ∂V θ [D](z0 + N q − q) − , θ [D](z0 + N q) θ[D](z0 + N q − q)

N ∈ N (4.12)

z0 being the same as in (4.8). As a result, the solution Xi (x), x ∈ R, of the continuous geodesic billiard problem should be viewed as consisting of an infinite number of pieces each parameterized by x1 ∈ (−∞, ∞) and given by (4.7) and (4.11). These pieces are obtained by iteratively adding vector q to the phase z in (4.7) and (4.11) and they are glued together at the impact points corresponding to x1 = ±∞. Now we turn to the ellipsoidal billiard with the Hooke potential (σ = 1). In this case the curve C appearing in (4.3) has 2 infinite points at ±∞. We again introduce normalized differentials ω¯ k , ?0 , and coordinates zk , Z according to (4.4) and (4.6). Let the base point of the mapping (4.3) be one of the Weierstrass points of C, say µ0 = an . Then, instead of (4.7), the inversion of the generalized mapping (4.3) yields the following expressions for the squares of the Cartesian coordinates of the mass point moving inside an ellipsoid Q: Xi2 (x1 , z) = κi

θ˜ 2 [D + η(i) ](z, Z) , ˆ ˆ ˜ ˜ θ[D](z − q/2, ˆ Z − S/2) θ[D](z + q/2, ˆ Z + S/2)

(4.13)

i = 1, . . . , n,

z = (z1 , . . . , zn−1 )T , Z = 2 ρ(0)x1 + Z0 , z, Z0 = const, ∞+ Q+ (ω¯ 1 , . . . , ω¯ g )T , qˆ = 2 (ω¯ 1 , . . . , ω¯ g )T , q= Sˆ =

Q− ∞+

an

∞−

?0 ,

where θ˜ [D](z, Z) is defined in (4.10) and θ˜ [D + η(i) ](z, Z) = e−Z/2 θ [D + η(i) ](z − q/2) + eZ/2 θ[D + η(i) ](z + q/2) (4.14) √ Here κi are constants, and ρ(0) is the same as in (4.5). Similarly to (4.7), as x1 and Z pass from −∞ to ∞, Xi2 (x1 , z) tend to finite values resulting in the squares of the coordinates of subsequent impact points on Q. Thus, expressions (4.13) describe a segment of trajectory of the billiard in the field of the Hooke potential between two

Complex Geometry of Piecewise Solutions

215

impacts. After each impact the phase vector z changes according to Theorem 4.1. Then, by using induction, the sequence of impact points is described as follows: xi2 (N ) = κi

θ 2 [D + η(i) ](z0 + N q) , θ [D](z0 − qˆ + N q) θ[D](z0 + qˆ + N q) N ∈ N,

i = 1, . . . , n,

(4.15)

T

z0 = (z10 , . . . , zg0 ) = const.

Apparently, this theta-functional solution for the billiard with the Hooke potential was not previously known. Lastly, we find the following expression for x x(x1 , z) = const + log

ˆ ˜ θ[D](z − q/2, ˆ Z − S/2) , ˆ ˜ θ[D](z + q/2, ˆ Z + S/2)

Z = 2 ρ(0)x1 + const, (4.16)

which, for x1 → ±∞ and Z → ±∞, has finite limits determining x for two subsequent impacts. Then, using the expression (4.10), by induction, we express a x-interval between the impacts in terms of the customary theta-function: θ [D](z0 − q/2 ˆ + N q) θ[D](z0 + q/2 ˆ + N q) θ [D](z0 − q/2 ˆ + N q − q) ˆ − log − log S. θ[D](z0 + q/2 ˆ + N q − q)

x(N) − x(N − 1) = log

(4.17)

We emphasize that, in contrast to the geodesic billiard, for the billiard in the potential field the “time” x is not proportional to the length of a trajectory. Stationary finite-gap peaked solutions. Now we return to the finite-gap solutions of equations (HD) and (SW). Notice that under the limit m1 → 0 the mapping (2.28) takes the form (4.3) with ρ(µ) being a polynomial of degree 2n − 1 and 2n respectively. The trace formula (2.22) and relations (4.1) yield U=

n j =1

Xj2 +

n

ai + m.

i=1

Then solution to the billiard problems (4.7)–(4.17) provide solutions U (x, t0 ) for the above equations which consist of infinite sequences of smooth pieces each one corresponding to a segment between two impacts. The impacts themselves give peaks of U (x, t0 ). This leads to the following theorem. Theorem 4.2. 1) At any fixed time t = t0 , finite-gap peaked solution of the equation (HD) consists of an infinite number of pieces UN (x, t0 ), N ∈ Z glued at peak points. Let ρ(µ) be any polynomial with distinct roots a1 , . . . , an . Then, for any N , every piece is given by the following pair of theta-functional expressions parameterized by x1 ∈ R, UN =

n j =1

x(x1 , z) =

Xj2 (x1 , zN ) +

n

ai ,

(4.18)

i=1

e−Z/2 ∂V θ [D](zN − q/2) + eZ/2 ∂V θ [D](zN + q/2) + x0 , e−Z/2 θ [D](zN − q/2) + eZ/2 θ [D](zN + q/2) zN = z0 + N q ∈ Cn−1 , Z = 2 ρ(0)x1 + Z0 ,

(4.19)

216

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

where Xj2 (x1 , z) and q are given by (4.7) and z0 , Z0 , x0 are constant phases of the solution depending on t0 , which are the same for any piece. The length of the N th piece equals ∂V θ [D](z0 + N q) ∂V θ[D](z0 + N q − q) − . θ [D](z0 + N q) θ [D](z0 + N q − q)

(4.20)

2) At any fixed time t = t0 finite-gap peaked solution to equation (SW) consists of an infinite number of pieces UN (x, t0 ), N ∈ Z which are glued at peak points. The pieces are given in the following parametric form UN =

n j =1

x(x1 , t0 ) = log

Xj2 (x1 , zN ) +

n

ai + m,

zN = z0 + N q ∈ Cn−1 ,

(4.21)

i=1

ˆ ˜ θ[D](z ˆ Z − S/2) N − q/2, + x0 , ˆ ˜ ˆ Z + S/2) θ[D](z N + q/2,

Z = 2 ρ(0)x1 + Z0 , (4.22)

where Xj2 (x1 , z) are given by (4.13) and z0 , Z0 , x0 are constant phases which depend on t0 . The x-length of N th piece equals log

θ [D](z0 − q/2 ˆ + N q) ˆ + N q − q) θ[D](z0 − q/2 ˆ − log − log S. θ [D](z0 + q/2 ˆ + N q) θ [D](z0 + q/2 ˆ + N q − q)

(4.23)

When in the polynomials (2.13) or (2.14) m1 = 0 and m2 tends to zero, the distance between subsequent peaks of a profile tends to zero and in the limit the peaks coalesce. (Notice that this is done for a fixed t.) The solution U (x, t0 ) for this limiting case is smooth. Remark. It is known (see, for instance, Fedorov [1999]) that there are special degenerate umbilic billiard solutions of the classical billiard problem (without a potential) that have straight line segments meeting n − 1 fixed focal conics of Q between any subsequent impacts and, as x → ±∞, the billiard motion converges to simple oscillations along the largest axis of the ellipsoid. This corresponds to the confluence of the roots of the polynomial ρ(µ) in (4.3), c1 = a1 ,

...,

cn−1 = an−1 .

As a result, the hyperelliptic curve C becomes singular of arithmetic genus zero and the asymptotic billiard motion is described in terms of tau-functions. The corresponding asymptotic peaked solutions of equations (HD) and (SW) are given in Alber and Fedorov [2001]. Time-dependent piecewise-meromorphic solutions. Now we pass to global algebraic geometrical description of the finite-gap peaked solutions. After setting m1 → 0, the system (2.25) is formally reduced to the following Abel–Jacobi mapping: µ1 k−1 µn k−1 µ dµ µ dµ tk + φk k = 1, . . . , n − 1, + ··· + = (4.24) √ √ x + φn k = n, ρ(µ) ρ(µ) 2 2 µ0 µ0

Complex Geometry of Piecewise Solutions

where ρ(µ) = −L20

2n

217

(µ − mr )

and

ρ(µ) =

r=2

2n+1

(µ − mr )

r=2

in the case of equations (HD) or (SW) respectively. Here φ1 , . . . , φn are constant phases. This system contains n − 1 independent holomorphic differentials defined on the genus g = n − 1 Riemann surface {w2 = ρ(µ)}, which can be identified with the curve C described above. However, in contrast to the system (4.3), in the case of a polynomial ρ(µ) of odd order which corresponds to equation (HD), the last equation in (4.24) contains a meromorphic differential of the second kind having a double pole at the infinite point ∞ on C. In case of a polynomial ρ(µ) of even order corresponding to equation (SW), the last equation includes a meromorphic differential of the third kind with a pair of simple poles at the infinite points ∞− , ∞+ on C. According to Clebsch and Gordon [1866] and Gavrilov [1999], in the odd order case, such a system describes a well defined and invertible mapping of the symmetric product C (g+1) to Jac(C, ∞), the generalized Jacobian of the curve C with one distinguished point at ∞. The set Jac(C, ∞) is a noncompact algebraic variety which is topologically equivalent to the product Jac(C) × C. To describe this case we introduce a normalized differential of second kind having a double pole at ∞, √ g −1L0 µg dµ (1) dk ω¯ k , g = n − 1, (4.25) ?∞ = + √ 2 ρ(µ) k=1 where ω¯ k are the normalized holomorphic differentials specified in (4.4), dk are normal(1) izing constants such that all A-periods of ?∞ on C are zeros. Then the last equation in (4.24) implies that n i=1

µi µ0

?(1) ∞ = Z,

Z=

√ −1L0 x + (d, Dt) + const

d = (d1 , . . . , dn−1 )T ,

(4.26)

t = (tn , . . . , t2 )T ,

where D is an (n − 1) × (n − 1) normalizing matrix defined in (4.11). (1) Since ∞ now is a pole of ?∞ , we choose the basepoint P0 = (µ0 , w0 ) to be a finite Weierstrass point on C. For concreteness we choose P0 = (m2n , 0). Applying the residue theorem to the generalized theta-function associated with Jac(C, ∞) we solve the inversion problem (4.24) and find the following expression: n

µi = C1 − Z 2 +

i=1

2Z∂V θ[D + η2n ](z) − ∂V2 θ[D + η2n ](z) , θ [D + η2n ](z)

(4.27)

√

−1L0 x + (d, Dt) + Z0 , z = Dt + z0 ∈ Cn−1 , g µ ω¯ k + m2n , Z0 , z0 = const, C1 =

Z=

k=1 Ak

where the half-integer characteristic η2n labels the point (m2n , 0), the vector V = (D1g , . . . , Dgg )T is specified in (4.4), and the constant C1 contains the sum of integrals along the canonical cycles A1 , . . . , Ag on C. Notice that in the above formula

218

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

∂V = ∂t2 . Expression (4.27) is meromorphic in x and t1 , . . . , tn−1 and can be regarded as a generalization of the Matveev–Its formula to the case of the noncompact variety Jac(C, ∞). In the case of an even order curve C, corresponding to finite-gap peaked solutions of equation (SW), system (4.24) defines a mapping of the symmetric product C (g+1) to the generalized Jacobian Jac(C, ∞± ) which is topologically equivalent to the product Jac(C)×C∗ . As above, we set P0 to be the last Weierstrass point (m2n+1 , 0) and introduce the normalized differential of the third kind having a pair of simple poles at ∞− , ∞+ ∈ C, as well as the corresponding variable Z: g

µg dµ d¯k ω¯ k , ?∞± = √ + 2 ρ(µ) k=1

Z=

n i=1

µi

m2n+1

?∞± ,

(4.28)

where (d¯1 , . . . , d¯g ) = d¯ are chosen such that all the A-periods of ?∞± are zeros. Then, applying the residue theorem to the generalized theta-function associated with the Jac(C, ∞± ) yields n

ˆ + eZ θ [D](z + q) ˆ e−Z θ[D](z − q) , θ [D](z)

(4.29)

¯ Dt) + Z0 , z = Dt + z0 ∈ Cg , Z = x + (d, ∞+ ∞+ T qˆ = ω¯ 1 , . . . , ω¯ g ∈ Cg , Z0 , z0 = const.

(4.30)

µi + m = const −

i=1

where, in view of (4.28),

∞−

∞−

Remark. According to the formula (2.22), expressions (4.27) and (4.29) describe formal solutions to equations (HD) and (SW) respectively. However, while treating these solutions, one needs to take into account the reflection phenomenon described √ in Theorem 4.1. Namely, when a certain variable µi passes zero, the point Pi = (µi , ρ(µi )) jumps from one sheet of the Riemann surface C to another or, in other words, from the pole Q+ of the differential of the third kind ?0 to another pole Q− . Therefore, the above expressions do not provide global solutions to the equations. Instead, the following theorem holds. Theorem 4.3. 1) The time-dependent finite-gap peaked solution U (x, t) of (HD) consists of an infinite number of pieces in Rn = (t1 , . . . , tn−1 , x) described by meromorphic functions 2ZN ∂V θ [D + η2n ](zN ) − ∂V2 θ [D + η2n ](zN ) , N ∈ Z, θ [D + η2n ](zN ) √ zN = Dt + N q + z0 , ZN = −1L0 x + (d, zN ) + N h + Z0 , Z0 , z0 = const, t = (t1 , . . . , tn−1 ), (4.31) T Q+ Q+ Q+ h= ?(1) ω¯ 1 , . . . , ω¯ g , ∞, q =

2 UN (x, t) = C1 − ZN +

Q−

Q−

where C1 is the constant specified in (4.27).

Q−

Complex Geometry of Piecewise Solutions

219

For a fixed N the corresponding piece UN (x, t) is bounded by nonintersecting surfaces SN−1 and SN in Rn given by equations SN = {x = pN (t)}, 1 pN (t) = √ (∂V log θ [D + η2n ](zN + q/2) − (d, zN ) − N h) . −1L0

(4.32)

The adjacent pieces UN (x, t) and UN+1 (x, t) are thus glued to each other along SN , where U (pN (t), t) = C1 − ∂V2 log θ [D + η2n ](zN + q/2).

(4.33)

2) The finite-gap peaked solution U (x, t) of (SW) consists of an infinite number of pieces given by meromorphic functions ˆ + eZN θ [D](zN + q) ˆ e−ZN θ [D](zN − q) , N ∈ Z, θ [D](zN ) ¯ zN ) + N h¯ + Z0 , zN = Dt + qN + z0 , ZN = x + (d, Q+ ?∞± , t = (tn , . . . , t2 ), Z0 , z0 = const, h¯ =

UN (x, t) = const −

(4.34)

Q−

where the vector qˆ is described in (4.30). The piece UN (x, t) is bounded by peak surfaces S¯N−1 and S¯N defined as follows: S¯N = {x = p¯ N (t)},

p¯ N (t) = const − log

θ [D](zN − qˆ + q/2) . θ [D](zN + qˆ + q/2)

(4.35)

The adjacent pieces UN (x, t) and UN+1 (x, t) are glued together along S¯N , where U (pN (t), t) = const − ∂V log

ˆ θ[D](zN − q) . θ[D](zN + q) ˆ

(4.36)

Notice that along the peak surfaces, the solutions described in 1) and 2) have discontinuous partial derivatives with respect to x and t1 , . . . , tn−1 . Remark. By fixing all the times but tk in the above expressions, one obtains 2-dimensional piecewise solutions UN (x, tk ), whereas the corresponding sections of SN , S¯N ⊂ (x, t) = Rn describe peak lines in (x, tk )-plane. As follows from (4.32) and (4.35), the motion of the N th peak pN (tk ) along the x-axis is described by a sum of a linear function in tk and a quasi-periodic one. The latter function becomes periodic in the case g = 1. Finally, after fixing all the times without exception, expressions (4.31) and (4.34) provide pieces of the stationary finite-gap peaked solution already described in Theorem 4.2. Proof of Theorem 4.3. According to Theorem 4.2, the profiles of finite-gap peaked solutions are associated with geodesic ellipsoidal billiards and billiards in the field of a Hooke potential. An impact point on the boundary of a billiard trajectory corresponds to a peak of the profile U (x, t0 ), and this happens when one of the µi passes zero. Hence, the solution (4.27) is valid until one of the points P1 , . . . , Pn on C coincides with Q− or Q+ , the poles of the differential ?0 in (4.5). Putting, for example, Pn ≡ Q+ (µn ≡ 0)

220

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

in (4.24), one arrives at the following relations involving the normalized differentials defined in (4.4) and (4.25): g g i=1

µi

i=1 P0 g

?(1) ∞

P0

µi

−

ω¯ k = zk − qk /2,

dk ω¯ k

=

k=1

√

−1L0 x −

k = 1, . . . , g, Q+ P0

?(1) ∞

−

(4.37) g

dk ω¯ k ,

(4.38)

k=1

where P0 = (m2n , 0). Notice that Eqs. (4.37) form a closed system for the variables µ1 , . . . , µn−1 and describe the standard Abel–Jacobi mapping C (g) →Jac(C). Hence, the first symmetric polynomial has the following standard form in terms of theta-functions in the odd order case: g µ1 + · · · + µn−1 = c1 − ∂V2 log θ [D + η2n ](z − q/2), c1 = ω¯ k . (4.39) k=1 Ak

On the other hand, Eq. (4.38) implies that at a peak point the coordinate x becomes a function of z and therefore of t: x = p0 (t). In the odd order case, this equation contains a sum of Abelian integrals of the second kind, the so-called Abelian transcendent. By making use of the following standard expression for the normalized transcendent (Clebsch and Gordon [1866]) g g µi µi ?(1) ω¯ k , ∞ = −∂V log θ[D + η2n ] i=1

µ0

i=1

P0

from (4.37) and (4.38) we find p0 (t) = √

1 −1L0

(∂V log θ [D + η2n ](z − q/2) − (d, z) + h/2) ,

h=

Q+ Q−

?(1) ∞. (4.40)

Using the trace formula for the solution U (p0 (t), t) = µ1 + · · · + µn−1 and expression (4.39) it follows that the equation x = p0 (t) determines a surface S0 in Cn along which the solution U has a peak. Now setting in (4.24) Pn ≡ Q− and taking into account (4.4), (4.25) and Q− Q+ Q− Q+ (1) (1) ?∞ = − ?∞ , ω¯ = − ω¯ P0

P0

P0

P0

we obtain an expression for another peak surface S1 determined by the equation {x = p1 (t)} with p1 (t) = √

1 −1L0

(∂V log θ [D + η2n ](z + q/2) − (d, z) − h/2) ,

(4.41)

along which U (p1 (t), t) = µ1 + · · · + µn−1 = C1 − ∂V2 log θ [D + η2n ](z + q/2).

(4.42)

Complex Geometry of Piecewise Solutions

221

Under the reality condition, the surfaces S0 and S1 do not intersect and therefore determine a connected domain in Cn = (x, t) where the solution (4.27) is applicable. We denote this piece of solution as U1 (x, t). As follows from (4.40) and (4.41) S1 is obtained from S0 by changing the phase as follows: Z → Z + h,

z→z+q

that is x → x + √

1

−1L0 −1 t → t + D q.

(h − (d, Dt)),

(4.43)

In addition, according to (4.42) and (4.39) at any two points on S0 and S1 which are equivalent modulo the shift, U1 (x, t) takes the same values: 1 (h − (d, Dt)), t + D −1 q . (4.44) U1 (q1 (t), t) = U1 q0 (t) + √ −1L0 √ Now let us define the function U2 (x, t) = U1 (x + (h − (d, Dt))/( −1L0 ), t + D −1 q), which is also a local solution to (HD). In view of (4.44), U1 and U2 take the same values along S1 , which ensures a correct gluing of two pieces together. By using iteration with respect to both positive and negative N s, we construct a complete sequence of peak surfaces and obtain formulae given in part 1) of the theorem. Similarly, solution (4.29) of (SW) is valid until one of the points P1 , . . . , Pn on C coincides with Q− or Q+ , the poles of ?0 . Setting Pn ≡ Q+ in (4.24) for the case of an even order curve C, and using (4.4) and (4.28) yields n−1

n−1

µi P0

i=1

µi

i=1 P0 n−1

ω¯ k = zk − qk /2,

d¯k ω¯ k

?∞± −

k=1

=x−

k = 1, . . . , n − 1,

Q+ P0

?∞± −

n−1

(4.45)

d¯k ω¯ k ,

(4.46)

k=1

where P0 = (m2n+1 , 0). Inverting (4.45) results in the following expression for a symmetric polynomial (see e.g., Clebsch and Gordon [1866]) µ1 + · · · + µn−1 = const − ∂V log

θ [D](z − qˆ − q/2) . θ [D](z + qˆ − q/2)

(4.47)

After applying the theta-functional formula for the normalized transcendent of the third kind (Clebsch and Gordon [1866]), g i=1

µi P0

?∞±

θ [D](s − q) ˆ , = const − log θ [D](s + q) ˆ

s=

g i=1

µi P0

ω, ¯

g = n − 1,

from (4.46) and (4.45) we obtain x = p0 (t) = const − log

θ [D](z − qˆ + q/2) ¯ + h/2. ¯ − (z, d) θ [D](z + qˆ + q/2)

(4.48)

By choosing Pn ≡ Q+ in (4.24), one arrives at the expressions (4.47) and (4.48) with ¯ replaced by −q/2, −h/2. ¯ q/2, h/2 Then, following similar arguments and applying induction, the piecewise solution of part 2) is constructed.

222

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

We emphasize that although the different pieces UN (x, t) of the solution are obtained by iterative shifting the phases z, Z by the same vector, the pieces UN (x, t0 ) of the solution (t0 being fixed) are all distinct because the shift occurs in both x- and t-directions. Remark. If we omit the reality condition above, then the hypersurfaces SN , S¯N in Cn intersect, bounding a set of n-dimensional domains adjacent to each other in a rather complicated manner. Then the procedure of gluing different pieces of the functions UN (x, t) meromorphic inside each domain cannot be defined uniquely. As a result, the generic complex solution U (x, t) branches along the peak surfaces. 5. Kinematics of Peaks Now we obtain expressions for the velocity of the N th peak pN (t) of the piecewise solution of (HD) with respect to time tk . As was shown above, the solution has a peak when one of the µ-variables passes zero implying that Pn = Q− or Pn = Q+ . Theorem 5.1. Let y1 , . . . , yn−1 denote the µ-coordinates of the points P1 , . . . , Pn−1 at the moment in time when one of the µ-variables passes zero. The following system of equations holds: ∂pN (t) = −&k−1 (y1 , . . . , yn−1 ), ∂tk

(5.1)

where &k is k th the symmetric function of y1 , . . . , yn−1 . In particular, we have ∂pN (t) = y1 + · · · + yn−1 = U (pN (t), t), ∂t2

(5.2)

i.e., the t2 -velocity of the peak coincides with its height. Proof. After applying limit m1 → 0, Eqs. (2.12) and (2.23) for the derivatives of µn take the form √ ∂µn ρ(µn ) = , (5.3) ∂x (µn − µ1 ) · · · (µn − µn−1 ) √ ∂µn ρ(µn ) . (5.4) = &k−1 (µ1 , . . . , µn−1 ) ∂tk (µn − µ1 ) · · · (µn − µn−1 ) On the other hand, along the peak line {x = pN (tk )}, we have ∂µn d pN (tk ) ∂µn d + = 0, µn (pN (tk ), tk ) ≡ dt ∂x d tk ∂tk which, in view of (5.3) and (5.4) and after setting µn ≡ 0, yields √ ∂µn ρ(0) + &k−1 (y1 , . . . , yn−1 ) = 0. µ1 · · · µn−1 ∂tk Since ρ(0) = 0 and µ1 = y1 , . . . , µn−1 = yn−1 are finite, the latter relation gives (5.1).

Complex Geometry of Piecewise Solutions

223

Remark. The relations (5.1) can be also found by using direct differentiation of the expression for the N th peak surface (4.35) with respect to tk . Namely, putting without loss of generality N = 0, and taking into account ∂V = ∂t2 , we write n−1 µi ∂p0 (t) = ∂tk ∂t2 log θ [D + η2n ] ω¯ . ∂tk P0

(5.5)

i=1

According to Mumford [1983], in case of odd order hyperelliptic curves, this gives a theta-functional expression for the coefficient in front of λn−k in the polynomial (λ − µ1 ) · · · (λ − µg ) which coincides with &k−1 (y1 , . . . , yn−1 ). 6. The Dynamics of Peaks and Weak Solutions Expression (5.2) states that for equations in the hierarchies of (HD) or (SW), every peak in the solution profile moves with velocity determined by the local value of the solution. In this section, we derive this property without recourse to tools related to the complete integrability of the evolution equation. Thus, this property of peak motion can hold in general for equations that admit piecewise-smooth weak solutions, with jumps in the first spatial derivative at isolated points in the solution’s support. In this case, the derivative discontinuity can be viewed as a “shock” in the appropriate weak form of the evolution equation. We will take the weak form of the equation (HD) or (SW) to be ∇φ(x, t) · V(x, t) dx dt = 0, (6.1) ?

where the equality is satisfied for all test functions φ(x, t) is C ∞ with compact support in a domain ? in the (x, t) plane. Here ∇φ = (φt , φx ), the dot denotes the R2 inner product, and the vector function V(x, t) = (V1 , V2 ) is defined by V1 = Ux , 1 2 1 ∞ 2 U − |x − y| (Uy − 2κU ) dy , V2 = ∂x 2 4 −∞

(6.2)

for equation (HD) and V 1 = Ux ,

1 2 1 ∞ −|x−y| 2 2U + Uy2 − 2κU dy , U + e V2 = ∂x 2 4 −∞

(6.3)

for equation (SW), respectively. We will look for jump conditions satisfied by the solutions of Eq. (6.1). If the jump discontinuities are isolated, by adjusting the support of the test functions φ(x, t) we only need to consider the case of a single discontinuity. Let us suppose that the function U (x, t) is infinitely differentiable almost everywhere in ?, except along the curve x = q(t) where the first derivative Ux has a discontinuity. If we partition the domain ? into ? = ?1 ∪ ?2 by cutting along the portion of the

224

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

discontinuity curve x − q(t) = 0 in ?, the divergence theorem and the choice of test functions φ(x, t) vanishing on the boundary ∂? allow us to write Eq. (6.1) as 0= dx dt ∇φ · V = dx dt φ ∇ · V ?

?1

+

dx dt φ ∇ · V +

?2

dl φ n · [V]+ −.

(6.4)

∂?1 ∩∂?2

Here the unit vector n is directed along the normal [−q, ˙ 1] to the discontinuity curve ∂?1 ∩ ∂?2 in ?, and [V]+ denotes the jump of the vector V across this curve, − [V]+ − ≡

lim

x→q(t)+

V(x, t) −

lim

x→q(t)−

V(x, t).

By the arbitrariness of φ(x, t), each integrand term on the right-hand side of (6.4) has to vanish separately. Thus, from the first two terms, ∇ · V = 0,

or

∂V1 ∂V2 + = 0, ∂t ∂x

(6.5)

in ?1 or ?2 , where U (x.t) is smooth. This smoothness and zero divergence condition, by the definition (6.2) or (6.3) for (HD) or (SW) respectively, imply that U (x, t) is a solution of these equations in ?1 or ?2 . For instance, (6.5) becomes 1 2 1 ∞ |x − y| (Uy2 − 2κU ) dy = 0, Uxt + ∂xx U − 2 4 −∞ which is the integrated form of the Harry-Dym equation (HD). The last (jump) condition in (6.4), n · [V]+ − =0 along ∂?, implies + q[V ˙ 1 ]+ − = [V2 ]− .

(6.6)

The left-hand side of this expression is simply ˙ x ]+ q[V ˙ 1 ]+ − = q[U −.

(6.7)

As to the right-hand side, the second (integral) term in the definitions (6.2) or (6.3) of V2 (x, t) is a continuous function of x, as the integral wipes out the discontinuity sgn(x − y) as well as additional ones that Uy2 might have. Hence the integral terms do not contribute to the right hand side of (6.6). The jump of V2 (x, t) across the discontinuity curve x = q(t) then reduces to 1 2 + U )x [V2 ]+ = U (q, t) [Ux ]+ (6.8) − = −. − 2 If

+ − [Ux ]+ − ≡ Ux (q , t) − Ux (q , t) = 0,

Eqs. (6.7) and (6.8) yield q˙ = U (q, t),

(6.9)

i.e., the location of the discontinuity (shock) in the Ux moves at the local speed U (q, t). We have then proved the following

Complex Geometry of Piecewise Solutions

225

Theorem 6.1. Let U (x, t) be a solution of Eq. (6.1), with the vector V(x, t) defined in terms of U (x, t) by the nonlinear, nonlocal operators (6.2) and (6.3) respectively for equations (HD) and (SW). Let U (x, t) be a smooth function of (x, t) in the domain ? ⊆ R2 , except along the curve x = q(t), where U is continuous while the first derivative Ux has a jump discontinuity (peak) U (q + , t) = U (q − , t). Then U (x, t) is a solution of equations (HD) and (SW) in each domain ?1 and ?2 in which the curve x = q(t) partitions ?, and the location of the peak q(t) moves with velocity equal to its height, q˙ = U (q, t). Conclusions. In this paper, profiles of the weak finite-gap piecewise-smooth solutions of the integrable nonlinear equations of shallow water and Dym type are linked to billiard dynamical systems and geodesic flows with reflections described in terms of finite dimensional Hamiltonian systems on Riemann surfaces. After reducing the solution of these systems to that of a nonstandard Jacobi inversion problem, solutions are found by introducing new parametrizations. The extension of the algebraic-geometric method for nonlinear integrable PDE’s given in this paper leads to a description of piecewise-smooth weak solutions of a class of N -component systems of nonlinear evolution equations and its associated energy dependent Schrödinger operators. Acknowledgements. Mark Alber and Roberto Camassa would like to thank Francesco Calogero and Al Osborne for helpful discussions. The authors would like to thank R. Beals, D. Sattinger and J. Szmigielski for pointing out their recent work and for making it available.

References Abenda, S., Fedorov, Yu. [2000]: On the weak Kowalevski–Painlevé property for hyperelliptically separable systems. Acta Appl. Math. 60 (2), 137–178 Ablowitz, M.J., Segur, H. [1981]: Solitons and the Inverse Scattering Transform. Philadelphia: SIAM Alber, S.J. [1979]: Investigation of equations of Korteweg de Vries type by the method of recurrence relations. (Russian) J. Lond. Math. Soc. (2) 19, no. 3, 467–480 Alber, M.S., Alber, S.J. [1985]: Hamiltonian formalism for finite-zone solutions of integrable equations. C. R. Acad. Sci. Paris Ser. I Math. 301, 777–781 Alber, M.S., Camassa, R., Holm, D.D. and Marsden, J.E. [1994]: The geometry of peaked solitons and billiard solutions of a class of integrable PDE’s. Lett. Math. Phys. 32, 137–151 Alber, M.S., Camassa, R., Holm, D.D. and Marsden, J.E. [1995]: On the link between umbilic geodesics and soliton solutions of nonlinear PDE’s. Proc. Roy. Soc. 450, 677–692 Alber, M.S., Camassa, R., Fedorov, F., Holm, D.D. and Marsden, J.E. [1999]: On billiard solutions of nonlinear PDE’s. Phys. Lett. A 264, 171–178 Alber, M.S. and Miller, C. [2001]: On peak on solutions of the shallow water equation. Appl. Math. Lett. 14, 93–98 Alber, M.S. and Fedorov, Yu.N. [2000]: Wave solutions of evolution equations and Hamiltonian flows on nonlinear subvarieties of generalized Jacobians. J. Phys. A: Math. Gen. 33, 8409–8425 Alber, M.S. and Fedorov, Yu.N. [2001]: Algebraic geometric solutions for nonlinear evolution equations and flows on the nonlinear subvarieties of Jacobians. Inverse Problems (to appear) Alber, M.S., Luther, G.G., and Marsden, J.E. [1997]: Complex billiard Hamiltonian systems and nonlinear waves. In: Fokas,Y.H., Gelfand, I. M. (eds.) Algebraic Aspects of Integrable Systems, 1–16, Progr. Nonlinear Differential Equations Appl., 26. Boston: Birkhäuser Antonowicz, M. and Fordy, A. P. [1989]: Factorization of energy dependent Schrödinger operators: Miura maps and modified systems. Commun. Math. Phys. 124, no. 3, 465–486 Beals, W., Sattinger, D., and Szmigielski, J. [1998]: Acoustic scattering and the extended Korteweg de Vries hierarchy. Adv. Math. 140, 190–206

226

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

Beals, R., D.H. Sattinger, and J. Szmigielski [1999]: Multi-peakons and a theorem of Stietjes. Inverse Problems 15, L1–L4 Beals, R., Sattinger, D.H. and Szmigielski, J. [2000]: Multipeakons and the classical moment. Adv. in Math. 154, no. 2, 229–257 Belokolos, E.D., Bobenko, A.I., Enol’sii, V.Z., Its, A.R., and Matveev, V.B. [1994]: Algebro-Geometric Approach to Nonlinear Integrable Equations. Springer-Verlag series in Nonlinear Dynamics. New York: Springer-Verlag Bobenko, A. I. and Suris, Y. B. [1999]: Discrete Lagrangian reduction, discrete Euler-Poincaré equations, and semidirect products. Lett. Math. Phys. 49, no. 1, 79–93 Bulla, W., Gesztesy, F., Holden, H. and Teschl, G. [1998]: Algebro-geometric quasi-periodic finite-gap solutions of the Toda and Kac–van Moerbeke hierarchies. Mem. Am. Math. Soc. 135 Camassa, R. [2000]: Characteristic variables for a completely integrable shallow water equation. In: Boiti, M. et al. (eds.) Nonlinearity, Integrability and All That: Twenty Years After NEEDS ’79. Singapore: World Scientific Camassa, R., Holm, D.D. [1993]: An integrable shallow water equation with peaked solitons. Phys. Rev. Lett. 71, 1661–1664 Camassa, R., Holm, D.D. and Hyman, J.M. [1994]: A new integrable shallow water equation. Adv. Appl. Mech. 31, 1–33 Calogero, F. [1995]: An integrable Hamiltonian system. Phys. Lett. A 201, 306–310 Calogero, F. and Francoise, J.-P. [1996]: Solvable quantum version of an integrable Hamiltonian system. J. Math. Phys. 37 (6), 2863–2871 Cewen, C. [1990]: Stationary Harry-Dym’s equation and its relation with geodesics on ellipsoid. Acta Math. Sinica 6, 35–41 Clebsch, A. and Gordon, P. [1866]: Theorie der abelschen Funktionen. Leipzig: Teubner Dmitrieva, L.A. [1993a]: Finite-gap solutions of the Harry Dym equation. Phys. Lett. A 182, 65–70 Dmitrieva, L.A. [1993b]: The higher-times approach to multisoliton solutions of the Harry Dym equation J. Phys. A 26, 6005–6020 Drach, M. [1919]: Sur l’integration par quadratures de l’equation d 2 y/dx 2 = [φ(x) + h]y. Comptes rendus 168, 337–340 Dubrovin, B.A. [1975]: Periodic problems for the Korteweg–de Vries equation in the class of finite-band potentials. Funct. Anal. Appl. 9, 215–223 Dubrovin, B.A. [1981]: Theta-functions and nonlinear equations. Russ. Math. Surv. 36 (2), 11–92 Ercolani, N. [1987]: Generalized theta functions and homoclinic varieties. In: Ehrenpreis, L., Gunning, R.C. (eds.) Theta functions. Proceedings, Bowdoin. 87–100, Providence, R.I.: American Mathematical Society Fedorov, Yu. [1999]: Classical integrable systems related to generalized Jacobians. Acta Appl. Math. 55, 3, 151–201 Fedorov, Yu. [2001]: Ellipsoidal billiard with the quadratic potential. Funct. Anal. Appl. (Russian) (to appear) Gavrilov, L. [1999]: Generalized Jacobians of spectral curves and completely integrable systems. Math. Z. 230, 487–508 Gesztesy, F. [1995]: New trace formulas for Schrödinger operators. Evolution equations (Baton Rouge, LA, 1992), Lecture Notes in Pure and Appl. Math., 168, New York: Dekker, pp. 201–221 Gesztesy, F. and Holden, H. [1994]: Trace formulas and conservation laws for nonlinear evolution equations. Rev. Math. Phys. 6, 51–95 and 673 Gesztesy, F. and Holden, H. [2001]: Algebraic-geometric solutions of the Camassa–Holm equation. Preprint Gesztesy, F. and Holden, H. [2001]: Dubrovin equations and integrable systems on hyperelliptic curves. Math. Scand. (to appear) Gesztesy, F., Ratnaseelan, R., and Teschl, G. [1996]: The KdV hierarchy and associated trace formulas. Recent developments in operator theory and its applications (Winnipeg, MB, 1994), 125–163, Oper. Theory Adv. Appl. 87, Basel: Birkhäuser Hunter, J.K. and Zheng, Y.X. [1994]: On a completely integrable nonlinear hyperbolic variational equation. Physica D 79, 361–386 Jacobi, C.G.J. [1884a]: Vorlesungen uber Dynamik, Gesammelte Werke. Berlin: Supplementband Jacobi, C.G.J. [1884b]: Solution nouvelle d’un probleme fondamental de geodesie. Gesamelte Werke Bd. 2, Berlin

Complex Geometry of Piecewise Solutions

227

Jaulent, M. [1972]: On an inverse scattering problem with an energy dependent potential. Ann. Inst. H. Poincare A 17, 363–372 Jaulent, M. and Jean, C. [1976]: The inverse problem for the one-dimensional Schrödinger operator with an energy dependent potential. Ann. Inst. H. Poincare A. I, II 25, 105–118, 119–137 Kane, C., E. A. Repetto, E.A., Ortiz, M., and Marsden, J.E. [1999]: Finite element analysis of nonsmooth contact. Comput. Methods Appl. Mech. and Engrg. 180, 1–26 Klingerberg, W. [1982]: Riemannian Geometry. New York: de Gruyter Knörrer, H. [1982]: Geodesics on quadrics and mechanical problem of C. Neumann. J. Reine Angew. Math. 334, 69–78 Kozlov, V.V. and Treschev, D. V. [1991]: Billiards, a Genetic Introduction to Systems with Impacts. AMS Translations of Math. Monographs 89. New York Kruskal, M.D. [1975]: Nonlinear wave equations. In: Moser, J. (eds.) Dynamical Systems, Theory and Applications. Lecture Notes in Physics 38, New York: Springer Markushevich, A. I. [1977]: Introduction to the Theory of Abelian Functions. English version: Translations of Mathematical Monographs, 96. Providence, RI: American Mathematical Society, 1992 Marsden, J. E., Patrick, G.W., and Shkoller, S. [1998]: Mulltisymplectic geometry, variational integrators and nonlinear PDEs, Commun. Math. Phys. 199, 351–395 Marsden, J. E., Pekarsky, S., and Shkoller, S. [1999]: Discrete Euler-Poincare and Lie-Poisson equations. Nonlinearity 12, 1647–1662 Marsden, J. E. and Ratiu, T.S. [1999]: Introduction to Mechanics and Symmetry. Texts in Applied Mathematics, 17, Berlin–Heidelberg–New York: Springer-Verlag McKean, H.P. and Constantin, A. [1999]: A shallow water equation on the circle. Comm. Pure Appl. Math. Vol LII, 949–982 Moser, J. and Veselov, A.P. [1991]: Discrete versions of some classical integrable systems and factorization of matrix polynomials. Commum. Math. Phys. 139, 217–243 Mumford, D. [1983]: Tata Lectures on Theta I and II. Boston: Birkhäuser-Verlag Novikov D.P. [1999]: Algebraic-geometrical solutions of the Harry–Dym equations. Sibirskii Matematicheskii Zhurnal, 40, 159–163, (Russian) English transl. in: Siberian Math. Journal, 40, 136–140 Previato, E. [1995]: Hyperelliptic quasi-periodic and soliton solutions of the nonlinear Schrödinger equation. Duke Math. J. 52, 329–377 Rauch-Wojciechowski, S. [1995]: Mechanical systems related to the Schrödinger spectral problem. Chaos, Solitons & Fractals 5, 2235–2259 Serre, J.-P. [1959]: Groupes algébriques et corps de classes. Paris: Hermann Vanhaecke, P. [1995]: Stratification of hyperelliptic Jacobians and the Sato Grassmannian. Acta Appl. Math. 40, 143–172 Veselov, A.P. [1988]: Integrable discrete-time systems and difference operators. Funct. An. and Appl. 22, 83–94 Verhulst, F. [1996]: Nonlinear Differential Equations and Dynamical Systems. Second Edition, Berlin–Heidelberg–New York: Springer-Verlag Wadati, M., Ichikawa, Y.H., and Shimizu, T. [1980]: Cusp soliton of a new integrable nonlinear evolution equation. Prog. Theor. Phys. 64, 1959–1967 Weierstrass, K. [1884]: Über die geodätischen Linien auf dreiachsigen Ellipsoid, Math. Werke I, 257–266 Wendlandt, J.M. and Marsden, J.E. [1997]: Mechanical integrators derived from a discrete variational principle. Physica D 106, 223–246 Whittaker, E.T. [1937]: A Treatise on the Analytical Dynamics of Particles and Rigid Bodies, Cambridge: Cambridge University Press, 1904; 4th Ed., 1937 (reprinted by Dover 1944, and Cambridge University 1988) Young, L.C. [1969]: Lectures on the Calculus of Variations and Optimal Control Theory. Corrected printing, Chelsea: Saunders, 1980 Communicated by T. Miwa

Commun. Math. Phys. 221, 229 – 254 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

The Absolute Continuity of the Integrated Density of States for Magnetic Schrödinger Operators with Certain Unbounded Random Potentials Thomas Hupfer1 , Hajo Leschke1 , Peter Müller2 , Simone Warzel1 1 Institut für Theoretische Physik, Universität Erlangen-Nürnberg, Staudtstraße 7, 91058 Erlangen, Germany 2 Institut für Theoretische Physik, Georg-August-Universität, 37073 Göttingen, Germany

Received: 20 October 2000 / Accepted: 8 March 2001

Dedicated to the memory of Kurt Broderix (26 April 1962 – 12 May 2000) Abstract: The object of the present study is the integrated density of states of a quantum particle in multi-dimensional Euclidean space which is characterized by a Schrödinger operator with magnetic field and a random potential which may be unbounded from above and below. In case that the magnetic field is constant and the random potential is ergodic and admits a so-called one-parameter decomposition, we prove the absolute continuity of the integrated density of states and provide explicit upper bounds on its derivative, the density of states. This local Lipschitz continuity of the integrated density of states is derived by establishing a Wegner estimate for finite-volume Schrödinger operators which holds for rather general magnetic fields and different boundary conditions. Examples of random potentials to which the results apply are certain alloy-type and Gaussian random potentials. Besides we show a diamagnetic inequality for Schrödinger operators with Neumann boundary conditions. Contents 1. 2.

3. 4.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Random Schrödinger Operators with Magnetic Fields . . . . . . 2.1 Basic notation . . . . . . . . . . . . . . . . . . . . . . . 2.2 Basic assumptions . . . . . . . . . . . . . . . . . . . . . 2.3 Definition of the operators . . . . . . . . . . . . . . . . . 2.4 The integrated density of states . . . . . . . . . . . . . . Existence of the Density of States for Certain Random Potentials 3.1 A Wegner estimate . . . . . . . . . . . . . . . . . . . . . 3.2 Upper bounds on the density of states . . . . . . . . . . . Examples Illustrating the Results of Section 3 . . . . . . . . . . 4.1 Alloy-type random potentials . . . . . . . . . . . . . . . 4.2 Gaussian random potentials . . . . . . . . . . . . . . . . 4.3 Two space dimensions: random Landau Hamiltonians . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

230 230 230 231 232 234 236 236 238 238 239 241 243

230

T. Hupfer, H. Leschke, P. Müller, S. Warzel

A.

On Finite-Volume Schrödinger Operators with Magnetic Fields A.1 Definition of magnetic Neumann Schrödinger operators A.2 Diamagnetic inequality . . . . . . . . . . . . . . . . . A.3 Some consequences . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

245 245 248 251 252

1. Introduction The integrated density of states is a quantity of primary interest in the theory [34, 10, 49] and application [54, 7, 40, 2, 37] of Schrödinger operators for a particle in d-dimensional Euclidean space Rd (d = 1, 2, 3, . . . ) subject to a random potential. Its knowledge allows one to compute the free energy and hence all basic thermostatic quantities of the corresponding non-interacting many-particle system. It also enters formulae for transport coefficients. The goal of the present paper is to prove the absolute continuity of the integrated density of states N for certain unbounded random potentials, thereby generalizing a result in [23] for zero magnetic field to the case of a constant magnetic field. Examples of random potentials to which our result applies are certain alloy-type and Gaussian random potentials. In particular, we consider the situation of two space dimensions and a perpendicular constant magnetic field where N is not absolutely continuous without random potential. For the proof of absolute continuity of N , we use the abstract one-parameter spectralaveraging estimate of [11] to derive what is called a Wegner estimate [65]. Such estimates provide upper bounds on the averaged number of eigenvalues of finite-volume random Schrödinger operators in a given energy regime. They play a major rôle in proofs of Anderson localization for multi-dimensional random Schrödinger operators [10, 49, 11, 24, 61]. In contrast to the Wegner estimates with magnetic fields which are available so far, we are neither restricted to the case of a constant magnetic field [12, 5, 64] nor to the existence of gaps in the spectrum of the magnetic Schrödinger operator without random potential [4]. In fact, the Wegner estimate in the present paper holds for magnetic vector potentials whose components are locally square integrable. Its proof involves techniques for (non-random) magnetic Neumann Schrödinger operators among them Dirichlet–Neumann bracketing and a diamagnetic inequality. Appendix A provides the definition of these operators and proofs of the latter techniques in greater generality than actually needed for the main body of the present paper. 2. Random Schrödinger Operators with Magnetic Fields 2.1. Basic notation. As usual, let N := {1, 2, 3, . . . } denote the set of natural numbers. Let R, respectively C, denote the algebraic field of real, respectively complex numbers and let Zd be the simple cubic lattice in d dimensions, d ∈ N. An open cube in ddimensional Euclidean space Rd is a translate of the d-fold Cartesian product I × . . . × I of an open interval I ⊆ R. The open unit cube in Rd which is centered at site y ∈ Rd and whose edges are oriented parallel to the co-ordinate axes is denoted by (y). The d 2 1/2 . Euclidean norm of x ∈ Rd is |x| := j =1 xj d The volume of a Borel subset d ⊆ R with respect to the d-dimensional Lebesgue d χ measure is || := d x = Rd d x (x), where χ is the indicator function of . In particular, if is the strictly positive half-line, := χ ] 0,∞[ is the left-continuous

Density of States for Random Schrödinger Operators with Magnetic Fields

231

Heaviside unit-step function. The Banach space Lp (), p ∈ [1, ∞], consists of the Borel-measurable complex-valued functions f : → C which areidentified if their values differ only on a set of Lebesgue measure zero and which obey dd x |f (x)|p < 2 ∞ if p < ∞ and f ∞ := ess supx∈ |f (x)| < ∞ if p = ∞. We recall dthat L () is a separable Hilbert space with scalar product · , · given by f, g = d x f (x) g(x). p Here the overbar denotes complex conjugation. We write f ∈ Lloc (Rd ), if f χ ∈ Lp (Rd ) for any bounded Borel set ⊂ Rd . Finally, C0∞ () is the vector space of functions f : → C which are arbitrarily often differentiable and have compact supports. 2.2. Basic assumptions. Let (, A, P) be a complete probability space and E{·} := P(dω)(·) be the expectation induced by the probability measure P. By a random potential we mean a (scalar) random field V : × Rd → R , (ω, x) → V (ω) (x) which is assumed to be jointly measurable with respect to the product of the sigma-algebra A of event sets in and the sigma-algebra B(Rd ) of Borel sets in Rd . We will always assume d ≥ 2, because magnetic fields in one space dimension may be “gauged away” and are therefore of no physical relevance. Furthermore, for d = 1 far more is known [10, 49] thanks to methods which only work for one dimension. We list four properties which V may have or not: (F) There exists some real p ∈]1, ∞[ with p > 1 if d = 2 and p ≥ d/2 if d ≥ 3 such that for P-almost each ω ∈ the realization V (ω) : x → V (ω) (x) of V belongs to p Lloc (Rd ). (S) There exists some pair of reals p1 > p(d) and p2 > p1 d/ [2(p1 − p(d))] such that p /p (2.1) dd x |V (x)| p1 2 1 < ∞. sup E y∈Zd

(y)

Here p(d) is defined as follows: p(d) := 2 if d ≤ 3, p(d) := d/2 if d ≥ 5 and p(4) > 2, otherwise arbitrary. (E) V is Zd -ergodic or Rd -ergodic. (I) The finiteness condition dd x |V (x)|2ϑ+1 < ∞ (2.2) sup E y∈Zd

(y)

holds, where ϑ ∈ N is the smallest integer with ϑ > d/4. Remark 2.1. (i) Property (E) requires the existence of a group Tx , x ∈ Zd or Rd , of probability-preserving and ergodic transformations on such that V is Zd - or Rd homogeneous in the sense that V (Tx ω) (y) = V (ω) (y − x) for all x ∈ Zd or Rd , all y ∈ Rd and all ω ∈ . p(d)

(ii) Since property (S) assures that the realization V (ω) belongs to Lloc (Rd ) for Palmost each ω ∈ , property (S) implies property (F). Property (I) also implies property (F). We proceed by listing two properties either of which a random potential may additionally have or not and which characterize two examples of random potentials, which we will consider in the present paper.

232

T. Hupfer, H. Leschke, P. Müller, S. Warzel

(A) V is an alloy-type random field, that is, a random field with realizations given by

(ω) λj u0 (x − j ). (2.3) V (ω) (x) = j ∈Zd

The coupling strengths {λj } form a family of random variables which are P-independent and identically distributed according to the common probability measure B(R) I → P{λ0 ∈ I }. Moreover, we suppose that the single-site potential u0 : Rd → R satisfies the Birman–Solomyak condition d p1 1/p1 < ∞ with some real p ≥ 2ϑ + 1 and that 1 j ∈Zd (j ) d x |u0 (x)| E (|λ0 |p2 ) < ∞ for some real p2 satisfying p2 ≥ 2ϑ + 1 and p2 > p1 d/[2(p1 − p(d))]. [The constants p(d) and ϑ are defined in properties (S) and (I).] (G) V is a Gaussian random field [1, 41] which is Rd -homogeneous. It has zero mean, E [ V (0)] = 0, and its covariance function x → C(x) := E [ V (x)V (0)] is continuous at the origin where it obeys 0 < C(0) < ∞. Remark 2.2. (i) Consider an alloy-type random potential V , that is, a random potential with property (A). Then V has properties (E), (I), (S) and (F), see, for example [29]. (ii) Consider a random field with the Gaussian property (G). Then its covariance function C is bounded and uniformly continuous on Rd . Consequently, [22, Thm. 3.2.2] implies the existence of a separable version V of this field which is jointly measurable. Speaking about a Gaussian random potential, we tacitly assume that only this version will be dealt with. By the Bochner–Khintchine theorem [51, Thm. IX.9] there is a one-to-one correspondence between finite positive (and even) Borel measures on Rd and Gaussian random potentials. An explicit calculation shows that a Gaussian random potential enjoys properties (I), (S) and (F). A simple sufficient criterion for the ergodicity property (E) is the mixing condition lim|x|→∞ C(x) = 0. By a vector potential we mean a (non-random) Borel-measurable vector field A : Rd → Rd , x → A(x) which we assume to possess either the property 1 (Rd ), (B) |A|2 belongs to Lloc

or the property (C) A has continuous partial derivatives which give rise to a magnetic field (tensor) with constant components given by Bj k := ∂j Ak − ∂k Aj , where j , k ∈ {1, . . . , d}. Remark 2.3. (i) Property (C) implies property (B). (ii) Given property (C), we may exploit the gauge freedom to choose the vector potential in the symmetric gauge in which the components of A are given by Ak (x) = d j =1 xj Bj k /2, where k ∈ {1, . . . , d}. 2.3. Definition of the operators. We are now prepared to precisely define magnetic Schrödinger operators with random potentials on the Hilbert spaces L2 () and L2 (Rd ). The finite-volume case is treated in Proposition 2.1. Let ⊂ Rd be a bounded open cube, A be a vector potential with the property (B) and V be a random potential with the property (F). Then

Density of States for Random Schrödinger Operators with Magnetic Fields

233

(i) the sesquilinear form hA,0 ,N (ϕ, ψ) :=

d

1 (i∇ + A)j ϕ , (i∇ + A)j ψ , 2

(2.4)

j =1

2 2 d with ϕ, ψ in the form domain Q hA,0 ,N := φ ∈ L () : (i∇ + A) φ ∈ (L ()) and ∇ − iA denoting the gauge-covariant gradient in the sense of distributions on C0∞ (), uniquely defines a self-adjoint positive operator on L2 (), which we A,0 denote by H,N (A, 0). The closure hA,0 ,D of the restriction of h,N to the domain C0∞ () uniquely defines another self-adjoint positive operator on L2 (), which we denote by H,D (A, 0). (ii) The two operators H,X (A, V (ω) ) := H,X (A, 0) + V (ω) ,

X = D or X = N,

(2.5)

are self-adjoint and bounded below on L2 () as form sums for all ω in some subset F ∈ A of with full probability, in symbols, P(F ) = 1. (iii) The mapping H,X (A, V ) : F ω → H,X (A, V (ω) ) is measurable. We call it the finite-volume magnetic Schrödinger operator with random potential V and Dirichlet or Neumann boundary condition if X = D or X = N, respectively. (iv) The spectrum of H,X (A, V (ω) ) is purely discrete for all ω ∈ F . (v) The (random) finite-volume density-of-states measure, defined by the trace (ω) (2.6) ν,X (I ) := Tr χ I H,X (A, V (ω) ) , is a positive Borel measure on the real line R for all ω ∈ F . Here χ I H,X (A, V (ω) ) is the spectral projection operator of H,X (A, V (ω) ) associated with the energy regime I ∈ B(R). Moreover, the (unbounded left-continuous) distribution function (ω) (ω) N,X (E) := ν,X ]−∞, E[ = Tr E − H,X (A, V (ω) ) < ∞ (2.7) (ω)

of ν,X , called the finite-volume integrated density of states, is finite for all energies E ∈ R. Proof. The proofs of assertions (i), (ii) and (iv) are contained in Appendix A because (B) and (F) imply (A.1) and (A.2). Assertion (iii) is a consequence of considerations in [35], see also Sect. V.1 in [10], and of a straightforward generalization to non-zero vector potentials. Assertion (v) follows from (ii) and (iv). (ω)

Remark 2.4. Counting multiplicity, ν,X (I ) is just the number of eigenvalues of the operator H,X (A, V (ω) ) in the Borel set I ⊆ R. Since this number is almost-surely (ω) finite if I is bounded, the mapping ν,X : ω → ν,X is a random Borel measure. The precise definition of the infinite-volume magnetic Schrödinger operator on L2 (Rd ) and a compilation of its basic properties are given in Proposition 2.2. Assume that the vector potential A and the random potential V enjoy the properties (C) and (S). Then

234

T. Hupfer, H. Leschke, P. Müller, S. Warzel

(i) the operator C0∞ (Rd ) ψ → 21 dj =1 (i∂j + Aj )2 ψ + V (ω) ψ is essentially selfadjoint for all ω in some subset S ∈ A of with full probability, P(S ) = 1. Its self-adjoint closure on L2 (Rd ) is denoted by H (A, V (ω) ). (ii) The mapping H (A, V ) : S ω → H (A, V (ω) ) is measurable. We call it the infinite-volume magnetic Schrödinger operator with random potential V . Proof. For assertion (i) see [24, Prop. 2.3], which generalizes [10, Prop. V.3.2] to the case of continuously differentiable vector potentials A = 0. Note that the assumption of a vanishing divergence, dj =1 ∂j Aj = 0, in [24, Prop. 2.3] is not needed in the argument. Assertion (ii) is a straightforward generalization of [10, Prop. V.3.1] to continuously differentiable A = 0, see also [34, Prop. 2 on p. 288]. Remark 2.5. For alternative or weaker criteria instead of (S) guaranteeing the almostsure self-adjointness of H (0, V ), see [49, Thm. 5.8] or [34, Thm. 1 on p. 299]. If A has the property (C), the infinite-volume magnetic Schrödinger operator without scalar potential, H (A, 0), is unitarily invariant under so-called magnetic translations [67]. The latter form a family of unitary operators {Tx }x∈Rd on L2 (Rd ) defined by ψ ∈ L2 (Rd ), (2.8) (Tx ψ) (y) := ei(x (y) ψ(y − x), where (x (y) := K(x,y) dr · (A(r) − A(r − x)) is an integral along some smooth curve K with initial point x ∈ Rd and terminal point y ∈ Rd . Since A and its x-translate A( · − x) give rise to the same magnetic field and Rd is simply connected, the integral (x (y) is actually independent of K. Remark 2.6. (i) For the vector potential in the symmetric gauge (see Remark 2.3 (ii)) one has (x (y) = dj,k=1 xj Bj k (yk − xk )/2. (ii) For a discussion in the case of more general configuration spaces and magnetic fields, see for example [44]. (iii) In the situation of Prop. 2.2 and if the random potential V has property (E), we have Tx H (A, V (ω) ) Tx† = H (A, V (Tx ω) )

(2.9)

for all ω ∈ S and all x ∈ Zd or x ∈ Rd , depending on whether V is Zd - or Rd -ergodic. Hence, following standard arguments, H (A, V ) is an ergodic operator and its spectral components are non-random, see [62, Thm. 2.1]. Moreover, the discrete spectrum of H (A, V (ω) ) is empty for P-almost all ω ∈ , see [34, 10, 62]. 2.4. The integrated density of states. The quantity of main interest in the present paper is the integrated density of states and its corresponding measure, called the density-ofstates measure. The next theorem, which we recall from [29], deals with its definition and its representation as an infinite-volume limit of the suitably scaled finite-volume counterparts (2.7). Proposition 2.3. Let χ (0) denote the multiplication operator associated with the indicator function of the unit cube (0). Assume that the potentials A and V have the properties (C), (S), (I) and (E). Then the (infinite-volume) integrated density of states (2.10) N (E) := E Tr χ (0) E − H (A, V ) χ (0) < ∞

Density of States for Random Schrödinger Operators with Magnetic Fields

235

is well defined for all energies E ∈ R in terms of the (spatially localized) spectral family of the infinite-volume operator H (A, V ). It is the (unbounded left-continuous) distribution function of some positive Borel measure ν on the real line R. Moreover, let ⊂ Rd stand for bounded open cubes centered at the origin. Then there is a set 0 ∈ A of full probability, P(0 ) = 1, such that the limit relation (ω)

N (E) = lim

↑Rd

N,X (E) ||

(2.11)

holds for both boundary conditions X = D and X = N, all ω ∈ 0 and all E ∈ R except for the (at most countably many) discontinuity points of N . Proof. See [29]. Remark 2.7. (i) A proof of the existence of the integrated density of states N under slightly different hypotheses was outlined in [43]. It uses functional-analytic arguments first presented in [36] for the case A = 0. A different approach to the existence of the density-of-states measure ν for A = 0, using Feynman–Kac(-Itô) functional-integral representations of Schrödinger semigroups [58, 9], can be found in [62, 8]. The latter approach dates back to [47, 46] for the case A = 0. To our knowledge, it works straightforwardly in the case A = 0 for X = D only. For A = 0 the independence of the infinite-volume limit in (2.11) of the boundary condition X (previously claimed without proof in [43]) follows from [45] if the random potential V is bounded and from [19] if V is bounded from below. So the main new point about Prop. 2.3 is that it also applies to a wide class of V unbounded from below. Even for A = 0, Prop. 2.3 is partially new in that the corresponding result [49, Thm. 5.20] only shows vague convergence of the underlying measures, see the next remark. (ii) An immediate corollary of Prop. 2.3 is the vague convergence [6, Def. 30.1] of (ω) the spatial eigenvalue concentrations ||−1 ν,X in the infinite-volume limit ↑ Rd to the non-random positive Borel measure ν uniquely corresponding to the integrated density of states (2.10) in the sense that N (E) = ν(] − ∞, E[) for all E ∈ R, that is, (ω)

ν = lim

↑Rd

ν,X ||

(vaguely)

(2.12)

for both X = D and X = N and P-almost all ω ∈ . One may relate properties of the density-of-states measure ν to simple spectral properties of the infinite-volume magnetic Schrödinger operator. Examples are the support of ν and the location of the almost-sure spectrum of H (A, V (ω) ) or the absence of a point component in the Lebesgue decomposition of ν and the absence of “immobile eigenvalues” of H (A, V (ω) ). This is the content of Corollary 2.1. Under the assumptions of Prop. 2.3 and letting I ∈ B(R), the following equivalence holds: ν(I ) = 0 if and only if χ I H (A, V (ω) ) = 0 for P-almost all ω ∈ . This immediately implies: (i)

supp ν = spec H (A, V (ω) ) for P-almost all ω ∈ . [Here spec H (A, V (ω) ) denotes the spectrum of H (A, V (ω) ) and supp ν := {E ∈ R : ν(]E − ε, E + ε[) > 0 for all ε > 0} is the topological support of ν.]

236

T. Hupfer, H. Leschke, P. Müller, S. Warzel

(ii) 0 = ν({E}) = limε↓0 N (E + ε) − N (E) if and only if E ∈ R is not an eigenvalue of H (A, V (ω) ) for P-almost all ω ∈ . Proof. See [29]. The equivalence (ii) of the above corollary is a continuum analogue of [15, Prop. 1.1], see also [49, Thm. 3.3]. In the one-dimensional case [48] and the multi-dimensional lattice case [18], the equivalence has been exploited to show for A = 0 the (global) continuity of the integrated density of states N under practically no further assumptions on the random potential beyond those ensuring the existence of N . The proof of such a statement in the multi-dimensional continuum case is considered an important open problem [60]. For A = 0 one certainly needs additional assumptions as [20] illustrates, see Remark 4.3(ii) below. Under the additional assumptions of Corollary 3.1 below, we will show that the integrated density of states is not only continuous, but even absolutely continuous in the case of a constant magnetic field of arbitrary strength. 3. Existence of the Density of States for Certain Random Potentials In this section we provide conditions under which the integrated density of states N (or, equivalently, its measure ν) is absolutely continuous with respect to the Lebesgue measure. As a by-product, we get rather explicit upper bounds on the resulting Lebesgue density dN (E)/dE = ν(dE)/dE, called the density of states. Results of this genre date back to [65] and go nowadays under the name Wegner estimates. 3.1. A Wegner estimate. The main aim of this subsection is to extend the Wegner estimate in [23] to the case with magnetic fields. For this purpose we recall from there Definition 3.1. A random potential V : × Rd → R admits a (U, λ, u, ,)decomposition if there exists a random potential U : × Rd → R , a random variable d λ : → R and a real-valued u ∈ L∞ loc (R ) such that (i) V (ω) = U (ω) + λ(ω) u for P-almost all ω ∈ , (ii) the conditional probability distribution of λ relative to the sub-sigma-algebra generated by the family of random variables {U (x)}x∈Rd has a jointly measurable density , : × R → [0, ∞[ with respect to the Lebesgue measure on R . d The condition u ∈ L∞ loc (R ) was missed out in [23, Def. 2]. We now state the following generalization of [23, Thm. 2] which in its turn relies on a result in [11]. int J Theorem 3.1. Let ⊂ Rd be a bounded open cube. Let = be j =1 j decomposed into the interior of the closure of finitely many, J ∈ N, pairwise disjoint bounded open cubes j ⊂ Rd . Let the potentials A and V be supplied with the properties (B) and (F), respectively. Assume that for each j ∈ {1, . . . , J } the random potential V admits a (Uj , λj , uj , ,j )-decomposition subject to the following three conditions: there exist five strictly positive constants v1 , v2 , β, R, Z > 0 such that for all j ∈ {1, . . . , J },

(i) v1 χ j (x) ≤ uj (x) and uj (x)χ j (x) ≤ v2 for Lebesgue-almost all x ∈ Rd , (ω) (ii) ess sup ,j (ξ ) max{e−βv1 ξ , e−βv2 ξ } ≤ R for P-almost all ω ∈ , ξ ∈R

−βH ,N (A,Uj ) j ≤ j Z. (iii) E Tr e

Density of States for Random Schrödinger Operators with Magnetic Fields

237

Then the averaged number of eigenvalues of the finite-volume operator H,X (A, V ) in any non-empty energy regime I ∈ B(R) of finite Lebesgue measure |I | is bounded from above according to RZ β sup I E ν,X (I ) ≤ || |I | e v1

(3.1)

for both boundary conditions X. [Here sup I denotes the least upper bound of I ⊂ R.] Remark 3.1. The (Chebyshev–Markov) inequality χ [1,∞[ (|ξ |) ≤ |ξ | implies P I ∩ spec H,X (A, V ) = ∅ = E χ [1,∞[ ν,X (I ) ≤ E ν,X (I ) .

(3.2)

Therefore the Wegner estimate (3.1) in particular bounds the probability of finding at least one eigenvalue of H,X (A, V ) in a given energy regime I ∈ B(R). Such bounds are a key ingredient of proofs of Anderson localization for multi-dimensional random Schrödinger operators, see [10, 49, 11, 24, 61] and references therein. Proof (of Theorem 3.1). Since we follow exactly the strategy of the proof of [23, Thm. 2], we only remark that the two main steps in this proof remain valid in the presence of a vector potential A. The first step, used in inequality (27) of [23], concerns the lowering of the eigenvalues of the operator H,X (A, V (ω) ) by so-called Dirichlet– Neumann bracketing in case X = D and by the (subsequent) insertion of interfaces in with the requirement of Neumann boundary conditions. For A = 0, supplied with property (B), the validity of these two techniques is established in Appendix A. The second step is an application of a spectral-averaging estimate of [11], which is re-phrased as Lemma 3.1 below. Since there the operator L is only required to be self-adjoint and does not enter the r.h.s. of (3.3), it makes no difference if L is taken as H,X (0, Uj ) (as is done in [23]) or as H,X (A, Uj ) for each j ∈ {1, . . . , J }. An essential tool in the preceding proof is the (simple extension of the) abstract oneparameter spectral-averaging estimate of [11]; in this context see also [13]. Lemma 3.1. Let K, L and M be three self-adjoint operators acting on a Hilbert space H with K and M bounded such that κ := inf Kϕ=0 ϕ , M ϕ/ϕ , K 2 ϕ > 0 is strictly positive. Moreover, let g ∈ L∞ (R). Then the inequality R

dξ |g(ξ )| ψ , K χ I (L + ξ M) K ψ ≤ |I |

g ∞ ψ, ψ κ

(3.3)

holds for all ψ ∈ H and all I ∈ B(R). Proof. Since the assumption κ > 0 implies the operator inequality κ K 2 ≤ M, the lemma is proven as Cor. 4.2 in [11] for any positive bounded functions g with compact supports. It extends to positive bounded function with arbitrary support by a monotoneconvergence argument.

238

T. Hupfer, H. Leschke, P. Müller, S. Warzel

3.2. Upper bounds on the density of states. If the fraction RZ/v1 on the r.h.s of the Wegner estimate (3.1) is independent of for sufficiently large ||, this estimate enables one to prove the absolute continuity of the infinite-volume density-of-states measure with a magnetic field. Corollary 3.1. Let A and V have the properties (C), (S), (I) and (E). Suppose furthermore: (i) there exists a sequence () of bounded open cubes ⊂ Rd with ↑ Rd such that int J infinitely many of them admit a decomposition = into a finite j =1 j number J (depending on ) of pairwise disjoint open cubes 1 , . . . , J . (ii) V obeys the assumptions of Theorem 3.1 for every such decomposition with constants β, v1 , R, Z > 0, all of them not depending on . Then the density-of-states measure ν is absolutely continuous with respect to the Lebesgue measure. Moreover, its Lebesgue density w, called the density of states, is locally bounded according to w(E) :=

ν(dE) dN (E) RZ βE e =: W (E) = ≤ dE dE v1

(3.4)

for Lebesgue-almost all energies E ∈ R. Proof. Let I ⊂ R be bounded and open. Then (2.12) together with [6, Satz 30.2] implies (ω) that ν(I ) ≤ lim inf ↑Rd ||−1 ν,X (I ) for P-almost all ω ∈ . Therefore, by the nonrandomness of the density-of-states measure ν and Fatou’s lemma we have E ν,X (I ) RZ β sup I ν(I ) ≤ lim inf e . (3.5) ≤ |I | || v1 ↑Rd Here we used (3.1) and the assumption that the constants involved there do not depend on . Now the Radón-Nikodým theorem yields the claimed absolute continuity of ν. 4. Examples Illustrating the Results of Section 3 Assumption (iii) of Theorem 3.1 may be checked in various ways. For example, by the diamagnetic inequality (A.24) of Appendix A for Neumann partition functions one sees that a possible choice of Z in (3.1) is −βHj ,N (0,Uj ) Z1 := max |j |−1 E Tr e . (4.1) 1≤j ≤J

This yields an upper bound on E ν,X (I ) in (3.1) which is independent of the magnetic field and, in particular, coincides with the one in [23, Thm. 2]. Rather weak conditions on the random potential Uj assuring the finiteness of the expectation value in (4.1) can be found in [21]. Another choice of Z results from applying the following averaged Golden–Thompson inequality.

Density of States for Random Schrödinger Operators with Magnetic Fields

239

Lemma 4.1. Let ⊂ Rd be a bounded open cube and assume that A and V enjoy properties (B) and (F). Then the averaged partition function of H,X (A, V ) is bounded for all β > 0 according to E Tr e−β H,X (A,V ) ≤ Tr e−β H,X (A,0) ess sup E e−β V (x) , (4.2) x∈

provided that the essential supremum on the r.h.s. is finite. (ω)

Proof. We proceed as in the proof of [36, Thm. 3.4(ii)] and define Vn (x) := max{−n, V (ω) (x)} for n ∈ N and ω ∈ F . The Golden–Thompson inequality [53] yields (ω) (ω) Tr e−β H,X (A,Vn ) ≤ Tr e−βH,X (A,0) e−β Vn .

(4.3)

We then evaluate the trace on the r.h.s. in an orthonormal eigenbasis of H,X (A, 0). Using Fubini’s theorem, the probabilistic expectation of the quantum-mechanical expectation of exp(−βVn ) with eigenfunction of H,X (A, 0) is estimated respect to a normalized by ess supx∈ E exp(−β Vn (x)) , which is smaller than the second factor on the r.h.s. of (4.2) since V ≤ Vn . The proof is completed by noting that the l.h.s. of (4.3) converges for n → ∞ to the trace on the l.h.s. of (4.2) by monotone convergence of forms [51, Thm. S.16], similar to the proof of [36, Prop. 2.1(e)]. Using (4.2) one gets Z2 := max

1≤j ≤J

|j |

−1

Tr e

−βHj ,N (A,0)

ess sup E e x∈j

−βUj (x)

(4.4)

as another choice for Z in (3.1). By (A.24) one may further estimate the magnetic Neumann partition function in (4.4) according to d Tr e−β H,N (A,0) ≤ Tr e−β H,N (0,0) ≤ || ||−1/d + (2πβ)−1/2 . (4.5) The second inequality follows from the explicitly known [53, p. 266] spectrum of H,N (0, 0). Applying (4.5) to (4.4) one weakens Z2 to a rather explicit choice of Z in (3.1) given by d −1/d −1/2 −βUj (x) Z3 := max . (4.6) |j | + (2πβ) ess sup E e 1≤j ≤J

x∈j

4.1. Alloy-type random potentials. The existence of a (U, λ, u, ,)-decomposition of V as required in Theorem 3.1 is immediate for alloy-type random potentials whose coupling strengths are distributed according to a Borel probability measure on the real line with a bounded Lebesgue density. To illustrate the essentials of Theorem 3.1 we first consider the case of positive potentials.

240

T. Hupfer, H. Leschke, P. Müller, S. Warzel

d Corollary 4.1. Let A and V have the properties (B) and (A). Assume that u0 ∈ L∞ loc (R ) ∞ and that the probability distribution of λ0 has a Lebesgue density g ∈ L (R) with support in the positive half-line [0, ∞[. Furthermore, suppose that there exist two strictly positive constants v1 , v2 > 0 such that

v1 χ (0) (x) ≤ u0 (x) and u0 (x)χ (0) (x) ≤ v2

(4.7)

for Lebesgue-almost all x ∈ Rd . Then for each bounded open cube of the form =

int (j )

,

(4.8)

j ∈∩Zd

one has E ν,X (I ) ≤ || |I | WA ( sup I )

(4.9)

for both X = D and X = N and all I ∈ B(R). Here WA is the function d g ∞ βE e R E → WA (E) := 1 + (2πβ)−1/2 v1

(4.10)

with β ∈] 0, ∞[ serving as a variational parameter. (ω)

Proof. For each j ∈ ∩ Zd , the choice uj (x) := u0 (x − j ) and Uj (x) := V (ω) (x) − (ω)

λj uj (x) yields a (Uj , λj , uj , g)-decomposition of V in the sense of Definition 3.1. It remains to verify the three assumptions of Theorem 3.1. Assumption (i) is guaranteed by (4.7). Assumption (ii) is fulfilled with R = g ∞ . To verify assumption (iii), we make (ω) use of (4.6) and observe that Uj ≥ 0. Remark 4.1. (i) The estimates in the proof of Corollary 4.1, when specializing the fraction RZ/v1 of Theorem 3.1 to WA , were unnecessarily rough for the sake of simplicity. In specific examples the upper bound WA may be improved. Moreover, more general alloy-type random potentials are also covered by Theorem 3.1. In particular, the random potential may be unbounded from below, see the next corollary. Furthermore, one may allow for correlated coupling strengths {λj } as long as the relevant conditional probabilities have bounded Lebesgue densities. (ii) Apart from the existence of a bounded Lebesgue density for the coupling strength λ0 one further restrictive assumption of Corollary 4.1 is the fact that the single-site potential u0 must possess a definite sign. The latter may be slightly weakened such that one may treat certain u0 taking on values of both signs by choosing a more complicated decomposition different from the natural one used in the proof of Corollary 4.1. This basically corresponds to the linear-transformation technique introduced in [63] which turns certain given alloy-type random potentials into ones with positive single-site potentials and correlated coupling strengths, see the previous Remark 4.1(i). In any case, the fact that u0 must possess a sufficiently large support is believed to be important for the absolute continuity of the integrated density of states in the presence of a magnetic field, see Remark 4.3(ii).

Density of States for Random Schrödinger Operators with Magnetic Fields

241

(iii) We only know of [12, 4, 5, 64] where Wegner estimates for magnetic Schrödinger operators with alloy-type random potentials have been derived.1 The Wegner estimate of [4] is proven for energies in pre-supposed gaps of the spectrum of H (A, 0). The other three works consider the case of two space dimensions and a perpendicular constant magnetic field, see Subsect. 4.3, especially Remark 4.3(iii) and 4.3(iv). We close this subsection by considering the example of an unbounded below alloy-type random potential with exponentially decaying probability density for its (independent) coupling strengths. This example is marginal in the sense that any such density has to fall off at minus infinity at least as fast as exponentially in order to ensure the applicability of Theorem 3.1. Corollary 4.2. Let A and V have the properties (B) and (A). Assume a Laplace distribution for λ0 , that is

1 dξ e−|ξ |/α , I ∈ B(R), (4.11) P λ0 ∈ I = 2α I d with some α > 0. Furthermore, suppose that u ∈ L∞ loc (R ) and that (4.7) holds with some v1 , v2 > 0 and let

ln 1 − [βαu0 (x − j )]2 < ∞ (4.12) Kβ := − ess inf x∈(0)

j ∈Zd

be finite for some β ∈] 0, (α u0 ∞ )−1 [. Finally, let be of the form (4.8). Then (4.9) holds where WA may be taken as the function d 1 − (βαv )2 1 E → WA (E) := 1 + (2πβ)−1/2 (4.13) eβE+Kβ 2αv1

with β ∈ β ∈ ] 0, (α u0 ∞ )−1 [ : Kβ < ∞ serving as a variational parameter. Proof. The proof is analogous to that of Corollary 4.1. To verify the assumptions of Theorem 3.1 we note that assumption (i) is guaranteed by (4.7). Assumption (ii) is fulfilled with R = (2α)−1 if β ∈] 0, (αv2 )−1 ]. As for assumption (iii), we make use of (4.6) and explicitly compute the involved expectation if β ∈] 0, (α u0 ∞ )−1 [. 4.2. Gaussian random potentials. As another application of Theorem 3.1 we note that the Wegner estimate derived previously [23, Thm. 1] for certain Gaussian random potentials and the case without magnetic field remains valid in the present setting. The reason for this is the fact that every Wegner estimate stemming from [23, Thm. 2] is also one in the presence of a magnetic field thanks to the diamagnetic inequality. Corollary 4.3. Let A and V have the properties (B) and (G). Moreover, assume that a finite signed Borel measure µ on Rd , which is normalized in the sense that there exist d x) d y) C(x −y) = C(0), an open subset > ⊂ Rd with volume > > 0 µ(d µ(d Rd Rd and a constant γ > 0 such that the covariance function C of V obeys γ χ > (x) ≤ (C(0))−1 µ(dd y) C(x − y) =: (C(0))−1/2 u(x) (4.14) Rd

1 See, however, note added in proof.

242

T. Hupfer, H. Leschke, P. Müller, S. Warzel

for all x ∈ Rd . Then for each @ > 0, for which there exists a bounded open cube (@) ⊆ > with edges of length @ parallel to the co-ordinate axes, and each bounded open cube ⊂ Rd satisfying the matching condition ||1/d /@ ∈ N, one has E ν,X (I ) ≤ || |I | WG ( sup I ) (4.15) for both X = D and X = N and all I ∈ B(R). Here WG is the function d exp βE + β 2 C@ /2 (4.16) E → WG (E) := 2@ + (2πβ) √ 2π C(0) b@ where we introduced the constants C@ := C(0) 1 + B@2 − b@2 , B@ := (C(0))−1/2 supx∈(@) u(x) and b@ := (C(0))−1/2 inf x∈(@) u(x) ≥ γ . Finally, β ∈ ] 0, ∞[ serves, besides @, as a second variational parameter.

−1

−1/2

Proof. The key input is the fact that every Gaussian random potential V admits a (U, λ, u, ,)-decomposition in the sense of Definition 3.1. More precisely, λ(ω) := −1/2 d (ω) (x) is a standard Gaussian random variable with Lebesgue (C(0)) Rd µ(d x)V density ,(ξ ) := (2π)−1/2 exp −ξ 2 /2 . This random variable and the Gaussian random field U (ω) (x) := V (ω) (x) − λ(ω) u(x), where u is defined in (4.14), are stochastically independent. For details see the proof of [23, Thm. 1]. To obtain the specific form WG , which is independent of the magnetic field, we used (4.6). Remark 4.2. (i) Without loss of generality, every measure µ yielding (4.14) may be normalized in the sense of the assumption in the above corollary. The measure µ allows one to apply Corollary 4.3 to Gaussian random potentials with certain covariance functions taking on also negative values. Examples are given in [23, 30]. (ii) If C(x) ≥ 0 for all x ∈ Rd , we may choose µ equal to Dirac’s point measure at the origin. Due to the continuity of C and since C(0) > 0, condition (4.14) is then fulfilled with some sufficiently small cube > containing the origin and γ = inf x∈> C(x)/C(0). Under stronger conditions on the vector potential A the Wegner estimate for this case has been stated in [24, Prop. 2.14] where it serves as one input for a proof of Anderson localization by certain Gaussian random potentials, see Remark 3.1. (iii) Choosing @ = |E|−1/4 and β = (2C@ )−1 E 2 + 2d C@ − E we obtain the following leading low- and high-energy behaviour: lim

E→−∞

ln WG (E) 1 =− , 2 E 2C(0)

lim

E→∞

WG (E) (e/(π d))d/2 = . √ E d/2 2π u(0)

(4.17)

Since WG provides an upper bound on the density of states (see Corollary 3.1), its lowenergy behaviour is optimal in the sense that it coincides with that of the derivative of the known low-energy behaviour of the integrated density of states [43, 62, 8]. This is not true for the high-energy behaviour. It is known [43, 62] that the high-energy growth of the integrated density of states is neither affected by the random potential nor by the magnetic field and proportional to E d/2 for E → ∞ in analogy to Weyl’s celebrated asymptotics for the free particle [66]. Note that the constant on the r.h.s. of the second equation in (4.17) is smaller than the one given by [23, Eq. (14)].

Density of States for Random Schrödinger Operators with Magnetic Fields

243

4.3. Two space dimensions: random Landau Hamiltonians. In this subsection we consider the special case of two space dimensions and a perpendicular constant magnetic field of strength B := B12 > 0. Accordingly, the vector potential in the symmetric gauge is given by B −x2 x1 A(x) = , x= ∈ R2 . (4.18) 2 x1 x2 This case has received considerable attention during the last three decades [2, 37] in the physics of low-dimensional electronic structures. The magnetic Schrödinger operator on L2 (R2 ) modelling the non-relativistic motion of a particle with unit charge on the Euclidean plane R2 under the influence of this magnetic field is the Landau Hamiltonian. Its spectral resolution dates back to Fock [25] and Landau [38] and is given by the strong-limit relation ∞ B

H (A, 0) = (2l + 1) Pl . 2

(4.19)

l=0

The energy eigenvalue (l + 1/2)B is called the l th Landau level and the corresponding orthogonal eigenprojection Pl is an integral operator with continuous complex-valued kernel B B B B 2 2 Pl (x, y) := (4.20) exp i (x2 y1 − x1 y2 ) − |x − y| Ll |x − y| , 2π 2 4 2 l −ξ dl given in terms of the l th Laguerre polynomial ξ → Ll (ξ ) := l!1 eξ dξ , ξ ≥ 0, [27, l ξ e Sect. 8.97]. The diagonal Pl (x, x) = B/(2π ) is naturally interpreted as the degeneracy per area of the l th Landau level. Using definition (2.10) with V = 0, the integrated density of states of the Landau Hamiltonian (4.19) turns out to be the well-known “staircase” function N (E) =

∞ 1 B E− l+ B , 2π 2

V = 0,

(4.21)

l=0

which is obviously not absolutely continuous with respect to the Lebesgue measure. For the derivation of (4.21) one may apply [51, Thm. VI.23] because the operator Pl χ (0) is Hilbert-Schmidt, more precisely Tr[χ (0) Pl χ (0) ] = B/(2π) < ∞. Alternatively one may compute [45, App. B] the infinite-area limit lim↑R2 ||−1 Tr (E − H,X (A, 0)) for some boundary condition X. The result coincides with (4.21) by Prop. 2.3. Informally, the density of states associated with (4.21) is a series of Dirac delta functions supported at the Landau levels. The corresponding infinities are indicated by vertical lines in Fig. 4.1 and together form what might be called a “Dirac half-comb”. By adding a random potential V to (4.19), the delta peaks are expected to be smeared out. In fact, under the assumptions of Corollary 3.1 they are smeared out completely in the sense that the density of states w of the arising random Landau Hamiltonian H (A, V ) = H (A, 0) + V is shown there to be locally bounded. For example, in the presence of a Gaussian random potential with the Gaussian covariance function C(x) = C(0) exp − |x|2 /(2τ 2 ) > 0, τ > 0, Fig. 4.1 contains the graph of the upper bound WG on w given in (4.16) after (numerically) minimizing with

244

T. Hupfer, H. Leschke, P. Müller, S. Warzel

WG

1/2π

E 0

B/2

3B/2

5B/2

Fig. 4.1. Plot of the upper bound WG (E) on w(E) as a function of the energy E. Here w is the density of states of the Landau Hamiltonian with a Gaussian random potential with Gaussian covariance function. The dashed line is a plot of the graph of an approximation to w. The exact w is unknown. Vertical lines indicate the delta peaks which reflect the non-existence of the density of states without random potential V . The step function (E)/2π (not shown) is the free density of states characterized by B = 0 and V = 0. (See text)

respect to β, @ and a certain one-parameter subclass of possible decompositions of V . Here we picked a (small) disorder parameter, C(0) = (B/5)2 , and a (large) correlation length, τ = 100B −1/2 . We recall that the function WG is independent of B due to our application of the diamagnetic inequality, but nevertheless provides an upper bound on w for all B ≥ 0. Therefore WG (E) is a rather rough estimate of w(E) already for energies E < B/2 and, in particular, starts increasing significantly at too low energies. Nevertheless, the upper bound shows that the density of states w has no infinities for arbitrarily weak disorder, that is, for arbitrarily small C(0) > 0. In fact, in the above situation we believe the graph of w to look similar to the dashed line in Fig. 4.1. We conclude this subsection with several remarks: Remark 4.3. (i) Unfortunately, our upper bound W in (3.4) is never sharp enough to reflect the expected “magneto-oscillations” of w. Instead, by construction W is always increasing. (ii) The assumptions of Corollary 3.1 guarantee in particular that there occurs no point component in the Lebesgue decomposition of the density-of-states measure ν. Using Corollary 2.1, this implies that any given energy E ∈ R, in particular any Landau-level energy, is P-almost surely no eigenvalue under these assumptions. This stands in contrast to a certain situation with random point impurities, in which case the authors of [20] show that finitely many Landau-level energies remain infinitely degenerate eigenvalues if B is sufficiently large. (iii) Exploiting the existence of spectral gaps of H (A, 0), a Wegner estimate for Landau Hamiltonians with alloy-type random potentials is derived in [12, 4, 5] which proves that ν is absolutely continuous when restricted to intervals between the Landau-level energies. For this result to hold the authors were able to weaken the assumption (4.7) on the size of the support of the single-site potential which our Corollary 4.1 requires. On

Density of States for Random Schrödinger Operators with Magnetic Fields

245

the other hand, absolute continuity of ν at all energies is proven in [12] only for bounded random potentials under the present assumptions on the support. (iv) In [64] a Wegner estimate for alloy-type random potentials is derived without assuming a definite sign of the single-site potential. However, this estimate holds only between the Landau-level energies for sufficiently strong magnetic field and does not enable one to deduce the existence of the density of states. (v) In [30] the integrated density of states associated with the restricted random Landau Hamiltonian Pl H (A, V )Pl of a single but arbitrary Landau level is shown to be absolutely continuous for Gaussian random potentials satisfying the assumptions of Corollary 4.3 (for d = 2). A. On Finite-Volume Schrödinger Operators with Magnetic Fields For convenience of the reader (and the authors), this appendix defines non-random magnetic Schrödinger operators with Neumann boundary conditions and compiles some of their basic properties. In passing, the more familiar basic properties of the corresponding operators with Dirichlet boundary conditions are briefly recalled, see for example [42, 9]. In particular, we prove a diamagnetic inequality for Neumann Schrödinger operators and Dirichlet–Neumann bracketing for a wide class of vector potentials including singular ones. Altogether, this appendix may be understood to extend some of the results in the key papers [31, 32, 3, 57] to the case of Neumann boundary conditions. Throughout this appendix, ⊆ Rd denotes a non-empty open, not necessarily proper subset of d-dimensional Euclidean space Rd with d ∈ N. Moreover, a : Rd → Rd stands for a vector potential and v : Rd → R for a scalar potential with v± := (|v| ± v) /2 denoting its positive respectively negative part. We will assume throughout that 1 |a|2 , v+ ∈ Lloc (Rd ).

(A.1)

The negative part v− is assumed to be a form perturbation either of H,N (a, 0) or even of H,N (0, 0). By this we mean that v− is form-bounded [52, Def. p. 168] with form bound strictly smaller than one either relative to H,N (a, 0) or even to H,N (0, 0). Both operators will be defined in Lemma A.1 below. The operator H,N (0, 0) is the usual Neumann Laplacian, up to a factor of −1/2. Remark A.1. By the diamagnetic inequality, see Prop. A.2 below, we will see that v− is a form perturbation of H,N (a, 0) if it is one of H,N (0, 0). If is a bounded open cube, an easy-to-check sufficient criterion for v− to be even infinitesimally form-bounded [52, Def. p. 168] relative to H,N (0, 0) can be taken from [36, Lemma 2.1] and reads p

v− ∈ Lloc (Rd )

(A.2)

with p = 1 if d = 1, some p > 1 if d = 2 and some p ≥ d/2 if d ≥ 3. A.1. Definition of magnetic Neumann Schrödinger operators. In a first step, we consider 1 (Rd ) or, equivalently, a ∈ L2 (Rd ) d , that is, a ∈ the case v = 0 and |a|2 ∈ Lloc j loc 2 (Rd ) for all j ∈ {1, . . . , d}. We define the sesquilinear form Lloc ha,0 ,N (ϕ, ψ) :=

d 1 (i∇ + a)j ϕ, (i∇ + a)j ψ 2 j =1

(A.3)

246

T. Hupfer, H. Leschke, P. Müller, S. Warzel

for all ϕ and ψ in its form domain Wa1,2 () := φ ∈ L2 () : (i∇ + a) φ ∈ (L2 ())d ,

(A.4)

which might be called a magnetic Sobolev space, see [39, Sect. 7.20] in case = Rd . Here and in the following, ∇ − ia denotes the gauge-covariant gradient in the sense of distributions on C0∞ (). In particular, this means Wa1,2 () =

d

φ ∈ L2 () : there is φj ∈ L2 () such that

j =1

φ , i∂j η + aj η = φj , η

(A.5)

for all η ∈ C0∞ () .

Remark A.2. We emphasize that the condition ψ ∈ Wa1,2 () allows for the case that d neither ∇ψ nor aψ belongs to L2 () . In general, ψ ∈ Wa1,2 () only implies ∇ψ ∈

d d 1 , the usual firstLloc () and | ψ | ∈ W 1,2 () := φ ∈ L2 () : ∇φ ∈ L2 () order Sobolev space of L2 -type. The latter statement is a consequence of the diamagnetic inequality, see Remark A.5(iv) below and [59]. If even |a|2 ∈ L∞ (Rd ), the magnetic Sobolev space coincides with the usual one, Wa1,2 () = W 1,2 (), up to equivalence of norms. Basic facts about ha,0 ,N are summarized in 2 Lemma A.1. The form ha,0 ,N is densely defined on L (), symmetric, positive and closed. It therefore uniquely defines a self-adjoint positive operator H,N (a, 0) on L2 () which, up to a factor of −1/2, is called magnetic Neumann Laplacian.

Proof. Since C0∞ () ⊂ Wa1,2 () ⊂ L2 () and C0∞ () is dense in L2 (), the form ha,0 ,N is densely defined. Its symmetry and positivity are obvious from the definition. To

1,2 prove that ha,0 ,N is also closed we have to show that the space Wa () is complete with respect to the (metric induced by the form-) norm φ, φ + ha,0 (A.6) ,N (φ, φ).

To this end, we proceed along the lines of Sects. 7.20 and 7.3 in [39] and let (φn )n∈N be a sequence in Wa1,2 () which is Cauchy with respect to the norm (A.6). By completeness of L2 (), there exist functions φ, ψj ∈ L2 (), j ∈ {1, . . . , d}, such that φn → φ and (i∇ + a)j φn → ψj strongly in L2 () as n → ∞. Since (i∇ + a)j φn → (i∇ + a)j φ in the sense of distributions on C0∞ () as n → ∞, we have (i∇ + a)j φ = ψj and hence φ ∈ Wa1,2 (). The existence and uniqueness of H,N (a, 0) follow now from the one-to-one correspondence between densely defined, symmetric, bounded below, closed forms and self-adjoint, bounded below operators, see [51, Thm. VIII.15]. Remark A.3. (i) We recall that the operator H,N (a, 0) has the subspace ∈ L2 () such that D H,N (a, 0) := ψ ∈ Wa1,2 () : there is ψ ha,0 ,N (ϕ, ψ) = ϕ , ψ

for all ϕ ∈ Wa1,2 ()

(A.7)

Density of States for Random Schrödinger Operators with Magnetic Fields

247

of its underlying form domain as its operator domain and acts according to . H,N (a, 0) ψ = ψ (ii) Let Dj (a) denote the closure of the symmetric operator C0∞ () ψ → (i∇ + a)j ψ ∈ L2 (). Being the closure of a symmetric operator, Dj (a) is symmetric. The domain of its adjoint Dj† (a) is given by D Dj† (a) := ψ ∈ L2 () : (i∇ + a)j ψ ∈ L2 () ,

(A.8)

because the adjoint of C0∞ () ψ → (i∇ + a)j ψ coincides with that of its closure. While for a proper subset = Rd the operator Dj (a) is not self-adjoint, it is so for = Rd [57, Lemma 2.5]. In the latter case it may physically be interpreted, up to a sign, as the j th component of the velocity (operator). By construction the magnetic Neumann Laplacian is a form sum of d operators in accordance with H,N (a, 0) =

d 1

Dj (a) Dj† (a) , 2

(A.9)

j =1

where the self-adjoint positive operator Dj (a) Dj† (a) comes from the closed form † Dj (a) ϕ, Dj† (a) ψ with form domain (A.8). Note that (A.8) is just the j th set of the intersection on the r.h.s. of (A.5). See also Thm. X.25 in [52]. 1,2 ∞ (iii) Restricting the form ha,0 ,N to the domain C0 () ⊂ Wa (), one obtains a form

which is closable in Wa1,2 () with respect to the norm (A.6), see [57, 42, 9]. Its closure ha,0 ,D is uniquely associated with another self-adjoint positive operator H,D (a, 0) on L2 () which, up to a factor of −1/2, is called magnetic Dirichlet Laplacian. For general 2 d a ∈ Lloc (Rd ) the space C0∞ () is not contained in D H,N (a, 0) , see (A.7). As a consequence, H,N (a, 0) in general cannot be restricted to C0∞ (). This stands in contrast to the case a = 0 where H,D (0, 0) is the Friedrichs extension of the restriction of H,N (0, 0) to C0∞ ().As the Dirichlet counterpart of (A.9) we only have the inequality H,D (a, 0) ≤ 21 dj =1 Dj† (a) Dj (a) which is meant in the sense of forms [53, Def. on p. 269]. The operators HRd,N (a, 0) and HRd,D (a, 0) are equal, see [57]. (iv) In the free case, which is characterized by a = 0 and v = 0, the just defined operators H,D (0, 0) and H,N (0, 0) coincide, up to a factor of −1/2, with the usual Dirichlet- and Neumann-Laplacian [53, p. 263], respectively. 1 (Rd ) and assume v to be a form perturbation In a second and final step, we let v+ ∈ Lloc − of H,N (a, 0). As a consequence, the sesquilinear form ! ! 1/2 1/2 1/2 1/2 a,0 (A.10) ha,v ,N (ϕ, ψ) := h,N (ϕ, ψ) + v+ ϕ, v+ ψ − v− ϕ, v− ψ

1,2 is well defined for all ϕ and ψ in its form domain Q ha,v ,N := Wa ()∩Q (v+ ), where 1/2 Q (v+ ) := φ ∈ L2 () : v+ φ ∈ L2 () . (A.11) Basic facts about ha,v ,N are summarized in

248

T. Hupfer, H. Leschke, P. Müller, S. Warzel

2 Lemma A.2. The form ha,v ,N is densely defined on L (), symmetric, bounded below and closed. It therefore uniquely defines a self-adjoint, bounded below operator H,N (a, v) on L2 () which is called magnetic Neumann Schrödinger operator. a,v

Proof. The domain Wa1,2 ()∩Q (v+ ) of h,N+ is dense in L2 (), because both Wa1,2 () and Q (v+ ) contain C0∞ (). Hence H,N (a, v+ ) is well defined as a form sum of a,v H,N (a, 0) and v+ . Moreover, h,N+ is symmetric, positive and closed, because it is the sum of two of such forms. Since H,N (a, 0) ≤ H,N (a, v+ ), the negative part v− of v is also a form perturbation of H,N (a, v+ ). The proof of the lemma is then completed by the KLMN-theorem [52, Thm. X.17]. 1,2 Remark A.4. Since the form domain of ha,0 ,D is contained in Wa (), the negative part v− of v is also a form perturbation of H,D (a, 0) ≤ H,D (a, v+ ). Hence one may apply the KLMN-theorem to define, similarly to H,N (a, v), what is called the magnetic Dirichlet Schrödinger operator and denoted as H,D (a, v).

An immediate consequence of the definition of H,X (a, v) is the fact that so-called decoupling and Dirichlet–Neumann bracketing continues to hold for a = 0 as in the case a = 0, see Props. 3 and 4 in Sect. XIII.15 of [53], and [14, 45] for smooth a = 0. 1 (Rd ) and v be a form perturbation of H Proposition A.1. Let |a|2 , v+ ∈ Lloc − ,N (a, 0). d Moreover, let 1 , 2 ⊂ R be a disjoint pair of non-empty open sets.

(i) Then the orthogonal decomposition H1 ∪2 ,X (a, v) = H1 ,X (a, v) ⊕ H2 ,X (a, v)

(A.12)

holds for both X = D and X = N on L2 (1 ∪ 2 ) = L2 (1 ) ⊕ L2 (2 ). int (ii) Let := 1 ∪ 2 be defined as the interior of the closure of the union of 1 and 2 , and suppose that the interface \ (1 ∪ 2 ) is of d-dimensional Lebesgue measure zero. Then the inequalities H1 ∪2 ,N (a, v) ≤ H,N (a, v) ≤ H,D (a, v) ≤ H1 ∪2 ,D (a, v)

(A.13)

hold in the sense of forms. Proof. The proofs of Props. 3 and 4 in Sect. XIII.15 of [53] for the free case carry over to the case a = 0 and v = 0. In particular, the inclusion relations between the various form domains for a = 0 and v = 0 hold analogously for the form domains in the case a = 0 and v = 0. A.2. Diamagnetic inequality. A useful tool in the study of Schrödinger operators with magnetic fields is 1 (Rd ) and v be a form perturProposition A.2. Let ⊆ Rd be open, |a|2 , v+ ∈ Lloc − bation of H,N (0, 0). Then v− is a form perturbation of H,N (a, 0) with form bound not exceeding the one for a = 0 and the inequality −t H (a,v) ,X e (A.14) ψ ≤ e−t H,X (0,v) |ψ|

holds for all ψ ∈ L2 (), all t ≥ 0 and both X = D and X = N .

Density of States for Random Schrödinger Operators with Magnetic Fields

249

Remark A.5. (i) For the Dirichlet version X = D of the diamagnetic inequality (A.14) to hold, it would be sufficient that v− is a form perturbation of H,D (0, 0). (ii) Inequality (A.14) for = Rd dates back to [31, 56, 28, 32, 3, 59, 57]. It is also known to hold for = Rd and X = D, even under the weaker assumptions 1 (), see [50, 42]. These assumptions still guarantee that the operators |a|2 , v+ ∈ Lloc H,D (a, v) and H,N (a, v) are definable as self-adjoint operators via forms. However, for arbitrary open = Rd the proof of (A.14) for X = N would be more complicated than the one which we will give under the stronger assumptions of Prop. A.2. The reason is that a gauge function more fancy than that in Lemma A.3 would be needed in order to avoid integration of aj across the boundary of . For a “simply shaped” , like a cube, such complications do not arise which implies that our proof would go through for cubes under the weaker assumptions. (iii) If a = 0 inequality (A.14) is equivalent to the assertion that H,X (0, v) is the (negative of the) generator of a positivity-preserving one-parameter operator semigroup 2 d on L2 (), see [52, pp. 186]. For general a ∈ Lloc (Rd ) inequality (A.14) asserts that the semigroup generated by H,X (0, v) dominates the one generated by H,X (a, v). (iv) It follows from [28, 59] that (A.14) is equivalent to the following pair of statements: (a) ψ ∈ D H,X (a, v) implies |ψ| ∈ Q h0,v ,X , (b) h0,v ,X (ϕ, |ψ|) ≤ Re ϕ sgn ψ , H,X (a, v) ψ for all ϕ ∈ Q h0,v ,X with ϕ ≥ 0 and all ψ ∈ D H,X (a, v) , where the signum function associated with ψ is defined by (sgn ψ) (x) := ψ(x)/|ψ(x)| ∈ C if ψ(x) = 0 and zero otherwise. If a = 0 these statements boil down to a Beurling–Deny criterion [17, Thm. 1.3.2] for H,X (0, v) which guarantees that it generates a positivity-preserving semigroup. Inequality (b) with X = N and v = 0 basically corresponds to the germinal distributional inequality of Kato, which he d proved [31] for a ∈ C 1 (Rd ) . In case = Rd and X = N, we are not aware of a reference proving (A.14) or (a) and (b) for singular a. Our proof of the diamagnetic inequality (A.14) for X = N will mimic the proof in [57], where the case = Rd is considered, see also Sect. 1.3 in [16]. It relies on the fact that for one dimension the vector potential can be removed by a gauge transformation. More precisely, for each j ∈ {1, . . . , d} the operator Dj† (a) is unitarily equivalent to Dj† (0). 1 (Rd ) and define a (gauge) function λ : Rd → R through Lemma A.3. Let |a|2 ∈ Lloc j xj λj (x) := dyj aj x1 , . . . , xj −1 , yj , xj +1 , . . . , xd . (A.15) 0

For open ⊆ Rd it induces a densely defined and self-adjoint multiplication operator λj on L2 (). The corresponding unitary operator e−iλj maps D Dj† (a) onto D Dj† (0) , recall (A.8), and one has Dj† (a) ψ = eiλj Dj† (0) e−iλj ψ for all ψ ∈ D Dj† (a) .

(A.16)

250

T. Hupfer, H. Leschke, P. Müller, S. Warzel

2 (Rd ). Proof. Fubini’s theorem and the Cauchy–Schwarz inequality show that λj ∈Lloc

Therefore, the induced multiplication operator on its maximal domain D λj := ψ ∈ L2 () : λj ψ ∈ L2 () ⊃ C0∞ () is densely defined and self-adjoint. Moreover, † 1 (), we are allowed to use the product since ψ ∈ D Dj (a) implies ∇j ψ ∈ Lloc and chain rule for distributional derivatives [26, pp. 150] which yield ∇j e−iλj ψ = e−iλj ∇j ψ − e−iλj iaj ψ.

Proof of Prop. A.2. For X = D see [50, 42, 9]. The proof for X = N consists of three steps. 1 (Rd ) to be bounded from below. In this case In the first step, we assume v ∈ Lloc H,N (a, v) is a form sum of d +1 operators each of which is bounded from below, recall Remark A.3(ii) and Lemma A.2. Hence we may employ the strong Lie–Trotter product formula generalized to form sums of several operators [33] and write n † † e−tH,N (a,v) = s-lim e−tD1 (a)D1(a)/2n · · · e−tDd (a)Dd (a)/2n e−tv/n . (A.17) n→∞

Gauge equivalence (A.16) now shows that †

†

e−tDj(a)Dj(a)/2n = eiλj e−tDj(0)Dj(0)/2n e−iλj

(A.18) for all j ∈ {1, . . . , d} and all t ≥ 0. By the distributional inequality ∇j |ψ| ≤ ∇j ψ , valid for all ψ ∈ D Dj† (0) [39, Thm. 6.17], the operator Dj (0) Dj† (0) obeys a Beurling– Deny criterion [17, Thm. 1.3.2] and hence is the generator of a positivity-preserving semigroup. It follows that † −tDj (a)D †(a)/2n e j (A.19) ψ ≤ e−tDj(0)Dj(0)/2n |ψ| for all ψ ∈ L2 () and all t ≥ 0. This together with (A.17) implies the assertion (A.14) 1 (Rd ) which are bounded from below. (with X = N) for scalar potentials v ∈ Lloc In the second step, we prove that if v− is a form perturbation of H,X (0, 0) then it is also one of H,X (a, 0) with form bound not exceeding the one for a = 0 (see [3] or [58, Thm. 15.10] for the case = Rd ). This follows from (A.23) below with v = 0 and α = 1/2 together with the fact that the form bound of v− relative to H,X (a, 0) can be expressed as " −1/2 −1/2 " " " lim " H,X (a, 0) + E (A.20) v− H,X (a, 0) + E ", E→∞

see [16, Prop. 1.3(ii)]. Here · denotes the (uniform) norm of bounded operators on L2 (). In the third step, we extend the validity of (A.14) (with X = N) to scalar potentials 1 (Rd ) and v being a form perturbation of H v with v+ ∈ Lloc − ,N (0, 0). To this end, we approximate v by vn defined through vn (x) := max {−n, v (x)}, x ∈ Rd , n ∈ N. Monotone convergence for forms [51, Thm. S.16] yields the convergence of H,N (a, vn ) to H,N (a, v) in the strong resolvent sense as n → ∞. It follows that s-lim e−tH,N (a,vn ) = e−tH,N (a,v) n→∞

(A.21)

for all t ≥ 0. Since (A.14) (with X = N) holds for each vn by the first step, the proof is complete.

Density of States for Random Schrödinger Operators with Magnetic Fields

251

A.3. Some consequences. We list some immediate consequences of the diamagnetic inequality. For this purpose, we assume the situation of Prop. A.2. (i) Powers of the resolvent of the self-adjoint operator H,X (a, v) may be expressed in terms of its semigroup by using the functional calculus. This gives the integral representation ∞ −α 1 H,X (a, v) − z = dt t α−1 etz e−tH,X (a,v) , (A.22) (α − 1)! 0 which is valid for all α > 0, all z ∈ C with Re z < inf spec H,X (a, v) and both X = D and X = N. Here α → (α − 1)! denotes Euler’s gamma function [27]. Inequality (A.14) then implies the diamagnetic inequality for powers of the resolvent H,X (a, v) − z −α ψ ≤ H,X (0, v) − Re z −α |ψ| ,

(A.23)

valid for all ψ ∈ L2 () and all z ∈ C with Re z < inf spec H,X (0, v). We recall [55] that the ground-state energy goes up when the magnetic field is turned on, in symbols, inf spec H,X (0, v) ≤ inf spec H,X (a, v). This follows from Remark A.5(iv)(b) or inequality (A.24) below if its r.h.s. is finite. (ii) If H,X (0, v) has purely discrete spectrum or, equivalently [53, Thm. XIII.64], has compact resolvent, the Dodds-Fremlin-Pitt theorem [3, Thm. 2.2] together with (A.23) implies that H,X (a, v) has also compact resolvent and hence purely discrete spectrum. In turn, H,X (0, v) has purely discrete spectrum if the free operator H,X (0, 0) has and if v is a form perturbation of H,X (0, 0) [53, Thm. XIII.68]. While H,D (0, 0) has purely discrete spectrum for arbitrary bounded open ⊂ Rd , H,N (0, 0) only has if possesses an additional property, for example the segment property, see [53, pp. 255]. For example, if is a bounded open cube the spectra of H,D (a, −v− ) and H,N (a, −v− ) are both purely discrete. Moreover, by the min-max principle the addition of the positive multiplication operator v+ to H,X (a, −v− ) cannot create essential spectrum. As a consequence, H,X (a, v) has purely discrete spectrum for both X = D and X = N if is a bounded open cube. (iii) The diamagnetic inequality (A.14) together with Lemma 15.11 in [58] implies the diamagnetic inequality for partition functions Tr e−tH,X (a,v) ≤ Tr e−tH,X (0,v)

(A.24)

for all t > 0 and both X = D and X = N, provided that the r.h.s. is finite. The latter is the case if is a bounded open cube, for example. This follows from Dirichlet– Neumann bracketing (see (A.13) with a = 0), the facts that v+ ≥ 0 and v− is a form perturbation of H,N (0, 0), and the finiteness of the free Neumann partition function (see [36, Prop. 2.1(c)] or (4.5)). Acknowledgement. It is a pleasure to thank Kurt Broderix, Dirk Hundertmark, Thomas Hoffmann-Ostenhof and Georgi D. Raikov for helpful remarks and stimulating discussions. This work was supported by the Deutsche Forschungsgemeinschaft under grant nos. Le 330/10 and Le 330/12. The latter is a project within the Schwerpunktprogramm “Interagierende stochastische Systeme von hoher Komplexität” (DFG Priority Programme SPP 1033).

252

T. Hupfer, H. Leschke, P. Müller, S. Warzel

Note added in proof. After submission of the present paper we learned of the interesting paper The Lp -theory of the spectral shift function, the Wegner estimate, and the integrated density of states for some random operators, Commun. Math. Phys. 218, 113–130 (2001), by J. M. Combes, P. D. Hislop and S. Nakamura. Among other things, their approach yields Wegner estimates for rather general magnetic fields and certain bounded random potentials. While these estimates do not imply absolute continuity of the integrated density of states, they yield Hölder continuity of arbitrary order strictly smaller than one. The recent preprint The integrated density of states for some random operators with nonsign definite potentials, mp_arc 01-139 (2001), by P. D. Hislop and F. Klopp extends part of this result to single-site potentials taking values of both signs.

References 1. Adler, R.J.: The geometry of random fields. Chichester: Wiley, 1981 2. Ando, T., Fowler, A.B., Stern, F.: Electronic properties of two-dimensional systems. Rev. Mod. Phys. 54, 437–672 (1982) 3. Avron, J., Herbst, I., Simon, B.: Schrödinger operators with magnetic fields. I. General interactions. Duke Math. J. 45, 847–883 (1978) 4. Barbaroux, J.-M., Combes, J.M., Hislop, P.D.: Localization near band edges for random Schrödinger operators. Helv. Phys. Acta 70, 16–43 (1997) 5. Barbaroux, J.-M., Combes, J.M., Hislop, P.D.: Landau Hamiltonians with unbounded random potentials. Lett. Math. Phys. 40, 335–369 (1997) 6. Bauer, H.: Maß- und Integrationstheorie. 2. Auflage, Berlin: de Gruyter, 1992 [in German] English translation to appear 7. Bonch-Bruevich,V.L., Enderlein, R., Esser, B., Keiper, R., Mironov,A.G., Zvyagin, I.P.: Elektronentheorie ungeordneter Halbleiter. Berlin: VEB Deutscher Verlag der Wissenschaften, 1984 [in German. Russian original: Moscow: Nauka, 1981] 8. Broderix, K., Hundertmark, D., Leschke, H.: Self-averaging, decomposition and asymptotic properties of the density of states for random Schrödinger operators with constant magnetic field. In: Path integrals from meV to MeV: Tutzing ’92. Grabert, H., Inomata, A., Schulman, L.S., Weiss, U. (eds.), Singapore: World Scientific, 1993, pp. 98–107 9. Broderix, K., Hundertmark, D., Leschke, H.: Continuity properties of Schrödinger semigroups with magnetic fields. Rev. Math. Phys. 12, 181–225 (2000) 10. Carmona, R., Lacroix, J.: Spectral theory of random Schrödinger operators. Boston: Birkhäuser, 1990 11. Combes, J.M., Hislop, P.D.: Localization for some continuous, random Hamiltonians in d-dimensions. J. Funct. Anal. 124, 149–180 (1994) 12. Combes, J.M., Hislop, P.D.: Landau Hamiltonians with random potentials: Localization and the density of states. Commun. Math. Phys. 177, 603–629 (1996) 13. Combes, J.M., Hislop, P.D., Mourre, E.: Spectral averaging, perturbation of singular spectra, and localization. Trans. Am. Math. Soc. 348, 4883–4894 (1996) 14. Combes, J.M., Schrader, R., Seiler, R.: Classical bounds and limits for energy distributions of Hamilton operators in electromagnetic fields. Ann. Phys. (N.Y.) 111, 1–18 (1978) 15. Craig, W., Simon, B.: Log Hölder continuity of the integrated density of states for stochastic Jacobi matrices. Commun. Math. Phys. 90, 207–218 (1983) 16. Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schrödinger operators. Berlin: Springer, 1987 17. Davies, E.B.: Heat kernels and spectral theory. Paperback edition, Cambridge: Cambridge Univ. Press, 1990 18. Delyon, F., Souillard, B.: Remark on the continuity of the density of states of ergodic finite difference operators. Commun. Math. Phys. 94, 289–291 (1984) 19. Doi, S., Iwatsuka, A., Mine, T.: The uniqueness of the integrated density of states for the Schrödinger operators with magnetic fields. Math. Z. 237, 335–371 (2001) 20. Dorlas, T.C., Macris, N., Pulé, J.V.: Characterization of the spectrum of the Landau Hamiltonian with delta impurities. Commun. Math. Phys. 204, 367–396 (1999) 21. Droese, J., Kirsch, W.: The effect of boundary conditions on the density of states for random Schrödinger operators. Stochastic Processes Appl. 23, 169–175 (1986) 22. Fernique, X.M.: Regularité des trajectoires des fonctions aléatoires Gaussiennes. In: Ecole d’Eté de Probabilités de Saint-Flour IV - 1974. Hennequin, P.-L. (ed.), Lecture Notes in Mathematics 480, Berlin: Springer, 1975, pp. 1–96 [in French] 23. Fischer, W., Hupfer, T., Leschke, H., Müller, P.: Existence of the density of states for multi-dimensional continuum Schrödinger operators with Gaussian random potentials. Commun. Math. Phys. 190, 133–141 (1997)

Density of States for Random Schrödinger Operators with Magnetic Fields

253

24. Fischer, W., Leschke, H., Müller, P.: Spectral localization by Gaussian random potentials in multidimensional continuous space. J. Stat. Phys. 101, 935–985 (2000) 25. Fock, V.: Bemerkung zur Quantelung des harmonischen Oszillators im Magnetfeld. Z. Physik 47, 446–448 (1928) [in German] 26. Gilbarg, D., Trudinger, N.S.: Elliptic partial differential equations of second order. 2nd edition, Berlin: Springer, 1983 27. Gradshteyn, I.S., Ryzhik, I.M.: Table of integrals, series, and products. Corrected and enlarged edition, San Diego: Academic, 1980 28. Hess, H., Schrader, R., Uhlenbrock, D.A.: Domination of semigroups and generalization of Kato’s inequality. Duke Math. J. 44, 893–904 (1977) 29. Hupfer, T., Leschke, H., Müller, P., Warzel, S.: Existence and uniqueness of the integrated density of states for Schrödinger operators with magnetic fields and unbounded random potentials. e-print mathph/0010013 (2000). 30. Hupfer, T., Leschke, H., Warzel, S.: Upper bounds on the density of states of single Landau levels broadened by Gaussian random potentials. e-print math-ph/0011010 (2000) 31. Kato, T.: Schrödinger operators with singular potentials. Israel J. Math. 13, 135–148 (1972) 32. Kato, T.: Remarks on Schrödinger operators with vector potentials. Integral Equations Oper. Theory 1, 103–113 (1978) 33. Kato, T., Masuda, K.: Trotter’s product formula for nonlinear semigroups generated by the subdifferentials of convex functionals. J. Math. Soc. Japan 30, 169–178 (1978) 34. Kirsch, W.: Random Schrödinger operators: A course. In: Schrödinger operators. Holden, H., Jensen, A. (eds.), Lecture Notes in Physics 345, Berlin: Springer, 1989, pp. 264–370 35. Kirsch, W., Martinelli, F.: On the ergodic properties of the spectrum of general random operators. J. Reine Angew. Math. 334, 141–156 (1982) 36. Kirsch, W., Martinelli, F.: On the density of states of Schrödinger operators with a random potential. J. Phys. A 15, 2139–2156 (1982) 37. Kukushkin, I.V., Meshkov, S.V., Timofeev, V.B.: Two-dimensional electron density of states in a transverse magnetic field. Sov. Phys. Usp. 31, 511–534 (1988) [Russian original: Usp. Fiz. Nauk 155, 219–264 (1988)] 38. Landau, L.: Diamagnetismus der Metalle. Z. Physik 64, 629–637 (1930) [in German] 39. Lieb, E.H., Loss, M.: Analysis. Providence, Rhode Island: Am. Math. Soc., 1997 40. Lifshits, I.M., Gredeskul, S.A., Pastur, L.A.: Introduction to the theory of disordered systems. New York: Wiley, 1988 [Russian original: Moscow: Nauka, 1982] 41. Lifshits, M.A.: Gaussian random functions. Dordrecht: Kluwer, 1995 42. Liskevitch, V., Manavi, A.: Dominated semigroups with singular complex potentials. J. Funct. Anal. 151, 281–305 (1997) 43. Matsumoto, H.: On the integrated density of states for the Schrödinger operators with certain random electromagnetic potentials. J. Math. Soc. Japan 45, 197–214 (1993) 44. Mohamed, A., Raikov, G.D.: On the spectral theory of the Schrödinger operator with electromagnetic potential. In: Pseudo-differential calculus and mathematical physics. Demuth, M., Schrohe, E., Schulze, B.-W.(eds.), Berlin: Akademie, 1994, pp. 298–390 45. Nakamura, S.: A remark on the Dirichlet–Neumann decoupling and the integrated density of states. J. Funct. Anal. 179, 136–152 (2001) 46. Nakao, S.: On the spectral distribution of the Schrödinger operator with random potential. Japan. J. Math. 3, 111–139 (1977) 47. Pastur, L.: On the Schrödinger equation with a random potential. Theor. Math. Phys. 6, 299–306 (1971) [Russian original: Teor. Mat. Fiz. 6, 415–424 (1971)] 48. Pastur, L.: Spectral properties of disordered systems in the one-body approximation. Commun. Math. Phys. 75, 179–196 (1980) 49. Pastur, L., Figotin, A.: Spectra of random and almost-periodic operators. Berlin: Springer, 1992 50. Perelmuter, M.A., Semenov, Yu.A.: On decoupling of finite singularities in the scattering theory for the Schrödinger operator with a magnetic field. J. Math. Phys. 22, 521–533 (1981) 51. Reed, M., Simon, B.: Methods of modern mathematical physics I: Functional analysis. Revised and enlarged edition, San Diego: Academic, 1980 52. Reed, M., Simon, B.: Methods of modern mathematical physics II: Fourier analysis, self-adjointness. New York: Academic, 1975 53. Reed, M., Simon, B.: Methods of modern mathematical physics IV: Analysis of operators. New York: Academic, 1978 54. Shklovskii, B.I., Efros, A.L.: Electronic properties of doped semiconductors. Berlin: Springer, 1984 [Russian original: Moscow: Nauka, 1979] 55. Simon, B.: Universal diamagnetism of spinless Bose systems. Phys. Rev. Lett. 36, 1083–1084 (1976) 56. Simon, B.: An abstract Kato’s inequality for generators of positivity preserving semigroups. Ind. Math. J. 26, 1067–1073 (1977)

254

T. Hupfer, H. Leschke, P. Müller, S. Warzel

57. 58. 59. 60.

Simon, B.: Maximal and minimal Schrödinger forms. J. Operator Theory 1, 37–47 (1979) Simon, B.: Functional integration and quantum physics. New York: Academic, 1979 Simon, B.: Kato’s inequality and the comparison of semigroups. J. Funct. Anal. 32, 97–101 (1979) Simon, B.: Schrödinger operators in the twenty-first century. In: Mathematical Physics 2000. Fokas, A., Grigoryan, A., Kibble, T., Zegarlinski, B. (eds.), London: Imperial College Press, 2000, pp. 283–288 Stollmann, P.: Caught by disorder: Bound states in random media. Boston: Birkhäuser, 2001 Ueki, N.: On spectra of random Schrödinger operators with magnetic fields. Osaka J. Math. 31, 177–187 (1994) Veseli´c, I.: Wegner estimate for some indefinite Anderson-type Schrödinger operators. e-print mp_arc 00-373 (2000) Wang, W.-M.: Microlocalization, percolation, and Anderson localization for the magnetic Schrödinger operator with a random potential. J. Funct. Anal. 146, 1–26 (1997) Wegner, F.: Bounds on the density of states in disordered systems. Z. Phys. B 44, 9–15 (1981) Weyl, H.: Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Differentialgleichungen (mit einer Anwendung auf die Theorie der Hohlraumstrahlung). Math. Ann. 71, 441–479 (1912) [in German] Zak, J.: Magnetic translation group. Phys. Rev. 134, A1602–A1606 (1964)

61. 62. 63. 64. 65. 66. 67.

Communicated by B. Simon

Commun. Math. Phys. 221, 255 – 265 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Eigenvalues of the Dirac Operator on Manifolds with Boundary Oussama Hijazi1 , Sebastián Montiel2 , Xiao Zhang3 1 Institut Élie Cartan, Université Henri Poincaré, Nancy I, B.P. 239, 54506 Vandœuvre-Lès-Nancy Cedex,

France. E-mail: [email protected]

2 Departamento de Geometría y Topología, Universidad de Granada, 18071 Granada, Spain.

E-mail: [email protected]

3 Institute of Mathematics, Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences,

Beijing 100080, P.R. China. E-mail: [email protected] Received: 22 August 2000 / Accepted: 15 March 2001

Abstract: Under standard local boundary conditions or certain global APS boundary conditions, we get lower bounds for the eigenvalues of the Dirac operator on compact spin manifolds with boundary. For the local boundary conditions, limiting cases are characterized by the existence of real Killing spinors and the minimality of the boundary. 1. Introduction It is well known that the spectrum of the Dirac operator on closed spin manifolds detects subtle information on the geometry and the topology of such manifolds (see for example [6, 8]). In [31, 33, 27, 30], basic properties of the hypersurface Dirac operator are established. This hypersurface Dirac operator appears as the boundary term in the integral Schrödinger–Lichnerowicz formula (2.3) for compact spin manifolds with compact boundary. In fact, the hypersurface Dirac operator is, up to a zero order operator, the intrinsic Dirac operator of the boundary. In this paper, we examine the classical local boundary conditions and certain Atiyah– Patodi–Singer boundary conditions for the Dirac operator. Here, the spectral resolution of the intrinsic Dirac operator of the boundary is used to define the APS boundary conditions. We first prove self-adjointness and ellipticity of such conditions. Then, systematic use of the modified Levi–Civita connections, introduced in [10, 24, 33, 11, 27, 30], is made (see also [28, 15] for the Dirac operators on submanifolds). Under appropriate curvature assumptions, these modified connections combined with formula (2.3), yield the corresponding estimates for compact spin manifolds with boundary. The limiting cases are then studied. Such estimates are obtained in Sects. 3 and 4. In Sect. 3 we consider both the local and the above mentioned APS boundary conditions. We first introduce the modified connection (3.1) which allows to establish a Friedrich’s type inequality, in case the mean curvature of the boundary is nonnegative. Under the local boundary conditions,

256

O. Hijazi, S. Montiel, X. Zhang

the limiting case is then characterized by the existence of a Killing spinor on the compact manifold with minimal boundary (see (3.5)). Then the energy-momentum tensor is used to define the modified connection (3.7), from which one can deduce inequality (3.9). Finally, in Sect. 4, under the local boundary conditions, the conformal aspect is examined. For example, generalizations of the conformal lower bounds in [22, 24] are obtained (see Remark 9). It might be useful to mention that local and global boundary conditions are introduced in [25] to get optimal extrinsic lower bounds for the first nonnegative eigenvalue of the intrinsic Dirac operator of the boundary. Moreover, in [26], the conformal aspect of this setup is examined where a conformal extrinsic lower bound is given. 2. The Elliptic Boundary Conditions Let M be an n-dimensional Riemannian spin manifold with boundary ∂M endowed with its induced Riemannian and spin structures. Denote by S the spinor bundle of M. Let ∇ (resp. ∇ ∂M ) be the Levi–Civita connection of M (resp. ∂M) and denote by the same symbol their corresponding lift to the spinor bundle S. Consider the Dirac operator D of M defined by ∇ on S. It is known [29] that there exists a positive definite Hermitian metric on S which satisfies, for any covector field X∗ ∈ (T ∗ M), and any spinor fields ϕ, ψ ∈ (S), the relation (X∗ · ϕ, X ∗ · ψ) = |X ∗ |2 (ϕ, ψ),

(2.1)

where “·” denotes Clifford multiplication. The connection ∇ is compatible with the metric ( , ). Fix a point p ∈ ∂M and an orthonormal basis {eα } of Tp M with e0 the outward normal to ∂M and ei tangent to ∂M such that for 1 ≤ i, j ≤ n, (∇i∂M ej )p = (∇0 ej )p = 0. Let {eα } be the dual coframe. Then, for 1 ≤ i, j ≤ n, (∇i ej )p = −hij e0 , (∇i e0 )p = hij ej , where hij = (∇i e0 , ej ) are the components of the second fundamental form at p, and we have 1 ∇i = ∇i∂M + hij e0 · ej · . (2.2) 2 Let H = hii be the unnormalized mean curvature of M. In the above notation, the standard sphere Srn = ∂Brn+1 has positive mean curvature H = nr . By (2.1), (e0 · ej · ϕ, ψ) = (ϕ, ej · e0 · ψ). Therefore (2.2) implies d(ϕ, ψ) ∗ ei = (∇i ϕ, ψ) + (ϕ, ∇i ψ) ∗ 1 = (∇i∂M ϕ, ψ) + (ϕ, ∇i∂M ψ) ∗ 1. Hence the connection ∇ ∂M is also compatible with the metric ( , ). Denote by D ∂M the Dirac operator of ∂M. In the above orthonormal coframe {ei } of M, D ∂M = ei · ∇i∂M . Thus D ∂M is self-adjoint with respect to the metric ( , ). The relation (2.2) implies that ∇i∂M (e0 · ϕ) = e0 · ∇i∂M ϕ.

Dirac Operator on Manifolds with Boundary

257

Hence

D ∂M (e0 · ϕ) = −e0 · D ∂M ϕ. Consider the integral form of the Schrödinger–Lichnerowicz formula for a compact manifold with compact boundary 1 (ϕ, e0 · D ∂M ϕ) − H |ϕ|2 2 ∂M ∂M R = |∇ϕ|2 + |ϕ|2 − |Dϕ|2 . (2.3) 4 M It is well-known that there are basically two types of elliptic boundary conditions for the Dirac operator: The local boundary condition and the (global) Atiyah–Patodi–Singer (APS) boundary condition. Such boundary conditions are used in the positive mass theorem for black holes, Penrose conjecture in general relativity and the index theory in topology [13, 14, 20, 21, 34]. The APS boundary condition exists on any spin manifold with boundary [2–4] (see also [16–19]), while the local boundary condition requires certain additional structures on manifolds such as the existence of a Lorentzian structure or a chirality operator, etc [12, 13, 21]. Now we shall show that the local boundary condition exists on certain spin manifolds with a “boundary chirality operator”. An operator defined on C ∞ (∂M, S|∂M ) is said to be a boundary chirality operator if it satisfies the following conditions: 2 = I d, = 0,

∇e∂M i 0

e · = − · e0 , ei · = · ei , ( · ϕ, · ψ) = (ϕ, ψ).

(2.4) (2.5) (2.6) (2.7) (2.8)

If M is a spacelike hypersurface of a spacetime manifold with timelike covector T , then we can let = T · e0 , where e0 is the normal covector on ∂M. Recall that (see [12] for example), an operator F defined on C ∞ (M, S) is called a chirality operator on M if for all X∗ ∈ (T ∗ M), and any spinor fields ϕ, ψ ∈ (S), one has F 2 = I d, ∇X F = 0, X ∗ · F = −F · X ∗ , (F · ϕ, F · ψ) = (ϕ, ψ). Note that such an operator exists if the spin manifold M is even dimensional. It is easy to see that if M has a chirality operator F , then = F |∂M · e0 is a boundary chirality operator. In this paper, we consider the following boundary conditions: • The local boundary condition. As the eigenvalues of the chirality operator are ±1, the corresponding eigenspaces loc + = ϕ ∈ C ∞ (∂M, S|∂M ), · ϕ = ϕ , loc − = ϕ ∈ C ∞ (∂M, S|∂M ), · ϕ = −ϕ provide local boundary conditions.

258

O. Hijazi, S. Montiel, X. Zhang

• The APS type boundary condition. The operator e0 ·D ∂M is self-adjoint with respect to the induced metric ( , ) on ∂M. Therefore it has a discrete (real) spectrum. Let (ϕk )k∈N be the spectral resolution of e0 · D ∂M , i.e., e0 · D ∂M ϕk = λk ϕk , and consider APS spanned by the positive and negative the corresponding L2 -orthogonal subspaces ± 0 ∂M eigenspaces of e · D , i.e., APS + = ϕ ∈ C ∞ (∂M, S|∂M ), ϕ = ck ϕk , λk >0

APS − = ϕ ∈ C ∞ (∂M, S|∂M ), ϕ =

ck ϕk .

λk 0, there exists Ck,δ such that

ϕ 2H k ≤ (1 + δ) Dϕ 2L2 +Ck,δ ϕ 2H k−1 . loc or ϕ ∈ loc , D ∂M ( · ϕ) = · D ∂M ϕ, thus Proof. Note that for any ϕ ∈ + −

(ϕ, e0 · D ∂M ϕ) = · ϕ, e0 · D ∂M ( · ϕ) = ( · ϕ, e0 · · D ∂M ϕ) = −(ϕ, e0 · D ∂M ϕ).

(2.9)

Dirac Operator on Manifolds with Boundary

259

APS , then Therefore (ϕ, e0 · D ∂M ϕ) = 0. If ϕ ∈ − (ϕ, e0 · D ∂M ϕ) = |ck |2 λk ≤ 0. ∂M

λk 0, there exists a constant Cε > 0 such that ϕ 2L2 (∂M) ≤ ε ϕ 2H 1 +Cε ϕ 2L2 , thus (2.3) implies ϕ 2H 1 ≤ (1 + δ) Dϕ 2L2 +Cδ ϕ 2L2 . Then a standard argument gives (2.9).

(2.10)

The following corollary is a direct consequence of the Sobolev embedding theorem ϕ 2C k ≤ C ϕ 2

n

H k+ 2

.

Corollary 2. Any eigenspinor of the Dirac operator which satisfies either the local loc or the (negative) APS boundary condition ϕ ∈ AP S boundary condition ϕ ∈ ± − is smooth. 3. Lower Bounds for the Eigenvalues In this section, we adapt the arguments used in [27] to the case of spin compact manifolds with boundary. In particular, we get generalizations of basic inequalities on the eigenloc or the negative values of the Dirac operator D under the local boundary conditions ± APS APS boundary condition − . For this, we use the integral identity (2.3) together with an appropriate modification of the Levi–Civita connection. Let Dϕ = λϕ, where λ is a real constant or a real function. For any real functions a and u, we define ∇ia,u = ∇i + a∇i u + Then

a λ ∇j u e i · e j · + e i · . n n

(3.1)

1 λ2 2 |ϕ| + a 2 1 − |du|2 |ϕ|2 n n 2λ (∇i ϕ, ei · ϕ) +2a∇i u (∇i ϕ, ϕ) + n 1 λ2 2 2 2 = |∇ϕ| − |ϕ| + a 1 − |du|2 |ϕ|2 + a∇i u∇i |ϕ|2 . n n

|∇ a,u ϕ|2 = |∇ϕ|2 +

Define the functions Ra,u by

1 2 Ra,u = R − 4a,u + 4∇a∇u − 4 1 − a |du|2 , n

(3.2)

260

O. Hijazi, S. Montiel, X. Zhang

where , is the positive scalar Laplacian. Then we have M

|∇

a,u

λ2 ϕ| = |∇ϕ| − |ϕ|2 − n M + a du(e0 )|ϕ|2 . 2

2

Ra,u R − |ϕ|2 4 4

∂M

Therefore (2.3) yields M

|∇ a,u ϕ|2 =

1 2 Ra,u )λ − |ϕ|2 n 4 M

H 0 ∂M + (ϕ, e · D ϕ) + a du(e0 ) − |ϕ|2 . 2 ∂M

(1 −

(3.3)

Now we generalize Lemma 2.3 in [11] to the case where a is a real function. Lemma 3. Suppose there exist a spinor field ϕ ∈ (S), a real number λ and a real functions a and u on M such that for all i, 1 ≤ i ≤ n, λ a ∇i ϕ = − ei · ϕ − a∇i uϕ − ∇j uei · ej · ϕ. n n

(3.4)

Then ϕ is a real Killing spinor, i.e., either a = 0 or du = 0. In particular, the manifold is Einstein. Proof. First, observe that (3.4) implies Dϕ = λϕ. By the Ricci identity (see [11]), we have 1 Rij ei · ej · ϕ = ei · D(∇i ϕ) − D 2 ϕ 2 λ a = ei · ej · ∇j − ei · −a∇i u − ∇k uei · ek · ϕ − λ2 ϕ n n λ = ei · (ei · ej · +2δij )∇j ϕ n − λ2 ϕ − du · da · ϕ + auϕ − aλdu · ϕ 1 + ei · (ei · ej · +2δij )ek · ∇j a∇k uϕ n + a∇j ∇k uϕ + a∇k u∇j ϕ 2(1 − n) 2 2a 2(2 − n) = λ + u − ∇a∇u ϕ n n n 4aλ 2 − du · da · ϕ + 2 du · ϕ. n n

This implies either a = 0 or du = 0. By (3.3) and Lemma 3, we obtain

Dirac Operator on Manifolds with Boundary

261

Theorem 4. Let M n be a compact Riemannian spin manifold of dimension n ≥ 2, with boundary ∂M, and let λ be any eigenvalue of D under either the local boundary condition loc or the (negative) APS boundary condition APS . If there exist real functions a, u ± − on M such that H ≥ 2a du(e0 ) on ∂M, where H is the mean curvature of ∂M, then λ2 ≥

n sup inf Ra,u , 4(n − 1) a,u M

(3.5)

where Ra,u is given in (3.2). In the limiting case with the local boundary conditions, the associated eigenspinor is a real Killing spinor and ∂M is minimal. Note that by [25], under the APS boundary conditions equality in (3.5) could not hold. Now we make use of the energy-momentum tensor (see [24]) to get lower bounds for the eigenvalues of D. For any spinor field ϕ, we define the associated energy momentum 2-tensor Qϕ on the complement of its zero set by, Qϕ,ij =

1 i e · ∇j ϕ + ej · ∇i ϕ , ϕ/|ϕ|2 . 2

(3.6)

If ϕ is an eigenspinor of D, the tensor Qϕ is well-defined in the sense of distribution. Let a Q,a,u ∇i = ∇i + a∇i u + ∇j u ei · ej · +Qϕ,ij ej · . (3.7) n It is easy to prove that (see [27]) 1 |du|2 |ϕ|2 + a∇i u∇i |ϕ|2 . |∇ Q,a,u ϕ|2 = |∇ϕ|2 − |Qϕ |2 |ϕ|2 + a 2 1 − n Therefore

M

|∇

Q,a,u

Ra,u 2 λ − |ϕ|2 ϕ| = + |Qϕ | 4 M

H |ϕ|2 . + (ϕ, e0 · D ∂M ϕ) + a du(e0 ) − 2 ∂M 2

2

(3.8)

Thus we have Theorem 5. Let M n be a compact Riemannian spin manifold of dimension n ≥ 2, with boundary ∂M, and let λ be any eigenvalue of D under either the local boundary condition loc or the (negative) APS boundary condition APS . If there exist real functions a, u ± − on M such that H ≥ 2a du(e0 ) on ∂M, where H is the mean curvature of ∂M, then Ra,u 2 2 + |Qϕ | . λ ≥ sup inf 4 a,u M In the limiting case, one has H = 2adu(e0 ) on ∂M.

(3.9)

262

O. Hijazi, S. Montiel, X. Zhang

loc or the (negative) APS boundRemark 6. Under either the local boundary condition ± APS , assume that H ≥ 0. Take a = 0 or u constant in (3.5) and (3.9), ary condition − then one gets Friedrich’s inequality [10]

λ2 ≥

n inf R 4(n − 1) M

(3.10)

and the following inequality [24] λ ≥ inf 2

M

R 2 + |Qϕ | . 4

(3.11)

4. Conformal Lower Bounds loc , we show that As in the previous section and under the local boundary conditions ± the conformal arguments used in [27] combined with the integral formula (2.3) yield to generalizations of all known lower bounds for the eigenvalues of the Dirac operator. Let g be the metric of M. For any real function u on M, consider a conformal metric g¯ = e2u g. Denote by D the Dirac operator with respect to this conformal metric. If n−1 Dϕ = λϕ, then D ψ = λ e−u ψ, where ψ = e− 2 u ϕ. Note that

a λ ∇ea,u = ∇ ei + a e−u ∇i u + e−u ∇j u ei · ej + e−u ei ·, i n n −u −u −2u ,u = − e (∇ei (e ∇ei u)) = e (,u + |du|2 ), i

R e2u = R + 2(n − 1),u − (n − 1)(n − 2)|du|2 , also, on ∂M,

(n−2) n D ∂M e− 2 u ϕ = e− 2 u D ∂M ϕ, H = e−u H + (n − 1) du(e0 ) .

a,u by Define the function R a,u = R + 4 n − 1 − a ,u + 4∇a∇u R 2 1 2 − (n − 1)(n − 2) + 4(2 − n)a + 4(1 − )a |du|2 , n

(4.1)

where , is the positive scalar Laplacian. Then apply (3.3) to the conformal metric g, to get

a,u 2 1 2 R a,u 2 −u 1− e λ − |ϕ| vg ∇ ψ v¯g = g¯ n 4 M M

H (ψ, e0 · D ∂M ψ)g + a du(e0 ) − |ψ|2g vg , + 2 ∂M

Dirac Operator on Manifolds with Boundary

hence

263

a,u 2 1 2 R a,u 2 1− λ − |ϕ| vg e−u ∇ ψ v¯g = g¯ n 4 M M + e−u (ϕ, e0 · D ∂M ϕ) ∂M

n−1 H 2 + (a − ) du(e0 ) − |ϕ| vg . 2 2

(4.2)

a,u ψ = 0 implies Note that ∇ λ n 1 n ∇i uϕ − a− ∇j uei · ej · ϕ ∇i ϕ = − ei · ϕ − a − n 2 n 2 (see [11]), we thus have either a = Mn

n 2

or du = 0 by Lemma 3. Thus we obtain:

be a compact Riemannian spin manifold of dimension n ≥ 2, with Theorem 7. Let boundary ∂M, and let λ be any eigenvalue of D under the local boundary condition loc . If there exist real functions a, u on M such that ± H ≥ (2a − n + 1) du(e0 ) on ∂M, where H is the mean curvature of ∂M, then n a,u , sup inf R λ2 ≥ 4(n − 1) a,u M

(4.3)

a,u is given in (4.1). In the limiting case, the associated eigenspinor where the function R is a real Killing spinor and either H = du(e0 ) or H = 0 on ∂M. Since Qϕ,i¯ j¯ = e−u Qϕ,ij under the conformal transformation g = e2u g, we apply (3.8) to the conformal metric g, to get

Ra,u Q,a,u 2 ψ v¯g = e−u λ2 − + |Qϕ |2 |ϕ|2 vg ∇ g¯ 4 M M + e−u (ϕ, e0 · D ∂M ϕ)

∂M

+ (a −

H n−1 ) du(e0 ) − |ϕ|2 vg . 2 2

(4.4)

Thus we have Theorem 8. Let M n be a compact Riemannian spin manifold of dimension n ≥ 2, with boundary ∂M, and let λ be any eigenvalue of D under the local boundary condition loc . If there exist real functions a, u on M such that ± H ≥ (2a − n + 1) du(e0 ) on ∂M, where H is the mean curvature of ∂M, then Ra,u + |Qϕ |2 . λ2 ≥ sup inf 4 a,u M In the limiting case one has H = (2a − n + 1) du(e0 ) on ∂M.

(4.5)

264

O. Hijazi, S. Montiel, X. Zhang

2 Remark 9. If n ≥ 3, take a = 0 and u = − n−2 log h in (4.3) and (4.5), where h is a positive eigenfunction of the first eigenvalue µ1 of the conformal Laplacian

L := 4

n−1 +R n−2

under the boundary condition dh(e0 ) −

(n − 2)H h = 0. 2(n − 1)

Then, one gets the lower bounds [22, 24] λ2 ≥ and λ2 ≥ inf M

n µ1 , 4(n − 1) µ

1

4

+ |Qϕ |2

(4.6) (4.7)

loc . In the limiting case of (4.6), the associated under the local boundary condition ± eigenspinor is a real Killing spinor and ∂M is minimal.

Acknowledgements. Research of S.M. is partially supported by a DGICYT grant No. PB97-0785. Research of X.Z. is partially supported by the Chinese NSF and mathematical physics program of CAS. This work is partially done during the visit of the last two authors to the Institut Élie Cartan, Université Henri Poincaré, Nancy 1. They would like to thank the institute for its hospitality.

References 1. Adams, R.A.: Sobolev spaces. New York: Academic Press, 1978 2. Atiyah, M.F., Patodi, V.K., Singer, I.M.: Spectral asymmetry and Riemannian geometry, I. Math. Proc. Cambr. Phil. Soc. 77, 43–69 (1975) 3. Atiyah, M.F., Patodi, V.K., Singer, I.M.: Spectral asymmetry and Riemannian geometry, II. Math. Proc. Cambr. Phil. Soc. 78, 405–432 (1975) 4. Atiyah, M.F., Patodi, V.K., Singer, I.M.: Spectral asymmetry and Riemannian geometry, III. Math. Proc. Cambr. Phil. Soc. 79, 71–99 (1976) 5. Bär, C.: Lower eigenvalue estimates for Dirac operators. Math. Ann. 293, 39–46 (1992) 6. Baum, H., Friedrich, T., Grunewald, R., Kath, I.: Twistor and Killing Spinors on Riemannian Manifolds. Seminarbericht 108, Humboldt-Universität zu Berlin, 1990 7. Bourguignon, J.P., Gauduchon, P.: Spineurs, Opérateurs de Dirac et Variations de Métriques. Commun. Math. Phys. 144, 581–599 (1992) 8. Bourguignon, J.P., Hijazi, O., Milhorat, J.-L., Moroianu, A.: A Spinorial Approach to Riemannian and Conformal Geometry. Monograph (In preparation) 9. Botvinnik, B., Gilkey, P., Stolz, S.: The Gromov Lawson Rosenberg conjecture for groups with periodic cohomology. J. Diff. Geo. 46, 374–405 (1997) 10. Friedrich, T.: Der erste Eigenwert des Dirac-Operators einer kompakten, Riemannschen Mannigfaltigkeit nicht negativer Skalarkrümmung. Math. Nachr. 97, 117–146 (1980) 11. Friedrich, Th., Kim, E.-C.: Some remarks on the Hijazi inequality and generalizations of the Killing equation for spinors. To appear in J. Geom. Phys. 12. Farinelli, S., Schwarz, G.: On the spectrum of the Dirac operator under boundary conditions. J. Geom. Phys. 28, 67–84 (1998) 13. Gibbons, G., Hawking, S., Horowitz, G., Perry, M.: Positive mass theorems for black holes. Commun. Math. Phys. 88, 295–308 (1983) 14. Gilkey, P.B.: Invariance theory, the heat equation, and the Atiyah–Singer index theorem. 2nd ed., Boca Raton: CRC Press, 1995

Dirac Operator on Manifolds with Boundary

265

15. Ginoux, N., Morel, B.: Eigenvalue Estimates for the Submanifold Dirac Operator. Preprint IÉCN, Nancy, n◦ 44 (2000) 16. Grubb, G.: Heat operator trace expansions and index for generalAtiyah-Patodi-Singer boundary problems. Commun. Part. Diff. Equat. 17, 2031–2077 (1992) 17. Grubb, G., and Seeley, R.: Développements asymptotiques pour l’opérateur d’Atiyah–Patodi–Singer, C. R. Acad. Sci., Paris, Ser. I 317, 1123–1126 (1993) 18. Grubb, G., and Seeley, R.: Weakly parametric pseudodifferential operators and Atiyah Patodi Singer boundary problems. Invent. Math. 121, 481–529 (1995) 19. Grubb, G., and Seeley, R.: Zeta and eta functions for Atiyah-Patodi-Singer operators. J. Geom. Anal. 6, 31–77 (1996) 20. Herzlich, M.: A Penrose-like inequality for the mass of Riemannian asymptotically flat manifolds. Commun. Math. Phys. 188, 121–133 (1998) 21. Herzlich, M.: The positive mass theorem for black holes revisited. J. Geom. Phys. 26, 97–111 (1998) 22. Hijazi, O.: A conformal lower bound for the smallest eigenvalue of the Dirac operator and Killing spinors. Commun. Math. Phys. 104, 151–162 (1986) 23. Hijazi, O.: Première valeur propre de l’opérateur de Dirac et nombre de Yamabe. C. R. Acad. Sci. Paris, 313 , 865–868 (1991) 24. Hijazi, O.: Lower bounds for the eigenvalues of the Dirac operator. J. Geom. Phys. 16, 27–38 (1995) 25. Hijazi, O., Montiel, S., Zhang, X.: Dirac operator on embedded hypersurfaces. Math. Res. Lett. 8, 195–208 (2001) 26. Hijazi, O., Montiel, S., Zhang, X.: Conformal Lower Bounds for the Dirac Operator of Embedded Hypersurfaces. Asian J. Math., to appear 27. Hijazi, O., Zhang, X.: Lower bounds for the Eigenvalues of the Dirac Operator, Part I. The hypersurface Dirac Operator. To appear in Ann. Glob. Anal. Geom. 28. Hijazi, O., Zhang, X.: Lower bounds for the Eigenvalues of the Dirac Operator, Part II. The Submanifold Dirac Operator. To appear in Ann. Glob. Anal. Geom. 29. Lawson, H., Michelsohn, M.: Spin geometry. Princeton, NJ: Princeton Univ. Press, 1989 30. Morel, B.: Eigenvalue Estimates for the Dirac-Schrödinger Operators. J. Geom. Phys. 38, 1–18 (2001) 31. Trautman, A.: The Dirac Operator on Hypersurfaces. Acta Phys. Plon. B 26, 1283–1310 (1995) 32. Witten, E.: A new proof of the positive energy theorem. Commun. Math. Phys. 80, 381–402 (1981). 33. Zhang, X.: Lower bounds for eigenvalues of hypersurface Dirac operators. Math. Res. Lett. 5, 199–210 (1998); A remark on: Lower bounds for eigenvalues of hypersurface Dirac operators. Math. Res. Lett. 6, 465–466 (1999) 34. Zhang, X.: Angular momentum and positive mass theorem. Commun. Math. Phys. 206, 137–155 (1999) Communicated by M. Aizenman

Commun. Math. Phys. 221, 267 – 292 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Boundary Layer Stability in Real Vanishing Viscosity Limit Denis Serre1, , Kevin Zumbrun2, 1 ENS Lyon, UMPA (UMR 5669 CNRS), 46, allée d’Italie, 69364 Lyon Cedex 07, France.

E-mail: [email protected]

2 Department of Mathematics, Indiana University, Rawles Hall, Bloomington, IN 47405, USA.

E-mail: [email protected] Received: 27 November 2000/ Accepted: 16 March 2001

Abstract: In the previous paper [20], an Evans function machinery for the study of boundary layer stability was developed. There, the analysis was restricted to strongly parabolic perturbations, that is to an approximation of the form ut + (F (u))x = ν(B(u)ux )x (ν 0.

(1)

Here, F is a given flux, a C 2 -vector field on a convex open subset U of Rn . The diffusion tensor B is of class C 2 ; its eigenvalues need to have non-negative real parts, but it is important not to assume the invertibility of B(u). We assume that the rank of B does not depend on u, and we denote it by r (1 ≤ r ≤ n). The positive constant ν is small. We therefore are interested in the limit as ν → 0+ , expecting that the solutions uν of (1) converge boundedly almost everywhere to solutions of the inviscid system ut + F (u)x = 0,

x, t > 0.

(2)

The local well-posedness of the Cauchy problem (where x ∈ R replaces x > 0) for (1) is a difficult problem, first addressed by Kawashima in his unpublished thesis [15]. A natural hypothesis (see [16]), that we shall adopt here is that there exists a smooth change of variables u → v(u) (with inverse u = g(v)), in which the system rewrites as g(v)t + f (v)x = ν(b(v)vx )x ,

(3)

with the following properties: (H1) b(v) is block-diagonal:

b(v) =

0

0

0 b1 (v)

,

with b1 (v) ∈ GLr (R). (H2) dg(v) is lower block-triangular: dg(v) =

γ (v)

0

·

δ(v)

with, necessarily, γ (v) ∈ GLn−r , δ(v) ∈ GLr .

,

Boundary Layer Stability

269

(H3) For each v¯ ∈ V := v(U), the linear operator ¯ x2 δ(v)∂ ¯ t − b1 (v)∂ is strongly parabolic. (H4) In the block decomposition of df , df (v) =

h(v) · ·

·

,

the matrix γ (v)−1 h(v) is diagonalisable with real eigenvalues. In this list, (H1) is a little bit restrictive. But it only needs the eigenvalue λ = 0 of B to be semi-simple. We shall see later on that a natural assumption (see (H9)) ensures this property. The last hypothesis means that the system obtained from (3) by removing the second order equations, and freezing the corresponding variables, is an (n − r) × (n − r) hyperbolic system. In view of these hypotheses, we shall denote by (w, z)T the block decomposition of v. Defining also f =: (f0 , f1 ), g =: (g0 , g1 ), we have h = dw f0 and γ = dw g0 . Since we are concerned with initial-boundary value problems (IBVP), we need to distinguish between two cases, the boundary {x = 0} being characteristic or not. We say that it is characteristic at some state u ∈ U (corresponding to v ∈ V) if one of the signal velocities of the system under consideration vanishes at u. For the perturbed system (1), or equivalently (3), this means that h(v) is singular. For the “inviscid” system (2) this means that dF (u) is singular. Characteric IBVPs may be difficult to attack. But overall, the status of the boundary layer is much different according to the nature of the boundary. The example of a gas flow, where (2) is the Euler equations and (1) is the Navier–Stokes equations, is enlightening. A natural assumption is that the boundary is impermeable ; then it is characteristic, for both (2) and (1). The width of the boundary √ layer is about the square root ν of the viscosity. Its profile is expected to obey the Prandtl equation. Very little is known at a rigorous level in this case. On the contrary, an inflow (or outflow) boundary condition makes the boundary non-characteristic in most cases1 . In that case, the width of the boundary layer is of order ν and its profile obeys an ODE (see (4) below). A suitable analysis of this case was carried out by Gisclon and one of us [9, 10] in one-space dimension, and by Grenier & Guès [12] in several space dimensions. These references deal with boundary layers of moderate amplitude, when (1) is strictly parabolic, that is r = n. Let us assume that (2) is hyperbolic, that is dF (u) is diagonalisable with real eigenvalues. When dF (u) is invertible, an IBVP needs q independent scalar boundary conditions, where q is the number of incoming characteristic curves, that is the number of positive eigenvalues of dF (u). Similarly, assuming (H1–H4) and that h(v) is invertible, an IBVP for (1) needs r + p independent scalar boundary conditions, where p is the number of incoming characteristics for the reduced system g0 (w, z¯ )t + f0 (w, z¯ )x = 0 (¯z constant), that is the number of positive eigenvalues of γ (v)−1 h(v). We shall see that p ≤ q ≤ r + p (Corollary 1); a boundary layer occurs when q < p + r. For the sake of simplicity, we shall restrict to a set of Dirichlet-type boundary conditions for (1). In [9, 10, 12], the convergence of (1) towards (2) was proved under natural assumptions, as long as the amplitude of the boundary layer remains smaller than some threshold. 1 In- or out-flow data are of interest in problems with apertures, such as occur in oil recovery.

270

D. Serre, K. Zumbrun

For x >> ν, the solution uν of the IBVP for (1) behaves like the solution u¯ of an appropriate IBVP for (2). For x = O(ν), it behaves as a layer U (x/ν; t). Here, the time variable acts just as a parameter and U (·; t) solves an ODE ¯ t)), B(U )U = F (U ) − F (u(0,

(4)

with U (+∞; t) = u(0, ¯ t) and U (0; t) satisfying the boundary condition of (1). Given u(0, ¯ t), this is an overdetermined problem, which admits a solution if and only if u(0, ¯ t) belongs to some subset C(t) (the reference to t is there because the boundary data might depend on t). The relation u(0, ¯ t) ∈ C(t) then plays the rôle of a boundary condition for (2), called the residual boundary condition. Under suitable assumptions, C(t) is a submanifold of codimension q (see [9, 10, 19]) and gives rise to a locally well-posed IBVP for (2) (see [17] for a theory of such IBVP), which determines u¯ on some strip R+ × (0, T ). The restriction r = n, assumed in [9, 10], is not essential here, as pointed out by H. Freistühler (personal communication). We point out that U (·; t) is nothing but a steady solution of the IBVP for the rescaled problem uτ + F (u)y = (B(u)uy )y .

(5)

Similarly, U (·/ν; t0 ) is a steady solution of the IBVP for (1). The restriction of moderate strength in [9, 10, 12] is actually relevant. We do not exclude that some strong layers become linearly unstable, which would forbid the convergence as ν → 0. The instability mechanism may be described as follows. Let us consider the linearized problems about U (·/ν; t0 ) and U (·; t0 ): ut = Lν u + linearized boundary conditions, uτ = Lu + linearized boundary conditions.

(6) (7)

Clearly, the linearized boundary conditions are the same for both problems ; therefore L and Lν have the same domain D. One easily checks that Lν is conjugated to ν −1 L, through the rescaling u → u, ˜ u(y) ˜ = u(νy). Let us now suppose that the spectrum of L contains some complex number λ with real part ω > 0. Then Lν admits the spectral value λ/ν and the boundary layer is more and more unstable as ν → 0: disturbances are amplified by a factor exp(ωt/ν) and are completely destroyed on a time scale O(ν). In other words, such a boundary layer may not be observed in practice and is irrelevant. As a matter of fact, the analysis in [9, 10, 12] implies that layers of moderate size, with r = n, are linearly stable. On the contrary, a recent work by Grenier and Rousset [13] shows that spectral stability of the boundary layer implies non-linear stability, under the condition that r = n. Let us give a short description of their result. Being given a Dirichlet boundary data a(t) for (1), let u be a smooth solution of the hyperbolic system (2) with initial data u0 (x) and residual boundary condition u(0, t) ∈ C(t) associated to a. Let U (·; t) be the boundary layer, determined by (4) and U (0; t) = a(t) (and therefore U (+∞; t) = u(0, t)). Finally, let uν be the solution of the IBVP for (1). Assume that for every t ∈ [0, T [, the boundary layer U (·; t) is spectrally stable. Then uν converges strongly towards u. This motivates our study of the “spectral stability of the boundary layer”. By this, we mean that the spectrum of L is included in the left (say, stable) half-space {λ ∈ C; λ ≤ 0}. To decide whether a given boundary layer is spectrally stable or not is a difficult task, which cannot be solved explicitly by quadrature. We shall see that, under reasonable assumptions, the essential spectrum of L lies in the stable half-space.

Boundary Layer Stability

271

Therefore, instability can occur only when L admits an eigenvalue with λ > 0. This yields the eigenvalue problem (L − λ)u = 0,

u ∈ D.

(8)

The difficulty then comes from the fact that, since L is a differential operator with variable coefficients, we are not able to solve explicitly the ODE (L − λ)u = 0. The information obtained by differentiating (4) is clearly not enough: LU = 0.

(9)

We point out in passing that U does not satisfy the linearized boundary condition in general, so that (9) does not mean that zero is an eigenvalue of L, contrary to the case of travelling waves (see for instance [7]). In the sequel, we first focus on the stability analysis of one single given layer U . We denote by u+ its limit at +∞ and we define V := v ◦ U . We only assume that (H1–H4) hold on a neighbourhood U of the range of U . In order to have minimal hypotheses, we complete (H1–H4) by (H5) The boundary is non-characteristic for (1), that is h(v) ∈ GLn−r (R),

∀v ∈ V := v(U).

(H6) Strict hyperbolicity of (2) near u+ : for u in some neighbourhood of u+ , the matrix dF (u) is diagonalisable with real eigenvalues of constant multiplicities. (H7) The boundary is non-characteristic for (2) at u+ , that is dF (u+ ) ∈ GLn (R). (H8) The state u+ is linearly L2 -stable for the Cauchy problem of (1): for all ξ ∈ R∗ , the eigenvalues of the matrix ξ 2 B(u+ ) + iξ dF (u+ ) have strictly positive real parts. For the sake of simplicity, we shall denote K+ := K(u+ ) (for functions of the variable u ∈ U) or k+ = limx→+∞ k(x) (for functions of the variable x > 0). When (H8) holds, there exists a positive θ such that κ ≥ θξ 2 holds for all eigenvalue κ of ξ 2 B+ + iξ dF+ and |ξ | < 1. This estimate is not uniformly valid for ξ ∈ R when r < n. We point out that (H8) implies that dF+ has real eigenvalues (examine the limit as ξ → 0), a slightly weaker property than (H6). Also, (H8) follows from stronger, but rather natural, hypotheses: (H9) There is a dissipative symmetrizer S+ at u+ , that is a positive definite symmetric matrix such that S+ dF+ is symmetric and ∀X ∈ Rn ,

(S+ B+ X, X) ≥ βB+ X2 ,

where (· , ·) denotes the scalar product in Rn and β > 0 is a constant. (H10) The hyperbolic and parabolic modes do couple: the kernel of B+ does not contain eigenvectors of dF+ . Lemma 1. Hypotheses (H9, H10) imply (H8).

272

D. Serre, K. Zumbrun

Proof. Let ξ ∈ R∗ and let (λ, X) be an eigenpair of ξ 2 B+ + iξ dF+ . Then (ξ 2 B+ + iξ dF+ − λ)X = 0. Multiplying by X ∗ S+ , and taking the real part, we obtain (λ)X∗ S+ X = ξ 2 (S+ B+ X, X∗ ) ≥ ξ 2 βB+ X2 . Therefore λ is positive. It is strictly so, because otherwise B+ X = 0 so that X would be an eigenvector of dF+ . Such a symmetrizer usually comes as the Hessian matrix of an entropy for (2), which is strongly convex at u+ and dissipative for (1). The rôle of (H9) in the computation of a “stability index” has been explained in [2]. Let us point out that assumption (H9) immediately implies that the range R(B+ ) is S+ -orthogonal to ker B+ . This shows that zero is a semi-simple eigenvalue of B+ , a property which was implicit in assumption (H1). We also remark that instability does not occur in scalar problems (n = 1), even at a non-linear level, as shown in [4]. Our paper is organized as follows. In the next section, we study the boundary layer equation in a geometrical setting and we show that the stability analysis reduces to the search of ordinary eigenvalues. In Sect. 3, we built our Evans function, following [20] and focus on its crucial estimate at λ = 0. In Sect. 4, we consider a richer situation, where the boundary layer is parametrized in such a way that it is a piece of a maximal solution of the layer equation. When this solution is a viscous shock profile and the piece is almost the whole, then we show that the stability index is the sign of an algebraic expression. This sign can be computed in several cases. The remaining sections are devoted to full as well as to isentropic gas dynamics. For full gas dynamics (Sect. 5), we show that for an adiabatic constant γ > 2 and when the viscosity coefficient ν dominates the heat diffusion κ (for instance, ν > κ works), then there exist unstable boundary layers with inflow. As explained above, such instability is only shown for layers of large amplitude, which are almost heteroclinic orbits of (4). This result is the main application of our analysis. Finally, an appendix shows that weak boundary layers are spectrally stable, thanks to generalized energy inequalities. 2. Linear and Non-Linear Dynamical Systems We begin with the non-linear equation (4), that we rewrite as B(U )U = F (U ) − F (u+ ).

(10)

When r < n, this is not an ODE in the strict sense, but a “differential-algebraic” equation. It may be better to see it under the form b(V )V = f (V ) − f (v+ ),

v+ = v(u+ ).

(11)

We split this system into two pieces: f0 (W, Z) = f0 (v+ ),

b1 (W, Z)Z = f1 (W, Z) − f1 (v+ ).

From (H5), the identity f0 (v) = f0 (v+ ) allows to determine w in terms of (z, v+ ) in a neighbourhood of the range of V : w = w(z, ˆ v+ ). Therefore, the differential part becomes an ODE in z, well-defined in a neighbourhood of the z-projection of this range. Let write it as z = G(z; v+ ) We know w(z ˆ + , v+ ) = w+ and therefore G(z+ , v+ ) = 0.

(12)

Boundary Layer Stability

273

Lemma 2. Under (H7,9,10), the rest point z+ of the dynamical system (12) is hyperbolic. Its stable manifold is of dimension r + p − q. Proof. One easily computes dz G(z+ , v+ ) = b1 (v+ )−1 (dw f1 dz wˆ + dz f1 )+ . Let us consider eigenvalues σ of dz G(z+ , v+ ). We have det(dw f1 dz wˆ + dz f1 − σ b1 )+ = 0. However, dz wˆ = −(dw f0 )−1 dz f0 . Thus, using Schur’s formula, we arrive to det(df+ − σ b+ ) = 0, or equivalently det(dF+ − σ B+ ) = 0. Up to a non-zero constant, this determinant is the characteristic polynomial of dG+ . According to (H7, H8), σ may not be purely imaginary. Therefore, z+ is a hyperbolic rest point. We now proceed by homotopy. For m > 0, we define Pm (σ ) := det(dF+ − σ (B+ + mIn )). Since the pair (dF+ , B+ + mIn ) satisfies the assumptions (H7, H8), we again see that Pm does not vanish on the imaginary axis. Since its degree n is constant for m > 0, we deduce that the number of roots of negative real part does not depend on m > 0. Letting m → +∞, we find that this number is n − q. Since the degree of Pm drops to p as m reaches 0+, its roots split into two parts. One set contains those which tend to the roots of P0 with negative real parts. The cardinality of this set is the dimension of the stable manifold. The other set consists of those roots which tend to infinity as m → 0+. To prove the lemma, we need to show that its cardinality is n − r − p, the number of negative eigenvalues of γ+−1 h+ . By density, we may assume that this matrix has only simple eigenvalues. We first show that, if Pm (σm ) = 0 and σm tends to infinity, then mσm tends to such an eigenvalue. For Pm = 0 means that there is an Xm , say of unit norm, such that df+ Xm = σm (B+ + mdg+ )Xm . Dividing by σm , we first obtain that Xm has a cluster point X¯ in ker B+ . Obviously, X¯ has unit norm. Next, retaining only the p first rows of ¯ which proves the claim. the equality, we have h+ X¯ ∼ σm mγ+ X, Conversely, let µ0 < 0 be a negative eigenvalue of γ+−1 h+ . This means that there exists a non-zero pair (Y0 , Z0 ) with Y0 ∈ ker B+ , Z0 ∈ R(B+ ) and (dF+ − µ0 )Y0 = B+ Z0 . From (H5), µ0 = 0 and we may redefine Z0 so to have (dF+ −µ0 )Y0 = µ0 BZ0 . This means in particular that (dF+ − µ0 )(ker B+ ) ∩ R(B+ ) = {0}. Since the sum of dimensions of these spaces is n (hypothesis (H10)), this means that their sum has codimension one. Let l0 be a non trivial linear form vanishing on it. From the simplicity of µ0 , we know that l0 Y0 = 0 ; we therefore normalize l0 by l0 Y0 = 1. We now define the following non-linear mapping: R2 × ker B+ × R(B+ ) → R × Rn m µ l Y − 1 0 → N (m, µ, Y, Z) := . (dF+ − µ)(Y + mZ) − µB+ Z Y Z We already have N (0, µ0 , Y0 , Z0 ) = 0. We check easily that the differential dµ,Y,Z N , computed at (0, µ0 , Y0 , Z0 ), is injective, thus invertible. From the implicit function theorem, we receive a locally defined smooth function m → (µ, Y, Z), whose graph is the zero set of N near (0, µ0 , Y0 , Z0 ). Then X := Y (m) + mZ(m) and σ := mµ(m) satisfy (dF+ − σ (B+ + m))X = 0, so that Pm (σ ) = 0, with σ < 0.

274

D. Serre, K. Zumbrun

Corollary 1. Under (H7, H9, H10), one has p ≤ q ≤ p + r. We point out that both inequalities in this corollary are equivalent to each other, in the following sense. Let us for instance assume that p ≤ q is true under (H7, H9, H10). Then (−F, B) satisfy (H7, H9, H10) too, with (p, q) replaced by (n − r − p, n − q). Therefore, n − r − p ≤ n − q, or equivalently q ≤ r + p. We deduce from the lemma that the profile U tends exponentially fast to its limit u+ : U (y) − u+ + U (y) = O(e−α+ y ),

α+ > 0.

(13)

This is actually clear for Z, then for W , using the formula W = w(Z, ˆ v+ ). We now turn to the linear operator Lu = {B(U )u + (dB(U )u)U − dF (U )u} . Its boundary conditions are given by r + p linear forms D1 , . . . , Dr+p : D1 u(0) = · · · = Dr+p u(0) = 0.

(14)

The linear transform u → dv(U )u shows that L is conjugate to l, where

lv := dg(v)−1 b(V )v + (db(V )v)V − df (V )v . The boundary conditions transform accordingly: d1 v(0) = · · · = dr+p v(0) = 0,

(15)

where dj ◦ dv(U (0)) = Dj . The operator l is a list of r second-order differential operators and n − r first order ones. Its domain is

Dl = (w, z) ∈ H 1 (R+ )n−r × H 2 (R+ )r ; dj (w(0), z(0)) = 0, 1 ≤ j ≤ r + p . For instance, Dl = H01 (R+ )n−r × (H 2 (R+ ) ∩ H01 (R+ ))r , when r + p = n, that is when all the eigenvalues of (γ −1 h)(V (0)) are positive. Let us now introduce the constant coefficient operator on the whole line l+ v = (dg+ )−1 (b+ v − df+ v ), with domain H 1 (R)n−r × H 2 (R)r . It is obtained from l by taking the limit as x → +∞. Its spectrum, computable from the Fourier transform, is given by σ+ = {λ ; det(µ2 b+ + iµdf+ + λdg+ ) = 0 for some µ ∈ R}. From (H8), we know that σ+ consists of numbers of strictly negative real part, apart from λ = 0. We shall denote by A the connected component of C \ σ+ , which contains the right half-plane {λ > 0}. As usual, the following lemma is crucial. Lemma 3. For all λ ∈ A, the operator λ − l : Dl → L2 (R+ ) is Fredholm with index zero. The eigenvalues of l, in A, are isolated. Therefore, Corollary 2. The unstable spectrum of L, or equivalently l, consists only of isolated eigenvalues of finite multiplicities.

Boundary Layer Stability

275

Proof. This follows similarly as in the case of an asymptotically constant-coefficient operator on the whole line, by a now-standard argument of Henry [14]. Specifically, the result for constant coefficient operators can be established by direct computation, similarly as in [14, p. 138]; this can then be extended to the asymptotically constant case by a version of Weyl’s Lemma (Theorem A.1 of [14, p. 136]) stating that, except for isolated eigenvalues of constant multiplicity, the spectrum of an operator is unchanged by relatively compact perturbation. For, it is readily verified that an asymptotically constant-coefficient operator is a relatively compact perturbation of the corresponding constant-coefficient operator with limiting coefficients at x → +∞, see Exercise 2, p. 137 of [14]. Alternatively, following the approach of [21] for operators on the line, one can establish the result directly, by explicit construction of the Green’s function in terms of the Evans function, followed by a direct computation showing that the location and multiplicity of eigenvalues of L correspond exactly to the location and multiplicity of zeroes of the (analytic) Evans function. 3. The Evans Function Following the general theory set up in [1], we construct an holomorphic function B : A → C, whose zeroes are the unstable eigenvalues of L. This extends Serre’s construction [20] to the case of a non-invertible B. We call B the “Evans function” of L. Following Gardner & Zumbrun [7], we show that B extends analytically to a neighbourhood of the origin. Let λ be a complex number with λ > 0, or more generally an element of A. We first rewrite the differential equation (l − λ)v = 0 as a linear first order system of n + r ordinary differential equations: w w z = M(x; λ) z , x > 0. (16) z z The boundary conditions are rewritten as dˆj (w, z, z ) := dj (w, z) = 0. The matrix M+ (λ) = M(+∞; λ) is hyperbolic, that is its eigenvalues have non-zero real parts. These are the zeroes of the polynomial Pλ (µ) := det(µ2 b+ −µdf+ −λdg+ ). By a contin+ uation argument, there are r + p eigenvalues of negative real part µ+ 1 (λ), . . . , µr+p (λ), counting with multiplicities. The corresponding (generalized) eigenvectors span the “stable subspace” E+ (λ) of M+ (λ). It follows that the set of bounded solutions of (16) is a vector space of dimension r + p, that we denote by E(λ). Such solutions actually decay exponentially fast as x → +∞. The space E+ (λ) is the limit of the trace E(λ; x), as x → +∞. The space E(λ) depends holomorphically on λ. If it was possible to select a holomorphic basis B(λ) = {φ1 (·; λ), . . . , φp+r (·; λ)} of E(λ), then one should define B(λ) directly by B(λ) := dˆj (φk (0; λ)) . (17) 1≤j,k≤p+r

The vanishing of such a number is clearly equivalent to the existence of a linear combination of the φk ’s, on which the dˆj ’s vanish simultaneously. This amounts to saying

276

D. Serre, K. Zumbrun

that there exists a φ in E(λ), such that dˆ1 φ = · · · = dˆr+p φ = 0. Equivalently, there is a v ∈ Dl such that (l − λ)v = 0: λ is an eigenvalue of l. Reciprocally, B vanishes at every eigenvalue of l in A. This procedure is possible when r +p = 1. It is also possible in every small open ball in A. However, in the general case, it raises serious difficulties because of the existence of branching points in A, where M+ (λ) fails to be diagonalisable. At such points, the natural choice of B, given by prescribed asymptotic behaviour of the φk ’s, is meaningless. To overcome this difficulty, one commonly works in the exterior algebra Fr+p (Cn+r ). For 1 ≤ m ≤ n + r, there is a unique homomorphism M (m) (x; λ) in Fm (Cn+r ), such that m solutions φ1 , . . . , φm of (16) always satisfy d φ1 ∧ · · · ∧ φm = M (m) (x; λ)φ1 ∧ · · · ∧ φm . dx (p+r)

When m = r +p, M+ (λ) has the nice property that it has only one eigenvalue µ+ (λ) of minimal real part and that it is simple. Actually, + µ+ (λ) = µ+ 1 (λ) + · · · + µr+p (λ).

The corresponding eigenvector has the form y1 ∧ · · · ∧ yr+p , where {y1 , . . . , yr+p } is (r+p) (λ) is holomorphic and µ+ is simple, µ+ (λ) any basis of E+ (λ). Since λ → M+ is holomorphic too. Therefore, one may select a holomorphic section λ → Y (λ) of the eigen-bundle: (r+p) (M+ (λ) − µ+ (λ))Y (λ) = 0, Y (λ) = 0. In addition, noticing that µ+ (λ) is real when λ ∈]0, +∞[, we infer that one may choose ¯ = Y (λ). Next, there is a unique Y (λ) so that it is real when λ is. In particular, Y (λ) solution y(·; λ) of y = M (r+p) (x; λ)y,

y(x; λ) ∼ (exp µ+ (λ)x)Y (λ) as x → +∞.

(18)

For every λ ∈ A, y(·; λ) equals, up to a constant, a wedge product of a basis of E(λ). ¯ = y(λ). Moreover, it inherits the holomorphy of Y (λ). Similarly, y(λ) Defining the (p + r)-form dˆ := dˆ1 ∧ · · · ∧ dˆp+r , we may define our Evans function as ˆ y(0; λ) > . B(λ) =< d, Besides all the above-mentioned properties, we point out that it takes real values on the real positive semi-axis. Given any point λ0 ∈ A, it admits a form (17) in a vicinity of λ0 . This is the way we compute its local behaviour in practice. We now point out that λ → (µ+ (λ), Y (λ)) admits an analytic extension in a neighbourhood of the origin. Then, thanks to the exponentially fast convergence of (p+r) M (p+r) (x; λ) towards M+ (λ), we deduce (“gap lemma”, see [7]): Proposition 1. The spaces E+ (λ), E(λ), the eigenvector Y (λ), the eigen-function y(·; λ) ˆ of the origin. and the Evans function B(λ) extend analytically to a neighbourhood A Let us point out that, however, these extensions no longer obey the same definitions. For instance, E+ (λ) is no longer the stable subspace of M+ (λ), and so on. Let us describe E+ (λ) when |λ| 0, the eigenvalues of M+ (λ) are found by looking at decaying modes eµx vˆ of l+ − λ: these are roots of Pλ . Since Pλ (µ) = det(−µdf+ − λdg+ ) + O(µ2 ),

Boundary Layer Stability

277

we see that q roots vanish as λ → 0 in A. They behave as −λ/aj , where an−q+1 , . . . an are the positive eigenvalues of (dg+ )−1 df+ , or of dF+ = df+ (dg+ )−1 . The corresponding eigenvectors are (rj , 0)T + O(λ), where rj is an eigenvector of (dg+ )−1 df+ associated to aj . In terms of eigenvectors Rj of dF+ , one has Rj = df+ rj . The fact that µj extends analytically near the origin is clear when aj is simple, from the implicit function theorem. It is still true when aj is semi-simple (assumption (H6)). The other roots tend to non-zero limits µj (0) as λ → 0. These limits are roots of det(µb+ − df+ ) = 0. These are the eigenvalues of negative real part of the matrix M1 := b1−1 (dz f1 − dw f1 (dw f0 )−1 dz f0 ),

v = v+ .

Given a (generalized) eigenvector zˆ of this matrix, one built a (generalized) eigenvector of M+ (0) through −(dw f0 )−1 dz f0 zˆ . (19) ϕ := zˆ µˆz In summary, a basis {ϕ1 , . . . , ϕp+r } of E+ (0) is given by −(dw f0 )−1 dz f0 zˆ j rn−q+j ϕj = , if j ≤ q, ϕj = zˆ j 0 µj (0)ˆzj

, if q < j ≤ p + r.

Hereabove, {ˆzq+1 , . . . , zˆ p+r } is a basis of the stable subspace of M1 , in which M1 has a Jordan form, diagonal if possible. The µj (0) are the corresponding eigenvalues of M1 . We now turn to the elements of E(0). These are solutions φ = (v, z )T of (16) with λ = 0. This amounts to lv = 0, or {b(V )v + (db(V )v)V − df (V )v} = 0. Integrating once, we receive a first-order differential-algebraic equation: b(V )v + (db(V )v)V − df (V )v = constant =: q.

(20)

Though E(λ) is made up of functions decaying at +∞ when λ ∈ A, this is not true any more for λ = 0. However, E(0) certainly contains all the exponentially decaying solutions of (16). These correspond to the decaying solutions of the homogeneous equation b(V )v + (db(V )v)V − df (V )v = 0.

(21)

Such solutions form a vector subspace of dimension p + r − q, a basis of which being {φq+1 (0), . . . , φp+r (0)}, where φj (0) solves (16) and φj (x; 0) ∼ (exp µj (0)x)ϕj ,

x → +∞,

q < j ≤ p + r.

The remaining elements of E(0) actually do not decay, but have finite limits ϕ ∈ Span{ϕ1 , . . . , ϕq } (the case ϕ = 0 corresponds to the decaying solutions, already considered). The constant in (20) is computed by letting x → +∞. We thus complete a basis of E(0) by choosing φj (0), solutions of (16) with λ = 0, according to lim φj (x; 0) = ϕj ,

x→+∞

1 ≤ j ≤ q.

278

D. Serre, K. Zumbrun

With φj =: (vj , zj )T , we obtain from (20), b(V )vj + (db(V )vj )V − df (V )vj = −Rn−q+j ,

1 ≤ j ≤ q.

Once a basis B(0) = {φ1 (0), . . . , φp+r (0)} of E(0) is chosen according to the above ˆ as a basis of E(λ). requirements, it is extendable in an analytic way in A, Two remarks. First we point out that there remains much room in the choice of B(0). Second, as mentioned above, lV = 0 and V decays at infinity. Therefore, V ∈ E(0). Moreover, the asymptotic behaviour of V is generically

V (x) ∼ eµj (0)x r , for some index j > q, with (µj (0)b − df )+ r = 0. In the case where µj (0) is real, we may choose φj = (V , 0)T . 3.1. Discussion of (20). We now show that (20) may be viewed as a traditional ODE, instead of a differential-algebraic equation. Let us denote the constant right-hand side by q ∈ Rn . We first split the equation into two parts. With q = (q0 , q1 )T and v = (w, z)T : df0 (V )v = −q0 ,

(22)

b1 (V )z + (db1 (V )v)Z − df1 (V )v = q1 .

(23)

We now differentiate (22) and keep (23) unchanged. This yields ˜ b(x)v − a(x)v ˜ = q, ˜

with b˜ =

h

·

0 b1

,

a˜ =

(df0 ) ···

(24)

,

q˜ =

0 q1

.

Thanks to (H5), b˜ is invertible. Therefore, (24) is a linear ODE in the traditional form. Every solution of (20) solves (24). Conversely, let v be a solution of (24), with a constant vector q˜ and q˜0 = 0, and assume that (v(x), v (x)) → (v∞ , 0), as x → +∞. Then (20) holds true, with q1 := q˜1 and q0 := −df0 (v+ )v∞ . 4. Parametrized IBVPs Let U : R → U be a given solution of the differential-algebraic system (10). We emphasize that it is defined on the whole line R, instead of on the semi-axis R+ . We now consider the initial-boundary value problem for (1), on the space domain I :=]x0 , +∞[, instead of R+ . For this, we provide (1) with p + r suitable boundary conditions, possibly depending on the choice of x0 , and we assume that the restriction U |I satisfies these conditions. We are now concerned with the linear stability of U |I for the corresponding IBVP. For this, we construct the Evans function B(x0 ; λ). We easily see that it can be built as a continuous function with respect to x0 . Here, we focus on the sign of W (x0 ) := B(x0 ; 0), vs the sign of B(x0 ; ·) near +∞. By the intermediate value theorem, opposite signs imply the existence of at least one real positive root. In particular, U |I would be unstable. More precisely, opposite signs mean

Boundary Layer Stability

279

that the number of unstable eigenvalues of L is odd, while same signs mean that this number is even, keeping in mind that non-real eigenvalues come in complex conjugate pairs. In the sequel, we shall denote this parity by the stability index of L. Since it can be checked that B(x0 ; ·) does not vanish near infinity, a consequence of a Gårding estimate (see [2]), its sign does not depend on x0 . Therefore, the only ingredient in the computation of the stability index of L is the sign of W (x0 ), which may vary with x0 . Though the exact computation of W is not easy, we may expect to receive some results by means of a qualitative study of W . Notice that, in this case, a suitable choice of solutions φ1 , . . . , φp+r of φ = M(x; 0)φ gives a coherent basis of E(x0 ; 0), for all x0 . That is, {φ1 |I , . . . , φp+r |I } form a basis B(x0 ; 0). As in the previous section, we choose the φj ’s so that φj decays exponentially fast as x → +∞ if j > q, and φj tends to ϕj as x → +∞ if j ≤ q. We also decompose φj =: (vj , zj )T . A convenient tool for this study is a differential equation that W should satisfy. Let us consider for instance the easiest case where p + r = n, that is when the boundary condition for (1) is a pure Dirichlet one (in other cases, it will often depend on x0 when U is not constant). Then, W (x0 ) = det(v1 (x0 ), . . . , vn (x0 )). Let us differentiate, using the matrix K := b˜ −1 a: ˜ W = det(. . . , vj −1 , Kvj + b˜ −1 q˜j , vj +1 , . . . ) j

= (TrK)W +

q

det(. . . , vj −1 , b˜ −1 q˜j , vj +1 , . . . )

j =1

= (TrK)W +

1

q

det b˜

j =1

˜ j −1 , q˜j , bv ˜ j +1 , . . . ). det(. . . , bv

Lemma 4. The spectrum of K+ = K(+∞) is made up of the eigenvalues of dG+ (with the same multiplicities), plus µ = 0, with multiplicity n − r. Proof. Because the first n − r rows of a˜ + vanish, we easily see that w =0 (K+ − µ) z is equivalent to either µ = 0 or to dG+ z = µz with appropriate w. The case where eigenvalues of dG+ are simple is done. The case of higher multiplicy follows by a density argument, as in, e.g. [6, 5]. Two subcases, q = p or p + 1 (recall that we already know that q ≥ p), are of ˜ k consist strong interest for applications. We point out that the p first components of bv of the vector df0 (V )vk , that is of −qk,0 , with qk =: (qk,0 , qk,1 )T . For k > q, this is zero, since qk = 0. Since q˜j,0 vanishes too, we see that each of the above n × n determinants contains a null block of size p × (n − q + 1). Let’s first consider the case q = p. From p + (n − q + 1) = n + 1, we conclude that each of these determinants vanish, so that W = (TrK)W . Therefore, W does not vanish ; it keeps a constant sign. Since we can

280

D. Serre, K. Zumbrun

prove (see Appendix A, below) that weak boundary layers are linearly stable, we already know that the signs of B(x0 , 0) and B(x0 , λ >> 1) agree for x0 >> 1. We conclude that they do, for all x0 : Theorem 1. Let U be a boundary layer, with p + r = n (so that the boundary condition is a pure Dirichlet one) and q = p. Then the stability index of the linearized operator L is even. From this, we cannot conclude, regarding the linear stability of U . The next subcase comes with q = p + 1. The same arguments as above show that each of the determinants are block-triangular, since p + (n − q + 1) = n. Therefore, they may be written as products of two determinants, of respective sizes p × p and (n − p) × (n − p). However, we may rewrite the differential equation in a simpler form, with the next lemma. Lemma 5. If r + p = n and q = p + 1, then q

˜ j −1 , q˜j , bv ˜ j +1 , . . . ) = (−1)p det(q1 , . . . , qp+1 , bvq+1 , . . . , bvn ). det(. . . , bv

j =1

Proof. Let us define Qj := qj − q˜j = (qj,0 , 0)T . We use qj = q˜j + Qj and the linearity of the determinant to rewrite det(q1 , . . . , qp+1 , bvq+1 , . . . , bvn ) as a sum of 2p+1 terms. Those containing two (or more) q˜j vanish, since they contain a null block of size p × (n − p + 1). The term with only Qj ’s vanishes too, since it contains a null block of size (n − p) × (p + 1). There remain only those terms with exactly one q˜j . These ones are block diagonal and are not changed when replacing the lower null block (of size (n − p) × p) by a non-zero block. Such a change is performed when replacing the corresponding Ql ’s by −bvl , since their first p components agree. We finally obtain ˜ k for k > q. the expected formula by noticing that bvk = bv We now face the linear ODE W = (TrK)W + (−1)p det(q1 , . . . , qp+1 , bvq+1 , . . . , bvn ) =: τ (x)W + s(x), (25) for which we can write W (x) exp −

x

τ (y)dy

x

=c+

0

s(y) exp −

0

y

τ (ξ )dξ

dy,

0

where c is a constant. Then the sign of W equals the one of the right-hand side. This may be evaluated for x → −∞, independently of c, each time the integral

x −∞

diverges.

s(y) exp − 0

y

τ (ξ )dξ

dy

(26)

Boundary Layer Stability

281

4.1. Boundary layers from 1-shock profiles. A rather interesting case consists in choosing for U a viscous shock profile of a steady shock wave, that is a solution of (10), which admits a limit u− as x → −∞. In order to have some control on the behaviour of U as x → −∞, we ask that this shock be non-characteristic for (2): df− := df (v− ) is invertible. Again, z− is a hyperbolic point for the ODE (12), and Z takes values in its unstable manifold. Then, generically, there exists a pair (µ− , r− ), such that V (x) ∼ (eµ− x r− ),

x → −∞,

(27)

and (µb − df )− r− = 0. We shall focus on the most frequent case of a Lax shock, that is an (n−q)-Lax shock. Similar to Lemma 2, we have Lemma 6. For an (n−q)-Lax shock, the unstable manifold of z− for (12) is of dimension 1 + q − p. Example (fundamental). In full gas dynamics, (1) is the Navier–Stokes model, with viscosity and heat conduction. Then, n = 3, r = 2. Let assume that p = 1 and q = 2 ; that is 0 < u+ < c+ , where c denotes the sound speed. Then the stable manifold of z+ for (12) is a curve, which splits into two trajectories. For reasonable state laws (as the perfect gas law p = (γ − 1)ρe), there is a single other state u− such that F (u− ) = F (u+ ), and the pair (u− , u+ ) is a 1-Lax shock2 . From Gilbarg’s study [8], we know that a shock profile exists for every positive choice of the viscosity and heat conductivity. Therefore, one of the two possible trajectories U such that U (+∞) = u+ is actually such a shock profile. This argument certainly admits generalisations to many systems and states u+ such that q = n − 1. Our assumption that U is a piece of a shock profile is thus not too restrictive. Let us assume that (u− , u+ ) is such a shock and that U is its profile, with as before r + p = n, q = p + 1. Since we do not know explicitly the vq+1 , . . . , vn , except for vn = V , we cannot in general evaluate s(x). We therefore assume q = n − 1, that is r = 2, p = n−2. Then, noticing that f (v− ) = f (v+ ) because of the Rankine–Hugoniot condition, s = (−1)n det(q1 , . . . , qn−1 , bV ) = (−1)n det(q1 , . . . , qn−1 , f (V ) − f (v± )). Since (u− , u+ ) is a 1-shock, Lemma 6 shows that z− is a source for (12). Thus, all eigenvalues of DG− , and from Lemma 4, both non-zero eigenvalues of K− have positive real parts, one of them being µ− , the other one denoted by σ− . Thus we have TrK− = µ− + σ− > µ− . Then the integrand in (26) is equivalent to Se−σ− x , S := (−1)p det(q1 , . . . , qn−1 , b− r− ). This proves that (26) diverges. Then there are two generic pictures. Either µ− is not real, and σ− = µ− . Then W oscillates, as shown in [20], so that the stability index is odd when x0 belongs to a denumerable union of intervals, implying the instability of the corresponding boundary layers. Or µ− , σ− are real and simple. Then the sign of W as 2 Let us remark that 2-shocks do not exist in gas dynamics, since the second characteristic field is linearly degenerate.

282

D. Serre, K. Zumbrun

x → −∞ is the one of −S/σ− , an explicit quantity ! This is made even simpler when choosing, as it is possible, qj = −Rj +1 (recall that (dF+ − aj )Rj = 0 and aj > 0): S = − det(R2 , . . . , Rn , b− r− ) = − det(R2 , . . . , Rn , B− R− ), with R− = df− r− . Choosing appropriately an eigenform L1 of dF+ (that is a non-trivial solution of L1 (dF+ − a1 ) = 0), we also have S = L1 B− R− = L1 b− r− . At this stage, one may wonder about the consistency of this analysis, since a change in the choice of the basis B(0) could result, for instance, in a flip of the sign of S. This is without taking into account the need to define continuously the bases B(λ) along R+ . Such a modification would also change the sign of B(x0 ; λ) for λ >> 1, but it would not affect the sign of the product B(x0 ; 0)B(x0 ; +∞), of course since it is intrinsic. For the moment, let us say that, considering a one-parameter family of steady shock waves L → (u− , u+ ), endowed with a smoothly varying family of profiles L → UL , we obtain a presumably smooth function L → S(L), instead of a single number. Then, detecting a value, for instance L = 0, where S vanishes with dS/dL = 0, we conclude that for, say L > 0, S and B(x0 ; λ) have the same sign, for all x0 and all λ >> 1. Therefore, there is a point X(L) such that, for L > 0 and x0 < X(L), the corresponding profile is unstable (let us point out that limL→0 X(L) = −∞). To summarize this analysis, we write For r = 2 and a profile U of a steady 1-shock wave, the vanishing of S detects the instability of some boundary layers associated to nearby steady shocks, with x0 1 is the adiabatic constant. Its value is 5/3 for a mono-atomic gas, 7/5 for the air.

284

D. Serre, K. Zumbrun

Table 1. Numbers of boundary conditions v+ q

−c+ 0

[p, p + r] [0, 2]

c+

0 1

2

3

[0, 2]

[1, 3]

[1, 3]

From now on, we consider only perfect gases. Without loss of generality, we may assume that θ ≡ e. We therefore have

ρv

2 + (γ − 1)ρe , f (v) = ρv 1 2 2 v + γ e ρv

0

0

0

0 . 0 νv κ

b(v) = 0

ν

We notice that r = 2. We easily check (H1–H4, H6) for the Navier–Stokes system. Assumption (H5) asks that v(0) = 0. Since ρv is a constant along a boundary layer (because of the √ first line of (11)), this amounts to v+ = 0. Next, denoting the sound speed by c := γ (γ − 1)e, the characteristic speeds in the Euler equations (2) are 2 − c2 ) = 0. The number of boundary v − c, v, v + c, so that (H7) asks that v+ (v+ + conditions that we need for (1) and (2) is given by Table 1. From the existence of a strictly convex, dissipative entropy, we know that (H9) holds. Since (H10) holds trivially, Lemma 1 shows that (H8) is satisfied. Therefore, the construction of the Evans function and the analysis done in the previous section apply when v+ > 0. When v+ > c+ , the boundary layer is trivial (that is, constant) and the linearized IBVP has a full Dirichlet boundary condition. An obvious a priori estimate shows that such a layer is stable. The situation is less clear when v+ < 0, since then a choice has to be made concerning the boundary condition (we need only two scalar data). Since the Evans function strongly depends on the linearized ones d1 , d2 , we anticipate that the stability index of a layer will depend not only on the layer but also on the linearized boundary conditions. We shall illustrate this dependence in the simpler case of an isentropic flow (see Sect. 6).

5.1. The case 0 < v+ < c+ . The case considered in the previous section (r = 2, q = n − 1 = 2, p = n − 2 = 1) corresponds to the choice 0 < v+ < c+ . From Gilbarg [8], we know that, given such a state v + there is a unique state v − with f (v − ) = f (v + ) and that the pair (v − , v + ) is 1-Lax steady shock. In particular, v− > c− . Also, it is proved that this shock admits a viscous profile V , for every choice of the positive functions ν and κ. We first compute the expression S = L1 b− r− . The differential form L1 is the eigenform for dF+ , associated to v+ − c+ , its first eigenvalue. We have L1 = (γ − 1)(e+ dρ + ρ+ de) − ρ+ c+ dv γ −1 2 v + vc = du1 − (c + (γ − 1)v)+ du2 + (γ − 1)du3 . 2 +

Boundary Layer Stability

285

Dropping for a moment the minus indices, r = r− obeys the eigen-equation (µb − df )r = 0. With r =: (x, y, z)T , this reads vx + ρy 0 = (v 2 + (γ − 1)e)x + 2ρvy + (γ − 1)ρz . µ νy 3 2 1 2 νvy + κz v + γ e vx + v + γ e ρy + γρvz 2 2 From vx + ρy = 0, we eliminate x. Making also a linear combination of the two last equalities, we arrive at µνvy = (γ − 1)ρvz + ρ(v 2 − (γ − 1)e)y, µκz = ρ((γ − 1)ey + vz). Since r and v are non-zero, we have (y, z) = 0, which implies µνv + ρ((γ − 1)e − v 2 ) −(γ − 1)ρv = 0. −(γ − 1)ρe µκ − ρv Defining ζ :=

(28)

µκ − 1, ρv

this is rewritten as ν ζ (ζ + 1) + ζ κ

1−γ 1 −1 + = 0, 2 γM γ M2

(29)

where M = v− /c− > 1 is the Mach number. This quadratic equation has two real solutions of opposite signs, the negative one corresponding to the smallest “eigenvalue” µ. That is precisely the one which governs the asymptotic behaviour of V near −∞ (see [8]). In passing, this shows that these eigenvalues are simple and real, so that the behaviour (27) is correct. From now on, by ζ we mean the negative root of (29). The corresponding eigenvector is given by the formula 0 −ρζ , br = , r= νvζ vζ νv 2 ζ + κ(γ − 1)e (γ − 1)e everything being evaluated at v − . Finally, S = L1 b− r− = −(c+ + (γ − 1)(v+ − v− ))ν− v− ζ + κ− (γ − 1)e− . We now investigate the possible vanishing of S. First of all, since ν, κ, v, e are positive and ζ is negative, ℵ := c+ + (γ − 1)[v] needs to be negative. This is far from true in general. For instance, a weak shock (v − being close to v + ) yields ℵ ∼ c+ > 0. Next, it can be shown that, as long as γ ≤ 2, all the 1-shocks satisfy ℵ > 0. However, when γ > 2, strong 1-shocks satisfy ℵ < 0. For instance, so-called “maximal” shocks (see

286

D. Serre, K. Zumbrun

[3]), for which e− vanishes, or equivalently M = +∞, give 2γ − 2 v+ , ℵ= γ −1 and the parenthesis is negative if and only if γ > 2. We therefore restrict to the case γ > 2 and select a strong enough steady 1-shock, that is one for which ℵ < 0. Having fixed the state v − in such a way, S appears to be an homogeneous function of degree one of (ν− , κ− ). Thus its vanishing depends only on the ratio ν/κ. However, we easily obtain the following asymptotics: ν/κ → 0+ : then S ∼ (γ − 1)e− κ− > 0, ν/κ → +∞: then S ∼ ℵv− ν− < 0. By continuity we see that S vanishes for some value of the ratio ν/κ. In order to have a qualitative feeling for this value, let us consider the example of a maximal 1-shock. Using β := ν/κ and L := 1/γ M 2 > 0, L β ∗ (γ )κ, particularly when ν ≥ κ, then S < 0 for sufficiently strong shocks, that is for sufficiently large M. Let us point out that the value of M > 1 2 , v 2 , e , e ), up to a single multiplicative positive constant, which just determines (v− + − + factorizes in S.

Boundary Layer Stability

287

Existence of unstable boundary layers. We now give the conclusion. Because the 1shock curves are connected, and because of Galilean invariance, the set of steady 1shocks is connected too. Then we may consider the Evans function as a continuous function of, altogether, λ, x0 , ν, κ and the shock itself. Consequently, W is a continuous function of x0 , ν, κ and the shock. Thanks to the Gårding inequality, we know that the stability index depends only on the sign of W . Since the “eigenvalues” µ are distinct and real, we know that the sign of W (x) becomes constant as x → −∞, and that it is opposite to the sign of S. In turn, S˜ := S/ν− is a continuuous function of ν− /κ− and the shock. On one hand, we know that for small shocks, S˜ is positive. On the other hand, we know that the corresponding boundary layers, being small whatever x0 is chosen, are stable (this comes from a direct entropy estimate on the linearized system). By continuity, we therefore conclude that the stability index of a boundary layer V |(x0 ,+∞) is even when W (x0 ) is negative and odd when it is positive. Now, let γ be larger than 2, assume that ν > β ∗ (γ )κ, for instance ν > κ, and let the shock (v − , v + ) be strong enough. Then S > 0, which means that W (x0 ) is negative for x0 < 0, large enough. Then the corresponding boundary layer is unstable. 6. Isentropic Gas Dynamics In one-space dimensional isentropic gas dynamics, the flow is described by v = (ρ, v) only. Therefore, n = 2 and the conservation laws express the mass and momentum balances. We have ρv ρ 0 0 , f (v) = . g(v) = , b(v) = ρv 2 + p(ρ) ρv 0 ν(ρ) Hereabove, ν > 0 (thus r = 1) and the pressure satisfies (hyperbolicity) p > 0. The sound speed is here c := p . The eigenvalues of dF are v ± c(ρ) and the one of dg −1 h (a 1 × 1 matrix !) is v. Contrary to the case of full gas dynamics, both matrices do not share a common eigenvalue. We summarize the number of boundary conditions for the viscous (p + r) and the inviscid (q) problems in Table 2. There are four distinct cases: v+ > c+ : then q = r + p, so that every boundary layer is trivial (that is constant). Since r + p = n, the linearized boundary layer is homogeneous Dirichlet, The “layer” is linearly stable, from an obvious a priori estimate. 0 < v+ < c+ : then q = p and r + p = n, so that W is a Wronskian. It cannot vanish. Thus the sign of B(x0 ; 0)B(x0 ; +∞) is constant, therefore positive. The stability index is even. −c+ < v+ < 0: then q = p + r, so every boundary layer is constant. Since r + p = 1, there is only one boundary condition for the Navier–Stokes system. Let αρ +βv = 0 be the linearized boundary condition, with β = 0 for the viscous IBVP being locally Table 2. Isentropic case v+ q

−c+ 0

[p, p + r] [0, 1]

c+

0 1

1

2

[0, 1]

[1, 2]

[1, 2]

288

D. Serre, K. Zumbrun

well-posed. The subspace E(0) is spanned by the constant v = (−ρ+ , c+ ). Then B(0) = −αρ+ + βc+ . From an obvious energy estimate, it is stable when α = 0. By continuity, we conclude that the stability index is even (resp. odd) when α ρ+ < c+ , β

resp. > c+ .

v+ < −c+ : then q = p and r + p = 1. Here, E(0) is spanned by V (if V is not trivial). With the same notations as above, B(0) = αρ (0) + βv (0). Since ρv ≡ j in the layer, the vanishing of B(0) is equivalent to αρ 2 = βj . By continuity, the stability index depends only on the sign of α ρ(0)2 − j. β For α = 0, we find that a constant layer (seen as a limit case) is stable, from an obvious energy estimate. Therefore the stability index is even (resp. odd) when the above quantity is positive (resp. negative). 7. Appendix A In this appendix, we establish the stability of weak boundary layers in the case of degenerate viscosity, under the simplifying hypothesis r + p = n followed in Sects. 4– 5; as discussed in the introduction, this corresponds to the case of full Dirichlet boundary conditions. This extends prior results of [10] and [12] in the one-dimensional and multidimensional case, respectively, obtained for strictly parabolic viscosities. Similarly as in [10, 12], a key ingredient in the proof is the following elementary Poincaré estimate. Lemma 7. Let w ∈ C 1 [0, +∞) vanish at x = 0. Then, for any weighting function α(·) > 0, we have +∞ +∞ +∞ 2 α(y)|w(y)| dy ≤ α(y)|y|dy |w (y)|2 dy. (30) 0

0

0

Proof. Cauchy–Schwarz inequality, applied to w(x) = |w(x)|2 ≤

x

x

1dy 0

|w (y)|2 dy

x 0

w (y)dy, gives

= |x|

0

x

|w (y)|2 dy,

0

whence

+∞

+∞

α(y)|w(y)|2 ≤

0

0

=

0

proving the claim.

+∞

y

α(y)|y| 0

|w (z)|2 (

|w (z)|2 dzdy +∞

z

α(y)|y|dy)dz,

Boundary Layer Stability

289

Proposition 2. Fixing u+ , assume that (H1)–(H10) hold in some neighborhood U of u+ . Further, suppose that r + p = n, where r and p are defined as in the introduction. Then, boundary layers U : U (+∞) = u+ that are sufficiently “weak” in the sense that the entire profile U (·) is contained in a sufficiently small ball about u+ , are spectrally stable. Proof. The result follows by a combination of two energy estimates. The first is the one used to establish stability in the strictly parabolic case: Writing the linearized eigenvalue equation in original coordinates, we have λu + (Au) = (Bu ) ,

(31)

where B := B(U ) is singular, and Aα := dF (U )α − dB(U )(α, U ) for any vector α. From (H6) and (H9), we find by continuity that, for sufficiently weak profiles, there exists a symmetric, positive definite symmetrizer S(x) ≥ s0 > 0 such that S(x)dF (U (x)) is symmetric for all x (recall, existence of a symmetrizer is equivalent to semisimple, real spectrum for dF ), and, moreover (dissipativity): X ∗ SBX ≥ β|BX|2 .

(32)

Taking the real part of the (complex) L2 inner product of Su against Eq. (31), carrying out various integrations by parts, and rearranging, we obtain the basic energy estimate: λ'Su, u( = 'O(|U |)u, u( + 'O(|U |)u, Bu ( − 'u , SBu ( − 'Su , (dB · u)U (, which, through (32), implies λ'Su, u( + βBu 2 ≤ 'O(|U |)u, u( + 'O(|U |)u, Bu ( − 'Su , (dB · u)U (.

(33)

Let us denote by π the projection from Cn to Cr , where we just retain the r last components. Then the last term in (33) is bounded by cst π Su · u, since the n − r first rows of dB · u, like those of B, vanish. We now use Lemma 8 below in order to bound this term by cst Bu · u. We then derive from (33) and Young’s inequality the following estimate, s0 λu2 + βBu 2 ≤ 'O(|U |)u, u(,

(34)

or, in (w, z) coordinates: s1 λ(w2 + z2 ) + β1 z 2 ≤ 'O(|U |)w, w( + 'O(|U |)z, z(, where s1 and β1 denote modified, positive constants. Applying (30) with α = O(|U (y)|), and observing that +∞ |U ||y|dy 0

(35)

290

D. Serre, K. Zumbrun

can be made as small as desired by enforcing sufficiently weak layer strength, we find that we can absorb the z2 term in the right-hand side of (35) to obtain, finally, the desired first energy estimate: s2 λ(w2 + z2 ) + β2 z 2 ≤ 'O(|U |)w, w(,

(36)

where s2 and β2 denote further modified, positive constants. Note that, were there no w term, this would already establish a contradiction to the assumption that λ ≥ 0, proving spectral stability. In the case of degenerate viscosity, however, we require also a second energy estimate controlling term 'O(|U |)w, w(. Restrict attention, now, to the (n − r)-dimensional reduced eigenvalue equation λγ w + (dhw) + (dkz) = 0

(37)

arising in the (w, z) coordinates, where γ and h are as defined in (H2)–(H4), and γ := γ (U ), dh := dh(U ), dk := dk(U ). By (H4), the matrix γ −1 dh is diagonalisable, with real eigenvalues, for all x. Moreover, by assumption p = n − r, we have at x = +∞ that all eigenvalues are positive; for sufficiently weak boundary layers, then, this property holds for all x ∈ [0, +∞). It follows that there is a symmetric, positive definite symmetrizer Sw (x) such that P := Sw γ −1 dh ≥ h0 > 0

(38)

is symmetric, positive definite for all x ∈ [0, +∞). That is, (37) features purely upwind propagation. This allows us to apply to the system of equations (37) a weighted energy estimate like that applied by Goodman [11] to individual characteristic fields, in the context of shock stability. Precisely, following Goodman, define the scalar weight α(x) > 0 by ODE α = −C|U |α/ h0 ,

(39)

with initial condition α(0) = 1, where C > 0 is a sufficiently large constant. Then, multiplying (37) by αγ −1 and taking the real part of the (complex) L2 inner product against Sw w, we obtain after rearrangement the basic energy estimate: 'αw, w( − (1/2)'w, (αP )w( = 'O(α|U |)w, w( + 'O(α|U |)z, z(.

(40)

But, from (39), (38) we easily obtain as in [11] that −(1/2)'w, (αP )w( ≥ (C/2)'α|U |w, w(. Thus, summing (34) and (36), and taking C sufficiently large, we obtain the final estimate s2 λ(w2 + z2 ) + (β2 /2)z 2 + (C/4)'O(|U |)w, w( ≤ 0,

(41)

where we have as before used (30) to absorb the term 'O(|U |)z, z( in β2 z 2 . Evidently, (41) implies spectral stability, since λ ≥ 0 gives an immediate contradiction. (Note: above, we have used freely the fact that α is bounded away from zero.) It remains to prove the following Lemma 8. Under assumption (32), there exists a finite number c(u) such that |π SX| ≤ c|BX|.

Boundary Layer Stability

291

Proof. Let Y be a vector such that BY = 0. Then, applying (32) to X + sY , we see that the affine function s → (S(X + sY ), BX) − β|BX|2 is non-negative, therefore constant. Thus, BY = 0 implies (SY, BX) = 0 for every X. In other words, π Y = 0 implies (π SY, π BX) = 0 for every X. However, π B is onto, therefore π Y = 0 implies πSY = 0. This means that there exists a matrix S, such that π S = Sπ B. A simple computation gives S = S0 B0−1 , where S0 and B0 are the lower right blocks of S and B. (Let us recall that, zero being a semi-simple eigenvalue of B, the block B0 is invertible.) Then the result holds, with c := S. We remark that essentially the same argument, rephrased as an energy estimate on the time-evolutionary equations, establishes linearized and not only spectral stability. In the case r = n, Gisclon and Serre [10] were able to establish a full nonlinear stability result. It would be interesting to determine whether or not stability of weak boundary layers holds also in the case that p < n−r, when there are fewer than n Dirichlet conditions. In this case, it seems conceivable that the nature of the boundary conditions may strongly affect stability, even in the weak layer limit. For related work, see [18]. References 1. Alexander, J.C., Gardner, R. and Jones, C.K.R.T.: A topological invariant arising in the stability analysis of travelling waves. J. Reine Angew. Math. 410, 167–212 (1990) 2. Benzoni-Gavage, S., Serre, D. and Zumbrun, K: Alternate Evans functions and viscous sjock wawes. SIAM J. Math. Anal. 32, 929–962 (2001) 3. Bultelle, M., Grassin, M. and Serre, D.: Unstable Godunov discrete profiles for steady shock waves. SIAM J. Numer. Anal. 35, 2272–2297 (1998) 4. Freistühler, H. and Serre, D.: The L1 -stability of boundary layers for scalar viscous conservation laws. J. Dynamics & Diff. Eqns. to appear 5. Gardner, R.A. and Jones, C.K.R.T.: Traveling waves of a perturbed diffusion equation arising in a phase field model. Indiana Univ. Math. J. 39, 4, 1197–1222 (1990) 6. Gardner, R.A. and Jones, C.K.R.T.: A stability index for steady state solutions of boundary value problems for parabolic systems. J. Differential Eqs. 91, 2, 181–203 (1991) 7. Gardner, R.A. and Zumbrun, K.: The gap lemma and geometric criteria for instability of viscous shock profiles. Comm. Pure Appl. Math. 51, 7, 797–855 (1998) 8. Gilbarg, D.: The existence and limit behavior of the one-dimensional shock layer: Am. J. Math. 73, 256–274 (1951) 9. Gisclon, M.: Etude des conditions aux limites pour un système hyperbolique, via l’approximation parabolique. J. Maths. Pures & Appl. 75, 485–508 (1996) 10. Gisclon, M. and Serre, D.: Etude des conditions aux limites pour un système hyperbolique, via l’approximation parabolique. C. R. Acad. Sci. Paris, Série I 319, 377–382 (1994) 11. Goodman, J.: Remarks on the stability of viscous shock waves. In Viscous profiles and numerical methods for shock waves (Raleigh, NC, 1990), Philadelphia: SIAM, 1991, pp. 66–72 12. Grenier, E. and Guès, O.: Boundary layers for viscous perturbations of noncharacteristic quasilinear hyperbolic problems. J. Diff. Eqs. 143, 110–146 (1998) 13. Grenier, E. and Rousset, F.: Stability of one dimensional boundary layers using Green’s functions. Comm. Pure Applied Math., to appear 14. Henry, D.: Geometric theory of semilinear parabolic equations. Lecture Notes in Maths. 840, Berlin: Springer-Verlag, 1981 15. Kawashima, S.: Systems of a hyperbolic–parabolic composite type, with applications to the equations of magnetohydrodynamics. PhD thesis, Kyoto University (1983) 16. Kawashima, S. and Shizuta, Y.: On the normal form of the symmetric hyperbolic-parabolic systems associated with the conservation laws. Tohoku Math. J. 40, 449–464 (1988)

292

D. Serre, K. Zumbrun

17. Li, Ta-tsien and Yu, Wen-ci: Boundary value problems for quasilinear hyperbolic systems. Durham: Duke Univ., 1985 18. Matsumura, A. and Mei, M.: Convergence to travelling fronts of solutions of the p-system with viscosity in the presence of a boundary. Arch. Ration. Mech. Anal. 146, 1–22 (1999) 19. Rousset, F.: Inviscid boundary conditions and stability of viscous boundary layers. Asymptotic Analysis, 2001, to appear 20. Serre, D.: Sur la stabilité des couches limites de viscosité. Ann. Inst. Fourier 51, 109–129 (2001) 21. Zumbrun, K. and Howard, P.: Pointwise semigroup methods and stability of viscous shock waves. Indiana Univ. Math. J. 47, 3, 741–871 (1998) Communicated by P. Constantin

Commun. Math. Phys. 221, 293 – 304 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Symplectic Structures of Moduli Space of Higgs Bundles over a Curve and Hilbert Scheme of Points on the Canonical Bundle Indranil Biswas1 , Avijit Mukherjee2 1 School of Mathematics, Tata Institute of Fundamental Research, Homi Bhabha Road, Bombay 400005,

India. E-mail: [email protected]

2 Scuola Internazionale Superiore di Studi Avanzati, via Beirut 4, 34014 Trieste, Italy.

E-mail: avijit@@sissa.it Received: 15 January 2000 / Accepted: 25 March 2001

Abstract: The moduli space of triples of the form (E, θ, s) are considered, where (E, θ ) is a Higgs bundle on a fixed Riemann surface X, and s is a nonzero holomorphic section of E. Such a moduli space admits a natural map to the moduli space of Higgs bundles simply by forgetting s. If (Y, L) is the spectral data for the Higgs bundle (E, θ ), then s defines a section of the line bundle L over Y . The divisor of this section gives a point of a Hilbert scheme, parametrizing 0-dimensional subschemes of the total space of the canonical bundle KX , since Y is a curve on KX . The main result says that the pullback of the symplectic form on the moduli space of Higgs bundles to the moduli space of triples coincides with the pullback of the natural symplectic form on the Hilbert scheme using the map that sends any triple (E, θ, s) to the divisor of the corresponding section of the line bundle on the spectral curve. 1. Introduction A Higgs bundle over a compact connected hyperbolic Riemann surface X is a pair of the form (E, θ ), where E is a holomorphic vector bundle over X and θ is a holomorphic section of KX End(E). Higgs bundles were introduced in [Hi1]. Let MH denote a moduli space of stable Higgs bundles. We assume that for a Higgs bundle in MH , the degree of the underlying vector bundle E satisfies the inequality degree(E) > rank(E)(g − 1). So the Riemann–Roch theorem for E ensures that E admits a nonzero section. It is known that MH is a connected complex manifold of dimension 2rank(E)2 (g − 1) + 2 [Hi2]. Here we consider triples of the form (E, θ, s), where (E, θ ) is a Higgs bundle in MH and s ∈ H 0 (X, E) is a nonzero section. We recall that triples of this kind are considered in [BG, Li].

294

I. Biswas, A. Mukherjee

Let MT denote the moduli space of isomorphism classes of triples. It can be shown that MT exists as an analytic space. Since for every (E, θ, s) ∈ MT we have (E, θ ) ∈ MH , there is a surjective morphism F : MT −→ MH . The sujectivity of F is a consequence of the earlier observation that dim H 0 (X, E) > 0. A holomorphic symplectic form on the moduli space MH was constructed in [Hi1]. We will denote this symplectic form on MH by . Therefore, F ∗ is a holomorphic closed two-form on MT . If E is of rank n, then it is possible to evaluate on θ a GL(n, C)-invariant polynomial on the Lie algebra M(n, C), namely polynomials of the form A −→ trace(Ai ). This yields an element of the vector space H 0 (X, KX⊗i ). The resulting map P : MH −→ V :=

n i=1

H 0 (X, KX⊗i ),

which is known as the Hitchin map, is proper [Hi1]. Any element of V defines a spectral curve, which is a curve in the total space of KX defined by a polynomial constructed from the given element of V. Given any point p ∈ MH , a holomorphic line bundle L can be constructed on the spectral curve Yp corresponding to the point P (p). If p = (E, θ ), then the fibers of L are basically the eigenvectors of θ . Let π denote the projection of Yp to X obtained from the obvious projection of KX to X. It turns out that π∗ L = E [Hi2]. Therefore, a section s of E defines a section s of L over Yp . Since Yp is embedded in KX , the divisor of s defines a 0-dimensional subscheme of KX . In other words, we have a map : MT −→ Hilbl (KX ) to a Hilbert scheme of points on KX ; the integer l is the degree of the line bundle L. The natural symplectic structure on KX induces a holomorphic symplectic structure on Hilbl (KX ) [Be]. This symplectic form on Hilbl (KX ) will be denoted by C . In Theorem 3.1 we prove that the 2-form F ∗ on MT coincides with ∗ C . Since both and C are exact, Theorem 3.1 reduces to an equality of two given 1-forms. We show that the difference of these two 1-forms in question descends to V. The properness of the Hitchin map is very useful for this step. Then we show that a twisted version of the descended 1-form on V further descends as a meromorphic 1-form on a projective space with appropriate poles. Finally, the proof of Theorem 3.1 is completed using the result that no nonzero meromorphic form of the given type exists. 2. Higgs Bundles and Triples Let X be a compact connected Riemann surface of genus g, with g ≥ 2. The holomorphic cotangent bundle of X will be denoted by KX . A Higgs bundle over X is a pair of the form End(E)) (E, θ ), where E is a holomorphic vector bundle over X and θ ∈ H 0 (X, KX known as a Higgs field. A Higgs bundle (E, θ ) is calledstable if for every proper subbundle F ⊂ E of positive rank and with θ (F ) ⊆ KX F , the inequality degree(F )/rank(F ) < degree(E)/rank(E)

Higgs Bundles and Hilbert Scheme

295

is valid [Hi1]. Stability is an open condition. In other words, given any algebraic (respectively, analytic) family of Higgs bundles the points of the parameter space over which the Higgs bundle is stable form a Zariski (respectively, analytic) open subset. End(E)) on a vector bundle E of rank n and Given a Higgs field θ ∈ H 0 (X, KX an integer i ∈ [1, n], consider pi (θ ) := trace(θ i ) ∈ H 0 (X, KX⊗i ), which is defined using the associative algebra structure of the fibers of End(E). The map which sends a Higgs bundle (E, θ ) to n

pi (θ ) =

i=1

n

trace(θ i ) ∈

i=1

n i=1

H 0 (X, KX⊗i )

is known as the Hitchin map. By p0 (θ ) we will mean the section of KX⊗0 = OX given by the constant function 1. The total space of the line bundle KX will also be denoted by KX . Let π denote the projection of KX to X. For a Higgs bundle (E, θ ), the subscheme of KX defined by the solution of the polynomial n n Pθ (t) = t n−i pi (θ ) = t n−i trace(θ i ) = 0 i=0

i=0

is called the spectral curve associated to (E, θ ) [Hi2, BNR]. We will denote this spectral curve by Yθ . The natural projection from the spectral curve Yθ to X obtained by restricting π – which we will also denote by π – is a finite morphism. Furthermore, there is a torsionfree sheaf L of rank one on Yθ such that π∗ L ∼ = E.

(2.1)

The fiber Ly of L over a point y ∈ Yθ can be considered as the eigenvector of θ(π(y)) for the eigenvalue y [Hi2]. The pair (Yθ , L) is called the spectral data for the Higgs bundle (E, θ ). Since Yθ ⊂ KX , there is a natural homomorphism fθ : L −→ π ∗ KX ⊗ L,

(2.2)

which sends any vector v ∈ Ly , where y ∈ Yθ , to the tensor product y ⊗l ∈ (KX )|π(y) ⊗ Ly . Its direct image π∗ fθ : π∗ L −→ π∗ (π ∗ KX ⊗ L) ∼ = KX ⊗ π ∗ L coincides with the Higgs field θ [Hi2]; the last isomorphism is obtained from the projection formula. The equivalence between Higgs bundles and spectral data was used in [Si] to give a construction of moduli space of semistable Higgs bundles. Given a Higgs bundle (E, θ ), let C. denote the following two term complex of sheaves on X: [−,θ] C. : C0 = End(E)−→C1 = KX ⊗ End(E), where End(E) is at the 0th position, and if θ = dz ⊗ A in a local coordinate function z and s is a local section of End(E), then [s, θ ] = dz ⊗ (sA − As).

296

I. Biswas, A. Mukherjee

The space of infinitesimal deformations of the Higgs bundle (E, θ ) is parametrized by the hypercohomology H1 (C. ) [BR]. There is a canonical element in the dual vector space H1 (C. )∗ . To construct this element, first consider the diagram [−,θ]

End(E) −→ KX ⊗ End(E) End(E) −→ 0

(2.3)

This diagram gives a homomorphism δ : H1 (C. ) −→ H 1 (X, End(E)). Now, given any α ∈ H1 (C. ), the pairing trace(δ(α) ∪ θ) ∈ H 1 (X, KX ) = C defines the canonical element in H1 (C. )∗ [Hi1]; we will denote this canonical element in H1 (C. )∗ by &θ . Let KX [1] denote the complex of sheaves with only one nonzero term KX at the first position. We have the following homomorphism from the complex C. ⊗ C. to KX [1]: 0 End(E) ⊗ End(E) [−, θ ] ⊕ [−, θ ]

0 −→ 0 trace

End(E) ⊗ (KX ⊗ End(E)) ⊕ (KX ⊗ End(E)) ⊗ End(E) −→ [−, θ ] + [−, θ ] (KX ⊗ End(E)) ⊗ −→ (KX ⊗ End(E)) 0

KX 0 0

where the middle homomorphism, namely trace, is defined using the trace map End(E) ⊗ End(E) −→ OX of endomorphisms. This homomorphism of complexes gives the following homomorphism of hypercohomologies: H1 (C. ) ⊗ H1 (C. ) −→ H2 (C. ⊗ C. ) −→ H2 (KX [1]) = H 1 (X, KX ) = C.

(2.4)

This bilinear form on H1 (C. ) is evidently anti-symmetric and nondegenerate. In other words, it defines a symplectic form on H1 (C. ). This symplectic form on H1 (C. ) will be denoted by θ . Given a holomorphic family of Higgs bundles parametrized by a complex manifold T , the pointwise construction of &θ gives a holomorphic one-form, which we will denote by &T on T . The exterior derivative d&T coincides with the one obtained from pointwise construction of θ [BR]. In fact, the pairing θ defines a symplectic form on the smooth locus of any moduli space of Higgs bundles over X [Hi2].

Higgs Bundles and Hilbert Scheme

297

Let MH denote the moduli space of stable Higgs bundles over X of rank n and degree d. See [Si] for the construction of MH . We assume that d > n(g − 1). So from Riemann–Roch, dim H 0 (X, E) = d − n(g − 1) + dim H 1 (X, E) > 0. In other words, E admits nonzero sections. Definition 2.1. A triple over X is a data of the form (E, θ, s), where (E, θ ) is a stable Higgs bundle of rank n and degree d, and s ∈ H 0 (X, E) − 0 is a nonzero holomorphic section. Let MT denote the moduli space of triples that parametrizes isomorphism classes of triples. Two triples (E, θ, s) and (E , θ , s ) are called isomorphic if there is a holomorphic isomorphism E −→ E that takes θ to θ and s to s . We will show that MT exists as an analytic space. Let (ET , θT ) be a family of stable Higgs bundles of rank n and degree d over X parametrized by a complex space T . Let ψT denote the projection of X × T to T . If (ET , θT ) is another such family such that for every point t ∈ T , the Higgs bundle (Et , θt ) is isomorphic to (Et , θt ), then it can be shown that there is a holomorphic line bundle ξ over T such that ET = ET ⊗ ψT∗ ξ and θT = θT ⊗ I dψT∗ ξ . Indeed, this is an immediate consequence of the fact that given a stable Higgs bundle (E, θ ), the only automorphisms of E that takes θ to itself are the scalar multiplications. Now note that any automorphism of E defined by a scalar multiplication acts trivially on the projective space PH 0 (X, E) of lines in H 0 (X, E). From this it follows that MT exists as an analytic space, and the fiber of the forgetful map, that assigns (E, θ ) to (E, θ, s), is PH 0 (X, E) over the point of MH represented by (E, θ ). In fact, MT can be constructed locally over MH , that is over sufficiently small analytic open subsets of MH . The earlier remarks ensure that these local constructions patch together to define MT . It was remarked in (2.1) that π∗ L is isomorphic to E. Since π −1 (π∗ L) is a subsheaf of L, there is a canonical injective homomorphism π∗ : H 0 (Yθ , L) −→ H 0 (X, π∗ L) = H 0 (X, E). The finiteness of the map π implies that this homomorphism π∗ is actually an isomorphism. Now, given a section s ∈ H 0 (X, E), let s := π∗−1 (s) ∈ H 0 (Yθ , L)

(2.5)

be the section of L that corresponds to it by the isomorphism π∗ defined above. For a nonzero section s ∈ H 0 (X, E)−0, consider the divisor div( s) on Yθ . Using the inclusion map of Yθ in the surface KX , the zero-dimensional subscheme div( s) of Yθ defines a zero-dimensional subscheme of KX . The genus of the spectral curve Yθ is n2 (g − 1) + 1 [Hi2]. Therefore, from the Riemann–Roch theorem it follows that degree(div( s)) = d + n(n − 1)(g − 1),

(2.6)

where d = degree(E). In the next section we will construct a morphism from a moduli space of triples to a Hilbert scheme parametrizing 0-dimensional subschemes of the total space KX of the canonical bundle.

298

I. Biswas, A. Mukherjee

3. Morphism from Triples to Hilbert Scheme For any integer j ≥ 1, let Hilbj (KX ) denote the Hilbert scheme, which is the moduli space parametrizing 0-dimensional subschemes of length j of the quasi-projective surface KX . Since the spectral curve Yθ is embedded in KX , the divisor div( s) (defined in (2.5)) on Yθ defines a point of Hilbl (KX ), where l = d + n(n − 1)(g − 1) is the degree of s as obtained in (2.6). As before, let MT denote the moduli space of triples of the form (E, θ, s) considered in Definition 2.1. Recall that d = degree(E) > rank(E)(g − 1) = r(g − 1). Associating to any triple (E, θ, s) the element of the Hilbert scheme Hilbl (KX ) defined by the divisor div( s), we obtain a morphism : MT −→ Hilbl (KX ).

(3.1)

As before, let MH denote the moduli space of stable Higgs bundles over X of degree d and rank n. Let (3.2) F : MT −→ MH denote the forgetful map which sends any triple (E, θ, s) to the Higgs bundle (E, θ ). Recall that we have assumed that d > n(g − 1). Therefore, the map F in (3.2) is surjective. We note that there are different notions of stability of a triple. If notions of moduli spaces of triples different from the one given here is used, then F is usually not everywhere defined. Therefore, MT is a projective bundle over MH . For any point p = (E, θ ) on MH , the inverse image F −1 (p) is P (H 0 (X, L)). Let denote the symplectic form on MT whose pointwise construction has been described in (2.4). This symplectic form was introduced in [Hi1]. The surface KX has a natural symplectic structure. This symplectic form induces a symplectic structure on any Hilbert scheme Hilbj (KX ) of 0-dimensional subschemes of KX of length j [Be, pp. 766–767]. Let C denote the canonical symplectic form on Hilbl (KX ). Therefore, on the moduli space MT we have two holomorphic 2-forms, namely F ∗ and ∗ C . The following theorem says that F ∗ = ∗ C . Theorem 3.1. The two holomorphic 2-forms F ∗ and ∗ C on MT coincide. The proof of this theorem will be carried out in the next section. In the rest of this section we will reduce the equality F ∗ = ∗ C to an equality of 1-forms. In Sect. 2 a one-form &θ was constructed on the space of infinitesimal deformations of a Higgs bundle (E, θ ). Let & denote the holomorphic 1-form on MH such that for any point (E, θ ) ∈ MH the form & at the point (E, θ ) coincides with the 1-form &θ . We already noted in Sect. 2 that d& = . (3.3) Take a point p ∈ Hilbl (KX ) representing the collection of points {p1 , p2 , · · · , pl }, where pi ∈ KX and pi are distinct. In other words, pi = pj if i = j . We have Tp Hilbl (KX ) =

l i=1

Tpi KX .

(3.4)

Higgs Bundles and Hilbert Scheme

299

This decomposition is immediate from the fact that a neighborhood of p in Hilbl (KX ) is identified with a neighborhood in the l th symmetric product of KX of the point represented by the collection {pi }. Let ω denote the canonical 1-form on KX . The exterior derivative dω is the canonical symplectic form on KX . Since ω(pi ) is a 1-form on Tpi KX , using the decomposition (3.4) we have an element &C (p) ∈ Tp∗ Hilbl (KX ). It is easy to check that this defines a 1-form on Hilbl (KX ). Let &C denote the 1-form on Hilbl (KX ) whose evaluation at any point p coincides with &C (p) constructed above. Proposition 3.2. The exterior derivative d&C coincides with the symplectic form C on Hilbl (KX ). Proof. Let v := {v1 , v2 , · · · vl } and w = {w1 , w2 , · · · wl } be two tangent vectors in Tp Hilbl (KX ), where vi , wi ∈ Tpi KX ; the decomposition of Tp Hilbl (KX ) used here is the one obtained in (3.4). The evaluation of the symplectic form C on the pair {v, w} is described by the following identity ([Be]; the construction in Prop. 5 (pp. 766)): C (p)(v, w) =

l

dω(pi )(vi , wi ),

(3.5)

i=1

where dω, as before, is the canonical symplectic form on KX . The equality (3.5) immediately implies that the decomposition (3.4) of Tp Hilbl (KX ) is orthogonal with respect to the symplectic form C (p). The decomposition (3.4) is obviously orthogonal with respect to the skew-symmetric form d&C (p). Consequently, it suffices to check that the restriction of d&C (p) to the subspace Tpi KX ⊂ Tp Hilbl (KX ) coincides with the restriction of C (p). But clearly both these restrictions coincide with the symplectic form dω(pi ) on Tpi KX . This completes the proof of the proposition. In view of Proposition 3.2 and the equality (3.3), the Theorem 3.1 is an immediate consequence of the following lemma. Lemma 3.3. The two holomorphic 1-forms F ∗ & and ∗ &C on MT coincide. The following section will be devoted to the proof of Lemma 3.3. 4. Proof of the Lemma Let Y be a connected Riemann surface, and let π : Y −→ X be a covering map, possibly ramified, of degree n. Fix a holomorphic section β ∈ H 0 (Y, π ∗ KX ). Using the natural homomorphism (dπ )∗ : π ∗ KX −→ KY , the section β gives a section of KY . This section of KY will also be denoted by β.

300

I. Biswas, A. Mukherjee

For any holomorphic line bundle L on Y , the direct image π∗ L is a holomorphic vector bundle of rank n over X. Considering the infinitesimal deformations, we have a homomorphism π : H 1 (Y, OY ) −→ H 1 (X, End(π∗ L)),

(4.1)

as the space of infinitesimal deformations of π∗ L (respectively, L) are parametrized by H 1 (X, End(π∗ L)) (respectively, H 1 (Y, OY )). Since β ∈ H 0 (Y, KY ), we have ∪β

fL : H 1 (Y, OY ) −→ H 1 (Y, KY ) = C.

(4.2)

On the other hand, since β ∈ H 0 (Y, π ∗ KX ), taking the direct image of the multiplication map β⊗

L −→ π ∗ KX ⊗ L we have a homomorphism φL : π∗ L −→ π∗ (π ∗ KX ⊗ L) ∼ = KX ⊗ π ∗ L of vector bundles, where the last isomorphism is obtained from the projection formula. So, φL defines a holomorphic section βL ∈ H 0 (X, KX End(π∗ L)). Let f L denote the composition βL ∪

H 1 (X, End(π∗ L)) −→ H 1 (X, KX ⊗ End(π∗ L) ⊗ End(π∗ L)) trace

−→ H 1 (X, KX ) = C. Here trace, as before, is the homomorphism End(π∗ L) End(π∗ L) −→ OX constructed using the Killing form on GL(n, C) defined by A B −→ trace(AB). Proposition 4.1. The following diagram commutes H 1 (Y, OY ) π

fL

−→ C fL

H 1 (X, End(π∗ L)) −→ C where π and fL are defined in (4.1) and (4.2) respectively. Proof. The proposition follows immediately by unraveling the definitions of the above homomorphisms. Take any cohomology class v ∈ H 1 (Y, OY ). Let α be a (0, 1)-form on Y which is a Dolbeault representative of v. Let U ⊂ X be the complement of the finite subset of X consisting of points over which π is ramified. Consider the π∗ Oπ −1 (U ) -valued (0, 1)-form on U defined by α. It is the restriction of a Dolbeault representative of π (α) to U . Since the canonical isomorphism H 1 (X, KX ) ∼ = C is defined using the integration of (1, 1)-forms on X, and similarly for Y , the proposition follows easily.

Higgs Bundles and Hilbert Scheme

301

In Sect. 2 we briefly described the identification of Higgs bundles and the spectral data constructed in [Hi2]. Set Y to be a spectral curve. So, in particular, it is a curve embedded in KX . Set π to be the natural projection of the spectral curve to X. Let JY denote the subvariety of MH consisting of all Higgs bundles such that the corresponding spectral curve coincides with Y . We know that JY is identified with the component Picl (Y ) of the Picard group of Y , where l, as before, is d + n(n − 1)(g − 1) [Hi2]. The inverse image F −1 (JY ) ⊂ MT will be denoted by AY . So AY is a projective bundle over Picl (Y ). More precisely, it is the projectivized Picard bundle over Picl (Y ). Let ν : AY 5→ MT be the inclusion map. Proposition 4.2. The two 1-forms (F ◦ ν)∗ & and ( ◦ ν)∗ &C on AY coincide. Proof. This is actually a consequence of Proposition 4.1. Let f denote the inclusion of Y in KX . As before, the canonical 1-form on KX is denoted by ω. Set β in Proposition 4.1 to be the 1-form f ∗ ω on Y . Note that f ∗ ω is indeed a section of π ∗ KX . For any point (L, s) ∈ AY , the homomorphism fL constructed in (4.2) gives the oneform ( ◦ ν)∗ &C . On the other hand, (F ◦ ν)∗ & is constructed from the homomorphism f L . Now, Proposition 4.1 finishes the proof. We will note a simple observation. Let g : Z1 −→ Z2 be a proper smooth holomorphic map between connected complex manifolds. For any u ∈ Z2 , the inverse image g −1 (u) is assumed to be connected. Let τ be a holomorphic 1-form on Z1 such that the contraction of τ with any vertical tangent vector v ∈ Tu Z1 vanishes. By a vertical vector we mean a vector in the kernel of the differential dg : T Z1 −→ g ∗ T Z2 of the map g. Then there is a 1-form on Z2 , say τ , such that g ∗ τ = τ . To see this first note that given any tangent vector w ∈ Tz Z2 , using the above condition, namely the contraction of τ with any vertical vector vanishes, a holomorphic function on g −1 (z) is obtained. Now the existence of such a form τ is an immediate consequence of the fact that there is no nonconstant holomorphic function on a compact connected complex manifold. As we know from [Hi2], the space of spectral curves is parametrized by the vector space n V := H 0 (X, KX⊗i ). i=1

Let P : MH −→ V denote the Hitchin map defined in Sect. 2 which sends any Higgs bundle to the spectral curve associated to it. The morphism P is proper [Hi2]. Now, from Proposition 4.2 coupled with the above observation it follows that there is a holomorphic 1-form γ on V such that F ∗ & − ∗ &C = (P ◦ F )∗ γ .

(4.3)

Indeed, taking Z1 in the above observation to be the open subset of MT over which P ◦ F is smooth and setting τ to be F ∗ & − ∗ &C , the existence of a form γ satisfying (4.3) is ensured by the above observation. Lemma 3.3 will be proved by showing that the form γ in (4.3) vanishes identically.

302

I. Biswas, A. Mukherjee

Consider the action of the group C∗ = C\{0} on V is defined by t · {v1 , v2 , · · · , vn } = {tv1 , t 2 v2 , · · · , t i vi , · · · , t n vn },

(4.4)

where t ∈ C∗ and vi ∈ H 0 (X, KX⊗i ). This action will be denoted by ρ. The quotient of V\{0} by this action ρ is a weighted projective space. This weighted projective space . Let will be denoted by P q : V\{0} −→ P denote the quotient map. Proposition 4.3. The evaluation of the 1-form γ , obtained in (4.3), on any vertical vector for the projection q vanishes. Proof. Take a point v ∈ V\{0}. For any z ∈ C∗ , let fz denote the automorphism of KX that sends any vector α ∈ KX to zα. It is easy to see that the automorphism fz of KX takes the spectral curve Yv defined by v to the spectral curve Yρ(z)v defined by ρ(z)v ∈ V, where ρ, as before, is the action defined in (4.4). The isomorphism of Yρ(z)v with Yv obtained from fz will be denoted by f z . Fix a line bundle L over the spectral curve Yv and also fix a holomorphic section s ∗ ∗ of L. Now f z L is a holomorphic line bundle over the spectral curve Yρ(z)v and f z s is ∗ a holomorphic section of f z L. Since the following diagram commutes fz

Yρ(z)v −→ Yv π π X = X ∗

∗

the two vector bundles π∗ L and π∗ f z L over X are isomorphic. Let (π∗ f z L, θz ) denote ∗ the Higgs bundle over X corresponding to the spectral data (Yρ(z)v , f z L). Since π∗ L ∗ and π∗ f z L are isomorphic, it follows that the evaluation of the 1-form & on MH to the ∗ tangent vector at the point (π∗ L, θ1 ) defined by the curve z −→ (π∗ f z L, θz ) vanishes. Indeed, this is immediate from the construction of &θ done following (2.3). ∗ −1 The divisor of the section f z s is simply f z (div(s)), the image of the divisor of −1

s by the isomorphism f z which is the inverse of f z . Furthermore, the evaluation of the canonical 1-form ω on KX on any vertical vector for the projection π vanishes. Consequently, the evaluation of the 1-form ∗ &C on the tangent vector at the point (π∗ L, θ1 , π∗ s) ∈ MT ∗

∗

defined by the curve z −→ (π∗ f z L, θz , π∗ f z s) vanishes. Combining the above two observations on & and ∗ &C respectively, we conclude that γ vanishes on any vertical vector for the projection q. This completes the proof of the proposition.

Higgs Bundles and Hilbert Scheme

303

For each integer i ∈ [1, n], fix a basis {βi,1 , βi,2 , · · · , βi,mi } of the vector space So n1 = g and ni = (2i − 1)(g − 1) if i ≥ 2. This collection of basis gives us a basis for the vector space V. n Let N denote the dimension of V. So we have N = i=1 ni . Let P denote the N projective space consisting of all lines in C . We will define a morphism from V − {0} to P using the basis of V. Take any nonzero vector H 0 (X, KX⊗i ).

v=

ni n

ci,j βi,j ∈ V − {0},

i=1 j =1

where ci,j ∈ C. Let q : V − {0} −→ P be the map that sends any such vector v to the point of P represented by {ci,j }. Let Q : V − {0} −→ V − {0}

(4.5)

(4.6)

be the holomorphic map that sends any vector v as above to the vector ni n

(ci,j )i βi,j ∈ V − {0}.

i=1 j =1

Now we have a commutative diagram Q

V − {0} −→ V − {0} q q P −→ P is the one induced by the map Q; the map q is defined in where the morphism P −→ P (4.5). In view of the commutativity of the above diagram, Proposition 4.3 implies that the evaluation of the 1-form Q∗ γ for any vertical vector for the projection q vanishes. Take a nonzero vector v ∈ V − {0}. Let α be a holomorphic tangent vector to the manifold V −{0} at v. Take a nonzero complex number λ. Let λ denote the automorphism of V which sends any vector w to λw. Therefore, d λ(α) is a tangent vector at λ(v), where d λ is the differential of the map λ. Choose a holomorphic map α from the unit disk in C to V such that α (0) = v and λ ◦ α ) (0) = d λ(α). α (0) = α. Therefore, we have ( For any point t in the unit disk, let Yt denote the spectral curve corresponding to the point α (t) of V. It is easy to see that the spectral curve corresponding to the point λ ◦ α (t) ∈ V is the image fλ (Yt ), where fλ , as before, is the automorphism of KX defined by multiplication with λ. If Lt is a line bundle over Yt and st is a holomorphic section of Lt , ∗ ∗ ∗ then f λ Lt is a line bundle on fλ (Yt ) and f λ s is a holomorphic section of f λ Lt ; here f λ : fλ (Yt ) −→ Yt , as before, is the isomorphism defined by fλ .

304

I. Biswas, A. Mukherjee

From the above observations it readily follows that λ · Q∗ γ (v)(α) = Q∗ γ (λv)(d λ(α)),

(4.7)

where Q is defined in (4.6). We already noted that the evaluation of the 1-form Q∗ γ for any vertical vector for the projection q (defined in (4.5)) vanishes. In view of this, the equality (4.7) implies OP (1) on P. This that the one-form Q∗ γ descends to a holomorphic section of 1P OP (1) such that means that there is a section γ of 1P q ∗ γ ∈ H 0 (V\{0}, 1V \{0} ⊗ q ∗ OP (1)) realized as a one-form on V\{0} using the canonical trivialization of q ∗ OP (1) coincides with the one-form Q∗ γ . It is known that H 0 (P, 1P ⊗ OP (1)) = 0 [SS, pp. 71, Theorem 4.3(a)]. Therefore, we obtain that γ = 0. Since γ = 0, the identity (4.3) completes the proof of Lemma 3.3. We already noted in Sect. 3 that Lemma 3.3 implies Theorem 3.1. Therefore, the proof of Theorem 3.1 is complete. We note that exactly imitating the construction of the 1-form & on MH we may construct a 1-form on MT . Indeed, there is a natural projection from the space of infinitesimal deformations of a triple (E, θ, s) to the space of infinitesimal deformations of the Higgs bundle (E, θ ). Similarly, exactly imitating the construction of the 2-form on MH a 2-form on MT can be constructed. If the parameter σ in the stability condition of a triple is not sufficiently small, then F is not defined everywhere. Even in this case, these 1-form and 2-form coincide on the domain of F with F ∗ & and F ∗ respectively. Theorem 3.1 remains valid if F ∗ is replaced by this 2-form on MT . Acknowledgements. We thank the referee for clarifying remarks.

References [Be]

Beauville, A.: Variétés Kähleriennes dont la premiére classe de Chern est nulle. J. Diff. Geom. 18, 755–782 (1983) [BNR] Beauville, A., Narasimhan, M.S., Ramanan, S.: Spectral curves and the generalised theta divisor. J. Reine Angew. Math. 398, 169–179 (1989) [BG] Bradlow, S.B., García-Prada, O.: Stable triples, equivariant bundles and dimension reduction. Math. Ann. 304, 225–252 (1996) [BR] Biswas, I., Ramanan, S.: An infinitesimal study of the moduli of Hitchin pairs. J. Lond. Math. Soc. 49, 219–231 (1994) [G] García-Prada, O.: Dimensional reduction of stable bundles, vortices and stable pairs. Internat. J. Math. 5, 1–52 (1994) [Hi1] Hitchin, N.J.: The self-duality equations on a Riemann surface. Proc. Lond. Math. Soc. 55, 59–126 (1987) [Hi2] Hitchin, N.J.: Stable bundles and integrable systems. Duke Math. J. 54, 91–114 (1987) [Li] Lin, T.-R.: Hermitian-Yang–Mills–Higgs metrics and stability for holomorphic vector bundles with Higgs fields. Preprint [Si] Simpson, C.T.: Moduli of representations of the fundamental group of a smooth projective variety. II. Inst. Hautes Études Sci. Publ. Math. 80, 5–79 (1994) [SS] Shiffman, B., Sommese, A.: Vanishing Theorems on Complex Manifolds. Progress in Math., Vol. 56 Boston: Birkhäuser, 1985 Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 221, 305 – 333 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Long Time Behavior of the Continuum Limit of the Toda Lattice, and the Generation of Infinitely Many Gaps from C ∞ Initial Data A.B.J. Kuijlaars1 , K. T.-R. McLaughlin2,3 1 Department of Mathematics, Katholieke Universiteit Leuven, Celestijnenlaan 200 B, 3001 Leuven, Belgium.

E-mail: [email protected]

2 Department of Mathematics, University of Arizona, Tucson, Arizona 85721, USA.

E-mail: [email protected]

3 Department of Mathematics, University of North Carolina, Chapel Hill, North Carolina 27599, USA.

E-mail: [email protected] Received: 8 May 2000 / Accepted: 27 March 2001

Abstract: We analyze a continuum limit of the finite non-periodic Toda lattice through an associated constrained maximization problem over spectral density functions. The maximization problem was derived by Deift and McLaughlin using the Lax–Levermore approach, initially developed for the zero dispersion limit of the KdV equation. It encodes the evolution of the continuum limit for all times, including evolution through shocks. The formation of gaps in the support of the maximizer is indicative of oscillations in the Toda lattice and the lack of strong convergence of the continuum limit. For large times, the maximizer tends to have zero gaps, which is the continuum analogue of the sorting property of the finite lattice. Using methods from logarithmic potential theory, we show that this behavior depends crucially on the initial data. We exhibit initial data for which the zero gap ansatz holds uniformly in the spatial parameter (at large times), and other initial data for which this uniformity fails at all times. We then construct an example of C ∞ smooth initial data generating, at a later time, infinitely many gaps in the support of the maximizer, while for even larger times, all gaps have closed.

1. Introduction The finite, non-periodic Toda lattice is a dynamical system that is given in Flaschka coordinates by dan 2 = 2(bn2 − bn−1 ), n = 1, . . . , N, dt dbn = bn (an+1 − an ), n = 1, . . . , N − 1 dt

(1.1) (1.2)

with b0 = bN = 0. The Toda lattice is completely integrable and is solved explicitly by the inverse spectral transform [12, 23, 24].

306

A. B. J. Kuijlaars, K. T.-R. McLaughlin

The continuum limit of the Toda lattice is studied by Deift and McLaughlin in [7]. For > 0, they choose N = [1/ ] and initial values an = a0 ( n),

bn = b0 ( n)

(1.3)

in which a0 (x) and b0 (x) are given continuous functions for x ∈ [0, 1] such that b0 (x) > 0 on (0, 1). For small > 0, the initial data (1.3) vary only very little with the index n. Since the constant functions an (t) ≡ a ∈ R and bn (t) ≡ b > 0 clearly satisfy (1.1)–(1.2) one may then reasonably expect that the solutions with initial data (1.3) are approximately constant in time, and vary noticeably only over large time scales. Putting an (t) = a ( n, t),

bn (t) = b ( n, t),

one may then make the ansatz about the asymptotic form of the functions a and b : a (x, t) ∼ a(x, t) + a1 (x, t) + · · · , b (x, t) ∼ b(x, t) + b1 (x, t) + · · · .

(1.4) (1.5)

Inserting the asymptotic form (1.4)–(1.5) into (1.1)–(1.2) and equating the leading order terms, one arrives at the formal continuum limit: ∂a ∂b = 4b , ∂t ∂x

∂b ∂a =b , ∂t ∂x

(1.6)

for 0 < x < 1 and t > 0, with boundary conditions b(0) = b(1) = 0. The system (1.6) is hyperbolic with Riemann invariants α = a − 2b,

β = a + 2b.

The Riemann invariant form of (1.6) is ∂α β − α ∂α =− , ∂t 2 ∂x

∂β β − α ∂β = , ∂t 2 ∂x

(1.7)

with initial values α0 (x) := a0 (x) − 2b0 (x),

β0 (x) := a0 (x) + 2b0 (x).

(1.8)

A rigorous justification of this procedure was provided by Deift and McLaughlin in [7] for certain initial data (see below), modified to correspond to WKB data, and for times t up to the shock time of the system (1.6), see also the recent work [2]. For times beyond shock time, a more complicated description was found in certain cases. The analysis of the continuum limit of the Toda lattice has much in common with the analysis of Lax and Levermore [22] of the zero-dispersion limit of the Korteweg–de Vries (KdV) equation ut − 6uux + 2 uxxx = 0 as → 0. In both cases, essential use is made of the inverse spectral (or scattering) transforms for the KdV equation and the Toda lattice, respectively. A fundamental step in the analysis is the formulation of a quadratic maximization problem in the spectral variable in which the space-time variables x and t appear as parameters. The maximization problem arises from an asymptotic analysis of the spectral transform.

Continuum Limit of the Toda Lattice

307

For the continuum limit of the Toda lattice, the spectral data are the eigenvalues λ k , k = 1, 2, . . . , N, of the tridiagonal matrix a1 b1 0 b1 a2 b2 .. . L = (1.9) b2 a3 .. .. . . bN −1 0 bN −1 aN constructed from the initial values (1.3), together with the “norming constants” wk > 0, k = 1, . . . , N, where wk is the first component of the normalized eigenvector of L corresponding to the eigenvalue λ k . The following restriction is put on the initial Riemann invariants α0 and β0 from (1.8). We assume as in [7, 21] • α0 has exactly one local minimum in [0, 1], and • β0 has exactly one local maximum in [0, 1]. In addition, we assume that α0 (x) < β0 (x) for x ∈ (0, 1), α0 (0) = β0 (0) and α0 (1) = β0 (1). We put A := min α0 (x), 0≤x≤1

B := max β0 (x). 0≤x≤1

It follows from the above assumptions that, for every λ ∈ [A, B], the set {x ∈ [0, 1] : α0 (x) ≤ λ ≤ β0 (x)} is an interval. We denote the endpoints of this interval by x− (λ) and x+ (λ). Under these hypothesis, Deift and McLaughlin [7] show that the eigenvalues λ k have an asymptotic density φ, i.e., for all λ1 < λ2 , λ2 lim #{λ k ∈ (λ1 , λ2 )} = φ(λ) dλ, →0

λ1

and φ is given by φ(λ) =

1 π

x+ (λ)

x− (λ)

√

1 dx, (β0 (x) − λ)(λ − α0 (x))

λ ∈ [A, B].

(1.10)

Furthermore, for every fixed λ∗ ∈ [A, B], and eigenvalues λ k( ) of L such that λ k( ) → λ∗ as → 0, the limit lim log wk ( ) = −V (λ∗ )

→0

exists, and V is given by

x− (λ)

V (λ) = 0

2

λ − a0 (x)

λ − a0 (x)

log

− 1

dx. + 2b0 (x)

2b0 (x)

(1.11)

308

A. B. J. Kuijlaars, K. T.-R. McLaughlin

In (1.11) that branch of the square root is chosen which is positive for λ > β0 (x). Following [7], we are led to the maximization problem Q(x, t) := max [(Lψ, ψ) − 2(V − tλ, ψ)] , where the maximum is taken for those functions ψ on [A, B] satisfying 0 ≤ ψ ≤ φ, ψ(λ) dλ = x. Here

(1.12)

(1.13)

Lψ(λ) =

log |λ − µ|ψ(µ) dµ,

and the inner product (· , ·) is the L2 inner product on [A, B]: B f (λ)g(λ) dλ. (f, g) = A

The maximization problem (1.12)–(1.13) is an extremal problem for logarithmic potentials. The function V (λ) − tλ in the right-hand side of (1.12) is known as an external field, see [27]. The external field changes with time, and initially, at time t = 0, it is given by the spectral function V . The other spectral function φ is a constant of the motion and appears in (1.13) as an upper constraint for the maximizer. The spatial coordinate x appears as a normalization in (1.13). It is important to note that (1.12)–(1.13) provides a global description of the continuum limit of the Toda lattice. Indeed, the maximizer ψ(λ) = ψ(λ; x, t) exists and is unique for every x ∈ (0, 1) and every t ∈ R. In case, for some range of the parameters x and t in space-time, the “free part” of ψ, i.e., the set of λ ∈ [A, B] where 0 < ψ(λ) < φ(λ), is an interval (α(x, t), β(x, t)) then the endpoints α(x, t) and β(x, t) satisfy Eqs. (1.7) and the asymptotic form (1.4) and (1.5) is believed to be valid. In such a case one says that a zero gap ansatz holds. In case the set 0 < ψ(λ) < φ(λ) consists of several intervals, separated by a number of gaps, then the continuum limit exists only in a weak, averaged sense. The endpoints of the intervals then evolve according to a system of PDEs, which is recognized as a Whitham-type system of modulation equations, see again [7]. It is generally believed that the lack of strong convergence is due to the development of oscillations in the Toda lattice, and that the oscillations can be modeled by theta functions built out of a Riemann surface with genus equal to the number of gaps (for the analogous case of the KdV equation with small dispersion, see [32] and [13] for genus 1 and arbitrary genus, respectively). The connection between the existence of gaps and the development of oscillations in the continuum limit of the Toda lattice has not been established rigorously. However, experience from the interplay between whole line scattering theory and periodic spectral theory indicates that the existence of gaps implies the development of oscillations. For the small dispersion limit of the KdV equation, Deift, Venakides, and Zhou [8] have shown, under real analyticity assumptions on the spectral data, that the existence of gaps implies oscillations. Also, for the so-called Toda shock problem, the connection between a gap and oscillations is well known [15, 31, 17]. As already indicated, the formulation and analysis of a maximization problem like (1.12)–(1.13) lies at the heart of the Lax–Levermore approach to the zero-dispersion limit of the KdV equation [22, 30, 8, 11]. In a similar way, other singular limits of integrable

Continuum Limit of the Toda Lattice

309

systems have been treated recently. We mention the semiclassical limit of the defocussing nonlinear Schrödinger equation [16] and the continuum limit of a discrete NLS chain [28]. The long time asymptotic behavior was considered in [22] for the zero-dispersion limit of the KdV equation, and in [1] for the semi-classical limit of the defocussing NLS equation. In this paper, we consider two questions related to the continuum limit of the Toda lattice. The first concerns the long time behavior. It is well known that the Toda lattice (1.1)–(1.2) has the sorting property, i.e., for fixed > 0, the tridiagonal matrix (1.9) converges as t → ∞ to a diagonal matrix with the eigenvalues λ k on the diagonal, sorted from large to small. This sorting property was discussed for the continuum limit in [3]. Deift and McLaughlin [7, Chapter 11] study the long time behavior in terms of the so-called right ansatz. The right ansatz holds for x and t if there exist α(x, t) and β(x, t) in [A, B] such that the maximizer ψ of the maximization problem (1.12)–(1.13) satisfies 0 < ψ(λ) < φ(λ), λ ∈ (α(x, t), β(x, t)), ψ = 0, λ ∈ [A, α(x, t)], ψ = φ, λ ∈ [β(x, t), B].

(1.14) (1.15) (1.16)

Theorem 1.1 (Deift–McLaughlin). For initial data as above, there exists a time t¯ such that for t > t¯ there exist x0 (t) and x1 (t) in [0, 1] satisfying lim x0 (t) = 0

t→∞

and lim x1 (t) = 1

t→∞

such that the right ansatz holds for every t > t¯ and every x ∈ (x0 (t), x1 (t)). Proof. See [7, Theorems 11.1 and 11.2], where this result was proved for the case that α0 is strictly decreasing. [In that case one may take x1 (t) = 1.] The more general case follows from similar arguments. It follows from Theorem 1.1 that, for fixed x ∈ (0, 1), the functions α(x, t) and β(x, t) exist for t sufficiently large. The fact that they have a common limit as t → ∞ represents the continuum analogue of the sorting property of the Toda lattice. We complement this long time result in two ways. • Firstly, we describe a general class of initial data for which the right ansatz holds for large enough times t, and for all x ∈ (0, 1), see Theorem 4.1. For these initial data, one may therefore write x0 (t) = 0 and x1 (t) = 1 in Theorem 1.1. • Secondly, we describe a different class of initial data for which the right ansatz does not hold near x = 0 no matter how large t is. See Theorem 5.3 for the precise conditions. For these initial data one therefore necessarily has x0 (t) > 0 in Theorem 1.1. The difference between the two cases lies in the smoothness of β0 at its maximum. The second problem we consider in this paper is the formation of an infinite gap solution, i.e., an infinite number of intervals in the support of the maximizer. • We present an example in which C ∞ initial data evolve into an infinite gap solution at a certain time t0 and position x0 , see Lemma 6.2 and Theorem 6.3. The construction makes essential use of the result on long time behavior.

310

A. B. J. Kuijlaars, K. T.-R. McLaughlin

• At the critical time t0 , we show that for x < x0 , and for x ∈ (x0 , x0 +δ) the constraint is not active, and the maximizer is supported on finitely many intervals. As x approaches x0 , the number of gaps increases without bound. This indicates that the oscillations in the continuum limit of the Toda lattice become more and more complicated. It would be of interest to show that the oscillations are described by theta functions associated to Riemann surfaces whose genuses grow as x tends to x0 . We remark that for real analytic spectral data the formation of an infinite gap solution is not possible, see [18]. The outline of the rest of the paper is as follows. Sections 2 and 3 contain preliminary material that is needed for the long time result of Theorem 4.1. In Sect. 2 we introduce an external field V˜ which is dual to V . In Sect. 3 we obtain a uniformly valid long time result from certain smoothness properties of both V and V˜ . Then in Sect. 4 we give a general class of initial data α0 and β0 which give rise to external fields V and V˜ with the required smoothness properties, so that for large enough time, the right ansatz holds uniformly in x. In Sect. 5 we consider a different class of initial data for which the right ansatz does not hold uniformly in x. This class includes C 2 initial data. In our final Sect. 6 we describe the generation of an infinite gap solution from C ∞ initial data. 2. The Dual Problem The Euler–Lagrange relations associated with the maximization problem (1.12)–(1.13) are Lψ(λ) − V (λ) + tλ = $, if 0 < ψ(λ) < φ(λ), Lψ(λ) − V (λ) + tλ ≤ $, if ψ(λ) = 0, Lψ(λ) − V (λ) + tλ ≥ $, if ψ(λ) = φ(λ),

(2.1) (2.2) (2.3)

where $ is a constant, which may depend on x and t. The maximizer is the only function on [A, B] that satisfies (1.13) and the relations (2.1)–(2.3) for some constant $. We need some notions from logarithmic potential theory. Good references are [26, 27]. The Green function with pole at infinity of an unbounded domain % in the complex λ-plane, is the unique continuous function of λ, that is harmonic in %, vanishes on C \ % and behaves like log |λ| as |λ| → ∞. We use g[α,β] to denote the Green function with pole at infinity of C \ [α, β]. It is known that

2

λ − a

λ−a g[α,β] (λ) = log

− 1

, + 2b

2b

where a = (α + β)/2 and b = (β − α)/4. Thus from (1.8) and (1.11) we see that the external field V is given as an integral of Green functions x− (λ) V (λ) = g[α0 (x),β0 (x)] (λ) dx. (2.4) 0

Also the function φ(λ) of (1.10) has a potential theoretic interpretation. Let ω[α,β] be the density of the equilibrium measure of the interval [α, β], that is, 1 √ if λ ∈ [α, β] π (β − λ)(λ − α) ω[α,β] (λ) = 0 elsewhere.

Continuum Limit of the Toda Lattice

311

Then by (1.12)

1

φ(λ) = 0

ω[α0 (x),β0 (x)] (λ) dλ.

(2.5)

We recall the following relation between the equilibrium measure and the Green function:

β −α Lω[α,β] = g[α,β] + log . (2.6) 4 We define a second external field 1 V˜ (λ) = g[α0 (x),β0 (x)] (λ) dx, x+ (λ)

Lemma 2.1. Assume that C =

1 0

λ ∈ [A, B].

(2.7)

log b0 (x)dx > −∞. Then

Lφ = V + V˜ + C. Proof. Using (2.5) and (2.6), we find 1 Lφ = g[α0 (x),β0 (x)] dx + 0

1

log b0 (x)dx.

0

The Green function g[α0 (x),β0 (x)] (λ) vanishes for λ ∈ [α0 (x), β0 (x)]. Thus for fixed λ, it vanishes for x ∈ [x− (λ), x+ (λ)]. This gives the lemma. 1 Remark 2.2. From now on we always assume 0 log b0 (x)dx > −∞. This is related to the assumption that Lφ is a bounded function. We put ψ˜ = φ − ψ, so that ψ˜ satisfies 0 ≤ ψ˜ ≤ φ,

˜ ψ(λ) dλ = 1 − x.

Then Lψ˜ = Lφ − Lψ, so that in view of (2.1)–(2.3) and Lemma 2.1, ˜ if 0 < ψ(λ) ˜ ˜ Lψ(λ) − V˜ (λ) − tλ = $, < φ(λ), ˜ if ψ(λ) ˜ ˜ Lψ(λ) − V˜ (λ) − tλ ≤ $, = 0, ˜ ˜ ˜ ˜ Lψ(λ) − V (λ) − tλ ≥ $, if ψ(λ) = φ,

(2.8) (2.9) (2.10)

with $˜ = C − $. Thus we have proved the following theorem. Theorem 2.3. If ψ is the maximizer for the maximization problem (1.12)–(1.13), then ˜ ψ˜ = φ − ψ is the maximizer for the maximization problem with external field V (λ) + tλ, ˜ constraint φ, and normalization ψ dλ = 1 − x. We will refer to the maximization problem with external field V˜ (λ) + tλ as the dual problem. The simultaneous study of a maximization problem and its dual was done earlier by Dragnev and Saff [10] in their work on the zero asymptotics of Krawtchouk polynomials.

312

A. B. J. Kuijlaars, K. T.-R. McLaughlin

3. The Right Ansatz and the Left Ansatz The right ansatz was formulated in (1.14)–(1.16). If the right ansatz is valid then the support of ψ is equal to the interval [α, B] and the support of ψ˜ = φ − ψ, the maximizer for the dual problem, is [A, β]. It is easy to see that also the converse holds. That is, if the support of ψ is an interval [α, B] and the support of ψ˜ is an interval [A, β], then α < β and (1.14)–(1.16) hold. The following lemma gives sufficient conditions for the supports to be intervals. Part of the lemma is covered by [9, Theorem 2.16 (b)] of Dragnev and Saff. For completeness and convenience of the reader, we give here the full proof. Lemma 3.1. Let t ∈ R. (a) If (λ − A)(V˜ (λ) + t) is increasing for λ ∈ [A, B], then, for every x ∈ (0, 1), there is β = β(x, t) such that the support of ψ˜ is [A, β]. In addition, β(x, t) depends continuously on x. (b) If (B − λ)(V (λ) − t) is increasing for λ ∈ [A, B], then, for every x ∈ (0, 1), there is α = α(x, t) such that the support of ψ is [α, B]. In addition, α(x, t) depends continuously on x. ˜ Proof. (a) If the support of ψ˜ is not an interval, then there exist λ1 < λ2 in supp(ψ) ˜ such that ψ(λ) = 0 for λ ∈ (λ1 , λ2 ). By (2.9) we then have ˜ ˜ Lψ(λ) − V˜ (λ) − tλ ≤ $,

λ ∈ (λ1 , λ2 ),

˜ while by (2.8) there is equality for λ = λ1 , and for λ = λ2 . Thus Lψ(λ) − V˜ (λ) − tλ assumes its minimum on [λ1 , λ2 ] at an internal point, say at λ∗ ∈ (λ1 , λ2 ). Then ˜ (λ∗ ) − V˜ (λ∗ ) − t = 0. (Lψ)

(3.1)

Next, we note that for λ ∈ (λ1 , λ2 ), ˜ (λ) = (λ − A) (λ − A)(Lψ)

B A

1 ˜ ψ(µ)dµ. λ−µ

(3.2)

It is easy to see that for every µ ∈ (A, B] \ (λ1 , λ2 ), the function (λ − A)/(λ − µ) is strictly decreasing on [λ1 , λ2 ]. Since ψ˜ is non-negative on [A, B] and vanishes on (λ1 , λ2 ), we find that the left-hand side of (3.2) is strictly decreasing on [λ1 , λ2 ]. Then we get, using the assumption that (λ − A)(V˜ (λ) + t) increases on [A, B], ˜ (λ) − V˜ (λ) − t is strictly decreasing on [λ1 , λ2 ]. (λ − A) (Lψ) (3.3) In view of (3.1) this implies that ˜ (λ) − V˜ (λ) − t > 0, (Lψ) ˜ (λ) − V˜ (λ) − t < 0, (Lψ)

λ ∈ (λ1 , λ∗ ), λ ∈ (λ∗ , λ2 ),

˜ which means that Lψ(λ) − V˜ (λ) − tλ has a local maximum at λ∗ . This contradiction shows that the support of ψ˜ is an interval. Let β = β(x, t) be the right endpoint of this interval.

Continuum Limit of the Toda Lattice

313

For later reference we note that for λ ∈ (β, B], we have strict inequality ˜ ˜ Lψ(λ) − V˜ (λ) − tλ < $. Indeed, if equality would hold, then we could repeat the same arguments as above, with λ1 = β and λ2 = λ, and we would obtain a contradiction again. ˜ Assuming A is not in the support of ψ, ˜ Next, we show that A is in the support of ψ. we let λ2 > A be the smallest number in the support. Then ˜ ˜ Lψ(λ) − V˜ (λ) − tλ ≤ $,

λ ∈ (A, λ2 ),

with equality for λ = λ2 . Then in the same way we obtained (3.3), we get ˜ (λ) − V˜ (λ) − t is strictly decreasing on [A, λ2 ]. (λ − A) (Lψ) As the above expression vanishes for λ = A, we see that it is negative for λ ∈ (A, λ2 ). ˜ Thus Lψ(λ) − V˜ (λ) − tλ is strictly decreasing on [A, λ2 ]. This is again a contradiction, since ˜ ˜ Lψ(λ) − V˜ (λ) − tλ ≤ $, for λ = A and equality holds for λ = λ2 . Thus the support of ψ˜ is equal to the interval [A, β(x, t)]. To prove that β(x, t) depends continuously on x, we note first that as x increases, ˜ the normalization ψdλ = 1 − x decreases, and therefore, by Proposition 4.1(a) of [18] (see also Lemma 5.1 below) the maximizer decreases as x increases. In addition, ˜ the measures ψdλ depend continuously on x in the weak∗ topology of measures on [A, B], see [18, Proposition 4.1(b)]. This immediately implies that the endpoint β(x, t) is right-continuous in x. What remains is to show that it is also continuous from the left. The limit from the left exists since β is a decreasing function of x. We denote the left limit by β(x−, t). Clearly β(x−, t) ≥ β(x, t). So we certainly have β(x−, t) = β(x, t) if β(x, t) = B. Assuming β(x, t) < B, we recall that we have strict inequality ˜ Lψ(λ) − V˜ (λ) − tλ < $˜ ˜ for all λ ∈ (β(x, t), B]. By [18, Proposition 4.1(b)] Lψ(λ) and $˜ both depend continuously on x. It follows that for any given λ ∈ (β(x, t), B] we can find sufficiently small δ > 0 such that also for x − δ strict inequality holds at λ. Then β(x − δ, t) < λ for all sufficiently small δ, and it follows that β(x−, t) < λ. Since λ can be taken arbitrarily close to β(x, t), we arrive at β(x−, t) ≤ β(x, t). This proves the left-continuity, since we already know that β(x−, t) ≥ β(x, t). This completes the proof of part (a). The proof of part (b) is similar. Combining Lemma 3.1 with the remarks immediately preceding it, we arrive at the following result. Theorem 3.2. Let t ∈ R. If both (λ−A)(V˜ (λ)+t) and (B −λ)(V (λ)−t) are increasing functions on the interval [A, B], then the right ansatz holds for the maximization problem (1.12)–(1.13) for every x ∈ (0, 1). Furthermore, the functions α(x, t) and β(x, t) appearing in (1.14)–(1.16) are continuous in x.

314

A. B. J. Kuijlaars, K. T.-R. McLaughlin

It is clear that if the conditions of the theorem hold for some t, then they hold for all later times as well. Thus in that case, the right ansatz continues to hold for all later times. It is important to note that in Theorem 3.2 the right ansatz holds for every x ∈ (0, 1). Dual to the right ansatz, we have the so-called left ansatz. By this we mean that the maximizer ψ of (1.12)–(1.13) satisfies 0 < ψ(λ) < φ(λ), ψ = φ, ψ = 0,

λ ∈ (α(x, t), β(x, t)), λ ∈ [A, α(x, t)], λ ∈ [β(x, t), B]

(3.4) (3.5) (3.6)

for some values α(x, t), β(x, t) in [A, B]. The following analogue of Theorem 3.2 holds for the left ansatz. Theorem 3.3. Let t ∈ R. If both (λ − A)(V (λ) − t) and (B − λ)(V˜ (λ) + t) are increasing functions on [A, B], then the left ansatz holds for the maximization problem (1.12)–(1.13) for every x ∈ (0, 1), and the functions α(x, t) and β(x, t) appearing in (3.4)–(3.6) are continuous in x. If the conditions of Theorem 3.3 hold at some time t, then they hold for all earlier times. Corollary 3.4. Suppose V and V˜ are differentiable functions on (A, B). (a) If there is a constant T such that for every λ ∈ (A, B), d (λ − A)V˜ (λ) ≥ −T , dλ and d (B − λ)V (λ) ≥ −T , dλ then the right ansatz holds for every time t ≥ T and every x ∈ (0, 1). (b) If there is a constant T such that for every λ ∈ (A, B), d (λ − A)V (λ) ≥ T , dλ and d (B − λ)V˜ (λ) ≥ T , dλ then the left ansatz holds for every time t ≤ T and every x ∈ (0, 1). (c) If V and V˜ are C 2 functions on [A, B], then there exist Tr and Tl such that the right ansatz holds for t ≥ Tr and x ∈ (0, 1), and the left ansatz for t ≤ Tl and x ∈ (0, 1). In all three cases, the functions α(x, t) and β(x, t) are continuous in x. Proof. This follows immediately from Theorems 3.2 and 3.3.

Remark 3.5. Under the conditions of Corollary 3.4, one may also show that the functions α(x, t) and β(x, t) are continuous in t. However, as we will not use this in the rest of the paper, we will not show it here.

Continuum Limit of the Toda Lattice

315

4. Long Time Behavior: Right Ansatz Holds Uniformly In many important cases the functions V and V˜ are not C 2 functions and special attention is required for points where differentiability fails. Our goal in this section is to find conditions on the initial Riemann invariants α0 and β0 such that Theorem 3.2 can be applied for t sufficiently large. Recall that V and V˜ are determined by α0 and β0 through Eqs. (2.4) and (2.7), respectively. We study initial data satisfying the following conditions (see Fig. 1 below) I. The function β0 (x) is continuous on [0, 1] and assumes its maximum at xB ∈ (0, 1). It is a C 2 function on [0, 1] \ {xB } with β0 > 0 on [0, xB ) and β0 < 0 on (xB , 1]. At the point xB the left and right limits of β0 and β0 exist with β0 (xB −) > 0,

β0 (xB +) < 0.

(4.1)

II. The function α0 (x) is continuous on [0, 1] and assumes its minimum at xA ∈ (0, 1). It is a C 2 function on [0, 1] \ {xA } with α0 < 0 on [0, xA ) and α0 > 0 on (xA , 1]. At the point xA the left and right limits of α0 and α0 exist with α0 (xA −) < 0,

α0 (xA +) > 0.

(4.2)

β0 (x) ... ... ... ... ..... . ... . ... .... ............. ... . ... ...... .. ...... ...... ....... ....... .... . . . ..... ...... . ............. ...... ... ...... .. ...... ... . ... .............. .. . . ............. . . . . . ................... . . . . . . . . ......... ............ . ....... . ..... ...... . ..... . ...... .... ...... . .... . ...... ... .. .. ... ............ . . . . . . ... .. α0 (x) ...... . . ... . . . . ....... .... .. ... ....... ........... ....... ... ... .. ...... ... ...... . ........... .. ... ............ ... ... .................... .......... . . ....................... .. .. ........... .. ........... .... ... .. 0

xB

xA

1

Fig. 1. Example of initial Riemann invariants satisfying the Conditions I and II

Theorem 4.1. If the initial data α0 , β0 satisfy Conditions I and II, then for t large enough, the right ansatz holds for all x ∈ (0, 1), and the functions α(x, t) and β(x, t) are continuous in x. Proof. The proof is based on Theorem 3.2, so that we have to show that, for t large enough, both (B − λ)(V (λ) − t) and (λ − A)(V˜ (λ) − t) increase on [A, B]. We discuss here in detail the behavior of (B − λ)(V (λ) − t), the other function being similar.

316

A. B. J. Kuijlaars, K. T.-R. McLaughlin

Without loss of generality we may assume that α0 (0) = β0 (0) = 0. Then A < 0 < B. We start with some remarks on the function V . It is a non-negative continuous function on [A, B] with V (0) = 0. It is differentiable on (A, 0) and on (0, B) with derivative

x− (λ)

1 dx, λ ∈ (0, B), (λ − α (x))(λ − β0 (x)) 0 0 x− (λ) 1 dx, λ ∈ (A, 0). V (λ) = − √ (λ − α0 (x))(λ − β0 (x)) 0

V (λ) =

√

(4.3) (4.4)

We see that V (λ) is positive for λ > 0 and negative for λ < 0. Hence V assumes its minimum only √ at 0. [We could have taken (4.3) as the formula for V for all λ, if we would consider (λ − α0 (x))(λ − β0 (x)) as a complex function of the complex variable λ, which would take positive values for λ > β0 (x) and negative values for λ < α0 (x). However, we chose here not to take that point of view. Instead all formulas are “real” formulas, and all square roots are positive.] The values V (B) and V (A) exist. Indeed, we have xB 1 V (B) = dx √ (B − α0 (x))(B − β0 (x)) 0 and the integral converges because of the assumption that β0 (xB −) > 0. Similarly, V (A) exists. Hence V is a continuous function on [A, B] \ {0}. At 0, V is not continuous, but the left and right limits at 0 exist. This may be seen from the formula √ 1 (λu) λ x− du V (λ) = , 0 < λ < B, (4.5) √ 1 −u λ − α (x (λu)) 0 0 − which is obtained from (4.3) through the change of variables x = x− (λu). As λ → 0+, it is easy to see that (λu) → x−

1

(4.6)

β0 (0)

and √ λ

α0 (x− (λu)) = 1− λ λ − α0 (x− (λu))

−1/2

−1/2

α (0) → 1 − 0 u β0 (0)

Thus V (0+) =

1 β0 (0)

1

1−

0

−1/2

α0 (0) u β0 (0)

arctan −α0 (0)/β0 (0) . =2 −α0 (0)β0 (0)

√

du 1−u

.

(4.7)

Continuum Limit of the Toda Lattice

317

Similarly arctan −β0 (0)/α0 (0) V (0−) = −2 . −α0 (0)β0 (0) Thus V has a jump at 0 of magnitude V (0+) − V (0−) =

π −α0 (0)β0 (0)

.

(4.8)

To study the differentiability of V we restrict ourselves to λ ∈ (0, B], the case λ ∈ [A, 0) being similar. We consider √ (λu) λ x− , 0 ≤ λ ≤ B, 0 ≤ u ≤ 1, (4.9) f (λ, u) = λ − α0 (x− (λu)) which for λ = 0, is interpreted as the limit from above (which exists by (4.6) and (4.7)). In view of (4.5) we have 1 du V (λ) = f (λ, u) √ . 1 −u 0 The function f is continuous on the rectangle R = {(λ, u) : 0 ≤ λ ≤ B, 0 ≤ u ≤ 1}. From (4.9) we see that f is differentiable with respect to λ, as all functions in (4.9) are differentiable. Furthermore, at λ = 0, f is differentiable from the right, and at λ = B, it is differentiable from the left. However, there is one exception, which has to do with the fact that α0 is not differentiable at xA . Thus α0 (x− (λu)) is not differentiable with respect to λ if x− (λu) = xA . This happens if xA < xB and λu = β0 (xA ). In that case it follows from the assumptions on α0 that the derivatives from the left and from the right exist. Thus ∂f/∂λ exists on R \ {λu = β0 (xA )}, is continuous there, and on the curve λu = β0 (xA ) its left and right limits exist. Then it easily follows that ∂f 1 (λ, u) √ ∂λ 1−u is integrable on R. By Fubini’s theorem, we then have for every λ0 ∈ (0, B],

λ0 1 1 λ0 ∂f ∂f du du (λ, u) √ (λ, u)dλ √ dλ = ∂λ 1−u 1−u 0 0 0 ∂λ 0 1 du = [f (λ0 , u) − f (0, u)] √ 1 −u 0 = V (λ0 ) − V (0+). Thus V is differentiable on (0, B] and V (λ) =

1 0

du ∂f (λ, u) √ ∂λ 1−u

318

A. B. J. Kuijlaars, K. T.-R. McLaughlin

is a continuous function on (0, B] and the limit V (0+) exists. Similarly, we find that V is a continuous function on [A, 0) and V (0−) exists. It follows that (B − λ)(V (λ) − t) is differentiable on [A, B] \ {0} with derivative (B − λ)V (λ) − V (λ) + t. The functions V and V are continuous on [A, B] \ {0} and have left and right limits at 0. Therefore they are bounded. It follows that for t ≥ T1 :=

sup

λ∈[A,B]\{0}

V (λ) − (B − λ)V (λ)

the function (B − λ)(V (λ) − t) increases on [A, 0) and on (0, B]. At λ = 0, V has a jump (4.8), from which it follows that lim (B − λ)(V (λ) − t) ≥ lim (B − λ)(V (λ) − t).

λ→0+

λ→0−

Thus (B − λ)(V (λ) − t) increases on the full interval [A, B] if t ≥ T1 . In the same way, we obtain that (λ − A)(V˜ (λ) + t) increases on [A, B] for t ≥ T2 , with T2 :=

sup

λ∈[A,B]\{0}

−V˜ (λ) − (λ − A)V˜ (λ) .

Thus for t ≥ max(T1 , T2 ), both (B − λ)(V (λ) − t) and (λ − A)(V˜ (λ) + t) increase on [A, B] and the theorem follows because of Theorem 3.2. Remark 4.2. The requirements that α0 and β0 are C 2 functions at 0 imply that φ(λ) becomes infinite at λ = 0 (as in the proof of Theorem 4.1, we assume that α0 (0) = β0 (0) = 0). In fact one may check that Conditions I and II imply that φ(λ) behaves like log (1/|λ|) for λ near 0. Thus the accumulation of eigenvalues for the original Toda matrix L is greater at zero than at other points of the spectrum. This observation has a noteworthy dynamical consequence. Since φ is a constant of the motion, the blow-up of φ at λ = 0 should be visible in the evolving curves α(x, t), β(x, t) at all times. Since the right ansatz holds for large time by Theorem 4.1, it follows that at large times, either α(·, t) or β(·, t) has a zero derivative at the value of x where it is 0. Remark 4.3. The inequalities (4.1) in Condition I imply that the function β0 is not C 2 at the point xB , but rather possesses a corner there. We will show in the next section that the conclusion of Theorem 4.1 does not hold if β0 is a C 2 function on the full interval [0, 1]. It is similarly true that the conclusion of the theorem does not hold if α0 is a C 2 function on the full interval. The difference between the two cases has an interesting dynamical interpretation. The point xB where β0 has its maximum moves to the left as t increases. If β0 has a corner at xB , then the top point hits the boundary x = 0 in finite time. This is a consequence of Theorem 4.1. On the other hand, if β0 is a C 2 function, then the top point will not hit the boundary in finite time. A similar interpretation applies to α0 .

Continuum Limit of the Toda Lattice

319

5. Long Time Behavior: Right Ansatz Does Not Hold Uniformly The goal of this section is to prove Theorem 5.3 from which it follows that for C 2 initial data α0 and β0 , the right ansatz does not hold uniformly in x, no matter how large t is. To establish Theorem 5.3 we will use two results on the x-dependence of the maximizer, which will be discussed first. We study the dependence of the maximizer ψ on the spatial parameter x. Lemmas 5.1 and 5.2 hold generally and are not restricted to C 2 spectral data. Lemma 5.1. Let V be continuous, and let φ be such that Lφ is continuous. Then for a fixed t, the maximizer ψ of the problem (1.11)–(1.12) increases with x. Proof. See Proposition 4.1 of [18].

We are also interested in the behavior for x → 0+. Lemma 5.2. Let V be continuous, and let φ be such that Lφ is continuous. We further assume that supp(φ) = [A, B]. Fix t ∈ R. The following are equivalent for λ0 ∈ [A, B], (a) λ0 is in the support of the maximizer ψ for every x ∈ (0, 1); (b) the function V (λ) − tλ assumes its minimum at λ0 . Proof. Assume (b) holds. Let x > 0 and denote the corresponding maximizer by ψ. The function Lψ is harmonic on C \ supp(ψ). Since it tends to +∞ at infinity, the minimum principle for harmonic functions gives that Lψ assumes its minimum only in supp(ψ). Let λ1 ∈ supp(ψ) be a point where the minimum is assumed. Then, by (2.1) and (2.3) Lψ(λ1 ) − V (λ1 ) + tλ1 ≥ $. If we assume that λ0 ∈ supp(ψ), then Lψ(λ1 ) < Lψ(λ0 ) and by (2.2) Lψ(λ0 ) − V (λ0 ) + tλ0 ≤ $. Combining the last three relations, we find V (λ0 ) − tλ0 > V (λ1 ) − tλ1 which contradicts (b). Thus λ0 ∈ supp(ψ) and (a) holds. Next, assume that (a) holds. For x ∈ (0, 1), we use ψ(·; x) to denote the maximizer corresponding to x and $x to denote the constant appearing in (2.1)–(2.3). Then by (2.1) and (2.3) we have for every x ∈ (0, 1), Lψ(λ0 ; x) − V (λ0 ) + tλ0 ≥ $x . Letting x → 0, we have that Lψ(·; x) → 0 by the dominated convergence theorem, and thus −V (λ0 ) + tλ0 ≥ lim sup $x . x→0

320

A. B. J. Kuijlaars, K. T.-R. McLaughlin

We are done, if we can prove that lim $x = − min{V (λ) − tλ : λ ∈ [A, B]}.

x→0

(5.1)

Let λ1 be a point where V (λ) − tλ assumes its minimum. Then λ1 ∈ supp(ψ(·; x)) for every x ∈ (0, 1), by what has been proved before. Thus Lψ(λ1 ; x) − V (λ1 ) + tλ1 ≥ $x .

(5.2)

For every x ∈ (0, 1), the set of λ values such that Lψ(λ; x) − V (λ) + tλ = $x is a non-empty closed set, because of the continuity of Lψ(λ; x). We let λx be a point closest to λ1 such that Lψ(λx ; x) − V (λx ) + tλx = $x .

(5.3)

We claim that λx → λ1 as x → 0. Indeed, if this were not the case, there would be a sequence (xn ) tending to 0 and an > 0 such that |λxn − λ1 | > . Then using (5.2) and (5.3) we would find that Lψ(λ; xn ) − V (λ) + tλ > $xn for all λ in the interval + = (λ1 − , λ1 + ). Then ψ(·; xn ) = φ in + by (2.3), so that xn = ψ(λ; xn )dλ ≥ ψ(λ; xn )dλ = φ(λ)dλ > 0. +

+

This contradicts the fact that (xn ) tends to 0. Therefore λx → λ1 as x → 0, as claimed. Now letting x → 0 in (5.3), we get −V (λ1 ) + tλ1 = lim $x , x→0

which proves (5.1) by the definition of λ1 . This completes the proof of the lemma. Theorem 5.3. Suppose the initial upper Riemann invariant β0 is increasing on the interval [0, xB ], and decreasing on the interval [xB , 1], and the lower Riemann invariant α0 is decreasing on the interval [0, xA ], and increasing on the interval [xA , 1], where xA , xB ∈ (0, 1). (a) If β0 is a C 2 function in a neighborhood of xB , then for every t > 0, there exists δ > 0 such that for every x ∈ (0, δ), the maximizer ψ vanishes in a neighborhood of B. (b) If α0 is a C 2 function in a neighborhood of xA , then for every t > 0, there exists δ > 0 such that for every x ∈ (1 − δ, 1), the maximizer ψ = φ in a neighborhood of A. Consequently, in both cases the right ansatz (1.14)–(1.16) is not valid for all x ∈ (0, 1), no matter how large t is. In case (a) it fails for x close to 0, and in case (b) it fails for x close to 1.

Continuum Limit of the Toda Lattice

321

Proof. We will restrict our attention to the proof of part (a), as the proof of part (b) is similar. So we assume that β0 is a C 2 function in a neighborhood of xB . Then as in (4.3), we have xB 1 V (B) = dx. (5.4) √ (B − α (x))(B − β0 (x)) 0 0 As β0 is a C 2 function around xB and xB is the point where β0 has its maximum, we have B − β0 (x) = O((xB − x)2 ),

(x → xB ).

Hence the integral in (5.4) diverges and V (B) = ∞. It follows that V (λ) − tλ does not assume its minimum at λ = B, no matter how large t is. Consequently, by Lemma 5.2, there is δ > 0 such that the maximizer ψ corresponding to normalization δ vanishes in a neighborhood of B. But then by Lemma 5.1, the maximizer corresponding to any smaller normalization also vanishes in this neighborhood, and thus part (a) of the theorem follows. Example 5.4. The effect described in Theorem 5.3 is clearly visible in the following explicit solution of the continuous Toda equations (1.7): 2 (1 − x)p − x(1 − p) , (5.5) α(x, t) = 2 (1 − x)p + x(1 − p) , (5.6) β(x, t) = where p = p(t) =

1 . 1 + e−2t

(5.7)

A straightforward calculation shows that (5.5) and (5.6) satisfy (1.7) indeed. This example is related to Krawtchouk polynomials [10]. The corresponding initial data are √ 2 1 √ 1 α0 (x) = 1 − x − x = − x(1 − x), 2 2 √ 2 1 √ 1 β0 (x) = 1 − x + x = + x(1 − x). 2 2 The upper Riemann invariant β is smooth and has its maximum initially at x = 1/2, and at later times at 1 x =1−p = . 1 + e2t Similarly, α has its minimum at x=p=

1 . 1 + e−2t

Thus for t > 0 the right ansatz holds for x in the interval 1 1 [1 − p, p] = , , 1 + e2t 1 + e−2t but not for x in [0, 1 − p) ∪ (p, 1].

322

A. B. J. Kuijlaars, K. T.-R. McLaughlin

6. Generation of Infinite Gaps from Smooth Initial Data We show in this section how the previous results can be used to establish the existence of smooth C ∞ initial data such that for some time t0 and some position x0 , the maximizer is supported on an infinite union of disjoint intervals. 6.1. Construction of the external field. We start with the construction of a C ∞ external field V0 such that the equilibrium measure in the presence of V0 (with normalization 1, and without upper constraint) is supported on infinitely many intervals. Lemma 6.1. Define for k = 0, 1, 2, . . . , ak =

1 3

k 1 , 2

bk =

1 2

k 1 , 2

(6.1)

and put . = {0} ∪

∞

[ak , bk ] .

k=0

Then there are functions ψ0 and V0 on R with the following properties: (a) ψ0 is a non-negative continuous function with support equal to . such that ψ0 (λ) dλ = 1.

(6.2)

(b) V0 is C ∞ on R and real analytic on R \ {0, b0 , a0 , b1 , . . . }. (c) Lψ0 = V0 on . and Lψ0 < V0 on R \ .. Proof. The function ψ0 will be built out of translates and rescalings of the function √ 2 1 − λ2 for λ ∈ [−1, 1], (6.3) f (λ) = π 0 otherwise. It is well known that Lf (λ) = λ2 − 1/2 − log 2,

for λ ∈ [−1, 1],

Lf (λ) < λ − 1/2 − log 2,

for λ ∈ R \ [−1, 1],

2

(see, for example, [27, p. 240]). Thus Lf is real analytic on the intervals (−∞, −1), (−1, 1) and (1, ∞). Then there is a function W such that W is C ∞ on R and real analytic on R \ {−2, −1, 1, 3},

(6.4)

Continuum Limit of the Toda Lattice

323

and Lf (λ) = W (λ), Lf (λ) < W (λ),

for λ ∈ (−∞, −2] ∪ [−1, 1] ∪ [3, ∞), for λ ∈ (−2, −1) ∪ (1, 3).

Now for k = 0, 1, 2, . . . , we define 5 1 k ak + bk ck = = , 2 12 2 and

1 bk − ak = rk = 2 12

(6.5) (6.6)

k 1 , 2

λ − ck , rk

λ − ck Wk (λ) = rk W + log rk . rk

fk (λ) = f

(6.7) (6.8)

Then the function fk is supported on the interval [ak , bk ], and by a straightforward calculation,

λ − ck + log rk . (6.9) Lfk (λ) = rk Lf rk From (6.5), (6.6), and the definitions (6.1) of ak and bk , it then follows that Lfk (λ) = Wk (λ), Lfk (λ) < Wk (λ),

for λ ∈ (−∞, bk+1 ] ∪ [ak , bk ] ∪ [ak−1 , ∞), for λ ∈ (bk+1 , ak ) ∪ (bk , ak−1 ),

(6.10) (6.11)

where a−1 = 2/3. Furthermore, Wk is C ∞ on R and real analytic on R \ {bk+1 , ak , bk , ak−1 } as a result of (6.4). Taking k = 0, we see that W0 is real analytic, except at the points 1/4, 1/3, 1/2 and 2/3. Then there exists a C ∞ function Wˆ 0 such that Wˆ 0 = W0 on [0, 1/2], Wˆ 0 > W0 on (−∞, 0) ∪ (1/2, ∞), and such that Wˆ 0 is real analytic on R \ {0, 1/4, 1/3, 1/2}. It follows from (6.10)–(6.11) that Lf0 (λ) = Wˆ 0 (λ), Lf0 (λ) < Wˆ 0 (λ),

for λ ∈ [0, 1/4] ∪ [1/3, 1/2],

(6.12)

for λ ∈ (−∞, 0) ∪ (1/4, 1/3) ∪ (1/2, ∞).

(6.13)

Now we form the two infinite series ∞ 12 fk (λ) ψ0 (λ) = √ , k! e k=0 ∞ Wk (λ) 12 ˆ . V0 (λ) = √ W0 (λ) + k! e k=1

(6.14)

(6.15)

√ The factor 12/( e) was taken in order to guarantee that (6.2) holds. Observe that by construction, the support of ψ0 is ., so that property (a) of the lemma holds. We note that V0 is a C ∞ function, since Wˆ 0 and each Wk is C ∞ and the series (6.15) is uniformly convergent on compacts, as are the series with the derivatives of any order. Similarly, V0

324

A. B. J. Kuijlaars, K. T.-R. McLaughlin

is real analytic on each of the open intervals (ak , bk ) and (bk+1 , ak ) for k = 1, 2, . . . . Thus property (b) holds. We see from (6.10)–(6.15) that Lψ0 = V0 on . and that Lψ0 < V0 on each of the gaps (bk+1 , ak ). Because of the modification of W0 to Wˆ 0 , we also have Lψ0 < V0 on (−∞, 0) and on (b0 , ∞). Thus (c) holds. From properties (a)–(c) of Lemma 6.1 it follows that ψ0 is the equilibrium measure in the presence of the external field V0 . That is, it maximizes (Lψ, ψ) − 2(V0 , ψ) among all non-negative functions ψ satisfying ψdλ = 1, see [5, 6, 27]. 6.2. Construction of initial data. Let ψ0 and V0 be as in Lemma 6.1. We put √ 2 1 − λ2 for λ ∈ [−1, 1], φ(λ) = π 0 otherwise.

(6.16)

Since ψ0 is bounded with support . ⊂ [0, 1/2], there is an x0 ∈ (0, 1) such that x0 ψ0 < φ

on (−1, 1).

(6.17)

We consider the external field x0 V0 and the constraint φ on the interval [−1, 1]. The dual external field, see Lemma 2.1, is Lφ − x0 V0 − C. Both x0 V0 and Lφ are C ∞ on [−1, 1] (in fact, Lφ is real analytic). Thus, by Corollary 3.4 (c), there exists a sufficiently negative time Tl such that the left ansatz holds for every t < Tl and every x ∈ (0, 1), with continuous functions α(·, t) and β(·, t). Choose t0 > −Tl and write V (λ) = x0 V0 (λ) + t0 λ,

λ ∈ [−1, 1].

(6.18)

Note that we then have V (λ) > 0

for all λ ∈ [−1, 1].

For the external field (6.18) and the constraint φ, the maximizer ψ(·; x, t) for the maximization problem (1.12)–(1.13) at time t ∈ (−∞, t0 + Tl ) satisfies the left ansatz (3.4)– (3.6). Thus for every x ∈ (0, 1) and t < t0 + Tl , there exist α(x, t) and β(x, t) in [−1, 1] such that ψ(·; x, t) = φ on [−1, α(x, t)] and 0 < ψ(·; x, t) < φ on (α(x, t), β(x, t)). We take α0 (x) = α(x, 0),

β0 (x) = β(x, 0),

(6.19)

as initial data, whose spectral functions φ and V are given by (6.16) and (6.18), respectively. Lemma 6.2. Let x0 and Tl be defined as above. Then for every t0 > −Tl the following statements hold for the functions α0 (x) and β0 (x) from (6.19): (a) α0 and β0 are continuous increasing functions on (0, 1) with −1 < α0 (x) < β0 (x) < 1, lim α0 (x) = lim β0 (x) = −1,

(6.20) (6.21)

lim α0 (x) = lim β0 (x) = 1.

(6.22)

x→0+ x→1−

x→0+ x→1−

Continuum Limit of the Toda Lattice

325

(b) The maximizer of the maximization problem (1.12)–(1.13) corresponding to x0 and t0 is equal to x0 ψ0 , and x0 ψ0 is supported on an infinite number of intervals. Proof. We already noted that the maximizer ψ(·; x, 0) at time t = 0 is equal to the constraint φ precisely on [−1, α0 (x)], that it vanishes precisely on [β0 (x), 1], and that the functions α0 (x) and β0 (x) are continuous in x. Since the maximizer increases with x by Lemma 5.1, it follows that both α0 and β0 are increasing functions of x. If α0 (x) would be −1, then the constraint φ would not be active. In that case, an explicit formula for ψ would be β V (s) 1 1 (β − s)(s + 1)ds , ψ(λ; x) = √ x + P.V. π π (β − λ)(λ + 1) −1 s − λ where P.V. denotes a Cauchy principal value integral, see e.g. [14,19]. Since V (s) > 0, we then see that the maximizer ψ would have a square-root singularity at −1, which is clearly impossible since ψ ≤ φ. Thus α0 (x) > −1. Similarly β0 (x) < 1. Now we follow the arguments of Deift and McLaughlin in [7, Chapter 4], where they consider decreasing initial data. If we modify their arguments to the case of increasing initial data, we find that α0 (x) and β0 (x) satisfy the equations T (α, β) = 0,

X(α, β) = x,

(6.23)

where the functions T and X are defined by T (α, β) =

1 π

β

α

and X(α, β) =

1 π

α

√

V (λ) dλ − (β − λ)(λ − α)

α −1

√

φ(λ) dλ, (β − λ)(α − λ)

β

(λ) α λ − α+β φ(λ) λ − α+β V 2 2 dλ − dλ. √ √ (β − λ)(λ − α) (β − λ)(α − λ) −1

(6.24)

(6.25)

If we let β → α+, then (6.24) becomes α φ(λ) dλ = −∞. V (α) − −1 α − λ Thus α0 (x) < β0 (x), and we proved (6.20). As the maximizer is equal to the constraint φ on the interval [−1, α0 (x)], it is clear that α0 (x) → −1 as x → 0+. Since the maximizer vanishes on [β0 (x), 1], and φ(λ)dλ = 1, we also find that β0 (x) → +1 as x → 1−. Suppose that β0 (x) → β > −1 as x → 0+. Then taking the limit x → 0+ in the equation T (α0 (x), β0 (x)) = 0, we find that 1 π

β −1

√

V (λ) dλ = 0, (β − λ)(λ + 1)

which is clearly impossible, since V (λ) > 0. Thus β0 (x) → −1 as x → 0+. Similar arguments, based on the dual problem, lead to the conclusion that α0 (x) → 1 as x → 1−. Hence (6.21) and (6.22) are proved.

326

A. B. J. Kuijlaars, K. T.-R. McLaughlin

Finally, we note that 0 ≤ x0 ψ0 ≤ φ by (6.17), and Lemma 6.1 (c) and (6.10) we have

x0 ψ0 dλ = x0 by (6.2). By

L(x0 ψ0 )(λ) = x0 Lψ0 (λ) ≤ x0 V0 (λ) = V (λ) − t0 λ with equality on the support of x0 ψ0 . The support of x0 ψ0 is equal to the set . of Lemma 6.1 and it consists of an infinite number of intervals. This proves part (b) and completes the proof of Lemma 6.2. Summarizing, for each choice of t0 > −Tl , we have constructed an external field V (λ) = x0 V0 (λ) + t0 λ out of V0 (λ) so that at t = 0, the maximization problem (1.12)– (1.13) is solved by the left ansatz for all x ∈ (0, 1), with α0 (x) and β0 (x) depending continuously on x. Next we would like to establish the C ∞ smoothness of the functions α0 (x) and β0 (x) (so far, we only know that they are continuous). For this, we will require that the parameter t0 be taken sufficiently large. We first observe that if we write T (α, β) from Eq. (6.24) in the form 1 T (α, β) = π −

1

−1 1 −1

V

β−α ! 2 u √ 2 1−u

α+β 2

φ β−

+

α−1 2

α−1 2

+

−

du !

α+1 2 u

!

α+1 2 u

(1 − u)

du,

and use the fact that V and φ are C ∞ functions on [−1, 1], we find that T is a C ∞ function for −1 < α < β < 1. Similarly, X is C ∞ . Theorem 6.3. Let x0 and Tl be as in Lemma 6.2. Then there is Tˆ > −Tl so that t0 > Tˆ implies that the functions α0 and β0 corresponding to t0 as in (6.19) are C ∞ smooth. Proof. We recall from the proof of Lemma 6.2 that for each x and t0 > −Tl , the pair (α0 (x), β0 (x)) solves the pair of equations (6.23). Using (6.24), together with V (λ) = x0 V0 (λ) + t0 , we may rewrite the function T as follows,

T (α, β) = t0 + −

x0 V0 (λ)/π dλ (β − λ)(λ − α) α α φ(λ) dλ. √ (β − λ)(α − λ) −1 β

√

(6.26)

Observe that the first integral on the right-hand side of (6.26) is uniformly bounded for all α < β in [−1, 1]: min x0 V0 (λ) ≤

λ∈[−1,1]

β α

√

x0 V0 (λ)/π dλ ≤ max x0 V0 (λ). λ∈[−1,1] (β − λ)(λ − α)

(6.27)

Continuum Limit of the Toda Lattice

327

Similarly, its partial derivatives are uniformly bounded. Differentiating (6.26) with respect to β, we find

α x0 V0 (λ)/π φ(λ) 1/2 dλ dλ + √ β −λ (β − λ)(λ − α) (β − λ)(α − λ) α −1 β α x0 V0 (λ)/π ∂ φ(λ) 1/2 ≥ dλ dλ + √ √ ∂β α β + 1 (β − λ)(α − λ) (β − λ)(λ − α) −1 β x0 V0 (λ)/π ∂ = dλ √ ∂β α (β − λ)(λ − α) x0 V0 (λ) 1 β 1/2 t0 + + dλ − T (α, β) . √ β +1 π α (β − λ)(λ − α)

Tβ =

∂ ∂β

β

√

(6.28)

β x0 V0 (λ)/π Now assume α and β satisfy T (α, β) = 0. Then, since α √(β−λ)(λ−α) dλ and its derivative with respect to β are uniformly bounded, we have Tβ > 0 for t0 sufficiently big. Similarly, we now show that for t0 sufficiently large, it follows that if α and β solve the equation T (α, β) = 0, then Tα < 0. We insert the definition (6.16) of φ into the second integral in (6.26), to obtain √ α φ(λ) 1 − λ2 2 α dλ = dλ. √ √ π −1 (β − λ)(α − λ) (β − λ)(α − λ) −1 By contour integration this integral may be re-expressed as an integral over the interval [β, 1], which yields the following formula: √ α φ(λ) 1 − λ2 2 1 dλ = α + β + dλ. √ √ π β (β − λ)(α − λ) (λ − β)(λ − α) −1 Now arguments quite similar to those used to prove that Tβ > 0 can be used to prove that Tα < 0, if T (α, β) = 0 and t0 is sufficiently large. We thus have shown that if t0 is sufficiently big, and if α and β solve T (α, β) = 0, then Tα < 0 and Tβ > 0. For the partial derivatives of X, it follows as in [7, Chapter 4] that Xα = −

β −α Tα , 2

Xβ =

β −α Tβ , 2

provided that α and β satisfy T (α, β) = 0. Therefore we learn that " # Tα Tβ det = (β − α)Tα Tβ = 0 Xα Xβ for −1 < α < β < 1 solving T (α, β) = 0. Hence the Jacobian of the mapping (α, β) ! → (T , X) is non-zero for t0 sufficiently large and T (α, β) = 0. Thus, recalling that α0 (x) and β0 (x) are continuous functions solving T (α, β) = 0 and X(α, β) = x, we deduce from the implicit function theorem that α0 (x) and β0 (x) are C ∞ functions on (0, 1). This proves the theorem.

328

A. B. J. Kuijlaars, K. T.-R. McLaughlin

Remark 6.4. Combining Lemma 6.2 and Theorem 6.3, we have constructed an example where an infinite gap solution arises out of C ∞ initial data at a certain position x0 and time t0 > 0. We used the global description provided by the maximization problem (1.12)–(1.13), but we were not able to analyse the support of the maximizer in general for every x and t. Hence we do not know whether the infinite gap solution occurs at other values of x and t, or not. What we can say is that the conditions of Corollary 3.4 are satisfied. Thus for large enough time (larger than t0 ) the right ansatz holds uniformly for x ∈ (0, 1). Therefore, for large enough time, all gaps in the support of the maximizer have disappeared, and we again have C ∞ functions α(x, t) and β(x, t). We are also able to analyse the maximizer at the fixed time t0 , with varying x ∈ (0, 1). It turns out that for x = x0 , we have a finite gap solution provided that the maximizer does not meet the constraint φ. This will be discussed in the next subsection. 6.3. Deformation in the spatial variable x. We further study the external field V0 constructed in the proof of Lemma 6.1, for which the equilibrium measure is supported on infinitely many intervals. We will consider the equilibrium problem with normalization x > 0, and prove that the maximizer is supported on finitely many intervals for every x different from 1. As discussed in the Introduction, the normalization x corresponds to the spatial variable in the continuum limit of the Toda lattice. In this subsection, we consider V0 as an external field defined on [−1, 1]. For each x > 0, we use ψ(·; x) to denote the maximizer with external field V0 and normalization x, i.e., ψ(λ; x)dλ = x, and no upper constraint. We write .x for the support of ψ(·; x). Recall that ψ(·; x) is increasing with x (cf. Lemma 5.1), and that ψ(·; 1) is equal to the function ψ0 from Lemma 6.1. Thus .1 = {0} ∪

∞

[ak , bk ],

(6.29)

k=0

where ak and bk are given by (6.1). Theorem 6.5. For every x > 0, x = 1, the set .x consists of a finite number of intervals. Proof. We consider first the case x < 1. Then .x ⊂ .1 . First, we want to show that for all k sufficiently large, the interval [ak , bk ] is disjoint from .x . We use Lemma 5.7 of [29], from which it follows that ψ0 (λ) = ψ(λ; 1) ≥ (1 − x)

dω.1 (λ), dλ

for λ ∈ .x ,

(6.30)

where ω.1 denotes the equilibrium measure without external field of the set .1 , and normalization 1. Enlarging the set .1 to the interval [0, 1], we decrease the equilibrium measure on .1 , and a fortiori on .x . This property of equilibrium measures follows for example from Theorem IV.4.5 of [27]. So we have dω.1 dω[0,1] 1 (λ) ≥ (λ) = √ , dλ dλ π λ(1 − λ)

for λ ∈ .x .

(6.31)

Continuum Limit of the Toda Lattice

329

For λ ∈ [ak , bk ], we have 1 2(k+1)/2 1 1 . ≥ √ ≥ √ = π π bk π λ(1 − λ) π bk (1 − bk ) √

(6.32)

Now combining inequalities (6.30), (6.31), and (6.32), we find that ψ0 (λ) ≥ (1 − x)

2(k+1)/2 , π

for λ ∈ [ak , bk ] ∩ .x .

(6.33)

On the other hand, from the construction of ψ0 in (6.3), (6.7), and (6.14), it is clear that ψ0 (λ) ≤

24 √ , π ek!

for λ ∈ [ak , bk ].

(6.34)

From (6.33) and (6.34), we learn that if 24 2(k+1)/2 < (1 − x) √ π π ek! then [ak , bk ] ∩ .x is empty. This is clearly satisfied for k large enough. Thus we have shown that .x ⊂

kx

[ak , bk ],

k=1

for some finite kx . Next, it also follows from (6.30) that the points ak and bk do not belong to .x . Indeed, we know that ψ0 vanishes at these points, and the density of ω.1 is infinite at these points. So we see that ak ∈ .x or bk ∈ .x would contradict (6.30). Thus [ak , bk ] ∩ .x is contained in [ak + δ, bk − δ] for some δ > 0. Since V0 is real analytic on (ak , bk ), Theorem 1.38 of [6] gives that [ak , bk ] ∩ .x consists of a finite number of intervals (cf. [18]). So it follows that .x consists of a finite number of intervals for all x ∈ (0, 1). Now we turn to the case that x is bigger than 1. Fix x > 1, so that .1 ⊂ .x . Our first goal is to show that for k large enough, the gaps (bk+1 , ak ) of .1 are fully contained in .x . To this end, we introduce external fields, one for each k ∈ N, V0 − Lψ0 on [bk+1 , ak ], Qk = (6.35) 0 on [−1, bk+1 ] ∪ [ak , 1]. 1

This is a C 1+ 2 external field on [−1, 1]. Let ak Qk (s) 1 1 x − 1 + P.V. 1 − s 2 ds , ηk (λ) = √ π π 1 − λ2 bk+1 s − λ

(6.36)

where P.V. denotes the Cauchy principal value. Then by standard results on singular integral equations, see e.g. [14, §42.3], we have Lηk = Qk

on [−1, 1]

(6.37)

330

A. B. J. Kuijlaars, K. T.-R. McLaughlin

and

1 −1

ηk (λ) dλ = x − 1.

(6.38)

We are going to show that ηk is non-negative on [−1, 1] if k is sufficiently large. From (6.10)–(6.15) and (6.35) we note that k+1 ! 12 1 Qk (λ) = √ Wj (λ) − Lfj (λ) , j! e

for λ ∈ [bk+1 , ak ].

(6.39)

j =k

From this and (6.7)–(6.8) we compute that for λ ∈ [bk+1 , ak ],

k+1 λ − cj λ − cj 12 1 Qk (λ) = √ − (Lf ) . W j! rj rj e

(6.40)

j =k

Inserting (6.40) into the principal value integral in the right-hand side of (6.36) and making a suitable transformation for each term separately, we arrive at the following principal value integrals: −1 12 W (t) − (Lf ) (t) 1 − (ck + rk t)2 ) dt, (6.41) √ P.V. t −ζ π ek! −3 2 12 W (t) − (Lf ) (t) 1 − (ck+1 + rk+1 t)2 dt, (6.42) P.V. √ t −ζ π e(k + 1)! 1 where in the first integral λ = ck +rk ζ , and in the second integral λ = ck+1 +rk+1 ζ . The functions (6.41) and (6.42) are Hilbert transforms of Hölder continuous functions, and therefore they are also Hölder continuous, and they decay to 0 for |ζ | → ∞, uniformly with respect to k. Using the continuity property of the Hilbert transform ! on Hölder continuous functions, we easily see that both (6.41) and (6.42) are O k!1 , as k → ∞, uniformly in ζ . Then it is clear from the definition (6.36) of ηk , that there exists kx ∈ N such that ηk > 0,

for all k ≥ kx .

(6.43)

Combining (6.35), (6.37), (6.38), (6.43) with Lemma 6.1, we see that for k ≥ kx , ψ0 + ηk > 0 on [−1, 1], 1 (ψ0 (λ) + ηk (λ))dλ = x, −1

and

L(ψ0 + ηk )

= V0

on [bk+1 , ak ],

≤ V0

on [−1, 1].

(6.44) (6.45)

(6.46)

We also note that ψ0 + ηk = ψ(·; x), since strict inequality occurs in (6.46) in each of the gaps (bj +1 , aj ) with j = k, and supp(ψ0 + ηk ) = [−1, 1]. From (6.44)–(6.46) it then follows by Lemma 2.2 of [4] that [bk+1 , ak ] ⊂ supp(ψ(·; x)) = .x

for k ≥ kx .

Continuum Limit of the Toda Lattice

331

Since .1 ⊂ .x , we conclude that [0, bkx ] ⊂ .x . Thus for each x > 1, a full interval around 0 is in the support of .x . To conclude that .x consists of a finite number of intervals we are now left with the intervals [bkx , 1] and [−1, 0]. The bands [ak , bk ] remain in the support .x . It is thus enough to show that for each k < kx the set .x ∩ [bk+1 , ak ] consists of a finite number of intervals, and similarly for .x ∩ [−1, 0]. To this end, we note that ψ(·; x) − ψ0 is a non-negative function with L(ψ(·; x) − ψ0 ) = V0 − Lψ0 + $x

on .x ,

and inequality ≤ on [−1, 1]. Thus ψ(·; x) − ψ0 is the maximizer for the external field V0 − Lψ0 and normalization x − 1. This external field is zero on each interval [ak , bk ], and convex in a neighborhood of each ak and bk . It then follows that some interval [ak − δ, bk + δ] is also contained in .x . In a neighborhood of the remaining gaps [bk+1 + δ, ak − δ], the external field V0 is real analytic, and so by Theorem 1.38 of [6], .x ∩ [bk+1 + δ, ak − δ] consists of a finite union of intervals. Similarly, V0 − Lψ0 is convex in a left neighborhood of 0, and a left neighborhood [−δ, 0] is also contained in .x . The external field V0 is real analytic on [−1, −δ] and again by [6, Theorem 1.38] it follows that .x ∩ [−1, −δ] consists of a finite union of intervals. Thus we have shown that for x > 1 the support .x is a finite union of disjoint closed intervals. Remark 6.6. Theorem 6.5 has the following consequence for the continuum limit of the Toda lattice with the initial data α0 and β0 considered in Subsect. 6.2. We showed in Lemma 6.2 (b) that x0 ψ0 is the maximizer at time t0 , and position x0 . In the same way it follows that xψ(·; x) is the maximizer at time t0 and position x provided that it satisfies the constraint xψ(·; x) ≤ φ.

(6.47)

Using the fact that x0 ψ0 < φ on (−1, 1), see (6.17), we may prove as in [18, Lemma 4.10] that there exists δ > 0 such that (6.47) holds with strict inequality on (−1, 1) for every x < x0 + δ. This implies that in the example of Subsect. 6.2 the infinite gap solution holds at time t0 at x0 , but not at other x values less than x0 + δ. Going from x0 to x < x0 , we have that an infinite number of bands disappear, while going from x0 to x ∈ (x0 , x0 + δ), we have that an infinite number of gaps close. For x > x0 + δ, the upper constraint becomes active, and we are not able to analyse this more complicated situation. Acknowledgements. Arno Kuijlaars was supported in part by FWO research project G.0278.97, and a research grant of the Fund for Scientific Research – Flanders. He is grateful to K. T.-R. McLaughlin for the support and hospitality during a visit to the University of Arizona. Kenneth T.-R. McLaughlin was supported in part by NSF postdoctoral fellowship grant # DMS-9508946 and NSF grant # DMS-9970328. He thanks the faculty and staff of the Princeton University Mathematics Department and MSRI for their support and hospitality, and thanks A. Kuijlaars and W. Van Assche for their hospitality and support during visits to K. U. Leuven.

332

A. B. J. Kuijlaars, K. T.-R. McLaughlin

References 1. Bardos, C., Ghidaglia, J.-M. and Kamvissis, S.: Weak convergence and deterministic approach to turbulent diffusion. In: Nonlinear wave equations, (Yan Guo ed.), Contemp. Math. 263, Providence RI: AMS, 2000, pp. 1–15 2. Bloch, A., Golse, F., Paul, T. and Uribe, A.: Dispersionless Toda and Toeplitz operators. Preprint 3. Brockett, R.W. and Bloch, A.: Sorting with the dispersionless limit of the Toda lattice. In:Hamiltonian systems, transformation groups and spectral transform methods (Montreal, 1989), Montreal: Univ. Montréal, 1990, pp. 103–112 4. Damelin, S.B. and Kuijlaars, A.B.J.: The support of the equilibrium measure in the presence of a monomial external field on [−1, 1]. Trans. Am. Math. Soc. 351, 4561–4584 (1999) 5. Deift, P.: Orthogonal polynomials and random matrices: a Riemann–Hilbert approach. Courant Lecture Notes in Mathematics 3, New York: Courant Institute, 1999 6. Deift, P., Kriecherbauer, T. and McLaughlin, K.T.-R.: New results on the equilibrium measure for logarithmic potentials in the presence of an external field. J. Approx. Theory 95, 388–475 (1998) 7. Deift, P. and McLaughlin, K.T.-R.: A continuum limit of the Toda lattice. Mem. Am. Math. Soc. 131 624, (1998) 8. Deift, P., Venakides, S. and Zhou, X.: New results in small dispersion KdV by an extension of the steepest descent method for Riemann–Hilbert problems. Internat. Math. Research Notices 6, 286–299 (1997) 9. Dragnev, P.D. and Saff, E.B.: Constrained energy problems with applications to orthogonal polynomials of a discrete variable. J. Anal. Math. 72, 223–259 (1997) 10. Dragnev, P.D. and Saff, E.B.: A problem in potential theory and zero asymptotics of Krawtchouk polynomials. J. Approx. Theory 102, 120–140 (2000) 11. Ercolani, N., Levermore, C.D. and Zhang, T.: The behavior of the Weyl function in the zero-dispersion KdV limit. Commun. Math. Phys. 183, 119–143 (1997) 12. Flaschka, H.: On the Toda lattice II. Prog. Theor. Phys. 51, 703–716 (1974) 13. Flaschka, H., Forest, M.G. and McLaughlin, D.W.: Multiphase averaging and the inverse spectral solutions of the Korteweg–de Vries equation. Comm. Pure Appl. Math. 33, 739–784 (1980) 14. Gakhov, F.: Boundary value problems Oxford: Pergamon Press, 1966 15. Holian, B.L., Flaschka, H. and McLaughlin, D.W.: Shock waves in the Toda lattice: analysis. Phys. Rev. A 24, 2595–2623 (1981) 16. Jin, S., Levermore, C.D. and McLaughlin, D.W.: The semiclassical limit of the defocusing NLS hierarchy. Comm. Pure Appl. Math. 52, 613–654 (1999) 17. Kamvissis, S.: On the Toda shock problem. Phys. D 65, 242–266 (1993) 18. Kuijlaars, A.B.J.: On the finite gap ansatz in the continuum limit of the Toda lattice. Duke Math. J. 104, 433–462 (2000) 19. Kuijlaars, A.B.J. and Dragnev, P.D.: Equilibrium problems associated with fast decreasing polynomials. Proc. Am. Math. Soc. 127, 1065–1074 (1999) 20. Kuijlaars, A.B.J. and McLaughlin, K.T-R: Generic behavior of the density of states in random matrix theory and equilibrium problems in the presence of real analytic external fields. Comm. Pure Appl. Math. 53, 736–785 (2000) 21. Kuijlaars, A.B.J. and Van Assche, W.: The asymptotic zero distribution of orthogonal polynomials with varying recurrence coefficients. J. Approx. Theory 99, 167–197 (1999) 22. Lax, P.D. and Levermore, C.D.: The small dispersion limit of the Korteweg–de Vries equation I, II, III. Comm. Pure Appl. Math. 36, 253–290, 571–593, 809–829 (1983) 23. Manakov, S.V.: Complete integrability and stochastization of discrete dynamical systems. Zh. Exp. Teor. Fiz. 67, 543–555 (1974) 24. Moser, J.: Finitely many mass points on the line under the influence of an exponential potential – an integrable system. In: Dynamical Systems, Theory and Applications (J. Moser ed.) Lect. Notes in Phys. 38, Berlin: Springer, 1975, pp. 467–497 25. Rakhmanov, E.A.: Equilibrium measure and the distribution of zeros of the extremal polynomials of a discrete variable. Mat. Sb. 187, 109–124 (1996); English transl. in Sb. Math. 187, 1213–1228 (1996) 26. Ransford, T.: Potential theory in the complex plane. Cambridge: Cambridge University Press, 1995 27. Saff, E.B. and Totik, V.: Logarithmic Potentials with External Fields. New York: Springer-Verlag, 1997 28. Shipman, S.P.: Modulated waves in a semiclassical continuum limit of an integrable NLS chain. Comm. Pure Appl. Math. 53, 243–279 (2000) 29. Totik, V.: Weighted Approximation with Varying Weight. Lecture Notes in Math. 1569, Berlin: SpringerVerlag, 1994 30. Venakides, S.: Higher order Lax–Levermore theory. Comm. Pure Appl. Math. 43, 335–362 (1990)

Continuum Limit of the Toda Lattice

333

31. Venakides, S., Deift, P. and Oba, R.: The Toda shock problem. Comm. Pure Appl. Math. 44, 1171–1242 (1991) 32. Whitham, G.B.: Linear and Nonlinear Waves. New York: Wiley, 1974 Communicated by M. Aizenman

Commun. Math. Phys. 221, 335 – 349 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

A Generic C1 Expanding Map has a Singular S–R–B Measure James T. Campbell , Anthony N. Quas Department of Mathematical Sciences, University of Memphis, Memphis, TN 38152-3240, USA. E-mail: [email protected]; [email protected] Received: 8 December 2000 / Accepted: 27 March 2001

Abstract: We show that for a generic C1 expanding map T of the unit circle, there is a unique equilibrium state for − log T that is an S–R–B measure for T , and whose statistical basin of attraction has Lebesgue measure 1. We also present some results related to the question of whether a generic C1 expanding map preserves a σ -finite measure, absolutely continuous with respect to Lebesgue measure. 1. Introduction Let E k denote the set of Ck expanding maps of the unit circle S 1 onto itself, k = 1, 2, . . . . Expanding maps have been widely studied in ergodic theory. In particular, various cases with k ≥ 2 have been studied by a large number of authors including Rényi ([17], 1965), Kr˙zyzewski ([9], 1971), Kr˙zyzewski and Szlenk ([11], 1969). A typical result says that an expanding map with C2 regularity has a unique absolutely continuous invariant measure with strong ergodic properties. These results have been extended to the case of C1+α expanding maps of the circle (maps with a Hölder continuous derivative) and even to maps satisfying weaker regularity conditions. More recently Góra ([5], 1994) proved results of this type under the Dini condition. A later result of Kr˙zyzewski ([10], 1979) gave the first indication that the situation for C1 expanding maps differs from that of the smoother maps. Namely, he showed that within the set of expanding C1 self-maps of any manifold, the set of such maps for which there is an absolutely continuous invariant probability measure, with continuous density bounded away from 0, is meager. (That is, its complement is generic, i.e., contains a dense Gδ set with respect to the C1 topology.) This theme was taken up by Góra and Schmitt ([4], 1989) who showed that there is an example of an expanding C1 map of the circle that has no absolutely continuous invariant probability measure. In further studies of C1 expanding maps of the circle by Quas ([15, 13, 14], all 1996) maps with respectively more than one absolutely continuous invariant measure and a J. Campbell is partially supported by NSF Grant #DMS–9801602

336

J. T. Campbell, A. N. Quas

non-weak-mixing invariant measure were constructed; and it was shown that a dense set of C1 expanding maps have a unique absolutely continuous invariant probability with unbounded density. In [2] (1998), Bruin and Hawkins constructed an example of an expanding C1 map of the circle with no σ -finite absolutely continuous invariant measure (finite or infinite). In a more recent paper of Quas ([16], 1999) it was shown that a generic C1 expanding map of the circle has no absolutely continuous invariant probability measure. Our main result shows that despite this result, there is (generically) a singular invariant probability from which properties of Lebesgue almost every orbit can be obtained. Theorem 1. For a generic T ∈ E 1 , there is a unique equilibrium measure µT for the potential − log T . This T - invariant probability measure has the following properties: 0 1 1. For a set of points S of Lebesgue measure 1, for all f ∈ C (S ), the averages n−1 k 1/n k=0 f (T x) converge to f dµT for all x ∈ S. 2. The measure µT is singular with respect to Lebesgue measure. 3. For each non-empty open set U , µT (U ) > 0.

In other words, a generic T ∈ E 1 possesses a fully supported singular Sinai–Ruelle– Bowen measure whose statistical basin of attraction has Lebesgue measure 1. A natural question is whether the result from [16] may be extended from probability measures to σ -finite measures; i.e., is it true that generically in E 1 , there is no absolutely continuous invariant measure? At the moment, we do not know the answer, but we include the following trio of results that give some information about this situation. Silva [19] introduced a notion of recurrence for a measure with respect to a non−1 singular transformation. To define this in our setting, let h be the density of λ ◦ 1T with respect to Lebesgue measure (h = dλ ◦ T −1 /dλ), and set ωn (x) = nj=1 h◦T j . Then 1 ωn > 0 on S and ωn dλ = 1, n = 1, 2, . . . . Lebesgue measure is recurrent for T if 1 the quantity ∞ n=1 ωn (x) is infinite for λ-a.e. x ∈ S . (We caution the reader that this notion of recurrence is much stronger than Poincaré recurrence. For example there exist C2 expanding maps of S 1 for which Lebesgue measure is not recurrent in this sense.) This recurrence property is relevant to the question of the existence of invariant measures as follows. If one can establish that a measure is recurrent for a non-invertible map, then existence or non-existence of absolutely continuous, σ -finite invariant measures for the map can be decided using a version of Krieger’s ratio set (see Hawkins and Silva [6] for a proof of this result). Theorem 2. For a generic subset of E 1 , Lebesgue measure is not recurrent. Recall that a measure µ is locally infinite if µ(I ) = ∞ for each open interval I . Theorem 3. For a generic T ∈ E 1 , any absolutely continuous invariant measure is locally infinite. To describe the next result in this direction, let hn (x) be the density of λ ◦ T −n with respect to λ: hn (x) = dλ ◦ T −n /dλ(x). Set S,n,a = T ∈ E 1 : λ{x : hn (x) ∈ [a, 2a]} < ,

Generic C 1 Expanding Maps

337

and consider the collection S=

S,n,a .

>0 n∈N a>0

If T ∈ S, we say the densities of λ ◦ T −n have no characteristic scale. This is because for such a T and for any > 0, there exists an n such that for each a > 0, the set {x : hn (x) ∈ [a, 2a]} has Lebesgue measure less than . It is known that there exist mappings with an infinite invariant measure so that the above densities hn , when appropriately rescaled, converge in measure to the invariant density (see Aaronson’s book [1] for examples). One can see that when T belongs to the class S defined above, this is impossible. Therefore, when T belongs to S, a natural way of producing an absolutely continuous invariant measure is lost. Theorem 4. The set S constructed above is a dense Gδ subset of E 1 . In the next section we give some notation and definitions, in Sect. 3 we state and prove some preliminary lemmas, in Sect. 4 we prove Theorems 1, 2, 3, and in Sect. 5 we prove Theorem 4. 2. Notation & Definitions We work on S 1 = [0, 1]/ ∼ , where ∼ identifies 0 with 1. The Borel sigma-algebra is denoted by B. The space of Borel measures on S 1 is denoted by M, with M1 denoting the subspace of probabilities. If T ∈ E 1 , MT1 denotes the set of Borel probability measures that are invariant under T . For ν ∈ MT1 , the measure-theoretic entropy of T with respect to ν is denoted by hν (T ), or hν if T is understood. For a continuous function f : S 1 → R, the pressure of f (with respect to T ) is given by PT (f ) = sup hν (T ) + f dν . ν∈M1T

An equilibrium state for f is an element µ ∈ MT1 satisfying PT (f ) = hµ + f dµ. Recall that a Borel measure µ is called a Sinai–Ruelle–Bowen measure for T ∈ E 1 if there exists a subset B of S 1 of positive Lebesgue measure such that for each f ∈ C 0 (S 1 ) and all x ∈ B, n−1

1

f (T k (x)) → n

f dµ.

k=0

The set B is called the statistical basin of attraction of µ. For each T ∈ E 1 , T is a continuous function whose absolute value is strictly larger than 1. Since S 1 is connected, E 1 decomposes into two disjoint open subsets, the first consisting of those T ’s for which T > 1, the other, those T ’s for which T < −1. Each of these sets has countably many open components, corresponding to the maps of degree k (k = 2, 3, . . . , and k = −2, −3, . . . , respectively). In some of our arguments, we want to prove, say, that a subset of E 1 with a certain property is generic. We proceed by supposing that T > 1 and the degree is a fixed but arbitrary integer k > 1, and proving that within the corresponding component, the set is generic. Since a practically

338

J. T. Campbell, A. N. Quas

identical argument (with only the obvious minor modifications) will hold for T < −1 and k ≤ −2, and the components partition E 1 , the general result will follow. With these conventions in place we set !T = ! = − log(T ) < 0. We define the Perron–Frobenius operator, or transfer operator LT by LT f (x) =

f (y) . |T (y)|

T y=x

For now we do not specify the space containing f or LT f . These will depend upon the context in which they are being used, and will be designated as needed in the development. We repeatedly use the fact (proved in [9]) that for each T ∈ E 2 , there exists a unique, absolutely continuous µ ∈ MT1 , whose density is strictly positive and continuous. 3. Preliminary Lemmas We state and prove some lemmas that lead to the main results. Following [7], for each natural number k ≥ 2, let Ek : S 1 → S 1 denote the linear expanding map Ek (x) = kx mod 1. For T ∈ E 1 of degree k, it is well-known that Ek is conjugate to T ; that is, there exists a homeomorphism γ of S 1 such that T ◦ γ = γ ◦ Ek . In fact, in general there is more than one such homeomorphism (although only finitely many). For a degree k map T ∈ E 1 , we shall write Conj(T ) for the set of conjugacies between Ek and T . For our purposes, it will be necessary to study and control the dependence of the conjugacy on the map T . To do this, we shall exploit the construction in [7] of such a conjugacy. Specifically, in their construction, they start with a point p that is fixed by T and use the Markov partition of the circle given by the intervals whose endpoints are the points of T −1 {p}. For our modification, we need to control the choice of p. For z ∈ S 1 , set Uz = {T ∈ E 1 : T (z) = z}. Note that Uz is a dense open subset of 1 E . Lemma 1. For each z ∈ S 1 , there is a continuous map 'z : Uz → Homeo(S 1 ) such that 'z (T ) ∈ Conj(T ) for each T . In particular, given T ∈ E 1 of degree k, there is a neighborhood U of T on which there is a continuous choice of conjugacies to the map Ek . Proof. The proof is essentially that given in the proof of Theorem 2.4.6 in [7]. For a map T ∈ Uz , we choose the fixed point p of T that is the first fixed point on the circle “to the right” of z. That is, considering the circle to be the set [0, 1), p is chosen to be the first fixed point to the right of z or if there is none, the first fixed point to the right of 0. This choice of fixed point determines a conjugacy 'z (T ). The fixed point may be seen to depend continuously on the map, and so do its preimages. This allows one to show the required continuity of 'z . To show that in a neighborhood of any given map T ∈ E 1 , there is a continuous family of conjugacies, we argue as follows: Let z be any point not fixed by T , then Uz is the required neighborhood and 'z (S) is the continuous choice of conjugacy for S ∈ Uz .

Generic C 1 Expanding Maps

339

Note that if γ ∈ Conj(T ) and f ∈ C0 (S 1 ), then PEk (f ◦ γ ) = PT (f ). Indeed, 1 and M 1 by ν → ν ◦ γ −1 . Then γ induces a bijection between ME f ◦ γ dν = T k −1 f dν ◦ γ , and since γ is a measure-theoretic isomorphism, hν (Ek ) = hν◦γ −1 (T ). The pressure equality follows. Lemma 2. For all T ∈ E 1 , PT (!T ) = 0. Proof. If T ∈ E 2 , this is well-known as the Ruelle-Ledrappier-Young entropy formula (see [12]). Given a degree k map T ∈ E 1 , by Lemma 1, we may find a neighborhood V of T and a choice of conjugacies γS for all S ∈ V so that the map S → γS is continuous on V . With these choices, if {Ti } ⊂ E 2 and Ti → T in E 1 , then !Ti ◦ γTi → !T ◦ γT in C 0 (S 1 ). Since pressure is continuous on C 0 (S 1 ), and 0 = PTi (!Ti ) = PEk (!Ti ◦ γTi ) for all i, it follows by taking limits that 0 = PEk (!T ◦ γT ) = PT (!T ). Corollary 1. If µ is any equilibrium state for !T , then µ is non-atomic. Proof. Let µ be an ergodic equilibrium state; then it must be either purely atomic, or continuous. If it is purely atomic, then hµ (T ) = 0 and !T dµ < 0, contradicting P (!T ) = 0. The result follows since the equilibrium states form a convex set, of which the extreme points are the ergodic states. Lemma 3. The set of T ∈ E 1 for which !T has a unique equilibrium state is generic. The lemma is a version of the Gibbs Phase Rule for the class of expanding maps of the circle. The original Gibbs Phase Rule for the case of a shift was proved by Ruelle [18] and Gallavotti and Miracle-Sole [3]. Proof. For any expansive T , there is at least one equilibrium state for each h ∈ C 0 (S 1 ) (see Walters [20], p. 224). Since expanding maps are expansive, every !T possesses at least one equilibrium state. To prove uniqueness for a dense Gδ , we work with equilibrium states for the map Ek : S 1 → S 1 given by Ek (x) = kx mod 1. We now show that the set B of potentials for which there is a unique Ek -equilibrium state forms a Gδ set. Theorems 4.3.3 and 4.3.5 of [8] characterize those potentials with unique equilibrium states as the set of f such that for all g ∈ C 0 (S 1 ), limt→0 (PEk (f + tg) − PEk (f ))/t exists. For fixed f and g, define H (t) = (PEk (f + tg) − PEk (f ))/t. Since the map t → PEk (f + tg) is convex, H is an increasing function. The above limit then exists if and only if lim inf t→0+ H (t) − H (−t) = 0. Hence f has a unique equilibrium state if and only if lim inf t→0+

PEk (f + tg) + PEk (f − tg) − 2PEk (f ) =0 t

for all g ∈ C 0 (S 1 ).

(1)

To show that these f form a Gδ set, we need to show that it is sufficient to calculate the lim inf for a collection of g belonging only to a countable set. To this end, let (gn )n∈N be a countable collection of continuous functions that is dense in C 0 (S 1 ). We note that PE (f + tg) + PE (f − tg) − 2PE (f ) /t − k k k

PEk (f + tgn ) + PEk (f − tgn ) − 2PEk (f ) /t ≤ 2g − gn ∞ .

340

J. T. Campbell, A. N. Quas

Hence (1) holds if and only if lim inf (PEk (f + tgn ) + PEk (f − tgn ) − 2PEk (f ))/t = 0 for all n ∈ N. t→0+

The set of B of functions f satisfying this condition may be written as f : |(PEk (f + tgn ) + PEk (f − tgn ) − 2PEk (f ))/t| < 1/p , n∈N p∈N m∈N t∈(0, 1 ) m

which is easily seen to be a Gδ subset of C 0 . From Lemma 1, there is a continuous choice of conjugacies for maps in U0 . For a map T ∈ U0 , we shall call this choice of conjugacy γT . Letting . be the map U0 → C 0 (S 1 ) defined by .(T ) = !T ◦ γT , we see that . is continuous. It follows that .−1 (B) is a Gδ subset of U0 . We now have T ∈ .−1 (B) if and only if !T ◦ γT has a unique Ek equilibrium state. Since there is a bijection between Ek -equilibrium states for !T ◦ γT and T -equilibrium states for !T , we see that T ∈ .−1 (B) if and only if !T has a unique T -equilibrium state. We have established that the set S ⊂ E 1 consisting of those T for which !T has a unique equilibrium state, contains a Gδ subset of U0 . Since E 2 ∩ U0 is a dense subset of U0 that is contained in S, it follows that S contains a dense Gδ subset of U0 . Since U0 is a dense open subset of E 1 , we conclude that S contains a dense Gδ subset of E 1 . Set / = {T ∈ E 1 : there exists a unique equilibrium state for !T }. Lemma 4. Equip / with the (relative) C1 -topology, and M1 with the (relative) weak∗ topology. Then M : / → M1 , given by M(T ) = µT , is continuous. Proof. Suppose T0 ∈ / is of degree k, Ti ∈ / and Ti → T0 in C1 . We shall show that µT0 is the limit of the µTi . As in Lemma 1, fix a neighborhood V of T0 such that there is a continuous family of conjugacies γT for T ∈ V . Suppose that µ is any limit point of the µTi . We shall show that µ = µT0 , and this is sufficient, by weak∗ -sequential compactness, to show that the original sequence must converge to µT0 . Replacing the original sequence with a subsequence if necessary, we suppose that µTi → µ. Set νi = µTi ◦ γTi and ν = µ ◦ γT0 . Then ν and the νi are all Ek -invariant measures on S 1 . By continuity of the family of conjugacies, we see that νi → ν in the weak∗ -topology. For each i, since νi is an Ek -equilibrium state for !Ti ◦ γTi which by Lemma 2 has pressure 0, we have 0 = hνi + !Ti ◦ γTi dνi . Since the entropy map is upper semi-continuous, lim sup hνi ≤ hν . Since Ti → T0 in C1 and νi → ν we have 0 = lim sup hνi + !Ti ◦ γTi dνi ≤ hν + !T0 ◦ γT0 dν ≤ 0 ,

Generic C 1 Expanding Maps

341

where the last inequality is true because the pressure is 0. Thus, all of the inequalities are equalities and ν is an equilibrium state for !T0 ◦ γT0 , so that µ is an equilibrium state for !T0 . Since T0 ∈ /, there is only one such state. Thus any limit point of the µTi is µT0 , the unique equilibrium state for !T0 , and the lemma is proved. ˜ = {T ∈ / : µT is fully supported} is a generic subset of / (and hence Lemma 5. / of E 1 ). Proof. From Corollary 1, for each T ∈ /, µT must be non-atomic. By Lemma 4, for a non-empty open interval I ⊂ S 1 , the map T → µT (I ) is continuous on /. Choose any collection {Ii } of non-empty open intervals that forms a countable basis for the topology ˜ = i {T ∈ / : µT (Ii ) > 0}, a Gδ that contains E 2 (and is therefore of S 1 . Then / dense). 4. Proofs of Theorems 1, 2, and 3 Proof of Theorem 1. Lemma 3 establishes that for T belonging to the residual set /, there is a unique equilibrium state µT for the potential − log T . To prove Statement 1, we use a result of Keller. Any fixed T ∈ E 1 , together with the Markov partition for T , forms what Keller [8] calls a continuous e−ψ -conformal fibred system. He shows ([8], Theorem 6.1.8)1 that in such a system, for λ-almost every x, the weak∗ -limit points of the averages k1 (δx + . . . + δT k−1 x ) are contained in the set of measures satisfying hµ + (− log T ) dµ ≥ 0. Since PT (− log T ) = 0, these measures are precisely the equilibrium states. Hence for T ∈ /, for λ-almost every x, the sequence k1 (δx + . . . + δT k−1 x ) has at most one weak-∗ limit point, namely µT . By weak-∗ sequential compactness, the entire sequence must converge to µT . To see that µT must be singular (with respect to λ), we first note that each T ∈ E 1 is a non-singular transformation (with respect to λ). Thus, if µT = µsi + µac is the decomposition of µT into singular and absolutely continuous components, the map µT → µT ◦ T −1 preserves µsi and µac , so that µac is a finite, absolutely continuous T -invariant measure. But we have seen that a generic T ∈ E 1 possesses no such invariant measure ([16]); that is, µac = 0. This proves Statement 2. Lemma 5 implies that generically, µT is fully supported, showing Statement 3. This completes the proof of Theorem 1. Before proving Theorem 2, we state and prove a lemma. There is a reference to a similar lemma in [2] although we have been unable to find the proof in the papers cited there. Recall that if T ∈ E 2 , µT is an absolutely continuous probability measure with strictly positive Radon–Nikodym derivative ρ = dµT /dλ. Lemma 6. Suppose T ∈ E 2 . Then log LT 1 dµT ≥ 0, with equality if and only if ρ is T −1 B-measurable. Proof. Fix T ∈ E 2 . In this case, the equilibrium state µT is absolutely continuous. We write ρ for the density of µT with respect to Lebesgue measure. 1 In fact the quoted theorem, as stated in the book, contains a mistake, although an irrelevant one for the present setting. The interested reader may go to http://www.mi.uni-erlangen/, keller/publications/equibook.html, where the needed correction to the proof of the theorem is given.

342

J. T. Campbell, A. N. Quas

Let P denote the Perron-Frobenius operator for T with respect to µT ∈ MT1 . Then L (ρ·f ) . In particular LT (1) = ρP( ρ1 ). Thus, P(f ) = T ρ log LT (1) dµT = log ρ dµT + log P(1/ρ) dµT = − log(1/ρ) dµT + log P(1/ρ) dµT = − P(log(1/ρ)) dµT + log P(1/ρ) dµT , where the last equality follows because P preserves µT -integrals. It is well-known that P(·)◦T = EµT (·|T −1 B). Since T preserves µT , we may continue the above calculations as follows: − P(log(1/ρ)) dµT + log P(1/ρ) dµT = − P(log(1/ρ)) ◦ T dµT + log P(1/ρ) ◦ T dµT −1 = − EµT log(1/ρ)|T B dµT + log EµT (1/ρ|T −1 B) dµT ≥ 0, where the last inequality follows from Jensen’s inequality, from which it also follows that equality holds in the last step if and only if log( ρ1 ) is T −1 B-measurable, which holds if and only if ρ is T −1 B-measurable. This concludes the proof of Lemma 6. n Proof of Theorem 2. Since log ωn (x) = − j =1 log LT 1 ◦ T j (x), by Theorem 1 1 we have that log ω (x) → − log LT 1 dµT for λ-a.e. x ∈ S 1 and T ∈ /. If n n n log LT 1 dµT > 0, then for large n, ωn (x) = O(a ) for λ-a.e. x, where a is any number such that − log LT 1 dµT < log a < 0. That is, the sequence ωn (x) is asymptotically comparable to a geometric sequence, and hence summable (for λ-a.e. x), so that Lebesgue measure is not recurrent for T. First we observe that {T : log L 1 dµT > 0} is open in /. To see this, if T ∈ / T satisfies log LT 1 dµT > 0 and S ∈ / is C1 -close to T , then LS 1 is C0 -close to LT 1. By Lemma 4, µS is weak∗ -close to µT , proving the observation. Thus by Lemma 6, it is sufficient to show that for maps T belonging to a dense subset of E 2 (and hence a dense subset of E 1 ), the invariant density ρT is not T −1 B-measurable. Choose T ∈ E 2 for which ρT is T −1 B-measurable. We shall show that there is an S ∈ E 2 arbitrarily close to T (in the C1 topology) for which ρS is not S −1 B-measurable. Since ρT is T −1 B-measurable, T x = T y implies that ρ(x) = ρ(y). Given a Markov partition for T, we call the atoms of the partition the branches of T . We shall construct a C2 -homeomorphism π : S 1 → S 1 in such a way that 1. 2.

π is arbitrarily (C1 -) close to the identity, and The map T˜ = π ◦ T ◦ π −1 has the property that ρT˜ = ρ˜ is not T˜ −1 B-measurable. Establishing Items 1 and 2 will finish the proof.

Suppose for the moment that π is any C2 -homeomorphism of the circle, and T˜ (x) ˜ = −1 x) ˜ −1 B-measurable precisely ˜ π ◦ T ◦ π −1 (x). ˜ Then ρ( ˜ x) ˜ = πρ(π , so that ρ ˜ will be T (π −1 x) ˜

Generic C 1 Expanding Maps

343

when T˜ (x) ˜ = T˜ (y) ˜ implies that ρ( ˜ x) ˜ = ρ( ˜ y). ˜ Suppose x˜ = y˜ and T˜ (x) ˜ = T˜ (y). ˜ Then, since ρ is T −1 B-measurable, ρ(π −1 y) ˜ = ρ(π −1 x). ˜ Hence ρ( ˜ x) ˜ will differ from ρ( ˜ y) ˜ precisely when π (π −1 x) ˜ = π (π −1 y). ˜ Hence, if π is chosen so that π is not T −1 B-measurable, these terms will be different. Now we specify that π is a C2 -homeomorphism of S 1 with the property that π ≡ 1 on one branch of T , and different from 1, yet arbitrarily close to 1, on the other branches. This completes the proof of Theorem 2. Proof of Theorem 3. Suppose that T satisfies the conditions of Theorem 1. We show that in this case, any absolutely continuous invariant measure for T is locally infinite. Suppose ν is an absolutely continuous invariant measure for T . Then ν(S 1 ) = ∞. Suppose, for the purpose of obtaining a contradiction, that I is any open interval with ν(I ) < ∞. Let f be any non-negative continuous function supported on I that is positive on some subinterval of I . Clearly f ∈ L1 (ν). By Birkhoff’s ergodic theorem for an infinite invariant measure, for ν-almost every x, n1 (f (x) + . . . + f (T n−1 x)) → 0. This holds in particular on a set of positive Lebesgue measure. On the other hand, since µT is a Sinai–Ruelle–Bowen measure, we have for λ-almost every x, n1 (f (x) + . . . + f (T n−1 x)) → f dµT . Since f is strictly positive on a subinterval of I and µT is fully supported, this quantity is strictly positive. This contradiction completes the proof of the theorem.

5. No Characteristic Scale In this section we prove that if S,n,a = T ∈ E 1 : λ{x : Ln 1(x) ∈ [a, 2a]} < , and S=

S,n,a

>0 n∈N a>0

then S is a dense Gδ subset of E 1 . Proof (Proof of Theorem 4). We can replace the uncountable intersections in the definition of S by countable intersections over the rationals without changing the set. Define LnT 1(x) 1 ≤2 . Fn (T ) = λ × λ (x, y) : ≤ n LT 1(y) 2 Clearly, Fn (T ) < 2 implies that for all positive a, the measure of the set of points with LnT 1(x)∈ [a, 2a] is less than . Letting R,n = {T : Fn (T ) < 2 }, it is clear that R,n ⊂ a>0 S,n,a . Conversely, for fixed x, let a1 = LnT 1(x)/2 and a2 = 2a1 . If T ∈ a>0 S 2 /2,n,a , then for each x, by considering ∪2i=1 {y : LnT 1(y) ∈ [ai , 2ai ]} we have λ{y : LnT 1(y) ∈ [LnT 1(x)/2, 2LnT 1(x)]} < 2 . By Fubini’s theorem, we see that Fn (T ) ≤ 2 so that T ∈ R,n . It follows that S,n,a = R,n . S= >0 n∈N a>0

>0 n∈N

344

J. T. Campbell, A. N. Quas

We shall show that Fn : E 1 → R is an upper semi-continuous map so that S is a Gδ set. To prove this, suppose that Fn (T ) < α. We have λ×λ

(x, y) :

Ln 1(x) 1 ∈ 2, 2 Ln 1(y)

= lim λ × λ k→∞

Ln 1(x) 1 1 . ∈ 2 − k , 2 + k1 (x, y) : n L 1(y)

One can therefore find a k such that λ × λ({(x, y) : LnT 1(x)/LnT 1(y) ∈ [1/2 − 1/k, 2 + 1/k]}) < α. Since the map . : E 1 → C 0 (S 1 × S 1 ) given by .(T )(x, y) = LnT 1(x)/ LnT 1(y) is continuous (with the C1 and C0 -topologies on the respective spaces), there exists a neighborhood U of T such that if T˜ ∈ U , then .(T ) − .(T˜ ) < 1/k. It follows that if T˜ ∈ U , then Fn (T˜ ) < α, proving the upper semi-continuity of Fn . It then remains to demonstrate the density of S. To do this, we shall establish that for any > 0, any T0 ∈ E 2 and any neighborhood U of T0 (in the C1 topology), there is a T ∈ U and an n ∈ N such that for each a, λ{x : Ln 1(x) ∈ [a, 2a]} < . This will be accomplished by conjugating T0 using a homeomorphism constructed via a cocycle. We shall therefore assume > 0, T0 ∈ E 2 and δ > 0 are given. Let η > 0 be such that (1 + η)/(1 − η) < 1 + δ. Then we also have (1 − η)/(1 + η) > 1 − δ. Since T0 belongs to E 2 , T0 preserves an absolutely continuous invariant probability measure, µ, with a strictly positive continuous density, ρ. Let m be such that m1 ≤ ρ(x) ≤ m for all x. Let T¯0 : X → X be a natural extension of T0 : S 1 → S 1 preserving the measure µ. ¯ From [21], µ¯ is Bernoulli, so we may find a non-trivial independent partition P = {A0 , A1 } of X. Write p for µ(A ¯ 0 ) and q for µ(A ¯ 1 ). We then define a ¯ 0 on X as follows: function G 1 + ηq if x ∈ A0 ¯ G0 (x) = 1 − ηp if x ∈ A1 . ¯ (n) defined by Let n > 0 be an integer. We then form the multiplicative cocycle G 0 ¯ 0 (x)G ¯ 0 (T¯0 x) . . . G0 (T¯ n−1 x). ¯ (n) (x) = G G 0 0 (n)

¯ takes on the value vk = (1 + ηq)k (1 − ηp)n−k on a set of measure The G 0

n function k n−k q . p k Let K ∈ N be the least integer so that

Since vk+1 /vk =

1+ηq 1−ηp

1 + ηq 1 − ηp

K

> 2m2 .

¯ (n) in , for each a there are at most K values taken by G 0

[a, 2m2 a]. We then have the estimate ¯ (n) (x) ∈ [a, 2m2 a]} ≤ K µ{x ¯ :G 0

n k n−k p q . {k:vk ∈[a,2m2 a]} k max

Generic C 1 Expanding Maps

345

Since for the values of k in the range over which the maximum is taken have the property that vk ≥ a, we see n ¯ (n) (x) ∈ [a, 2m2 a]} ≤ K a µ{x ¯ :G vk p k q n−k max 0 {k:vk ∈[a,2m2 a]} k n (p + ηpq)k (q − ηpq)n−k = K max 0≤k≤n k CK < √ , n where C is a constant that depends only on the values of p and q. ¯ (n) (x) ∈ [a, 2m2 a]}) < /4 for all a. It will turn out Now fix an n so that a µ({x ¯ :G 0 that an inequality of this type will be what is needed for the conjugate map to have the ¯ (n) is defined not on the circle, but on the desired property. At this point, the function G 0 natural extension space. We shall apply a conditional expectation and approximation ¯ (n) to obtain a function on the circle as needed. argument to G 0 Let Q be a Markov partition for T0 consisting of intervals. There exists a k such that k−1 −s s=0 T0 Q consists of intervals of length less than δ. Denote these intervals by Ij and write I¯j for π −1 Ij , where π denotes the natural projection from the natural extension (X, T¯0 , µ) ¯ to (S 1 , T0 , µ). ¯ Write ρ¯ = ρ ◦ π and define the natural extension of λ, λ¯ by λ(A) = A (1/ρ) ¯ d µ. ¯ We then calculate χ I¯j (n) i ¯ ¯ ¯ ¯ (n) ◦ T¯0i d µ. ¯ G0 ◦ T 0 d λ = ·G 0 ρ¯ I¯j Since T¯0 is mixing, we see that χ I¯j ¯ (n) ◦ T¯0i d λ¯ = ¯ (n) d µ¯ lim G d µ ¯ G 0 i→∞ I¯j 0 ρ¯ n ¯ ¯ ¯ G0 d µ¯ = λ(Ij ) = λ(Ij ), where we used the fact that P is an independent partition to get the second equality. We recall that n is chosen so that ¯ (n) (x) ∈ [a, 2m2 a]}) < /(4a), µ({x ¯ :G 0 for each a > 0. We now choose an i0 such that for i ≥ i0 , ¯ (n) ◦ T¯0i d λ¯ − λ(Ij ) < δ λ(Ij ), G I¯j 0 3

(2)

(3)

for each j . ¯ (n) if G ¯ is chosen to be We now show that similar inequalities persist for functions G ¯ 0. an appropriate perturbation of G ¯ (n) are in the range [(1 − It is useful to note that because the values taken by G 0 ηp)n , (1 + ηq)n ], the inequality (2) holds trivially for a outside this range.

346

J. T. Campbell, A. N. Quas

We define N to be a subset of L1 (µ) ¯ as follows and equip it with the L1 subspace topology: ¯ : 1 − ηp ≤ G ¯ ≤ 1 + ηq; G ¯ −G ¯ 0 1 < ζ }. N = {G ¯ and because N consists of bounded Since composition with T¯ is an isometry on L1 (µ), ¯ → G ¯ (n) is continuous. Clearly, for G ¯ ∈ functions, the map from N to L1 given by G (n) ¯ N, the values taken by G are in the range [(1 − ηp)n , (1 + ηq)n ]. By choosing ζ ¯ (n) | < (1 − ηp)n /2 on a set of measure ¯ (n) − G appropriately small, we can ensure that |G 0 at least 1 − /(8(1 + ηq)n ). For a given a in the range [(1 − ηp)n /2m2 , (1 + ηq)n ], let a1 = a/2 and a2 = 2a. Then ¯ (n) (x) ∈ [a, 2m2 a]} ⊂ {x : G ¯ (n) (x) ∈ [a1 , 2m2 a1 ]} {x : G 0 ¯ (n) (x) ∈ [a, 2m2 a]} ∪ {x : G 0 ¯ (n) (x) ∈ [a2 , 2m2 a2 ]} ∪ {x : G 0 (n)

¯ (x) − G ¯ (n) (x)| > (1 − ηp)n /2}. ∪ {x : |G 0 We shall denote the four sets on the right-hand side by A1 , A2 , A3 and A4 respectively. ¯ (n) we have µ(A ¯ 1 ) < /(2a), µ(A ¯ 2 ) < /(4a) and By our previous estimates on G 0 µ(A ¯ 3 ) < /(4a2 ) < /(8a). We chose ζ above to ensure that µ(A ¯ 4 ) < /(8(1+ηq)n ) < /(8a), so that ¯ (n) (x) ∈ [a, 2m2 a]}) < /a µ({x ¯ :G for each a in the range [(1 − ηp)n /(2m2 ), (1 + ηq)n ]. As before, the inequality holds trivially for a outside this range, so we have established that for sufficiently small ζ , a ¯ (n) , if G ¯ is chosen from N . similar inequality to (2) persists for all a and functions G Since ¯ (n) ◦ T¯0i − G ¯ (n) ◦ T¯0i | d λ¯ ≤ |G ¯ (n) ◦ T¯0i − G ¯ (n) ◦ T¯0i | d λ¯ |G I¯j

0

0

≤m

¯ (n) ◦ T¯0i − G ¯ (n) ◦ T¯0i | d µ, |G ¯ 0

¯ ∈ N. we see that provided ζ is sufficiently small, (3) holds for G ¯ ∈ N, We have therefore shown that there exists a ζ > 0 such that for G ¯ (n) (x) ∈ [a, 2m2 a]}) < /a for each a, and µ({x ¯ :G ¯ (n) (x) ◦ T¯0i d λ¯ − λ(Ij ) < δ λ(Ij ) for each j , and i ≥ i0 . G 3 I¯j

(4) (5)

We note that since T¯0 : X → X is a natural extension of T0 : S 1 → S 1 , the σ -algebras ¯ 0 | T¯ i π −1 BS 1 ) converges to G ¯ 0 in L1 . By T¯0k π −1 BS 1 increase to BX . It follows that Eµ¯ (G 0 the monotonicity of conditional expectation, these functions also satisfy the inequality ¯ 0 | T¯ i π −1 BS 1 ) ≤ 1 + ηq. It follows that for sufficiently large i ≥ i0 , (i0 1 − ηp ≤ Eµ¯ (G 0 ¯ 0 | T¯ i π −1 BS 1 ) in place of G ¯ . Fix some as above), (4) and (5) are satisfied with Eµ¯ (G 0 ¯ 1 for Eµ¯ (G ¯ 0 | T¯ i π −1 BS 1 ). such i and write G 0

Generic C 1 Expanding Maps

347

¯ 1 ◦ T¯ i = Eµ¯ (G ¯ 0 ◦ T¯ i |π −1 BS1 ) so we see that G ¯ 1 ◦ T¯ i may be written as g1 ◦ π Now G 0 0 0 for some B-measurable function g1 on the circle. Since C 0 (S 1 ) is dense in L1 (S 1 , B, µ), it follows that there exists a continuous function g2 such that g1 − g2 1 is arbitrarily ¯ 1 − g2 ◦ π ◦ T¯ −i 1 = g1 − g2 1 , we see that g2 may be chosen so that small. Since G 0 g2 ◦ π ◦ T¯0−i lies in N . Equations (4) and (5) now yield (n) g2 dλ − λ(Ij ) < 3δ λ(Ij ) for each j ; and Ij (n)

µ({x : g2 (x) ∈ [a, 2m2 a]}) < /a for each a > 0. (n) From the first equation, we see that 1 − 3δ < g2 dλ < 1 + 3δ , so finally we rescale g2 (i.e. multiply by a constant, that will, by our above estimates, be very close to 1) to obtain a function g that satisfies g (n) dλ = 1. We then have the inequalities (n) (6) g dλ − λ(Ij ) < δλ(Ij ) for each j ; and Ij µ({x : g (n) (x) ∈ [a, 2m2 a]}) < 2/a for each a > 0. (7) x (n) Set θ (x) = 0 g (t) dt and let T (x) = θ ◦ T0 ◦ θ −1 (x). Then from the above, and since each interval Ij has length less than δ, it may be verified that |θ(x) − x| < 2δ, and supx∈S 1 |T (x) − T0 (x)| < (C + 4)δ, where C = maxx∈S1 |T0 (x)|. Hence this quantity can be made arbitrarily small by choosing δ sufficiently small. Also, differentiating, we see θ (T0 (θ −1 x)) θ (θ −1 x) g (n) (T0 (θ −1 x)) = T0 (θ −1 x) g (n) (θ −1 x) g(T0n (θ −1 x)) = T0 (θ −1 x) . g(θ −1 x)

T (x) = T0 (θ −1 x)

Since g is uniformly close to 1 and T0 is uniformly continuous, we see that supx∈S 1 |T (x)− T0 (x)| can also be made arbitrarily small by controlling δ and η. This shows that T can be chosen arbitrarily close to T0 in the C1 norm. It remains to verify that T has the property that there exists an n such that for each a, λ{x : Ln 1(x) ∈ [a, 2a]} < . Since T is conjugate to T0 , there is also a conjugacy relation between their Perron-Frobenius operators given by LT = Lθ ◦ LT0 ◦ Lθ −1 , where Lθ f (x) = f (θ −1 (x))/θ (θ −1 x). Since T0 is a C2 expanding map, we have that LnT0 1 converges uniformly to ρ. It follows that LnT 1 converges uniformly to Lθ ρ(x) = ρ(θ −1 x)/θ (θ −1 x). We then estimate λ({x :

ρ(θ −1 x) 1 a , 2ma]}) ∈ [a, 2a]}) ≤ λ({x : (n) −1 ∈ [ m (n) −1 g (θ x) g (θ x) 1 = λ({x : g (n) (θ −1 x) ∈ [ 2ma ,m a ]}).

348

J. T. Campbell, A. N. Quas

1 1 m (n) But we see that {x : g (n) (θ −1 x) ∈ [ 2ma ,m a ]} = θ({y : g (y) ∈ [ 2ma , a ]}). Using this, we get

λ({x :

ρ(θ −1 x) 1 ,m ∈ [a, 2a]}) ≤ λ ◦ θ({y : g (n) (y) ∈ [ 2ma a ]}) g (n) (θ −1 x) = g (n) (y) dλ 1 m {y : g (n) (y)∈[ 2ma , a ]}

0 there is CR > 0 such that √ Pk f (u) − (µ, f ) ≤ CR e−c k sup |f | + Lip(f ) for k ≥ 0, H

(0.4)

where u ≤ R, f is an arbitrary bounded Lipschitz function on H , and c > 0 is a constant not depending on u, f , R, and k. Example 0.2. Let us consider the 2D Navier–Stokes (NS) equations perturbed by a random kick-force: u˙ − νu + (u, ∇)u + ∇p = η(t, x) ≡

∞

ηk (x)δ(t − k),

k=−∞

div u = 0,

(0.5)

u(t, ·) = 0,

where u = u(t, x), x ∈ T2 , and u = T2 u(x) dx. Let H be the space of divergencefree vector fields u ∈ L2 (T2 , R2 ) such that u = 0 and let {ej } be the normalised trigonometric basis in H . Assuming that the kicks ηk ∈ H have the form (0.1) and normalising solutions u(t) for (0.5) to be continuous from the right, we observe that (0.5) can be written in the form (0.2), where uk = u(k, ·) ∈ H and S : H → H is the timeone shift along trajectories of the free NS system (i.e., of Eqs. (0.5) with η ≡ 0). As it is shown in [KS1], the operator S satisfies all the required assumptions, and therefore Theorem 0.1 applies to (0.5).

Randomly Forced Nonlinear PDE’s

353

Theorem 0.1 can also be applied to many other dissipative nonlinear PDE’s perturbed by a random kick-force, in particular, to the complex Ginzburg–Landau equation u˙ − ν( − 1)u + i|u|2 u = η(t, x),

x ∈ Tn ,

where u = u(t, x) and ν > 0 (see [KS1, KS2]). Uniqueness of a stationary measure for (0.2) was first established1 in [KS1]. The proof in [KS1] is based on a Lyapunov–Schmidt type reduction of the system (0.2) to an N -dimensional RDS with delay (the integer N is the same as in Theorem 0.1). Due to this reduction, the problem of uniqueness of a stationary measure for (0.2) reduces to a similar question for an abstract 1D Gibbs system with an N -dimensional phase space. The uniqueness for the reduced Gibbs system is then established using a version of the Ruelle–Perron–Frobenius theorem. E, Mattingly, Sinai [EMS] and Bricmont, Kupiainen, Lefevere [BKL] used later similar approaches to show that the NS system (0.5) perturbed by a white (in time) force of the form N η(t, x) = bj β˙j (t)ej (x), N < ∞, j =1

also has a unique stationary measure µ ∈ P(H ), provided that bj = 0 for 1 ≤ j ≤ N ≤ N with some sufficiently large N = N (ν). Moreover, it is shown in [BKL] that for the case of white noise the convergence in (0.4) is exponentially fast for µ-almost all u ∈ H . In [KS3] the NS equations (0.5) with an unbounded kick-force η(t, x) is studied and the scheme of [KS1] is used to prove the uniqueness and ergodicity of a stationary measure. The approach presented in this work does not use a Lyapunov–Schmidt type reduction and the Gibbs measure technique. Instead it exploits some ideas from [KS2], interpreting them in terms of the coupling. The new approach gives rise to a shorter proof and is more flexible. The coupling is a well-known effective tool for studying finite-dimensional Markov chains (e.g., see [Lin] and the Appendix in [V]) and dynamical systems (e.g., see [Y, BL]). In [EMS] a coupling is used to study the auxiliary finite-dimensional RDS with delay which arises as a result of the Lyapunov–Schmidt reduction. Our work shows that a form of coupling applies directly to infinite-dimensional Markov chains and randomly forced PDE’s. When a preprint of this paper was sent around, we learned from L.-S. Young that a similar approach to prove Theorem 0.1 is developed by her and Nader Masmoudi in their work under preparation. Notation. We abbreviate a pair of random variables ξ1 , ξ2 or points u1 , u2 to ξ1,2 and u1,2 , respectively. Given a probability space ($, F, P), for any integer k ≥ 1 we denote by $k the space $ × · · · × $ (k times) endowed with the σ -algebra F × · · · × F and the measure P × · · · × P. For a random variable ξ , we denote by D(ξ ) its distribution. For a Banach space H , we shall use the following spaces and sets: 1 It is shown in [KS1, KS2] that the left-hand side of (0.4) converges to zero as k → ∞ for any f ∈ C (H ); b however, the rate of convergence is not specified.

354

S. Kuksin, A. Shirikyan

Cb (H )

is the space of bounded continuous functions on H with the supremum norm · ∞ . L(H ) is the space of bounded Lipschitz functions on H endowed with the natural norm · L (see Sect. 1). M(H ) is the space of signed Borel measures on H with bounded variation. P(H ) is the set of probability measures µ ∈ M(H ); this space is endowed with two different metrics described in Sect. 1. P(H, A) is the set of measures µ ∈ P(H ) with support in a closed set A. µv (k) is the measure P(k, v, ·), where P is the Markov transition function for (0.2). BH (R) is the closed ball of radius R > 0 centred at zero. 1. Measures on Hilbert Spaces Let H be a separable Hilbert space with the Borel σ -algebra B(H ) and let M(H ) be the space of signed Borel measures with bounded variation. We denote by P(H ) the set of probability measures µ ∈ M(H ) and by P(H, A) the subset in P(H ) consisting of measures supported by a closed set A ⊂ H . For any measure µ ∈ M(H ) and any function f ∈ Cb (H ), we write (µ, f ) = f (u) dµ(u) = f (u)µ(du). H

H

We shall use two different topologies on P(H ). The first of them is given by the variation norm on M(H ): µvar = sup |µ()|. ∈B(H )

The distance defined by this norm on P(H ) can be characterised in terms of densities. Namely, let us assume that µ1 , µ2 ∈ P(H ) are absolutely continuous with respect to a fixed Borel measure m, finite or infinite. (Such a measure always exists; for instance, one can take m = (µ1 + µ2 )/2.) In this case, we have 1 µ1 − µ2 var = |p1 (u) − p2 (u)| dm(u), (1.1) 2 H where pi (u), i = 1, 2, is the density of µi with respect to m. The space P(H ) is complete with respect to · var . To define a second topology, we denote by L(H ) the space of real-valued bounded Lipschitz functions on H with the norm

|f (u) − f (v)| f L := sup |f (u)| ∨ sup . u − v u=v u∈H Let · ∗L be the dual norm on M(H ):

µ∗L = sup (µ, f ). f L ≤1

It is clear that the norm · ∗L defines a metric on P(H ). Lemma 1.1. The space P(H ) is complete with respect to the metric · ∗L .

Randomly Forced Nonlinear PDE’s

355

Proof. Suppose that {µn } ⊂ P(H ) is a sequence such that µn − µm ∗L → 0 as m, n → ∞. Let L∗ (H ) be the space of continuous functionals on L(H ). Regarding µn as elements of L∗ (H ), we conclude that the sequence {µn } converges (in the norm ·∗L ) to a limit ) ∈ L∗ (H ), and we have )(f ) = lim (µn , f ), n→∞

f ∈ L(H ).

(1.2)

In view of the corollary2 from Theorem 1 in [GS, Chapter VI, §1], there is a measure µ ∈ P(H ) such that )(f ) = (µ, f ). This completes the proof. Note that, in the case when H is finite-dimensional, the fact that the functional ) in (1.2) is a measure is implied by the following well-known result (for instance, see [H, Theorem 2.1.7]): any nonnegative distribution is a measure; in particular, any positive functional ) ∈ L∗ (H ) is a measure as well. Let P(k, u, ), k ≥ 0, u ∈ H , ∈ B(H ), be a Markov transition function. A set A ∈ B(H ) is said be invariant for P if P(k, u, A) = 1

for all

k ≥ 0,

u ∈ A.

Lemma 1.2. Let A ∈ B(H ) be an invariant set for P(k, u, ). Suppose that there is k0 ≥ 1 and a sequence ζk , k ≥ k0 , going to zero as k → ∞ such that P(k, u, ·) − P(k, v, ·)∗L ≤ ζk for k ≥ k0 ,

u, v ∈ A.

(1.3)

Then there is a unique measure µ ∈ P(H, A) such that P(k, u, ·) − µ∗L ≤ ζk for k ≥ k0 ,

u ∈ A.

(1.4)

Proof. Let f ∈ L(H ), f L ≤ 1. Then, by (1.3) and the Chapman–Kolmogorov relation, for l ≥ k ≥ k0 and u, v ∈ A we have P(l, v, ·) − P(k, u, ·), f ≤ P(l − k, v, dz) P(k, z, dw)f (w) − P(k, u, dw)f (w) H H ≤ ζk P(l − k, v, dz) = ζk . H

(1.5)

By Lemma 1.1, the space P(H ) is complete with respect to · ∗L . Hence, there is a unique measure µ ∈ P(H ) such that P(l, v, ·) − µ∗L → 0 as l → ∞. It is clear that supp µ ⊂ A and therefore µ ∈ P(H, A). Passing to the limit in (1.5) as l → ∞, we obtain (1.4). We now recall that a pair of random variables (ξ1 , ξ2 ) defined on the same probability space is called a coupling for given measures µ1 , µ2 ∈ P(H ) if D(ξj ) = µj , j = 1, 2. For some basic results on the coupling, see [Lin,V] and the Appendix (Sect. 4). 2 The corollary of Theorem 1 in [GS, Chapter VI, §1] claims, in fact, that if the limit in (1.2) exists for any f ∈ Cb (H ), then the functional ) can be represented in the form )(f ) = (µ, f ), where µ ∈ P(H ). However, the same proof works also in the case under study.

356

S. Kuksin, A. Shirikyan

Lemma 1.3. If measures µ1 , µ2 ∈ P(H ) admit a coupling (ξ1 , ξ2 ) such that P ξ1 − ξ2 > ε ≤ θ,

(1.6)

where ε > 0 and θ > 0 are some constants, then µ1 − µ2 ∗L ≤ 2θ + ε.

(1.7)

Proof. Let f ∈ L(H ), f L ≤ 1. Then (µ1,2 , f ) = E f (ξ1,2 ) and, therefore, |(µ1 − µ2 , f )| ≤ EχQ (f (ξ1 ) − f (ξ2 )) + EχQc (f (ξ1 ) − f (ξ2 )),

(1.8)

where χQ and χQc are characteristic functions of the event ξ1 − ξ2 > ε and of its complement, respectively. By (1.6), the first term in the right-hand side of (1.8) is bounded by 2θ, while the second does not exceed εf L ≤ ε. This completes the proof of (1.7). 2. A Class of Random Dynamical Systems Let H be a Hilbert space with a norm · and an orthonormal basis {ej } and let S : H → H be an operator satisfying Conditions (A)–(C) below: (A) For any R > r > 0 there exist positive constants a = a(R, r) < 1 and C = C(R) and an integer n0 = n0 (R, r) ≥ 1 such that S(u1 ) − S(u2 ) ≤ C(R)u1 − u2 S n (u) ≤ max{au, r}

for all u1 , u2 ∈ BH (R), for u ∈ BH (R), n ≥ n0 .

(2.1) (2.2)

Let ηk , k ≥ 1, be a sequence of i.i.d. H -valued random variables that are defined on a probability space ($1 , F1 , P1 ) and have the form (0.1), where bj ≥ 0 are some constants such that ∞ j =1

bj2 < ∞,

(2.3)

and {ξj k } is a family of independent real-valued random variables such that |ξj k | ≤ 1 for all j , k, and ω1 ∈ $1 . We consider the following RDS in H : uk = S(uk−1 ) + ηk =: F ω1 (uk−1 ),

k ≥ 1.

(2.4)

It follows from (0.1) and (2.3) that the distribution of ηk is supported by the Hilbert cube K,

∞ K= u= uj ej : |uj | ≤ bj for all j ≥ 1 . j =1

Therefore, if the initial state u0 of the RDS (2.4) belongs to a set B for all k ≥ 1 and ω1 ∈ $1 , where A0 (B) = B and Ak (B) = S Ak−1 (B) + K for

⊂ H , then uk ∈ Ak (B)

k ≥ 1.

The next condition expresses the property of existence of a bounded absorbing set for the system in question.

Randomly Forced Nonlinear PDE’s

357

(B) There exists ρ > 0 such that for any bounded set B ⊂ H there is an integer k0 ≥ 1 such that Ak (B) ⊂ BH (ρ) for k ≥ k0 . Clearly, inequality (2.2) and Condition (B) are satisfied if S(u) ≤ γ u for all u ∈ H and some positive constant γ < 1. To formulate the last condition, we introduce some notations. For a subspace E ⊂ H , we denote by E ⊥ its orthogonal complement in H . For an integer N ≥ 1, let HN be the finite-dimensional subspace generated by the vectors e1 , . . . , eN and let PN and QN be the orthogonal projections onto HN and HN⊥ , respectively. (C) For any R > 0 there is a decreasing sequence γN (R) > 0 tending to zero as N → ∞ such that QN S(u1 ) − S(u2 ) ≤ γN (R)u1 − u2 for all u1 , u2 ∈ BH (R). Finally, we specify the random variables {ξj k }: (D) For any j , the random variables ξj k , k ≥ 1, have the same distribution πj (dr) = pj (r) dr, where the densities pj (r) are functions of bounded variation such that supp pj ⊂ [−1, 1] and |r|≤ε pj (r) dr > 0 for all j ≥ 1 and ε > 0. We normalise the functions pj to be continuous from the right. The RDS (2.4) defines a family of Markov chains in H with the transition function P(k, v, ) = P uk ∈ , where (uk , k ≥ 0) is the solution of (2.4) such that u0 = v. Let Pk and Pk∗ be the corresponding semigroups (see the Introduction for their definition). Continuity of S (see Condition (A)) and the Lebesgue theorem on dominated convergence imply that the transition function satisfies the Feller condition: if f ∈ Cb (H ), then Pk f ∈ Cb (H ) for all k ≥ 1. Let ρ > 0 be the constant in Condition (B). We introduce the set A=

Ak BH (ρ) .

(2.5)

k≥1

It is clear that A is an invariant set for the RDS (2.4): if u0 ∈ A, then uk ∈ A for all k ≥ 1 and ω1 ∈ $1 . Moreover, it follows from Condition (C) that the set A is compact in H . (Note that the union in (2.5) is taken over k ≥ 1 and therefore BH (ρ) is not a subset of A.) Our goal is to prove the following result: Theorem 2.1. There is an integer N ≥ 1 such that if (0.3) holds, then the RDS (2.4) has a unique stationary measure µ ∈ P(H, A). Moreover, for any R > 0 there is CR > 0 such that √ Pk f (u) − (µ, f ) ≤ CR e−c k f L for k ≥ 0, u ≤ R, where f ∈ L(H ) is an arbitrary function and c > 0 is a constant not depending on f , u, R, and k.

358

S. Kuksin, A. Shirikyan

Condition (B) and the definition of A imply that for any R > 0 there is an integer l ≥ 1 depending on R such that P(l, u, A) = 1 for any u ∈ BH (R). Hence, we can restrict our consideration to the invariant set A. In view of Lemma 1.2, Theorem 2.1 will be established if we show that there are positive constants C and c and an integer k0 ≥ 1 such that P(k, u, ·) − P(k, v, ·)∗L ≤ C e−c

√ k

for

k ≥ k0 ,

u, v ∈ A.

(2.6)

3. Proof of the Main Result We first establish some auxiliary assertions and then use them to prove inequality (2.6), which implies the required result. 3.1. Auxiliary assertions. We begin with a simple observation. Let R > 0 be so large that BH (R) ⊃ A. To simplify notation, we denote B = BH (R). Lemma 3.1. For any d > 0 there is an integer l = l(d) ≥ 0 and a constant : = :(d) > 0 such that P ul (v) ≤ d/2 for all v ∈ B ≥ :. (3.1) Proof. Let a and n0 be the constants in Condition (A) that correspond to the parameters R (the radius of B) and r = d/4 and let l = n0 m, where m is the smallest integer such that a m R ≤ d/4. If ηk = 0 in (2.4) for 1 ≤ k ≤ l, then, in view of (2.2), we have ul (v) ≤ max{a m R, d/4} = d/4

for all

v ∈ B.

By continuity, there is γ > 0 such that if ηk ≤ γ

for

1 ≤ k ≤ l,

(3.2)

then ul (v) ≤ d/2.

(3.3)

It follows from (2.3) and Condition (D) that the event (3.2) has a positive probability :. Inequality (3.1) follows now from (3.3). To simplify notation, for any v ∈ H we denote by µv (k) the measure P(k, v, ·) ∈ P(H ). For any measurable space (X, B(X)) and any integer k ≥ 1, we denote by X k the direct product X × · · · × X endowed with the product σ -algebra B k (X) = B(X) × · · · × B(X). Lemma 3.2. There is a probability space ($, F, P), an integer N ≥ 1, and a constant C > 0 such that if (0.3) holds, then for any u1 , u2 ∈ B the measures µu1,2 (1) admit a coupling V1,2 = V1,2 (u1 , u2 ; ω) that possesses the following properties: (i) The maps V1,2 are measurable with respect to the σ -algebra B 2 (H )×F as functions of (u1 , u2 , ω) ∈ B 2 × $. (ii) Let d = u1 − u2 . Then P V1 − V2 ≥ d/2 ≤ Cd. (3.4)

Randomly Forced Nonlinear PDE’s

359

Let us note that inequality (3.4) is nontrivial only in the case Cd < 1. Proof. Let ($1 , F1 , P1 ) be the probability space on which the random variables {ηk } are defined and let ($2 , F2 , P2 ) be the probability space constructed in Theorem 4.2 for the measures ν1,2 specified below. We shall show that the set $ = $1 × $2 endowed with the natural σ -algebra and probability of direct product is the required probability space. The random variables V1,2 are sought in the form V1 = S(u1 ) + ξ1 ,

V2 = S(u2 ) + ξ2 ,

where ξ1,2 are some random variables on $ such that D(ξ1 ) = D(ξ2 ) = D(η1 ). It is clear that D(V1,2 ) = µu1,2 (1) and that (i) holds. To define the random variables ξ1,2 , we specify their projections PN ξ1,2 and QN ξ1,2 , where N ≥ 1 is a sufficiently large integer which is chosen below. We set QN ξ1 = QN ξ2 = QN η˜ 1 , where η˜ 1 is the natural extension of η1 to $, i.e., η˜ 1 (ω) = η1 (ω1 ) for ω = (ω1 , ω2 ) ∈ $. To define PN ξ1,2 , let us write ν1,2 := PN µu1,2 (1) and assume that we have proved the inequality ν1 − ν2 var ≤ Cd,

(3.5)

where C > 0 is a constant not depending on u1,2 ∈ B. In view of Theorem 4.2, there is a maximal coupling =1,2 (u1 , u2 ; ω2 ) for the measures ν1,2 that is measurable with respect to (u1 , u2 , ω2 ) ∈ B 2 × $2 : P{=1 = =2 } = ν1 − ν2 var ≤ Cd.

(3.6)

Retaining the same notation for the natural extensions of =1 and =2 to $, we now set PN ξ1,2 = =1,2 − PN S(u1,2 ) and note that PN V1 = PN V2 if and only if =1 = =2 . Let N ≥ 1 be so large that γN (R) ≤ 1/2 (see Condition (C)). In this case, if PN V1 = PN V2 , then V1 − V2 = QN (V1 − V2 ) = QN (S(u1 ) − S(u2 )) ≤ u1 − u2 /2 ≤ d/2. Inequality (3.4) follows now from (3.6). Thus, it remains to establish (3.5). To this end, we set v1,2 = PN S(u1,2 ) and note that, in view of (2.1), v1 − v2 ≤ C(R)d.

(3.7)

Since bj = 0 for 1 ≤ j ≤ N , Condition (D) implies that D(PN η1 ) = p(x) dx, where dx is the Lebesgue measure on the finite-dimensional space HN and p(x) =

N j =1

qj (xj ),

qj (xj ) = bj−1 pj (xj /bj ),

x = (x1 , . . . , xN ) ∈ HN ,

is a bounded function with support in the set PN K. It follows that ν1,2 = D(v1,2 + PN η1 ) = p(x − v1,2 ) dx.

360

S. Kuksin, A. Shirikyan

Therefore, by (1.1), 1 = 2

v1 − v2 var

HN

|p(x − v1 ) − p(x − v2 )| dx.

We claim that HN

|p(x − v1 ) − p(x − v2 )| dx ≤ |v1 − v2 |

N j =1

bj−1 Var(pj ),

(3.8)

where Var(pj ) stands for the total variation of pj . The required inequality (3.5) follows immediately from (3.7) and (3.8). To prove (3.8), we first assume that pj are C 1 -smooth functions. In this case, we have |p(x − v1 ) − p(x − v2 )| dx HN

≤ |v1 − v2 |

HN

= |v1 − v2 | = |v1 − v2 |

HN N

(∇p)(x − θv1 − (1 − θ)v2 ) dθdx

1 0

N (∇p)(x) dx ≤ |v1 − v2 |

j =1 R

∂x qj (xj ) dxj j

Var(qj ).

j =1

It remains to note that Var(qj ) = bj−1 Var(pj ). Inequality (3.8) in the general case can be easily derived by a standard approximation procedure; we omit the corresponding arguments. k (u , u ) for the We now combine Lemmas 3.1 and 3.2 to obtain a coupling U1,2 1 2 measures µu1,2 (k), k ≥ 1. Let l = l(d) and C > 0 be the constants in Lemmas 3.1 and 3.2 and let d0 > 0 be so small that

Cd0 ≤ 1/8. We set dr = 2−r d0 , r ≥ 1. For a probability space ($, F, P), we shall denote by ($k , F k , Pk ) the direct product of its k independent copies. Points of the latter will be denoted by ωk = (ω1 , . . . , ωk ). Lemma 3.3. Suppose that the conditions of Lemma 3.2 are satisfied. Let u1 , u2 ∈ A and d = u1 − u2 . Then for any k ≥ 1 the measures µu1,2 (k) admit a coupling k = U k (u , u ; ωk ), ωk ∈ $k , such that the following assertions hold: U1,2 1,2 1 2 k (u , u ; ωk ) are measurable with respect to (u , u , ωk ) ∈ A2 × $k . (i) The maps U1,2 1 2 1 2 (ii) There is a constant θ > 0 not depending on u1 , u2 , and k such that (3.9) Pk U1k − U2k ≤ dr ≥ θ for all k ≥ r + l(d0 ), u1 , u2 ∈ A.

Randomly Forced Nonlinear PDE’s

361

(iii) If u1 − u2 ≤ dr , then Pk U1k − U2k ≤ dk+r ≥ 1 − 2−r−1 for all k ≥ 1,

r ≥ 0.

(3.10)

Proof. Let us recall that for any (u1 , u2 ) ∈ B × B a coupling V1,2 (u1 , u2 ; ω) was constructed in Lemma 3.2. We set Vj (u1 , u2 ; ω) if u1 − u2 ≤ d0 , Uj (u1 , u2 ; ω) = F ω (uj ) if u1 − u2 > d0 , k on where j = 1, 2 and F ω (u) is given by (2.4). We define random variables U1,2 ($k , F k ) by the following rule: if u1 − u2 > d0 , then

Ujk (u1 , u2 ; ωk ) = F ωk ◦ · · · ◦ F ω1 (uj ) for k ≤ l(d0 ) and

Ujk (u1 , u2 ; ωk ) = Uj U1k−1 (u1 , u2 ; ωk−1 ), U2k−1 (u1 , u2 ; ωk−1 ); ωk

(3.11)

for k > l(d0 ), where ωk = (ωk−1 , ωk ) = (ω1 , . . . ωk ) and Uj0 (u1 , u2 ) = uj . If u1 − k 0 (u , u ) = u k u2 ≤ d0 , then U1,2 1 2 1,2 and for k ≥ 1 the random variables Uj (u1 , u2 ; ω ) are inductively defined by (3.11). k satisfy assertions (i)–(iii) of the lemma. Indeed, the measurabilWe claim that U1,2 k is obvious since they are compositions of measurable maps. To ity of the maps U1,2 prove (3.9), we first note that it is sufficient to consider the case k = l + r, l = l(d0 ). We introduce the following events in $l+r : Q+ = U1l − U2l ≤ d0 , Q− = U1l − U2l > d0 , Q = U1l+r − U2l+r ≤ dr . By Lemma 3.1, we have Pk (Q) = Pk (Q|Q+ )P(Q+ ) + Pk (Q|Q− )P(Q− ) ≥ : Pk (Q|Q+ ).

(3.12)

If we assume that (3.10) is proved for r = 0, then (3.12) will imply the required estimate (3.9) with θ = :/2. Thus, it remains to establish (iii). For a fixed r ≥ 0, we set k k k k Q+ Q− k = U1 − U2 ≤ dk+r , k = U1 − U2 > dk+r − and denote by pk+ and pk− the probabilities of Q+ k and Qk , respectively. Using (3.4) with d = dk+r−1 , we derive + + − + − + k pk+ = pk−1 Pk (Q+ k |Qk−1 ) + pk−1 P (Qk |Qk−1 ) ≥ (1 − Cdk+r−1 )pk−1 .

Since p0+ = 1, iteration of this estimate results in pk+ ≥ λ :=

k−1 j =0

(1 − Cdj +r ).

(3.13)

362

S. Kuksin, A. Shirikyan

Since dm = 2−m d0 and Cd0 ≤ 1/8, we have log λ =

k−1

log(1 − Cdj +r ) ≥ −2C

j =0

≥ −2Cd0

k−1

dj +r

j =0 ∞

2−(j +r) = −22−r Cd0 ≥ −2−r−1 .

j =0

Therefore, λ ≥ 1 − 2−r−1 . 3.2. Proof of Theorem 2.1. As was mentioned at the end of Sect. 2, it is sufficient to establish inequality (2.6). In what follows, to simplify notation, we shall write P instead of Pk . (1) Let us fix arbitrary u1 , u2 ∈ A and set T0 = 0 and Tr = Tr−1 + r + l for r ≥ 1, i.e., Tr = r(r + 1)/2 + rl. We claim that for any integer r ≥ 0 there is a coupling y1,2 (Tr ) on $Tr for the measures µu1,2 (Tr ) such that (3.14) P y1 (Tr ) − y2 (Tr ) > dr ≤ C1 γ r , where C1 and γ < 1 are some positive constants. The construction of y1,2 (Tr ) = y1,2 (Tr , u1 , u2 ; ωTr ) and the proof of (3.14) are by induction. For r = 0, we set yj (0) = uj , and inequality (3.14) with C1 ≥ 1 is trivial in this case. Assuming that y1,2 (Ti ) are constructed for 0 ≤ i ≤ r, we set yj (Tr+1 , u1 , u2 ; ωTr+1 ) = Ujr+l+1 y1 (Tr , u1,2 ; ωTr ), y2 (Tr , u1,2 ; ωTr ); ωr+l+1 , (3.15) k (u , u ; ωk ) are defined in Lemma 3.3 and ωTr+1 = (ωTr , ωr+l+1 ). Let us where U1,2 1 2 introduce the events Q+ Q− r = y1 (Tr ) − y2 (Tr ) ≤ dr , r = y1 (Tr ) − y2 (Tr ) > dr

and denote by pr+ and pr− their probabilities. Then, in view of (3.9) and (3.10) with k = r + l, we have (cf. (3.12)) − − + + − − pr+1 = P(Q− r+1 |Qr )P(Qr ) + P(Qr+1 |Qr )P(Qr )

≤ 2−r−1 pr+ + (1 − θ)pr− ≤ 2−r−1 + γpr− ,

(3.16)

where γ = 1 − θ. Without loss of generality, we can assume that 0 < θ < 1/2, and therefore 1 < 2γ < 2. Iterating (3.16), we obtain − pr+1

≤2

−r−1

r

(2γ )j + γ r+1 p0− ≤ 2−r−1

j =0

This completes the induction.

(2γ )r+1 − 1 + γ r+1 ≤ C1 γ r+1 . 2γ − 1

Randomly Forced Nonlinear PDE’s

363

(2) We can now prove (2.6). Let us fix arbitrary positive integers r and m ≤ r + l and set k = Tr + m, so that Tr + 1 ≤ k < Tr+1 . We define a coupling y1,2 (k) = y1,2 (k, u1 , u2 ) for the measures µu1,2 (k) by the formula (cf. (3.15)) yj (k, u1 , u2 ; ωk ) = Ujm y1 (Tr , u1 , u2 ; ωTr ), y2 (Tr , u1 , u2 ; ωTr ); ωm . In view of (3.10) and (3.14), we have (cf. (3.16)) −r−1 r P y1 (k) − y2 (k) > dr+1 ≤ P(Q− P(Q+ r )+2 r ) ≤ C2 γ ,

(3.17)

where C2 > 0 is a constant. Now note that r 2 /2 ≤ Tr ≤ (l + 1)r 2 for any r ≥ 0 and therefore there are positive constants C and c such that dr+1 ≤ C e−c

√ k

,

C2 γ r ≤ C e−c

√ k

for

Tr ≤ k < Tr+1 .

Combining this with (3.17), we derive √ √ P y1 (k, u1 , u2 ) − y2 (k, u1 , u2 ) ≥ C e−c k ≤ C e−c k .

(3.18)

By Lemma 1.3, inequality (3.18) implies that √ µu (k) − µu (k)∗ ≤ 3C e−c k , 1 2 L

which completes the proof of (2.6) with k0 = T1 . Theorem 2.1 is proved.

4. Appendix: Coupling In this appendix, we present some results on the coupling in finite-dimensional spaces in the form which we learned from S. Foss. These results are well known (e.g., see [Lin, V] for Lemma 4.1 and [BF] for Lemma 4.3). Let ν1 , ν2 ∈ P(RN ) be two measures absolutely continuous with respect to the Lebesgue measure dx: ν1,2 (dx) = p1,2 (x) dx. We set ρ := ν1 − ν2 var

1 = 2

|p1 (x) − p2 (x)| dx

(4.1)

pˆ 1,2 := ρ −1 (p1,2 − p).

(4.2)

RN

and assume first that 0 < ρ < 1. Let p := (1 − ρ)−1 p1 ∧ p2 ,

For ρ = 1 or 0, we define p(x) and p1,2 (x) as follows: p(x) ≡ 0, p(x) ≡ p1 (x),

pˆ 1,2 (x) ≡ p1,2 (x) if ρ = 1, pˆ 1,2 (x) ≡ 0 if ρ = 0.

It is clear that p1,2 (x) = (1 − ρ)p(x) + ρ pˆ 1,2 (x)

almost everywhere.

(4.3) (4.4)

364

S. Kuksin, A. Shirikyan

If (ξ1 , ξ2 ) is a coupling for the measures (ν1 , ν2 ), then for any ∈ B(RN ) we have ν1 () − ν2 () = E χ (ξ1 ) − χ (ξ2 ) = E χ{ξ1 =ξ2 } χ (ξ1 ) − χ (ξ2 ) ≤ P{ξ1 = ξ2 }. Therefore,

P{ξ1 = ξ2 } ≥ ρ ≡ ν1 − ν2 var .

A coupling (ξ1 , ξ2 ) for (ν1 , ν2 ) is said to be maximal if P{ξ1 = ξ2 } = ρ ≡ ν1 − ν2 var . Lemma 4.1. Let ξ1,2 , ξ , and α be independent random variables such that P{α = 1} = 1 − ρ,

P{α = 0} = ρ,

D(ξ ) = p(x) dx,

D(ξ1,2 ) = pˆ 1,2 (x) dx. (4.5)

Then the random variables =1,2 = αξ + (1 − α)ξ1,2

(4.6)

form a maximal coupling for ν1,2 . Proof. Since ξ1 and ξ2 are independent and their distributions possess densities with respect to the Lebesgue measure, we have P{ξ1 = ξ2 } = 0. Taking into account the relation α(1 − α) ≡ 0, we get D(=1,2 ) = p1,2 (x) dx = ν1,2 ,

P{=1 = =2 } = P{α = 0} = ρ,

which completes the proof. Let us now assume that ϕ is a random variable in RN with the distribution D(ϕ) = q(x) dx, where q ∈ L1 (RN ). Consider the following family of measures depending on a parameter v ∈ RN : νv (dx) = D(v + ϕ) = q(x − v) dx. Let ρ(v1 , v2 ) be the variation distance between νv1 and νv2 . It is clear from (4.1) that ρ(v1 , v2 ) is measurable with respect to v1 , v2 ∈ R2N . In the construction above, let us take ν1,2 = νv1,2 . Then p(x) = p(x; v1 , v2 ),

pˆ 1,2 (x) = pˆ 1,2 (x; v1 , v2 ).

Clearly, the functions p(x; v1 , v2 ) and pˆ 1,2 (x; v1 , v2 ) are measurable with respect to (x, v1 , v2 ). Using the above observations, we construct a coupling for (νv1 , νv2 ) that is measurable with respect to (v1 , v2 , ω). Namely, we have the following result: Theorem 4.2. There is a probability space ($, F, P) such that for any pair (v1 , v2 ) ∈ R2N there are random variables =1,2 = =1,2 (v1 , v2 ; ω) satisfying the following properties: (i) The pair (=1 , =2 ) is a maximal coupling for (νv1 , νv2 ). (ii) The map =(v1 , v2 ; ω) : R2N ×$ → RN is measurable with respect to the σ -algebra B(R2N ) × F.

Randomly Forced Nonlinear PDE’s

365

To prove the theorem, we shall need the lemma below: Lemma 4.3. Let µz ∈ P(RN ), z ∈ Rd , be a family of probability measures such that µz (dx) = pz (x) dx, d where pz ∈ L1 (RN x ) for each z ∈ R and pz (x) is measurable as a function of (x, z) ∈ N d R × R . Then there is a probability space ($, F, P) and a family of random variables ζz : $ → RN such that D(ζz ) = µz for all z ∈ Rd and ζz (x) is measurable with respect to (z, x).

Proof. If N = 1, then we take ($, F, P) = ([0, 1], B, dt), where B is the Borel σ algebra and dt is the Lebesgue measure. Denoting by Fz (λ) the distribution function of the measure µz , Fz (λ) = µz ((−∞, λ]), we set ζz (t) = min{λ : Fz (λ) ≥ t}. The map (t, z) & → ζz (t) from [0, 1]×Rd to R is measurable, and the distribution function of D(ζz ) is equal to Fz . Thus, for N = 1 the lemma is proved. We now assume that the required assertion is established for N = L and prove it for N = L + 1. Let us write x ∈ RL+1 as x = (x , y), where x ∈ RL and y ∈ R. Decomposing µz in terms of the conditional density (see [GS]), we write µz (dx) = pz (x) dx = pz (x | y) dx qz (y) dy. Here

qz (y) =

RL

pz (x , y) dx ,

pz (x | y) =

(4.7)

pz (x , y) , qz (y)

where we set 0/0 = ∞/∞ = 0. Applying the induction hypothesis with z replaced by (z, y), we find a probability space ($ , F , P ) and a measurable map ζz (ω , y) : $ × Rd × R → RL

such that D ζz (·, y) = pz (x | y) dx for each (z, y) ∈ Rd × R. Applying the first step of the proof, we construct a measurable map ξz (t) : [0, 1] × Rd → R such that D(ξz ) = qz (λ) dλ. We now set $ = $ × [0, 1] and ζz (ω , t) = ζz (ω , ξz (t)), ξz (t) ∈ RL+1 . We have constructed a measurable map $×Rd → RL+1 such that, for any fixed z ∈ Rd , its distribution is given by the right-hand side of (4.7). Proof of Theorem 4.2. Applying Lemma 4.2 to measures in RN given by the densities p and pˆ 1,2 , we construct probability spaces (Fj , Sj , Pj ), j = 0, 1, 2, and random variables j ξ(v1 ,v2 ) on Fj such that 0 D(ξ(v ) = p(x; v1 , v2 ) dx, 1 ,v2 )

j

D(ξ(v1 ,v2 ) ) = pˆ j (x; v1 , v2 ) dx,

j = 1, 2.

(4.8)

We also define a random variable αρ : [0, 1] → {0, 1}, ρ = ρ(v1 , v2 ), by the formula αρ (t) = χ[0,1−ρ] (t),

366

S. Kuksin, A. Shirikyan

where [0, 1] is endowed with the Borel σ -algebra and the Lebesgue measure, and χ[0,r] is the characteristic function of the interval [0, r]. We now define the required probability space as the set $ = F0 × F1 × F2 × [0, 1] with the σ -algebra and the probability of direct product. The natural extensions3 of αρ j and ξ(v1 ,v2 ) , j = 0, 1, 2, to $ (for which we retain the same notations) form a quadruple of independent random variables satisfying (4.8) and also the relations P{αρ = 1} = 1 − ρ(v1 , v2 ),

P{αρ = 0} = ρ(v1 , v2 ).

A maximal coupling (=1 , =2 ) for the measures (νv1 , νv2 ) that satisfies assertion (ii) of 0 the theorem can now be defined by formula (4.6), in which α = αρ , ξ = ξ(v , and 1 ,v2 ) j

ξj = ξ(v1 ,v2 ) , j = 1, 2.

Acknowledgements. The authors thank Roger Tribe and Sergei Foss for fruitful discussions of the coupling approach during the Symposium “Stochastic Fluid Equations” in Warwick on January 19–20, 2001, and at seminars in Heriot-Watt University, respectively. The authors are also grateful to Jan Kristensen for useful remarks on functional analysis. This research was supported by EPSRC, grant GR/N63055/01.

References [BF]

Borovkov, A.A., Foss, S.G.: Stochastically recursive sequences and their generalizations. Siberian Adv. in Math. 2, no. 1, 16–81 (1992) [BL] Bressaud, X., Liverani, C.: Anosov diffeomorphism and coupling. To appear in Ergodic Theory Dynam. Systems [BKL] Bricmont, J., Kupiainen, A., Lefevere, R.: Exponential mixing for the 2D stochastic Navier–Stokes dynamics. Preprint [EMS] E, W., Mattingly, J.C., Sinai, Ya.G.: Gibbsian dynamics and ergodicity for the stochastically forced Navier–Stokes equation. Preprint [GS] Gihman, I.I., Skorohod, A.V.: The Theory of Stochastic Processes I. Berlin–Heidelberg–New York: Springer-Verlag, 1980 [H] Hörmander, L.: The Analysis of Linear Partial Differential Operators I. Distribution Theory and Fourier Analysis. Berlin: Springer-Verlag, 1983 [KS1] Kuksin, S., Shirikyan, A.: Stochastic dissipative PDE’s and Gibbs measures. Comm. Math. Phys. 213, 291–330 (2000) [KS2] Kuksin, S., Shirikyan, A.: On dissipative systems perturbed by bounded random kick-forces. Submitted to Ergodic Theory Dynam. Systems (www.ma.hw.ac.uk/kuksin) [KS3] Kuksin, S., Shirikyan, A.: Ergodicity for the randomly forced 2D Navier–Stokes equations. Preprint. (www.ma.hw.ac.uk/kuksin) [Lin] Lindvall, T.: Lectures on the Coupling Method. New York: John Wiley & Sons, 1992 [V] Veretennikov, A.Yu.: Parametric and non-parametric estimation of Markov chains. Moscow: Moscow State University Press, 2000 (in Russian) [Y] Young, L.-S.: Recurrence times and rates of mixing. Israel J. Math. 110, 153–188 (1999) Communicated by G. Gallavotti

3 For instance, the extension of α is given by α (ω) = α (t), where ω = (ω , ω , ω , t) ∈ $. ρ ρ ρ 0 1 2

Commun. Math. Phys. 221, 367 – 384 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Loop Homotopy Algebras in Closed String Field Theory Martin Markl Mathematical Institute of the Academy, Žitná 25, 11567 Prague 1, The Czech Republic. E-mail: [email protected] Received: 10 November 1999 / Accepted: 29 March 2001

Abstract: Barton Zwiebach constructed [20] “string products” on the Hilbert space of a combined conformal field theory of matter and ghosts, satisfying the “main identity”. It has been well known that the “tree level” of the theory gives an example of a strongly homotopy Lie algebra (though, as we will see later, this is not the whole truth). Strongly homotopy Lie algebras are now well-understood objects. On the one hand, strongly homotopy Lie algebra is given by a square zero coderivation on the cofree cocommutative connected coalgebra [14, 13]; on the other hand, strongly homotopy Lie algebras are algebras over the cobar dual of the operad Com for commutative algebras [9]. As far as we know, no such characterization of the structure of string products for arbitrary genera has been available, though there are two series of papers directly pointing towards the requisite characterization. As far as the characterization in terms of (co)derivations is concerned, we need the concept of higher order (co)derivations, which has been developed, for example, in [2, 3]. These higher order derivations were used in the analysis of the “master identity.” For our characterization we need to understand the behavior of these higher (co)derivations on (co)free (co)algebras. The necessary machinery for the operadic approach is that of modular operads, anticipated in [5] and introduced in [8]. We believe that the modular operad structure on the compactified moduli space of Riemann surfaces of arbitrary genera implies the existence of the structure we are interested in the same manner as was explained for the tree level in [11]. We also indicate how to adapt the loop homotopy structure to the case of open string field theory [19]. 1. Introduction Let H be the Hilbert space of a combined conformal field theory of matter and ghosts and let Hrel ⊂ H be the subspace of elements annihilated by b0− := b0 − b0 and

368

M. Markl

L− 0 := L0 − L0 (see, for example, [11, Sect. 4]). Barton Zwiebach constructed in [20], for each “genus” g ≥ 0 and for each n ≥ 0, multilinear “string products” ⊗n B1 × · · · × Bn −→ [B1 , . . . , Bn ]g ∈ Hrel . Hrel

Recall the basic properties of these products. If gh(−) denotes the ghost number, then [20, (4.8)] gh([B1 , . . . , Bn ]g ) = 3 − 2n +

n

gh(Bi ).

i=1

The string products are graded (super) commutative [20, (4.4)]: [B1 , . . . , Bi , Bi+1 , . . . , Bn ]g = (−1)Bi Bi+1 [B1 , . . . , Bi+1 , Bi , . . . , Bn ]g .

(1)

Here we used the notation (−1)Bi Bi+1 := (−1)gh(Bi )gh(Bi+1 ) . For n = 0 and g ≥ 0, [ . ]g ∈ Hrel is just a constant, and the products are constructed in such a way that [ . ]0 = 0 [20, (4.6)]. The linear operation [B]0 =: QB is identified with the BRST differential of the theory. These products satisfy, for all n, g, the main identity [20, (4.13)]: 0= σ (il , jk ) Bi1 , . . . , Bil , [Bj1 , . . . , Bjk ]g2 g (2) 1 1 + (−1)s [s , s , B1 , . . . , Bn ]g−1 . 2 s Here the first sum runs over all g1 + g2 = g, k + l = n, and all sequences i1 < · · · < il , j1 < · · · < jk such that {i1 , . . . , il , j1 , . . . , jk } = {1, . . . , n}. Such sequences are called unshuffles (see the terminology introduced at the beginning of Sect. 2). The sign σ (il , jk ) is picked up by rearranging the sequence (Q, B1 , . . . , Bn ) into the order (Bi1 , . . . , Bil , Q, Bj1 , . . . , Bjk ). In the second sum, {s } is a basis of Hrel and {s } its dual basis in the sense that (−1)r r , s = δrs (Kronecker delta), where −, − denotes the bilinear inner product on H [20, (2.44)]. Let us remark that, in the original formulation of [20], {s } was a basis of the whole H, but the sum in (2) was restricted to Hrel . The product satisfies [20, (2.62)]: A, B = (−1)(A+1)(B+1) B, A

(3)

and it is nontrivial only for elements whose ghost numbers add up to five: if A, B = 0, then gh(A) + gh(B) = 5.

(4)

The above two conditions in fact imply that A, B = B, A. Moreover, the product −, − is Q-invariant [20, 2.63]: QA, B = (−1)A A, QB.

(5)

Loop Homotopy Algebras in Closed String Field Theory

369

⊗2 Conditions (3) and (4) also imply that the element := (−1)s s ⊗s ∈ Hrel is symmetric in the sense that s

(−1)s s ⊗s = (−1)s s ⊗s = −(−1) s ⊗s .

(6)

We use, in the previous formula as well as at many places in the rest of the paper, the Einstein convention of summing over repeated indices. The last important property of string products is that the element ⊗2 s ⊗[s , B1 , . . . , Bn−1 ]g ∈ Hrel

(7)

is antisymmetric. This property is not explicitly stated in [20], though it is used in the proof of the identity [20, (4.28)]: B1 , . . . , Bl , s , [s , A1 , . . . , Ak ]g2 g = 0, for arbitrary l ≥ 0, k ≥ 0, 1

s

which then immediately follows from the antisymmetry (7) by the graded commutativity (1) of string products. Eq. (7) is a consequence of the important fact that the string products are defined with the aid of the multilinear string functions [20, (7.72)] ⊗(n+1)

Hrel

B0 , . . . , Bn −→ {B0 , . . . , Bn }g ∈ C

by [20, (4.33)] [B1 , . . . , Bn ]g :=

(−1)t t · {t , B1 , . . . , Bn }g .

(8)

t

Let us show that the graded commutativity [20, (4.36)] {B0 , . . . , Bi , Bi+1 , . . . , Bn }g = (−1)Bi Bi+1 {B0 , . . . , Bi+1 , Bi , . . . , Bn }g of the string multilinear functions implies the antisymmetry of the element in (7). Indeed, because of (6), we may write (8) as (−1)t t · {t , B1 , . . . , Bn }g , [B1 , . . . , Bn ]g = t

thus the element in (7) takes the form (−1)t (s ⊗t ) · {t , s , B1 , . . . , Bn−1 }g . s,t

The antisymmetry we are proving means that

(−1)t s ⊗t · {t , s , B1 , . . . , Bn−1 }g

s,t

=−

(−1)t +s t t ⊗s · {t , s , B1 , . . . , Bn−1 }g .

s,t

The replacement t ←→ s in the right-hand side of the above equation gives − (−1)s +t s s ⊗t · {s , t , B1 , . . . , Bn−1 }g s,t

370

M. Markl

which can be further rewritten, using the graded commutativity of string functions, as s t (−1)s +t s + s ⊗t · {t , s , B1 , . . . , Bn−1 }g . (9) − s,t

Since gh(s ) ≡ gh(s ) + 1 (mod 2) and gh(t ) ≡ gh(t ) + 1 (mod 2), gh(s )gh(t ) ≡ gh(s )gh(t ) + gh(s ) + gh(t ) + 1 (mod 2), therefore the sign factor in (9) is (−1)t . This proves the claim. 2. Sign Interlude and the Definition In this brief section we rewrite the axioms of string products into a more usual and convenient formalism. All algebraic objects will be considered over a fixed field k of characteristic zero. This, of course, includes the case k = C of the previous section. We will systematically use the Koszul sign convention meaning that whenever we commute two “things” of degrees p and q, respectively, we multiply by the sign factor (−1)pq . Our conventions concerning graded vector spaces, permutations, shuffles, etc., will follow closely those of [15]. For graded indeterminates x1 , . . . , xn and a permutation σ ∈ n define the Koszul sign (σ ) = (σ ; x1 , . . . , xn ) by x1 ∧ · · · ∧ xn = (σ ; x1 , . . . , xn ) · xσ (1) ∧ · · · ∧ xσ (n) , which is to be satisfied in the free graded commutative algebra ∧(x1 , . . . , xn ). Define also χ (σ ) := χ (σ ; x1 , . . . , xn ) := sgn(σ ) · (σ ; x1 , . . . , xn ). We say that σ ∈ n is an (i, j )-unshuffle, i + j = n, if σ (1) < · · · < σ (i) and σ (i + 1) < · · · < σ (n). In this case we write σ ∈ unsh(i, j ). In the obvious similar manner one may introduce (i, j, k)-unshuffles, etc. Let us denote, for a graded vector space U , by ↑ U (resp. ↓ U ) the suspension (resp. the desuspension) of U , i.e. the graded vector space defined by (↑ U )p := Up−1 (resp. (↓ U )p := Up+1 ). We have the obvious natural maps ↑: U → ↑ U and ↓: U → ↓ U . For a graded vector space U , let its reflection r(U ) be the graded vector space defined by r(U )p := U−p . There is an obvious natural map r : U → r(U ). Observe that r2 = 1, r ◦ ↑= ↓ ◦ r and r ◦ ↓=↑ ◦ r. Take now V := r(↓ Hrel ). Define, for each g ≥ 0 and n ≥ 0, multilinear maps g ln : V ⊗n → V by g

ln (v1 , . . . , vn ) := (−1)(n−1)v1 +(n−2)v2 +···+vn−1 ↓ [↑ r(v1 ), . . . , ↑ r(vn )]g , for v1 , . . . , vn ∈ V ⊗n . Define also the bilinear form B : V ⊗V → C by B(u, v) := ↑ r(u), ↑ r(v)

(10)

and, finally, the element h = hs ⊗hs by hs := (−1)s r(↓ s ), hs := r(↓ s ), which means that hs ⊗hs := (−1)s r(↓ s )⊗r(↓ s ) (Einstein summation convention). A technical, but absolutely straightforward, calculation shows that the above structure is an example of a loop homotopy Lie algebra in the sense of the following definition.

Loop Homotopy Algebras in Closed String Field Theory

371 g

Definition 1. A loop homotopy Lie algebra is a triple V = (V , B, {ln }) consisting of Vi , (i) a Z-graded vector space V , V∗ = (ii) a graded symmetric nondegenerate bilinear degree +3 form B : V ⊗V → k, and g (iii) the set {ln }n,g≥0 of degree n − 2 multilinear antisymmetric operations g ln : V ⊗n → V . These data are supposed to satisfy the following two axioms: (A1) For any n, g ≥ 0 and v1 , . . . , vn ∈ V , the following “main identity” g g 0= χ (σ )(−1)l(k−1) lk 1 (ll 2 (vσ (1) ,..., vσ (l) ), vσ (l+1) ,..., vσ (n) ) k+l=n+1 g1 +g2 =g

+

σ ∈unsh(l,n−l)

1 g−1 (−1)hs +n ln+2 (hs , hs , v1 , . . . , vn ) 2 s

(11)

holds. In the second term, {hs } and {hs } are bases of the vector space V dual to each other in the sense that B(hs , ht ) = δts .

(12)

(A2) The element g

(−1)(n+1)hs hs ⊗ln (hs , v1 , . . . , vn−1 ) ∈ V ⊗V

(13)

is symmetric, for all g ≥ 0, n ≥ 0, and v1 , . . . , vn−1 ∈ V . Remark 1. To give a reasonable meaning to the “basis {hs } of V ”, we must suppose either that V is finite dimensional, or that it has a suitable topology, as in the case of string products. We will always tacitly assume that assumptions of this form have been made. In the “main identity” for g = 0 we put, by definition, ln−1 = 0. Because deg(hs ) + deg(hs ) = −3, deg(hs ) deg(hs ) is even. The graded symmetry of B then implies that, besides (12), also B(hs , ht ) = δst . The element h = hs ⊗hs is s easily seen to be symmetric, hs ⊗hs = (−1)hs h hs ⊗hs = hs ⊗hs . For n = 0 axiom (2) gives

0=

g1 +g2 =g

g

g

l1 1 (l0 2 (.)) +

1 g−1 (−1)hs l2 (hs , hs ), 2 s

while for n = 1 it gives 0=

g1 +g2 =g

g

g

g

g

(l1 1 (l1 2 (v)) + l2 1 (l0 2 (.), v)) −

1 g−1 (−1)hs l3 (hs , hs , v), 2 s g

(14)

for all v ∈ V . From this moment on, we will assume that l0 = 0, for all g ≥ 0, that is, the theory has “no constants”. This assumption is not really necessary, but it will considerably simplify our exposition.

372

M. Markl

Exercise 1. Let us denote ∂ := l10 . Equation (14) implies that ∂ 2 = 0 (recall our asg sumption l0 = 0!). Thus ∂ is a degree −1 differential on the space V . The symmetry of s hs ⊗∂(h ) (Axiom (A2) with n = 1 and g = 1) is equivalent to the d-invariance of the form B, B(∂u, v) + (−1)u B(u, ∂v) = 0, for u, v ∈ V . The tree level. Let us discuss the “tree level” (g = 0) specialization of the above g structure. The only nontrivial ln ’s are ln := ln0 , n ≥ 1. The main identity (11) for g = 0 reduces to χ (σ )(−1)l(k−1) lk (ll (vσ (1) ,..., vσ (l) ), vσ (l+1) ,..., vσ (n) ) (15) 0= k+l=n+1 σ ∈unsh(l,n−l) n

while, for g = 1 it gives (after forgetting the overall factor (−1) 2 ) (−1)hs ln+2 (hs , hs , v1 , . . . , vn ). 0=

(16)

s

Axiom (A2) says that the elements (−1)(n+1)hs hs ⊗ln (hs , v1 , . . . , vn )

(17)

are symmetric. We immediately recognize (15) as the defining axiom for strongly homotopy Lie algebras [13, Def. 2.1]. Thus the tree level loop homotopy Lie algebra is a strongly homotopy Lie algebra (V , {ln }) with an additional structure given by a bilinear form B such that the element h = hs ⊗hs , uniquely determined by B, satisfies (16) and (17). We see that the “tree-level” specialization is a richer structure than just a strongly homotopy Lie algebra as it is usually understood. A proper name for such a structure would be a cyclic strongly homotopy Lie algebra. 3. Higher Order (Co)derivations In this section we investigate properties of higher order coderivations of cofree cocommutative coalgebras. Because this paper is meant for humans, not for robots, we derive necessary properties for derivations on free commutative algebras, and then simply dualize the results. This is an absolutely correct procedure, except for one fine point related to the cofreeness, see Remark 3. The following definitions were taken from [1, 3]. Let A be a graded (super) commutative algebra and ∇ : A → A a homogeneous degree k linear map. We define inductively, for each n ≥ 1, degree k linear deviations n∇ : A⊗n → A by 1∇ (a) := ∇(a),

2∇ (a, b) := ∇(ab) − ∇(a)b − (−1)ka a∇(b),

3∇ (a, b, c) := ∇(abc) − ∇(ab)c − (−1)a(b+c) ∇(bc)a − (−1)c(a+b) ∇(ca)b + ∇(a)bc + (−1)a(b+c) ∇(b)ca + (−1)c(a+b) ∇(c)ab,

.. . n n n+1 ∇ (a1 , . . . , an+1 ) := ∇ (a1 , . . . , an an+1 ) − ∇ (a1 , . . . , an )an+1

− (−1)an ·an+1 n∇ (a1 , . . . , an−1 , an+1 )an .

Loop Homotopy Algebras in Closed String Field Theory

373

As a matter of fact, it is possible to give a non-inductive formula for n∇ , namely n∇ (a1 , . . . , an ) = (−1)n−i (σ )∇(xσ (1) · · · xσ (i) )xσ (i+1) · · · xσ (n) . (18) 1≤i≤n σ ∈unsh(i,n−i)

We say that ∇ is a derivation of order r if r+1 ∇ is identically zero. In this case we write ∇ ∈ Der rk (A), where k = deg(∇). In the following proposition, which was stated in [1], [−, −] denotes the graded anticommutator of endomorphisms. Proposition 1. The subspaces Der rk (A) satisfy: (i) Der 1k (A) ⊂ Der 2k (A) ⊂ Der 3k (A) ⊂ · · · , (ii) Der rk (A) ◦ Der sl (A) ⊂ Der r+s k+l (A), and s r (iii) [Der k (A), Der l (A)] ⊂ Der r+s−1 k+l (A). Let now A = ∧X be the free graded commutative algebra on the graded vector space X. Let us prove the following useful proposition. Proposition 2. Let ∇ ∈ Der rk (∧X). Then ∇ is uniquely determined by its values on the products x1 · · · xs , s ≤ r, xi ∈ X for 1 ≤ i ≤ s. In particular, ∇ = 0 if and only if ∇(x1 · · · xs ) = 0, for x1 · · · xs as above. Proof. Since ∇ ∈ Der rk (∧X) is linear, it is enough to prove that ∇(x1 · · · xs ) = 0 for all s ≤ r implies that ∇(x1 · · · xn ) = 0 for each n. This we prove inductively. Suppose we already know ∇(x1 · · · xk ) = 0, for each k ≤ n, n ≥ r, and consider ∇(x1 · · · xn+1 ). We compute from (18) that n+1 ∇ (x1 , . . . , xn+1 )

= ∇(x1 · · · xn+1 ) + (−1)n−i+1 (σ )∇(xσ (1) · · · xσ (i) )xσ (i+1) · · · xσ (n+1) . 1≤i≤n σ ∈unsh(i,n−i+1)

Since ∇ ∈ Der rk (∧X) and n ≥ r, n+1 ∇ (x1 , . . . , xn+1 ) = 0, while the terms in the sum are zero by the inductive assumption. Thus ∇(x1 · · · xn+1 ) = 0 and the induction may go on. Remark 2. 1-derivations are ordinary derivations, Der 1k (A) = Der k (A). Proposition 2 then states the standard fact that derivations on free algebras are given by their restrictions to the space of generators. For a fixed n, we denote by ∧n X the subspace of ∧X spanned by the products x1 · · · xn , xi ∈ X, 1 ≤ i ≤ n; we put, by definition, ∧0 X := k. Let ιn : ∧n X (→ ∧X be the inclusion. The following proposition says that r-derivations of the free algebra ∧X are in one-to-one correspondence with r-tuples of linear maps, {fs : ∧s X → ∧X}1≤s≤r . Proposition 3. Suppose we are given homogeneous degree k linear maps fs : ∧s X → ∧X, for 1 ≤ s ≤ r. Then there exists a unique order r derivation ∇ ∈ Derrk (∧X) such that ∇ ◦ ιs = fs , for 1 ≤ s ≤ r.

(19)

374

M. Markl

Proof. The uniqueness follows immediately from Proposition 2. To prove the existence, observe first that, given degree k linear maps gs : ∧s X → ∧X, 1 ≤ s ≤ r, the formula ∇(x1 · · · xn ) :=

(σ )gs (xσ (1) · · · xσ (s) )xσ (s+1) · · · xσ (n) ,

1≤s≤min(r,n) σ ∈unsh(s,n−s)

defines an order k derivation. Condition (19) then leads to the following system of equations: f1 (x1 ) = g1 (x1 ), f2 (x1 x2 ) = g2 (x1 x2 ) + g1 (x1 )x2 + (−1)x1 x2 g1 (x2 )x1 , .. . fr (x1 · · · xr ) = (σ )gs (xσ (1) · · · xσ (s) )xσ (s+1) · · · xσ (r) . 1≤s≤r σ ∈unsh(s,n−s)

This system can obviously be solved for gs , 1 ≤ s ≤ r. Let us turn our attention to coalgebras. Suppose that C = (C, +) is a cocommutative coassociative coalgebra. To define higher-order coderivations of C, we need analogs of the deviations r∇ introduced above. By duality, we define, for any homogeneous degree n : C → C ⊗n inductively k linear endomorphism , of C, degree k multilinear maps -, as 1 -, := ,, 2 -, := + ◦ , − (,⊗1) ◦ + − (1⊗,) ◦ +,

3 -, := +[3] ◦, − (+⊗1)◦(,⊗1)◦+ − T312 ◦(+⊗1)◦(,⊗1)◦+

− T231 ◦(+⊗1)◦(,⊗1)◦+ + (,⊗12 )◦+[3] + T312 ◦(,⊗12 )◦+[3]

+ T231 ◦(,⊗12 )◦+[3] .. . n+1 n n n -, := (1n−1 ⊗+) ◦ -, − (-, ⊗1) ◦ + − T1,2,... ,n−1,n+1,n ◦ (-, ⊗1) ◦ +,

where +[3] := (+⊗1)+ (= (1⊗+)+ by the coassociativity) and, for σ ∈ n , Tσ (1)···σ (n) : C ⊗n → C ⊗n is defined by Tσ (1)···σ (n) (x1 ⊗ · · · ⊗ xn ) := (σ )(xσ (1) ⊗ · · · ⊗ xσ (n) . r+1 is identically We say that a linear map , : C → C is an order r coderivation, if -, r zero. Let coDer k (C) be the space of all such maps. The following proposition is an exact dual of Proposition 1.

Loop Homotopy Algebras in Closed String Field Theory

375

Proposition 4. The subspaces coDer rk (C) satisfy: (i) coDer 1k (C) ⊂ coDer 2k (C) ⊂ coDer 3k (C) ⊂ · · · , (ii) coDer rk (C) ◦ coDer sl (C) ⊂ coDer r+s k+l (C), and s r (iii) [coDer k (C), coDer l (C)] ⊂ coDer r+s−1 k+l (C). Let W be a graded vector space and consider again the free graded commutative algebra ∧W on W . We introduce on ∧W a cocommutative coassociative comultiplication + = 1⊗1 + + + 1⊗1 by defining the reduced diagonal + as +(w1 · · · wn ) = (σ )(wσ (1) · · · wσ (i) ) ⊗ (wσ (i+1) · · · wσ (n) ), 1≤i≤n−1 σ

w1 · · · wn ∈ ∧n W , where σ runs through all (i, n − i) unshuffles. We denote the coalgebra (∧W, +) by c∧W . Remark 3. Here it must be pointed out that c∧W is not the cofree cocommutative coassociative coalgebra cogenerated by W , as it is generally supposed to be. It is the cofree coalgebra in the category of connected coalgebras, see the discussion in [13, p. 2150]. Denote by πn : c∧W → ∧n W the natural projection of vector spaces. The following theorem is the exact dual of Proposition 3. Proposition 5. For each r-tuple us : c∧W → ∧s W , 1 ≤ s ≤ r, of homogeneous degree k linear maps there exists a unique order r coderivation , ∈ coDer rk (c∧W ) such that πs ◦, = us , for 1 ≤ s ≤ r.

(20)

4. Loop Homotopy Lie Algebras – 1st Description We already observed at the end of Sect. 2 that strongly homotopy Lie algebras are closely related to the “tree level” specializations of loop homotopy Lie algebras. Recall [13, Theorem 2.3] that strongly homotopy Lie algebras have the following characterization. Proposition 6. There exists a one-to-one correspondence between strongly homotopy Lie algebra structures on a graded vector space V and degree −1 coderivations δ ∈ coDer −1 (c∧W ), W :=↑ V , with the property δ 2 = 0. In this section we give a similar characterization for loop homotopy Lie algebras. Suppose that the vector space V and the bilinear form B is the same as in Def. 1. Let h = hs ⊗hs ∈ (V ⊗V )−3 be as in (12) (of course, h is uniquely determined by the nondegenerate form B). Let W :=↑ V and y = ys ⊗y s :=↑ hs ⊗ ↑ hs ∈ (W ⊗W )−1 . Because h is symmetric, y is symmetric as well, thus, in fact, y = ys y s ∈ (∧2 W )−1 . Let us consider the extension c∧W [t] of c∧W over the polynomial ring k[t], c∧W [t] := c∧W ⊗k k[t]. By Proposition 5, there exist a unique coderivation θ ∈ coDer 2−1 (c∧W [t]) such that 0, w ∈ ∧n W [t], n > 0, π1 (θ ) = 0 and π2 (θ )(w) = 1 (21) 0 0 ∼ 2 ty, w = 1 ∈ ∧ W · t = k. The rôle of θ is to incorporate the form B into our theory. In the rest of this section we prove the following theorem.

376

M. Markl

Theorem 1. Under the above notation, there is a one-to-one correspondence between loop homotopy Lie algebra structures on the graded vector space V and degree −1 coderivations δ ∈ coDer 1−1 (c∧W [t]) such that (δ + θ)2 = 0.

(22)

Let us analyze Eq. (22). It is, of course, equivalent to δ 2 + θδ + δθ + θ 2 = 0.

(23)

Sublemma 1. Under the above notation, θ 2 = 0, δ 2 ∈ coDer 1−2 (c∧W [t]), and (θ δ + δθ ) ∈ coDer 2−2 (c∧W [t]). Proof. For w1 · · · wn ∈ ∧n W obviously θ (w1 · · · wn ) =

1 tys y s w1 · · · wn , 2

(24)

thus θ 2 (w1 · · · wn ) =

1 2 t ys y s y t y t w 1 · · · w n . 4

(25)

The graded commutativity implies that ys y s yt y t = (−1)(ys +y

s )(y +y t ) t

yt y t ys y s = −yt y t ys y s .

On the other hand, the substitution s ↔ t gives ys y s yt y t = yt y t ys y s , therefore yt y t ys y s = 0, and θ 2 = 0 by (25). The remaining two statements follow from Proposition 4(iii) and the observation that δ 2 = 21 [δ, δ] and θ δ + δθ = [δ, θ ]. By Sublemma 1, (23) reduces to δ 2 + θδ + δθ = 0.

(26)

By the same sublemma and Proposition 1(i), δ 2 + θδ + δθ is an order 2 coderivation. Thus (26) is, by Proposition 5, equivalent to π1 (δ 2 + θδ + δθ ) = 0, and

(27)

π2 (δ 2 + θδ + δθ ) = 0.

(28)

Because, by (21), π1 (θ ) = 0, Eq. (27) further reduces to π1 (δ 2 + δθ ) = 0.

(29)

To understand better the meaning of this equation, let us introduce, for any g ≥ 0 and g n ≥ 0, linear maps δn : ∧n W → W by g

δn (w1 · · · wn ) := Coef g (π1 δ(w1 · · · wn )), w1 · · · wn ∈ ∧n W,

(30)

Loop Homotopy Algebras in Closed String Field Theory

377 g

where Coef g (−) is the coefficient at t g . By Proposition 5, the set {δn }n,g≥0 uniquely determines the coderivation δ. The explicit formula is (compare explicit formulas for coderivations acting on coalgebras in [14]): g (σ )t g δi (wσ (1) · · · wσ (i) )wσ (i+1) · · · wσ (n) , (31) δ(w1 · · · wn ) = 0≤i≤n

where the summation is taken over all g ≥ 0 and all σ ∈ unsh(i, n − i). From this and (24) we obtain π1 (δ 2 + δθ )(w1 · · · wn ) = k+l=n+1 g1 +g2 =g

+

σ ∈unsh(l,n−1)

g

g

(σ )t g δk 1 (δl 2 (wσ (1) · · · wσ (l) )wσ (l+1) · · · wσ (n) ) (32)

1 g+1 g t δn+2 (ys , y s , w1 , . . . , wn ). 2 s,g≥0

We formulate the result as: Sublemma 2. Eq. (29) means that, for all n ≥ 0, w1 · · · wn ∈ ∧n W and g ≥ 0, g g (σ )δk 1 (δl 2 (wσ (1) · · · wσ (l) )wσ (l+1) · · · wσ (n) ) 0= k+l=n+1 g1 +g2 =g

+

(33)

σ ∈unsh(l,n−1)

1 g−1 δ (ys , y s , w1 , . . . , wn ). 2 s n+2

We will see that Eq. (33) will correspond to the “main identity” (11). Let us make a similar analysis of Eq. (28). Because clearly π2 (θ δ) = 0, it reduces to π2 (δ 2 + δθ) = 0.

(34)

Using the similar arguments as above, we obtain, for any g ≥ 0 and w1 · · · wn ∈

∧n W ,

(35) Coef g (π2 (δ 2 )(w1 · · · wn )) g1 g2 = (σ )δk (δl (wσ (1) · · · wσ (l) )wσ (l+1) · · · wσ (n−1) )wσ (n) k+l=n+1 g1 +g1 =g

+

σ ∈unsh(l,n−l−1,1)

g

g

(−1)wσ (1) +···+wσ (p) (σ )δp1 (wσ (1) · · · wσ (p) )δq 2 (wσ (p+1) · · · wσ (n) ).

p+g=n σ ∈unsh(p,q) g1 +g1 =g

Similarly, we have Coef g (π2 (δθ )(w1 · · · wn )) 1 g−1 = (−1)wi (wi+1 +···+wn ) δn+1 (ys y s w1 · · · wi−1 wi+1 · · · wn )wi 2 1≤i≤n

1 s g−1 (−1)ys (y +w1 +···+wn ) δn+1 (y s w1 · · · wn )ys 2 s 1 s g−1 + (−1)y (w1 +···+wn ) δn+1 (ys w1 · · · wn )y s . 2 s +

(36)

378

M. Markl

Now, assuming (33), it is immediate to see that the first term at the right-hand side of (35) s is minus the first term at the right-hand side of (36). The symmetry ys y s = (−1)ys y y s ys implies that the second and third terms at the left-hand side of (36) are the same, both g−1 equal to 1/2 s (−1)ys ys δn+1 (y s w1 · · · wn ). We formulate these observations as Sublemma 3. Assuming (33), Eq. (34) is equivalent to 1 g−1 (−1)ys ys δn+1 (y s w1 · · · wn ) = 0. 2 s

(37)

Since we work in the free commutative algebra, (37) is equivalent to the antisymmetry of 1 g−1 (−1)ys ys ⊗δn+1 (y s w1 · · · wn ) ∈ W ⊗W. (38) 2 s Proof of Theorem 1. Recall that W =↑ V . The correspondence between the structure g operations {ln }g,n≥0 of a loop homotopy Lie algebra and coderivations δ of Theorem 1 is given by g

g

ln (v1 , . . . , vn ) = (−1)(n−1)v1 +···+vn−1 ↓ δn (↑ v1 · · · ↑ vn ), v1 , . . . , vn ∈ V , with the inverse formula g

g

δn (w1 · · · wn ) = (−1)n(n−1)/2 (−1)(n−1)w1 +···+wn−1 ↑ ln (↓ w1 , . . . , ↓ wn ), g

w1 · · · wn ∈ ∧n W , where the multilinear maps {δn } were introduced in (30). Observe the sign (−1)n(n−1)/2 in the second formula; it is typical for formulas of this type, see [15, g g Example 1.6]. A routine calculation shows that the substitution ln ↔ δn converts (33) to (11) and that the symmetry of the element in (13) is equivalent to the antisymmetry of the element of (38).

5. Loop Homotopy Lie Algebras – Operadic Approach In this section we give an operadic characterization of loop homotopy Lie algebras. We will not repeat here all details of necessary definitions concerning operads, because it would stretch the paper beyond any reasonable limit. Operads are introduced in the classical book [17]. The (co)bar construction over a (co)operad is defined in [9], see also [6]. Cyclic operads are introduced in [7] while modular operads and the corresponding modular (co)bar construction (called the Feynman transform) in [8]. There is also a nice overview [10]. These sources are easily available, we will thus rely on them and indicate only basic ideas. Recall that a collection is a system E = {E(n)}n≥1 of graded vector spaces such that each E(n) possesses a right action of the symmetric group n . Any collection E extends to a functor (denoted by the same symbol) from the category of finite sets to the category of graded vector spaces with the property that E(n) = E({1, . . . , n}) [6, 1.3]. Let Trn denote the set of rooted (= directed) trees with n labelled leaves. For a tree T ∈ Trn and a collection E, denote ([9, 1.2.13]) E(T ) := E(In(v)), v∈Vert(T )

Loop Homotopy Algebras in Closed String Field Theory

379

where Vert(T ) is the set of the vertices of T and In(v) the set of incoming edges of v. The free operad on E [9, 2.1.1] is then the collection F(E)(n) := E(T ), n ≥ 1, T ∈Trn

with the operadic structure induced by the grafting of underlying trees. Let P be an operad. Consider the free operad F(↓ sP ∗ ) on the collection ↓ sP ∗ (n) := ↑ n−2 P ∗ (n), n ≥ 1, where (−)∗ is the linear dual. As proved in [9, 3.2], structure operations of the operad P induce a differential ∂D on F(↓ sP ∗ ). The differential operad D(P) := (F(↓ sP ∗ ), ∂D ) is called the (operadic) cobar dual of the operad P. It is well-known [9, 4.2.14] that “classical” strongly homotopy Lie algebras are characterized as follows. Proposition 7. Strongly homotopy Lie algebras are algebras over the cobar dual D(Com) of the operad Com for commutative algebras. The above proposition means that a strongly homotopy Lie algebra structure on a differential graded vector space V = (V , ∂) is the same as a morphism a : D(Com) → End V from the operad D(Com) to the endomorphism operad End V of V [9, 1.2.9]. Our aim is to give a similar characterization of loop homotopy Lie algebras, based on a certain generalization of operads, called modular operads. An intermediate step between ordinary operads and modular operads are cyclic operads whose definition we briefly recall. A cyclic collection is a system E = {E((n))}n≥1 of graded vector spaces such that each E((n)) has a right n+1 -action. Each cyclic collection E induces a functor from the category of finite sets into the category of graded vector spaces (denoted again by E) such that E(({0, . . . , n})) = E((n)). This notation differs from that of [7] and [5] where E(({0, . . . , n})) = E((n + 1)). Let Tur n denote the set of unrooted trees T with leaves indexed by {0, . . . , n}. For a cyclic collection E and a tree T ∈ Tur n , let E((T )) := E((Leg(v))), v∈Vert(T )

where Leg(v) is the set of all edges of T adjacent to the vertex v. A cyclic operad is then a cyclic collection C = {C((n))}n≥1 together with a “coherent” system of “contractions” αT : C((T )) → C((n)), T ∈ Tur n , n ≥ 1,

(39)

see [7, Def. 2.1] Modular operads, anticipated in [5], were introduced by Getzler and Kapranov [8] for the study of moduli spaces of Riemann surfaces of arbitrary genera. Recall that a modular collection is a cyclic collection E with a second grading by the “genus” g ≥ 0, E = {E((g, n))}n≥1 . A modular operad A is then a modular collection which possesses, besides a cyclic operadic structure, also operations A((g, n + 2)) → A((g + 1, n)). These operations are abstractions of the “self-gluing” which produces, from a surface of genus g with (n + 2) punctures, a new surface of genus g + 1 with n punctures, as indicated in Fig. 1.

380

M. Markl

3

4

self-gluing

1

2

✲

1

2

Fig. 1. An example of “self-gluing”. The surface on the right has 2 punctures and genus 2. It is obtained from the surface on the left with 4 punctures and genus 1 by sewing along the punctures marked by 3 and 4

As cyclic operads are characterized by a system of contractions (39) indexed by unrooted trees, there is a similar characterization of modular operads, but based on labelled (or “modular”) graphs rather than trees. Following [5, 12], by a graph 8 we mean a finite set Flag(8) (whose elements are called flags or half-edges) together with an involution σ and a partition λ. The vertices Vert(8) of a graph 8 are the blocks of the partition λ. The edges Edg(8) are pairs of flags forming a two-cycle of σ relative to the decomposition of a permutation into disjoint cycles. The legs Leg(8) are the fixed-points of σ . We also denote by Leg(v) the flags belonging to the block v or, in common speech, half-edges adjacent to the vertex v. Each graph 8 has its geometric realization, a finite one-dimensional cell complex |8|, obtained by taking one copy of [0, 21 ] for each flag and imposing the following equivalence relation: the points 0 ∈ [0, 21 ] are identified for all flags in a block of the partition λ, and the points 21 ∈ [0, 21 ] are identified for pairs of flags exchanged by the involution σ . We will usually make no distinction between a graph and its geometric realization. A modular or labelled graph is a connected graph 8 together with a map g : Vert(8) → {0, 1, 2, . . . }. The genus g(8) of a modular graph 8 is the number g(8) := dim H1 (|8|) + g(v). v∈Vert(8)

Let 8 ((g, S)) be the category whose objects are pairs (|8|, ρ) consisting of a modular graph 8 of genus g and an isomorphism ρ : Leg(8) → S labeling the legs of 8 by elements of a finite set S. As usual, we write 8 ((g, n)) := 8 ((g, {0, . . . , n})). For a modular collection A = {A((g, n))}n≥1 and a modular graph 8, let A((8)) be the tensor product A((8)) := A((g(v), Leg(v))). (40) v∈Vert(8)

A modular operad structure on A is then given by a coherent system of contractions [8, 2.10] α8 : A((8)) → A((g, S)), for any 8 ∈ 8 ((g, S)), g ≥ 0 and a finite set S.

Loop Homotopy Algebras in Closed String Field Theory

381

Example 1. Let V = (V , B) be a differential graded vector space with a graded symmetric inner product B : V ⊗V → k. Let us define, for each g ≥ 0 and a finite set S, End V ((g, S)) := V ⊗S (the tensor product of copies of V indexed by S). It follows from definition that, for any 8 ∈ 8 ((g, S)), End V ((8)) = V ⊗Flag(8) . Let B ⊗Edg(8) : V ⊗Flag(8) → V ⊗Leg(8) be the multilinear form which contracts the factors of V ⊗Flag(8) corresponding to the flags which are paired up as edges of 8. Then we define α8 : End V ((g, 8)) → End V ((g, S)) to be the map B ⊗Edg(8)

α8 : End V ((8)) ∼ = V ⊗Flag(8) −−−−−−→ V ⊗Leg(8) ∼ = V ⊗S = End V ((g, S)).

(41)

It is easy to show that the contractions {α8 | 8 ∈ 8 ((g, S))} define on End V the structure of a modular operad. We would like to modify Example 1 to the situation when the degree of the form B is +3, as in the definition of a loop homotopy Lie algebra. Formula (41) does not work, among other things also because α8 will not be of degree zero. For this modification we need to introduce “twisted” modular operads. If X is a finite set with card(X) = s, let Det(X) := ∧s ((↓ k)⊕X ), the top dimensional piece of the s-fold exterior power of the direct sum of the copies of ↓ k indexed by elements of X. Clearly Det(X) is an one-dimensional vector space concentrated in degree −s. The determinant of a graph 8 ∈ 8 ((g, S)) is defined by Det(8) := Det(Edg(8)). A twisted modular operad ([5, p. 293], also called a K-modular operad in [7]) is then a modular collection A together with a coherent system of contractions α˜ 8 : A((8))⊗Det(8) → A((g, S)), for any 8 ∈ 8 ((g, S)), g ≥ 0 and a finite set S. Example 2. Let W = (W, H ) be a graded vector space with a nondegenerate degree −1 W by symmetric bilinear form H . Define the modular collection End W ((g, S)) := W ⊗S , End for g ≥ 0 and a finite set S. For 8 ∈ 8 ((g, S)), the twisted modular contraction W ((8))⊗Det(8) → End W ((g, S)) α˜ 8 : End is defined as follows. Let us choose labels se , te such that e = {se , te } for each edge e ∈ Edg(8) and define α˜ 8 to be the composition: W ((8))⊗Det(8) ∼ End = W ⊗Flag(8) ⊗Det(8)

∼ W ⊗{se ,te } ⊗Span(↓ e) = W ⊗S ⊗ e∈Edg(8)

∼ =W

⊗S

⊗

Wse ⊗Wte ⊗Span(↓ e)

e∈Edg(8)

1⊗ e He W ((g, S)), −−−−−−→ W ⊗S ⊗k⊗Edg(8) ∼ = End

382

M. Markl

where He is the map that sends u⊗v⊗↓e ∈ Wse ⊗Wte ⊗Span(↓e) to H (u, v) ∈ k. The symmetry of H assures that the definition of α˜ 8 does not depend on the choice of labels. W the structure of a twisted modular The system {α˜ 8 | 8 ∈ 8 ((g, S))} induces on End operad. If V = (V , B) is a graded vector space with a nondegenerate degree +3 bilinear symmetric form B, then W = (W, H ) with W := ↑2 V and the form H defined by H (u, v) := B(↓2 u, ↓2 v), u, v ∈ W , form the data as in Example 2, so we may ↑2 V . consider the twisted modular operad End Another example of a twisted modular operad is provided by the Feynman transform (E) on a of a modular operad. Recall [8, 4.2] that the free twisted modular operad M modular collection E is given by (E)((g, n)) := M

colim

E((8))⊗Det(8),

8 ∈ Iso 8 ((g, n))

where Iso 8 ((g, n)) is the full subcategory of isomorphisms in 8 ((g, n)). The twisted modular operad structure is induced by the “grafting” of underlying graphs. (A)((g, n)) carries a natural differential ∂F [5, If A is a modular operad, then M (A), ∂F ) is called Theorem 4.4]. The twisted differential modular operad F(A) := (M the Feynman transform of the modular operad A. Let us consider the “forgetful” functor For : MOp → COp from the category of modular operads to the category of cyclic operads given by For(A)((S)) := A((0, S)), for any finite set S. It is not difficult to show [16] that this functor has a left adjoint Mod : COp → MOp. Definition 2. The modular operad Mod(P) is called the modular operadic completion of the cyclic operad P. An easy calculation shows that Mod(Com)((g, n)) ∼ = k, for each g ≥ 0, n ≥ 1,

(42)

with the trivial action of the symmetric group n+1 . The key role in our characterization is played by the Feynman transform F(Mod(Com)) of the modular completion of the operad Com. It follows from (42) that, as a nondifferential operad, F(Mod(Com)) is the free twisted modular operad on the g generators ωn , (Mod(Com)) ∼ ({ωng ; n ≥ 1, g ≥ 0}), M =M

(43)

g where ωn corresponds to the dual of 1 ∈ k ∼ = Mod(Com)((g, n)). The central result of this section reads as follows.

Theorem 2. There exists a natural one-to-one correspondence between twisted modular F(Mod(Com))-algebra structures on (↑ 2 V , B(↓ 2 −, ↓ 2 −), i.e. morphisms

↑2 V , ∂ = 0 (44) a : F(Mod(Com)), ∂F → End of differential twisted modular operads, and loop homotopy algebra structures on (V , B) in the sense of Def. 1.

Loop Homotopy Algebras in Closed String Field Theory

383

Sketch of proof. Description (43) shows that a map a of (44) is determined by its values g g ↑2 V ((g, n)) on the generators. Moreover, the map a ought to commute ξn := a(ωn ) ∈ End with the differentials, so the equation g

a(∂F (ωn )) = 0

(45) g

↑2 V ((g, n)) can be must be satisfied, for each g ≥ 0 and n ≥ 1. Observe that ξn ∈ End interpreted as a degree −2(n + 1)-element of the graded vector space V ⊗n+1 . Let us introduce a map @ : V ⊗n+1 → Hom(V ⊗n , V ) by @(x0 ⊗ · · · ⊗ xn )(v1 , . . . , vn ) : := (−1)nx0 +(n−1)x1 +···+xn−1 x0 B(x1 , v1 )B(x2 , v2 ) · · · B(xn , vn ),

(46)

for x0 ⊗ · · · ⊗ xn ∈ V ⊗n+1 and v1 , . . . , vn ∈ V . The map @ is clearly a degree 3n isog morphism of V ⊗n+1 and Hom(V ⊗n , V ). Finally, let ln : V ⊗n → V be a homogeneous degree n − 2 map given by g

ln (v1 , . . . , vn ) := (−1)

n(n+1) +n(v1 +···+vn ) 2

g

@(ωn )(v1 , . . . , vn ), for v1 , . . . , vn ∈ V . g

A long but straightforward calculation shows that ln are antisymmetric operations satisfying (13) and that (45) translates to the main identity (11). On the other hand, all steps above can clearly be reversed, thus a loop homotopy Lie algebra structure induces a map (44). Remark 4. Observe that Theorem 2 is formulated in such a way that the differential ∂ on V is a part of the structure, namely ∂ := a(ω10 ). 6. Possible Generalizations (Open Strings) Let P be an operad. It is now well-understood what a “strongly homotopy P-algebra” is. In the case when P is Koszul, it is an algebra over the cobar construction on the quadratic dual P ! of P [9, Def. 4.2.14]. An alternative characterization is that a homotopy P-algebra is a square zero differential on the cofree connected P ! -coalgebra. The equivalence of these two characterizations follows for example from [9, Prop. 4.2.15]. The quadratic dual of the operad Lie for Lie algebras is Com, the operad for commutative associative algebras, and the above characterization give Proposition 6, resp. Proposition 7. Another example is P = Ass, the operad for associative algebras. It is quadratic self-dual, P ! = Ass, and the corresponding strongly homotopy algebras are called strongly homotopy associative or A∞ -algebras [18, 15]. Let us look for possible generalizations to the loop case. If P is a cyclic operad (recall that both Lie and Ass are cyclic), the quadratic dual P ! is again cyclic [7], so it makes sense to consider the modular completion Mod(P ! ) (Def. 2). We suggest the following definition. Definition 3. Let P be a Koszul cyclic operad. A loop homotopy P-algebra is a modular algebra over the twisted differential modular operad F(Mod(P ! )).

384

M. Markl

For P = Lie we get Theorem 2. It would be interesting to write out explicitly axioms of loop homotopy associative algebras, because these structures should play an important rôle in the higher-genera open string field theory, as suggested by [19]. While in the Lie g case we had, for each n and g, only one antisymmetric operation ln : V ⊗n → V , in the loop homotopy associative case we expect to have (n + 1)! g 2 · g! · (n + 1 − 2g)! operations V ⊗n → V , due to the dimension of Mod(Ass)((g, n)). A seemingly easier approach would be the one based on coderivations. We would like to say that a loop homotopy P-algebra is an order 2 coderivation of the cofree connected P ! -coalgebra, having properties analogous to (22). This works nicely for P = Lie, because we know what is a higher order coderivation of a cocommutative coalgebra. But we are not sure whether there exists a reasonable concept of higher-order coderivations without the cocommutativity, though the paper [4] seems to suggest this. Acknowledgement. I would like to express my gratitude to Jim Stasheff for reading the manuscript and many helpful remarks and suggestions.

References 1. Akman, F.: On some generalizations of Batalin–Vilkovisky algebras. Preprint q-alg/9506027, June 1995 2. Akman, F.: Multibraces on the Hochschild complex. Preprint q-alg/9702010, February 1997 3. Alfaro, J., Bering, K., Damgaard, P.H.: Algebra of higher antibrackets. Preprint hep-th/9604027, April 1996 4. Alfaro, J., Damgaard, P.H.: Non-Abelian antibrackets. Preprint hep-th/9511066, November 1995 5. Behrend, K., Manin, Yu.: Stacks of stable maps and Gromov-Witten invariants. Preprint alggeom/9506023, June 1995 6. Getzler, E., Jones, J.D.S.: Operads, homotopy algebra, and iterated integrals for double loop spaces. Preprint, 1993 7. Getzler, E., Kapranov, M.M.: Cyclic operads and cyclic homology. In: S.-T. Yau, editor, Geometry, Topology and Physics for Raoul Bott, Volume 4 of Conf. Proc. Lect. Notes. Geom. Topol., Cambridge, MA: International Press, 1995, pp. 167–201 8. Getzler, E., Kapranov, M.M.: Modular operads. Compositio Math. 110 (1), 65–126 (1998) 9. Ginzburg, V., Kapranov, M.M.: Koszul duality for operads. Duke Math. J. 76 (1), 203–272 (1994) 10. Kapranov, M.M.: Operads in algebraic geometry. Documenta Mathematica Extra Volume ICM, pp. 277– 286 (1998) 11. Kimura, T., Stasheff, J.D., Voronov, A.A.: On operad structures of moduli spaces and string theory. Commun. Math. Phys. 171, 1–25 (1995) 12. Kontsevich, M.: Graphs, homotopical algebra and low-dimensional topology. Preprint, 1994 13. Lada, T., Markl, M.: Strongly homotopy Lie algebras. Communications in Algebra 23 (6), 2147–2161 (1995) 14. Lada, T., Stasheff, J.D.: Introduction to sh Lie algebras for physicists. International J. Theor. Phys. 32 (7), 1087–1103 (1993) 15. Markl, M.: A cohomology theory for A(m)-algebras and applications. J. Pure Appl. Algebra 83, 141–175 (1992) 16. Markl, M., Shnider, S., Stasheff, J.D.: Operads in algebra, topology and mathematical physics. Book, work in progress 17. May, J.P.: The Geometry of Iterated Loop Spaces. Lecture Notes in Mathematics Vol 271 Berlin– Heidelberg–New York: Springer-Verlag, 1972 18. Stasheff, J.D.: Homotopy associativity of H-spaces I,II. Trans. Am. Math. Soc. 108, 275–312 (1963) 19. Stasheff, J.D.: Higher homotopy algebras: String field theory and Drinfel’d quasi-Hopf algebras. In: Proceedings of the XXth International Conference on Differential Geometric Methods in Theoretical Physics, Baruch College, CUNI, June 1991, Singapore: World Scientific, 1992, pp. 408–425 20. Zwiebach, B.: Closed string field theory: Quantum action and the Batalin–Vilkovisky master equation. Nucl. Phys. B 390, 33–152 (1993) Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 220, 385 – 432 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Noncommutative Instantons and Twistor Transform Anton Kapustin1, , Alexander Kuznetsov2, , Dmitri Orlov3, 1 School of Natural Sciences, Institute for Advanced Study, Olden Lane, Princeton, NJ 08540, USA.

E-mail: [email protected]

2 Institute for Problems of Information Transmission, Russian Academy of Sciences, 19 Bolshoi Karetnyi,

Moscow 101447, Russia. E-mail: [email protected]; [email protected]

3 Algebra Section, Steklov Mathematical Institute, Russian Academy of Sciences, 8 Gubkin str., GSP-1,

Moscow 117966, Russia. E-mail: [email protected] Received: 3 May 2000 / Accepted: 3 April 2001

Dedicated to A.N. Tyurin on his 60th birthday Abstract: Recently N. Nekrasov and A. Schwarz proposed a modification of the ADHM construction of instantons which produces instantons on a noncommutative deformation of R4 . In this paper we study the relation between their construction and algebraic bundles on noncommutative projective spaces. We exhibit one-to-one correspondences between three classes of objects: framed bundles on a noncommutative P2 , certain complexes of sheaves on a noncommutative P3 , and the modified ADHM data. The modified ADHM construction itself is interpreted in terms of a noncommutative version of the twistor transform. We also prove that the moduli space of framed bundles on the noncommutative P2 has a natural hyperkähler metric and is isomorphic as a hyperkähler manifold to the moduli space of framed torsion free sheaves on the commutative P2 .The natural complex structures on the two moduli spaces do not coincide but are related by an SO(3) rotation. Finally, we propose a construction of instantons on a more general noncommutative R4 than the one considered by Nekrasov and Schwarz (a q-deformed R4 ). 1. Physical Motivation In this section we explain the physical motivation for studying instantons on a noncommutative R4 . Readers uninterested in the motivation may skip most of this section and proceed directly to Subsect. 1.5. Likewise, readers familiar with the way noncommutative instantons arise in string theory may start with Subsect. 1.5. 1.1. Instanton equations. Let E be a vector bundle with structure group G on an oriented Riemannian 4-manifold X, and let A be a connection on E. The instanton equation is Supported by DOE grant DE-FG02-90ER4054442.

Supported by NSF grant DMS97-29992 and RFFI grants 99-01-01144, 99-01-01204. Supported by NSF grant DMS97-29992 and RFFI grant 99-01-01144.

386

A. Kapustin, A. Kuznetsov, D. Orlov

the equation FA+ = 0,

(1)

where FA is the curvature of A, and FA+ denotes the self-dual (SD) part of FA . Solutions of this equation are called instantons, or anti-self-dual (ASD) connections. The second Chern class of E is known in the physics literature as the instanton number. Instantons automatically satisfy theYang–Mills equation dA (∗F ) = 0, where dA : p ⊗End(E) −→ p+1 ⊗ End(E) is the covariant differential, and ∗ : p −→ 4−p is the Hodge star operator. There are several physical reasons to be interested in instantons. If one is studying quantum gauge theory on a Riemannian 3-manifold M (space), then instantons on X = M × R describe quantum-mechanical tunneling between different classical vacua. The possibility of such tunneling has drastic physical effects, some of which can be experimentally observed. If one is studying classical gauge theory on a 5-dimensional space-time X × R, then instantons on X can be interpreted as solitons, i.e. as static solutions of the Yang–Mills equations of motion. In fact, instantons are the absolute minima of the Yang–Mills energy function of the 5-dimensional theory (with fixed second Chern class). Both interpretations arise in string theory, but to explain this we need to make a digression and discuss D-branes.

1.2. D-branes. It has been discovered in the last few years that string theory describes, besides strings, extended objects (branes) of various dimensions. These extended objects should be regarded as static solutions of (as yet poorly understood) stringy equations of motion. D-branes are a particularly manageable class of branes. Recall that ordinary closed oriented superstrings, known as Type II strings, are described by maps from a Riemann surface (“worldsheet”) to a 10-dimensional manifold Z (“target”). The physical definition of a D-brane is “a submanifold of Z on which strings can end”. This means that if a D-brane is present, then one needs to consider maps from a Riemann surface with boundaries to Z such that the boundaries are mapped to a certain submanifold X ⊂ Z. In this case one says that there is a D-brane wrapped on X. If X is connected and has dimension p + 1, then one says that one is dealing with a Dp-brane. In general, X can have several components with different dimensions, and then each component corresponds to a D-brane. In perturbative string theory, the role of equations of motion is played by the condition that a certain auxiliary quantum field theory on the Riemann surface is conformally invariant. When D-branes are present, has boundaries, and the auxiliary theory must be supplemented with boundary conditions. The requirement that the boundary conditions preserve conformal invariance imposes constraints on the submanifold X. These constraints should be regarded as equations of motion for D-branes. For example, if we consider a D0-brane wrapped on a 1-dimensional submanifold X, then conformal invariance requires that X be a geodesic in Z. This is the usual equation of motion for a relativistic particle moving in Z. An important subtlety is that to specify fully the boundary conditions for the auxiliary theory on it is not sufficient to specify X; one should also specify a unitary vector bundle E on X and a connection on it. In the simplest case this bundle has rank 1, but one can also have “multiple” D-branes, described by bundles of rank r > 1. Such bundles describe r coincident D-branes wrapped on the same submanifold X. Using

Noncommutative Instantons and Twistor Transform

387

the requirement of conformal invariance of the auxiliary two-dimensional quantum field theory, one can derive equations of motion for the Yang–Mills connection on E. In the low-energy approximation, the equations of motion are the usual Yang–Mills equations dA (∗FA ) = 0.In particular, instantons are solutions of these equations. 1.3. Instantons and D-branes. Let Z be R10 with a flat metric, and let X → Z be R5 = R4 × R linearly embedded in Z. We regard R4 as space and R as time. Consider r D4-branes wrapped on X. This physical system is described by the Yang–Mills action on R5 = R4 × R. If one is looking for static solutions of the equations of motion, one needs to consider the minima of the Yang–Mills energy function W [A] = ||FA ||2 , R4

where FA is the curvature of a U (r) connection A, and ||FA ||2 = −Tr (FA ∧ ∗FA ). The instanton number of A is defined by 1 c2 = (2) Tr (FA ∧ FA ) . 8π 2 R4 If the Yang–Mills energy evaluated on A is finite, then the bundle E and the connection A extend to S4 , the one-point compactification of R4 (see [4] for details). In this case c2 is the second Chern class of E and is therefore an integer. Solutions of instanton equations on R4 are precisely the absolute minima of the Yang–Mills energy function. These solutions should be regarded as composed of identical particle-like objects (instantons) on X, their number being c2 . Since the energy of the instanton is proportional to c2 , all “particles” have the same mass. Since the solution is static, the particles neither repel nor attract. This is actually a consequence of supersymmetry: Type II string theory is supersymmetric, and D4-branes with instantons on them leave part of supersymmetry unbroken. In string theory one may also consider k D0-branes present simultaneously with r D4-branes. More specifically, we will consider D0-branes which are at rest, i.e. the corresponding one-dimensional manifolds are straight lines parallel to the time axis. Such a configuration of branes is also supersymmetric, and consequently there are no forces between any of the branes. The positions of D0-branes are not constrained by anything, so their moduli space is (R9 )k . More precisely, since D0-branes are indistinguishable, the moduli space is Symk (R9 ). It turns out that an instanton with instanton number k and k D0-branes are related: they can be deformed into each other without any cost in energy. A convenient point of view is the following. In the presence of D4-branes wrapped on X the moduli space of D0-branes has two branches: a branch where their positions are unconstrained and D0-branes are point-like (this branch is isomorphic to Symk (R9 )), and the branch where they are constrained to lie on X. The latter branch is isomorphic to the moduli space Mr,k of U (r) instantons on X = R4 with c2 = k. The dimension of Mr,k is known to be 4rk for r > 1 (see for example [4]). For r = 1 instantons do not exist. The translation group of R4 acts freely on Mr,k , and the quotient space describes the relative positions and sizes of instantons. Thus D0-branes are pointlike objects when they are away from D4-branes, but when they bind to D4-branes they can acquire finite size.

388

A. Kapustin, A. Kuznetsov, D. Orlov

The “instanton” branch touches the “point-like” branch at submanifolds where some or all of the instantons shrink to zero size. These are the submanifolds where the instanton moduli space is singular. At these submanifolds the point-like instantons can detach from D4-branes and start a new life as D0-branes. This lowers the second Chern class of the bundle on D4-branes. Thus from the string theory perspective it is natural to glue together the moduli spaces of instantons with different Chern classes along singular submanifolds. 1.4. Noncommutative geometry and D-branes. Instanton equations (and, more generally, Yang–Mills equations) arise in the low-energy limit of string theory, or equivalently for large string tension. Recently, another kind of low-energy limit of string theory was discussed in the literature [32]. Consider a trivial U (r)-bundle on X = R4 with a connection A whose curvature FA is of the form 1⊗f where 1 is the unit section of End(E), and f is a constant nondegenerate 2-form. For small f the D4-branes are described by the ordinary Yang–Mills action, but for large FA the stringy equations of motion get complicated. It turns out that the equations of motion simplify again in the limit when both FA and the string tension are taken to infinity, with a certain combination of the two kept fixed (one also has to scale the metric appropriately, see [32]). We will call this limit the Seiberg–Witten limit. In this limit the D4-branes are described by Yang– Mills equations on a certain noncommutative deformation of R4 (see [32] and references therein). There is another description of the Seiberg–Witten limit, which is gauge-equivalent to the previous one. Type II string theory reduces at low energies to Type II supergravity in 10 dimensions. The bosonic fields of this low-energy theory include a symmetric ranktwo tensor (metric) and a 2-form B. R10 with a flat Lorenzian metric and a constant B is a solution of supergravity equations of motion, as well as full stringy equations of motion. A constant B can be gauged away, so this is not a very interesting solution. Life gets more interesting if there are D-branes present. For example, consider r coincident flat D4-branes embedded in R10 with a constant B-field. It turns out that one can gauge away a constant B-field only at the expense of introducing a constant FA of the form 1 ⊗ f , where f is equal to the pull-back of B to the worldvolume of the D4-branes. Thus the solution with zero FA and nonzero B is equivalent to the solution with nonzero FA and zero B. Therefore the Seiberg–Witten limit can be described as the limit in which both the B-field and the string tension become infinite. The idea that D-branes in a nonzero B-field are described Yang–Mills theory on a noncommutative space was first put forward in [13] for the case of D-branes wrapped on tori. 1.5. Instanton equations on a noncommutative R4 . The deformed R4 that one obtains in the Seiberg–Witten limit is completely characterized by its algebra of functions A. It is a noncommutative algebra whose underlying space is a certain subspace of C ∞ functions on R4 . The product is the so-called Wigner–Moyal product formally given by ∂2 1 f (x)g(y). (3) hθ (f g)(x) = lim exp ¯ ij y→x 2 ∂xi ∂yj Here θ is a purely imaginary matrix, and h¯ is a real parameter (“Planck constant”) which is introduced to emphasize that the Wigner–Moyal product is a deformation of the usual product. In the string theory context θ is proportional to f −1 .

Noncommutative Instantons and Twistor Transform

389

Of course, to make sense of this definition we must specify a subspace in the space of C ∞ functions which is closed under the Wigner–Moyal product. Leaving this question aside for a moment,1 one can define the exterior differential calculus over A. Differential geometry of noncommutative spaces has been developed by A. Connes [12]. In our situation Connes’ general theory is greatly simplified. For example, the sheaf of 1-forms 1 (A) is simply a bimodule A⊕4 (the relation of this definition with the general theory is explained in Subsect. 8.11). The elements of 1 (A) will be denoted i f i (x)dxi , or simply f i (x)dxi . The exterior differential d is a vector space morphism d : A → 1 (A),

f →

∂f dxi . ∂xi

The exterior differential d satisfies the Leibniz rule d(f1 f2 ) = df1 f2 + f1 df2 . This makes sense because 1 (A) is a bimodule. The sheaf of 2-forms over A is a bimodule 2 (A) = A⊕6 (see Subsect. 8.11). The definition of the exterior differential can be extended to 1 (A) in an obvious manner. Complex conjugation acts as an anti-linear anti-homomorphism of A, i.e. (f g) = g f .Thus A has a natural structure of a ∗-algebra. We will denote the ∗-conjugate of f ∈ A by f † . A trivial bundle over the noncommutative R4 is defined as a free A-module E. A trivial unitary bundle over the noncommutative R4 is defined as a free module V ⊗C A, where V is a Hermitian vector space. A connection on a trivial bundle E is defined as a map ∇ : E → E ⊗A 1 (A), which is a vector space morphism satisfying the Leibniz rule ∇(m f ) = ∇(m) f + m df. This formula makes use of the bimodule structure on 1 (A). The curvature F∇ = [∇, ∇] is a morphism of A-modules F∇ : E → E ⊗A 2 (A). As in the commutative case, a connection on a trivial bundle E can be written in terms of a connection 1-form A ∈ EndA (E) ⊗A 1 (A): ∇(m) = dm + A m. This formula uses the bimodule structure on m. If E is a unitary bundle, and we have A† = −A, then we say that A is a unitary connection. The curvature is given in terms of A by the usual formula F∇ := FA = dA + A ∧ A. Here it is understood that f i dxi ∧ g j dxj = f i g j dxi ∧ dxj . 1 String theory considerations do not shed light on this problem.

390

A. Kapustin, A. Kuznetsov, D. Orlov

The instanton equation on A is again given by (1), and the instanton number is defined by (2). The most obvious choice of the space of functions closed under the Wigner–Moyal product is the space of polynomial functions. However, this choice is not suitable for our purposes because it precludes the decrease of FA at infinity which is necessary for the instanton action to converge. In the commutative case, components of an instanton connection are rational functions [4], so we would like our class of functions to include rational functions on R4 . A possible choice for the underlying set of A is the set of C ∞ functions on R4 all of whose derivatives are polynomially bounded. Then we face the question of the convergence of the series (3). To avoid dealing with this issue, we modify our definition of the Wigner–Moyal product (see the Appendix for details). The modified product makes the space of C ∞ functions all of whose derivatives are polynomially bounded into an algebra over C, and agrees with (3) on polynomial functions. Polynomial functions form a subalgebra of A. This subalgebra is isomorphic to the algebra generated by four variables xi , i = 1, 2, 3, 4 with relations [xi , xj ] = hθ ¯ ij . This algebra is usually called the Weyl algebra. To summarize, there is a limit of string theory in which D4 branes are described by Yang–Mills equations on the noncommutative R4 (= A). D0-branes bound to D4-branes are described in this limit by the instanton equations on the noncommutative R4 . One can show that, unlike in the commutative case, instantons cannot be deformed to point-like D0-branes without a cost in energy. Thus it is natural to suspect that the moduli space of instantons on the noncommutative R4 is metrically complete. 2. Review of the ADHM Construction and Summary All instantons on the commutative R4 arise from the so-called ADHM construction. Recently N. Nekrasov and A. Schwarz [29] introduced a modification of this construction which produces instantons on the noncommutative R4 .2 In the commutative case the completeness of the ADHM construction can be proved using the twistor transform of R. Penrose, so one could hope that the same approach could work in the noncommutative case. In this paper we show that the deformed ADHM data of [29] describe holomorphic bundles on certain noncommutative algebraic varieties and interpret the deformed ADHM construction in terms of noncommutative twistor transform. In this subsection we review both ordinary and deformed ADHM constructions and make a summary of our results. 2.1. Review of the ADHM construction of instantons. First let us outline the ADHM construction of U (r) instantons on the commutative R4 following [15]. We assume that the constant metric G on R4 has been brought to the standard form G = diag(1, 1, 1, 1) by a linear change of basis. To construct a U (r) instanton with c2 = k one starts with two Hermitian vector spaces V Ck and W Cr . The ADHM data consist of four linear maps B1 , B2 ∈ Hom(V , V ), I ∈ Hom(W, V ), J ∈ Hom(V , W ) which satisfy the following two conditions: 2 As in the commutative case, one may consider different classes of functions on the noncommutative R4 : polynomial, C ∞ functions rapidly decreasing at infinity, C ∞ functions all of whose derivatives are polynomially bounded, etc. Our class of functions differs somewhat from that adopted in [29].

Noncommutative Instantons and Twistor Transform

391

(i) µc = [B1 , B2 ] + I J = 0, µr = [B1 , B1† ] + [B2 , B2† ] + I I † − J † J = 0. (ii) For any ξ = (ξ1 , ξ2 ) ∈ C2 ∼ = R4 the linear map Dξ ∈ Hom(V ⊕ V ⊕ W, V ⊕ V ) defined by Dξ =

B1 − ξ1 −B2 + ξ2 I B2† − ξ¯2 B1† − ξ¯1 J †

(4)

is surjective. The equations µc = µr = 0 are called the ADHM equations. They are invariant with respect to the action of the group of unitary transformations of V . Solutions of these equations are called ADHM data. The space of ADHM data modulo U (V ) transformations has dimension 4rk and carries a natural hyperkähler metric. ADHM construction identifies this moduli space with the moduli space of U (r) instantons with c2 = k and fixed trivialization at infinity. The role of the condition (ii) above is to remove submanifolds in this moduli space where the hyperkähler metric becomes singular (these are point-like instanton singularities mentioned in Subsect. 1.3). As a result the moduli space of the ADHM data is metrically incomplete. The instanton connection can be reconstructed from the ADHM data as follows. The condition (ii) implies that the family Ker Dξ forms a trivial subbundle of V ⊕ V ⊕ W of rank r. Let v(ξ ) be its trivialization, i.e. a linear map v(ξ ) : Cr → V ⊕ V ⊕ W smoothly depending on ξ such that Dξ v(ξ ) = 0 for all ξ , and ρ(ξ ) = v(ξ )† v(ξ ) is an isomorphism for all ξ . We set A(ξ ) = ρ(ξ )−1 v(ξ )† dv(ξ ). The matrix-valued one-form A is a connection on a trivial unitary bundle of rank r. One can show that its curvature FA is ASD (see [4]). However, it does not satisfy A† = −A, because we are not using a unitary gauge. Instead A satisfies A† (ξ ) = −(ρ(ξ )A(ξ )ρ(ξ )−1 + ρ(ξ )dρ(ξ )−1 ). To go to a unitary gauge, we must make a gauge transformation A (ξ ) = g(ξ )A(ξ )g(ξ )−1 + g(ξ )dg(ξ )−1 , where g(ξ ) is a function taking values in Hermitian r ×r matrices and satisfying g(ξ )2 = ρ(ξ ). We now explain, following [29], how to modify the ADHM construction so that it produces rank r instantons on the noncommutative R4 defined in the previous section. It proves convenient to apply an orthogonal transformation which brings the matrix θ in (3) to the standard form θ=

0

a

0

0 . 0 b 0 −b 0

√ −a 0 −1 0 0 0

0

0

392

A. Kapustin, A. Kuznetsov, D. Orlov

We will assume that a + b = 0.Since θ enters only in the combination hθ ¯ , we can set a + b = 1 without loss of generality. The relation between the affine coordinates ξ1 , ξ2 on C2 and affine coordinates x1 , x2 , x3 , x4 on R4 is chosen as follows: √ √ ξ1 = x4 − −1 x3 , ξ2 = −x2 + −1 x1 . Then ξ1 , ξ2 , ξ¯1 , ξ¯2 obey the Weyl algebra relations [ξ1 , ξ¯1 ] = 2hb, ¯

[ξ2 , ξ¯2 ] = 2ha, ¯

[ξ1 , ξ2 ] = [ξ1 , ξ¯2 ] = [ξ¯1 , ξ2 ] = [ξ¯1 , ξ¯2 ] = 0.

The modified ADHM data consist of the same four maps which now satisfy µc = 0, µr = −2h(a ¯ + b) · 1k×k . The instanton connection is given by essentially the same formulas as in the commutative case. The operator D is given by the same formula as Dξ , but is now regarded as an element of HomA ((V ⊕ V ⊕ W ) ⊗C A, (V ⊕ V ) ⊗C A). The module Ker D is a projective module over A. Following [10], we assume that it is isomorphic to a free module of rank r, and v is the corresponding isomorphism v : A⊕r → Ker D. We further assume [10] that the morphism / = DD† ∈ EndA ((V ⊕ V ) ⊗ A) is an isomorphism.3 Then it is easy to see that ρ = v † v ∈ EndA (Cr ⊗ A) is an isomorphism too. We set A = ρ −1 v † dv.

(5)

(The multiplication here and below is understood to be the Wigner–Moyal multiplication.) This formula defines a connection 1-form on a trivial unitary bundle on A of rank r. The curvature of this connection is given by FA = ρ −1 dv † ∧ (1 − vρ −1 v † )dv. A short computation (essentially the same as in the commutative case) shows that the curvature can be written in the form FA = ρ −1 v † dD† /−1 ∧ dD v. Furthermore, since D and D† are linear in ξi , ξ¯i , their exterior derivatives have a very simple form: −d ξ¯1 −dξ2 −dξ1 dξ2 0 , dD† = dD = d ξ¯2 −dξ1 . −d ξ¯2 −d ξ¯1 0 0 0 3 One can show that the latter assumption is always valid provided h = 0. As for the former one, it is ¯ not known what constraints should be imposed on the deformed ADHM data to ensure that Ker D is a free A-module of rank r. For r = 1 Ker D is never free [16].

Noncommutative Instantons and Twistor Transform

393

Note also that by virtue of the deformed ADHM equations / has a block-diagonal form: δ 0 , /= 0 δ where δ ∈ EndA (V ⊗ A) is an isomorphism. Using this fact, one can easily see that FA is proportional to the 2-forms dξ1 ∧ d ξ¯1 + dξ2 ∧ d ξ¯2 ,

dξ1 ∧ d ξ¯2 ,

dξ2 ∧ d ξ¯1 ,

which are anti-self-dual. As in the commutative case, the connection A does not satisfy A† = −A. To go to a unitary gauge one has to perform a gauge transformation A = g A g −1 + g dg −1 . Here g ∈ AutA (Cr ⊗ A) should be found from the conditions g † = g, g g = ρ. The existence of such g is an additional assumption. 2.2. Summary of results. In the commutative case there is a one-to-one correspondence between the following four classes of objects: A. Rank r holomorphic bundles on P2 with c2 = k and a fixed trivialization on the line at infinity. B. The set of ADHM data modulo the natural action of U (k). C. Rank r holomorphic bundles on P3 with c2 = k, a trivialization on a fixed line, vanishing H 1 (E(−2)), and satisfying a certain reality condition. D. U (r) instantons on R4 with c2 = k. The correspondence between C and D is a particular instance of twistor transform [6]. The correspondence between B and C has been proved by Atiyah, Hitchin, Drinfeld, and Manin [5, 4]. Together these two results imply that all instantons on R4 arise from the ADHM construction. The correspondence between A and B has been proved by Donaldson [15]. One can also prove the correspondence between A and D directly [7, 11, 18]. The goal of this paper is to extend some of these results to the noncommutative case. We show that there is a natural one-to-one correspondence between the isomorphism classes of the following objects: A . Algebraic bundles on a noncommutative deformation of P2 with c2 = k and a fixed trivialization on the line at infinity. B . Deformed ADHM data of Nekrasov and Schwarz modulo the natural U (k) action. C . Certain complexes of sheaves on a noncommutative deformation of P3 satisfying reality conditions. The moduli space of the deformed ADHM data has a natural hyperkähler metric, and the other two moduli spaces inherit this metric. Furthermore, we reinterpret the deformed ADHM construction of Nekrasov and Schwarz in terms of a noncommutative deformation of the twistor transform. It is interesting to note that H. Nakajima [27] studied the same linear algebra data as Nekrasov and Schwarz and showed that their moduli space coincides with the moduli

394

A. Kapustin, A. Kuznetsov, D. Orlov

space of torsion free sheaves on a commutative P2 with a trivialization on a fixed line. On the other hand, we show that the same data describe algebraic bundles on a noncommutative P2 . As shown below, the interpretation in terms of complexes of sheaves on a noncommutative P3 provides a geometric reason for this “coincidence”. We prove that the two moduli spaces are isomorphic as hyperkähler manifolds, but the natural complex structures on them differ by an SO(3) rotation. The rest of the paper is organized as follows. In Sect. 3 we define noncommutative deformations of certain commutative projective varieties (P2 , P3 , and a quadric in P5 ). Section 4 is an algebraic preparation for the study of bundles on noncommutative projective spaces. In Sect. 5 we study the cohomological properties of sheaves on noncommutative P2 and P3 and define locally free sheaves (i.e. bundles). In Sect. 6 we show that any bundle on a noncommutative P2 trivial on the commutative line at infinity arises as a cohomology of a monad. In Sect. 7 we exhibit bijections between A , B , and C and explain the relation with Nakajima’s results. In Sect. 8 we construct a noncommutative deformation of Grassmannians and flag manifolds and describe a noncommutative version of the twistor transform. We also describe a nice class of noncommutative projective varieties associated with a Yang–Baxter operator and define differential forms on these varieties. In Sect. 9 we consider a more general deformation of R4 (a q- deformed R4 ) whose physical significance is obscure at present. We propose an ADHM-like construction of instantons on this space and outline its relation to noncommutative algebraic geometry. In the Appendix we define the Wigner–Moyal product on the space of C ∞ functions on Rn all of whose derivatives are polynomially bounded, and prove that the Wigner–Moyal product provides this space with a structure of an algebra over C. Note added in proof. After this paper was submitted to the electronic archive, we learned that coherent sheaves on the noncommutative projective plane and their moduli spaces have been studied by L. Le Bruyn [21]. 3. Geometry of Noncommutative Varieties 3.1. Algebraic preliminaries. Let k be a base field (we will be dealing only with k = C or k = R in this paper). Let A be an algebra over k. It is called right (left) noetherian if every right (left) ideal is finitely generated, and it is called noetherian if it is both right and left noetherian. Let A = ⊕ Ai be a graded noetherian algebra. We denote by mod(A) the category i≥0

of finitely generated right A-modules, by gr(A) the category of finitely generated graded right A-modules, and by tors(A) the full subcategory of gr(A) which consists of finite dimensional graded A-modules. An important role will be played by the quotient category qgr(A) = gr(A)/tors(A). It has the following explicit description. The objects of qgr(A) are the objects of gr(A) (we

the object in qgr(A) which corresponds to a module M). The morphisms denote by M in qgr(A) are given by

N

) = lim Homgr (M , N ), Homqgr (M, −→ M

where M runs over submodules of M such that M/M is finite dimensional. On the category gr(A) there is a shift functor: for a given graded module M = ⊕i≥0 Mi the shifted module M(r) is defined by M(r)i = Mr+i . The induced shift

to M(r)

functor on the quotient category qgr(A) sends M = M(r).

Noncommutative Instantons and Twistor Transform

395

Similarly, we can consider the category Gr(A) of all graded right A-modules. It contains the subcategory Tors(A) of torsion modules. Recall that a module M is called torsion if for any element x ∈ M one has xA≥s = 0 for some s, where A≥s = ⊕ Ai . We i≥s

denote by QGr(A) the quotient category Gr(A)/Tors(A). The category QGr(A) contains qgr(A) as a full subcategory. Sometimes it is convenient to work in QGr(A) instead of qgr(A). Henceforth, all graded algebras will be noetherian algebras generated by the first component A1 with A0 = k. Sometimes we use subscripts R or L for categories gr(A), qgr(A), etc., to specify whether right or left modules are considered. If the subscript is omitted, the modules are taken to be right modules. For the same reason for an A-bimodule M we sometimes write MA or A M to specify whether the right or left module structure is considered. 3.2. Noncommutative varieties. A variety in commutative geometry is a topological space with a sheaf of functions (continuous, smooth, analytic, algebraic, etc.) which is, obviously, a sheaf of algebras. One of the main objects in geometry (algebraic or differential) is a bundle or, more generally, a sheaf. To any variety X we can associate an abelian category of sheaves of modules (maybe with some additional properties) over the sheaf of algebras of functions. Given a sheaf of modules on X, the space of its global sections is a module over the algebra of global functions on X. Thus the functor of global sections associates to every X an algebra and a certain category of modules over it. Under favorable circumstances, much of the information about the geometry of X is contained in this purely algebraic datum. Let us give a few examples. If X is a compact Hausdorff topological space, then the category of vector bundles over X is equivalent to the category of finitely generated projective modules over the algebra of continuous functions on X [34, 36]. The equivalence is given by the functor which maps a vector bundle to the module of its global sections. It is well known that if A is a commutative noetherian algebra, the category of coherent sheaves on the noetherian affine scheme Spec(A) is equivalent to the category of finitely generated modules over A. The equivalence is again given by the functor which attaches to a coherent sheaf the module of its global sections. In the case of projective varieties the only global functions are constants, so one has to act somewhat differently. Since a projective variety X is by definition a subvariety of a projective space, it inherits from it the line bundle OX (1) and its tensor powers OX (i). We can consider a graded algebra 9(X) = ⊕ H 0 (X, OX (i)). i≥0

This algebra is called the homogeneous coordinate algebra of X. Furthermore, for any sheaf F we can define a graded A-module 9(F) = ⊕ H 0 (X, F(i)). i≥0

It can be checked that 9 is a functor from the category of coherent sheaves on X coh(X) to gr(9(X)). In a brilliant paper [33], J.-P. Serre described the category of coherent sheaves on a projective scheme X in terms of graded modules over the graded algebra 9(X). He proved that the category coh(X) is equivalent to the quotient category qgr(9(X)) = gr(9(X))/tors(9(X)). The equivalence is given by the composition of the functor 9

396

A. Kapustin, A. Kuznetsov, D. Orlov

with the projection π : gr(A) → qgr(A). On other hand, let A = ⊕ Ai be a graded i≥0

commutative algebra generated over k by the first component (which is assumed to be finite dimensional). We can associate to A a projective scheme X = Proj(A). Serre proved that the category coh(X) is equivalent to the category qgr(A). The equivalence also holds for the category of quasicoherent sheaves on X and the category QGr(A) = Gr(A)/Tors(A). In all of the above examples it turned out that the natural category of sheaves or bundles on a variety is equivalent to a certain category defined in terms of (graded) modules over some (graded) algebra. On the other hand, “as A. Grothendieck taught us, to do geometry you really don’t need a space, all you need is a category of sheaves on this would-be space” ([25], p. 83). For this reason, in the realm of algebraic geometry it is natural to regard a noncommutative noetherian algebra as a coordinate algebra of a noncommutative affine variety; then the category of finitely generated right modules over this algebra is identified with the category of coherent sheaves on the corresponding variety. Similarly, a noncommutative graded noetherian algebra is regarded as a homogeneous coordinate algebra of a noncommutative projective variety. The category of finitely generated graded right modules over this algebra modulo torsion modules is identified with the category of coherent sheaves on this variety (see [3, 25, 35]). A different approach to noncommutative geometry has been pursued by Connes [12]. 3.3. Noncommutative deformations of commutative varieties. Many important noncommutative varieties arise as deformations of commutative ones. Let X be a commutative variety (affine or projective). Let A be the corresponding commutative (graded) algebra. A noncommutative deformation of X is a deformation of the algebra structure on A, that is, a deformation of the multiplication law. Usually it is not easy to write down an explicit formula for the deformed product. There is a more algebraic way to describe noncommutative deformations of commutative varieties. Assume that the algebra A is given in terms of generators and relations. This means that A is given as a quotient A = T (V )/R, where V is the vector space spanned by the generators, T (V ) is the tensor algebra of V , and R is a two-sided ideal in T (V ) generated by a subspace of relations R ⊂ T (V ). Assume that Rh¯ ⊂ T (V ) is a one-parameter deformation of the subspace R. Then Ah¯ = T (V )/Rh¯ is a oneparameter deformation of A. (If A is graded, then we assume that R is a graded subspace, and the deformation preserves the grading). We denote by Xh¯ the noncommutative variety corresponding to the algebra Ah¯ . Thus Xh¯ is a noncommutative one-parameter deformation of X. If X is projective and A is a graded algebra, then we denote by coh(Xh¯ ) the category qgr(Ah¯ ). Furthermore, as in the commutative case, we will write O(r) for the object h¯ (r). A Now we define noncommutative varieties which are going to be used in this paper. 3.4. Noncommutative C4 . Denote by A(C4 ) the algebra of polynomial functions on C4 . Let θ be a skew-symmetric 4 × 4 matrix. Let us define the algebra A(C4h¯ ) as an algebra over C generated by xi (i = 1, 2, 3, 4) with relations [xi , xj ] = hθ ¯ ij : A(C4h¯ ) = T(x1 , x2 , x3 , x4 )/[xi , xj ] = h¯ θij 1≤i,j ≤4 .

(6)

Noncommutative Instantons and Twistor Transform

397

We will regard A(C4h¯ ) as the algebra of polynomial functions on a noncommutative affine variety C4h¯ . 3.5. Noncommutative 4-dimensional quadric. Let G be a 4 × 4 symmetric nondegenerate matrix. Consider a graded algebra Qh¯ = ⊕ Qi over C generated by the elements i≥0

X1 , X2 , X3 , X4 , D, T of degree 1 with the following quadratic relations: [T , D] = [T , Xi ] = 0, 2 [Xi , Xj ] = hθ ¯ ij T , θil Glk Xk T , [D, Xi ] = 2h¯

(7)

lk

Gij Xi Xj = DT .

ij

We denote by Q4h¯ the noncommutative projective variety corresponding to the algebra Qh¯ . It is evident that Q4h¯ is a deformation of a 4-dimensional commutative quadric Q4 = { ij Gij Xi Xj = DT } ⊂ CP5 . 3.6. Embedding C4h¯ → Q4h¯ . Let Qh¯ [T −1 ] be the localization of the algebra Qh¯ with respect to T . Elements of degree 0 in Qh¯ [T −1 ] form a subalgebra which will be denoted by Qh¯ [T −1 ]0 . Lemma 3.1. The map xi → T −1 Xi (i = 1, 2, 3, 4) induces an isomorphism of the algebra A(C4h¯ ) with the algebra Qh¯ [T −1 ]0 . Proof. Obvious.

" !

This means that C4h¯ can be identified with the open subset {T = 0} in Q4h¯ . For this reason, Q4h¯ may be regarded as a compactification of C4h¯ which is compatible with the bilinear form G. Note also that the complement of C4h¯ in Q4h¯ corresponds to the algebra Qh¯ /T = T(X1 , X2 , X3 , X4 , D)/ [Xi , Xj ] = [D, Xi ] = 0, Gij Xi Xj = 0 . ij

Since this algebra is commutative, the complement is the usual 3-dimensional commutative quadratic cone. Thus one may say that Q4h¯ is obtained from C4h¯ by adding a cone “at infinity”. This is in complete analogy with the commutative case. 3.7. Noncommutative P2h¯ and P3h¯ . Noncommutative deformations of the projective plane have been classified in [1, 2, 9]. We will need one of them, namely the one whose homogeneous coordinate algebra is a graded algebra P Ph¯ = ⊕ P Ph¯ i over C generated by the elements w1 , w2 , w3 of degree 1 with the relations:

i≥0

[w3 , wi ] = 0 for any i = 1, 2, 3, 2 [w1 , w2 ] = 2hw ¯ 3.

(8)

398

A. Kapustin, A. Kuznetsov, D. Orlov

We will also need a noncommutative deformation of the 3-dimensional projective space, whose homogeneous coordinate algebra will be denoted P Sh¯ = ⊕ P Sh¯ i . It is a i≥0

graded algebra over C generated by P Sh¯ 1 = U , where the vector space U is spanned by elements z1 , z2 , z3 , z4 obeying the relations [z3 , zi ] = [z4 , zi ] = 0 for any i = 1, 2, 3, 4, [z1 , z2 ] = 2hz ¯ 3 z4 .

(9)

The noncommutative projective varieties corresponding to P Ph¯ and P Sh¯ will be denoted P2h¯ and P3h¯ , respectively. Note that for h¯ = 0 all algebras P Sh¯ are isomorphic, and therefore the varieties P3h¯ are the same for all h¯ = 0. The same is true for P2h¯ . 3.8. Subvarieties in P3h¯ and P2h¯ . If I ⊂ P Sh¯ is a graded two-sided ideal, then the quotient algebra P Sh¯ /I corresponds to a closed subvariety X(I ) ⊂ P3h¯ . Let us describe some of them. Let J be the graded two-sided ideal generated by z3 and z4 . Then P Sh¯ /J = T(z1 , z2 )/[z1 , z2 ] = 0, hence X(J ) is the commutative projective line. For each point p = (λ : µ) ∈ P1 let Jp denote the graded two-sided ideal generated by λz3 + µz4 . If p = (0 : 1) or p = (1 : 0), then it is easy to see that X(Jp ) is the commutative projective plane. For all other p ∈ P1 we have λ P Sh¯ /Jp = T(z1 , z2 , z3 )/ [z1 , z3 ] = [z2 , z3 ] = 0, [z1 , z2 ] = −2h¯ z32 , µ hence X(Jp ) is a noncommutative projective plane isomorphic to P2h¯ . We have Jp ⊂ J for all p ∈ P1 , hence all planes X(Jp ) pass through the line X(J ). Thus we see that P3h¯ is a pencil of noncommutative projective planes passing through a fixed commutative projective line. Similarly, the two-sided ideal generated by w3 in P Ph¯ corresponds to a commutative projective line l = {w3 = 0} ⊂ P2h¯ . 4. Properties of Algebras P Sh¯ and P Ph¯ and the Resolution of the Diagonal This section is a preparation for the study of sheaves on P3h¯ and P2h¯ . We show that the algebras P Sh¯ and P Ph¯ are regular and Koszul and construct the resolution of the diagonal, which will enable us to associate monads to certain bundles on P2h¯ . 4.1. Quadratic algebras. A graded algebra A = ⊕ Ai over a field k is called quadratic i≥0

if it is connected (i.e. A0 = k), is generated by the first component A1 , and the ideal of relations is generated by the subspace of quadratic relations R(A) ⊂ A1 ⊗ A1 . Therefore the algebra A can be represented as T (A1 )/R(A), where T (A1 ) is a free tensor algebra generated by the space A1 . The algebras P Sh¯ and P Ph¯ are quadratic algebras. For example, P Sh¯ can be represented as T(U )/W , where U = P Sh¯ 1 is a 4-dimensional vector space and W is the 6–dimensional subspace of U ⊗ U spanned by the relations (9).

Noncommutative Instantons and Twistor Transform

399

4.2. The dual algebra. For any quadratic algebra A = T (A1 )/R(A) we can define its dual algebra which is also quadratic. Let us identify A∗1 ⊗ A∗1 with (A1 ⊗ A1 )∗ by (l ⊗ m)(a ⊗ b) = m(a)l(b). Denote by R(A)⊥ the annulator of R(A) in A∗1 ⊗ A∗1 , i.e. the subspace which consists of such q ∈ (A∗1 )⊗2 that q(r) = 0 for any r ∈ R(A). Definition 4.1 ([25]). The algebra A! = T (A∗1 )/R(A)⊥ is called the dual algebra of A. Example 4.2. Let {ˇzi }, i = 1, 2, 3, 4, be the basis of P Sh¯ !1 = U ∗ which is dual to {zi }. By definition, P Sh¯ ! is generated by {ˇzi } with defining relations zˇ i2 = 0 for all i = 1, . . . , 4; zˇ i zˇ j + zˇ j zˇ i = 0 for all i < j, (i, j ) = (3, 4); zˇ 3 zˇ 4 + zˇ 4 zˇ 3 = h[ˇ ¯ z1 , zˇ 2 ] = 2h¯ zˇ 1 zˇ 2 . In the commutative case the dual algebra of the symmetric algebra S · (U ) is isomorphic to the exterior algebra C· (U ∗ ). Obviously, the algebras P Sh¯ ! and P Ph¯ ! are deformations of exterior algebras. For example, the vector space P Sh¯ !k is spanned by the elements zˇ i1 · · · zˇ ik with i1 < · · · < ik . In particular, the dimension of the vector space P Sh¯ !k is equal to k4 . Similarly, the dimension of P Ph¯ !k is equal to k3 . Proposition 4.3. Let A be P Sh¯ or P Ph¯ , and let n be 4 or 3, respectively. The multiplication map A!k ⊗ A!n−k −→ A!n is a non-degenerate pairing. Hence the dual algebra A! is a Frobenius algebra, i.e. (A! )A! ∼ = (A! A! )∗ as right A! -modules. Proof. The proposition holds for the exterior algebra, and therefore also for the algebra A! , since the latter is a “small” deformation of the exterior algebra. ! " 4.3. The Koszul complex. Consider right A-modules (A!k )∗ ⊗A. The following complex K· (A) is called the (right) Koszul complex of a quadratic algebra: d

d

d

d

· · · −→ (A!3 )∗ ⊗A(−3) −→ (A!2 )∗ ⊗A(−2) −→ (A!1 )∗ ⊗A(−1) −→ (A!0 )∗ ⊗A −→ 0, where the map d : (A!k )∗ ⊗ A → (A!k−1 )∗ ⊗ A is a composition of two natural maps: (A!k )∗ ⊗ A −→ (A!k )∗ ⊗ A!1 ⊗ A1 ⊗ A −→ (A!k )∗ ⊗ A. Here the first arrow sends α ⊗ a to α ⊗ e ⊗ a with e defined as e= yi ⊗ xi ∈ A!1 ⊗ A1 , i

and {xi } and {yi } being the dual bases of A1 and A!1 , respectively. The second map is determined by the algebra structures on A! and A. It is a well–known fact that d 2 = 0 (see, for example, [25]). Let kA be the trivial right A-module. The Koszul complex K· (A) possesses a natural ε augmentation K· −→ kA −→ 0.

400

A. Kapustin, A. Kuznetsov, D. Orlov

Definition 4.4 (see [31]). A quadratic algebra A = ⊕ Ai is called a Koszul algebra if i≥0

ε

the augmented Koszul complex K· (A) −→ kA −→ 0 is exact. In the same manner one can define the left Koszul complex of a quadratic algebra. It is well known that the exactness of the right Koszul complex is equivalent to the exactness of the left Koszul complex (see, for example, [22]). Proposition 4.5. The algebras P Sh¯ and P Ph¯ are Koszul algebras. Proof. For h¯ = 0 this is a well-known fact about the symmetric algebra S · (U ). Since the augmented Koszul complex is exact for h¯ = 0, it is also exact for small h, ¯ and consequently for all h. " ¯ ! Since the dual algebras P Sh¯ ! and P Ph¯ ! are finite, the Koszul resolutions for the algebras P Sh¯ and P Ph¯ are finite too and have the same form as the resolutions for ordinary symmetric algebras. For example, the Koszul resolution for A = P Ph¯ is: {0 → (A!3 )∗ ⊗ A(−3) → (A!2 )∗ ⊗ A(−2) → (A!1 )∗ ⊗ A(−1) → (A!0 )∗ ⊗ A} → C. 4.4. Resolution of the diagonal. Consider a bigraded vector space 2 2 K··2 (A) = Kk,l (A) with Kk,l (A) = A(k) ⊗ (A!l−k )∗ ⊗ A(−l). k,l≥0

2 → K2 2 2 Consider morphisms dR : Kk,l k,l−1 and dL : Kk,l → Kk+1,l given by the following compositions:

dR : A ⊗ (A!k )∗ ⊗ A → A ⊗ (A!k )∗ ⊗ A!1 ⊗ A1 ⊗ A → A ⊗ (A!k−1 )∗ ⊗ A, dL : A ⊗ (A!k )∗ ⊗ A → A ⊗ A1 ⊗ A!1 ⊗ (A!k )∗ ⊗ A → A ⊗ (A!k−1 )∗ ⊗ A. Here the leftmost maps are given by yi ⊗ xi ∈ A!1 ⊗ A1 eR =

and

eL =

i

xi ⊗ yi ∈ A1 ⊗ A!1 ,

i

where {xi } and {yi } are the dual bases of A1 and A!1 , respectively, while the rightmost maps are induced by the algebra structures of A! and A. It is easy to show that dR2 = dL2 = 0

and

dR dL = dL dR ,

hence K··2 (A) is a bicomplex. It is called the double Koszul bicomplex of the quadratic algebra A. The topmost part of the bicomplex looks as follows: dR

dR

dR

dR

. . . −−−−→ A ⊗ (A!l+1 )∗ ⊗ A(−1 − l) −−−−→ dL

A ⊗ (A!l )∗ ⊗ A(−l) dL

dR

−−−−→ . . .

dR

. . . −−−−→ A(1) ⊗ (A!l )∗ ⊗ A(−1 − l) −−−−→ A(1) ⊗ (A!l−1 )∗ ⊗ A(−l) −−−−→ . . .

Noncommutative Instantons and Twistor Transform

401

Each term of the bicomplex K··2 (A) has an obvious structure of a bigraded Abimodule, and it is clear that the differentials are morphisms of bigraded A-bimodules. Let 2 2 Kl (A) = Ker dL : K0,l (A) → K1,l (A). Then K· (A) is a complex of bigraded A-bimodules (with respect to the differential dR ). Consider a bigraded algebra / = i,j /ij with /ij = Ai+j and with the multiplication induced from A. The algebra / is called the diagonal bigraded algebra of A. Note that the multiplication map induces a surjective morphism of A-bimodules δ : A ⊗ A → /. Lemma 4.6. The map δ : K0 (A) = A ⊗ A → / gives an augmentation of the complex K· (A). 2 (A) = Proof. We have to check that δ · dR : K1 (A) → A vanishes. Note that K0,1 2 A ⊗ A1 ⊗ A(−1), and that the differentials dR and dL restricted to K0,1 (A) coincide with the multiplication maps m1,2 and m2,3 , respectively. Thus we have the following commutative diagram:

K1 (A)

dR

δ

m1,2

δ

−−−−→ K0 (A) −−−−→

/

A ⊗ A1 ⊗ A(−1) −−−−→ A ⊗ A −−−−→ / m2,3 A(1) ⊗ A(−1) Now the lemma follows because δ · m1,2 = δ · m2,3 (associativity) obviously annihilates Ker m2,3 = K1 (A). ! " δ

Proposition 4.7. If A is Koszul, then K· (A) → / is exact. ! 2 (A) is equal to A ∗ Proof. The (p, q)-bigraded component of Kk,l p+k ⊗ (Al−k ) ⊗ Aq−l , 2 hence the (p, q)-bigraded component of the bicomplex K·· (A) vanishes for l < k or l > q. Thus the (p, q)-bigraded component of the bicomplex K··2 (A) is bounded. Therefore both spectral sequences of the bicomplex K··2 (A) converge to the cohomology of the total complex Tot(K··2 (A)). The first term of the first spectral sequence reads A(l) ⊗ k(−l), if k = l 1 Ek,l = 0, otherwise.

Hence the spectral sequence degenerates in the first term, and we have H 0 (Tot(K··2 (A))) =

∞ l=0

A(l) ⊗ k(−l),

H =0 (Tot(K··2 (A))) = 0.

402

A. Kapustin, A. Kuznetsov, D. Orlov

On the other hand, the first term of the second spectral sequence reads

1 Ek,l

k(l) ⊗ A(−l), if k = l > 0 = Kl (A), if k = 0 0, otherwise.

Hence the spectral sequence degenerates in the second term, and we have H

0

(Tot(K··2 (A)))

= H (K· (A)) ⊕ 0

∞

k(l) ⊗ A(−l) ,

l=1

H l (Tot(K··2 (A))) = H l (K· (A)). Therefore H =0 (K· (A)) = 0, and we have an exact sequence 0 → H 0 (K· (A)) →

∞

A(l) ⊗ k(−l) →

l=0

∞

k(l) ⊗ A(−l) → 0.

l=1

Looking at the (p, q)-bigraded component of this sequence we see that (H (K· (A)))p,q = 0

Thus H 0 (K· (A)) = /.

Ap+q , 0,

if p, q ≥ 0 . otherwise

" !

Definition 4.8. Define the left A-module Ω k as the cohomology of the left Koszul complex, truncated in the term Kk . In particular, Ω 1 is defined by the so-called Euler sequence m

ε

0 → Ω 1 → A(−1) ⊗ A1 → A → k → 0.

(10)

In Sect. 8.11 we will show that for noncommutative projective spaces the sheaves corresponding to the modules Ω k can be regarded as sheaves of differential forms. Proposition 4.9. We have Kk (A) = Ω k (k) ⊗ A(−k). " Proof. This follows immediately from the definition of Ω k and Kk (A). ! Combining Propositions 4.7 and 4.9, we obtain the following resolution of the diagonal: . . . −→ Ω 2 (2) ⊗ A(−2) −→ Ω 1 (1) ⊗ A(−1) −→ A ⊗ A −→ / −→ 0.

(11)

Noncommutative Instantons and Twistor Transform

403

4.5. Cohomological properties of the algebras P Sh¯ and P Ph¯ . First we note that both algebras P Sh¯ and P Ph¯ are noetherian. This follows from the fact that they are Ore extensions of commutative polynomial algebras (see for example, [26]). For the same reason the algebras P Sh¯ and P Ph¯ have finite right (and left) global dimension, which is equal to 4 and 3, respectively (see [26], p. 273). We recall that the global dimension of a ring A is the minimal number n (if it exists) such that for any two modules M and N we have Ext n+1 A (M, N ) = 0. In the paper [1] the notion of a regular algebra has been introduced. Regular algebras have many good properties (see [3, 2, 40], etc.). Definition 4.10. A graded algebra A is called regular of dimension d if it satisfies the following conditions: (1) A has global dimension d, (2) A has polynomial growth, i.e. dim An ≤ cnδ for some c, δ ∈ R, (3) A is Gorenstein, meaning that ExtiA (k, A) = 0 if i = d, and ExtdA (k, A) = k(l) for some l. Here ExtA stands for the Ext functor in the category mod(A). It is easy to see that these properties are verified for P Sh¯ and P Ph¯ . Property (2) holds because our algebras grow as ordinary polynomial algebras. Property (3) follows from the fact that P Sh¯ and P Ph¯ are Koszul algebras and the dual algebras are Frobenius resolutions. In this case the Gorenstein parameter l in (3) is equal to the global dimension d. Thus we have Proposition 4.11. The algebras P Sh¯ and P Ph¯ are noetherian regular algebras of global dimension 4 and 3, respectively. For these algebras the Gorenstein parameter l coincides with the global dimension d. 5. Cohomological Properties of Sheaves on P2h¯ and P3h¯ 5.1. Ampleness and cohomology of O(i). Let A be a graded algebra and X be the corresponding noncommutative projective variety. Consider the sequence of sheaves {O(i)}i∈Z in the category coh(X) ∼ = qgr(A), where O(i) = A(i). This sequence is called ample if the following conditions hold: (a) For every coherent sheaf F there are integers k1 , . . . , ks and an epimorphism s

⊕ O(−ki ) −→ F.

i=1

(b) For every epimorphism F −→ G the induced map Hom(O(−n), F) −→ Hom(O(−n), G) is surjective for n & 0. It is proved in [3] that the sequence {O(i)} is ample in qgr(A) for a graded right noetherian k-algebra A if it satisfies the extra condition: (χ1 ) :

dimk Ext 1A (k, M) < ∞

for any finitely generated graded A-module M. This condition can be verified for all noetherian regular algebras (see [3], Theorem 8.1). In particular, the categories coh(P3h¯ ), coh(P2h¯ ) have ample sequences.

404

A. Kapustin, A. Kuznetsov, D. Orlov

For any sheaf F ∈ qgr(A) we can define a graded module 9(F) by the rule: 9(F) := ⊕ Hom(O(i), F). i≥0

It is proved in [3] that for any noetherian algebra A that satisfies the condition χ1 the correspondence 9 is a functor from qgr(A) to gr(A) and the composition of 9 with the natural projection π : gr(A) −→ gqr(A) is isomorphic to the identity functor (see [3, Ch. 3,4]). Now we formulate a result about the cohomology of sheaves on noncommutative projective spaces. This result is proved in [3] for a general regular algebra and parallels the commutative case. Proposition 5.1 ([3, Theorem 8.1.]). Let A be P Sh¯ or P Ph¯ , and X be P3h¯ or P2h¯ , respectively. Denote by n the dimension of X (in our case n = 3 or n = 2, respectively). Then (1) The cohomological dimension of coh(X) is equal to dim(X), i.e. for any two coherent sheaves F and G Exti (F, G) vanishes if i > n. (2) There are isomorphisms for p = 0, i ≥ 0 Ak H p (X, O(i)) = A∗−i−1−n for p = n, i ≤ −n − 1 (12) 0 otherwise. This proposition and the ampleness of the sequence {O(i)} implies the following corollary: Corollary 5.2. Let X be either P3h¯ or P2h¯ . Then for any sheaf F ∈ coh(X) and for all sufficiently large i ≥ 0 we have Hom(F, O(i)) = 0. Proof. By ampleness a sheaf F can be covered by a finite sum of sheaves O(kj ). Now the statement follows from the proposition, because Hom(O(kj ), O(i)) = 0 for all i < kj . ! " Corollary 5.3. Let X be either P3h¯ or P2h¯ . Then for any sheaf F ∈ coh(X) and for all sufficiently large i ≥ 0 we have H k (X, F(i)) = 0 for all k ≥ 1. Proof. The group H k (X, F(i)) coincides with Extk (O(−i), F). Let k be the maximal integer (it exists because the global dimension is finite) such that for some F there exists arbitrarily large i such that Extk (O(−i), F) = 0. Assume that k ≥ 1. s

Choose an epimorphism ⊕ O(−kj ) → F. Let F1 denote its kernel. Then for i > j =1

max{kj

} we have Ext>0 (O(−i),

s

⊕ O(−kj )) = 0, hence Extk (O(−i), F) = 0 implies

j =1

Extk+1 (O(−i), F) = 0. This contradicts the assumption of the maximality of k.

" !

Noncommutative Instantons and Twistor Transform

405

5.2. Serre duality and the dualizing sheaf. A very useful property of commutative smooth projective varieties is the existence of the dualizing sheaf. Recall that a sheaf ω is called dualizing if for any F ∈ coh(X) there are natural isomorphisms of k-vector spaces H i (X, F) ∼ = Extn−i (F, ω)∗ , where ∗ denotes the k-dual. The Serre duality theorem asserts the existence of the dualizing sheaf for smooth projective varieties. In this case the dualizing sheaf is a line bundle and coincides with the sheaf of differential forms nX of top degree. Since the definition of ω is given in abstract categorical terms, it can be extended to the noncommutative case. More precisely, we will say that qgr(A) satisfies classical Serre duality if there is an object ω ∈ qgr(A) together with natural isomorphisms Exti (O, −) ∼ = Extn−i (−, ω)∗ . Our noncommutative varieties P3h¯ and P2h¯ satisfy classical Serre duality, with dualizing sheaves being OP3 (−4) and OP2 (−3), respectively. This follows from the paper [40], h¯ h¯ where the existence of a dualizing sheaf in qgr(A) has been proved for a general class of algebras which includes all noetherian regular algebras. In addition, the authors of

[40] showed that the dualizing sheaf coincides with A(−l), where l is the Gorenstein paramenter for A (see condition (3) of Definition 4.10). 5.3. Bundles on noncommutative projective spaces. To any graded right A-module M one can attach a left A-module M ∨ = HomA (M, A) which is also graded. Note that under this correspondence the right module AA (r) goes to the left module A A(−r). It is known that if A is a noetherian regular algebra, then HomA (−, A) is a functor from the category gr(A)R to the category gr(A)L . Moreover, its derived functor RHom·A (−, A) gives an anti-equivalence between the derived categories of gr(A)R and gr(A)L (see [39, 40, 38]). If we assume that the composition of the functor HomA (−, A) with the projection gr(A)L −→ qgr(A)L factors through the projection gr(A)R −→ qgr(A)R , then we obtain a functor from qgr(A)R to qgr(A)L which is denoted by Hom(−, O). This functor is not right exact and has right derived functors Ext i (−, O), i > 0, from qgr(A)R to qgr(A)L . For a noetherian regular algebra the functor Hom(−, O) and its right derived functors exist. This follows from the fact that the functors ExtiA (−, A) send a finite dimensional module to a finite dimensional module (see condition (3) of Definition 4.10). Moreover, in this case the functor Hom(−, O) can be represented as the composition of the functor 9 : qgr(A)R −→ gr(A)R , the functor HomA (−, A) : gr(A)R −→ gr(A)L , and the projection π : gr(A)L −→ qgr(A)L . This can be illustrated by the following commutative diagram: gr(A)R π 9 qgr(A)R

HomA (−,A)

−−−→

Hom(−,O)

−−−→

gr(A)L π

(13)

qgr(A)L

For a noetherian regular algebra the functor RHom·A (−, A) is an anti-equivalence between the derived categories of gr(A)R and gr(A)L and takes complexes of finite dimensional modules over gr(A)R to complexes of finite dimensional modules over gr(A)L .

406

A. Kapustin, A. Kuznetsov, D. Orlov

This implies that the functor RHom· (−, O) gives an anti-equivalence between the derived categories of qgr(A)R and qgr(A)L . (Note that for derived functors RHomA (−, A) and RHom(−, O) there is also a commutative diagram like (13).) The functors Ext j (−, O) can be described more explicitly. Let M be an A-bimodule.

Regarding it as a right module, we see that for any F ∈ QGr(A)R the groups Ext i (F, M) have the structure of left A-modules. We can project them to QGr(A)L . Thus each bimodule M defines functors from QGr(A)R to QGr(A)L , which will be denoted by

πExti (−, M). Now, using π9 = id and the commutativity of the diagram (13) for the derived j functors ExtA (−, A) and Ext j (−, O), we obtain isomorphisms j j Ext j (F, O) ∼ = π ExtA (9(F), A) ∼ = π Extgr(A) (9(F), ⊕ A(i)) ∼ = π Extj (F, ⊕ O(i)) i≥0

i≥0

(14) for any sheaf F ∈ qgr(A)R . Definition 5.4. We call a coherent sheaf F ∈ qgr(A)R locally free (or a bundle) if Ext j (F, O) = 0 for any j = 0. Remark. In the commutative case this definition is equivalent to the usual definition of a locally free sheaf. Definition 5.5. The dual sheaf Hom(F, O) ∈ qgr(A)L will be denoted by F ∨ . If F ∈ qgr(A)L is a bundle, then the dual sheaf F ∨ is a bundle in qgr(A)L , because RHom· (F ∨ , O) = F in the derived category, and Ext j (F ∨ , O) = 0 for j = 0. Thus we have a good definition of locally free sheaves on P3h¯ and P2h¯ . Since the derived functor RHom(−, O) gives an anti-equivalence between the derived categories of qgr(A)R and qgr(A)L , there is an isomorphism: Hom(F, G) ∼ = Hom(G ∨ , F ∨ )

(15)

for any two bundles F and G on P3h¯ or P2h¯ . 6. Bundles on P2h¯ 6.1. Bundles on P2h¯ with a trivialization on the commutative line. In this section we study bundles on P2h¯ . By definition, a bundle is an object E ∈ coh(P2h¯ ) satisfying the additional condition Ext i (E, O) = 0 for all i > 0 (see (5.4)). The noncommutative plane P2h¯ contains the commutative projective line l ∼ = P1 given by the equation w3 = 0. If M is a P Ph¯ -module, then the quotient module M/Mw3 is a P Ph¯ /w3 -module. This gives a functor coh(P2h¯ ) → coh(P1 ), F → F|l . The sheaf F|l is referred to as the restriction of F to the line l. Lemma 6.1. If F is a bundle, there is an exact sequence: ·w3

0 −→ F(−1) −→ F −→ F|l −→ 0.

(16)

Noncommutative Instantons and Twistor Transform

407

Proof. To prove this we only need to show that multiplication by w3 is a monomorphism. s

If F is a bundle, it can be embedded into a direct sum ⊕ O(ki ), because by ampleness i=1

the dual bundle F ∨ is covered by a direct sum of line bundles. Now, since the morphism ·w3 ·w3 O(ki −1) −→ O(ki ) is mono for any i, the same is true for the morphism F(−1) −→ F. " ! Lemma 6.2. Let E be a bundle on P2h¯ such that its restriction E|l to the commutative line l is isomorphic to a trivial bundle Ol⊕r . Then H 0 (P2h¯ , E(−1)) = H 0 (P2h¯ , E(−2)) = H 2 (P2h¯ , E(−1)) = H 2 (P2h¯ , E(−2)) = 0. Proof. We have the following exact sequence in the category coh(P2h¯ ): 0 −→ E(−2) −→ E(−1) −→ E(−1)|l −→ 0.

(17)

Since E(−1)|l ∼ = Ol (−1)⊕r , we have H 0 (E(−1)|l ) = 0. Assume that E(−1) has a nontrivial section. Then E(−2) has a nontrivial section too. For the same reason E(−3) has a nontrivial section, and so on. Thus for any n < 0 the bundle E(−n) has a nontrivial section. By (15) we have isomorphisms: H 0 (E(−n)) ∼ = Hom(O(n), E) ∼ = Hom(E ∨ , O(−n)). On the other hand, by Corollary 5.2 the last group is trivial for n & 0. Hence H 0 (E(−n)) = 0 for all n & 0, and consequently H 0 (E(−2)) = H 0 (E(−1)) = 0. Further, assume that H 2 (E(−2)) is nontrivial. Since H 1 (E(i)|l ) = 0 for all i ≥ −1 we have from the exact sequence (16) with F = E(i) that H 2 (E(i)) is nontrivial too for all i ≥ −1. But this contradicts Corollary 5.3. Therefore H 2 (E(−2)) = H 2 (E(−1)) = 0. This completes the proof. ! " 6.2. Monads on P2h¯ and P3h¯ . As in the commutative case, a non-degenerate monad on P2h¯ or P3h¯ is a complex over coh(P2h¯ ) m

n

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0 for which the map n is an epimorphism and m is a monomorphism. (Note that there is another more restrictive definition of a monad, according to which the dual map (m)∗ has to be an epimorphism, see [30]). The coherent sheaf E = Ker(n)/ Im(m) is called the cohomology of a monad. A morphism between two monads is a morphism of complexes. The following lemma is proved in [30, Lemma 4.1.3] in the commutative case, but the proof is categorical and applies to the noncommutative case as well. Lemma 6.3. Let X be either P2h¯ or on P3h¯ , and let E and E be the cohomology bundles of two monads m

n

M :0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0, m

n

M :0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0

408

A. Kapustin, A. Kuznetsov, D. Orlov

on X. Then the natural mapping Hom(M, M ) −→ Hom(E, E ) is bijective. The proof is based on the fact that Extj (O, O(−1)) = Extj (O(1), O(−1)) = Extj (O(1), O) = 0 for all j (see [30], Lemma 4.1.3). 6.3. Non-degeneracy conditions. In the definition of a monad we require that the map n be an epimorphism. In the commutative case this condition must be verified pointwise. In the noncommutative case the situation is simpler in some sense, because the complement of the commutative line l does not have points. Lemma 6.4. If the restriction of a sheaf F ∈ coh(P2h¯ ) to the projective line l is the zero object, then F is also the zero object.

Consider Proof. Let M be a finitely generated graded P Ph¯ -module such that F ∼ = M. an exact sequence: ·w3 M −→ M(1) −→ N −→ 0.

= F(1)|l = 0, the module N is finite dimensional. This implies that for i & 0 Since N ·w3 the map Mi → Mi+1 is surjective. Moreover, these maps are isomorphisms for i & 0, because all Mi are finite dimensional vector spaces. Let us identify all Mi for i & 0 with respect to these isomorphisms. Using the A-module structure on M, we obtain a representation of the Weyl algebra T(X, Y )/[X, Y ] = 2h ¯ on the vector space Mi . But it is well known that the Weyl algebra does not have finite dimensional representations. Thus Mi = 0 for all i & 0, and M is finite dimensional. Therefore F = 0. ! " The following corollary is an immediate consequence of the lemma. Corollary 6.5. Let f : F −→ G be a morphism in coh(P2h¯ ). Suppose its restriction f¯ : F|l −→ G|l is an epimorphism. Then f is an epimorphism too. 6.4. From the resolution of the diagonal to a monad. Let M be an A-bimodule. Regard have ing it as a left module, we see that for any F ∈ QGr(A)L the groups Exti (F, M) the structure of right A-modules. We can project them to QGr(A)R . Thus each bimodule

from QGr(A)L to QGr(A)R . M defines functors π Exti (−, M) Let E be a bundle on P2h¯ such that its restriction to the line l is a trivial bundle. Let us consider the bundle E ∨ (1) ∈ qgr(P Ph¯ )L and the resolution of the diagonal K· (P Ph¯ ), which has only three terms: {0 −→ P Ph¯ (−1) ⊗ P Ph¯ (−2) −→ Ω 1 (1) ⊗ P Ph¯ (−1) −→ P Ph¯ ⊗ P Ph¯ } −→ /.

· over The resolution of the diagonal is a complex of bimodules. It induces a complex K QGr(P Ph¯ )L :

, {0 −→ O(−1) ⊗ P Ph¯ (−2) −→ 1 (1) ⊗ P Ph¯ (−1) −→ O ⊗ P Ph¯ } −→ /

(18)

Noncommutative Instantons and Twistor Transform

409

where 1 is a sheaf on P2h¯ corresponding to the P Ph¯ -module Ω 1 .

from As described above, each A-bimodule M gives the functors π Ext i (−, M) QGr(A)L to QGr(A)R . In particular, each object of the resolution of the diagonal induces such functors.

. Note that the object /

coincides First we calculate these functors for the object / with ⊕ O(i). Hence by (14) we have i≥0

) = 0 π Ext j (E ∨ (1), /

) ∼ if j > 0, while π Ext0 (E ∨ (1), / = E(−1). The resolution of the diagonal (18) gives us a spectral sequence with the E1 term pq

−p ) (⇒ π Ext p+q (E ∨ (1), /

), E1 = πExt q (E ∨ (1), K which converges to

i E∞ =

E(−1) 0

if i = 0 otherwise.

pq

Now we describe all terms E1 of this spectral sequence. First we have π Extj (E ∨ (1), O ⊗ P Ph¯ ) ∼ Ph¯ = Extj (E ∨ (1), O) ⊗ P j ∨ ∼ = H j (P2 , E(−1)) ⊗ O. = Ext (E (1), O) ⊗ O ∼ h¯

By Lemma 6.2, these groups are trivial for j = 1. For the same reason we have π Extj (E ∨ (1), O(−1) ⊗ P Ph¯ (−2)) = H j (P2h¯ , E(−2)) ⊗ O(−2) = 0 for j = 1 and πExt1 (E ∨ (1), O(−1) ⊗ P Ph¯ (−2)) ∼ = H 1 (P2h¯ , E(−2)) ⊗ O(−2). Now let us consider the functors which are associated with the object 1 (1)⊗P Ph¯ (−1). We have πExtj (E ∨ (1), 1 (1) ⊗ P Ph (−1)) ∼ = Extj (E ∨ , 1 ) ⊗ O(−1). ¯

It follows from the Koszul complex that the sheaf 1 can be included in two exact sequences: 0 −→ 1 −→ O(−1) ⊗ P Ph¯ 1 −→ O −→ 0, 0 −→ O(−3) −→ O(−2) ⊗ (P Ph¯ 1 )∗ −→ 1 −→ 0. Applying the functor Hom(E ∨ , −) to the first sequence and taking into account that Hom(E ∨ , O(−1)) = 0, we obtain Hom(E ∨ , 1 ) = 0. Similarly, we deduce from the second sequence that Ext2 (E ∨ , 1 ) = 0, because Ext2 (E ∨ , O(−2)) = 0. This implies that the object πExtj (E ∨ (1), 1 (1) ⊗ P Ph¯ (−1)) is trivial for all j = 1. Thus our spectral sequence is nothing more than the complex

2 ) −→ π Ext 1 (E ∨ (1), K

1 ) −→ π Ext 1 (E ∨ (1), K

0 ), π Ext1 (E ∨ (1), K which is isomorphic to the complex H 1 (P2h¯ , E(−2)) ⊗ O(−2) −→ Ext 1 (E ∨ , 1 ) ⊗ O(−1) −→ H 1 (P2h¯ , E(−1)) ⊗ O. It has only one cohomology which coincides with E(−1).

410

A. Kapustin, A. Kuznetsov, D. Orlov

Theorem 6.6. Let E be a bundle on P2h¯ such that its restriction to the commutative line l is isomorphic to the trivial bundle Ol⊕r . Then E is the cohomology of a monad m

n

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0 with H = H 1 (P2h¯ , E(−2)), L = H 1 (P2h¯ , E(−1)), and such a monad is unique up to an isomorphism. Moreover, in this case the vector spaces H and L have the same dimension. Proof. The existence of such a monad was proved above. The uniqueness follows from Lemma 6.3. The equality of dimensions of H and L follows immediately from the exact sequence (17). ! " 6.5. Barth description of monads. Now following Barth [8], we give a description of the moduli space of vector bundles on P2h¯ trivial on the line l in terms of linear algebra (see also [15]). Denote by Mh¯ (r, 0, k) the moduli space of bundles on the noncommutative P2h¯ trivial on the line l and with a fixed trivialization there (i.e. with a fixed isomorphism E|l ∼ = Ol⊕r ). Let dim H 1 (P2h¯ , E(−1)) = k. As in the commutative case, the numbers r, 0, k can be regarded as the rank, first Chern class, and second Chern class of E, respectively. The following theorem gives a description of this moduli space which is similar to the description given by Barth in the commutative case. Theorem 6.7. Let {(b1 , b2 ; j, i)} be the set of quadruples of matrices b1 , b2 ∈ Mk×k (C), j ∈ Mr×k (C), i ∈ Mk×r (C), which satisfy the condition [b1 , b2 ] + ij + 2h¯ · 1k×k = 0. Then the space Mh¯ (r, 0, k) is the quotient of this set with respect to the following free action of GL(k, C): bi → gbi g −1 ,

j → jg −1 ,

i → gi,

where g ∈ GL(k, C).

Proof. Let E be a bundle on P2h¯ trivial on the line l. We showed above that any such bundle comes from a monad unique up to an isomorphism. Conversely, suppose we have a monad m

n

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0

(19)

with dim H = dim L = k such that its restriction to the line l is a monad with the cohomology Ol⊕r . Then the cohomology of this monad is a bundle on P2h¯ which belongs to Mh¯ (r, 0, k). Indeed, the cohomologies of the dual complex n∗

m∗

0 −→ O(−1) ⊗ L∗ −→ O ⊗ K ∗ −→ O(1) ⊗ H ∗ −→ 0 coincide with Hom(E, O) and Ext 1 (E, O). Hence, to prove that E is a bundle, it is sufficient to show that the dual complex is a monad too, i.e. that the map m∗ is an epimorphism. The restriction of the dual complex to l is a monad which is dual to the restriction of the monad (19) to l. Hence the restriction of m∗ on l is an epimorphism. Then, by Lemma 6.5, m∗ is an epimorphism as well. Thus to describe the moduli space

Noncommutative Instantons and Twistor Transform

411

Mh¯ (r, 0, k) we have to decsribe the space of all monads (19) modulo isomorphisms preserving trivialization on l. Consider a monad m

n

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0 with dim H = dim L = k and dim K = 2k + r. Denote by E its cohomology bundle. The maps m and n can be regarded as elements of H ∗ ⊗ K ⊗ W and K ∗ ⊗ L ⊗ W , respectively, where W = H 0 (P2h¯ , O(1)) is the vector space spanned by w1 , w2 , w3 . The maps m and n can be written as m1 w1 + m2 w2 + m3 w3 ,

n1 w1 + n2 w2 + n3 w3 ,

where mi : H → K and ni : K → L are constant linear maps. Let us restrict the monad to the line l. The monadic condition nm = 0 implies now: n1 m2 + n2 m1 = 0,

n1 m1 = 0,

n2 m2 = 0.

Moreover, since the restriction of E to l is trivial, the composition n1 m2 is an isomorphism (see [30], Lemma 4.2.3). We can choose bases for H, K, L so that n1 m2 = 1k×k (the identity matrix) and 1k×k 0k×k m1 = m2 = 0k×k , 1k×k , 0r×k 0r×k ! ! n1 = 0k×k 1k×k 0k×r , n2 = −1k×k 0k×k 0k×r . Using the equations n3 m1 + n1 m3 = 0 and n3 m2 + n2 m3 = 0 we can write: b1 ! m3 = b2 , n = . −b b i 3 2 1 j Now the monadic condition nm = 0 can be written as: (n3 m3 ) · w32 + 1k×k · [w1 , w2 ] = 0. Therefore we obtain the following matrix equation: [b1 , b2 ] + ij + 2h¯ · 1k×k = 0. Note that the last r basis vectors of K give us a trivialization of the restriction of E to the line l. It is easy to check that any isomorphism of a monad which preserves trivialization on l and the choice of the bases of H, K, L made above has the form bi → gbi g −1 , This proves the theorem. ! "

j → jg −1 ,

i → gi,

where g ∈ GL(k, C).

412

A. Kapustin, A. Kuznetsov, D. Orlov

7. The Noncommutative Variety P3h¯ as a Twistor Space 7.1. Real structures. A ∗-algebra is, by definition, an algebra over C with an anti-linear anti-homomorphism ∗ satisfying ∗2 = id.A ∗-structure on a (graded) algebra is regarded as a real structure on the corresponding (projective) noncommutative variety. Let us introduce real structures on the complex varieties C4h¯ and Q4h¯ defined in Sect. 3. Assume that in (6), (7) the skew-symmetric matrix θ is purely imaginary and h¯ is real. Then there is a unique ∗-structure on the algebra A(C4h¯ ) such that xi∗ = xi . We denote the corresponding noncommutative variety by R4h¯ . Assume in addition that the symmetric matrix G in (7) is real and positive definite. There is a unique ∗-structure on the algebra Qh¯ such that Xi∗ = Xi ,D ∗ = D, and T ∗ = T . The corresponding noncommutative real variety will be called the noncommutative sphere and denoted by S4h¯ . The embedding of C4h¯ into Q4h¯ induces an embedding R4h¯ → S4h¯ . Recall that the complement of C4h¯ in Q4h¯ is a commutative quadratic cone kl G Xk Xl = 0 which has only one real point. Thus S4h¯ can be regarded as a one-point kl

compactification of R4h¯ . By a linear change of basis one can bring the pair (G, θ ) to the standard form

1 0 0 0

0 1 0 0 , G= 0 0 1 0 0 0 0 1

θ=

0

a

0

0 . 0 b 0 −b 0

√ −a 0 −1 0 0 0

0

0

(20)

Furthermore, since h¯ and θ enter only in the combination h¯ · θ , and we asssume that a + b = 0, we can set a + b = 1 without loss of generality. 7.2. Realification of P3h¯ . Recall that the noncommutative projective space P3h¯ corresponds to the algebra P Sh¯ with generators zi , i = 1, 2, 3, 4, and relations (9). Consider an algebra P" Sh¯ with generators zi , z¯ i , i = 1, 2, 3, 4, and relations [z1 , z2 ] = 2h(a ¯ + b)z3 z4 , [z1 , z¯ 1 ] = 2h¯ bz3 z¯ 3 − 2haz ¯ 4 z¯ 4 , [z1 , z¯ 2 ] = 0, [¯z1 , z¯ 2 ] = −2h(a + b)¯ z z ¯ , [z , z ¯ ] = 2 h az z ¯ − 2 hbz ¯ ¯ 3 3 ¯ 4 z¯ 4 , [z2 , z¯ 1 ] = 0, (21) 3 4 2 2 [zi , zj ] = [zi , z¯ j ] = [¯zi , zj ] = [¯zi , z¯ j ] = 0 for all i = 3, 4; j = 1, 2, 3, 4. There is a unique ∗-structure on this algebra such that zi∗ = z¯ i ,¯zi∗ = zi . We denote the corresponding real variety P3h¯ (R). This variety can be considered a realization of P3h¯ . Remark. In contrast to the commutative situation, a noncommutative complex variety in general has many different realization. We have an ambiguity in the choice of relations involving both zi and z¯ j . The realization (21) is distinguished by the fact that it is the twistor space of the noncommutative sphere S4h¯ , as explained below. In the commutative case there is a map from P3 (R) to the sphere S4 which is a P1 fibration. The corresponding P1 bundle is the projectivization of a spinor bundle on S4 . This map is known as the Penrose map. In the noncommutative case we have a

Noncommutative Instantons and Twistor Transform

413

similar picture. The analogue of the Penrose map is a map N : P3h¯ (R) −→ S4h¯ which is Sh¯ : associated with the homomorphism of ∗-algebras Qh¯ −→ P" √ −1 (z1 z¯ 4 − z¯ 1 z4 − z¯ 2 z3 + z2 z¯ 3 ), X1 → − 2 1 D → − (z1 z¯ 1 + z¯ 1 z1 + z2 z¯ 2 + z¯ 2 z2 ), 2 1 X2 → (z1 z¯ 4 + z¯ 1 z4 − z¯ 2 z3 − z2 z¯ 3 ), 2 T → − (z3 z¯ 3 + z4 z¯ 4 ), √ −1 X3 → − (¯z1 z3 − z1 z¯ 3 + z2 z¯ 4 − z¯ 2 z4 ), 2 1 X4 → (z1 z¯ 3 + z¯ 1 z3 + z¯ 2 z4 + z2 z¯ 4 ). 2 Note that for h¯ = 0 we obtain the homomorphism of commutative algebras which corresponds to the usual Penrose map. This means that P3h¯ (R) is the twistor space of S4h¯ . The variety P3h¯ (R) is a twistor space in yet another sense. For the commutative R4 the complex structures compatible with the symmetric bilinear form G and orientation are parametrized by points of a P1 . This remains true in the noncommutative case. A complex structure (resp. orientation) on R4h¯ is defined as a complex structure (resp. orientation) on the real vector space U spanned by x1 , . . . , x4 . We will choose an orientation on U and require that the complex structure be compatible with it. All such complex structures are parametrized by points of a P1 . Recall now that P3h¯ is a pencil of noncommutative projective planes passing through the commutative line. Let us pick any one of them. The realification of P3h¯ defined above induces a realification of the noncommutative projective plane. It is easy to see that the complement of the commutative line w3 = w¯ 3 = 0 in the realified projective plane is isomorphic to R4h¯ . Furthermore, the complement carries a natural complex structure defined by √ √ w3−1 wi → −1 w3−1 wi , w¯ 3−1 w¯ i → − −1 w¯ 3−1 w¯ i , i = 1, 2. The Penrose map induces an identification between the complement and R4h¯ ⊂ S4h¯ , and therefore induces a complex structure on the latter. Varying the noncommutative projective plane, one obtains all possible complex structures on R4h¯ compatible with a particular orientation. This is completely analogous to the commutative case.

7.3. Connection between sheaves on commutative and noncommutative planes. In this subsection we are going to connect the moduli space Mh¯ (r, 0, k) of bundles on P2h¯ with a trivialization on the line l with the moduli space M(r, 0, k) of torsion free sheaves on the commutative P2 with a trivialization on a fixed line. The bridge between bundles on P2h¯ and torsion free sheaves on P2 is provided by the twistor variety P3h¯ . This gives a geometrical interpretation of Nakajima’s results (the description of the moduli space M(r, 0, k) by the deformed ADHM data [28, 27]). We will construct a hyperkähler manifold M parametrizing certain complexes on P3h¯ which is isomorphic to M(r, 0, k) (which is also a hyperkähler manifold [28]). The isomorphism is given by the restriction of complexes

414

A. Kapustin, A. Kuznetsov, D. Orlov

to one of the commutative P2’s. On the other hand, the restriction of complexes to a noncommutative plane P2h¯ yields an isomorphism between M with a particular choice of complex structure and the moduli space Mh¯ (r, 0, k). Thus Mh¯ (r, 0, k) can be obtained from M(r, 0, k) by a rotation of complex structure. Consider complexes C · on P3h¯ of the form M

N

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0

(22)

with dim H = dim L = k, dim K = 2k + r, which satisfies the condition that its restriction to the line l has only one cohomology which is a trivial bundle (with a fixed trivialization). This condition implies that M is a monomorphism. Note that N is not an epimorphism in general, so (22) is not a monad. But the restriction of the complex (22) to any noncommutative plane is a monad by Corollary 6.5. Thus N can fail to be surjective only on the commutative planes z3 = 0 and z4 = 0. Now we introduce a real structure on P3h¯ (this is different from the real structure on the realification of P3h¯ defined above). Assume that h¯ is a real number. Consider an anti-linear anti-homomorphism J¯ of P Sh¯ defined by J¯ (z1 ) = z2 ,

J¯ (z2 ) = −z1 ,

J¯ (z3 ) = z4 ,

J¯ (z4 ) = −z3 ,

¯ J¯ (λ) = λ,

λ ∈ C.

Thus J¯ is a homomorphism of R-algebras from P Sh¯ to the opposite algebra P Sh¯ op . (The notation J¯ is used by analogy with the commutative case, where this anti-homomorphism is a composition of a complex structure J with complex conjugation [15].) The anti-homomorphism J¯ induces a functor J¯ ∗ from qgr(P Sh¯ )R to qgr(P Sh¯ op )R . The latter category is naturally identified with the category qgr(P Sh¯ )L . Using this identification we can consider the composition of J¯ ∗ with the dualization functor Hom(−, O) as a functor from qgr(P Sh¯ )R to itself. For any bundle E we denote by J¯ ∗ (E)∨ its image under this functor. The functor can be extended to complexes of bundles. It takes the complex C · (22) to the complex J¯ ∗ (C · )∨ J¯ ∗ (N)∨ J¯ ∗ (M)∨ 0 −→ L¯ ∗ ⊗ O(−1) −→ K¯ ∗ ⊗ O −→ H¯ ∗ ⊗ O(1) −→ 0.

Let us consider complexes C · on P3h¯ with an isomorphism J¯ ∗ (C · )∨ ∼ = C·

(23)

and trivialization on the line l. Then the space K acquires a hermitian metric and L becomes isomorphic to H¯ ∗ . The reasoning of Sect. 6 shows that we can represent the maps M and N as M 1 z1 + M 2 z2 + M 3 z3 + M 4 z 4 ,

N1 z1 + N2 z2 + N3 z3 + N4 z4 ,

where Mi and Ni are constant maps. By a suitable choice of bases we can put these maps into the form 1 0 B1 B1 (24) M1 = 0 , M2 = 1 , M3 = B2 , M4 = B2 , 0 0 J J

Noncommutative Instantons and Twistor Transform

! N1 = 0 1 0 ,

! N3 = −B2 B1 I ,

415

! N2 = −1 0 0 , N4 = −B2

B1

I

!

.

Using the reality conditions J¯ ∗ (N )∨ = M and J¯ ∗ (M)∨ = −N we find that

B1 = −B2 † ,

B2 = B1 † ,

J = I †,

I = −J † .

(25)

Finally the condition N M = 0 gives a) b)

µc = [B1 , B2 ] + IJ = 0, µr = [B1 , B1 † ] + [B2 , B2 † ] + II † − J † J = −2h¯ · 1k×k .

These matrix equations are invariant under the following action of U (k): Bi → gBi g −1 ,

I → gI,

J → Jg −1 ,

where g ∈ U (k).

(26)

Denote by M the vector space of complex matrices (B1 , B2 , I, J). It has a structure of a quaternionic vector space defined by (B1 , B2 , I, J) → (−B2 † , B1 † , −J † , I † ), and, moreover, it is a flat hyperkähler manifold (see [28]). The map µ = (µr , µc ) is a hyperkähler moment map for the action of U (k) defined in (26) (see [19]). Since the −1 −1 action of U (k) on µ−1 is free, the quotient M = µ−1 ¯ ¯ c (0)∩µr (−2h·1) c (0)∩µr (−2h· 1)/U (k) is a smooth hyperkähler manifold. This manifold parametrizes complexes (22) with a real structure (23) and a trivialization on the line l. On the other hand, it was proved in [28, 27] that the moduli space M(r, 0, k) of torsion free sheaves on the commutative P2 with a trivialization on a fixed line can be identified with M. This identification can be described geometrically as follows. Let us assume that h¯ is positive. It can be checked that in this case the map N can fail to be surjective only on the plane z4 = 0. We can restrict the complex (22) to the commutative plane z3 = 0. The restriction is a monad and its cohomology sheaf is a torsion free sheaf. It is easy to see that this yields a complex isomorphism from M to M(r, 0, p). The restriction of the complex (22) to a noncommutative plane is a monad as well. This yields a map from M to the moduli space Mh¯ (r, 0, k) of bundles on the noncommutative plane. Let us show that this map is an isomorphism. To this end we note that on the level of the linear algebra data this map sends a quadruple (B1 , B2 , I, J) to the quadruple (b1 , b2 , i, j) with b1 = B1 − B2 † ,

b2 = B2 + B1 † ,

i = I − J †,

j = J + I †.

Further, note that the equations µc = 0, µr = −2h¯ · 1 are equivalent to the equation [b1 , b2 ] + i · j + 2h¯ · 1 = 0 and the vanishing of the moment map for the action of the group U (k) on the space of quadruples (b1 , b2 , i, j). Now it follows from the theorem of Kempf and Ness ([28, 20]) that the map M → Mh¯ (r, 0, k) is a diffeomorphism. It becomes a complex isomorphism if we replace the natural complex structure of the space M with another one within the P1 of complex structures on M. Thus we have

416

A. Kapustin, A. Kuznetsov, D. Orlov

Theorem 7.1. The moduli space Mh¯ (r, 0, k) is a smooth hyperkähler manifold of real dimension 4rk, and as a hyperkähler manifold it is isomorphic to the moduli space M(r, 0, k) of torsion free sheaves on the commutative P2 with a trivialization on a fixed line. As a complex manifold Mh¯ (r, 0, k) is obtained from M(r, 0, k) by a rotation of the complex structure. The above discussion shows that there are natural bijections between A . Bundles on P2h¯ with a trivialization on the commutative line l and c2 = k. B . Solutions of the equations µc = 0, µr = −2h¯ · 1 modulo the action of U (k). C . Complexes of sheaves on P3h¯ of the form (22) with a trivialization on the commutative line l satisfying the reality condition (23). One can show that for r > 1 a generic complex (22) is a monad and its cohomology is a bundle E on P3h¯ such that H 1 (P3h¯ , E(−2)) = 0,

J¯ ∗ (E)∨ ∼ = E.

(27)

Moreover, it can be shown that any bundle E satisfying the conditions (27) can be represented as a cohomology of a monad of the form (22).

8. Noncommutative Twistor Transform 8.1. Review of the twistor transform. In the commutative case the ADHM construction of instantons has the following geometric interpretation. Consider the double fibration p

q

G(2; 4) ←−−−− Fl(1, 2; 4) −−−−→ P3 ,

(28)

where G(2; 4) is the Grassmannian and Fl(1, 2; 4) is the partial flag variety. The Grassmannian G(2; 4) has a real structure with S4 as the set of real points. For any bundle E on P3 its twistor transform is defined as a sheaf p∗ q ∗ E on G(2; 4). Given ADHM data we have a monad on P3 whose cohomology is a bundle. It can be shown that the restriction of its twistor transform to the sphere S4 coincides with the instanton bundle corresponding to these ADHM data. The instanton connection can also be reconstructed from the bundle on P3 (see [4, 24] for details). In this section we show that one can consider the noncommutative quadric introduced in Sect. 3 as a noncommutative Grassmannian G(2; 4). We also construct a noncommutative flag variety Fl(1, 2; 4) and projections p, q giving a noncommutative analogue of the twistor diagram (28). The twistor transform can be defined in the same way as above. It produces a bundle on the noncommutative sphere from the deformed ADHM data. We show that this bundle is precisely the kernel of the map D defined in Sect. 2. It should also be possible to construct the instanton connection on the noncommutative R4 from the complex of sheaves on P3h¯ . To do this, one needs to develop the differential geometry of noncommutative affine and projective varieties. We go some way in this direction by defining differential forms and spinors. Since the goal of this section is mainly illustrative, we limit ourselves to stating the results. An interested reader should be able to fill in the proofs.

Noncommutative Instantons and Twistor Transform

417

8.2. Tensor categories. A good way to construct noncommutative varieties with properties similar to those of commutative varieties is to start with a tensor category (see [25, 23]). Let T be an abelian tensor category. Consider a tensor functor O : T → Vect to the abelian tensor category of vector spaces compatible with the associativity constraint but not compatible with the commutativity constraint. If A is a commutative algebra in the tensor category T , then O(A) is a noncommutative algebra in the tensor category Vect. If M ∈ T is a right A-module, then O(M) is a right O(A)-module. Any right A-module (in the category T ) has a natural structure of a left A-module (and hence an A-bimodule). Thus any right O(A)-module of the form O(M) has a natural structure of a O(A)-bimodule. Consider the category CommT of all finitely generated (graded) commutative algebras in the tensor category T . Then under O the category CommT is mapped to a subcategory of the category of finitely generated (graded) algebras. This subcategory enjoys many properties of the category of commutative (graded) algebras. For example, for all A, B ∈ CommT there is a natural algebra structure on O(A) ⊗ O(B) coming from the algebra structure on A ⊗ B. The corresponding subcategory in the category of noncommutative affine (resp. projective) varieties shares a lot of properties with the category of commutative varieties. For example, if X and Y are varieties in this category, then using the tensor product of the corresponding algebras one can define the “Carthesian” product X × Y . More generally, given a pair of morphisms X → Z and Y → Z one can define the fiber product X ×Z Y . Further, starting from the module of differential forms of A one can construct the sheaf of differential forms on the corresponding noncommutative variety. The category qgr(O(A)) has a nice subcategory which consists of modules of the form O(M), where M ∈ T is an A-module. To any object O(M) of this subcategory one can associate its symmetric and exterior powers. The symmetric powers of O(M) form a noncommutative graded algebra. This enables one to define the projectivization of the sheaf corresponding to the module O(M). 8.3. Yang–Baxter operators. One way to construct an abelian tensor category T with a functor O : T → Vect is to consider a Yang–Baxter operator (see [25, 23]). A Yang–Baxter operator on a vector space V is an operator R : V ⊗ V → V ⊗ V , such that R 2 = idV ⊗V , (R ⊗ idV )(idV ⊗ R)(R ⊗ idV ) = (idV ⊗ R)(R ⊗ idV )(idV ⊗ R).

(29)

A Yang–Baxter operator induces an action of the permutation group Sn on the tensor power V ⊗n , where the transposition (i, i + 1) ∈ Sn acts as the operator Ri,i+1 = idV ⊗(i−1) ⊗ R ⊗ idV ⊗(n−i−1) : V ⊗n → V ⊗n . Equations (29) ensure that operators Ri,i+1 satisfy the relations between the transpositions (i, i + 1) in the group Sn . If R is a Yang–Baxter operator on a vector space V , then the dual operator R ∨ : V ∗ ⊗ V ∗ → V ∗ ⊗ V ∗ is also a Yang–Baxter operator. Given a Yang–Baxter operator R : V ⊗ V → V ⊗ V , one can construct an abelian tensor category TR and a functor OR : TR → Vect such that V is a OR -image of some object of TR , and the commutativity morphism in the category TR is mapped by OR to R [23]. As mentioned above, given any two objects A, B of the category CommTR , one

418

A. Kapustin, A. Kuznetsov, D. Orlov

has a natural algebra structure on the vector space O(A) ⊗ O(B). This algebra will be denoted O(A) ⊗ O(B) and called the R-tensor product of O(A) and O(B). R

It is well known that there is a one-to-one correspondence between irreducible representations of the group Sn and partitions of n (Young diagrams). Under this correspondence the trivial partition (n) corresponds to the sign representation, while the maximal partition (1, 1, . . . , 1) corresponds to the identity representation. Given # $% & n times

a partition (k1 , . . . , kr ) of n (k1 ≥ k2 ≥ · · · ≥ kr ) we denote by (k1 , . . . , kr ) the (k ,...,kr ) (k ,...,kr ) ∗ V (resp. R 1 V ) the corresponding irreducible representation and by R 1 ⊗n ∗ ⊗n (k1 , . . . , kr )-isotypical component of V (resp. (V ) ), i.e. the sum of all subrepresen(n) tations of V ⊗n (resp. (V ∗ )⊗n ) isomorphic to (k1 , . . . , kr ). We also put CnR V = R V , (n) CnR V ∗ = R V ∗ for brevity. Remark. The subspaces Rλ V ⊂ V ⊗n are the OR -images of some objects of the category TR . Let λ, µ be partitions of n and m respectively. It is clear that the action of the permutation σn,m ∈ Sn+m i + m, if 1 ≤ i ≤ n σn,m (i) = i − n, if n + 1 ≤ i ≤ n + m gives an isomorphism µ

µ

Rn,m : Rλ V ⊗ R V → R V ⊗ Rλ V . Remark. This isomorphism is the image of an isomorphism in the category TR . The trivial example of a Yang–Baxter operator is the usual transposition R0 (v1 ⊗ v2 ) = v2 ⊗ v1 . We will say that R is a deformation-trivial Yang–Baxter operator if R is an algebraic deformation of R0 in the class ofYang–Baxter operators. For a deformation-trivialYang– Baxter operator R we have dim Rλ V = dim Rλ 0 V for any partition λ. 8.4. The noncommutative projective space. Let R be a deformation-trivial Yang–Baxter operator on the vector space V ∗ . Then the graded algebra ) '( SR· V ∗ = T (V ∗ ) C2R V ∗ is a noncommutative deformation of the coordinate algebra of the projective space P(V ). We denote by PR (V ) the corresponding noncommutative variety. Thus PR (V ) is a noncommutative deformation of the projective space P(V ).

Noncommutative Instantons and Twistor Transform

419

Example 8.1. The operator if (i, j ) = (1, 2), (2, 1), R(zi ⊗ zj ) = zj ⊗ zi , R(z1 ⊗ z2 ) = z2 ⊗ z1 + 2h(az ⊗ z + bz4 ⊗ z3 ), ¯ 3 4 R(z2 ⊗ z1 ) = z1 ⊗ z2 − 2h(bz ¯ 3 ⊗ z4 + az4 ⊗ z3 ),

(30)

is a deformation trivialYang–Baxter operator on the 4-dimensional vector space Z ∗ with the basis {z1 , z2 , z3 , z4 }. By definition the homogeneous coordinate algebra of PR (Z) is generated by z1 , z2 , z3 , z4 with relations (9) (we set a + b = 1 as before). Hence PR (Z) is isomorphic to the noncommutative projective space P3h¯ defined in Sect. 3. The space Z ∗ was denoted U in that section. The above example shows that part of the data encoded in theYang–Baxter operator R is lost in the structure of the corresponding noncommutative projective space. We will see below that this data appears in the structure of other noncommutative varieties associated with R. 8.5. Noncommutative Grassmannians. It is well known that the homogeneous coordinate algebra of the Grassmann variety G(k; V ) is a graded quadratic algebra with Ck V ∗ as the space of generators and ! Ker Ck V ∗ ⊗ Ck V ∗ → (V ∗ )⊗2k → (k,k) V ∗ as the space of relations. This description justifies the following definition. Definition 8.2. Let R be a Yang–Baxter operator on the space V ∗ . The noncommutative Grassmann variety GR (k; V ) is the noncommutative projective variety corresponding to the quadratic algebra '( ) (k,k) Ker(CkR V ∗ ⊗ CkR V ∗ → R V ∗ ) . GR (k; V ) = T (CkR V ∗ ) The algebra GR (k; V ) is the OR -image of a commutative algebra in the category TR . If R is deformation-trivial, then GR (k; V ) is a noncommutative deformation of G(k; V ). Note that GR (1; V ) = PR (V ) by definition. Example 8.3. Consider the noncommutative Grassmannian GR (2; Z) corresponding to the Yang–Baxter operator (30). Let zij =

1 ((zi ⊗ zj − zj ⊗ zi ) − R(zi ⊗ zj − zj ⊗ zi )) ∈ C2R Z ∗ . 2

Then it is easy to check that GR (2; Z) is generated by the elements Y1 = z13 ,

Y2 = −z24 ,

Y3 = z23 ,

Y4 = z14 ,

D = −z12 ,

T = z34 ,

with relations [Y1 , Y2 ] = 2h¯ aT 2 , [Y3 , Y4 ] = 2h¯ bT 2 , [D, Y1 ] = −2h¯ aY1 T , [D, Y2 ] = 2haY ¯ 2T , [D, Y3 ] = −2h¯ bY3 T , [D, Y4 ] = 2hbY ¯ 4T , 1 DT = (Y1 Y2 + Y2 Y1 + Y3 Y4 + Y4 Y3 ) , 2

(31)

420

A. Kapustin, A. Kuznetsov, D. Orlov

[Yi , Yj ] = [T , Yj ] = [T , D] = 0 for all i = 3, 4, j = 1, 2, 3, 4. Comparing with (7) one can see that the algebra GR (2; Z) is isomorphic to Qh¯ with G and θ given by 0 a 0 0 0 1 0 0 −a 0 0 0 1 0 0 0 1 . , θ = 2h¯ G= 2 0 0 0 1 0 0 0 b 0 0 −b 0 0 0 1 0 Note that the variables Xi , i = 1, 2, 3, 4, used in Sect. 7 to describe the quadric are related to Yi , i = 1, 2, 3, 4, by the following formulas: √ √ Y1 = X2 + −1 X1 , Y2 = −X2 + −1 X1 , (32) √ √ Y3 = X4 + −1 X3 , Y4 = −X4 + −1 X3 . 8.6. Products of Grassmannians and flag varieties. Let R be a Yang–Baxter operator on the vector space V ∗ . Consider a sequence k1 , . . . , kr of integers. Let Zr be the free abelian group with r generators e1 , . . . , er . The R-tensor product GR (k1 ; V ) ⊗ . . . ⊗ GR (kr ; V ) R

R

is a Zr -graded algebra generated by the vector spaces CkRi V ∗ in degree ei , with relations ! (k ,k ) Ker CkRi V ∗ ⊗ CkRi V ∗ → R i i V ∗ in degree 2ei for all i and k

(id,−Rkj ,ki )

k

k

Ker (CkRi V ∗ ⊗ CRj V ∗ ) ⊕ (CRj V ∗ ⊗ CkRi V ∗ ) −−−−−−−−−−→ CkRi V ∗ ⊗ CRj V ∗

!

in degree ei + ej for all i > j . For any increasing sequence k1 , . . . , kr we define also a Zr -graded algebra FLR (k1 , . . . , kr ; V ).It has the same generators as the algebra GR (k1 ; V ) ⊗ . . . ⊗ GR (kr ; V ),, subject to the same relations in degrees 2ei and to relations k

R

R

kj

kj

k

(id,−Rkj ,ki )

kj

k

(ki ,kj )

Ker (CRi V ∗ ⊗ CR V ∗ ) ⊕ (CR V ∗ ⊗ CRi V ∗ ) −−−−−−−−→ CRi V ∗ ⊗ CR V ∗ −−−−−→ R

V∗

!

in degree ei + ej for all i > j . This definition is suggested by the Borel–Weil–Bott theorem (see [14]). In particular, for R = R0 we get the algebra corresponding to the commutative flag variety. We define the R-Carthesian product GR (k1 ; V ) × . . . × GR (kr ; V ) and the noncomR

R

mutative flag variety FlR (k1 , . . . , kr ; V ) as the noncommutative varieties corresponding to the algebras GR (k1 ; V ) ⊗ . . . ⊗ GR (kr ; V ) and FLR (k1 , . . . , kr ; V ) respectively. R

R

To make this compatible with our definition of a noncommutative variety, we consider instead of a Zr -graded algebra its diagonal subalgebra. The diagonal subalgebra is a graded algebra whose nth graded component is the n(e1 + · · · + er )-graded component of the Zr -graded algebra. Thus according to Sect. 3 the category of coherent sheaves on

Noncommutative Instantons and Twistor Transform

421

the R-Cartesian product of Grassmannians (or the flag variety) is the category qgr of the corresponding diagonal subalgebra. The algebra FLR (k1 , . . . , kr ; V ) is the OR -image of a commutative algebra in the category TR . Hence one can define the R-Carthesian product of several flag varieties. If R is deformation-trivial, then GR (k1 ; V ) × . . . × GR (kr ; V ) R

and

R

FlR (k1 , . . . , kr ; V )

are noncommutative deformations of the corresponding commutative varieties. Note that we have a canonical embedding of the graded algebra GR (ki ; V ) into the graded algebra FLR (k1 , . . . , ki , . . . , kr ; V ) inducing the canonical projections pi : FlR (k1 , . . . , ki , . . . , kr ; V ) → GR (ki ; V ). On the other hand, by definition FLR (k1 , . . . , kr ; V ) is a quotient algebra of the algebra GR (k1 ; V ) ⊗ . . . ⊗ GR (kr ; V ). Hence FlR (k1 , . . . , kr ; V ) can be regarded as a closed R

R

subvariety in GR (k1 ; V ) × . . . × GR (kr ; V ). R

R

Example 8.4. The algebra GR (1; Z) ⊗ GR (2; Z) corresponding to the Yang–Baxter opR

erator (30) is generated by the elements z1 , z2 , z3 , z4 , Y1 , Y2 , Y3 , Y4 , D, T with relations (9), (31), and [z2 , Y1 ] = 2h¯ az3 T , [z1 , Y2 ] = −2haz ¯ 4T , [z1 , Y3 ] = −2hbz [z2 , Y4 ] = −2hbz ¯ 3T , ¯ 4T , [z1 , D] = −2hbz ¯ 3 Y4 − 2haz ¯ 4 Y1 , [z2 , D] = 2haz ¯ 3 Y2 − 2hbz ¯ 4 Y3 , [z1 , Y1 ] = [z2 , Y2 ] = 0, [z3 , Yi ] = [z3 , D] = 0, [z4 , Yi ] = [z4 , D] = 0, [zi , T ] = 0 for all i = 1, 2, 3, 4. The algebra FLR (1, 2; Z) is given by the same generators subject to the same relations, as well as the additional relations

0

T Y2

T

0 z1 z2 0 −Y4 Y1 = . 0 D − h(a ¯ + b)T z3 0 0 −D − h(a 0 z4 ¯ + b)T Y2

0 Y4

Y3 −Y1

Y3

(33)

As explained above, we have projections Qh¯

p

q

GR (2; Z) ←−−−− FlR (1, 2; Z) −−−−→ PR (Z)

and a closed embedding FlR (1, 2; Z) ⊂ GR (2; Z) × PR (Z) = Qh¯ × P3h¯ . R

R

P3h¯

422

A. Kapustin, A. Kuznetsov, D. Orlov

8.7. Tautological bundles. Let V (resp. V ∗ , Rλ V, Rλ V ∗ ) denote the coherent sheaf on GR (k; V ) corresponding to the free right GR (k; V )-module V ⊗ GR (k; V ) (resp. V ∗ ⊗GR (k; V ), Rλ V ⊗GR (k; V ), Rλ V ∗ ⊗GR (k; V )). Since the space of global sections ∗ of the sheaf O(1) on the Grassmannian GR (k; V ) is CkR V ∗ , the maps Ck−1 R V → k ∗ ∗ ∗ V ⊗ CkR V ∗ and Ck+1 R V → V ⊗ CR V induce morphisms of sheaves φ

∗ −−−→ V Ck−1 R V (−1) −

and

ψ

∗ Ck+1 −−−→ V ∗ . R V (−1) −

We put S = Im φ, V/S = Coker φ, S = Im ψ, V ∗ /S = Coker ψ. Remark. For k = 1 we have S = O(−1), V ∗ /S = O(1). One can show that these sheaves are locally free. We refer to them as tautological bundles. The free GR (k; V )-modules, corresponding to the sheaves Rλ V, Rλ V ∗ are the OR images of free modules over the corresponding algebra in the category TR . Furthermore, the morphisms φ and ψ are OR -images. This implies that the GR (k; V )-modules corresponding to the tautological bundles are OR -images as well. Therefore they all have a natural structure of GR (k; V )-bimodules. This allows to define R-symmetric powers SRk (−) (resp. R-exterior powers CkR (−)) of the tautological bundles as the corresponding OR -images. One can check that we have canonical isomorphisms of bimodules V ∗ /S ∼ = S∨,

S ∼ = (V/S)∨ .

Example 8.5. Let R be the Yang–Baxter operator (30) and k = 2. Let zˇ 1 , zˇ 2 , zˇ 3 , zˇ 4 be the dual basis of Z. Then the twisted maps φ(1) : Z ∗ ⊗ OGR → Z ⊗ OGR (1), ψ(1) : Z ⊗ OGR ∼ = C3R Z ∗ ⊗ OGR → Z ∗ ⊗ OGR (1) are given by 0 D + h(a z1 zˇ 1 ¯ − b)T −Y1 −Y4 D − h(a z2 0 −Y3 Y2 ¯ − b)T zˇ 2 , φ(1) : → −Y1 Y3 0 −T zˇ 3 z3 −Y4 z4 −Y2 T 0 zˇ 4 zˇ 1 0 T Y2 Y3 z1 zˇ 2 T z2 0 −Y4 Y1 . ψ(1) : → 0 D − h(a ¯ + b)T z3 zˇ 3 Y2 Y4 Y3 −Y1 −D − h(a zˇ 4 0 z4 ¯ + b)T Note that ψ(1)φ = 0 and φ(1)ψ = 0. Hence we have isomorphisms S (1) ∼ = V/S,

S(1) ∼ = S∨.

Noncommutative Instantons and Twistor Transform

423

Note also that on the open subset T = 0 elements (z3 , z4 ) give a trivialization of the tautological bundle S ∨ . More precisely, the restriction of the sections z1 , z2 of S ∨ can be expressed as z1 = y4 z3 − y1 z4 ,

z2 = −y2 z3 − y3 z4 ,

(34)

where yi = T −1 Yi . Similarly, the elements(ˇz1 , zˇ 2 ) give a trivialization of V/S on T = 0. Thus the restrictions of all tautological bundles to the open subset T = 0 correspond to the free rank two bimodule over the Weyl algebra A(C4h¯ ). 8.8. Pull-back and push-forward. Recall that we have canonical projections pi : FlR (k1 , k2 ; V ) → GR (ki ; V )

(i = 1, 2).

Given a right graded GR (ki ; V )-module E we consider the right bigraded FLR (k1 , k2 ; V )-module E ⊗GR (ki ;V ) FLR (k1 , k2 ; V ). The diagonal subspace of this module is a graded module over the diagonal subalgebra of FLR (k1 , k2 ; V ). This gives the pull-back functor pi∗ : coh(GR (ki ; V )) → coh(FlR (k1 , k2 ; V )). The pull-back functor is exact and takes a OR -image to a OR -image. In particular, the pull-backs of the tautological bundles have a canonical bimodule structure. The pull-back functor pi∗ admits a right adjoint functor pi∗ : coh(FlR (k1 , k2 ; V )) → coh(GR (ki ; V )), called the push-forward functor. It also takes a OR -image to a OR image. The line bundles p1∗ O(i) and p2∗ O(j ) on the flag variety FlR (k1 , k2 ; V ) are OR images, hence they have a canonical bimodule structure. Therefore, we have a welldefined tensor product O(i, j ) = p1∗ O(i) ⊗ p2∗ O(j ). The line bundle O(i, j ) is also a OR -image and has a canonical bimodule structure. The nth graded component of the corresponding module over the diagonal subalgebra of FLR (k1 , k2 ; V ) is the ((n + i)e1 + (n + j )e2 )-graded component of the algebra FLR (k1 , k2 ; V ). One can check that the push-forward of the line bundle O(j1 , j2 ) with respect to p2 is given by the formula j p2∗ O(j1 , j2 ) = SR1 (S ∨ )(j2 ). 8.9. FlR (1, 2; Z) as the projectivization of the tautological bundle. The R-symmetric powers of the tautological bundle form a sheaf of graded algebras on the Grassmannian GR (k; V ), ) '( SR· (S ∨ ) = T (S ∨ )

C2R S ∨ .

The corresponding GR (k; V )-module ∞ i,j =0

j

9(GR (k; V ), SR (S ∨ )(i))

424

A. Kapustin, A. Kuznetsov, D. Orlov

is a bigraded module with a structure of a bigraded algebra. One can check that this bigraded algebra is isomorphic to the bigraded algebra FLR (1, k; V ). Thus we can regard the flag variety FlR (1, k; V ) as the projectivization of the tautological bundle S on the Grassmannian GR (k; V ). In particular, FlR (1, 2; Z) is the projectivization of the tautological bundle S on the Grassmannian GR (2; Z). 8.10. Noncommutative twistor transform. If E is a coherent sheaf on the noncommutative projective space PR (Z) = P3h¯ , we define its twistor transform as the sheaf p∗ q ∗ E on GR (2; Z) = Qh¯ , where q is the projection FlR (1, 2; Z) → PR (Z) = P3h¯ and p is the projection FlR (1, 2; Z) → GR (2; Z) = Qh¯ . Similarly, we can define the twistor transform of a complex of sheaves on P3h¯ . Actually, it is more natural to consider the derived twistor transform, i.e. the derived functor of the ordinary twistor transform. Consider a complex C · of the form M

N

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0 on the projective space P3h¯ . One can check that under the twistor transform one has OP3 (−1) → 0, h¯

OP3 (1) → S ∨ .

OP3 → OGR , h¯

h¯

In fact, for these sheaves the derived twistor transform coincides with the ordinary one. Thus the (derived) twistor transform takes the complex C · to the complex N

0 −→ K ⊗ O −→ L ⊗ S ∨ −→ 0. Let E denote the middle cohomology of the complex C · . It follows that the twistor transform takes E to the kernel of the map N : K ⊗ O −→ L ⊗ S ∨ . One can describe N without reference to the twistor transform. The morphism N is the same thing as a vector space morphism N1 z1 + N2 z2 + N3 z3 + N4 z4 : K −→ Z ∗ ⊗ L.

(35)

Here the maps Ni are given in terms of the deformed ADHM data according to (24) and (25). The map N is a composition of two maps K ⊗ OGR −→ L ⊗ Z ∗ ⊗ OGR −→ L ⊗ S ∨ , where the first map is given by (35), while the second map comes from the canonical projection Z ∗ ⊗ OGR → S ∨ . (We remind that S ∨ is the cokernel of the map ψ : Z ⊗ OGR (−1) −→ Z ∗ ⊗ OGR .) Recall that on the open subset {T = 0} the bundle S ∨ is trivial, and the elements (z3 , z4 ) give its trivialization (see (34)). Hence the restriction of the twistor transform of the complex (22) to this open subset is isomorphic to the complex

N3 N4

+ y 4 N1 − y 2 N 2

− y 1 N1 − y 3 N2 0 −−−−→ K ⊗ O −−−−−−−−−−−−−−−−→ (L ⊕ L) ⊗ O −→ 0.

(36)

Noncommutative Instantons and Twistor Transform

425

Assume now that the complex (22) is given by the deformed ADHM data (B1 , B2 , I, J) (see Sect. (7)). Applying the formulas (24) and (25), we see that with respect to the chosen bases of L and K the map N is given by the matrix −B2 + y2 B1 + y 4 I . −B1 † + y3 −B2 † − y1 −J † It is evident that this operator is related to the operator D in (4) by a change of basis. In particular, the Nekrasov–Schwarz coordinates ξ1 , ξ2 , ξ¯1 , ξ¯2 (see Sect. 2) can be expressed through xi = T −1 Xi as follows: √ √ ξ1 = −y4 = x4 − −1 x3 , ξ2 = y2 = −x2 + −1 x1 , √ √ ξ¯1 = y3 = x4 + −1 x3 , ξ¯2 = −y1 = −x2 − −1 x1 . Thus the twistor transform of the complex corresponding to the deformed ADHM data coincides with the instanton bundle corresponding to these data (see Sect. 2). This gives a geometric interpretation of the deformed ADHM construction of the noncommutative instanton bundle. 8.11. Differential forms. Let an algebra A be the OR -image of a commutative algebra in the category TR . This means that there exists an operator R : A⊗2 −→ A⊗2 compatible with the multiplication law of A. Above we have defined the R-tensor product A ⊗ A R

which is also an algebra with a Yang–Baxter operator. Explicitly, the multiplication law of A ⊗ A is defined as follows. Let m be the multiplication map from A ⊗ A to A. Then R

the multiplication map from (A ⊗ A) ⊗ (A ⊗ A) to A ⊗ A is given by m12 m34 R23 in the obvious notation. It is easy to see that the multiplication map m is a homomorphism of algebras. Let I denote the kernel of the map m : A ⊗ A → A. Then I is a two-sided ideal of R

the algebra A ⊗ A. R

Definition 8.6. We define the bimodule of R-differential forms of the algebra A by ΩA1 = I /I 2 . For a motivation of this definition, see [12]. Furthermore, suppose A is a graded algebra. Consider the total grading of the bigraded algebra A ⊗ A. The two-sided ideal I inherits R

the grading. Therefore the bimodule ΩA1 is graded too. In the graded case, besides ΩA1 , we can define the module of projective differential forms of A in the following way. Let χ : A ⊗ A → A ⊗ A be the linear operator which R

R

acts on the (p, q)th graded component of the algebra A ⊗ A as a scalar multiplication by R

q. Since χ is a derivation, we have χ (I 2 ) ⊂ I . Therefore m(χ (I 2 )) = 0. Furthermore, m·χ the induced map ΩA1 = I /I 2 −→ A is a morphism of graded A-bimodules. Definition 8.7. We define the A-bimodule of projective differential forms of the algebra A by m·χ *A1 = Ker(ΩA1 −→ Ω A).

426

A. Kapustin, A. Kuznetsov, D. Orlov

First, let us apply this construction of differential forms to the noncommutative affine variety C4h¯ (Subsect. 3.4). The algebra A(C4h¯ ) of polynomial functions on C4h¯ is the Weyl algebra: A(C4h¯ ) = T(x1 , x2 , x3 , x4 )/[xi , xj ] = h¯ θij 1≤i,j ≤4 . Let us define the Yang–Baxter operator on the tensor square of the subspace of A(C4h¯ ) spanned by 1, x1 , x2 , x3 , x4 by the formula 1 ⊗ xi → xi ⊗ 1, xi ⊗ 1 → 1 ⊗ xi , xi ⊗ xj → xj ⊗ xi + hθ ¯ ij · 1 ⊗ 1 for all

1 ≤ i, j ≤ 4.

This Yang–Baxter operator has a unique extension to the whole A(C4h¯ ) compatible with the multiplication law. There is another way to look at this Yang–Baxter operator. Recall that C4h¯ is an open subset T = 0 in the noncommutative Grassmannian GR (2; Z), where R is defined by (30). The Yang–Baxter operator on the quadratic algebra GR (2; Z) has the property that R(T ⊗ a) = a ⊗ T for any a ∈ GR (2; Z). Hence it descends to a Yang–Baxter operator on A(C4h¯ ). It is easy to see that it acts on the tensor square of the subspace spanned by 1, x1 , x2 , x3 , x4 in the above manner. We define the sheaf of differential forms 1C4 as the bimodule of R-differential forms h¯

of the algebra A(C4h¯ ). It is easy to check that 1C4 is isomorphic to the bimodule A(C4h¯ )⊕4 . h¯

p

Futhermore, we can take any R-exterior power of 1C4 and thereby define C4 . This h¯

h¯

enables us to define a connection and its curvature on any bundle on the noncommutative affine space. The relevant formulas were written above (see Subsect. 1.5). Second, we define the sheaf of differential forms 1GR on the noncommutative Grassmannian GR (k; V ) as the sheaf corresponding to the module of projective differential *1 . forms Ω GR It can be shown that as in the commutative case we have an isomorphism of coherent sheaves on the noncommutative Grassmannian GR (k; V ): 1GR ∼ = S ⊗ S. It follows that for k = 1 that we have an exact sequence 0 −→ 1PR (V ) −→ V ∗ (−1) −→ O −→ 0. Thus this definition of the sheaf of differential forms 1PR (V ) is consistent with Definition 4.8. Similarly, one can define the sheaf of differential forms 1FlR on the noncommutative flag variety FlR (k1 , . . . , kr ; V ). One can check that the projection pi : FlR (k1 , . . . , ki , . . . , kr ; V ) → GR (ki ; V ) induces a morphism of bundles pi∗ : 1GR → 1FlR . In the commutative case the ADHM construction of the instanton connection can be interpreted in terms of twistor transform (see [4, 24] for details). We believe that this can be done in the noncommutative case as well. It appears that the most convenient definition of connection on a bundle on a noncommutative projective variety is in terms of jet bundles (see, for example, [24]).

Noncommutative Instantons and Twistor Transform

427

9. Instantons on a q-Deformed R4 In this paper we have focused on a particular noncommutative deformation of R4 related to the Wigner–Moyal product (3). This is the only deformation of R4 which is known to arise in string theory. But most of our constructions work for more general deformations which do not have a clear physical interpretation. For example, let us replace C4h¯ with a noncommutative affine variety whose coordinate ring is generated by z1 , z2 , z3 , z4 subject to the following quadratic relations: qz3 z4 − q −1 z4 z3 = h, qz1 z2 − q −1 z2 z1 = h, ¯ ¯ [z1 , z3 ] = [z1 , z4 ] = [z2 , z3 ] = [z2 , z4 ] = 0. We will denote this noncommutative affine variety by C4q,h¯ , and its coordinate algebra by Aq,h¯ . If h¯ and q are real, we can define a ∗-operation on Aq,h¯ by z1∗ = z2 , z3∗ = z4 . The corresponding real noncommutative affine variety will be denoted by R4q,h¯ . Consider now the following deformation of the ADHM equations: [B1 , B1† ]q −1 + [B2 , B2† ]q + I I † − J † J = −2h¯ · 1k×k . (37)

[B1 , B2 ]q −1 + I J = 0,

Here B1 , B2 ∈ Hom(V , V ), I ∈ Hom(W, V ), J ∈ Hom(V , W ), as usual, and by [A, B]q we mean a q-commutator: [A, B]q = qAB − q −1 BA. We claim that solutions of these “q- deformed” ADHM equations can be used as an input for the construction of instantons on R4q,h¯ of rank r = dim W and instanton charge k = dim V . Let us sketch this construction. Define an operator D ∈ HomAq,h¯ ((V ⊕ V ⊕ W ) ⊗C Aq,h¯ , (V ⊕ V ) ⊗C Aq,h¯ ) by the formula

D=

B1 − qz1 −qB2 + qz2

I

B2† − z¯ 2

J†

qB1† − z¯ 1

.

Now we can go through the same manipulations as in Sect. 2: assume that D is surjective, and its kernel is a free module, and define a connection 1-form by the expression (5). The same formal computation as in Sect. 2 shows that the curvature of this connection is anti-self-dual. In order to ensure that D is surjective, it is probably necessary to replace the algebra Aq,h¯ with some bigger algebra containing Aq,h¯ as a subalgebra. This bigger algebra should play the role of the algebra of smooth functions on our noncommutative R4 . For h¯ = 0, q = 1 there is even a natural candidate for this bigger algebra: it should consist of C ∞ functions on C2 with some suitable growth conditions at infinity and the product defined by (f g)(z1 , z2 , z¯ 1 , z¯ 2 ) = exp − ln(q) z1 z¯ 1

∂2 ∂2 ∂2 ∂2 + z2 z¯ 2 − z1 z¯ 1 − z2 z¯ 2 ∂z1 ∂ z¯ 1 ∂z2 ∂ z¯ 2 ∂z ∂ z¯ 1 ∂z2 ∂ z¯ 2 1 f (z1 , z2 , z¯ 1 , z¯ 2 ) g z1 , z2 , z¯ 1 , z¯ 2 |z1 =z1 ,z2 =z2 . (38)

428

A. Kapustin, A. Kuznetsov, D. Orlov

Assuming that this formal expression exists, it is easy to check that the product is associative, that polynomial functions form a subalgebra with respect to it, and that this subalgebra is isomorphic to Aq,h¯ . It is natural to conjecture that all instantons on R4q,h¯ arise from this deformed ADHM construction. Note that in this case the deformed ADHM equations are not hyperkähler moment map equations, and one cannot use the hyperkähler quotient construction to infer the existence of a hyperkähler metric on the quotient space. The algebro-geometric part of the story can also be generalized. We did not go through this carefully, but nevertheless would like to indicate one result. It appears that the q-deformed ADHM data can be interpreted in terms of sheaves on a more general noncommutative P2 than the one defined in Sect. 3. The graded algebra corresponding to this noncommutative P2 is generated by degree one elements z1 , z2 , z3 with the quadratic relations 2 qz1 z2 − q −1 z2 z1 = 2hz ¯ 3 , [zi , z3 ] = 0, i = 1, 2. This algebra is one of the Artin-Schelter regular algebras of dimension three [1, 2]. It is characterized by the fact that the corresponding noncommutative variety P2q,h¯ contains as subvarieties a commutative quadric and a noncommutative line. The latter is given by the equation z3 = 0. In the limit q → 1 the plane P2q,h¯ reduces to P2h¯ , and the union of the quadric and the line turns into the triple commutative line l which played such a prominent role in this paper. If q = 1, then in the limit h¯ → 0 the quadric turns into a union of two intersecting commutative lines z1 = 0 and z2 = 0. For any q the line z3 = 0 should be regarded as “the line at infinity” (which is noncommutative for q = 1). It is plausible that the q- deformed ADHM data are in one-to-one correspondence with bundles, or may be torsion–free sheaves, on P2q,h¯ with a trivialization on this line. 10. Appendix In this section we define a -product on the space of complex-valued C ∞ functions on Rn whose derivatives of arbitrary order are polynomially bounded. The -product endows this space with a structure of a C-algebra and reduces to the Wigner–Moyal product (3) on polynomial functions. Definition 10.1. Let O be a topological vector space which is a subspace of the space of C ∞ functions on Rn , and let O be the space of distributions on O. Let f be a C-valued function on Rn which simultaneously is a distribution in O . f is called a multiplier if for any φ ∈ O, f φ ∈ O. The set of multipliers of O is obviously a subspace of O . Definition 10.2. Let f ∈ O . f is called a convolute if for any φ ∈ O we have (f ∗ φ)(x) ≡ (f (ξ ), φ(x + ξ )) ∈ O, and this expression depends continuously on φ. The above expression is called the convolution of f with φ. The set of convolutes is obviously a subspace of O .

, respectively. If f ∈ O,

and O We will denote the Fourier duals of O and O by O

will be the Fourier transform of f , etc. then f ∈ O

Noncommutative Instantons and Twistor Transform

429

Definition 10.3. The Schwartz space S(Rn ) is the space of C-valued C ∞ functions on Rn such that φ ∈ S if and only if all the norms sup x k D m φ(x), x

k = 0, 1, 2, . . . ,

(39)

are finite. Here m = (m1 , . . . , mn ) is an arbitrary polyindex. Convergence on S is defined using the family of norms (39). Then S becomes a complete countably normed space [17]. Proposition 10.4. A function f ∈ S is a multiplier if and only if it is a C ∞ function on Rn all of whose derivatives are polynomially bounded. Proof. Obvious. ! " The following theorem proved in [37] describes the subspace of convolutes of S : Theorem 10.5. A distribution f ∈ S is a convolute if and only if it has the form f = D α fα (x), |α| 0 u+ (τ, m, η) = j 0, elsewhere.

Hence + + −

u1 0 u2 L2 (R×Z×R) u+ 1 0 u2 L2 (R×Z×R) + u1 0 u2 L2 (R×Z×R)

+ − − + u− 1 0 u2 L2 (R×Z×R) + u1 0 u2 L2 (R×Z×R)

and a use of (18) yields + + −

u1 0 u2 L2 (R×Z×R) u+ 1 0 u2 L2 (R×Z×R) + u1 0 j(u2 ) L2 (R×Z×R)

Since τ − m5 −

η2 m

+ − − + j(u− 1 ) 0 u2 L2 (R×Z×R) + j(u1 ) 0 j(u2 ) L2 (R×Z×R) . 2 is an odd function, one has 0 < m ∼ Mj and τ − m5 − ηm ∼ Kj ,

− j = 1, 2 on the support of u+ j and j(uj ). Hence we can suppose m > 0 on the support of uj , j = 1, 2, when proving Lemma 4. We need to bound the expression ∞ ∞ ∞ ∞ u1 (τ1 , m1 , η1 )u2 (τ − τ1 , m − m1 , η − η1 ) m=0 −∞ −∞ m1 >0,m−m1 >0 −∞ −∞

2 dτ1 dη1 dτ dη.

Periodic KP-I Type Equations

461

The Cauchy–Schwarz inequality in (τ1 , m1 , η1 ), the support properties of u1 and u2 and the Cauchy–Schwarz inequality in (τ, m, η) yield

u1 0 u2 2L2 (R×Z0 ×R)

sup

(τ,m=0,n)

|Aτ mη | u1 2L2 (R×Z×R) u2 2L2 (R×Z×R) ,

(19)

where Aτ mη ⊂ R × Z × R is the set Aτ mη = (τ1 , m1 , η1 ) : 0 < m1 ∼ M1 , 0 < (m − m1 ) ∼ M2 , 2 2 τ1 − m5 − η1 ∼ K1 , τ − τ1 − (m − m1 )5 − (η − η1 ) ∼ K2 . 1 m1 m − m1 Further we obtain via the triangle inequality |Aτ mη | (K1 ∧K2 )|Bτ mη |, where Bτ mη ⊂ Z × R is the set Bτ mη = (m1 , η1 ) ∈ Z × R : 0 < m1 ∼ M1 , 0 < (m − m1 ) ∼ M2 , 2 2 τ − m5 − (m − m1 )5 − η1 − (η − η1 ) (K1 ∨ K2 ) . 1 m1 m − m1 It remains to bound |Bτ mη |. We shall again use Lemma 1. The projection of Bτ mη on the m1 axis is bounded by c(M1 ∧ M2 ). Fix now m1 . We need to estimate the Lebesgue measure of η1 such that the expression τ − m51 − (m − m1 )5 −

η12 (η − η1 )2 − m1 m − m1

(20)

ranges in an interval of size c(K1 ∨ K2 ). For that purpose we need the following lemma, the proof of which is straightforward. Lemma 5. Let a = 0, b, c be real numbers and I be an interval on the real line. Then 1

mes {x : ax + bx + c ∈ I } 2

|I | 2

1

|a| 2

.

Write the expression (20) as −

2m 2η η2 η1 + τ − m51 − (m − m1 )5 − . η12 + m1 (m − m1 ) m − m1 m − m1

Since m1 and m − m1 are both positive we have that 1 m . M1 ∧ M 2 m1 (m − m1 ) Therefore Lemma 5 implies that the Lebesgue measure of η1 such that the expression (20) 1 1 ranges in an interval of size c(K1 ∨ K2 ) is bounded by c(M1 ∧ M2 ) 2 (K1 ∨ K2 ) 2 . Hence

462

J.-C. Saut, N. Tzvetkov 3

1

using Lemma 1 we obtain |Bτ mη | (M1 ∧ M2 ) 2 (K1 ∨ K2 ) 2 and moreover 3

1

|Aτ mη | (K1 ∧ K2 )(M1 ∧ M2 ) 2 (K1 ∨ K2 ) 2 .

(21)

Substituting (21) in (19) completes the proof of Lemma 4. Consider the dyadic levels

η2 K1 K2 K 5 DM1 M2 M = (τ, m, η, τ1 , m1 , η1 ) : τ − m − ≈ K, |m| ≈ M, m

η2 τ1 − m51 − 1 ≈ K1 , |m1 | ≈ M1 , m1

(η − η1 )2 5 τ − τ1 − (m − m1 ) − ≈ K2 , m − m1

|m − m1 | ≈ M2 , (m, m1 , η, η1 ) ∈ "2 ,

where K1 , K2 , K, M1 , M2 , M are dyadic integers. Denote by J2K1 K2 KM1 M2 M the conK1 K2 K tribution of DM to (6). Then 1 M2 M

J2

J2K1 K2 KM1 M2 M .

K1 ,K2 ,K,M1 ,M2 ,M-dyadic

Define fK1 M1 (τ, m, η) and gK2 M2 (τ, m, η) as in (8) and (9) respectively. In the estimate of J2 we shall perform an additional (comparing to the estimate for J1 ) localization of 2 h near the level set of τ − m5 − ηm . So we set 2 h(τ, m, η), when τ − m5 − ηm ≈ K, |m| ≈ M hKM (τ, m, η) = 0, elsewhere. Then clearly J2K1 K2 KM1 M2 M is bounded by M · fK M (τ1 , m1 , η1 )gK M (τ − τ1 , m − m1 , η − η1 )hKM (τ, m, η) 1 1 2 2 K K K 1 2

*M1 M2 M

1

+

1

K 2 − K12 K22 1

"+

+

,

K1 K2 K ⊂ R4 is defined as where *M 1 M2 M

K1 K2 K *M = (τ, τ1 , η, η1 ) ∈ R4 such that there exists (m, m1 , η, η1 ) ∈ "2 1 M2 M K1 K2 K . with (τ, m, η, τ1 , m1 , η1 ) ∈ DM 1 M2 M Using Lemma 3 one obtains that max {K, K1 , K2 } M1 M2 M 3 . We are in a position to state the following lemma.

(22)

Periodic KP-I Type Equations

463

Lemma 6. 1

fK1 M1 L2 gK2 M2 L2 hKM L2 . [max{K, K1 , K2 }]0+ Proof. Via a symmetry argument we can assume that K1 ≥ K2 . We shall consider separately the cases K1 ≥ K and K1 ≤ K. Case K1 ≥ K. Then M J2K1 K2 KM1 M2 M hKM 0 j(gK2 M2 ), fK1 M1 L2 , 1 1 + 1+ K 2 − K12 K22 J2K1 K2 KM1 M2 M

where ·, · L2 connotes the L2 (R × Z × R) scalar product. Using the Cauchy–Schwarz inequality and Lemma 4 we obtain 3

J2K1 K2 KM1 M2 M

1

1

M(M ∧ M2 ) 4 (K ∧ K2 ) 2 (K ∨ K2 ) 4

1

+

1

+

K 2 − K12 K22 · fK1 M1 L2 gK2 M2 L2 hKM L2 . 1

Now (22) yields 1

1

1

3

3

K12 (M1 M2 M 3 ) 2 M22 M 2 M(M ∧ M2 ) 4 . Hence for K1 ≥ K one has J2K1 K2 KM1 M2 M

1 K10+

fK1 M1 L2 gK2 M2 L2 hKM L2 .

Case K1 ≤ K. Then J2K1 K2 KM1 M2 M

M K

1 2−

1

+

1

K12 K22

+

fK1 M1 0 gK2 M2 , hKM L2 .

The Cauchy–Schwarz inequality and Lemma 4 yield 3

J2K1 K2 KM1 M2 M

1

1

M(M1 ∧ M2 ) 4 (K1 ∧ K2 ) 2 (K1 ∨ K2 ) 4 1

+

1

+

K 2 − K12 K22 · fK1 M1 L2 gK2 M2 L2 hKM L2 . 1

Next using (22) we obtain K 2 − (M1 M2 M 3 ) 2 − M(M1 ∧ M2 ) 4 . 1

1

3

Hence for K1 ≤ K one has 1

fK1 M1 L2 gK2 M2 L2 hKM L2 . K 0+ This completes the proof of Lemma 6. J2K1 K2 KM1 M2 M

Now using (22) and Lemma 6 we can sum J2K1 K2 KM1 M2 M over dyadic K1 , K2 , K, M1 , M2 , M and arrive at J2 f L2 g L2 h L2 . This completes the proof of Theorem 2.1.

464

J.-C. Saut, N. Tzvetkov

3. Local Well-Posedness The goal of this section is to prove a local well-posedness result in the Fourier transform restriction spaces associated to the energy density of the fifth order KP-I equation posed on T × R. This well-posedness result is a consequence of a bilinear estimate in the framework of the above spaces. The gain of smoothness is obtained as in the previous section. Because of the specific structure of the energy density, an additional argument is needed in order to deal with the terms containing antiderivatives. This argument was already given in [12]. Here we perform it again with the needed modifications. We define now the antiderivative operator ∂x−k which acts on functions defined on T × R with zero x mean value(or equivalently vanishing of some Fourier modes). Let ˆ η) = 0). Define φ : T × R → R be such that T φ(x, y)dx = 0 (or equivalently φ(0, −k ∂x φ through its Fourier transform as ˆ (−im)−k φ(m, η), when m = 0 −k ∂ φ(m, η) = x 0, elsewhere. Note that ∂x−1 (∂x φ) = φ for any φ having zero x mean value. Let φ : T × R → R be such that T φ(x, y)dx = 0. Then an integration by parts yields ∂x−2 φ · φ = |∂x−1 φ|2 . T×R

T×R

H s,k (T × R)

be the Sobolev-type space (related to the Let s and k be real numbers and energy density of the KP equation for s = 2 and k = 1) of functions having zero x mean value equipped with the norm

φ H s,k =

∞

m=0 −∞

(|m| + |m| 2s

−2

ˆ |η| )|φ(m, η)| dη 2k

2

21

.

Let b and k be real numbers. Since the energy density of the KP equations contains an antiderivative we introduce the Fourier transform restriction space Y b,k (R × T × R) as Y b,k (R × T × R) = u ∈ S (R × T × R) : u(τ, ˆ 0, η) = 0 and u Y b,k < ∞ , where

u Y b,k =

∞

∞

−∞ −∞ m=0

1 2 2 η −2 2k 5 2b 2 |m| |η| (1 + |τ − m − |) |u(τ, ˆ m, η)| dτ dη . m

Define now the space Z b,s,k (R × T × R) := X b,s (R × T × R) ∩ Y b,k (R × T × R) equipped with the norm

u Z b,s,k = u Xb,s + u Y b,k . Let I ⊂ R be an interval. Then we define a localized Bourgain space Z b,s,k (I ) endowed with the norm

u Z b,s,k (I ) =

inf { w Z b,s,k , w(t) = u(t) on I }.

w∈Z b,s,k

Periodic KP-I Type Equations

465

We have the following local well-posedness result. Theorem 3.1. Let s ≥ 1 and k ≥ 0. Then for any φ ∈ H s,k (T×R), there exist a positive T = T ( φ H s,k ) (limρ→0 T (ρ) = ∞) and a unique solution u(t, x, y) of the initial value problem associated to the fifth order KP-I equation with data on T × R on the 1 time interval I = [−T , T ] such that u ∈ C(I, H s,k (T × R)) ∩ Z 2 +,s,k (I ). The proof of the Theorem 3.1 results from the following fundamental estimate:

∂x (uv)

1

Z − 2 +,s,k

u

1

Z 2 +,s,k

v

1

Z 2 +,s,k

,

s ≥ 1, k ≥ 0.

(23)

3.1. Proof of (23). Due to Theorem 2.1 we obtain for s > 1/2,

∂x (uv)

1

X− 2 +,s

u

1

X 2 +,s

v

u

1

X 2 +,s

1

Z 2 +,s,k

v

1

Z 2 +,s,k

.

Therefore the proof of (23) is reduced to estimating

∂x (uv)

1

Y − 2 +,k

by

u

1

Z 2 +,s,k

v

1

Z 2 +,s,k

.

Actually a stronger estimate holds. More precisely we have the following theorem. Theorem 3.2. Let s ≥ 1 and k ≥ 0. Then

∂x (uv) − 1 +,k u 1 +,s v

Y

X2

2

Y

1 2 +,k

+ u

Y

1 2 +,k

v

1 X 2 +,s

.

Proof of Theorem 3.2. Write

∂x (uv)

1

Y − 2 +,k

=

∞

∞

m=0 −∞ −∞

I 2 (τ, m, η)dτ dη

1 2

,

where I (τ, m, η) =

|η|k τ − m5 − m1 =0 m−m1 =0

=

R2

η2 21 − m

u(τ1 , m1 , η1 ) v (τ − τ1 , m − m1 , η − η1 )dτ1 dη1

|η|k τ − m5 −

η2 21 − m

|η|≤2|η1 |

··· +

m1 =0 m−m1 =0

:= I1 (τ, m, η) + I2 (τ, m, η) Theorem 3.2 is a direct consequence of the next lemma.

|η|≥2|η1 |

m1 =0 m−m1 =0

···

466

J.-C. Saut, N. Tzvetkov

Lemma 7. The following estimates hold:

∞

∞

m=0 −∞ −∞

∞

I12 (τ, m, η)dτ dη

∞

m=0 −∞ −∞

1 2

u

I22 (τ, m, η)dτ dη

1

Y 2 +,k

1 2

u

1

v

X 2 +,s

1

X 2 +,s

v

1

,

(24)

.

(25)

Y 2 +,k

Proof of Lemma 7. Since |η| ≤ 2|η1 | on the domain of the integral defining I1 (τ, m, n), a duality argument shows that in order to prove (24) we should bound the expression |m1 ||m − m1 |−s f (τ1 , m1 , η1 )g(τ − τ1 , m − m1 , η − η1 )h(τ, m, η) , η12 1 + η2 1 − (η−η1 )2 1 + 5 5 5 "+ τ − m − m 2 τ1 − m1 − m1 2 τ − τ1 − (m − m1 ) − m−m1 2 (26) by c f L2 (R×Z×R) g L2 (R×Z×R) h L2 (R×Z×R) , where f , g, h are positive L2 functions. Estimate for the contribution of "1 to (26). Denote by J1 the contribution of "1 to the K1 K2 expression (26). Consider the dyadic levels DM , where K1 , K2 , M1 , M2 , M are 1 M2 M

dyadic integers as in the proof of Theorem 2.1. Denote by J1K1 K2 M1 M2 M the contribution K1 K2 of DM to (26). Then 1 M2 M

J1

J1K1 K2 M1 M2 M .

K1 ,K2 ,M1 ,M2 ,M−dyadic

Define fK1 M1 (τ, m, η), gK2 M2 (τ, m, η) and hM (τ, m, η) as in (8), (9), (10). Then clearly J1K1 K2 M1 M2 M is bounded by M1 · fK M (τ1 , m1 , η1 )gK M (τ −τ1 , m−m1 , η−η1 )hM (τ, m, η) 1 1 2 2 dτ dτ1 , 1 K1 K2 + 1+ *M M M + M2s · K12 K22 " 1 2 1 K2 4 where *K M1 M2 M ⊂ R is defined as in the proof of Theorem 2.1. Moreover, similarly to the proof of Theorem 2.1 we obtain that

J1K1 K2 M1 M2 M

M1 M2−s 1 2+

1 2+

K1 K2

sup

(τ,|m|≈M,n)

|Aτ mn | 2 fK1 M1 L2 gK2 M2 L2 hM L2 , 1

where the set Aτ mη is defined as in (12). Again similarly to the proof of Theorem 2.1 we obtain via the triangle inequality that |Aτ mη | (K1 ∧K2 )|Bτ mn |, where Bτ mη ⊂ Z×R is the set defined by (14). We shall estimate |Bτ mη | in a slightly different fashion compared to the proof of Theorem 2.1. The projection of Bτ mη on the m1 axis is contained in a set

Periodic KP-I Type Equations

467

of cardinality at most c(M1 ∧ M2 ) since for (m1 , η1 ) ∈ Bτ mη one has |m1 | ≈ M1 and |m − m1 | ≈ M2 . Fix now m1 . Recall that for (m, m1 , η, η1 ) ∈ "1 one has ∂ 2 2 η ) (η − η 1 (τ − m51 − m − m1 )5 − 1 − ∼ |m| m2 − mm1 + m21 ∂η1 m1 m − m1 |mm1 | ∼ M1 M. Hence, due to Lemma 2, the maximum cardinality of the sections of Bτ mη with lines 2) parallel to the η1 axis is bounded by c(KM11∨K M . Now using Lemma 1 we obtain that the cardinality of Bτ mη is bounded by c(M1 ∧ M2 )(M1 M)−1 (K1 ∨ K2 ). Moreover |Aτ mη |

K1 K2 (M1 ∧ M2 ) . M1 M

Hence 1

J1K1 K2 M1 M2 M

1

M12 (M1 ∧ M2 ) 2 1

M 2 M2s K10+ K20+

fK1 M1 L2 gK2 M2 L2 hM L2

1

M12 1

s− 21

M 2 M2

K10+ K20+

fK1 M1 L2 gK2 M2 L2 hM L2

1

M12 1

1

M 2 M22 K10+ K20+

fK1 M1 L2 gK2 M2 L2 hM L2 ,

since s ≥ 1. By the triangle inequality we have that M1 max{M, M2 }. Using a symmetry argument we can suppose that M ≥ M1 and therefore M1 M. Let M = 2l M1 , where l ∈ Z, l ≥ −l0 (l0 is fixed, positive and independent of M1 ). Then we have that 1

l

J1K1 ,K2 ,M1 ,M2 ,2 M1

1

l

K10+ K20+ M22 2 2

fK1 M1 L2 gK2 M2 L2 h2l M1 L2 .

(27)

It remains to sum (27) over K1 , K2 , M1 , M2 , l. First we can easily sum (27) over K1 , K2 , M2 ,

l

J1K1 ,K2 ,M1 ,M2 ,2 M1

K1 ,K2 ,M2 -dyadic

1 l

22

fM1 L2 g L2 h2l M1 L2 ,

where fM1 (τ, m, η) =

f (τ, m, η), when |m| ≈ M1 0, elsewhere.

468

J.-C. Saut, N. Tzvetkov

Next we sum over M1 and l via the Cauchy–Schwarz inequality J1

∞

l=−l0

K1 ,K2 ,M1 ,M2 -dyadic

∞ 1 l 22 l=−l0

M1 -dyadic

l

J1K1 ,K2 ,M1 ,M2 ,2 M1

fM1 2L2

1/2

M1 -dyadic

h2l M1 2L2

1/2

g L2

f L2 g L2 h L2 . Estimate for the contribution of "2 to (26). For (m, m1 ) ∈ "+ and s ≥ 1 one has |m1 ||m − m1 |−s |m|. Hence the contribution of "2 to the sum in the expression (26) is bounded by |m|f (τ1 , m1 , η1 )g(τ − τ1 , m − m1 , η − η1 )h(τ, m, η) 1 1 1+ . 2 2− η12 2 + (η−η1 )2 2 5 5 "+ τ − m5 − η τ1 − m1 − m1 τ − τ1 − (m − m1 ) − m−m1 m Now we remark that the above expression has the same nature as (6) with s = 0. Hence we can use the arguments implemented above when estimating the contribution of "2 to the expression (6). This completes the proof of (24). When |n| ≥ 2|n1 | one has |n| ≤ 2|n − n1 |. Hence a duality argument shows that the proof of (25) is reduced to bound the expression |m − m1 ||m1 |−s f (τ1 , m1 , η1 )g(τ − τ1 , m − m1 , η − η1 )h(τ, m, η) 1 1 1+ , 2 2− η12 2 + (η−η1 )2 2 5 5 "+ τ − m5 − η τ1 − m1 − m1 τ − τ1 − (m − m1 ) − m−m1 m (28) by c f L2 (R×Z×R) g L2 (R×Z×R) h L2 (R×Z×R) , where f , g, h are positive L2 functions. A symmetry argument (m1 → (m − m1 )) shows that we can bound (28) similarly to (26). This completes the proof of Lemma 7.

3.2. The fixed point argument. In this section we perform a fixed point argument for the integral equation corresponding to the fifth order KP-I equation. This argument is standard since the linear estimates in the Fourier transform restriction method of J. Bourgain do not depend on the particular equation in hand. Write the fifth order KP-I equation as an integral equation 1 t u(t) = U (t)φ − U (t − t )∂x (u2 (t ))dt , (29) 2 0

Periodic KP-I Type Equations

469

where U (t) = exp(t (∂x5 + ∂x−1 ∂y2 )) is the unitary group generating the solutions of the linear problem. We shall apply the contraction mapping principle to a cut-off version of (29). Let ψ be a bump function such that ψ ∈ C0∞ (R), supp ψ ⊂ [−2, 2], ψ = 1 on the interval [−1, 1]. Consider the integral equation 1 u(t) = ψ(t)U (t)φ − ψ(t/T ) 2

t

U (t − t )∂x (u2 (t ))dt .

0

(30)

We shall solve (30) globally in time in the space Z b,s,k , where I = [−T , T ]. To the solutions of (30) correspond local solutions of the fifth order KP-I equation in the time interval [−T , T ] in the space Z b,s,k (I ), where I = [−T , T ]. Consider the nonlinear operator L acting on Z b,s,k as 1 Lu := ψ(t)U (t)φ − ψ(t/T ) 2

t 0

U (t − t )∂x (u2 (t ))dt .

We claim that for small enough T the operator L is a contraction in the space Z 2 +,s,k for any φ ∈ H s,k (T × R). This will follow from the next estimates of the two terms in the right-hand side of (30). 1

Lemma 8 (linear estimates). Let − 21 < b ≤ 0 ≤ b ≤ b + 1, s ≥ 0 and k ≥ 0. Then the following inequalities hold:

ψ(t)U (t)φ Z b,s,k φ H s,k ,

t

ψ(t/T ) 0

(31)

U (t − t )∂x (u2 (t ))dt Z b,s,k T 1−b+b uux Z b ,s,k .

(32)

We refer to [9] for the proof of (31) and (32) (and for a very clear introduction to Bourgain’s method). These estimates are essentially one dimensional and do not depend on the unitary group U (t). Now using (31), (32) and (23) we obtain that

Lu

1

Z 2 +,s,k

φ H s,k + T 0+ u 2 1 +,s,k , s ≥ 1, k ≥ 0. Z2

Hence L maps Z 2 +,s,k into itself for s ≥ 1, k ≥ 0. In a similar way we obtain that 1

Lu − Lv

1

Z 2 +,s,k

T 0+ u − v

1

Z 2 +,s,k

u + v

1

Z 2 +,s,k

.

for some positive Therefore L is a contraction in Z b,s,k for a small T of order φ −a H s,k constant a. It remains to use the contraction mapping principle to solve (30) in Z b,s,k . This implies the local well-posedness of (29) in Z b,s,k (I ). The embedding of Z b,s,k (I ) in C(I, H s,k (T × R)) follows from a one dimensional Sobolev inequality. This completes the proof of Theorem 3.1.

470

J.-C. Saut, N. Tzvetkov

4. Global Well-Posedness In this section we extend globally in time the local solutions obtained in Theorem 3.1. This results from the energy conservation. More precisely, applying Theorem 3.1 with s = 2 and k = 1 we obtain a local solution u of the fifth order KP-I equation on the time interval [−T , T ]. The local well-posedness implies that the following alternative holds: either limt→T u(t) H 2,1 (T×R) = ∞ or T = ∞. Our goal is to show that the second cla