Communications in Mathematical Physics - Volume 221

Commun. Math. Phys. 221, 1 – 26 (2001) Communications in Mathematical Physics © Springer-Verlag 2001 Evolution of a ...

Author: M. Aizenman (Chief Editor)

25 downloads 786 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 221, 1 – 26 (2001)

Communications in

Mathematical Physics

© Springer-Verlag 2001

Evolution of a Model Quantum System Under Time Periodic Forcing: Conditions for Complete Ionization O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko Department of Mathematics, Rutgers University, Piscataway, NJ 08854-8019, USA Received: 1 November 2000 / Accepted: 5 February 2001

Abstract: We analyze the time evolution of a one-dimensional quantum system with an attractive delta function potential whose strength is subjected to a time periodic (zero mean) parametric variation η(t). We show that for generic η(t), which includes the sum of any finite number of harmonics, the system, started in a bound state will get fully ionized as t → ∞. This is irrespective of the magnitude or frequency (resonant or not) of η(t). There are however exceptional, very non-generic η(t), that do not lead to full ionization, which include rather simple explicit periodic functions. For these η(t) the system evolves to a nontrivial localized stationary state which is related to eigenfunctions of the Floquet operator. 1. Introduction and Results We are interested in the qualitative long time behavior of a quantum system evolving under a time dependent Hamiltonian H (t) = H0 + H1 (t), i.e. in the nature of the solutions of the Schrödinger equation i h∂ ¯ t ψ = [H0 + H1 (t)]ψ.

(1)

Here ψ is the wavefunction of the system, belonging to some Hilbert space H, H0 and H1 are Hermitian operators and Eq. (1) is to be solved subject to some initial condition ψ0 . Such questions about the solutions of (1) belong to what Simon [1] calls “second level foundation” problems of quantum mechanics. They are of particular practical interest for the ionization of atoms and/or dissociation of molecules, in the case when H0 has both a discrete and a continuous spectrum corresponding respectively to spatially localized (bound) and scattering (free) states in Rd . Starting at time zero with the system in a bound state and then “switching on” at t = 0 an external potential H1 (t), we want to know the “probability of survival”, P (t), of the bound states, at times t > 0: P (t) = 2 j | ψ(t), uj | , where the sum is over all the bound states uj [2–6, 8, 9].

2

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko

This problem has been investigated both analytically and numerically for the case H1 (t) = η(t)V1 (x) with η(t) = r sin(ωt + θ) and V1 a time independent potential, x ∈ Rd . When ω is sufficiently large for “one photon” ionization to take place, i.e., when hω ¯ > −E0 , E0 the energy of the bound (e.g. ground) state of H0 and r is “small enough” for H1 to be treated as a perturbation of H0 then this is a problem discussed extensively in the literature ([8, 9]). Starting with the system in its ground state the long time behavior of P (t) is there asserted to be given by the P (t) ∼ exp[−F t]. The rate constant F is computed from first order perturbation theory according to Fermi’s golden rule. It is proportional to the square of the matrix element between the bound and free states, multiplied by the appropriate density of continuum states in the vicinity of the final state which will have energy hω ¯ − E0 [6, 8–10]. Going from perturbation theory to an exponential decay involves heuristics based on deep physical insights requiring assumptions which seem very hard to prove. It is therefore very gratifying that many features of this scenario have been recently made mathematically rigorous by Soffer and Weinstein [6] (their analysis was generalized by Soffer and Costin [7]). They considered the case when H0 = −∇ 2 + V0 (x), x ∈ R3 , V0 compactly supported and such that there is exactly one bound state with energy −ω0 (from now on we use units in which h¯ = 2m = 1) and a continuum of quasi-energy states with energies k 2 for all k ∈ R3 . The perturbing potential is H1 (t) = r cos(ωt)V1 (x) with V1 (x) also of compact support and satisfying some technical conditions. They then showed that for ω > ω0 and r small enough there is indeed an intermediate time regime where P (t) has a dominant exponential form with the Fermi exponent F . This regime is followed for longer times by an inverse power law decay. Some of these restrictions can presumably be relaxed but the requirement that r be small is crucial to their method which is essentially perturbative. The behavior of P (t) becomes much more difficult to analyze when the strength of H1 (t) is not small and perturbation theory is no longer a useful guide. This became clear in the seventies with the beautiful experiments by Bayfield and Koch, cf. [11] for a review, on the ionization of highly excited Rydberg (e.g. hydrogen atoms) by intense microwave electric fields. These experiments showed quite unexpected nonlinear behavior of P (t) as a function of the initial state, field strength E and the frequency ω. These results as well as other multiphoton ionizations of hydrogen atoms have been (and continue to be) analyzed by various authors using a variety of methods. Prominent among these are semi-classical phase-space analysis, numerical integration of the Schrödinger equation, Floquet theory, complex dilation, etc. While the results obtained so far are not rigorous, they do give physical insights and quite good agreement with experiments although many questions still remain open even on the physical level [11–15]. In addition to the above experiments on Rydberg atoms there are also many experiments which use strong laser fields to produce multiphoton (ω < −E0 ) ionization of multielectron atoms and/or dissociation of molecules [16, 17]. These systems are more complex than Rydberg atoms and their analysis is correspondingly less developed. One unexpected result of certain studies is that an increase in the intensity of the field may reduce the degree of ionization, i.e., P (t) can be non-monotone in the field strength E at large values of E. This phenomenon, which is often called “stabilization”, can be observed in some numerical simulations, analyzed rigorously in some models and is claimed to have been seen experimentally cf. [5] and [18–21]. It turns out that many features observed for Rydberg atoms and also stabilization are already present in a simple model system which we have recently begun to investigate analytically [22–24]. This somewhat surprising finding is based on comparisons between

Ionization of Simple Model

3

experimental and model results described in detail in [23]. In fact the phenomenon of ionization by periodic fields is very complex indeed once one goes beyond the perturbative regime even in the most simple model. This will become clear from the new results about this model presented here. 2. The Model We consider a very simple quantum system where we can analyze rigorously many of the phenomena expected to occur in more realistic systems described by (1). This is a one dimensional system with an attractive delta function potential. The unperturbed Hamiltonian H0 has, in suitable units, the form H0 = −

d2 − 2 δ(x), dx 2

−∞ < x < ∞.

(2)

The zero range (delta-function) attractive potential is much used in the literature to model short range attractive potentials [25–28]. It belongs, in one dimension, to the class K1 [2]. H0 has a single bound state ub (x) = e−|x| with energy −ω0 = −1. It also has continuous uniform spectrum on the positive real line, with generalized eigenfunctions 1 1 ikx i|kx| u(k, x) = √ , −∞ < k < ∞ e e − 1 + i|k| 2π and energies k 2 . Beginning at t = 0, we apply a parametric perturbing potential, i.e. for t > 0 we have H (t) = H0 − 2 η(t)δ(x)

(3)

and solve the time dependent Schrödinger equation (1) for ψ(x, t), with ψ(x, 0) = ψ0 (x). Expanding ψ in eigenstates of H0 we write ψ(x, t) = θ(t)ub (x)eit ∞ 2 + !(k, t)u(k, x)e−ik t dk (t ≥ 0)

(4)

−∞

with initial values θ (0) = θ0 , !(k, 0) = !0 (k) suitably normalized, ∞ ψ0 , ψ0 = |θ0 |2 + |!0 (k)|2 dk = 1. −∞

(5)

We then have that the survival probability of the bound state is P (t) = |θ(t)|2 , while |!(k, t)|2 dk gives the “fraction of ejected particles” with (quasi-) momentum in the interval dk. This problem can be reduced to the solution of an integral equation in a single variable [22, 23]. Setting Y (t) = ψ(x = 0, t)η(t)eit

(6)

4


we have

t

θ (t) = θ0 + 2i

Y (s)ds,

0

√

!(k, t) = !0 (k) + 2|k|/

(7)

2π (1 − i|k|)

t

Y (s)ei(1+k

2 )s

ds.

(8)

0

Y (t) satisfies the integral equation t [2i + M(t − t )]Y (t )dt Y (t) = η(t) I (t) + 0

= η(t) I (t) + (2i + M) ∗ Y ,

(9)

where the inhomogeneous term is i I (t) = θ0 + √ 2π and 2i M(s) = π

∞

0

∞

!0 (k) + !0 (−k) −i(k 2 +1)t dk, e 1 + ik

0

1+i u2 e−is(1+u ) du = √ 2 1+u 2 2π 2

with f ∗g =

t

s

∞

e−iu du u3/2

f (s)g(t − s)ds.

0

In our previous works we considered the case where !0 (k) = 0 and η(t) is a finite sum of harmonics with period 2πω−1 . In particular, we showed in [23] how to compute the survival probability P (t) as a function of the strength r and frequency ω when η(t) = r sin ωt. Here we study the general periodic case and write η=

∞

Cj eiωj t + C−j e−iωj t .

j =0

Our assumptions on the Cj are (a) (b) (c)

0 ≡ η ∈ L∞ (T), C0 = 0, C−j = Cj .

Genericity condition (g). Consider the right shift operator T on l2 (N) given by T (C1 , C2 , . . . , Cn , . . . ) = (C2 , C3 , . . . , Cn+1 , . . . ). We say that C ∈ l2 (N) is generic with respect to T if the Hilbert space generated by all the translates of C contains the vector e1 = (1, 0, 0 . . . , ) (which is the kernel of T ): e1 ∈

∞

T nC

(10)

n=0

(where the right side of (10) denotes the closure of the space generated by the T n C with n ≥ 0). This condition is generically satisfied, and is obviously weaker than the


5

n “cyclicity” condition l2 (N) ∞ n=0 T C = {0}, which is also generic [29] (Appendix B discusses in more detail the rather subtle cyclicity condition). An important case, which satisfies (10), (but fails the cyclicity condition) corresponds to η being a trigonometric polynomial, namely C ≡ 0 but Cn = 0 for all large enough n. (We can in fact replace e1 in (10) by ek with any k ≥ 1.) A simple example which fails (10) is η(t) = 2rλ

λ − cos(ωt) 1 + λ2 − 2λ cos(ωt)

(11)

for some λ ∈ (0, 1), for which Cn = −rλn for n ≥ 1. In this case the space generated by T n C is one-dimensional. We will prove that there are values of r and λ for which the ionization is incomplete, i.e. θ(t) does not go to zero for large t. 3. Results and Remarks Theorem 1. Under assumptions (a) . . . (c) and (g), the survival probability P (t) of the bound state ub , |θ (t)|2 tends to zero as t → ∞. Theorem 2. For ψ0 (x) = ub (x) there exist values of λ, ω and r in (11), for which |θ (t)| → 0 as t → ∞. Remarks. 1. Theorem 1 can be extended to show that D |ψ(x, t)|2 dx → 0 for any compact interval D ⊂ R. This means that the initially localized particle really wanders off to infinity since by unitarity of the evolution R |ψ(x, t)|2 dx = 1. Theorem 2 can be extended to show that for some fixed r and ω in (11) there are infinitely many λ, accumulating at 1, for which θ(t) → 0. In these cases, it can also be shown that for large t, θ approaches a quasiperiodic function. 2. While Theorem 1 holds for arbitrary ψ0 , care has to be taken with the initial conditions for Theorem 2. In particular we cannot have an initial state such that in (9) I (t) = 0 for all t. This would occur, for example, if ψ0 (x) is an odd function of x. In that case the evolution takes place as if the particle was entirely free – never feeling the delta function potential. There may also be other special ψ0 for which θ0 = 0 but for which θ(t) → 0 as t → ∞. We have therefore stated Theorem 2 for the case ψ0 = ub . We shall also, for simplicity, use this choice of ψ0 in the proofs of Theorem 1. For this case, which is natural from the physical point of view, I (t) = 1 in (9). The extension to general ψ0 is immediate and is given at the end of Sect. 5. 3. In [23] we gave a detailed picture of how the decay of θ(t) depends on r and ω when η(t) = r sin(ωt), θ0 = 1. For small r and ω−1 not too close to an integer we get an −1 exponential decay with a decay rate (r, ω) ∼ r 2(1+ω ) , where ω−1 is the integer part of ω−1 . (For ω > 1, this corresponds to ∼ F ). At times large compared to −1 , |θ (t)| decays as t −3/2 . The picture becomes much more complicated when r is large and/or ω−1 is an integer. In particular there is no monotonicity in |θ(t)| as a function of r. In [24] we proved complete ionization for the case where Cn = 0 for n > N , N ≥ 1. 4. We note here that Pillet [3] proved complete ionization for quite general H0 under the assumption that H1 (t) is “very random”, in fact a Markov process. Our results are not only consistent with this but support the expectation that generic perturbations will lead to complete ionization for general H0 . This is what we expect from entropic considerations – there is just too much phase space “out there”. The surprising thing is that even for our simple example one can readily find exceptions to the rule.

6


We should also mention here the work of Martin et al. [31, 32] who consider the case where H0 has an isolated eigenvalue E0 plus an absolutely continuous spectrum in the interval [0, Emax ]. They show that if the frequency ω of the periodic, small, perturbation H1 (t) is larger than E0 then the bound state is stable. This can be understood in terms of Fermi’s golden rule by noting that the density of states at the energy E0 + ω > Emax is zero so that F would be zero. 5. There is a direct connection between our results and Floquet theory where, for a time-periodic Hamiltonian H (t) with period T = 2π/ω, one constructs a quasienergy operator (QEO) [2, 33, 34] ∂ K = −i + H (θ). ∂θ K acts on functions of x and θ , periodic in θ , i.e. on the extended Hilbert space H ⊗ L2 (S, T −1 dθ ). Let now φ(x, θ ) be an eigenfunction satisfying Kφ = µφ, φ(x, θ + T ) = φ(x, θ) then,

(12)

ψ(x, t) = e−iµt φ(x, t)

is a solution of the Schrödinger equation i ∂ψ ∂t = H (t)ψ. The existence of a real eigenvalue µ of the QEO with an associated φ(x, θ ) ∈ L2 (Rd ⊗ S) is thus seen to imply the existence of a solution of the time-dependent Schrödinger equation which is, in absolute value, periodic. This shows that for appropriate initial conditions, the particle has a nonvanishing probability of staying in a compact domain and thus, for the case considered here, that ionization is incomplete. We also note that for each such µ there is actually a whole set µn = µ + nω of eigenvalues of K. For the specific model considered here, (12) takes the form Kφ = −

∂ 2 φ(x, θ ) ∂φ − 2(1 + η(θ ))δ(x)φ − i = µφ. 2 ∂x ∂θ

(13)

We can now look for solutions of (13) in the form φµ (x, θ ) = yn einωθ eαn x n∈Z

√

with αn± = ± µ − nω. Such a solution is in L2 only if (αn x) < 0, a condition which obviously selects different roots λn depending on whether x > 0 or x < 0. The requirement that φµ be in L2 (R) leads to a set of matching conditions which determine whether such eigenvalues µ can exist. It is easy to see that φµ has to be continuous at zero and satisfy the condition 2φµ (0− , θ) − φµ (0+ , θ ) = 2(1 + η(θ ))φµ (0, θ). This implies, after taking the Fourier coefficients of both sides of the above equality, the recurrence relation yn (2 − αn+ + αn− ) = 2 Cj yn−j (14) j =0


7

for which a (nontrivial) solution yn ∈ l 2 is sought. This is effectively the same equation as (20) below which is at the core of our analysis. Complete ionization thus corresponds to the absence of a discrete spectrum of the QEO operator and conversely stabilization implies the existence of such a discrete spectrum. In fact, an extension of Theorem 2 shows that for the initial condition ψ0 = ub , ψt approaches such a function with µ = −s0 . More details about Floquet theory and stability can be found in [33, 34]. 6. We are currently investigating extensions of our results to the case where H0 = −∇ 2 + V0 (x), x ∈ Rd , has a finite number of bound states and the perturbation is of the form η(t)V1 (x) and both V0 and V1 have compact support. Preliminary results indicate that, with much labor, we shall be able to generalize Theorem 1, to generic V1 (x). The definition of genericity will, however, depend strongly on V0 . The physically important case of an external electric dipole field, V1 (x) = −Ex can be transformed into the solution of a Schrödinger equation of the form H (t) = −∇ 2 + V0 (x − g(t)), see [2]. This should, in principle, also be amenable to our methods but so far we have no results for that case. Outline of the technical strategy. The method ∞ of proof relies on the properties of the Laplace transform of Y , y(p) = LY (p) = 0 e−pt Y (t)dt. Since the time evolution of ψ is unitary, |θ(t)| ≤ 1. This gives some a priori control on Y . For our purposes however it is useful to characterize directly the solution of the convolution equation (9). (We restrict ourselves to !0 (k) = 0 and I (t) = 1 there.) We show that this equation has a unique solution in suitable norms. This solution is Laplace transformable and the Laplace transform y satisfies a linear functional equation. The solution of the functional equation satisfied by the transform of Y is unique in the right half plane provided it satisfies the additional property that y(p0 + is) is square integrable in s for any p0 > 0. Any such solution y transforms back (by the standard properties of the inverse Laplace transform) into a solution of our integral equation with no faster than exponential growth; however there is a unique locally integrable solution of this equation, and this solution is exponentially bounded. This must thus be our Y . We can thus use the functional equation to determine the analytic properties of y(p). This is done using (appropriately refined versions of) the Fredholm alternative. After some transformations, the functional equation reduces to a linear inhomogeneous recurrence equation in l2 , involving a compact operator depending parametrically on p, see e.g. (17). The dependence is analytic except for a finite set of poles and squareroot branch-points on the imaginary axis and we show that the associated homogeneous equation has no nontrivial solution. We then show that the poles in the coefficients do not create poles of y, while the branch points are inherited by y. The decay of y(p) when |(p)| → ∞, and the degree of regularity on the imaginary axis give us the needed information about the decay of Y (t) for large t. 4. Behavior of y(p) in the Open Right Half Plane H Lemma 3. (i) Equation (9) has a unique solution Y ∈ L1loc (R+ ) and |Y (t)| < KeBt for some K, B ∈ R. (ii) The function y(p) = LY exists and is analytic in HB = {p : (p) > B}. (iii) In HB , the function y(p) satisfies the functional equation y=

∞ j =−∞

Cj T j h + by

(15)

8


with

T f (p) = f (p + iω),

of

h(p) = −p −1

and b(p) = −

i 1 + 1 − ip . p

The branch of the square root is such that for p ∈ H = {p : (p) > 0}, the real part √ 1 − ip is nonnegative and the imaginary part nonpositive.

The straightforward proofs of this lemma are done in Appendix A. (Some of the results can also be gotten directly from standard results on the Schrödinger operators and on integral equations.) Remark 4. It is clear that the functional equation (15) only links points on the one dimensional lattice {p + iZω}. It is convenient to take p0 such that p = p0 + inω with (p0 ) = (p) and (p0 ) ∈ [0, ω).

(16)

The functions y, h, b in (15) will now depend parametrically on p0 . We set y = {yj }j ∈Z , h = {hj }j ∈Z , b = {bj }j ∈Z with yn = y(p0 + inω) = y(p) (and similarly for h(p) and b(p)). It is convenient to define the operator (Hˆ y)n = bn yn . Let (T y)n = yn+1 be the right shift on l2 (Z) (which we denote for simplicity by l2 ) and rewrite (15) as y=

∞ j =−∞

j

Cj T h +

∞

Cj T j Hˆ y ≡ f + J y.

(17)

j =−∞

Proposition 5. For (p0 ) > 0 there exists a unique solution of (17) in l2 . This solution is analytic in p0 , (p0 ) > 0. Thus y(p) is analytic in p ∈ H and inverse Laplace transformable there with L−1 (y) = Y . Proof. The proof uses the Fredholm alternative. We first prove the following results. Lemma 6. The operator J is compact on l2 if p0 = 0. Proof. The proof uses standard compact operator results, see e.g. [30]. First note that the operator Hˆ is compact. This is straightforward: since bj → 0 as j → ∞, it follows that Hˆ is the norm limit as N → ∞ of the finite rank operators defined by (Hˆ N y)j = bj yj for |j | ≤ N and (Hˆ N y)j = 0 otherwise, and thus is compact. The operator J is the composition between the “convolution” operator C given by (Cv)n := (C ∗ v)n := ˆ j ∈Z Cj vn+j , which is continuous on l2 , and the compact operator H . Thus J is compact. " # Remarks. 1. Note that f ∈ l2 if p0 = 0 (a straightforward consequence of the fact that C and h in (17) are in l2 ). 2. The operator J is analytic in p0 , except for p0 = 0, where the coefficients have poles, and for an additional value on the imaginary axis (possibly also 0), where the coefficients have square root branch points.


9

Remark 7. Setting, for p0 = 0,

yl = ( 1 − i(p0 + ilω) − 1)zl

(18)

y = Jy

(19)

the homogeneous equation

clearly has a (nontrivial) l2 solution y only if

∞

Ck zl+k + C k zl−k 1 − ip0 + lω − 1 zl = −

(20)

k=1

has a (nontrivial) l2 solution z with

1 − ip0 + j ω − 1 zj

j ∈Z

∈ l2 .

(21)

Lemma 8. For any η under assumptions (a) to (c), if p0 ∈ H there is no nonzero l2 solution of (20) such that (21) holds. Proof. To get a contradiction, assume z ∈ l2 , z ≡ 0, satisfying (21), is a solution of (20). Multiplying (20) by zl , and summing with respect to l from −∞ to +∞ we get ∞ ∞ ∞

Ck zl+k zl + C k zl−k zl 1 − ip0 + lω − 1 |z|2l = −

l=−∞

=− =− √

l=−∞ k=1 ∞ ∞

Ck zl zl−k + C k zl−k zl

l=−∞ k=1 ∞ ∞

(22)

2 Ck zl zl−k .

l=−∞ k=1

If p0 ∈ H the imaginary part of 1 − ip0 + lω is negative (see Remark 24) and thus, if some zl is nonzero then the left side of (22) has strictly negative imaginary part, which is impossible since the right side is real. " # Proof of Proposition 5. The existence of the analytic solution follows now immediately from the analytic Fredholm alternative and the analyticity of the coefficients, for p0 ∈ H. The fact that {yn } ∈ l2 together with the stated analyticity imply that the function L−1 y(p) exists and satisfies the integral equation of Y , and thus coincides with Y . " # 5. Behavior of y(p) in the Neighborhood of (p) = 0 in the Generic Case Discussion of methods. We start again from relation (17). This has the form yn = i

j

Cj Cj qn+j yn+j , C0 = 0, − −ip0 + (n + j )ω j

(23)

10


where

qn =

1+

1 − ip0 + nω . −ip0 + nω

√

(24)

As the imaginary axis (p0 ) = 0 is approached, two types of potential singularities in the coefficients need attention: the poles in the coefficients due to the presence of p −1 , and the square root singularities. It will turn out that by cancellation effects, the poles play no role, generically. The square root singularities will be manifested in the solution y. The study of these questions requires further regularization of the functional Eq. (23). It is convenient to separate out the terms in (23) which are singular at p0 = 0. Using (from now on) the notation s0 = −ip0 we have √ Cj C−n C−n (1 + 1 + s0 ) − y0 + i yn = i s0 s0 s + (n + j )ω j =−n 0 Cj qn+j yn+j , n = 0, − (25) j =−n

y0 = i

j =0

Cj − Cj qj yj . s0 + j ω j =0

We break up the proof into two parts, the non-resonant and resonant case. We start with the former. 5.1. The non-resonant case, ω−1 ∈ N. Proposition 9. If condition (g) is satisfied, and ω−1 ∈ N, then the solution y of (25) is analytic in a small neighborhood of s0 = 0. For the proof we write y0 = i/2 + s0 u0 , and for n = 0 we make the substitution yn = vn + dn u0 , where we will choose dn according to (26) in order to eliminate u0 from all equations with n = 0. Lemma 10. (i) For s0 ∈ R there exists a unique solution d ∈ l2 (Z \ {0}) of the system dn = −C−n (1 + 1 + s0 ) − Ck−n qk dk , n = 0. (26) k=0

This solution is analytic at s0 = 0. (ii) With this choice of d, the system (25) becomes v n = fn −  s0 +

j =0

 Cj qj dj  u0 = f0 −

Ck−n qk vk ,

k=0

C j q j vj ,

(27)

j =0

where

√ Cj Ck−n 1 − 1 + s0 i f0 = − + i , fn = iC−n . +i 2 s0 + j ω 2s0 s0 + kω j =0

k=0

(28)


11

(iii) For small s0 we have j =0 Cj qj dj = 0, and the system (27) has a unique solution with v ∈ l2 (Z \ {0}), and vn , u0 are analytic at s0 = 0 . Proof. (i) Equation (26) is of the form (I − J )d = c in l2 (Z \ {0}), where cn = √ −(1 + 1 + s0 )C n and (J d)n = − Ck−n qk dk , (n = 0). k=0

We show first that Ker(I − J ) = {0}. Indeed, assume d = J d and set Dk = qk dk . Then we see that Ck−n Dk = 0 (29) qn −1 Dn + k=0

and, by multiplying with D n and summing over n we get qn−1 |Dn |2 + Ck−n Dk Dn = 0. n=0

(30)

n,k=0

Note that, because C−n = C n , the following quantity is real:

Ck−n Dk D n =

n,k=0

n,k=0

implying that

n=0

with (cf. (24)) Let N0 = −(1 + s0 have, by Remark 24

Cn−k Dk Dn =

Ck−n Dk D n ,

(31)

n,k=0

qn−1 |Dn |2 ∈ R

qn−1 = −1 + )ω−1

∈ R. Obviously

1 + s0 + nω.

qn−1

∈ R for n ≥ N0 while for n < N0 we

(qn−1 ) < 0.

Thus it is necessary that Dn = 0 for all n < N0 . Assume D = 0. Let N ∈ N be such that Dn = 0 for all n < N and DN = 0 (thus N0 ≤ N). Then from (29), Ck−n Dk = 0 for any n < N k≥N;k=0

or, setting k = N − 1 + j ,

Cj +n DN−1+j = 0

for n ≥ 0.

(32)

j ≥1,j =1−N

It is here that we use the genericity condition on C. In fact we will show that (32) implies D = 0 if condition (g) is satisfied. To see this define D˜ ∈ l2 (N) as D˜ j = DN−1+j if j ≥ 1, j = 1 − N and, if 1 − N ≥ 1, D˜ 1−N = 0. Then by (32) D˜ is orthogonal in

12


˜ e1 >= DN = 0, l2 (N) to all T n C, n ≥ 0. By the genericity condition (g) then < D, which is a contradiction. Thus D = 0. Since J is analytic in s0 for small enough s0 , and compact by the same simple arguments as in Lemma 6, it follows that (I − J )−1 exists and is analytic in s0 at s0 = 0. (ii) This part is an immediate calculation. (iii) Note first that f ∈ l2 (Z \ {0}), because 1/2  √ Ck−n 2 1 − 1 + s0 'c' +   'f ' ≤ 2s s + kω 0

≤ 'c'

k=0

n=0 k=0 0

1 < ∞. |s0 + kω|2

Also, formula (28) expresses f in terms of a discrete measure integral with respect to k of a function which depends analytically on the (small) parameter s0 , and which is uniformly in l1 . Therefore f depends analytically on s0 . The rest of the proof of (iii) closely follows that of part (i), using the following result. Cj qj dj = 0. Lemma 11. For s0 = 0 we have j =0

Proof. Assume the contrary was true. At s0 = 0, with Dn0 = Dn |s0 =0 and qn0 = qn |s0 =0 , relation (29), using (26), gives D0 0 = 0n = − Ck−n Dk0 − 2C−n (n = 0). (33) qn k=0

Multiplying with Dn0 and summing over n = 0 we would get √ (−1 + 1 + nω)|Dn0 |2 = − Ck−n Dk0 Dn0 − 2C−n Dn0 , n=0

k,n=0

(34)

n=0

and since we assumed n Cn Dn0 = 0 then, as in the proof of Lemma 10 (i), it follows that Dn0 = 0 for all n < N0 = −ω−1 . This gives, using (33), that Ck−n Dk0 + 2C−n = 0. (35) k≥N0 ;k=0

∈ l2 the sequence Dk1 = Dk0 if k = 0 and D01 = 2. As in the Denote by proof of Lemma 10 (i), using the genericity condition (g), we get D 1 = 0, an obvious contradiction. " # D1

This concludes the proof of Proposition 9: for generic η the solution y of (17) has, / N, analytic components yn when p = 0. for ω−1 ∈ Square root singularities. We now study the behavior at the square root singularities of the coefficients of the equation of y. Let k0 be the unique integer such that for some sr ∈ [0, ω) we have 1 + sr + k0 ω = 0 (then sr is a branch point in the coefficient q). The following proposition describes the analytic structure of y(p) near the imaginary axis.


13

√ Proposition 12. We have the decomposition yn = un + ( s0 − sr )vn , where un and vn are analytic in s0 in a complex neighborhood of the segment [0, ω). √ Proof. The substitution yn = un + ( s0 − sr )vn , and Uk = qk uk ; Vk = qk vk

(k = k0 )

and Uk0 =

uk0 ; s0 + k 0 ω

Vk0 =

vk0 s0 + k 0 ω

leads to the following system of equations for Un and Vn : Ck−n Ck−n Uk − Ck0 −n (s0 − sr )Vk0 (n = k0 ), − s0 + kω k k Ck−n Vk − Ck0 −n (s0 − sr )Vk0 − Ck0 −n Uk0 (n = k0 ), (36) qn−1 Vn = −

qn−1 Un = ri

k

(s0 + k0 ω)Uk0 (s0 + k0 ω)Vk0

Ck−k 0 =i , s0 + kω k =− Ck−k0 Vk . k

√ We now let Qk0 = s0 + k0 ω and, for n = k0 , Qn = qn−1 = −1 + 1 + s0 + kω. We use again the Fredholm alternative and, as in the previous proofs, we need only to show the absence of a solution of the homogeneous equation at s0 = sr . We thus multiply the homogeneous equations associated to (36) in the following manner: the equation for Uj by Uj and the equation for Vj by Vj , then sum over all j . As in the previous proofs, from the reality of the r.h.s. and then from the genericity condition (g) U ≡ 0. Then, similarly, V ≡ 0. The rest is immediate. " #

5.2. The resonant case: ω−1 = M ∈ N. In this case when s0 = 0 there are poles in the coefficients of (23) when n + j = 0 and branch points when n + j = −M. The proof is a combination of the two regularization techniques used in the previous case. √ Proposition 13. We can set y(s0 ) = A(s0 ) + B(s0 ) s0 with A and B analytic in a complex neighborhood of the segment [0, ω). Proof. Special care is only needed near s0 = 0. The system (26)–(28) now reads dn = −C−n (1 + vn = fn −

1 + s0 ) −

k ∈{0,−M} /

k ∈{0,−M} /

Ck−n qk dk − C−M−n

√ 1 + s0 v−M . Ck−n qk vk − C−M−n s0 − 1

√ 1 + s0 d−M , s0 − 1 (37)

14


√ s0 βn and vn = γn + s0 δn . The system becomes αn = − C−n (1 + 1 + s0 ) − Ck−n qk αk

We take dn = αn +

√

k ∈{0,−M} /

− C−M−n

βn = −

Ck−n qk βk − C−M−n

k ∈{0,−M} /

γn = fn −

δn = −

1 (α−M + β−M ), s0 − 1

Ck−n qk γk − C−M−n

k ∈{0,−M} /

Ck−n qk δk − C−M−n

k ∈{0,−M} /

1 (α−M + s0 β−M ), s0 − 1 (38)

1 (γ−M + s0 δ−M ), s0 − 1

1 (δ−M + γ−M ). s0 − 1

(39)

The system (38) is of the form α F1 α , + = S(s0 ) β F2 β where α, β, F1 , F2 are in l2 . We prove that the homogeneous equation has no nontrivial solutions: α α Lemma 14. (I − S(0)) = 0 implies = 0. β β Proof. Let Qn = qn , An = qn αn , Bn = qn βn for n = 0, −M and Q−M = −1, A−M = −α−M and B−M = −β−M . The system (38) becomes Ck−n Ak , Q−1 n An = − k=0

Q−1 n Bn

=−

Ck−n Bk − C−M−n A−M .

(40)

k=0

As in the proofs in Case I, multiplying the first equation by An , summing over n we first get from the reality of the r.h.s. that An = 0 for n < −M and then by the condition (g) we get that A ≡ 0. The conclusion B ≡ 0 now follows in the same way. " # End of proof of Proposition 13. The operator S is compact on l2 ⊕ l2 and S and (F1 , F2 ) are analytic in a complex neighborhood of 0. We saw in Lemma 14 that the kernel of I − S(0) is trivial and by the analytic Fredholm alternative it follows that (I − S(0))−1 exists and is analytic in a small neighborhood of s0 = 0. Hence (α, β) are analytic. Similarly, γ , δ are analytic in the same region. " #

5.3. Proof of Theorem 1. Combining the above results we have the following conclusion: Proposition 15. If condition (g) is fulfilled, then y(p) is analytic in a neighborhood of iR \ {isr + iωZ}. For any j ∈ Z,√in a neighborhood of p = isr + ij ω (sr ∈ R) y has the form y(p) = Aj (p) + Bj (p) −ip − sr − ij ω, where Aj and Bj are analytic. In particular, y is Lipschitz continuous of exponent 1/2 in the closed right half plane. Thus Y (t) = O(t −3/2 ) for large t.


15

Proof. All but the last claim has already been shown. The last statement is a standard Tauberian theorem (note that L−1 is the Fourier transform along the imaginary line). # " Proposition 16. We have θ (t) → 0 as t → ∞. Proof. We can write (9) (with I (t) = 1) as Y = η(θ + M ∗ Y ).

(41)

It is easy to check, in view of the fact that M and Y are O(t −3/2 ), that M ∗ Y → 0. t Furthermore 1 + 2i 0 Y (s)ds is convergent as t → ∞. Thus θ(t) → const as t → ∞. Since now the l.h.s. of (9) converges to zero and η does not, the equality (41) is only consistent if θ (t) → 0. " # This completes the proof of Theorem 1 for the case ψ0 = ub = e−|x| . The general case follows by noting that the inhomogeneous term does not affect the main argument, using the Fredholm alternative. Hence we will still have |θ(t)| → 0 but the rate of decay may be different. 6. A Nongeneric Example Let η be given by (11), for which Cn = −rλn for n ≥ 1,

Cn = C−n .

As in Sect. 5 set −ip0 = s0 and let qn be given by (24). Denote

1 1 1 an = an (s0 ) = = 1 + s0 + nω − 1 . r qn r

(42)

(43)

For r ∈ (0, 1), ω > 1, ω−1 ∈ N such that (1 − r)2 < ω − 1, let sr and sp be the unique numbers in (0, ω) so that 1 + sr ∈ ωZ and 1 + a−1 (sp ) = 0. We choose r, ω such that sr = sp . 6.1. The homogeneous equation. Lemma 17. Let s0,0 be a point in (0, sr ) ∪ (sr , ω). Consider s0 in a small enough neighborhood of s0,0 . The linear operator J = J (s0 ) of (17) depends analytically on s0 , and is compact on l2 . For s0 = sp , (I − J (s0 ))−1 exists and is analytic. Lemma 18. Denote for short J0 = J (sr ). There exists a value λ = λs ∈ (0, 1) such that dim Ker (I − J0 ) = 1.

(44)

Denote by A the diagonal (unbounded) operator (Az)n = an zn in l2 ; A−1 is bounded. Lemma 19. For λ = λs as in Lemma 18 we have Ker (I − J0 ) = A Ker I − J0∗ .

(45)

16


6.2. Proof of Lemma 17. The operator J is compact by Lemma 6. To show that I − J is invertible we prove this for any points s0 ∈ (0, ω), s0 = sp , sr ; by the analytic Fredholm theorem it will follow that I − J is invertible in a small enough neighborhood of any such point, thus proving the lemma. Let s0 ∈ (0, ω), s0 = sp , sr . As in Remark 7 in Sect. 5, the substitution yn = an zn (n ∈ Z) transforms the homogeneous equation (19) to an zn =

∞

λj zn+j + zn−j ,

n ∈ Z.

(46)

j =1

Note that an < 0 for n < −1 for s0 ∈ [ω − 1, ω) and an < 0 for n < 0 for s0 ∈ (0, ω − 1). We will discuss only the first case, s0 ≥ ω − 1, since the second one is completely analogous. As in the proof of Lemma 8, it follows that zn = 0

for n < −1.

(47)

Then Eqs. (46) for n < −1 become ∞

λk zk−2 = 0.

(48)

k=1

For n = −1 (46) gives (a−1 + 1)z−1 = 0,

(49)

and for n ≥ 0, using (48), we get (1 + an )zn =

n+1

(λj − λ−j )zn−j , n ≥ −1.

(50)

j =1

Since s0 = sp , (49) gives z−1 = 0, and it follows by induction, from (50), that zn = 0 for all n. By the Fredholm alternative theorem then I − J (s0 ) is invertible. 6.3. Proof of Lemma 18. In what follows s0 = sr . 6.3.1. An auxiliary lemma. We show that if z ∈ l2 then Eq. (48) is redundant. Lemma 20. If z is an l2 solution of (50) with zn = 0 for n < −1 then z satisfies (48). Proof. Let z ∈ l2 be a solution of (50). Then Z [n+1] ≡

n

λk zk−2

(51)

k=1

is the truncation of a convergent series, since there is a constant M with |zn | < M for all n. Note that n+1 1 + an )zn = λj zn−j − λ−n−2 Z [n+1] , j =1


17

hence Z [n+1] = λn+2

n+1

λj zn−j − λn+2 (1 + an )zn ,

j =1

so that √ Mλ [n+1] + λn+2 M 1 + const n −→ 0 as n → ∞. (52) Z ≤ λn+2 1−λ Since (51) are truncations of the series in the LHS of (48), then (52) implies (48). " # 6.3.2. Behavior of the general solution of (50). A direct calculation shows that the sequence zn satisfying the infinite order recurrence (50) and the initial condition z−1 = 1 satisfies, in fact, the three step recurrence (1 + an+1 )zn+1 + (1 + an−1 )zn−1 = [λ(1 + an ) + λ + an λ−1 ]zn (n ≥ 0)

(53)

with the initial condition z−1 = 1,

z0 =

λ − λ−1 . 1 + a0

(54)

Denote zn = then (53) becomes

λ − λ−1 Vn−1 , 1 + an

(55)

Vn + Vn−2

λ 2 + an = λ+ Vn−1 λ(1 + an )

n ≥ 1.

(56)

We are looking for l2 solutions. Recent rigorous WKB estimates (see e.g.√[35]) would imply there are solutions of the discrete equation (56) behaving like λ−n e− n/ω and like √ n n/ω . We will prove this in our context and find special values of λ for which the λ e solution decaying for large n satisfies the initial condition. We will show that this solution is obtained by taking Vn−2 = gn−1 Vn−1

(57)

in (56) and iterating: gn−1 = Gn −

1 gn

with Gn = λ +

λ 2 + an , λ(1 + an )

(58)

i.e., g0 is given by the continued fraction: g 0 = G1 −

1 G2 −

1 ... G3

,

which needs to match the initial condition (see (54): g0 = g0 (λ) =

λ + λ−1

1 . + (1 + a0 )−1 (λ − λ−1 )

(59)

18


Lemma 21. (i) Let λ ∈ (0, 1). The recurrence (58) has a solution such that gn → λ−1 as n → ∞. (ii) g0 is meromorphic in λ on [0, 1) and has poles. (iii) There exists λs ∈ (0, 1) such that g0 (λs ) satisfies (59). (iv) Let λ = λs . To the solution of (i) there corresponds a solution V [s] of the recurrence n+o(n) as n → ∞. The corresponding solution z[s] of (50) (56) such that Vn[s] ∼ λs satisfies zn → 0 as n → ∞. (v) Let λ = λs . There exists a solution of (56) with the asymptotic behavior Vn[l] ∼ −n+o(n) . λs Thus, for λ = λs , there exists a unique (up to a multiplicative constant) “small” n+o(n) for large n, while the general solusolution of (56), with the behavior Vn[s] ∼ λs −n+o(n) . As a consequence, a similar statement holds for the tion behaves like Vn ∼ λs recurrence (53). Remark. The proof of (iii) can be refined to show that, in fact, there is a countable set of points λs for which g0 satisfies the initial condition, and that these values accumulate to 1. Proof. (i) With the substitution gn = Gn+1 − λ + δn ,

(60)

the recurrence (58) becomes 1 ≡ (Sδ)n , n ≥ 0. (61) Gn+2 − λ + δn+1 −1 For n0 ≥ 0 and F small, positive, define λn0 = an0 +2 2 + an0 +2 − F. Let Nn0 be a small neighborhood of the interval In0 = [0, λn0 ]. Consider the Banach space Bn0 of sequences {δn }n≥n0 with δn = δn (λ) analytic on Nn0 and continuous up to the boundary, with the norm 'δ' = supn≥n0 supλ∈Nn |δn (λ)|. Direct estimates show that 0 the operator S defined by (61) takes the ball of radius ρn0 = 2/(2 + an0 +2 ) + F in Bn0 into itself (if F, F and Nn0 are small enough), and is a contraction in this ball. Therefore the equation δ = S(δ) has a unique solution in Bn0 , of norm less than ρn0 . Then |δn (λ)| < const(n + 2)−1/2 for all λ ∈ In and all n ≥ 0. Since the sequence λn increases to 1, (i) follows. (ii) Step I: All gn are meromorphic on [0, 1). Since δn is analytic on In , then from (60), gn is analytic on In \ {0}, having a pole at λ = 0: gn ∼ λ−1 an+1 (1 + an+1 )−1 (λ → 0). Iterating (58) it follows that gn−1 , gn−2 , . . . , g0 are meromorphic on In . Since the intervals In increase toward [0, 1) it follows that g0 , g1 , . . . gn . . . are meromorphic on [0, 1). Step II: There exists n1 and λ0 ∈ (0, 1) such that gn1 (λ0 ) ≤ 0. Define Fn = (1 + an )−1 ; we have (see (43)) δn = λ −

Fn0 ∼ r(n0 ω)−1/2 ,

n0 → ∞.

(62)

Let n0 be large and denote λ0 = 1 − Fn0 . Let N0 be large enough so that λ0 is in the domain of analyticity of gN0 . Iterating (58) starting from N0 (and decreasing indices) we get the value gn0 (λ0 ). If for some n ∈ {n0 , n0 + 1, . . . , N0 } we get gn (λ0 ) ≤ 0, Step II is proved. Then assume that gn0 (λ0 ) > 0.


19

Consider the recurrence rñ−1 = Gn0 (λ0 ) −

1 rñ

for n ≤ n0 ,

rñ0 = gn0 (λ0 ),

(63)

where, in fact, Gn0 (λ0 ) = 2 − Fn20 . The recurrence (63) can be solved explicitly (it is a discrete Riccati equation and a substitution rñ−1 = xn−1 /xn transforms it into a linear recurrence with constant coefficients). It has the solution rñ =

cos ((n − n0 )φ + θ ) , cos ((n + 1 − n0 )φ + θ )

(64)

where cos φ = 1−Fn20 /2, sin φ > 0, and tan θ = (cos φ −λ)/ sin φ so that θ ∼ π4 − 41 Fn0 (Fn0 → 0). We assume, to get a contradiction, that gn (λ0 ) > 0 for all n = 0, 1, . . . , n1 . Then gn (λ0 ) ≤ rñ

for n ≤ n0 ,

(65)

which follows immediately by induction using (58), (63), noting that Gn is increasing in n. Note that there is an n1 ∈ {1, 2, . . . , n0 − 1} so that rñ > 0

for n ∈ {n1 + 1, . . . , n0 } and rñ1 < 0.

(66)

Indeed (from (62)) when n decreases from n0 the numerator and denominator in (64) increase up to 1, then decrease, until the numerator becomes negative, when n equals n1 = n0 − k1 , where k1 is the integer with k1 − 1 < (π/2 + θ )/φ ≤ k1 . Since φ ∼ Fn0 (Fn0 → 0) then k1 ∼ (3π)/(4Fn0 ), and, using (62), clearly k1 ∈ {1, . . . , n0 − 1} (if n0 is sufficiently large). Then (65) and (66) contradict the assumption that gn1 (λ0 ) > 0, and Step II is proved. Step III. The function gn1 is meromorphic on [0, 1), with gn1 (0+) = +∞. There is a smallest value of λ in (0, λ0 ), where gn1 changes sign: this is either a zero, or a pole. Assume it was a pole. Let p ∈ (0, λ0 ) be the first pole of gn1 . Then gn1 is positive and analytic on (0, p), and gn1 (p−) = +∞, gn1 (p+) = −∞. Since gn+1 = 1/(Gn+1 −gn ) (see (58)) then gn1 +1 (p−) = 0−, hence gn1 +1 changes sign in (0, p). But gn1 +1 has no zero in (0, p) (otherwise at that zero gn1 would have had a pole, from (58)). Then gn1 +1 has a pole, with a change of sign, from + to −, in (0, p). Now the argument can be repeated. It follows that for any k > 0, gn1 +k has a pole in (0, p), which contradicts the fact that the domain of analyticity of gn increases to (0, 1) as n → ∞. Therefore, the first change of sign of gn1 is at a zero. Let ζ1 be the smallest value in (0, 1) such that gn1 (ζ1 −) = 0+, gn1 (ζ1 +) = 0−. Then from (58) we have gn1 −1 (ζ1 −) = −∞ and gn1 −1 changes sign in (0, ζ1 ). Now the argument can be repeated. It follows that g0 has a pole at a point ζn1 with g0 (ζn1 −) = −∞. (iii) Since g0 (λ) takes all the values when λ ∈ (0, ζn1 ) there exists λ = λs ∈ (0, 1) such that (59) holds. (iv) For λ = λs , since the solution of (i) satisfies gn (λ) = λ−1 +O(n−1/2 ) we have from n+o(n) ) (57), with the notation V [s] = V (λs ), that Vn[s] = nk=0 gk (λs )−1 V0[s] = O(λs n+o(n) n+o(n) [s] [s] [s] ); then from (55) zn = O(λs ). and thus Vn − Vn−1 = O(λs (v) The substitution (variation of constants) Vn = Vn[s] vn brings the recurrence (56) to [s] /Vn[s] In−1 and a first order one: with the notation In = vn − vn−1 we have In = Vn−2 the rest of the argument consists of straightforward estimates. " #

20

O. Costin, R. D. Costin, J. L. Lebowitz, A. Rokhlenko 2

1

0

0.75

0.8

0.85

0.9

λ

–1

–2

Fig. 1. Graph of g0 given by (58) (discontinuous graph) and by (59) in a region near λ = 1, as functions of λ

6.3.3. Proof of Lemma 18. Proof. Lemma 21(v) shows that Eq. (53) has a unique (up to a multiplicative constant) n+o(n) small solution, zn[s] ∼ λs (n → ∞), while the general solution behaves like zn ∼ √ −n+o(n) . Since yn ∼ nzn the uniqueness of the l2 solution is proven. λs 6.3.4. Examples of solutions. We will show next how concrete values λs satisfying Lemma 21 (iii) are relatively straightforwardly, and rigorously, found. One method is as follows. Note that the minimum/maximum of the function a − b/x, where x varies in an interval not containing zero is achieved at the endpoints. We thus take the recurrence 2 √ (58) with initial conditions gn0 = λ−1 ± 1−λ and compute g0 from these. The actual nω graph will be between these two, unless the condition mentioned is violated in between n0 and 0. This graph is to be intersected with the graph of the initial condition (59). We take for instance ω = 1.1, r = 0.45, sp = 0.11, n0 = 10, for which the rigorous control is not too involved. The two graphs are very close to each other (within about 3.10−6 for λ ∈ (0.3, 0.4)) and cannot be distinguished from each-other in Fig. 1. A first intersection is seen at λ ≈ 0.327; see Fig. 2. 6.4. Proof of Lemma 19. Denote B = (I − J0 )A; we have B = A − S. Hence B ∗ = A − S. Then Ker(B) = Ker(B ∗ ) (since Az = Sz implies (47), so Az = Az, and similarly, Az = z implies Az = Az). So Ker[(1 − J0 )A] = Ker[A(1 − J0 ∗ )] so that (since A is one-to-one) A−1 Ker(1 − J0 ) = Ker(1 − J0 ∗ ), which proves the lemma. 6.5. Discussion of the singularities of solutions of (17). Let λ = λs . We have that I − J is invertible for p0 > 0, and is not invertible at p0 = isp (Lemma 18). By the analytic Fredholm theorem (see e.g. [30]) (I − J )−1 is meromorphic on a small neighborhood of isp , therefore there exist m ≥ 1 and operators Sm , . . . , S1 , R(p0 ) so that: 1

(I − J )−1 =

p0 − isp

m Sm + · · · +

1 S1 + R(p0 ), p0 − isp

(67)


21

1.2

1.1

1

0.9

0.8

λ 0.3

0.32

0.34

0.36

0.38

0.4

Fig. 2. Graphs of g0 (steeper graph) and of the initial condition for g0 (59)

where R(p0 ) is analytic at isp , and Sm = 0 (since I − J0 is not invertible). Multiplying (67) by I −J to the left, respectively to the right, and writing J = J0 +(p0 −isp )J1 (p0 ) (where J1 (p0 ) is analytic at isp ) we get that −m+1

1 , R1 (p0 ) = m (I − J0 ) Sm + O p0 − isp p0 − isp −m+1

1 R2 (p0 ) = , m Sm (I − J0 ) + O p0 − isp p0 − isp where R1,2 are analytic at p0 = isp . By the uniqueness of the series of the analytic functions (Banach space valued) R1,2 we must then have (I − J0 ) Sm = 0 = Sm (I − J0 ) .

(68)

The first equality in (68)

implies Ran(Sm ) ⊂ Ker (I − J0 ) = {yKer } and since Sm = 0 then Ran(Sm ) = {yKer }, therefore Sm y = y, u yKer u ∈ l2 \ {0}. for some The second equality in (68) means u ∈ Ran (I − J0 )⊥ = Ker I − J0∗ . By Lemma 19 then (up to a multiplicative constant) u = A−1 yKer = zKer , where zKer satisfies (46), hence (53),(54). The solution y = (I − J )−1 f of (17) is then singular at p0 = isp if c =< f, zKer > = 0. For the example of Sect. 6.3.4 this latter condition can be checked by explicit calculation of the truncations to 10 terms and estimation of the remainder based on the contractivity bounds in the previous section. The result is c = −1.953 ± 0.001. Thus the inhomogeneous equation has poles. Lemma 22. Let Y (t) be analytic on [0, ∞), with limt→∞ Y (t) = 0. Let s ∈ R. Then ∞ lim a e−(a+is)t Y (t) dt = 0. a↓0

0

(69)

∞ Corollary 23. Let Y (t) be as in Lemma 22. Let y(p) = 0 e−pt Y (t) dt. Assume that y(p) is analytic on iR+ , except for a set of isolated points. Then y(p) does not have poles on iR+ .

22


Proof. I. We first show (69) for s = 0. Separating the positive and negative parts of Y (t), Y (t) write Y (t) = Y [1] (t) − [2] Y (t) + iY [3] (t) − iY [4] (t) with Y [k] (t) nonnegative, continuous, nonanalytic only on a discrete set, where the left and right derivatives exist, with limt→∞ Y [k] (t) = 0. It is enough to show (69) for each Y [k] . Let then Y be one of the Y [k] ’s. Denote H (t) = supτ ≥t Y (τ ). The function H on [0, ∞) has the same properties as Y and in ∞ addition is decreasing. Then H exists a.e. and H ∈ L1 [0, ∞), since 0 |H (τ )| dτ = t − limt→∞ 0 H (τ ) dτ = limt→∞ −H (t) + H (0) = H (0). Then ∞ ∞ ∞ d −at −at −at a H (t) dt e e Y (t) dt ≤ a e H (t) dt = − dt 0 0 0 ∞ e−at H (t) dt, = H (0) + 0

therefore

∞

lim a a↓0

e−at Y (t) dt ≤ H (0) + lim

a↓0 0

0

∞

e−at H (t) dt = 0,

which proves the lemma in this case. II. Let now s ∈ R arbitrary. Then (69) follows from the result for s = 0 applied to the # function Y˜ (t) = e−ist Y (t). " Proof of Theorem 2. In conclusion Y (t) cannot tend to zero as t → ∞ and complete ionization fails. " # Acknowledgements. The authors would like to thank A. Soffer and M. Weinstein for interesting discussions and suggestions. Work of O. C. was supported by NSF Grant 9704968, of R.D.C. by NSF Grant 0074924, and that of O. C., J. L. L. and A. R. by AFOSR Grant F49620-98-1-0207 and NSF Grant DMR-9813268.

Appendix A. Proof of Lemma 3 A (i) Consider L1loc [0, A] endowed with the norm 'F 'ν := 0 |F (s)|e−νs ds, where ν > 0. If f is continuous and F, G ∈ L1loc [0, A], a straightforward calculation shows that 'f F 'ν < 'F 'ν sup |f |,

(A1)

'F ∗ G'ν < 'F 'ν 'G'ν , 'F 'ν → 0 as ν → ∞,

(A2) (A3)

[0,A]

where the last relation follows from the Riemann–Lebesgue lemma. The integral equation (9) can be written as Y = η +JY

where J F := η(2i + M) ∗ F.

(A4)

Since M is locally in L1 and bounded for large x it is clear that for large enough B2 , (9) is contractive if ν > B2 , for any A.


23

(ii) This is an immediate consequence of Lemma 3 and of elementary properties of the Laplace transform. (iii) We have in H, ∞ 2 −i(x−ia)(1+u2 ) 2i ∞ u e −px dxe du (A5) LM = lim a↓0 π 0 1 + u2 0 ∞ i u2 = du. (A6) π −∞ (1 + u2 )(p + i(1 + u2 )) For (p) > 0 we push the integration contour through the upper half plane. At the two poles in the upper half plane u2 + 1 equals 0 and ip respectively, so that i π

∞ −∞

u2 du (1 + u2 )(p + i(1 + u2 )) u20 (−1) ds i u0 ds i + =− + , = π (2i)(p) s (ip)(2iu0 ) s p p

(A7)

where u0 is the root of p + i(1 + u2 ) = 0 in the upper half plane. Thus √ i i 1 − ip LM = − + (A8) p p √ with the branch satisfying 1 − ip → √ 1 as p → 0 in H. Thus, the analytic continuation of 1 − ip in H∪∂H in our calculations is as follows: √ Remark 24. As p varies in H, 1−ip belongs to the √ lower half plane −iH so that 1 − ip varies√in the fourth quadrant, and in particular 1 − ip < 0. If p√∈ iR and −ip ≥ −1 then 1 − ip is real and nonnegative, while if −ip < −1 and 1 − ip has zero real part and negative imaginary part. To show (15) note that for (p) > 0, ω > 0 we have √

i i 1 − ip ∓ ω ±iω L e M =− + , p ∓ iω p ∓ iω (with 1 − ip − ω = −i ω − 1 + ip if ω > 1)

(A9)

The branch of the square root was discussed in Remark 24. This concludes the proof of Lemma 3 (iii). Appendix B. Discussion of the Genericity Condition (g) A thorough analysis of the properties of the shift operator is provided by the treatise [29]. We provide here an independent discussion, meant to give an insight on the interesting analytic properties involved in this condition. Let C = (C0 , C1 , . . . , Cn , . . . ) ∈ l2 (N) and the operator T defined as before by T C = (C1 , C2 , . . . ). We want to see for which such vectors, the system of equations (z, T j C) = 0, j = 0, 1, . . .

(B1)

24


has nontrivial solutions z in l2 . We can associate to z and C analytic functions in the unit disk, Z(x) and C(x) by C(x) =

∞

Ck x k

Z(x) =

k=0

∞

zk x k .

(B2)

k=0

These functions, extend to L2 functions on the unit circle. The system of equations (B1) can be written as z0 C(x) + z1 x −1 (C(x)C(0)) + . . . + zn x

−n

C(x) − x

−n

n−1 k x k=0

k!

C (k) (0) + · · · = 0.

Using Cauchy’s formula, we can the difference in square brackets in (B3) as 1 C(s) ds, 2πi |s|=1 s n (s − x)

(B3)

(B4)

and thus (B1) becomes |s|=1

C(s)Z(1/s) ds = 0. s−x

(B5)

The functions C for which this equation has nontrivial solutions Z relate to the Beurling inner functions [29] and are very “rare”. Examples. (i) Let |λ| < 1 and Cn = λn , i.e. C(x) = (1 − λx)−1 . This is related to the function (11). Taking advantage of the analyticity of Z outside the unit circle, we can push the contour of integration towards s = ∞, collecting the residue at x = λ−1 ; we see that Eq. (B5) holds iff Z(λ) = 0, i.e., for a space of analytic functions of codimension one. (ii) Let λn = 1/n. Then C(x) = ln(1 − x), and by taking s = 1/t in (B5) we get 1 Z(t) ln(t − 1) ln(t)Z(t) 1 dt − dt = 0. (B6) x |t|=1 (t − x −1 )t x |t|=1 t (t − x −1 ) By making a cut on [1, ∞) for the log we see that the integrand in the first integral is analytic in the unit circle and thus the integral vanishes. We decompose the second integral by partial fractions and we get ln(t)Z(t) ln(t)Z(t) dt − dt = 0, (B7) t |t|=1 |t|=1 (t − y) where y = x −1 . The first integral is a constant, C. By pushing the contour of integration inwards, we see that the second integral extends analytically for small y = 0. For such y we thus have ln(t)(Z(t) − Z(y)) ln(t) dt + Z(y) dt = −C. (B8) (t − y) |t|=1 |t|=1 (t − y)


25

Now the contour of integration can be pushed to the sides of the interval [0, 1] collecting the difference between the branches of the log. We get 1 1 Z(t) − Z(y) 1 dt + Z(y) dt = 0. (B9) t −y 0 0 t −y Thus φ(y) + Z(y) ln(−y) = C with φ and Z analytic in the unit circle, thus ln(−y) is analytic unless Z = 0. This shows Cn = 1/n is generic. References 1. Simon, B.: Schrödinger Operators in the Twentieth Century. J. Math. Phys. 41, 3523 (2000) 2. Cycon, H.L., Froese, R.G., Kirsch, W. and Simon, B.: Schrödinger Operators. Berlin–Heidelberg–New York: Springer-Verlag, 1987 3. Pillet, C.-A.: Some results on the quantum dynamics of a particle in a Markovian potential. Commun. Math. Phys. 102, 237–254 (1985) and 105, 259 (1986) 4. Yajima, K.: Existence of Solutions for Schrödinger Evolution Equations. Commun. Math. Phys. 110, 415–426 (1987) 5. Fring, A., Kastrykin, V. and Schrader, R.: Ionization Probabilities through Ultra-Intense Fields in the Extreme Limit. J. Phys. Math. Gen. 30, 8599–8610 (1997) 6. Soffer, A. and Weinstein, M.I.: Nonautonomous Hamiltonians. J. Stat. Phys. 93, 359–391 (1998) 7. Soffer, A. and Costin, O.: Resonance Theory for Schrödinger Operators. Submitted to Commun. Math. Phys. 8. Landau, L.D. and Lifschitz, E. M.: Quantum Mechanics – Nonrelativistic Theory. 2nd ed. New York: Pergamon, 1965 9. Cohen-Tannoudji, C., Duport-Roc, J. and Arynberg, G.: Atom-Photon Interactions. New York: Wiley, 1992 10. Fermi, E.: Notes on Quantum Mechanics. Chicago: The University of Chicago Press, 1961, p. 100 11. Koch, P.M. and van Leeuven, K.A.H.: The Importance of Resonances in Microwave “Ionization” of Excited Hydrogen Atoms. Phys. Repts. 255, 289–403 (1995) 12. Casatti, G., Chirikovand, B. and Shepelyansky, D. L.: Relevance of Classical Chaos in Quantum Mechanics: The Hydrogen Atom in a Monochromatic Field. Phys. Repts. 154, 77–123 (1987) 13. Patolige, R.M. and Shaheshaft, R.: Multiphoton Processes in an Intense Laser Field: Harmonic Generation and Total Ionization Rates for Atomic Hydrogen. Phys. Rev. A 40, 3061–3079 (1990) 14. Buchleitner, A., Delande, D. and Gay, J.-C.: Microwave Ionization of Three Dimensional Hydrogen Atoms in a Realistic Numerical Experiment. J. Opt. Soc. Am. B 12, 505–519 (1995) 15. Benenti, G., Casati, G., Maspero, G. and Shepelyansky, D.L.: Quantum Poincaré Recurrences for Hydrogen Atom in a Microwave Field. Preprint, physics/9911200 16. Schwendner, P., Seyl, F. and Schinke, R.: Photodissociation of Ar2+ in Strong Laser Fields, Chem. Phys. 217, 233–244 (1997) 17. Guerin, S. and Jauslin, H.-R.: Laser-Enhanced Tunneling through Resonant Intermediate Levels. Phys. Rev. A 55, 1262–1275 (1997) 18. Eberly, J.M. and Kulander, K.C.: Atomic-Stabilization by Super-Intense Lasers. Science 262 1233 (1993) 19. Su, Q., Irving, B.P. and Eberly, J.H.: Ionization Modulation in Dynamic Stabilization. Laser Physics 7, 568 (1997) 20. Figueira de Morisson Faria, C., Fring, A. and Schrader, R.: Analytical Treatment of Stabilization. Preprint, physics/9808047, v2 21. Barash, D., Orel, A.E. and Vemuri, V.R.: A Genetic Search in Frequency Space for Stabilizing Atoms by High-Intensity Laser Fields. J. Comp. Info. Tech. CIT 8, 2, 103–113 (2000) 22. Rokhlenko, A. and Lebowitz, J.L.: Ionization of a Model Atom by Perturbation of the Potential. J. Math. Phys. 41, 3511–3523 (2000) 23. Costin, O., Lebowitz, J.L. and Rokhlenko, A.: Exact Results for the Ionization of a Model Quantum System. J. Phys. A. 33, 6311–6319 (2000) physics/9905038, and work in preparation 24. Costin, O., Lebowitz, J.L. and Rokhlenko, A.: To appear in: Proceedings of the CRM meeting “Nonlinear Analysis and Renormalization Group”, American Mathematical Society Publications (2000), mathph/0002003 25. Demkov, Yu.N. and Ostrovskii, V.N.: Zero Range Potentials and Their Application in Atomic Physics. New York: Plenum, 1988; Albeverio, S., Gesztesy, F., Høegh-Krohn, R. and Holden, H.: Solvable Models in Quantum Mechanics. Berlin–Heidelberg–New York: Springer-Verlag, 1988

26


26. Susskind, R.M., Cowley, S.C. and Valeo, E.J.: Multiphoton Ionization in a Short Range Potential: A Nonperturbative Approach. Phys. Rev. A 42, 3090–3106 (1990) 27. Scharf, G., Sonnemoser, K. and Wreszinski, W.F.: Sensitive Multiphoton Ionization. Phys. Rev. A 44, 3250–3265 (1991) 28. LaGattuta, K.J.: Laser-Assisted Scattering from a One-Dimensional δ-function potential: An Exact Solution. Phys. Rev. A 49 No. 3, 1745–1751 (1994) 29. Nikol’skii, N.K.: Treatise on the Shift Operator. Berlin–Heidelberg–New York: Springer-Verlag, 1986 30. Reed, M. and Simon, B.: Methods of Modern Mathematical Physics, Vol. I. New York: Academic Press, 1979 31. Martin, P.A.: Scattering with Time Periodic Potentials and Cyclic States. Preprint 1999, Texas 32. Kovar, T. and Martin, P.A.: Scattering with a periodically kicked interaction and cyclic states. Preprint (1999) 33. Belissard, J.: Stability and Instability in Quantum Mechanics. In: Trends and Developments in the Eighties, Albeverio, S. and Blanchard, Ph. eds., Singapore: World Scientific, 1985, pp. 1–106 34. Jauslin, H.R. and Lebowitz, J.L.: Spectral and Stability Aspects of Quantum Chaos. Chaos 1, 114–121 (1991) 35. Costin, O. and Costin, R.D.: Rigorous WKB for Discrete Schemes with Smooth Coefficients. SIAM J. Math. Anal. 27 no. 1, 110–134 (1996) Communicated by B. Simon

Commun. Math. Phys. 221, 27 – 56 (2001)

Communications in



How to Prove Dynamical Localization Serguei Tcheremchantsev MAPMO-CNRS, Département des Mathématiques, Université d’Orléans, BP 6759, 45067 Orléans Cedex 2, France. E-mail: [email protected] Received: 16 November 2000 / Accepted: 14 February 2001

Abstract: Let H be a self-adjoint operator on l 2 (Zd ) or L2 (Rd ) with pure point spectrum on some interval I . We establish general necessary and sufficient conditions for dynamical localization for a given vector and on the interval of energies I . The sufficient conditions we obtain improve the existing ones such as SULE or WULE and can be useful in applications. 1. Introduction Let H be a self-adjoint operator on the Hilbert space H with pure point spectrum on some interval I ⊂ R. We shall consider the case H = l 2 (Zd ) as well as H = L2 (Rd ). Consider the subspace H(I ) of H, H(I ) = PI (H )H, where PI (H ) is the spectral projector of H onto I . It is natural to say that the operator H has dynamical localization on I if for any p > 0 and well localised ψ ∈ H, sup r p (t) ≡ supexp(−itH )PI (H )ψ, |X|p exp(−itH )PI (H )ψ < +∞, t

t

(1.1)

where X is the usual position operator. The problem of dynamical localization was intensively studied during the last past years [1–8], especially in the case of random discrete and continuous Schrödinger operators (in particular, the Anderson model). The aim of the present paper is not to give a review of the results obtained for concrete models. We are rather interested in general mathematical methods which can be used to prove dynamical localization (1.1) for any self-adjoint operator H . We hope, however, that the obtained results (especially sufficient conditions for dynamical localization) will be useful in applications. Let {ek } be any orthonormal set of eigenfunctions of H complete in H(I ). For any k we have H ek = λk ek with λk ∈ I . Suppose that H = l 2 (Zd ) (in the last section of the paper we discuss also the continuous case). Let ψ = δm for some m ∈ Zd . Then exp(−itλk )ek (n)ek (m) ψI (t, n) ≡ (exp(−itH )PI (H )ψ)(n) = k

28

S. Tcheremchantsev

and sup |ψI (t, n)| ≤ W (n, m), t

where

W (n, m) =

sup r p (t) ≤ t

|n|p W 2 (n, m),

(1.2)

n∈Zd

|ek (n)ek (m)|.

k

To prove dynamical localization (1.1) for any ψ = δm , it is sufficient to show that the function W (n, m) is fast decaying as |n| → ∞ for all m ∈ Zd . What one usually proves for “concrete” Schrödinger operators is the so-called exponential (or mathematical) localization on the interval I . Namely, for some α > 0 and any eigenfunction ek ∈ H(I ), |ek (n)| ≤ C(k) exp(−α|n|),

(1.3)

where C(k) < +∞. If the sum is finite, it is evident that W (n, m) is exponentially decaying in n for any m and (1.1) holds. However, typically the sum has infinitely many terms. In this case the bounds (1.3) give nothing about the behaviour of the sum W (n, m). The well known example of [5] shows that it is quite possible that (1.3) holds, but there is no dynamical localization for ψ = δ0 and r 2 (tn )/tn2−δ → +∞ for any δ > 0 for some sequence tn → ∞. It is clear that to prove dynamical localization, one should control the decay of |ek (n)ek (m)| both in n and in k. Or, equivalently, one should control the constants C(k) in (1.3). Two approaches have been developed to solve this problem. The first is to estimate rather directly |ek (n)ek (m)| and to prove that the sum W (n, m) is fast decaying as |n| → ∞ [7, 8]. For example, one shows that

where

|ek (n)ek (m)| ≤ C(m)ak exp(−α|n − m|), k ak

(1.4)

= D < +∞. Clearly, (1.4) yields W (n, m) ≤ C(m)D exp(−α|n − m|)

and (1.1) holds. In particular, the condition called WULE was considered [7]. This is 2 and B is the operator of multiplication by (1 + |n|2 )−δ/2 with (1.4), where ak = Bek δ > d/2. Obviously, as k |ek (n)|2 ≤ 1 for any n, ak ≤ (|n|2 + 1)−δ < +∞. k

n

Another possibility is to have some control of constants C(k) in (1.3). Namely, the following condition called SULE was introduced in [5]. The operator H has SULE on I , if H has a complete set {ek } in H(I ) of orthonormal eigenfunctions and there exist α > 0 and nk ∈ Zd such that for any δ > 0, |ek (n)| ≤ C(δ) exp(δ|nk | − α|n − nk |)

(1.5)

with some finite constants C(δ) uniform in k, n. One shows that if (1.5) holds, then ek (m) are fast decaying in k for all m. So, it is easy to see that the function W (n, m) is exponentially decaying in n for all m and (1.1) holds (in fact, one controls also the behaviour in m, so one proves (1.1) for any exponentially decaying ψ). This result

How to Prove Dynamical Localization

29

was extended to the continuous case in [6] and applied in [6, 4] to prove dynamical localization for some concrete models. Similar ideas are used in [3] to prove strong dynamical localization. In the present paper we propose a new approach which ameliorates considerably the existing methods. Surprisingly, one can show that if the function |ek (n)ek (m)| decays sufficiently fast in n uniformly in k, then one should not take special care of decay in k necessary for the convergence of the sum W (n, m). The key point is the following. Due to the fact that the system {ek } is orthonormal in H, the decay in n and decay in k of |ek (n)ek (m)| are related. Therefore, one can “sacrifice” some decay in n to obtain a decay in k sufficient to the convergence of the sum. One can even allow some growth in k in the bounds for |ek (n)ek (m)| provided the decay in n is fast enough (see Theorem 5.3 and Theorem 5.4). One should say that there is a deep relation between our approach and that of [5]. The main technical ingredient in our consideration is the following result (Theorem 2.2 and Theorem 6.1). Let {ek } be any orthonormal system in H. For any p > 0 define the positive numbers dk (p) = |ek (n)|2 (|n| + 1)p , ηk (p) = sup(|ek (n)|2 exp(p|n|)) n

n

(where it is possible that dk (p) = +∞, ηk (p) = +∞). For any p > 0, one can reorder dk (p), ηk (p) so that dk (p) ≥ D(p, d)k p/d ,

ηk (p) ≥ C(p, d) exp(βk 1/d )

(1.6)

with some positive universal constants D(p, d), C(p, d), β(p, d). Considering the SULE condition, one can easily verify the fact that |nk | ≥ Ck 1/d (after reordering) implies dk (p) ≥ Dk p/d and ηk (p) ≥ C exp(βk 1/d ) for any p > 0. So, the growth of |nk | for the systems {ek } with SULE, which plays a key role in the proof of [5], can be considered as a manifestation of a more general result (1.6) valid for any orthonormal system. When proving (1.6), we don’t need any assumptions about the form of eigenfunctions ek or the notion of “center of localization” nk . The paper is organised as follows. In Sect. 2 we prove our main technical result about the growth of the moments dk (p). In Sect. 3 we establish necessary conditions for dynamical localization for a given vector ψ. In particular, we show (Theorem 3.4) that if p

sup |X|ψ (t) ≡ supexp(−itH )ψ, |X|p exp(−itH )ψ < +∞, t

t

(1.7)

for some p > 0, then the coefficients of the spectral measure of ψ decay sufficiently fast. Namely, if µψ = k ak δλk , then k

1

ak1+β < +∞

for any 0 < β < p/d. In particular, if (1.7) holds for any p > 0, then ak are fast decaying: akν < +∞ ∀ν > 0. k

In Sect. 4 we give some sufficient conditions for dynamical localization for a given vector ψ and p > 0 (Theorem 4.2 is the main result of the section). As a result, we obtain

30

S. Tcheremchantsev

(Corollary 4.3) general necessary and sufficient conditions for dynamical localization for a given vector ψ for all p > 0. Namely, let ψ ∈ Hpp . One can always represent it as ψ= ψk , ψk ∈ Hλk , k

where λk = λs for k = s and Hλ is an eigenspace of H corresponding to the eigenvalue λ. Consider the function Rψ (n) = sup |ψk (n)|. k

Then (1.7) holds for any p > 0 iff Rψ (n) is fast decaying. We show also that the dynamical localization p sup |X|ψ (t) < +∞ ∀p > 0 t

is equivalent to the dynamical localization for Cesaro averages: T p sup 1/T |X|ψ (t)dt < +∞ ∀p > 0. T

0

The results of Sects. 3 and 4 are well adapted to the case of power law or subexponential decay of eigenfunctions of H . In Sect. 5 we discuss the problem of dynamical localization on the interval of energies I (in the sense (1.1)). Theorems 5.3 and 5.4 give sufficient conditions that (1.1) hold for any finite ψ and any fast decaying ψ respectively. We give also some bound (Theorem 5.5) which can be used to prove the strong dynamical localization on I considered in [3, 8] (in the case when there is a family of operators H depending on some parameter). In Sect. 6 we consider exponential dynamical localization: sup |(exp(−itH )ψ)(t, n)| ≤ C exp(−γ |n|), t

γ > 0,

(1.8)

(denoted as EDL(ψ)) which is a special case of dynamical localization. In particular, we show that if (1.8) holds for some vector ψ, then the coefficients of the spectral measure of ψ decay (after reordering) as follows: ak ≤ C exp(−βk 1/d ) with some β > 0. We give also (Theorem 6.5) some sufficient conditions for exponential localization on the interval of energies I . In particular, if |ek (n)ek (m)| ≤ C exp(−α|n − m| + β|m|)

(1.9)

for some α > 0, β > 0, then EDL(PI (H )ψ) holds for any exponentially decaying ψ. The condition (1.9) is similar to (1.4), but is much easier to prove in concrete cases because one doesn’t need the decreasing factors ak . In particular, the SULE condition (1.5) implies immediately (1.9). In Sect. 7 we consider the continuous case H = L2 (Rd ). We show that most of the results proved in the discrete case remain true under some additional assumptions about eigenfunctions of H . In particular, it is the case if H = −( + V (x) with V (x) bounded from below and I = (−∞, K], K ∈ R. One should stress that practically all results of the paper hold regardless of the multiplicity of the spectrum of H .


31

2. Lower Bounds for the Moments of Eigenfunctions Let {akn } be a double sequence of nonnegative numbers labelled by k ∈ N, n ∈ Zd . We shall suppose that there exist two positive constants A, B such that ∀n ∈ Zd ,

∞

akn ≤ A < +∞,

(2.1)

k=1

∀k ∈ N,

akn ≥ B > 0.

(2.2)

n∈Zd

For p > 0 define the numbers

dk (p) =

akn (|n| + 1)p

n∈Zd

with the understanding that dk (p) may be equal to +∞. One can also remark that dk (p) ≥ B for any k, p. Lemma 2.1. Let p > 0. There exist a positive constant D(A, B, p, d) such that for any {ank } satisfying (2.1), (2.2), one can reorder dk (p) so that p

dk (p) ≥ Dk d .

(2.3)

Proof. For any N > 0 consider the following set in N: J (N) = k ∈ N akn ≤ B/2 n:|n|≥N

and the sum

S(N ) =

akn .

k∈J (N) |n| 0 such that L = of the set I (N, p) and (2.7) that

(2.7)

B p 2 (N +1) . It follows from the definition d

Card({k ∈ N | dk (p) ≤ L}) ≤ C(d)A/B(2L/B) p .

(2.8)

The bound (2.8) implies, in particular, that the set {k ∈ N | dk (p) ≤ L} is finite for any L > 0. Reordering dk (p) in such a way that they increase with k, we obtain the result of the lemma. As a direct application of this lemma, we obtain the following important result. Theorem 2.2. Let {ek }, k ∈ N be any orthonormal system in l 2 (Zd ) (not necessarily complete). For any p > 0 define the moments dk (p) = |ek (n)|2 (|n| + 1)p . n∈Zd

One can reorder dk (p) so that p

dk (p) ≥ D(p, d)k d , where the positive constants D are the same for any system {ek }. Proof. Let akn = |ek (n)|2 . Since the system is orthonormal, ∀n ∈ Zd ,

∞

akn ≤ 1,

(2.9)

k=1

∀k ∈ N,

akn = 1.

n∈Zd

(One has the equality in (2.9) if the system is complete.) The result of the theorem follows directly from Lemma 2.1, where A = B = 1.


33

Remark 2.3. The result of the theorem is optimal since there exist orthonormal systems such that C1 k p/d ≤ dk (p) ≤ C2 k p/d . The simplest example is the canonical basis in l 2 (Zd ) or, more generally, complete systems with SULE [5], where the functions ek (n) are well localised and fast decaying at infinity. One can observe that if the system is not complete, then dk (p) can grow as fast as you will (taking, for example, ek = δm(k) with fast growing |m(k)|). Even if the system is complete but the functions ek decay not too fast at infinity, it is possible that dk (p) are fast growing (in particular, dk (p) = +∞ for any k). An interesting problem: is it possible that dk (p) grow faster than k p/d for some complete systems where all the functions ek (n) decay fast (for example, exponentially) as |n| → +∞? 3. Localization for a Given Vector: Necessary Conditions Let H be a self-adjoint operator in H = l 2 (Zd ), ψ ∈ H, ψ = 1, ψ(t) = exp(−itH )ψ. For any p > 0, t ∈ R, define the moments of the position operator p |X|ψ (t) = |ψ(t, n)|2 (|n| + 1)p . n∈Zd

We prefer to take (|n| + 1)p rather than |n|p to avoid some technical problems in the proofs. Definition 3.1. One has dynamical localization for ψ and the moment of order p, if p

sup |X|ψ (t) < +∞. t

We shall denote it as DL(ψ, p). One has Cesaro dynamical localization CDL(ψ, p) if T p p sup|X|ψ (T ) ≡ sup 1/T |X|ψ (t)dt < +∞. T

T

0

Clearly, DL(ψ, p) ⇒ CDL(ψ, p). Definition 3.2. One has dynamical localization (Cesaro dynamical localization) for ψ if DL(ψ, p) (respectively, CDL(ψ, p)) holds for any p > 0. We shall write DL(ψ) and CDL(ψ) respectively. Again, DL(ψ) ⇒ CDL(ψ). Let Hc be the subspace of a continuous spectrum of H and Pc be the orthogonal p projection on it. It is well known that if Pc ψ = 0, then |X|ψ (t) → +∞ as t → ∞ for any p > 0. So, the dynamical localization is possible only if ψ ∈ Hpp - the subspace of pure point spectrum of H . We shall denote by λ the eigenvalues of H and by Hλ the corresponding eigenspaces: Hλ = {ϕ| H ϕ = λϕ}.

34

S. Tcheremchantsev

Clearly, the subspaces Hλ and Hµ are mutually orthogonal for λ = µ and Hpp = ⊕λ Hλ . We shall denote by Pλ orthogonal projection on Hλ . For any ψ ∈ Hpp consider the set (at most countable) I (ψ) = {λ| ψλ ≡ Pλ ψ = 0}. Then ψ can be written as ψ=

+∞

ψk , ψk = 0, ψk ∈ Hλk , λk ∈ I (ψ),

k=1

where λk = λs for k = s. (It is possible that the sum is finite, in this case the problem of dynamical localization for the vector ψ is rather trivial.) For any k define ek = ψk −1 ψk . As Hλk and Hλs are mutually orthogonal for k = s, the system M(ψ) ≡ {ek } is orthonormal in Hpp . Finally, any ψ ∈ Hpp can be written as ψ=

γ k ek ,

k

where γk = ψ, ek , H ek = λk ek and M(ψ) = {ek } is some orthonormal system of eigenfunctions of H depending on ψ. It is obvious that exp(−itλk )γk ek . (3.1) ψ(t) = k

Let dk (p) be the moments of the functions ek : dk (p) =

|ek (n)|2 (|n| + 1)p .

n∈Zd

One can also note that the spectral measure of ψ is equal to µψ =

ak δλk ,

(3.2)

k

where ak = |γk |2 > 0, k ak = ψ2 = 1. The first result we shall prove is a necessary condition for dynamical localization. Theorem 3.3. 1. For any p > 0, p lim inf |X|ψ (T ) T →∞

≥

∞

ak dk (p)

k=1

(with the convention that if dk (p) = +∞, for some k, then k = +∞). 2. DL(ψ, p) ⇒ CDL(ψ, p) ⇒ dk (p) < +∞ for any k and k ak dk (p) < +∞.


35

Proof. It follows from (3.1) that ∞

|ψ(t, n)|2 =

exp(−it (λk − λm ))γk γm ek (n)em (n).

(3.3)

k,m=1

The sum in (3.3) is absolutely converging for any n because |γk |2 = 1, |ek (n)|2 ≤ 1. k

k

Therefore, for any N > 0, T ∈ R, T ∞ dt |ψ(t, n)|2 (|n| + 1)p = bkm (T )γk γm dkm (p, N ), A(N, T ) ≡ 1/T 0

|n|≤N

k,m=1

(3.4)

p where dkm (p, N ) = |n|≤N ek (n)em (n)(|n| + 1) , bkm (T ) = 1 for k = m and bkm (T ) = (exp(−iT (λk − λm )) − 1)/(−iT (λk − λm ) for k = m. As A(N, T ) ≤ p |X|ψ (T ), for any N > 0 we have the inequality p

liminf T →∞ |X|ψ (T ) ≥ liminf T →∞ A(N, T ).

(3.5)

On the other hand, due to the dominated convergence theorem, one can take the limit in (3.4) for a fixed N as T → ∞: lim A(N, T ) = |γk |2 |ek (n)|2 (|n| + 1)p . (3.6) T →∞

k

|n|≤N

As ak = |γk |2 , it follows from (3.5) and (3.6) that p ak |ek (n)|2 (|n| + 1)p liminf T →∞ |X|ψ (T ) ≥ k

|n|≤N

for any N > 0. Taking the limit N → +∞, we obtain the first statement of the theorem. The second statement follows directly from the first. As a corollary of Theorem 3.3, we shall prove a necessary condition for dynamical localization for the vector ψ in terms of its spectral measure µψ given by (3.2). Theorem 3.4. The following statements hold: 1. For any p > 0, DL(ψ, p) ⇒ CDL(ψ, p) ⇒

k

1

ak1+β < +∞

for all 0 < β < p/d. 2.

DL(ψ) ⇒ CDL(ψ) ⇒

k

for all ν > 0.

akν < +∞

36

S. Tcheremchantsev

Proof. Suppose that CDL(ψ, p) holds. Theorem 3.3 implies ak dk (p) < +∞. k

One can apply Theorem 2.2 to the orthonormal system {ek } and reorder dk (p) so that dk (p) ≥ Dk p/d . Therefore, after reordering, ak k p/d < +∞. k

Let 0 < β < p/d, r = 1 + easily see that k

1 1+β

ak

β, r

= 1 + 1/β. Applying the Hölder inequality, one can

≤

ak k

p/d

1 1+β

k

β/(1+β) k

−p/(βd)

< +∞

k

1 The fact that k ak1+β < +∞, does not depend on reordering of {ak }. The second part of the theorem is obvious.

Corollary 3.5. If CDL(ψ) holds, then the numbers ak are fast decaying: for any s > 0, one can reorder {ak } so that ak ≤ C(s)k −s . One should stress that the statements of Theorem 3.4 and Corollary 3.5 are weaker than that of Theorem 3.3, because they do not depend on the system {ek }, and inevitably, one loses some information about the moments dk (p). If dk (p) grow very fast as k → ∞, it is even possible that k akν < +∞ for all ν > 0, but k ak dk (p) = +∞ for all p > 0. Finally, in this section we shall give necessary conditions for DL(ψ, p) in terms of projections of ψ on Hλk . Lemma 3.6. Let M = {ek } be any orthonormal system in H (in particular, the system M(ψ)), ψ ∈ H. Define the following function: Rψ,M (n) = sup |γk ek (n)|, k

Then

n∈Zd

γk = ψ, ek .

2 2 Rψ, M (n) ≤ ψ .

Proof. As the system {ek } is orthonormal, k |γk |2 ≤ ψ2 . Therefore, S= |γk ek (n)|2 = |γk |2 ≤ ψ2 . k,n

On the other hand, 2 S= |γk ek (n)|2 ≥ sup |γk ek (n)|2 = Rψ, M (n). n

k

(3.7)

k

n

k

The result of the lemma follows from (3.7)–(3.8).

n

(3.8)


37

The function Rψ,M (n) (especially its decay properties) will play an important role in the next part of the paper. Lemma 3.6 implies that Rψ,M always lies in l 2 (Zd ). We shall see below that if DL(ψ, p) holds, then Rψ,M(ψ) decays faster at infinity. On the other hand, in the next section we shall prove that a sufficiently fast decay of Rψ,M for some M implies DL(ψ, p). Definition 3.7. We shall say that a function f : Zd → C (f : Zd → R) is fast decaying if for any s > 0, sup |f (n)|(|n| + 1)s < +∞. n

Theorem 3.8. The following statements hold: 2 p 1. DL(ψ, p) ⇒ CDL(ψ, p) ⇒ n Rψ, M(ψ) (n)(|n| + 1) < +∞. 2. DL(ψ) ⇒ CDL(ψ) ⇒ Rψ,M(ψ) is fast decaying. Proof. Suppose that CDL(ψ, p) holds. Theorem 3.3 implies that dk (p) < +∞ for any k and ak dk (p) ≤ C(p) < +∞. S(p) = k

On the other hand, it follows from definition of dk (p) that S(p) = (|n| + 1)p |γk ek (n)|2 ≥ (|n| + 1)p sup |γk ek (n)|2 n

k

2 = (|n| + 1)p Rψ, M(ψ) (n),

n

k

n

so we obtain the first statement of the theorem. The second statement follows directly from the first. 4. Localization for a Given Vector: Sufficient Conditions In this section we shall give some sufficient conditions for DL(ψ, p) and DL(ψ). Let M = {ek } be any orthonormal system of eigenfunctions of H and HM the subspace of Hpp spanned by M. For any ψ ∈ HM , we have the identity ψ=

γk e k ,

γk = ψ, ek .

(4.1)

k

We shall denote as λk and dk (p) the eigenvalue and the moments of ek (n) respectively. In this section we shall consider any systems M of eigenfunctions of H , not necessarily M(ψ), so it is possible that λk = λm for k = m if the spectrum of H is not simple. The decomposition ψk ψ= k

of the previous section is a special case of (4.1), when M = M(ψ). The simplest sufficient condition for DL(ψ, p) can be given in terms of dk (p) and ak = |γk |2 .

38

S. Tcheremchantsev

Lemma 4.1. Let p > 0. The following statement holds: p sup |X|ψ (t) t

2 p 2 ≤ (|n| + 1) sup |ψ(t, n)| ≤ ak dk (p) . t

n

k

So, if the last sum converges, DL(ψ, p) holds. Proof. Since sup |ψ(t, n)| = sup | t

t

exp(−itλk )γk ek (n)| ≤

k

|γk ek (n)|,

(4.2)

k

the Cauchy–Schwartz inequality yields for any t, (|n| + 1)p sup |ψ(t, n)|2 ≤ |γk γm | |ek (n)em (n)|(|n| + 1)p n

t

n

k,m

≤

|γk γm | dk (p)dm (p) =

k,m

(4.3)

2

ak dk (p)

.

k

If the functions ek and em have only small overlapping for k = m, one can better estimate the sums |ek (n)em (n)|(|n| + 1)p . dkm (p) = n

(p) decay fast when |k − m| → ∞, the sum on the r.h.s of (4.3) can be majorated If dkm by C k ak dk (p) (or by something close to it). In this case one obtains a sufficient condition which is close to (or even identical with) the necessary condition given by Theorem 3.3. In particular, this is the case when M is the canonical basis in l 2 (Zd ). The sufficient condition of Lemma 4.1 is, however, difficult to apply in the concrete cases, because one should control at the same time the growth of dk (p) and the decay of ak . A more practical sufficient condition can be given in terms of the function Rψ,M (n), defined by Rψ,M (n) = sup |γk ek (n)|, γk = ψ, ek , |γk |2 = ψ2 . k

k

Later on in this section, ψ and M are fixed and we omit them in notations. To prove DL(ψ, p), one shall use the trivial bound (4.2): |ψ(t, n)| ≤ |γk ek (n)|. k

Clearly, any term in the sum is majorated by R(n), and if the sum has a finite number of terms, the sufficiently fast decay of R(n) implies DL(ψ, p). However, typically it is not the case, and one needs some decay in k so that the sum converges. The key moment is the following: one can sacrifice some decay in n to obtain a necessary decay in k. This is possible due to the growth of the moments dk (p) given by Theorem 2.2. Surprisingly,


39

one can even allow some growth in k in the bounds for γk ek (n). Namely, suppose that for some α > 0 the moments dk (α) are finite for any k. Consider the function R(n, α) = sup k

|γk ek (n)| . dk (α)

For α = 0 one has dk (0) = 1 (so, dk (α) are always finite) and R(n, 0) coincides with the function R(n) considered above. For any s > 0 define the moments R 2 (n, α)(|n| + 1)s , Lα (s) = n

where Lα (s) depend also on ψ, M and it is possible that Lα (s) = +∞. Theorem 4.2. Let ψ ∈ HM , α ≥ 0. Suppose that dk (α) < +∞ for any k. The following statements hold: 1. Let δ ∈ (0, 1), ε > 0. Then sup |ψ(t, n)| ≤ C(ε, δ, d)Lδ/2 α t

2α + d(2 − δ) + ε R 1−δ (n, α), δ

(4.4)

where the constants C(ε, δ, d) are universal, i.e. do not depend on H, M or ψ. 2. Let p > 0, ε > 0. There exist the universal constants C(ε, p, d) such that p (|n| + 1)p sup |ψ(t, n)|2 sup |X|ψ (t) ≤ t t (4.5) n ≤ C(ε, p, d)Lα (2α + p + 2d + ε). 3. If R(n, α) is fast decaying in n, then so is supt |ψ(t, n)| and DL(ψ) holds. Proof. Let r = 2α+d(2−δ) + ε. If Lα (r) = +∞, the bound (4.4) is trivially true with δ any finite constant C. Suppose that Lα (r) < +∞. It follows from definition of dk (r) and Rα (n) that for any r > 0, k ∈ N, |γk ek (n)|2 (|n| + 1)r ≤ dk2 (α) R 2 (n, α)(|n| + 1)r ≡ dk2 (α)Lα (r). ak dk (r) = n

n

As Lα (r) < +∞, dk (r) < +∞ for any k such that ak = |γk |2 = 0. Therefore, 1/2 |γk | ≤ Lα (r)dk2 (α)/dk (r) .

(4.6)

At the same time, |γk ek (n)| ≤ dk (α)R(n, α),

(4.7)

directly from the definition of R(n, α). Using the bounds (4.6)–(4.7), one can estimate: |γk ek (n)| ≤ (dk (α)Rα (n))1−δ |γk ek (n)|δ |ψ(t, n)| ≤ k

k

≤R

(n, α)Lδ/2 α (r)

1−δ

k

−δ/2

|ek (n)|δ dk (α)dk

(r),

(4.8)

40

S. Tcheremchantsev

where the summation is carried only over k such that ak > 0, so dk (r) < +∞. Let s = 2/(2 − δ), s = 2/δ. Applying the Hölder inequality, and using the fact that 2 k |ek (n)| ≤ 1, one obtains:

s 2/(2−δ) −δ/2 −δ/(2−δ) δ |ek (n)| dk (α)dk (r) ≤ dk (α)dk (r). (4.9) S≡ k

k

Let w < q. Applying the Hölder inequality with s = q/w, s = q/(q − w), one can estimate: |ek (n)|2/s (|n| + 1)w |ek (n)|2/s dk (w) = n

1/s

≤

|ek (n)| (|n| + 1) 2

ws

n

1/s |ek (n)|

2

(4.10) = (dk (q))w/q .

n

The bound (4.10) with w = α, q = r and (4.9) imply S≤ dk (r)(2α/r−δ)/(2−δ) .

(4.11)

k

Now we shall use the result of Theorem 2.2. One can reorder dk (r) so that dk (r) ≥ D(r, d)k r/d .

(4.12)

The choice of r and (4.11)–(4.12) yield S ≤ C(ε, δ, d) < +∞

(4.13)

with some universal constants C. The first statement of the theorem follows from (4.8) and (4.13). To prove the second statement, we shall use the inequality (4.6) with r = 2α + p + 2d + ε and the bound of Lemma 4.1. Again, if Lα (r) = +∞, there is nothing to prove. Suppose that Lα (r) < +∞. One obtains 2  d 2 (α)dk (p) k  . (|n| + 1)p sup |ψ(t, n)|2 ≤ Lα (r)  (4.14) dk (r) t n k

Applying twice the bound (4.10), we obtain

2 2α+p−r p 2 2r (|n| + 1) sup |ψ(t, n)| ≤ Lα (r) dk (r) . n

t

(4.15)

k

Again, by Theorem 2.2 and the choice of r, the sum converges and the second statement of the theorem follows directly from (4.15). The third statement follows directly from the first and the second. As a direct application of Theorem 4.2 we obtain the necessary and sufficient conditions for DL(ψ). Corollary 4.3. Let ψ ∈ Hpp and R(n) = R(n, 0) be defined with the system M(ψ) as in the previous section. Then CDL(ψ) ⇔ DL(ψ) ⇔ R(n) is fast decaying. The result follows from Theorem 3.8 and Theorem 4.2.


41

5. Localization on the Interval of Energies In the previous parts of the paper the vector ψ ∈ Hpp was fixed and we did not suppose anything about decay of ψ or about the set of λk such that ψk = 0. In this section we shall consider the set of functions ψ with some decaying properties at infinity and with the energies λk from some interval I ⊂ R (bounded or not). First of all, if DL(ψ) holds, then, in particular, for any p > 0, p |X|ψ (0) = |ψ(n)|2 (|n| + 1)p = C(p) < +∞, n

so ψ is fast decaying. Therefore, the largest set of ψ for which one could prove DL(ψ) is the set of fast decaying functions: A = {ψ ∈ H | ψ fast decaying}. We shall also consider the set of finite functions B = {ψ ∈ H | ψ finite}, which is the subset of A. The set of ψ exponentially decaying at infinity is intermediate between A and B and will be considered in the next section. Let I ⊂ R be some interval (bounded or not). We shall denote by H(I ) the subspace of Hpp with the energies from I H(I ) = ⊕λ∈I Hλ , and by PI (H ) the orthogonal projection on H(I ). Definition 5.1. The operator H has an A-dynamical localization on I if for any ψ ∈ A, we have DL(PI (H )ψ). H has a B-dynamical localization on I if for any ψ ∈ B we have DL(PI (H )ψ). Let M = {ek } be any orthonormal system of eigenfunctions of H complete in H(I ). One can obtain such systems choosing for all eigenvalues λ ∈ I orthonormal systems Mλ complete in Hλ and then taking M = ∪λ∈I Mλ . Clearly, M is unique if the spectrum of H is simple on I . For any ϕ ∈ H(I ) the identity holds: γk ek , γk = ϕ, ek . ϕ= k

Suppose that for some α ≥ 0,

|g(n)|2 (|n| + 1)α < +∞

n

for any eigenfunction of H from H(I ) (if α = 0, this is always true). Define as in the previous section |γk ek (n)| . Rϕ,M (n, α) = sup dk (α) k

42

S. Tcheremchantsev

Define also three functions from Z2d to R+ : |ek (n)ek (m)| , dk (α) k |g(n)g(m)| , Gα (n, m) = sup dg (α) g∈L Fα (n, m) = sup

where L = {g ∈ H| Hg = λg, λ ∈ I, g = 1},

dg (α) =

|g(n)|2 (|n| + 1)α ,

n

and

g (n) g (m)|, Uα (n, m) = sup | g ∈K

where K = { g | H g = λ g , λ ∈ I, and ∀n ∈ Zd ,

| g (n)| ≤ (|n| + 1)−α/2 }.

One can see that Fα (n, m) ≤ Gα (n, m) ≤ Uα (n, m).

(5.1)

The first inequality in (5.1) is obvious. To prove the second, for any g ∈ L define g (n) = (dg (α))−1/2 g(n), so that |g(n)g(m)| = | g (n) g (m)|. dg (α) One verifies that

| g (n)|2 (|n| + 1)α = 1,

n

so g ∈ K and the second inequality in (5.1) holds. Lemma 5.2. Let α ≥ 0, ψ ∈ H, ϕ = PI (H )ψ. The inequality holds: Rϕ,M (n, α) ≤ Nα (n, m)|ψ(m)|,

(5.2)

m∈Zd

where Nα is one of the three functions Fα , Gα , Uα . Proof. As ϕ = PI (H )ψ and the system M is complete in H(I ), ϕ= γk ek , γk = ϕ, ek = ψ, ek . k

Therefore, γk =

ψ(m)ek (m)

m

and |ek (n)ek (m)| |γk ek (n)| . |ψ(m)| ≤ dk (α) dk (α) m

(5.3)

Taking in (5.3) the supremum over k, we obtain the statement of the lemma for Fα . The inequality (5.1) yields the result for Gα and Uα .


43

The bound (5.2) combined with Theorem 4.2 allows us to give sufficient conditions for B- and A-dynamical localization on I . Theorem 5.3. The following statements hold: 1. Let α ≥ 0. If one of three functions Fα (n, m), Gα (n, m), Uα (n, m) is fast decaying in n for all fixed m ∈ Zd , then B-dynamical localization holds on I . In particular, PI (H )ψ ∈ A for any ψ ∈ B. 2. If the spectrum of H is simple on I and B-dynamical localization holds on I , then the function F0 (n, m) = G0 (n, m) is fast decaying in n for all fixed m ∈ Zd . Proof. Let ψ ∈ B and Nα be one of three functions Fα , Gα , Uα . As the function ψ is finite and Nα (n, m) is fast decaying in n for any m, (5.2) implies that Rϕ,M (n, α) is fast decaying in n. The third statement of Theorem 4.2 implies DL(ϕ), so B-dynamical localization holds on I . To prove the second statement of the theorem, observe that since the spectrum of H is simple on I , the system M is unique and coincides with the set of normalised eigenfunctions of H with eigenvalues from I . Therefore, Fα (n, m) = Gα (n, m). Moreover, one sees easily that for any ϕ ∈ H(I ), M(ϕ) is a subset of M, where M(ϕ) was defined in Sect. 3. Namely, M(ϕ) = {ek }k∈J , J = {k| γk = 0}. Since γk = 0 for any k ∈ / J, Rϕ,M(ϕ) (n, 0) = sup |γk ek (n)| = sup |γk ek (n)| = Rϕ,M (n, 0). k∈J

k

Let ψ ∈ B, so that DL(ϕ) ≡ DL(PI (H )ψ) holds. By the second statement of Theorem 3.8, the function Rϕ,M (n, 0) is fast decaying in n. In particular, if ψ = δm , m ∈ Zd , then γk = ek (m) and Rϕ,M (n, 0) = F0 (n, m) is fast decaying in n, so the second statement of the theorem holds. As to the A-dynamical localization on I , there are many possible sufficient conditions to propose. For example, the following result holds. Theorem 5.4. Let α ≥ 0 and Nα is one of three functions Fα , Gα , Uα . Assume that one of the two conditions holds: 1. For any s > 0 there exist two finite positive constants k(s), C(s) such that Nα (n, m) ≤ C(s)(|m| + 1)k(s) (|n| + 1)−s .

(5.4)

2. For any s > 0 there exist two finite positive constants k(s), C(s) such that Nα (n, m) ≤ C(s)(|m| + 1)k(s) (|n − m| + 1)−s . Then A-dynamical localization holds on I . Proof. Let ψ ∈ A, so

|ψ(m)| ≤ C(r)(|m| + 1)−r

for any r > 0. For any s > 0 the bounds (5.2) ans (5.4) yield Rϕ,M (n, α) ≤ C(r)C(s)(|n| + 1)−s (|m| + 1)k(s)−r . m

(5.5)

44

S. Tcheremchantsev

Taking r = k(s) + 2d, we see that Rϕ,M (n, α) is fast decaying in n. The first statement of the theorem follows from the third statement of Theorem 4.2. In the case of (5.5) the proof is similar. Up to now, the operator H was fixed. Suppose that there is a family of self-adjoint operators H (θ ) depending on some parameter θ ∈ 0,
0 such that sup |ψ(t, n)| ≤ C exp(−γ |n|). t

We shall note it as EDL(ψ). Clearly, EDL(ψ) implies DL(ψ). To establish necessary and sufficient conditions for EDL(ψ) we shall need the following version of Theorem 2.2.


45

Theorem 6.1. Let {ek } be any orthonormal system in H, γ > 0. For any k define the numbers ηk (γ ) = sup(|ek (n)|2 exp(γ |n|)). n

One can reorder ηk (γ ) so that

ηk (γ ) ≥ D exp(βk 1/d ) with some universal positive constants D(γ , d), β(γ , d). Proof. We shall follow the proof of Lemma 2.1 with akn = |ek (n)|2 and A = B = 1. Let N > 0, then for the set J (N ) = {k| |ek (n)|2 ≤ 1/2} |n|>N

we have

Card(J (N )) ≤ K(N + 1)d . Let L > 0. Consider the following set in N: I (L) = {k| ηk (γ ) ≤ L}.

It follows from definition of ηk (γ ) that for any k ∈ I (N ), |ek (n)|2 ≤ L exp(−γ |n|). Therefore, for any ν > 0, |ek (n)|2 ≤ C(ν, d)L exp(−(γ − ν)N ). |n|>N

Let L be such that C(ν, d)L exp(−(γ − ν)N ) = 1/2. Then I (L) ⊂ J (N ) and Card(I (L)) ≤ Card(J (N )) ≤ K(N + 1)d ≤ C(γ , ν, d) logd L for any L ≥ L0 (γ , ν, d). The result of the theorem follows directly from (6.1).

(6.1)

With this result we can obtain a necessary condition for EDL(ψ) in terms of projections ψk and in terms of coefficients of the spectral measure of ψ. Let M(ψ) be the orthonormal system of eigenfunctions of H defined in Sect. 2 and Rψ,M(ψ) (n) = supk |γk ek (n)|, where γk = ψ, ek , H ek = λk ek . The spectral measure of ψ can be written as µψ = ak δλk . k

Theorem 6.2. Suppose that sup |ψ(t, n)| ≤ C exp(−α|n|)

(6.2)

sup |γk |2 ηk (2α) < +∞,

(6.3)

Rψ,M(ψ) (n) ≤ C exp(−α|n|).

(6.4)

t

for some α > 0. Then k

or, equivalently, One can reorder ak so that with some positive C, β.

ak ≤ C exp(−βk 1/d )

46

S. Tcheremchantsev

Proof. The proof of the first statement is made in [5]. Since ψ(t, n) =

exp(−itλs )γs es (n),

s

then for any k, n we have

T

1/T 0

ψ(t, n) exp(itλk )dt → γk ek (n)

(6.5)

as T → ∞. The bound (6.2) and (6.5) yield |γk ek (n)| ≤ C exp(−α|n|) for any k, n, which implies (6.4) and (6.3). Next, it follows from (6.3) and Theorem 6.1 that after reordering ak ≡ |γk |2 ≤ C(ηk (2α))−1 ≤ C exp(−βk 1/d ) with some positive C, β.

In the following statement we shall use the same notations as in Theorem 4.2. As usual, M = {ek } is any orthonormal system of eigenfunctions of H . Moreover, for δ ≥ 0 we define Rψ,M (n, δ) = sup k

|γk ek (n)| , ηk (δ)

(6.6)

supposing that ηk (δ) < +∞ for any k (it is always true for δ = 0 because ηk (0) = 1). Theorem 6.3. Let ψ ∈ HM The following statements hold with universal constants C: 1. If Rψ,M (n, 0) ≤ C exp(−α|n|) for some α > 0, then sup |ψ(t, n)| ≤ C(α, d)(|n|d + 1) exp(−α|n|). t

2. Suppose that ηk (δ) < +∞ for some δ > 0 for any k and Rψ,M (n, δ) ≤ C exp(−α|n|), where α > δ. Then for any ν : 0 < ν < α − δ, sup |ψ(t, n)| ≤ C(ν, α, d) exp(−ν|n|). t

In particular, in both cases EDL(ψ) holds.


47

Proof. Since sup |γk ek (n)| ≤ C exp(−α|n|), k

we obtain sup(|γk |2 ηk (2α)) < +∞. k

The result of Theorem 6.1 yields after reordering |γk | ≤ C exp(−βk 1/d ) with some C, β > 0. Now ψ(t, n) can be estimated in the usual way. For any n ∈ Zd and B > 0, |ψ(t, n)| ≤ |γk ek (n)| ≤ C exp(−α|n|) + |γk | d d k k≤B|n| k>B|n| (6.7) ≤ CB|n|d exp(−α|n|) + K exp(−β/2(B|n|d )1/d ). Taking in (6.7) B so that β/2B 1/d = α, we obtain the first statement of the theorem. To prove the second statement of the theorem we shall need a bound relating ηk (α) and ηk (ν) for ν ≤ α. It follows from definition of ηk (α) that |ek (n)|2 ≤ ηk (α) exp(−α|n|) for any k, n. At the same time |ek (n)|2 ≤ 1. Therefore, |ek (n)|2 ≤ min{1, ηk (α) exp(−α|n|)}.

(6.8)

We shall use the elementary inequality min{1, z} ≤ zs ,

z ≥ 0, 0 < s < 1.

(6.9)

The bounds (6.8) and (6.9) where s = ν/α yield |ek (n)|2 ≤ ηks (α) exp(−ν|n|), and finally ηk (ν) ≤ (ηk (α))ν/α .

(6.10)

This bound is very similar to the bound (4.10) for the moments dk (r). Now we can end the proof. For any k, n, |γk ek (n)| ≤ Cηk (δ) exp(−α|n|). Therefore, |γk | ηk (2α) ≤ Cηk (δ). Next, as |ψ(t, n)| ≤

k

|γk ek (n)|,

(6.11)

48

S. Tcheremchantsev

one has

A ≡ sup exp(ν|n|) sup |ψ(t, n)| ≤ |γk | ηk (2ν). n

t

(6.12)

k

The bounds (6.11) and (6.12) imply A≤C

k

1/2

−1/2

ηk (δ)ηk (2ν)ηk

(2α).

(6.13)

Using twice the bound (6.10), we obtain from (6.13): A≤C

(ηk (2α))(δ+ν−α)/(2α) . k

As δ + ν < α, by Theorem 6.1 the sum converges and is bounded by some universal constant, so the second statement of theorem holds. The result of the theorem can be used to give sufficient conditions for exponential dynamical localization on the interval of energies I . Consider the set of exponentially decaying functions in H: C = {ψ| ∃r > 0 : |ψ(n)| ≤ C exp(−r|n|)}. Clearly, B ⊂ C ⊂ A, where A and B were defined in the previous section. Definition 6.4. The operator H has exponential dynamical localization on I , if for any ψ ∈ C, we have EDL(PI (H )ψ). Using the result of Theorem 6.3, one can give sufficient conditions for EDL on the interval I . For the sake of simplicity, we restrict ourselves to the first statement of this theorem. One could, however, if necessary, give also a more general sufficient condition based on the second statement of Theorem 6.3. As in the previous section, M = {ek } is some orthonormal system of eigenfunctions of H complete in H(I ) and F0 (n, m) = sup |ek (n)ek (m)|, k

G0 (n, m) = sup |g(n)g(m)|, g∈L

F0 (n, m) ≤ G0 (n, m). Theorem 6.5. Let N be one of two functions F0 , G0 . Let ψ ∈ C so that |ψ(m)| ≤ C exp(−r|m|) for some r > 0. As usual, let ϕ = PI (H )ψ, ϕ(t) = exp(−itH )ϕ. The following statements hold:


49

1. Suppose that there exist α > 0, β > 0 such that N (n, m) ≤ C exp(−α|n| + β|m|). Then

sup |ϕ(t, n)| ≤ C(γ ) exp(−γ |n|), t

where 0 < γ < min{α, αr/β}. 2. Suppose that there exist α > 0, β > 0 such that N (n, m) ≤ C exp(−α|n − m| + β|m|). Then sup |ϕ(t, n)| ≤ C(γ ) exp(−γ |n|), t

(6.14)

where 0 < γ < min{α, αr/(α + β)}. In particular, in both cases EDL holds on I . Proof. We shall give it for the second statement of the theorem; for the first the things are similar. As in the proof of Lemma 5.2, we have the bound Rϕ,M (n, 0) ≤ N (n, m)|ψ(m)|. m

Since N (n, m) ≤ 1 for any n, m, N (n, m) ≤ N s (n, m) ≤ C s exp(−sα|n − m| + sβ|m|) for all s ∈ [0, 1] (the argument is similar to (6.8)–(6.9)). Therefore, Rϕ,M (n, 0) ≤ C(s) exp(−sα|n − m| − (r − sβ)|m|). m

If r ≥ α + β, we take s = 1, and for r < α + β, we take s = r/(α + β). In both cases we obtain the bound Rϕ,M (n, 0) ≤ C(γ ) exp(−γ |n|)

(6.15)

for all 0 < γ < min{α, αr/(α + β)}. The bound (6.14) follows directly from (6.15) and the first statement of Theorem 6.3. As an example where this theorem can be directly applied consider operators with SULE on I . Namely, assume that there exists an orthonormal system M = {ek } of eigenfunctions of H complete in H(I ) such that for some nk ∈ Zd for any δ > 0, |ek (n)| ≤ C(δ) exp(δ|nk | − α|n − nk |),

(6.16)

where α > 0 and the constants C(δ) are uniform in k, n. It follows from (6.16) that |ek (n)ek (m)| ≤ C 2 (δ) exp(2δ|nk | − α(|n − nk | + |m − nk |)). Using the elementary inequalities |nk | ≤ |m| + |m − nk |,

|n − nk | + |m − nk | ≥ |m − n|,

50

S. Tcheremchantsev

one can easily show that F0 (n, m) = sup |ek (n)ek (m)| ≤ C 2 (δ) exp(2δ|m| − (α − 2δ)|n − m|) k

for any δ > 0. The second statement of Theorem 6.5 implies EDL on I . Moreover, if |ψ(m)| ≤ C exp(−r|m|), then sup |ϕ(t, n)| ≤ C(γ ) exp(−γ |n|), t

where 0 < γ < min{α, r}. 7. Adaptation to the Continuous Case Most of results of the previous sections remain valid in the case of L2 (Rd ) provided the result of Theorem 2.2 holds. However, one cannot expect that Theorem 2.2 is true in the continuous case in such a generality. For example, in the case of L2 (R), define the moments dk (p) = |ek (x)|2 (|x| + 1)p dx. R

It is sufficient to take any orthonormal system {ek (x)} in L2 ([−1, 1]) and to put ek (x) = 0 for |x| > 1. For such a system dk (p) ≤ 2p for any k. However, if the functions ek (x) do not oscillate fast, the same phenomenon of “repulsion” of eigenfunctions occurs and one can show the result similar to that of Theorem 2.2. The main result of this section is the following. Theorem 7.1. Let {ek }, k ∈ N be an orthonormal system in L2 (Rd ) such that lim sup | ek (u)|2 du = 0, R→+∞ k

|u|>R

(7.1)

where ek is the Fourier transformation of ek . Then for any p > 0 one can reorder the moments dk (p) = (|x| + 1)p |ek (x)|2 dx Rd

so that

dk (p) ≥ D(p, d)k p/d

with some positive constants D(p, d) depending on the system {ek }. Proof. To prove the theorem, we shall discretize the problem and use the same technical Lemma 2.1 as in the discrete case. For any n = (n1 , ..., nd ) ∈ Zd , ε > 0 define the cube of size ε in Rd : Kn (ε) = {x = (x1 , ..., xd ) ∈ Rd | xj ∈ [εnj , ε(nj + 1)), j = 1, ..., d}. It is clear that the cubes are disjoint and Rd = ∪n Kn (ε). Let x ∈ Kn (ε). Then C1 (|n| + 1) ≤ |x| + 1 ≤ C2 (|n| + 1) with some constants C1 (ε), C2 (ε). As dk (p) = (|x| + 1)p |ek (x)|2 dx = (|x| + 1)p |ek (x)|2 dx, Rd

n∈Zd

Kn (ε)


51

we obtain that p

p

C1 (ε)wk (p) ≤ dk (p) ≤ C2 (ε)wk (p),

(7.2)

where p wk (p) = (|n| + 1) n

Kn (ε)

|ek (x)|2 dx ≡

(|n| + 1)p |gk (n)|2 . n

One Lemma 2.1 taking akn = |gk (n)|2 . It is obvious that could 2 try to apply 2 = ek = 1, so the condition (2.2) is satisfied. However, it is not clear n |gk (n)| whether k |gk (n)|2 ≤ A < +∞. To avoid this problem, we shall consider rather the quantities hk (n) = ek (x)dx. Kn (ε)

By the Cauchy–Schwartz inequality, |hk (n)|2 ≤ εd |gk (n)|2 .

(7.3)

Therefore, (7.2) implies p

dk (p) ≥ ε−d C1 (ε)

(|n| + 1)p |hk (n)|2 ,

(7.4)

n

and to prove the theorem it is sufficient to show that the numbers akn = |hk (n)|2 verify the conditions of Lemma 2.1 for some ε > 0. One can represent hk (n) as hk (n) = ek , ηn L2 (Rd ) , where ηn is the characteristic function of Kn (ε). Since the system {ek } is orthonormal, |hk (n)|2 ≤ ηn 2 = εd , k

so (2.1) holds with A = εd . To prove (2.2) is more difficult. We shall show that the numbers ε −d |hk (n)|2 are close 2 to |gk (n)| for ε small enough if the condition (7.1) is satisfied. Using n |gk (n)|2 = 1, we shall prove (2.2) for some B(ε) > 0 if ε is small enough. We need the following technical result. Lemma 7.2. Let ψ ∈ L2 (Rd ), ψ = 1. For any n ∈ Zd , ε > 0 define (n (ε) =

|ψ(x)| dx − ε 2

Kn (ε)

−d

Kn (ε)

2 ψ(x)dx

((n (ε) ≥ 0 by Cauchy–Schwartz inequality). There exists some universal constant C(d) such that for any ε > 0, R > 0, 0≤

n∈Zd

1/2

(n (ε) ≤ C(d) R ε + 2 2

|u|>R

(u)|2 du |ψ

.

52

S. Tcheremchantsev

Proof. One can represent (n (ε) as (n (ε) = ε−d dx ψ(x) Kn (ε)

Kn (ε)

dy (ψ(x) − ψ(y)).

(7.5)

Applying twice the Cauchy–Schwartz inequality (to the integral over y and to the integral over x), we obtain from (7.5): (2n (ε) ≤ ε−d dx|ψ(x)|2 dxdy|ψ(x) − ψ(y)|2 . Kn (ε)

Kn (ε) Kn (ε)

Therefore,

(n (ε) ≤ ε

n

·

n

=ε

1/2

−d/2

n

Kn (ε)

1/2

Kn (ε) Kn (ε)

−d/2

dx|ψ(x)|

n

2

dxdy|ψ(x) − ψ(y)|

2

(7.6)

1/2

Kn (ε) Kn (ε)

dxdy|ψ(x) − ψ(y)|

2

.

√ One can now observe that |x − y| ≤ ε d for any x, y ∈ Kn (ε). Therefore, dxdy|ψ(x) − ψ(y)|2 n

Kn (ε) Kn (ε)

≤

n

=

Rd

Kn (ε)

Rd

dx

Rd

√ dy|ψ(x) − ψ(y)|2 F |x − y| ≤ ε d (7.7)

√ dxdy|ψ(x) − ψ(y)|2 F (|x − y| ≤ ε d),

√ where F is the characteristic function of the set {(x, y) | |x − y| ≤ ε d}. The bounds (7.6)-(7.7) imply (n (ε) ≤ ε−d/2 L1/2 (δ), (7.8) n

√ where L(δ) = Rd Rd |ψ(x) − ψ(y)|2 F (|x − y| ≤ δ) and δ = ε d. Changing the variable z = y − x in the integral, we obtain in Fourier representation (u)|2 |1 − eiz,u |2 . L(δ) = dz du|ψ (7.9) |z|≤δ

Rd

Let R > 0. The integral over u can be written as (u)|2 |1 − eiz,u |2 = I1 (z) + I2 (z), du|ψ Rd


53

where in I1 (z) and I2 (z) one integrates over u : |u| ≤ R and over u : |u| > R respectively. Using the elementary bound |eiw − 1| ≤ C|w|, w ∈ R, we estimate I1 (z) ≤ C|z|2

|u|≤R

(u)|2 ≤ C|z|2 R 2 ψ 2 = C|z|2 R 2 . du|u|2 |ψ

(7.10)

As to I2 (z), trivially I2 (z) ≤ 4

|u|>R

(u)|2 . du|ψ

(7.11)

The bounds (7.9)-(7.11) imply 2 d+2

L(δ) ≤ CR δ

+ Cδ

d

|u|>R

(u)|2 . du|ψ

Finally, (7.8) and (7.12) yield the statement of the lemma.

(7.12)

Now we can finish the proof of the theorem. Let {ek } be an orthonormal system verifying (7.1). The bound of Lemma 7.2 applied to ek yields 2 −d 2 (|gk (n)| − ε |hk (n)| ) ≤ C(d) R 2 ε 2 + n

1/2 |u|>R

du| ek (u)|

2

.

(7.13)

Using the condition (7.1), it is easy to see that one can choose R > 0 big enough and ε> 0 small enough so that the r.h.s. of (7.13) is smaller than 1/2 for any k ∈ N. As 2 n |gk (n)| = 1 for any k, (7.13) yields for such ε:

|hk (n)|2 ≥ εd /2

n

and (2.2) holds for akn = |hk (n)|2 with B = εd /2. The proof of the theorem is completed. Remark. One can note that the choice of ε depends on the system {ek }, so, unlike the discrete case, the constants D(p, d) are not necessarily universal in the continuous case. An important example where the condition (7.1) is satisfied, is given by the following Theorem 7.3. Let H = −( + V (x) be an operator in L2 (Rd ) self-adjoint on H 2 (Rd ), where V (x) is a real function bounded from below: V (x) ≥ −M for a.e. x ∈ Rd . Let K ∈ R and {ek } be any orthonormal family of eigenfunctions of H with eigenvalues λk ≤ K for all k. Then for any p > 0 one can reorder the moments dk (p) so that dk (p) ≥ D(p, d, K + M)k p/d with universal positive constant D depending only on p, d and A + M.

54

S. Tcheremchantsev

Proof. For any k ∈ N we have H ek (x) = −(ek (x) + V (x)ek (x) = λk ek (x). Therefore, −(ek , ek =

Rd

dx(λk − V (x))|ek (x)|2 ≤ (K + M)ek 2 = K + M.

(7.14)

On the other hand,

−(ek , ek =

Rd

du|u|2 | ek (u)|2 ≥ R 2

|u|>R

du| ek (u)|2

(7.15)

for any R > 0. The bounds (7.14)–(7.15) imply sup du| ek (u)|2 ≤ (K + M)/R 2 , k

|u|>R

so (7.1) is satisfied. Moreover, it is clear from the proof of Theorem 7.1 that one can choose R > 0 and ε > 0 depending only on d and K + M so that the r.h.s of (7.13) is smaller than 1/2 for any k. That means that the constants A = ε d and B = εd /2 in Lemma 2.1 depend only on d and K + M but not on the choice of the system {ek }. This gives us the result of the theorem. All the results of Sect. 3 hold if the orthonormal system M(ψ) satisfies the condition (7.1). The proof of Theorem 3.3 is essentially the same (one considers |x|≤N dx instead of |n|≤N in the proof). The proofs of Theorem 3.4 and Corollary 3.5 do not change. The results of Lemma 3.6 and Theorem 3.8 hold with the function Rψ,M (n) defined as follows: Rψ,M (n) = sup |γk gk (n)|, γk = ψ, ek , n ∈ N, k

where |gk ≡ Kn (1) dx|ek (x)|2 . The sufficient conditions for DL(ψ, p) and DL(ψ) in the continuous case are based on the following version of Lemma 4.1: (n)|2

p |X|ψ (t)

≤C

n∈Zd

(|n| + 1)

p

|ψ(t, x)| ≤ 2

Kn (1)

2 ak wk (p)

,

(7.16)

k

where wk (p) = n (|n| + 1)p |gk (n)|2 . The numbers wk (p) are equivalent to the moments dk (p) due to (7.2), so the lower bounds wk (p) ≥ Dk p/d hold. The result similar to (n)| , Theorem 4.2 can be easily obtained. One should only replace R(n, α) by supk |γwk gk k(α) supt |ψ(t, n)| by k |γk gk (n)|, dk (p) by wk (p), and ek (n) by gk (n). The only differ ence is the following: one does not have the bound k |gk (n)|2 ≤ 1 which was valid for ek (n) in the discrete case. Therefore the bounds in Statements 1 and 2 of the theorem one can prove are slightly weaker than in the discrete case. Statement 3 of the theorem and the result of Corollary 4.3 remain true. For the sake of completeness, let us give a direct proof of the third statement of Theorem 4.2 in the continuous case (this proof is valid also in the discrete case). For simplicity we shall suppose that α = 0.


55

Theorem 7.4. Let M = {ek } be some orthonormal system of eigenfunctions of H in L2 (Rd ) verifying the conditions of Theorem 7.1. For agiven vector ψ ∈ HM consider the function R(n) = supk |γk gk (n)|, where |gk (n)|2 = Kn (1) |ek (x)|2 dx ≤ 1, n ∈ Zd and γk = ψ, ek . If the function R(n) is fast decaying then DL(ψ) hold. Proof. As the function R(n) is fast decaying, for any r > 0, |γk |2 wk (r) =

(|n| + 1)r |γk gk (n)|2 ≤ (|n| + 1)r R 2 (n) ≤ C(r) < +∞. n

n

On the other hand, wk (r) ≥ Dk r/d after reordering. Therefore, |γk | ≤ C(m)k −m

for any m > 0.

(7.17)

Next, as |ψ(t, x)| ≤

∞

|γk ek (x)|,

k=1

for any n ∈ Zd the bound holds: Kn (1)

|ψ(t, x)|2 dx ≤

∞

2 |γk gk (n)|

≡ S 2 (n).

(7.18)

k=1

Reorder the terms in the sum so that (7.17) hold. Then  2 |n| S(n) =  + k=1

∞

  |γk gk (n)|

k=|n|2 +1 ∞

≤ |n|2 R(n) +

(7.19) |γk | ≤ |n|2 R(n) + C(m)(|n|2 + 1)1−m

k=|n|2 +1

for any m > 0. The bounds (7.16), (7.18) and (7.19) yield DL(ψ, p) for all p > 0. The proof is completed. Most of results of Sect. 5 can be adapted to the continuous case. It is sufficient to take gk (n) instead of ek (n) and ( Kn (1) |ψ(x)|2 )1/2 instead of ψ(n). The results of Theorem 5.3 and Theorem 5.4 are true if the system M complete in H(I ) satisfies the conditions of Theorem 7.1. In particular, this is the case if H = −( + V (x) with V (x) bounded from below and I = (−∞, K]. The result similar to that of Theorem 5.5 can be proved in the case H (θ ) = −( + V (x, θ), where V (x, θ ) ≥ −M for µ-a.e. θ and a.e.x. The constants in the bounds will depend on ε, p, d and K + M. The main results of Sect. 6 can be also generalized to the continuous case in a similar way. Acknowledgements. I thank F. Germinet for stimulating discussions on the subject of the paper.

56

S. Tcheremchantsev

References 1. Aizenman, M.: Localization at weal disorder: Some elementary bounds. Rev. Math. Phys. 6, 1163–1182 (1994) 2. Aizenman, M., Schenker, J.H., Friedrich, R.M., Hundertmark, D.: Finite-volume fractional-moment criteria for Anderson localization. To appear in Commun. Math. Phys. 3. Damanik, D. and Stollman, P.: Multi-scale analysis implies strong dynamical localization. Preprint (1999) 4. De Bièvre, S. and Germinet, F.: Dynamical localization for the random dimer Schrödinger operator. J. Stat. Phys. 98, 1135–1148 (2000) 5. Del Rio, R., Jitomirskaya, S., Last, Y. and Simon, B.: Operators with singular continuous spectrum IV: Hausdorff dimensions, rank one perturbation and localization. J. d’Analyse Math. 69, 153–200 (1996) 6. Germinet, F. and De Bièvre, S.: Dynamical localization for discrete and continuous random Schrödinger operators. Commun. Math. Phys. 194, 323–341 (1998) 7. Germinet, F.: Dynamical localization II with an application to the almost Mathieu operator. J. Stat. Phys. 95, 273–286 (1999) 8. Germinet, F. and Jitomirskaya, S.: Strong dynamical localization for the almost Mathieu model. Preprint (2000) Communicated by B. Simon

Commun. Math. Phys. 221, 57 – 76 (2001)

Communications in



Conformal and Quasiconformal Realizations of Exceptional Lie Groups M. Günaydin1, , K. Koepsell2 , H. Nicolai2 1 CERN, Theory Division, 1211 Geneva 23, Switzerland. E-mail: [email protected] 2 Max-Planck-Institut für Gravitationsphysik, Albert-Einstein-Institut, Mühlenberg 1, 14476 Potsdam,

Germany. E-mail: [email protected]; [email protected] Received: 12 August 2000 / Accepted: 2 March 2001

Abstract: We present a nonlinear realization of E8(8) on a space of 57 dimensions, which is quasiconformal in the sense that it leaves invariant a suitably defined “light cone” in R 57 . This realization, which is related to the Freudenthal triple system associated with the unique exceptional Jordan algebra over the split octonions, contains previous conformal realizations of the lower rank exceptional Lie groups on generalized space times, and in particular a conformal realization of E7(7) on R 27 which we exhibit explicitly. Possible applications of our results to supergravity and M-Theory are briefly mentioned.

1. Introduction It is an old idea to define generalized space-times by association with Jordan algebras J , in such a way that the space-time is coordinatized by the elements of J , and that its rotation, Lorentz, and conformal group can be identified with the automorphism, reduced structure, and the linear fractional group of J , respectively [11–13]. The aesthetic appeal of this idea rests to a large extent on the fact that key ingredients for formulating relativistic quantum field theories over four dimensional Minkowski space extend naturally to these generalized space times; in particular, the well-known connection between the positive energy unitary representations of the four dimensional conformal group SU (2, 2) and the covariant fields transforming in finite dimensional representations of the Lorentz group SL(2, C) [29, 28] extends to all generalized space-times defined by Jordan algebras [16]. The appearance of exceptional Lie groups and algebras in extended supergravities and their relevance to understanding the non-perturbative regime of string theory have provided new impetus; indeed, possible applications to string and M-Theory constitute the main motivation for the present investigation. This work was supported in part by the NATO collaborative research grant CRG. 960188.

Work supported in part by the National Science Foundation under grant number PHY-9802510.

Permanent address: Physics Department, Penn State University, University Park, PA 16802, USA.

58

M. Günaydin, K. Koepsell, H. Nicolai

In this paper, we will present a novel construction involving the maximally extended Lie group E8(8) . This construction of E8(8) together with the corresponding construction of E8(−24) contain all previous examples of generalized space-times based on exceptional Lie groups, and at the same time goes beyond the framework of Jordan algebras. More precisely, we show that there exists a quasiconformal nonlinear realization of E8(8) on a space of 57 dimensions1 . This space may be viewed as the quotient of E8(8) by its maximal parabolic subgroup [18, 19]; there is no Jordan algebra directly associated with it, but it can be related to a certain Freudenthal triple system which itself is associated with the “split” exceptional Jordan algebra J3O S , where O S denote the split real form of the octonions O. It furthermore admits an E7(7) invariant norm form N4 , which gets multiplied by a (coordinate dependent) factor under the nonlinearly realized “special conformal” transformations. Therefore the light cone, defined by the condition N4 = 0, is actually invariant under the full E8(8) , which thus plays the role of a generalized conformal group. By truncation we obtain quasiconformal realizations of other exceptional Lie groups. Furthermore, we recover previous conformal realizations of the lower rank exceptional groups (some of which correspond to Jordan algebras). In particular, we give a completely explicit conformal Möbius-like nonlinear realization of E7(7) on the 27-dimensional space associated with the exceptional Jordan algebra J3O S , with linearly realized subgroups F4(4) (the “rotation group”) and E6(6) (the “Lorentz group”). Although in part this result is implicitly contained in the existing literature on Jordan algebras, the relevant transformations have not been exhibited explicitly so far, and are here presented in the basis that is also used in maximal supergravity theories. The basic concepts are best illustrated in terms of a simple and familiar example, namely the conformal group in four dimensions [29], and its realization via the Jordan algebra J2C of hermitian 2 × 2 matrices with the hermiticity preserving commutative (but non-associative) product a ◦ b := 21 (ab + ba)

(1)

(basic properties of Jordan algebras are summarized in Appendix A). As is well known, these matrices are in one-to-one correspondence with four-vectors x µ in Minkowski space via the formula x ≡ xµ σ µ , where σ µ := (1, σ ). The “norm form” on this algebra is just the ordinary determinant, i.e. N2 (x) := det x = xµ x µ

(2)

(it will be a higher order polynomial in the general case). Defining x¯ := xµ σ¯ µ (where σ¯ µ := (1, −σ )) we introduce the Jordan triple product on J2C : ¯ ◦ c + (c ◦ b) ¯ ◦ a − (a ◦ c) ◦ b¯ {a b c} := (a ◦ b) 1 ¯ + cba) ¯ = a, b c + c, b a − a, c b = (a bc

(3)

2

with the standard Lorentz invariant bilinear form a, b := aµ bµ . However, it is not generally true that the Jordan triple product can be thus expressed in terms of a bilinear form. The automorphism group of J2C , which is by definition compatible with the Jordan product, is just the rotation group SU (2); the structure group, defined as the invariance 1 A nonlinear realization will be referred to as “quasiconformal” if it is based on a five graded decomposition of the underlying Lie algebra (as for E8(8) ); it will be called “conformal” if it is based on a three graded decomposition (as e.g. for E7(7) ).

Conformal Realizations of Exceptional Lie Groups

59

of the norm form up to a constant factor, is the product SL(2, C) × D, i.e. the Lorentz group and dilatations. The conformal group associated with J2C is the group leaving invariant the light-cone N2 (x) = 0. As is well known, the associated Lie algebra is su(2, 2), and possesses a three-graded structure g = g−1 ⊕ g0 ⊕ g+1 ,

(4)

where the grade −1 and grade +1 spaces correspond to generators of translations P µ and special conformal transformations K µ , respectively, while the grade 0 subspace is spanned by the Lorentz generators M µν and the dilatation generator D. The subspaces g1 and g−1 can each be associated with the Jordan algebra J2C , such that their elements are labeled by elements a = aµ σ µ of J2C . The precise correspondence is Ua := aµ P µ ∈ g−1

U˜ a := aµ K µ ∈ g+1 .

and

(5)

By contrast, the generators in g0 are labeled by two elements a, b ∈ J2C , viz. Sab := aµ bν (M µν + ηµν D).

(6)

The conformal group is realized non-linearly on the space of four-vectors x ∈ J2C , with a Möbius-like infinitesimal action of the special conformal transformations δx µ = 2c, x x µ − x, x cµ

(7)

with parameter cµ . All variations acquire a very simple form when expressed in terms of the above generators: we have Ua (x) = a, Sab (x) = {a b x}, U˜ c (x) = − 1 {x c x},

(8)

2

where {. . . } is the Jordan triple product introduced above. From these transformations it is elementary to deduce the commutation relations [Ua , U˜ b ] = Sab , [Sab , Uc ] = U{abc} [Sab , U˜ c ] = U˜ {bac} ,

(9)

[Sab , Scd ] = S{abc} d − S{bad} c (of course, these could have been derived directly from those of the conformal group). As one can also see, the Lie algebra g admits an involutive automorphism ι exchanging g−1 and g+1 (hence, ι(K µ ) = P µ ). The above transformation rules and commutation relations exemplify the structure that we will encounter again in Sect. 3 of this paper: the conformal realization of E7(7) on R 27 presented there has the same form, except that J2C is replaced by the exceptional Jordan algebra J3O S over the split octonions O S . While our form of the nonlinear variations appears to be new, the concomitant construction of the Lie algebra itself by means of the Jordan triple product has been known in the literature as the Tits–Kantor–Koecher construction [32, 21, 25], and as such generalizes to other Jordan algebras. The generalized linear fractional (Möbius) groups of Jordan algebras can be abstractly defined in an

60


analogous manner [26], and shown to leave invariant certain generalized p-angles defined by the norm form of degree p of the underlying Jordan algebra [22, 14]. However, to our knowledge, explicit formulas of the type derived here have not appeared in the literature before. While this construction works for the exceptional Lie algebras E6(6) , and E7(7) , as well as other Lie algebras admitting a three graded structure, it fails for E8(8) , F4(4) , and G2(2) , for which a three grading does not exist. These algebras possess only a five graded structure g = g−2 ⊕ g−1 ⊕ g0 ⊕ g+1 ⊕ g+2 .

(10)

Our main result, to be described in Sect. 2, states that a “quasiconformal” realization is still possible on a space of dimension dim(g1 ) + 1 if the top grade spaces g±2 are one-dimensional. Five graded Lie algebras with this property are closely related to the so-called Freudenthal Triple Systems [9, 30], which were originally invented to provide alternative constructions of the exceptional Lie groups2 . This relation will be made very explicit in the present paper. The novel realization of E8(8) which we will arrive at, together with its natural extension to E8(−24) , contains various other constructions of exceptional Lie algebras by truncation, including the conformal realizations based on a three graded structure. For this reason, we describe it first in Sect. 2, and then show how the other cases can be obtained from it. Whereas previous attempts to construct generalized space-times mainly focused on generalizing Minkowski space-time and its symmetries, the physical applications that we have in mind here are of a somewhat different nature, and inspired by recent developments in superstring and M-Theory. Namely, the generalized “space-times” presented here could conceivably be identified with certain internal spaces arising in supergravity and superstring theory, which are related to the appearance of central charges in the associated superalgebras. Central charges and their solitonic carriers have been much discussed in the recent literature because it is hoped that they may provide a window on M-Theory and its non-perturbative degrees of freedom. More specifically, it has been argued in [5] that a proper description of the non-perturbative M-Theory degrees of freedom might require supplementing ordinary space-time coordinates by central charge coordinates. Solitonic charges also play an important role in the microscopic description of black hole entropy: for maximally extended N = 8 supergravity, the latter is conjectured to be given by an E7(7) invariant formula [20, 8]. The corresponding formula for the entropy in maximally extended supergravity in five dimensions is E6(6) invariant and involves a cubic form. In [7] an invariant classification of orbits of E7(7) and E6(6) actions on their fundamental representations that classify BPS states in d = 4 and d = 5 was given. The entropy formula in [20, 8] is identical to the equation for a vector with vanishing norm in 57 dimensions (see Eq. (27)), provided we use the SL(8, R)form of the quartic E7(7) invariant. This suggests that the 57th component of our E8(8) realization should be interpreted as the entropy. However, we should stress that the quartic invariant can assume both positive and negative values, cf. the simple examples given inAppendix B. In order to avoid imaginary entropy, one must therefore restrict oneself to the positive semidefinite values of the quartic invariant, corresponding to the “time-like” and “light-like” orbits of E7(7) in the language of [7]. With the 57th coordinate interpreted as the entropy and the remaining 56 coordinates as the electric and magnetic charges, it is natural from our point of view to define a distance in this “entropy-charge space” between any two 2 The more general Kantor–Triple-Systems for which g±2 have more than one dimension, will not be discussed in this paper.


61

black hole solutions using our Eqs. (25), (26). If two black hole solutions are light-like separated in this space, they will remain so under the action of E8(8) .3 We should also point out that it is not entirely clear from the existing black hole literature whether it is the SU(8) or the SL(8, R) form of the invariant that should be used here (the detailed relation between the two is worked out in Appendix B). The SU(8) basis is relevant for the central charges, which appear in the superalgebra via surface integrals at spatial infinity and determine the structure (and length) of BPS multiplets. By contrast, the 28 electric and 28 magnetic charges carried by the solitons of d = 4, N = 8 supergravity transform separately under SL(8, R) [4], and therefore the SL(8, R) form of the invariant appears more appropriate in this context. For applications to M-Theory it would be important to obtain the exponentiated version of our realization. One might reasonably expect that modular forms with respect to a fractional linear realization of the arithmetic group E8(8) (Z) will have a role to play. We expect that our results will pave the way for the explicit construction of such modular forms. According to [19] these would depend on 28+1 = 29 variables, such that the 57dimensional Heisenberg subalgebra of E8(8) exhibited here would be realized in terms of 28 “coordinates” and 28 “momenta”. Consequently, the 57 dimensions in which E8(8) acts might alternatively be interpreted as a generalized Heisenberg group, in which case the 57th component would play the role of a variable parameter h. ¯ The action of E8(8) (Z) on the 57 dimensional Heisenberg group would then constitute the invariance group of a generalized Dirac quantization condition. This observation is also in accord with the fact that the term modifying the vector space addition in R 57 (cf. Eq.(25)), which is required by E8(8) invariance, is just the cocycle induced by the standard canonical commutation relations on an (28+28)-dimensional phase space. 2. Quasiconformal Realization of E8(8) 2.1. E7(7) decomposition of E8(8) . We will start with the maximal case, the exceptional Lie group E8(8) , and its quasiconformal realization on R 57 , because this realization contains all others by truncation. Our results are based on the following five graded decomposition of E8(8) with respect to its E7(7) × D subgroup g−2 ⊕ g−1 ⊕

g0

⊕ g+1 ⊕ g+2

1 ⊕ 56 ⊕ (133 ⊕ 1) ⊕ 56 ⊕ 1

(11)

with the one-dimensional group D consisting of dilatations. D itself is part of an SL(2, R) group, and the above decomposition thus corresponds to the decomposition 248 → (133, 1) ⊕ (56, 2) ⊕ (1, 3) of E8(8) under its subgroup E7(7) × SL(2, R). In order to write out the E7(7) generators, it is convenient to further decompose them w.r.t. the subgroup SL(8, R) of E7(7) . In this basis, the Lie algebra of E7(7) is spanned by the SL(8, R) generators Gi j , and the antisymmetric generators Gij kl , transforming in the 63 and 70 representations of SL(8, R), respectively. We also define Gij kl :=

1 24 %ij klmnpq

Gmnpq

3 For the exceptional N = 2 Maxwell–Einstein supergravity [17] defined by the exceptional Jordan algebra the U-duality groups in five and four dimensions are E6(−26) and E7(−25) , respectively. The quasi-conformal symmetry of the exceptional supergravity in four dimensions is hence E8(−24) , with the maximal compact subgroup E7 × SU (2).

62


with SL(8, R) indices 1 ≤ i, j, . . . ≤ 8. The commutation relations are [Gi j , Gk l ] = δ kj Gi l − δ il Gk j , lmn]i − [Gi j , Gklmn ] = −4 δ [k j G

[Gij kl , Gmnpq ] =

1 36

δ ij Gklmn ,

1 2

% ij kls[mnp Gq] s .

The fundamental 56 representation of E7(7) is spanned by the two antisymmetric real tensors Xij and Xij and the action of E7(7) is given by4 δX ij = *i k X kj − *j k X ki + + ij kl Xkl , δXij = *k i Xj k − *k j Xik + +ij kl X kl ,

(12)

where +ij kl =

mnpq 1 . 24 %ij klmnpq +

(13)

In order to extend E7(7) × D to the full E8(8) , we must enlarge D to an SL(2, R) with generators (E, F, H ) in the standard Chevalley basis, together with 2 × 56 further real generators (Eij , E ij ) and (Fij , F ij ). Under hermitian conjugation, we have E ij = Fij† ,

F ij = −Eij† ,

and

E = −F † .

The grade −2, −1, 1 and 2 subspaces in the above decomposition correspond to the subspaces g−2 , g−1 , g1 , and g2 in (11), respectively: E ⊕ {E ij , Eij } ⊕ {Gij kl , Gi j ; H } ⊕ {F ij , Fij } ⊕ F.

(14)

The grading may be read off from the commutators with H [H , E] = −2 E, ij

ij

[H , E ] = −E , [H , Eij ] = −Eij ,

[H , F ] = 2 F, [H , F ij ] = F ij , [H , Fij ] = Fij .

The new generators (Eij , E ij ) and (Fij , F ij ) form two (maximal) Heisenberg subalgebras of dimension 28 ij

[E ij , Ekl ] = 2 δ kl E,

ij

[F ij , Fkl ] = 2 δ kl F,

and they transform under SL(8, R) as [Gi j , E kl ] = δ kj E il − δ lj E ik − 41 δ ij E kl , [Gi j , Ekl ] = δ ik Elj − δ il Ekj + 41 δ ij Ekl , [Gi j , F kl ] = δ kj F il − δ lj F ik − 41 δ ij F kl , [Gi j , Fkl ] = δ ik Flj − δ il Fkj + 41 δ ij Fkl . 4 We emphasize that X ij and X are independent. This convention differs from the one used for the SU(8) ij basis in the appendix.


63

The remaining non-vanishing commutation relations are given by [E, F ] = H and [ij

1 ij klmnpq [Gij kl , E mn ] = − 24 % Epq ,

[ij

1 ij klmnpq [Gij kl , F mn ] = − 24 % Fpq ,

[Gij kl , Emn ] = −δ mn E kl] , [Gij kl , Fmn ] = −δ mn F kl] , [E ij , F kl ] = 12 Gij kl , [E ij , Fkl ] = 4 δ [i[k G

j]

l]

[Eij , Fkl ] = −12 Gij kl , ij

− δ kl H,

[E , F ij ] = −E ij , ij

l] kl [Eij , F kl ] = 4 δ [k [i G j ] + δ ij H,

[E , Fij ] = −Eij ,

ij

[F , E ] = F ,

[F , Eij ] = Fij .

To see that we are really dealing with the maximally split form of E8(8) , let us count the number of compact generators: The antisymmetric part (Gi j − Gj i ) of Gi j and (Gij kl − Gij kl ) correspond to the 63 generators of the maximal compact subalgebra SU (8) of E7(7) [4]. The remaining compact generators are the 28+28+1 anti-hermitian generators (Eij + F ij ), (E ij − Fij ), and (E + F ) giving a total of 120 generators which close into the maximal compact subgroup SO(16) ⊃ SU(8) of E8(8) . An important role is played by the symplectic invariant of two 56 representations. It is given by X, Y := X ij Yij − Xij Y ij .

(15)

The second structure which we need to introduce is the triple product. This is a trilinear map 56 × 56 × 56 −→ 56, which associates to three elements X, Y and Z another element transforming in the 56 representation, denoted by (X, Y, Z), and defined by (X, Y, Z)ij := − 8 X ik Ykl Z lj −8 Y ik Xkl Z lj −8 Y ik Zkl X lj − 2 Y ij X kl Zkl − 2 X ij Y kl Zkl − 2 Z ij Y kl Xkl +

1 2

% ij klmnpq Xkl Ymn Zpq ,

(X, Y, Z)ij := 8 Xik Y kl Zlj + 8 Y ik X kl Zlj + 8 Y ik Z kl Xlj

(16)

+ 2 Yij Z kl Xkl + 2 Xij Z kl Ykl + 2 Zij X kl Ykl −

kl mn pq 1 Z . 2 %ij klmnpq X Y

A somewhat tedious calculation5 shows that this triple product obeys the relations (X, Y, Z) = (X, Y, Z) = (X, Y, Z) , W = (X, Y, (V , W, Z)) =

(Y, X, Z) + 2 X, Y Z, (Z, Y, X) − 2 X, Z Y, (X, W, Z) , Y − 2 X, Z Y, W , (V , W, (X, Y, Z)) + ((X, Y, V ) , W, Z) + (V , (Y, X, W ) , Z) .

5 Which relies heavily on the Schouten identity ε [ij klmnpq Xr]s = 0.

(17)

64


We note that the triple product (16) could be modified by terms involving the symplectic invariant, such as X, Y Z; the above choice has been made in order to obtain agreement with the formulas of [6]. While there is no (symmetric) quadratic invariant of E7(7) in the 56 representation, a real quartic invariant I4 can be constructed by means of the above triple product and the bilinear form; it reads I4 (X ij , Xij ) := ≡

1 48 (X, X, X) , X Xij Xj k X kl Xli − 41 X ij Xij X kl Xkl 1 ij klmnpq + 96 % Xij Xkl Xmn Xpq 1 + 96 %ij klmnpq X ij X kl X mn X pq .

(18)

2.2. Quasiconformal nonlinear realization of E8(8) . We will now exhibit a nonlinear realization of E8(8) on the 57-dimensional real vector space with coordinates X := (X ij , Xij , x), where x is also real. While x is a E7(7) singlet, the remaining 56 variables transform linearly under E7(7) . Thus X forms the 56 ⊕ 1 representation of E7(7) . In writing the transformation rules we will omit the transformation parameters in order not to make the formulas (and notation) too cumbersome. To recover the infinitesimal variations, one must simply contract the formulas with the appropriate transformation parameters. The E7(7) subalgebra acts linearly by Gi j (X kl ) = 2 δ kj X il − 41 δ ij X kl ,

Gij kl (X mn ) =

Gi j (Xkl ) = −2 δ ik Xj l + 41 δ ij Xkl ,

Gij kl (Xmn ) =

1 ij klmnpq Xpq , 24 % [ij δ mn X kl] ,

(19)

Gij kl (x) = 0,

Gi j (x) = 0, H generates scale transformations H (Xij ) = Xij ,

H (Xij ) = Xij ,

H (x) = 2 x,

(20)

and the E generators act as translations; we have E(Xij ) = 0,

E(Xij ) = 0,

E(x) = 1

(21)

and E ij (X kl ) = 0, Eij (X kl ) = δ kl ij ,

ij

E ij (Xkl ) = δ kl ,

E ij (x) = −Xij ,

Eij (Xkl ) = 0,

Eij (x) = Xij .

(22)


65

By contrast, the F generators are realized nonlinearly: F (X ij ) = −

1 6

(X, X, X)ij + X ij x

≡ 4Xik Xkl X lj +X ij X kl Xkl 1 ij klmnpq Xkl Xmn Xpq + X ij 12 % − 16 (X, X, X)ij + Xij x − 4X ik X kl Xlj − Xij X kl Xkl

− F (Xij ) = ≡

x,

(23)

kl mn pq 1 + Xij x, 12 %ij klmnpq X X X 4 I4 (X ij , Xij ) + x 2 4 Xij Xj k X kl Xli − X ij Xij X kl Xkl 1 ij klmnpq + 24 % Xij Xkl Xmn Xpq 1 + 24 %ij klmnpq X ij X kl X mn X pq + x 2 .

+

F (x) = ≡

Observe that the form of the r.h.s. is dictated by the requirement of E7(7) covariance: (F (Xij ), F (Xij )) and F (x) must still transform as the 56 and 1 of E7(7) , respectively. The action of the remaining generators is likewise E7(7) covariant: F ij (X kl ) = − 4 Xi[k X l]j + 41 % ij klmnpq Xmn Xpq , F ij (Xkl ) = + 8 δ [ik X

j ]m

ij

ij

Xml + δ kl X mn Xmn + 2 X ij Xkl − δ kl x,

mn kl kl Fij (X kl ) = − 8 δ k[i Xj ]m X ml +δ kl ij X Xmn − 2 Xij X − δ ij x,

Fij (Xkl ) = 4 X ki Xj l −

mn pq 1 4 %ij klmnpq X X ,

(24)

F ij (x) = 4 X ik Xkl X lj +X ij X kl Xkl −

1 12

% ij klmnpq Xkl Xmn Xpq + X ij x,

Fij (x) = 4 X ik X kl Xlj + Xij X kl Xkl −

kl mn pq 1 12 %ij klmnpq X X X

− Xij x.

Although E7(7) covariance considerably constrains the expressions that can appear on the r.h.s., it does not fix them uniquely: as for the triple product (16) one could add further terms involving the symplectic invariant. However, all ambiguities are removed by imposing closure of the algebra, and we have checked by explicit computation that the above variations do close into the full E8(8) algebra in the basis given in the previous section. This is the crucial consistency check. The term “quasiconformal realization” is motivated by the existence of a norm form that is left invariant up to a (possibly coordinate dependent) factor under all transformations. To write it down we must first define a nonlinear “difference” between two points X ≡ (Xij , Xij ; x) and Y ≡ (Y ij , Yij ; y); curiously, the standard difference is not invariant under the translations (E ij , Eij ). Rather, we must choose δ(X , Y) := (X ij − Y ij , Xij − Yij ; x − y + X, Y ).

(25)

66


This difference still obeys δ(X , Y) = −δ(Y, X ) and thus δ(X , X ) = 0, and is now invariant under (E ij , Eij ) as well as E; however, it is no longer additive. In fact, with the sum of two vectors being defined as δ(X , −Y), the extra term involving X, Y can be interpreted as the cocycle induced by the standard canonical commutation relations. The relevant invariant is a linear combination of x 2 and the quartic E7(7) invariant I4 , viz. N4 (X ) ≡ N4 (X ij , Xij ; x) := 4I4 (X) − x 2 ,

(26)

In order to ensure invariance under the translation generators, we consider the expression N4 (δ(X , Y)), which is manifestly invariant under the linearly realized subgroup E7(7) . Remarkably, it also transforms into itself up to an overall factor under the action of the nonlinearly realized generators. More specifically, we find F N4 (δ(X , Y)) = 2 (x + y) N4 (δ(X , Y)), F ij N4 (δ(X , Y)) = 2 (Xij + Y ij ) N4 (δ(X , Y)), H N4 (δ(X , Y)) = 4 N4 (δ(X , Y)). Therefore, for every Y ∈ R 57 the “light cone” with base point Y, defined by the set of X ∈ R 57 obeying N4 (δ(X , Y)) = 0,

(27)

is preserved by the full E8(8) group, and in this sense, N4 is a “conformal invariant” of E8(8) . We note that the light cones defined by the above equation are not only curved hypersufaces in R 57 , but get deformed as one varies the base point Y. As we will show in Appendix B, the quartic invariant I4 can take both positive and negative values, but in the latter case Eq. (27) does not have real solutions. However, we can remedy this problem by extending the representation space to C 57 and using the same formulas to get a realization of the complexified Lie algebra E8 (C) on C 57 . The existence of a fourth order conformal invariant of E8(8) is noteworthy in view of the fact that no irreducible fourth order invariant exists for the linearly realized E8(8) group (the next invariant after the quadratic Casimir being of order eight). 2.3. Relation with Freudenthal Triple Systems. We will now rewrite the nonlinear transformation rules in another form in order to establish contact with mathematical literature. Both the bilinear form (15) and the triple product (16) already appear in [6], albeit in a very different guise. That work starts from 2 × 2 “matrices” of the form α 1 x1 A= , (28) x2 α 2 where α1 , α2 are real numbers and x1 , x2 are elements of a simple Jordan algebra J of degree three. There are only four simple Jordan algebras J of this type, namely the 3 × 3 hermitian matrices over the four division algebras, R, C, H and O. The associated matrices are then related to non-compact forms of the exceptional Lie algebras F4 , E6 , E7 , and E8 , respectively. For simplicity, let us concentrate on the maximal case J3O S , when the matrix A carries 1+1+27+27 = 56 degrees of freedom. This counting suggests


67

an obvious relation with the 56 of E7(7) and its decomposition under E6(6) , but more work is required to make the connection precise. To this aim, [6] defines a symplectic invariant A, B , and a trilinear product mapping three such matrices A, B and C to another one, denoted by (A, B, C). This triple system differs from a Jordan triple system in that it is not derivable from a binary product. The formulas for the triple product in terms of the matrices A, B and C given in [6] are somewhat cumbersome, lacking manifest E7(7) covariance. For this reason, instead of directly verifying that our prescription (16) and the one of [6] coincide, we have checked that they satisfy identical relations: a quick glance shows that the relations (T1)–(T4) [6] are indeed the same as our relations (17), which are manifestly E7(7) covariant. To rewrite the transformation formulas we introduce Lie algebra generators UA and ˜ UA labeled by the above matrices, as well as generators SAB labeled by a pair of such matrices. For the grade ±2 subspaces we would in general need another set of generators KAB and K˜ AB labeled by two matrices, but since these subspaces are one-dimensional in the present case, we have only two more generators Ka and K˜ a labelled by one real number a. In the same vein, we reinterpret the 57 coordinates X as a pair (X, x), where X is a 2 × 2 matrix of the type defined above. The variations then take the simple form Ka (X) = 0, UA (X) = A, SAB (X) = (A, B, X) , U˜ A (X) = 1 (X, A, X) − Ax,

Ka (x) = 2 a, UA (x) = A, X , SAB (x) = 2 A, B x, (29) 1 ˜ UA (x) = − (X, X, X) , A + X, A x,

2

6

K˜ a (X) = − 16 a (X, X, X) + aXx,

K˜ a (x) =

1 6

a (X, X, X) , X + 2 ax 2 .

From these formulas it is straightforward to determine the commutation relations of the transformations. To expose the connection with the more general Kantor triple systems we write KAB ≡ KA,B

(30)

in the formulas below. The consistency of this specialization is ensured by the relations (17). By explicit computation one finds [UA , U˜ B ] = SAB , [UA , UB ] = −KAB , [U˜ A , U˜ B ] = −K˜ AB , [SAB , UC ] = −U(A,B,C) , [SAB , U˜ C ] = −U˜ (B,A,C) , [KAB , U˜ C ] = U(A,C,B) − U(B,C,A) , [K˜ AB , UC ] = U˜ (B,C,A) − U˜ (A,C,B) , [SAB , SCD ] = −S(A,B,C)D − SC(B,A,D) , [SAB , KCD ] = KA(C,B,D) − KA(D,B,C) , [SAB , K˜ CD ] = K˜ (D,A,C)B − K˜ (C,A,D)B , [KAB , K˜ CD ] = S(B,C,A)D − S(A,C,B)D − S(B,D,A)C + S(A,D,B)C .

(31)

68


For general KAB , these are the defining commutation relations of a Kantor triple system, and, with the further specification (30), those of a Freudenthal triple system (FTS). Freudenthal introduced these triple systems in his study of the metasymplectic geometries associated with exceptional groups [10]; these geometries were further studied in [1, 6, 30, 24]6 . A classification of FTS’s may be found in [24], where it is also shown that there is a one-to-one correspondence between simple Lie algebras and simple FTS’s with a non-degenerate bilinear form. Hence there is a quasiconformal realization of every Lie group acting on a generalized lightcone. 3. Truncations of E8(8) For the lower rank exceptional groups contained in E8(8) , we can derive similar conformal or quasiconformal realizations by truncation. In this section, we will first give the list of quasiconformal realizations contained in E8(8) . In the second part of this section, we consider truncations to a three graded structure, which will yield conformal realizations. In particular, we will work out the conformal realization of E7(7) on a space of 27 dimensions as an example, which is again the maximal example of its kind. 3.1. More quasiconformal realizations. All simple Lie algebras (except for SU (2)) can be given a five graded structure (10) with respect to some subalgebra of maximal rank and one can associate a triple system with the grade +1 subspace [23, 2]. Conversely, one can construct every simple Lie algebra over the corresponding triple system. The realization of E8 over the FTS defined by the exceptional Jordan algebra can be truncated to the realizations of E7 , E6 , and F4 by restricting oneself to subalgebras defined by quaternionic, complex, and real Hermitian 3 × 3 matrices. Analogously the non-linear realization of E8(8) given in the previous section can be truncated to nonlinear realizations of E7(7) , E6(6) , and F4(4) . These truncations preserve the five grading. More specifically we find that the Lie algebra of E7(7) has a five grading of the form: E7(7) = 1 ⊕ 32 ⊕ (SO(6, 6) ⊕ D) ⊕ 32 ⊕ 1.

(32)

Hence this truncation leads to a nonlinear realization of E7(7) on a 33 dimensional space. Note that this is not a minimal realization of E7(7) . Further truncation to the E6(6) subgroup preserving the five grading leads to: E6(6) = 1 ⊕ 20 ⊕ (SL(6, R) ⊕ D) ⊕ 20 ⊕ 1.

(33)

This yields a nonlinear realization of E6(6) on a 21 dimensional space, which again is not the minimal realization. Further reduction to F4(4) preserving the five grading F4(4) = 1 ⊕ 14 ⊕ (Sp(6, R) ⊕ D) ⊕ 14 ⊕ 1

(34)

leads to a minimal realization of F4(4) on a fifteen dimensional space. One can further truncate F4 to a subalgebra G2(2) while preserving the five grading G2(2) = 1 ⊕ 4 ⊕ (SL(2, R) ⊕ D) ⊕ 4 ⊕ 1,

(35)

6 FTS’s have also been used in [3] to give a classification and a unified realization of non-linear quasisuperconformal algebras and in the realizations of nonlinear N = 4 superconformal algebras in two dimensions [15].


69

which then yields a nonlinear realization over a five dimensional space. One can go even futher and truncate G2 to its subalgebra SL(3, R) SL(3, R) = 1 ⊕ 2 ⊕ (SO(1, 1) ⊕ D) ⊕ 2 ⊕ 1,

(36)

which is the smallest simple Lie algebra admitting a five grading. We should perhaps stress that the nonlinear realizations given above are minimal for G2(2) , F4(4) , and E8(8) which are the only simple Lie algebras that do not admit a three grading and hence do not have unitary representations of the lowest weight type. The above nonlinear realizations of the exceptional Lie algebras can also be truncated to subalgebras with a three graded structure, in which case our nonlinear realization reduces to the standard nonlinear realization over a JTS. This truncation we will describe in Sect. 3.2 in more detail. With respect to E6(6) the quasiconformal realization of E8(8) (11) decomposes as follows: 1 ⊕

56

(133 ⊕ 1)

⊕

56

1

1 ⊕

1

⊕

27

⊕

⊕

⊕

✧ 27 ✧ ✧ ⊕ ❜ ❜ ❜ 27

⊕

27

⊕

1

⊕

1

27 1

⊕

❜

⊕

❜ ❜ ✧ ✧

27

✧

78

⊕

1

1

1 The numbers in the first line are the dimensions of E7(7) , whereas the remaining numbers correspond to representations of USp(8) which is the maximal compact subgroup of E6(6) . The 27 of grade −1 subspace and the 27 of grade +1 subspace close into the E6(6) ⊕ D subalgebra of grade zero subspace and generate the Lie algebra of E7(7) . Similarly 27 of grade −1 subspace together with the 27 of grade +1 subspace form another E7(7) subalgebra of E8(8) . Hence we have four different E7(7) subalgebras of E8(8) : i) E7(7) subalgebra of grade zero subspace which is realized linearly. ii) E7(7) subalgebra preserving the 5-grading, which is realized nonlinearly over a 33 dimensional space iii) E7(7) subalgebra that acts on the 27 dimensional subspace as the generalized conformal generators. iv) E7(7) subalgebra that acts on the 27 dimensional subspace as the generalized conformal generators.

70


Similarly for E7(7) under the SL(6, R) subalgebra of the grade zero subspace the 32 dimensional grade +1 subspace decomposes as 32 = 1 + 15 + 15 + 1. The 15 from grade +1 (−1) subspace together with 15 (15) of grade −1 (+1) subspace generate a nonlinearly realized SO(6, 6) subalgebra that acts as the generalized conformal algebra on the 15 (15) dimensional subspace. For E6(6) , F4(4) , G2(2) , and SL(3, R) the analogous truncations lead to nonlinear conformal subalgebras SL(6, R), Sp(6, R), SO(2, 2), and SL(2, R), respectively. 3.2. Conformal Realization of E7(7) . As a special truncation the quasiconformal realization of E8(8) contains a conformal realization of E7(7) on a space of 27 dimensions, on which the E6(6) subgroup of E7(7) acts linearly. The main difference is that the construction is now based on a three-graded decomposition (4) of E7(7) rather than (10) – hence the realization is “conformal” rather than “quasiconformal”. The relevant decomposition can be directly read off from the figure: we simply truncate to an E7(7) subalgebra in such a way that the grade ±2 subspace can no longer be reached by commutation. This requirement is met only by the two truncations corresponding to the diagonal lines in the figure; adding a singlet we arrive at the desired three graded decomposition of E7(7) 133 = 27 ⊕ (78 ⊕ 1) ⊕ 27

(37)

under its E6(6) × D subgroup. The Lie algebra E6(6) has USp(8) as its maximal compact subalgebra. It is spanned ˜ ij in the adjoint representation 36 of USp(8) and a fully antisymby a symmetric tensor G ˜ ij kl transforming under the 42 of USp(8); indices metric symplectic traceless tensor G 1 ≤ i, j, . . . ≤ 8 are now USp(8) indices and all tensors with a tilde transform under ˜ ij kl is traceless with respect to the real symplectic metric USp(8)rather than SL(8, R). G j 9ij = −9j i = −9ij (thus 9ik 9kj = δi ). The symplectic metric also serves to pull up and down indices, with the convention that this is always to be done from the left. The remaining part of E7(7) is spanned by an extra dilatation generator H˜ , translation generators E˜ ij and the nonlinearly realized generators F˜ ij , transforming as 27 and 27, respectively. Unlike for E8(8) , there is no need here to distinguish the generators by the position of their indices, since the corresponding generators are linearly related by means of the symplectic metric. The fundamental 27 of E6(6) (on which we are going to realize a nonlinear action of E7(7) ) is given by the traceless antisymmetric tensor Z˜ ij transforming as ˜ i j (Z˜ kl ) = 2 δ k Z˜ il , G j ˜ ij kl (Z˜ mn ) = G

1 ij klmnpq ˜ Z pq , 24 %

where Z˜ ij := 9ik 9j l Z˜ kl = (Z˜ ij )∗

and

9ij Z˜ ij = 0.

(38)


71

Likewise, the 27 representation transforms as ˜ i j (Z¯ kl ) = 2 δ k Z¯ il , G j ˜ ij kl (Z¯ mn ) = − 1 % ij klmnpq Z¯ pq . G 24

(39)

Because the product of two 27’s contains no singlet, there exists no quadratic invariant of E6(6) ; however, there is a cubic invariant given by ˜ := Z˜ ij Z˜ j k Z˜ kl 9il . N3 (Z)

(40)

We are now ready to give the conformal realization of E7(7) on the 27 dimensional space spanned by the Z˜ ij .As the action of the linearly realized E6(6) subgroup has already been given, we list only the remaining variations. As before E˜ ij acts by translations: E˜ ij (Z˜ kl ) = −9i[k 9l]j − 18 9ij 9kl

(41)

H˜ (Z˜ ij ) = Z˜ ij .

(42)

and H˜ by dilatations

The 27 generators F˜ ij are realized nonlinearly: F˜ ij (Z˜ kl ) := − 2 Z˜ ij (Z˜ kl ) + 9i[k 9l]j (Z˜ mn Z˜ mn ) +

1 8

9ij 9kl (Z˜ mn Z˜ mn )

+ 8 Z˜ km Z˜ mn 9n[i 9j ]l −9kl (Z˜ im 9mn Z˜ nj ).

(43)

The norm form needed to define the E7(7) invariant “light cones” is now constructed from the cubic invariant of E6(6) . Then N3 (X˜ − Y˜ ) is manifestly invariant under E6(6) and under the translations E˜ ij (observe that there is no need to introduce a nonlinear difference unlike for E8(8) ). Under H˜ it transforms by a constant factor, whereas under the action of F˜ ij we have F˜ ij N3 (X˜ − Y˜ ) = (X˜ ij + Y˜ ij )N (X˜ − Y˜ ). (44) Thus the light cones in R 27 with base point Y˜ N3 (X˜ − Y˜ ) = 0

(45)

are indeed invariant under E7(7) . They are still curved hypersurfaces, but in contrast to the E8(8) light-cones constructed before, they are no longer deformed as one varies the base point Y˜ . The connection to the Jordan Triple Systems of Appendix A can now be made quite explicit, and the formulas that we arrive at in this way are completely analogous to the ones given in the introduction. We first of all notice that we can again define a triple product in terms of the E6(6) representations; it reads ˜ ij = 16 X˜ ik Z˜ kl Y˜ lj +16 Z˜ ik X˜ kl Y˜ lj +4 9ij (X˜ kl Y˜lm Z˜ mn 9kn ) {X˜ Y˜ Z} + 4 X˜ ij Y˜ kl Z˜ kl + 4 Y˜ ij X˜ kl Z˜ kl + 2 Z˜ ij X˜ kl Y˜kl .

(46)

72


This triple product can be used to rewrite the conformal realization. Recalling that a triple product with identical properties exists for the 27-dimensional Jordan algebra J3O S , we now consider Z˜ as an element of J3O S . Next we introduce generators labeled by elements of J3O S , and define the variations ˜ = a, Ua (Z) ˜ = {a b Z}, ˜ Sab (Z) ˜ = U˜ c (Z)

(47)

˜ − 21 {Z˜ c Z},

for a, b, c ∈ J3O S . It is straightforward to check that these reproduce the commutation relations listed in the introduction with the only difference that J2C has been replaced by J3O S . Acknowledgements. We are very grateful to R. Kallosh for poignant questions and comments on the first version of this paper. We would also like to thank B. de Wit and B. Pioline for enlightening discussions.

Appendix A. Jordan Triple Systems Let us first recall the defining properties of a Jordan algebra. By definition these are algebras equipped with a commutative (but non-associative) binary product a ◦ b = b ◦ a satisfying the Jordan identity (a ◦ b) ◦ a 2 = a ◦ (b ◦ a 2 ).

(A.1)

A Jordan algebra with such a product defines a so-called Jordan triple system (JTS) under the Jordan triple product ˜ ◦ c − b˜ ◦ (a ◦ c), {a b c} = a ◦ (b˜ ◦ c) + (a ◦ b) where ˜ denotes a conjugation in J corresponding to the operation † in g. The triple product satisfies the identities (which can alternatively be taken as the defining identities of the triple system) {a b c} = {c b a}, {a b {c d x}} − {c d {a b x}} − {a {d c b} x} + {{c d a} b x} = 0.

(A.2)

The Tits–Kantor–Koecher (TKK) construction [32, 21, 25] associates every JTS with a 3-graded Lie algebra g = g−1 ⊕ g0 ⊕ g+1 ,

(A.3)

satsifying the formal commutation relations: [g+1 , g−1 ] = g0 , [g+1 , g+1 ] = 0, [g−1 , g−1 ] = 0. With the exception of the Lie algebras G2 , F4 , and E8 every simple Lie algebra g can be given a three graded decomposition with respect to a subalgebra g0 of maximal rank.


73

By the TKK construction the elements Ua of the g+1 subspace of the Lie algebra are labelled by the elements a ∈ J . Furthermore every such Lie algebra g admits an involutive automorphismι, which maps the elements of the grade +1 space onto the elements of the subspace of grade −1: ι(Ua ) =: U˜ a ∈ g−1 .

(A.4)

To get a complete set of generators of g we define [Ua , U˜ b ] = Sab , [Sab , Uc ] = U{abc}

(A.5)

where Sab ∈ g0 and {abc} is the Jordan triple product under which the space J is closed. The remaining commutation relations are [Sab , U˜ c ] = U˜ {bac} , [Sab , Scd ] = S{abc}d − Sc{bad} ,

(A.6)

and the closure of the algebra under commutation follows from the defining identities of a JTS given above. The Lie algebra generated by Sab is called the structure algebra of the JTS J , under which the elements of J transform linearly. The traceless elements of this action of Sab generate the reduced structure algebra of J . There exist four infinite families of hermitian JTS’s and two exceptional ones [31, 27]. The latter are listed in the table below (where M1,2 (O) denotes 1 × 2 matrices over the octonions, i.e. the octonionic plane) J

G

H

M1,2 (O S )

E6(6)

SO(5, 5)

M1,2 (O)

E6(−14) SO(8, 2)

J3O S

E7(7)

E6(6)

J3O

E7(−25)

E6(−26)

Here we are mainly interested in the real form J3O S , which corresponds to the split octonions O S and has E7(7) and E6(6) as its conformal and reduced structure group, respectively. Appendix B. The Quartic E7(7) Invariant In the SL(8, R) basis E7(7) the quartic invariant is given by (18), which we here repeat for convenience SL(8,R)

I4

= Xij Xj k X kl Xli − 41 X ij Xij X kl Xkl + +

1 ij klmnpq Xij Xkl Xmn Xpq 96 % ij kl mn pq 1 96 %ij klmnpq X X X X .

(B.1)

74


Another very useful form of E7(7) makes the maximal compact subgroup SU(8) manifest. The fundamental 56 representation then is spanned by the complex tensors ZAB which are related to the SL(8, R) basis by [4] Z AB = (ZAB )∗ =

1 √ (X ij 4 2

ij

− i Xij ):AB ,

(B.2)

ij

where :AB are the SO(8) gamma matrices. In this basis the quartic invariant takes the form SU(8)

I4

= Z AB ZBC Z CD ZDA − 41 Z AB ZAB Z CD ZCD + +

1 ABCDEF GH ZAB ZCD ZEF ZGH 96 % AB CD EF GH 1 Z Z Z . 96 %ABCDEF GH Z SU(8)

(B.3)

SL(8,R)

The precise relaton between I4 and I4 has never been spelled out in the literature although it is claimed in [4] that they should be proportional. In fact, we have SU(8)

I4

SL(8,R)

= −I4

.

(B.4)

To prove this claim, one needs the identities ij

ij

ij

pq

kl Tr(: ij : kl : mn : pq ) = − 128 δ p[k δlmn ] q + 128 δ p[m δn]q + 128 δ k[m δn]l ij

mn + 96 (δkl δpq )sym ∓ 8 % ij klmnpq ,

(B.5)

and ij

pq

ij

ij

kl mn mn % ABCDEF GH :AB :CD :EF :GH = − 128 (12 δkl δpq + 48 δ p[k δlmn ] q )sym

∓ % ij klmnpq ,

(B.6)

where (. . . )sym denotes symmetrization w.r.t. the pairs of indices (ij ), (kl), (mn), (pq), and the signs ∓ depend on whether the spinor representation or the conjugate spinor representation of the gamma matrices is used: : ij klmnpq = ∓% ij klmnpq . To see that I4 can assume both positive and negative values it is sufficient to consider configurations in the SU(8) basis of the form [8]   z1 0 1   .. ZAB =:  ⊗ , (B.7)  . −1 0 z4 with complex parameters z1 , . . . , z4 . For this configuration the quartic invariant becomes SU(8) I4 = |zα |4 − 2 |zα |2 |zβ |2 + 4 z1 z2 z3 z4 + 4 z1∗ z2∗ z3∗ z4∗ . (B.8) α

β>α

Using this formula, one can easily see that both positive and negative values are possible for I4 :


i)

75

We find positive values for I4 when all but one parameter vanish: SU(8)

I4

= |z1 |4 > 0

for

z1 = 0, z2 = z3 = z4 = 0

ii) I4 vanishes when all parameters take the same real (electric) or imaginary (magnetic) value: SU(8)

I4

=0

for

z1 = z2 = z3 = z4 = M or iM, M ∈ R.

This is the example considered in [20] corresponding to maximally BPS black hole solutions in d = 4, N = 8 supergravity with vanishing entropy and vanishing area of the horizon. iii) I4 is negative when all parameters take the same complex “dyonic” value. For instance, SU(8)

I4

x0 }, the map A → A∗ , A ∈ A(W1 ) defines an antilinear operator SW1 : A(W1 ) → A(W1 ) which is closable. Its closure is called the Tomita operator of and A(W1 ) and admits a unique polar 1/2 decomposition SW1 = JW1 W1 into an antiunitary conjugation JW1 (the “phase” of SW1 ) which is called the modular conjugation of ( , A(W1 ) ), and a positive operator 1/2 W1 (the “modulus” of SW1 ) whose square W1 is referred to as the modular operator of ( , A(W1 ) ). The main theorem of Tomita–Takesaki theory [46] now implies that the adjoint actions of the operators itW1 map the algebras A(W1 ) and A(W1 ) onto themselves, whereas the adjoint action of the conjugation JW1 maps the two algebras onto one another. Bisognano and Wichmann showed that for finite-component Wightman fields, the unitary itW1 coincides with the unitary representing the 01-boost by −2π t for all t ∈ R, whereas JW1 implements a charge conjugation together with a time reflection and a spatial reflection in the 1-direction, this combination of discrete transformations will be referred to as a P1 CT-symmetry. For the algebraic setting, Borchers proved in [11]2 that the spectrum condition (without assuming Lorentz covariance) implies the commutation relations (i)

JW1 U (a)JW1 = U (j1 a),

(ii) itW1 U (a)−it W1 = U (1 (−2π t)a)

for all t ∈ R,

where 1 (−2πt) denotes the Lorentz boost by −2π t in the 1-direction, while j1 is the reflection defined by j1 x := (−x0 , −x1 , x2 , . . . , xs ). Wiesbrock noted that Borchers’ relations are not only a necessary, but also a sufficient condition for the spectrum condition ([52], cf. also [25]). For 1+1 dimensions, Borchers’ relations immediately imply [11] that the net of observables may be enlarged to a local net which generates the same wedge algebras (and hence the same corresponding modular operator and conjugation) as the original one and which has the property that J1 is a P1 CT-operator (modular P1 CT-symmetry), whereas itW1 implements the Lorentz boost by −2πt for each t ∈ R (modular Lorentz symmetry). The first uniqueness theorem for modular symmetries states that even in higher dimensions, JW1 or itW1 , t ∈ R, can be shown to be a P1 CT-operator or a 0-1-Lorentz boost, respectively, provided that JW1 or itW1 implement any geometric action on the net. The first step towards it is the following lemma. In this lemma and in what follows, K will denote the class of all double cones of the form O := (a + V+ ) ∩ (b − V+ ), a, b ∈ R1+s . Lemma 2.1. Let K be a unitary or antiunitary operator with the property that for every double cone O there are open sets MO and NO such that KA(O)K ∗ = A(MO ),

K ∗ A(O)K = A(NO ),

2 For a considerably simpler proof found recently, see [28].

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry

81

and let κ be a causal automorphism3 of R1+s such that KU (a)K ∗ = U (κa) for all a ∈ R1+s . Then there is a unique ξ ∈ R1+s such that KA(O)K ∗ = A(κO + ξ ),

for all O ∈ K.

A first proof of Lemma 2.1 was published in [37], but both the statement and the proof given there were more general, which made the formulation somewhat technical. For the reader’s convenience a less general, but more accessible formulation is used here, and a more detailed version of the proof is given below. The following theorem is a consequence of Lemma 2.1 and Borchers’ commutation relations. Theorem 2.2 (First Uniqueness Theorem). (i) If for every double cone O ∈ K there is an open set MO such that JW1 A(O)JW1 = A(MO ), then

JW1 A(O)JW1 = A(j1 O) for all O ∈ K.

t such that (ii) If for every t ∈ R and for every O ∈ K there is an open set MO t itW1 A(O)−it W1 = A(MO ),

then

itW1 A(O)−it W1 = A(1 (−2π t)O) for all O ∈ K.

The statement of part (ii) implies the statement of part (i) [30], i.e., the Unruh effect implies modular P1 CT-symmetry. Further results relating the above statements to each other and to similar conditions can be found in [26]. Assuming that is separating with respect to the algebra A(V+ ), Borchers also found commutation relations for the corresponding modular conjugation and unitaries: for each a ∈ R1+s , he found that J+ U (a)J+ it + U (a)−it +

= U (−a); = U (e−2πt a)

for all t ∈ R.

These relations, together with Lemma 2.1, imply the following corollary: Corollary 2.3 (Uniqueness Theorem “1a”). Assume A to be Poincaré covariant, and assume that the vacuum vector is separating with respect to the algebra A(V+ ) , and let itV+ and JV+ be the corresponding modular operator and conjugation, respectively. 3 Recall that a causal automorphism of R1+s is a bijection f : R1+s → R1+s which preserves the causal structure of R1+s , i.e., f (x) and f (y) are timelike with respect to each other if and only if x and y are timelike with respect to each other. Without assuming linearity or continuity, one can show that the group of all causal automorphisms of R1+s is generated by the elements of the Poincaré group and the dilatations [1, 3, 2, 54, 15]. Since the transformations implemented on the translations by Borchers’ commutation relations happen to be causal in all applications discussed below, this assumption means no loss of generality.

82

B. Kuckert

(i) If for every double cone O there is an open set MO such that JV+ A(O)JV+ = A(MO ), then

JV+ A(O)JV+ = A(−O) for all O ∈ K.

t such that (ii) If for every t ∈ R and every double cone O there is an open set MO t itV+ A(O)−it V+ = A(MO ),

then

−2πt O) for all O ∈ K. itV+ A(O)−it V+ = A(e

Since massive theories cannot be dilation invariant unless their mass spectrum is dilation invariant (cf., e.g., [42]), the models concerned by part (ii) of this corollary are massless theories. But it follows from the scattering theory for massless fermions and bosons in 1+3 or 1+1 dimensions (see [17–19]) that either of the symmetry properties found in part (i) and part (ii) of the corollary implies a massless theory to be free (i.e., its S-matrix is trivial) (see [18, 20, 23]). Note that for the 1+1-dimensional case, all modular symmetries considered in Thm. 2.2 and Cor. 2.3 have been established in [11]. It is assumed above that the adjoint actions of JW1 and itW1 , t ∈ R, map each local algebra A(O), O ∈ K, onto the algebra A(MO ) associated with some open region MO in Minkowski space. This means that, essentially, the net structure has to be preserved. This is the restrictive aspect of the assumption. On the other hand, the shape of the region MO is left completely arbitrary, the map K O → MO is not even assumed to be induced by a point transformation. In this aspect, the above assumptions are rather weak. But there are, of course, other ways to specify what a “geometric action” is. Denote by W the class of all wedges, i.e., all images of the Rindler wedge W1 under Poincaré transformations. For M ⊂ R1+s , define the causal complement M c to be the set of all points that are spacelike to M, and let M denote the interior of M c . It has been shown in [38, 39] that one can define a nonempty localization region for each local observable A∈ / Cid by L(A) := {W : W ∈ W, A ∈ A(W ) }. This localization prescription will be said to satisfy locality if any two local observables A and B with the property that L(A) and L(B) are spacelike separated commute. This property does not follow from the locality property of the net alone, but with the following additional assumptions one can derive it for the present setting [39]: (E) Wedge duality. A(W ) = A(W ) for each wedge W ∈ W. (F) Wedge additivity. For each wedge W ∈ W and each double cone O ∈ K with W ⊂ W + O one has A(W ) ⊂ A(a + O) . a∈W

Wedge duality is a property of all finite-component Wightman fields by the Bisognano–Wichmann theorem, and wedge additivity is a standard property of Wightman


83

fields as well. Condition (F) is slightly stronger than the definition of wedge additivity used in [47, 39], where the algebras A(a + O) in Condition (F) are replaced by the larger algebras A(a + O ) , but as this difference is not expected to be substantial for physics, we use the same term for convenience, which is in harmony with the other existing notions of additivity used in algebraic quantum field theory. Assume now that the localization region of the observable At := itW1 A−it W1 depends continuously on t, i.e., that for every sequence (tν )ν∈N which converges to some t∞ ∈ R, the localization region L(At∞ ) consists precisely of all accumulation points of sequences (xν )ν∈N with xν ∈ L(Atν ). Then the following lemma establishes a first restriction on how the localization region can depend on t. Lemma 2.4. With Assumptions (A)–(E), suppose the localization prescription L defined above satisfies locality. Let A be a local observable in A(W1 ), and assume that there exists an ε > 0 such that all At , t ∈ [0, ε], are local observables and such that the function [0, ε] t → L(At ) is continuous in the above sense. Then

(i) L(Aε ) ⊂ 1 (−2πε) (L(A) + W1 )cc ∩ (L(A) − W1 )cc ; (ii) L(Aε ) ⊂ L(A) − V + ; (iii) L(A) ⊂ L(Aε ) + V + . It is shown in the Appendix that the continuity assumption made on t → L(At ) is equivalent to continuity with respect to a metric first considered by Hausdorff, and that L(A t ) is compact. t∈[0,ε] Next suppose that t → L(At ) is continuous not only for sufficiently small t, but for all t ∈ R, and assume wedge additivity in addition. With these slightly strengthened assumptions one can now prove the following: Theorem 2.5 (Second Uniqueness Theorem). With Assumptions (A)–(F), assume that itW1 Aloc itW1 = Aloc , and suppose that L(At ) depends continuously from t for all t ∈ R and for all A ∈ Aloc . Then L(itW1 A−it W1 ) = 1 (−2π t)L(A) for all A ∈ Aloc . By the result of Guido and Longo, the conclusion of this proposition also implies modular P1 CT-symmetry, but Proposition 2.5 does not provide a proper parallel to the P1 CT-part of the first uniqueness theorem, which may also apply if the modular group does not act in any geometric way. The assumption that every local observable A is mapped onto some other local observable under the adjoint action of the modular group prevents A to be mapped onto an observable localized in an unbounded region. For every bounded open region O there are conformal transformations which map O onto an unbounded region; these transformations are excluded a priori. In contrast, the assumptions of the first uniqueness theorem do not exclude these symmetries explicitly, while it is evident from this theorem that the modular objects under consideration cannot implement these symmetries. Another restrictive assumption of the second uniqueness theorem is that wedge duality is assumed there, whereas the first one can be used to derive wedge duality. On the other hand the assumptions made in the second uniqueness theorem admit the situation that the net structure of A is destroyed completely under the action of the modular group.

84

B. Kuckert

3. Proofs For every algebra M ⊂ B(H), define its localization region L(M) with respect to the net A by L(M) := {O ∈ K : A(O) ⊂ M}. The only reason to use the class K of double cones in this definition is convenience; one could replace K by the larger class T of all open sets in R1+s without affecting the definition. To see this, denote the localization region obtained this way by LT (M); it is trivial that L(M) ⊂ LT (M) as K ⊂ T , while from isotony of the net and the fact that each open region M is the union of all double cones O ⊂ M, one finds {M ∈ T : A(M) ⊂ M} = {O ∈ K : ∃M ∈ T : O ⊂ M, A(M) ⊂ M} ⊂ {O ∈ K : A(O) ⊂ M} = L(M),

LT (M) =

which is the converse inclusion. It is obvious from the definitions that L(A(M)) ⊃ M. For causally complete and convex regions one can prove the converse inclusion, which we recall without proof from [39] (Cor. 5.4) for later use. Here a causally complete region is a region R such that (R c )c = R. Lemma 3.1. Let R ⊂ R1+s be a causally complete convex open region. (i) For every open region M ⊂ R1+s , one has A(M) ⊂ A(R ) if and only if M ⊂ R. (ii) L(A(R)) = R. One also checks that for any such R, one has L(A(R)) = L(A(R) ) = L(A(R ) ). We emphasize that the above assumption s ≥ 2 is crucial for this lemma; in 1+1 dimensions, there are chiral theories which do not obey the statement of the lemma. The repeated use of this lemma in the proofs is the main reason why s ≥ 2 is assumed throughout this paper. Proof of Lemma 2.1. In what follows, K and κ are defined as in Lemma 2.1. As before, K will denote the class of double cones. For any open region M ⊂ R1+s , we denote by KM the class of all double cones O ∈ K with O ⊂ M, and for each subalgebra M of B(H), we denote by KM the class of all double cones O such that A(O) ⊂ M. The proof will be subdivided into five lemmas. The first implies that for every O ∈ K, the regions MO and NO are bounded. It uses the fact that a region M is bounded if and only if its difference region M − M is bounded, and that difference sets can be expressed in terms of translations. Since the behaviour of translations under the action of the symmetry K is known by assumption, one can prove the following lemma. Lemma 3.2. For every double cone O ∈ K, one has L(KA(O)K ∗ ) − L(KA(O)K ∗ ) = κ(O − O).


85

Proof. Using the assumptions of Theorem 2.1, one obtains L(KA(O)K ∗ ) − L(KA(O)K ∗ ) = L(A(MO )) − L(A(MO )) = {a ∈ R1+s : ∃P ∈ KA(MO ) : A(P + a) ⊂ A(MO )} = {a ∈ R1+s : ∃P ∈ KA(MO ) : KU (κ −1 a)K ∗ A(P )KU (−κ −1 a)K ∗ ⊂ A(MO )} = κ{a ∈ R1+s : ∃P ∈ KA(MO ) : U (a) K ∗ A(P )K U (a) ⊂ K ∗ A(MO )K }

=A(NP )

⊂ κ{a ∈ R

1+s

: ∃P ∈ K

A(MO )

: ∃Q ∈ K

A(NP )

=A(O )

: A(Q + a) ⊂ A(O)}.

Since the definitions and isotony imply ∗ ∗ KA(NP ) = KK A(P )K ⊂ KK A(MO )K = KA(O) ,

and since, as remarked above, KA(O) = KO , one obtains L(KA(O)K ∗ ) − L(KA(O)K ∗ ) ⊂ κ{a ∈ R1+s : ∃Q ∈ KO : A(Q + a) ⊂ A(O)} = κ(O − O). Conversely, κ(O − O) = κ{a ∈ R1+s : ∃P ∈ KO : A(P + a) ⊂ A(O)} = {a ∈ R1+s : ∃P ∈ KO : A(P + κ −1 a) ⊂ A(O)} = {a ∈ R1+s : ∃P ∈ KO : K ∗ U (a)KA(P )K ∗ U (−a)K ⊂ A(O)} = {a ∈ R1+s : ∃P ∈ KO : A(MP + a) ⊂ A(MO )} ⊂ {a ∈ R1+s : ∃P ∈ KO : ∃Q ∈ KA(MP ) : A(Q + a) ⊂ A(MO )}, and since ∗

∗

KA(MP ) = KK A(P )K ⊂ KK A(O)K = KA(MO ) , one obtains κ(O − O) ⊂ {a ∈ R1+s : ∃Q ∈ KA(MO ) : A(Q + a) ⊂ A(MO )} = L(A(MO )) − L(A(MO )). The next lemma proves that strict inclusions of double cones are preserved under the adjoint action of the operator K. Again, this boils down to translating local algebras up and down Minkowski space and using the commutation relations between K and the translation operators. One uses the fact that O ⊂ P if and only if O can be translated within P into all directions. Lemma 3.3. For any two double cones O, P ∈ K with O ⊂ P , one has L(KA(O)K ∗ ) ⊂ L(KA(P )K ∗ ).

86

B. Kuckert

Proof. O ⊂ P if and only if the set {a ∈ R1+s : O + a ⊂ P } is a neighbourhood of the origin of R1+s . After using Lemma 3.1, elementary transformations yield {a ∈ R1+s : O + a ⊂P } = {a ∈ R1+s : A(O + a) ⊂ A(P )} = {a ∈ R1+s : K ∗ U (κa)KA(O)K ∗ U (−κa)K ⊂ A(P )} = {a ∈ R1+s : A(MO + κa) ⊂ A(MP )} = κ −1 {a ∈ R1+s : A(MO + a) ⊂ A(MP )} ⊂ κ −1 {a ∈ R1+s : L(A(MO )) + a ⊂ L(A(MP ))}. Since κ is a linear automorphism of R1+s , it follows that O can be a subset of P only if {a ∈ R1+s : L(A(MO )) + a ⊂ L(A(MP ))} is a neighbourhood of the origin. This implies the statement.

The next lemma proves that the maps K K → L(KA(O)K ∗ ) and

K O → L(K ∗ A(O)K)

are induced by continuous functions κ˜ : R1+s → R1+s and κˆ : R1+s → R1+s . Lemma 3.4. Let x ∈ R1+s be arbitrary, and let (Oν )ν∈N be a neighbourhood base of x consisting of double cones Oν ∈ K. Then (L(KA(Oν )K ∗ ))ν∈N is a neighbourhood base of a (naturally, unique) point κ(x) ˜ ∈ R1+s , and (L(K ∗ A(Oν )K))ν∈N is a neighbourhood base of a point κ(x) ˆ ∈ R1+s . The functions x → κ(x) ˜ and x → κ(x) ˆ are continuous. Proof. Without loss of generality, one may assume that Oν+1 ⊂ Oν for all ν ∈ N. It follows from L(A(O)) = O for all O ∈ K and Lemma 3.2 that all L(KA(Oν )K ∗ ), ν ∈ N, are bounded sets, and it follows from Lemma 3.3 that L(KA(Oν+1 )K ∗ ) ⊂ L(KA(Oν )K ∗ ). Therefore, the intersection of this family is nonempty, and Lemma 3.2 implies that the diameter of L(KA(Oν )K ∗ ) tends to zero as ν tends to infinity. This implies that the intersection contains precisely one point κ(x), ˜ as stated. The corresponding statements for K ∗ are proved analogously. This proves that x → κ(x) ˜ is a bijective point transformation. Let (xν )ν∈N be a sequence in R1+s that converges to a point x∞ . Then there is a neighbourhood base (Oν )ν∈N of x∞ with xν ∈ Oν for all ν ∈ N. But since κ(x ˜ ν ) ∈ κ(O ˜ ν ) for all ν ∈ N, and since κ(O ˜ ν ) is a neighbourhood base of κ(x ˜ ∞ ), it follows that κ(x ˜ ν ) tends to κ(x ˜ ∞ ) as ν → ∞. This line of argument applies to κˆ as well. The next lemma determines the functions κ˜ and κˆ up to a constant translation. Lemma 3.5. For every x ∈ R1+s , one has κ(x) ˜ = κ(0) ˜ + κx, and κ(x) ˆ = κ(0) ˆ + κ −1 x.


87

Proof. Let (Oν )ν∈N be a neighbourhood base of o. Then (Oν +x)ν∈N is a neighbourhood base of x, and L(KA(Oν + x)K ∗ ) = κ(O ˜ ν + x) = {κ(x)}. ˜ ν∈N

ν∈N

On the other hand, L(KA(Oν + x)K ∗ ) = L(U (κx)KA(Oν )K ∗ U (−κx)) ν∈N

ν∈N

= κx +

κ(O ˜ ν)

ν∈N

= κx + {κ(0)}. ˜ The corresponding reasoning also leads to the statement made on κ. ˆ It has been shown now that L(KA(O)K ∗ ) = κ(O) ˜ for each double cone O ∈ K, and since KA(O)K ∗ = A(MO ) by assumption, one concludes from MO ⊂ K(A(MO )) and isotony that KA(O)K ∗ ⊂ A(κ(O)) ˜ for all O ∈ K and that

ˆ for all O ∈ K. K ∗ A(O)K ⊂ A(κ(O))

Using this, one can now prove that κ˜ and κˆ are inverse to each other. Lemma 3.6. κˆ = κ˜ −1 , and in particular, κ˜ and κˆ are homeomorphisms. Proof. For every double cone O, it follows from the preceding results that A(O) = K ∗ KA(O)K ∗ K ⊂ K ∗ A(κ(O))K ˜ ⊂ A(κ( ˆ κ(O))), ˜ and since κ( ˆ κ(O)) ˜ is a double cone by Lemma 3.5, one can use Lemma 3.1 to conclude that O ⊂ κ( ˆ κ(O)). ˜ On the other hand, it follows from Lemma 3.2 that the radii of the double cones O and κ( ˆ κ(O)) ˜ are equal, so these double cones coincide, and as this applies for any double cone O, it follows that κˆ = κ˜ −1 , as stated. The proof of Lemma 2.1 is now almost complete. For each O ∈ K, one has KA(O)K ∗ ⊂ A(κ(O)), ˜ and conversely, ∗ ∗ A(κ(O)) ˜ = KK ∗ A(κ(O))KK ˜ ⊂ KA(κ˜ −1 (κ(O)))K ˜ = KA(O)K ∗ ,

so

KA(O)K ∗ = A(κ(O)), ˜

and with ξ := κ(0) ˜ it follows from Lemma 3.5 that KA(O)K ∗ = A(κO + ξ )

for all O ∈ K.

That ξ is unique, immediately follows from Lemma 3.1, so the proof of Lemma 2.1 is complete.

88

B. Kuckert

Proof of Theorem 2.2 (i). It follows from Lemma 2.1 that there is a unique ι ∈ R1+s such that JW1 A(O)JW1 = A(j1 O + ι)

for all O ∈ K.

It remains to be shown that ι = 0. Since J is an involution, one has x = j1 (j1 x + ι) + ι) = x + j1 ι + ι

for all x ∈ R1+s ,

which gives ι = −j1 ι, hence ι2 = · · · = ιs = 0. Furthermore, one has A(W1 + ι) = JW1 A(W1 ) JW1 = A(W1 ) from Lemma 2.1 and the Tomita–Takesaki theorem, so on the one hand, it follows from Lemma 3.1 that W1 + ι ⊂ W1 , and on the other hand, locality implies A(W1 ) ⊂ A(W1 ) = A(W1 + ι) ⊂ A(W1 + ι) , so using Lemma 3.1 once more one finds W1 ⊂ W1 + ι, arriving at W1 + ι = W1 and ι0 = ι1 = 0, as stated.

In what follows, a well-known generalization of Asgeirsson’s Lemma will be used repeatedly. It is called the double cone theorem of Borchers andVladimirov [50, 9, 51, 12]. Below, it will be applied together with the edge of the wedge theorem due to Bogoliubov (cf., e.g., [45, 51, 12]). For the reader’s convenience, both theorems are recalled here. For ε > 0, Bε will denote the open ε-ball centered at the origin of R2 , and n will denote some natural number. Theorem 3.7 (Edge of the Wedge Theorem). Let C be a nonempty, open and convex cone in Rn . For some ε > 0, assume that g+ is a function analytic in the tube Rn + i(C ∩ Bε ), and that g− is a function analytic in the tube Rn − i(C ∩ Bε ). If there is an open region γ ⊂ Rn where g+ and g− have a common boundary value in the sense of distributions, then g+ and g− are branches of a function g which is analytic in a complex neighbourhood . of γ . Theorem 3.8. Given the assumptions and notation of Theorem 3.7, let c be any smooth curve in γ which has all its tangent vectors in C. Then g is analytic in a complex neighbourhood of the double cone (c + C) ∩ (c − C). Another well known lemma that will be used repeatedly is the following (cf. e.g., part (i) of Lemma 2.4.1 in [39]). Lemma 3.9. Let R ⊂ R1+s be a region that contains an open cone, and let A ∈ Aloc be a local observable such that , AB = , BA for all B ∈ A(R). Then A ∈ A(R) .


89

Proof of Theorem 2.2 (ii). In what follows, e0 and e1 denote the unit vectors pointing into the 0- and the 1-direction, respectively. For every t ∈ R, Theorem 2.1 implies the existence of a unique ξ(t) ∈ R1+s with itW1 A(O)−it W1 = A(ξ(t) + 1 (−2π t)O)

for all O ∈ K.

By Corollary 3.1 it is clear that ξ(t) + W1 = W1 , so for all s ∈ R, one has 1 (−2π s)ξ(t) = ξ(t) and −it −is it A(ξ(s + t) + 1 (−2π(t + s))O) = is W1 W1 A(O)W1 W1

= A(ξ(s) + 1 (−2π s)(ξ(t) + 1 (−2π t)O)) = A(ξ(s) + 1 (−2π s)ξ(t) + 1 (−2π(t + s))O) = A(ξ(s) + ξ(t) + 1 (−2π(t + s))O), so ξ(s+t) = ξ(s)+ξ(t) follows from Lemma 3.1. One now concludes that ξ(λt) = λξ(t) for λ ∈ Q, so t → ξ(t) is Q-linear. Next we prove that the function R t → ξ(t) is continuous and, hence, R-linear. As ξ is additive, it is sufficient to prove continuity at t = 0. Assume ξ were not continuous there, then there would exist a sequence (tν ), ν ∈ N, in R that tends to zero, while |ξ(tν )| > ε for some ε > 0. Define the double cone O := − 3ε e0 + V+ ∩ 3ε e0 − V+ . By the above results and locality, there is an Nε ∈ N such that for any A, B ∈ A(O), one has ν [itWν1 A−it W1 , B] = 0 for all ν > Nε . But as itW1 depends strongly continuously on t, one concludes that A and B commute, and since A and B are arbitrary elements of A(O), it follows that A(O) is abelian. Ad is abelian as well, so H = C by irreducibility, which contradicts ditivity implies that A the assumption that H is infinite-dimensional. It follows that ξ is continuous and, hence, R-linear, so there is a ξ ∈ R1+s with ξ(t) = ξ t for all t ∈ R. It remains to be shown that ξ = 0. To this end, define the double cone O := (ρe1 + V+ ) ∩ (ρe1 + ρe0 − V+ ) ⊂ W1 for some ρ > 0. If one chooses ρ sufficiently small, there are a ∈ R1+s and ε, δ > 0 such that (1) 1 (−2πt)O + tξ − δte0 ⊂ a + V+ for all t ∈ [0, ε]; (2) O ⊂ a + V+ . As an example, choose a := ρe1 + ξ − |ξ |e0 , where |ξ | :=

|ξ 2 |. Defining

f (t) := (1 (−2π t)ρe1 + tξ − δte0 − a)2 , one computes

f (0) = 2|ξ |(−2πρ + |ξ | − δ).

|ξ | If one chooses ρ < 2π , one can choose δ such that 0 < δ < −2πρ + |ξ |. With this choice one has f (0) > 0, and as f is smooth and satisfies f (0) = 0, there is an ε > 0 such that f (t) ≥ 0 for all t ∈ [0, ε], which immediately implies Condition (1), whereas Condition (2) follows from f (0) = 0.

90

B. Kuckert b

P O

V1 (−2π t)O + εξ

a Fig. 1. The double cone P in the proof of Thm. 2.2 (ii)

As the set

0≤t≤ε (1 (−2πt)O

+ tξ ) is bounded, there is a b ∈ R1+s such that

(3) 1 (−2πt)O + tξ ⊂ b − V+ for all t ∈ [0, ε]. Now denote P := (a + V+ ) ∩ (b − V+ ) (Fig. 1), choose A ∈ A(O) and B ∈ A(P ), denote by e0 the unit vector in the time direction, and consider the function gA,B defined by R2 (t, s) → gA,B (t, s) := , [B, U (se0 )itW1 A−it W1 U (−se0 )] . By Conditions (1) and (3), this function vanishes in the closure of the open triangle γ with corners (0, 0), (ε, 0) and (ε, −δε) (Fig. 2). Clearly, γ contains a smooth curve that joins (0, 0) to (ε, −δε) and that has tangent vectors in the cone C := {(t, s) ∈ R2 : t > 0, s < 0}. It will be shown that by the double cone theorem, gA,B vanishes in the whole open rectangle ]0, ε[ × ]−δε, 0[. Since gA,B is continuous, it follows that it even vanishes in the closed rectangle [0, ε] × [−δε, 0]. Since B ∈ A(P ) and A ∈ A(O) are arbitrary, Lemma 3.9 implies that A(O − δεe0 ) ⊂ A(P ) . But since by Condition (2), the double cone O − δεe0 cannot be contained in P no matter how small δε is, this is in conflict with Lemma 3.1, so it follows that ξ = 0, which completes the proof.


91

1s 0 0 1 0 1 0 1 0 1 0 1 0 1 ε 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 0000000000000 1111111111111 0 1 −εδ 0 1 0 1 0 1 0 1 0 1 0 1 0 1

t

Fig. 2. Where gA,B vanishes in the proof of Thm. 2.2 (ii)

It remains to be shown that the function gA,B fulfills the assumptions of the double cone theorem. To this end, first note that gA,B = , BU (se0 )it A − , A−it U (−se0 )B = , BU (se0 )it A − , B ∗ U (se0 )it A∗ =: g+ (t, s) − g− (t, s). Using elementary arguments from spectral theory it can be shown that given any ρ > 0, any vector φ in the domain of ρ and any ψ ∈ H, the function R t → ψ, it φ has an extension to a function that is continuous on the strip {t ∈ C : −ρ ≤ Im t ≤ 0} and analytic on the interior of this strip (cf. [40], Lemma 8.1.10 (p. 351)). 1 As O ⊂ W1 , the vectors A and A∗ are in the domain of 2 , and it follows that for every ψ ∈ H, the functions R t → ψ, it A and R t → ψ, it A∗ have extensions that are continuous in the strips {t ∈ C : − 21 ≤ Im t ≤ 0} and {t ∈ C : 0 ≤ Im ≤ 21 }, respectively, and that are analytic in the interior of these strips. On the other hand, it follows from the spectrum condition that for any two vectors φ, ψ ∈ H, the functions R s → ψ, U (se0 )φ and R s → ψ, U (se0 )φ have extensions that are continuous in the (complex) closed upper and lower half plane, respectively, and analytic in the interior of these half planes. This proves that the function g+ has a continuous extension to the tube T+ := {(t, s) ∈ C2 : −1/2 ≤ Im t ≤ 0, Im s ≥ 0} and that at every interior point of this strip, this extension is analytic separately in t and in s. Using Hartogs’ fundamental theorem stating that a function of several complex variables is holomorphic if and only if it is holomorphic separately in each of these variables [33, 51], it follows that g+ , as a function in two complex variables, is analytic in the interior of T+ . It follows in the same way that g− has the corresponding properties for the tube −T+ =: T− . The tubes T+ and T− contain the smaller tubes R2 − iC ∩ B 1 and R2 + iC ∩ B 1 . 2

2

92

B. Kuckert

Since g+ and g− coincide as continuous functions in the closure of γ , they coincide as distributions in the open region γ , and it follows from the edge of the wedge theorem that they are branches of a function g that is analytic in a complex neighbourhood . of γ . But since γ contains a smooth curve joining the points (0, 0) and (ε, −δε) with tangent vectors in C, it follows from the double cone theorem that the function g is analytic in the region ((0, 0) + C) ∩ ((ε, −δε) − C) =]0, ε[ × ] − δε, 0[. This implies that gA,B vanishes in this region, which is all that remained to be shown, so the proof is complete. Proof of Corollary 2.3. If J+ or it+ behave the way assumed in (i) or (ii), respectively, the commutation relations recalled in the remark preceding the corollary, together with Lemma 2.1, imply that its geometrical action can differ from the stated symmetry at most by a translation. Since V+ is Lorentz-invariant, J+ and it+ , t ∈ R, commute with ↑ all U (g), g ∈ L+ . However, there are no nontrivial translations that commute with all ↑ g ∈ L+ . Proof of Lemma 2.4. It follows from the Tomita–Takesaki Theorem that the modular group under consideration leaves the algebras A(W1 ) and A(W1 ) invariant. By wedge duality, it also leaves the algebra A(W1 ) = A(−W1 ) invariant. Borchers’commutation relations now imply −iε iε W1 A(a ± W1 ) W1 = A(1 (−2π ε)a ± W1 ) .

L(A) + W1 is a union of translates of W1 , so (L(A) + W1 )c , being an intersection of translates of −W 1 , is a translate of −W 1 . It follows that (L(A) + W1 )cc is a translate of W 1 . In particular, (L(A) + W1 )cc = {a + W 1 : a ∈ R1+s , (L(A) + W1 )cc ⊂ a + W1 }. But if a ∈ R1+s is chosen such that (L(A) + W1 )cc ⊂ a + W1 , Lemma 3.1 above and wedge duality imply A ∈ A(a + W1 ) = A(a + W1 ) , so one finds {a + W 1 : a ∈ R1+s , A ∈ A(a + W1 ) } ⊂ (L(A) + W1 )cc , and one concludes

−iε {a + W 1 : a ∈ R1+s , iε W1 AW1 ∈ A(a + W1 ) } iε = {a + W 1 : a ∈ R1+s , A ∈ −iε W1 A(a + W1 ) W1 } = {a + W 1 : a ∈ R1+s , A ∈ A(1 (2π ε)a + W1 ) } = 1 (−2πε) {a + W 1 : a ∈ R1+s , A ∈ A(a + W1 ) }

L(Aε ) ⊂

⊂ 1 (−2πt)(L(A) + W1 )cc . The proof that L(Aε ) ⊂ 1 (−2π t)(L(A) − W1 )cc is completely analogous, so the proof of (i) is complete.


93

It remains to prove (ii) and (iii). We prove (iii); (ii) can be established along precisely the same line of argument by replacing itW1 by −it W1 and by exchanging, respectively, V+ and −V+ , A and Aε with one another. Due to Borchers’ commutation relations it suffices to consider A ∈ A(W1 ) , which, as in the proof of Theorem 2.2 (ii), will ensure that A ∈ D(1/2 ) in the following argument. Assume that L(A) ⊂ L(Aε ) + V + . Then one finds an a ∈ R1+s such that (1) L(Aε ) ⊂ a + V+ , while (2) L(A) ⊂ a + V+ . This can be seen as follows. The assumption that L(A) ⊂ L(Aε ) + V + and Statement (i) just proved imply that there is a double cone O ⊂ L(A) such that O and L(Aε ) are spacelike separated, so there is a double cone P ⊃ L(Aε ) such that O and P are spacelike separated (cf., e.g., Prop. 3.8 (b) in [47]); choosing a to be the lower tip of P , one arrives at both Conditions (1) and Condition (2). By Condition (1), L(Aε ) is a compact subset of the open set a + V+ , and as L(At ) depends continuously on t by assumption, there exist σ 7 > 0 and δ > 0 such that (1’) L(At ) − σ 7 e0 ⊂ a + V+

for all t ∈ [ε − δ, ε],

and this condition is, of course, equivalent to Condition (1). Since L(At ) depends continuously on t ∈ [0, ε], the set 0≤t≤ε L(At ) is bounded, so one finds a σ 8 ≥ 0 such that (3) L(At ) + σ 8 e0 ⊂ a + V+ for all t ∈ [0, ε], and for the same reason there is a b ∈ R1+s such that (4) L(At ) + 2σ 8 e0 ⊂ b − V+ for all t ∈ [0, ε]. Now define P := (a + V+ ) ∩ (b − V+ ), and for any B ∈ A(P ), consider – as in the proof of Proposition 2.2 – the function gA,B defined by R2 (t, s) → gA,B (t, s) := , [B, U (se0 )At U (−se0 )] . Locality and Conditions (3) and (4) imply that this function vanishes in the rectangle [0, ε] × [σ 8 , 2σ 8 ], and Condition (1’) implies that it also vanishes in the rectangle [ε −δ, ε]×[−σ 7 , σ 8 ]. By the double cone theorem, gA,B vanishes throughout the whole rectangle [0, ε] × [−σ 7 , 2σ 8 ] (Fig. 3). In particular, one obtains gA,B (0, −σ 7 ) = 0 for all B ∈ A(P ), so one can use Lemma 3.9 to conclude that A ∈ A(σ 7 e0 + P ) . By the definition of L(A), one finds L(A) − σ 7 e0 ⊂ P ⊂ a + V + , and as σ 7 > 0, this implies L(A) ⊂ a + V+ , which is in conflict with Condition (2) above and completes the proof. Proof of Theorem 2.5. Fix any ρ > 0, and define the double cones O1 := (ρ(2e1 + e0 ) + V+ ) ∩ (ρ(2e1 + 2e0 ) − V+ ), O2 := (ρ(2e1 − 2e0 ) + V+ ) ∩ (ρ(2e1 + 2e0 ) − V+ ), and O3 := (ρ(2e1 − 3e0 ) + V+ ) ∩ (ρ(2e1 + 3e0 ) − V+ ), (Fig. 4) and choose A ∈ A(O1 ). As L(A) ⊂ O1 , it follows from Lemma 2.4 (i) and (ii)

94

B. Kuckert

s

000 111 11111111111111 00000000000000 00000000000000 11111111111111 000 111 00000000000000 11111111111111 000 111 000 111 00000000000000 11111111111111 00000000000000 11111111111111 000 111 000 111 00000000000000 11111111111111 00000000000000 11111111111111 000 111 000 111 8 11111111111111 00000000000000 σ 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 ε − δ 111 ε 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 000 111 111 000 −σ 7 111 000 11111111111111 00000000000000

2σ 8

t

Fig. 3. Where gA,B vanishes in the proof of Lemma 2.4

that L(At ) ⊂ (1 (−2π t)ρ ( 23 e1 + 23 e0 ) + W 1 ) ∩ (1 (−2π t)ρ ( 25 e1 + 23 e0 ) − W 1 ) ∩ (ρ(2e1 + 2e0 ) − V + ) =: Rt , and there is an ε > 0 such that Rt ⊂ O2

for all t ∈ [0, ε].

Note that by the linearity of the Lorentz boosts, ε does not depend on ρ. One now has L(At ) ⊂ O2 for all A ∈ A(O1 ), and with Corollary 5.4 in [39], it follows that itW1 A(O1 )−it W1 ⊂ A(O3 )

for all t ∈ [0, ε].

Using Borchers’ commutation relations, one finds itW1 A(a + O1 )−it W1 ⊂ A(1 (−2π t)a + O3 )

for all a ∈ R1+s and all t ∈ [0, ε]. Defining x := ρ(2e1 + e0 ), P1 := O1 − x, and P3 := O3 − x, one obtains itW1 A(a + P1 )−it W1 ⊂ A(1 (−2π t)a + (x − 1 (−2π t)x) + P3 ) .

Two Uniqueness Results on Unruh Effect and on PCT-Symmetry x0

95

W1

2ρ O3 O1 O2 x1 2ρ

W1 (dashed lines) Fig. 4. The double cones O1 , O2 , and O3 in the proof of Thm. 2.5

Note that the euclidean length of the vector x − 1 (−2π t)x is ≤ 3ρ for all t ∈ [0, ε], as 1 (−2πt)x ∈ Rt ⊂ O2 by the above choice of ε. Now choose any wedge W ∈ W. As W ⊂ W + P1 , it follows from wedge additivity that

A(W ) ⊂

A(a + P1 )

.

a∈W

Define, for δ > 0, the wedges W (δ) := Bδ (W ) , where Bδ (W ) denotes the euclidean δ-ball around W , and W (−δ) := ((W )(δ) ) , then it follows from isotony and wedge duality that

A(a

+ P3 )

⊂ A(W (4ρ) ) ,

a∈W

and as the euclidean length of the vector (1 (−2π t)x − x) is ≤ 3ρ, one arrives at

a∈W

A(a + (x

− 1 (−2π t)x) + P3 )

⊂ A(W (7ρ) ) .

96

B. Kuckert

For t ∈ [0, ε], one now obtains



 itW1 A(1 (2π t)W ) −it W1 ⊂

a∈1 (2πt)W



⊂

  itW1 A(a + P1 )−it W1 

A(1 (−2π t)a + (x − 1 (−2π t)x) + P3 ) 

a∈1 (2πt)W

⊂ A(W (7ρ) ) , and as W = (W (−7ρ) )(7ρ) , this can be rewritten it (−7ρ) ) ). −it W1 A(W ) W1 ⊃ A(1 (2π t)W

Using the fact that the transformations 1 (2π t) are linear and, hence, bounded maps in R1+s , which map the euclidean 7ρ-ball onto some bounded set with radius proportional to ρ, and using the facts that this radius continuously depends on t ∈ [0, ε], that the interval [0, ε] is compact, and that ε does not depend on the choice of ρ, one concludes that there is an M > 0 which is independent from ρ and satisfies 1 (2πt)W (−7ρ) ⊃ (1 (2π t)W )(−Mρ)

for all t ∈ [0, ε],

so with the above specifications of ε and M, one obtains it (−Mρ) ) −it W1 A(W ) W1 ⊃ A((1 (2π t)W )

for all wedges W ∈ W and all ρ > 0. For each A ∈ Aloc , one now concludes {W : W ∈ W, itW1 A−it L(At ) = W1 ∈ A(W ) } it = {W : W ∈ W, A ∈ −it W1 A(W ) W1 } ⊂ {W : W ∈ W, A ∈ A((1 (2π t)W )(−Mρ) ) } ρ>0

=

{1 (−2π t)X : X ∈ W, A ∈ A(X (−Mρ) ) }

ρ>0

= 1 (−2πt)

{X : X ∈ W, A ∈ A(X (−Mρ) ) }

ρ>0

= 1 (−2πt)

{X (Mρ) : X ∈ W, A ∈ A(X) }

ρ>0

= 1 (−2π t)L(A). To prove the converse inclusion, one proves L(At ) ⊂ 1 (−2π t) for t ∈ [−ε, 0] by mimicking the above argument: one defines the double cone O1 := ρ(2e1 − 2e0 ) + V+ ) ∩ (ρ(2e1 − e0 ) − V+ ), keeps O2 and O3 as before, defines x := ρ(2e1 − e0 ) and proceeds like above with t ∈ [−ε, 0], using Lemma 2.4 (iii) instead of Part (ii) of the same lemma. Now having proved L(At ) ⊂ 1 (−2π t)L(A) for all t ∈ [−ε, ε] and for all A ∈ Aloc , one concludes L(At ) = 1 (−2π t)L(A) for all t ∈ [−ε, ε] and for all A ∈ Aloc . As this immediately implies the statement for all t ∈ R and all A ∈ Aloc , the proof is complete.


97

4. Conclusion By the above results, the modular group of a theory that does not exhibit the Unruh effect acts in a completely “non-geometric” fashion, in the sense that it can neither preserve the net structure nor act on the local observables in such a way that localization regions evolve continuously. In particular, it cannot implement any equilibrium dynamics in this case. The above results imply that the only observer who can possibly experience the vacuum in thermodynamical equilibrium is the uniformly accelerated one (whose acceleration may, of course, be zero). Physically, this result reflects the fact that any nonuniformly accelerated observer would feel nonstationary inertial forces destroying any thermodynamical equilibrium, while the constant acceleration felt by a uniformly accelerated observer does not affect thermodynamical equilibrium provided the theory exhibits the Unruh effect. The first results similar to the above ones have been obtained by Araki and by Keyl [4, 35]. These authors avoid the spectrum condition and assume stronger a priori restrictions on the possible geometric behaviour instead. Recently, more results in this spirit have been found by Buchholz et al. and by Trebels [21, 27, 29, 48]. One aim of these approaches is to obtain new insight on quantum fields on curved spacetimes by avoiding the spectrum condition. So far, results have been obtained for de Sitter, Anti-de Sitter, and certain Robertson–Walker spacetimes [21, 22, 24]. For the vacuum states in Minkowski space considered above, the spectrum condition is a reasonable physical assumption. The assumptions made above on the possible geometric behaviour of the modular objects (in particular those made in the first uniqueness theorem) are less restrictive than those made in any of the other approaches, since a small class of regions, namely, the double cones, is assumed to be mapped into an extremely large class of regions, namely, the open sets. In this sense the above results are, at present, the most general uniqueness results in Minkowski space that point towards the Unruh effect and modular P1 CT-symmetry. Even more than a uniqueness result can be found if conformal symmetry holds in addition to our above Conditions (A) through (C). In this case, the whole representation of the conformal group arises from the modular objects of the theory, and in particular, the Bisognano–Wichmann symmetries can be established [16]. Appendix. A Remark on the Continuity of t → L(At ) In the discussion of the second uniqueness theorem it was assumed that L(At ) depends continuously on t for t ∈ [0, ε] in the sense that for each sequence (tν )ν∈N tending to a t∞ ∈ [0, ε], the localization region L(At∞ ) consists precisely of all accumulation points of sequences (xν )ν∈N with xν ∈ L(Atν ). In this appendix we show that this notion of convergence, which we refer to as pointwise convergence, is equivalent to the convergence according to a metric first considered by Hausdorff, which one can introduce on the set C of compact convex subsets of R1+s by defining, for any two such sets K, L ∈ C, δH (K, L) := inf{δ > 0 : K ⊂ Bδ (L) and L ⊂ Bδ (K)} (cf. Problem 4D (p. 131) in [34]). It is evident that continuity of [0, ε] t → L(At ) with respect to this metric, which we refer to as uniform continuity, implies the pointwise

98

B. Kuckert

continuity for this map. Conversely, one can also show that pointwise continuity implies uniform continuity for t → L(At ). To prove this indirectly, assume that t → L(At ) is pointwise continuous for t ∈ [0, ε] and that this map is not continuous with respect to Hausdorff’s metric. Then there exists a ρ > 0 and a sequence (tν )ν∈N of points in [0, ε] which converges to a point t∞ ∈ [0, ε] and has the property that δH (L(Atν ), L(At∞ )) ≥ ρ. On the other hand, there is a subsequence (sν )ν∈N of (tν )ν∈N with the property that all L(Asν ) have nonempty intersection with Bρ (L(At∞ )), as otherwise L(At∞ ) would be empty by the assumption of pointwise continuity. As δH (L(Asν ), L(At∞ )) ≥ ρ, there exists a sequence (xν )ν∈N such that the euclidean distance δ(xν , L(At∞ )) between xν and L(At∞ ) is ≥ ρ/2 for all ν ∈ N, and as all L(Asν ) are convex sets with a nonempty intersection with Bρ (L(At∞ )), this sequence can be chosen such that it is bounded and, hence, has an accumulation point x. ˜ As δ(xν , L(At∞ )) ≥ ρ/2 for all ν ∈ N, one finds δ(x, ˜ L(At∞ ) ≥ ρ/2, so x˜ ∈ / L(At∞ ). But this contradicts the assumption that t → L(At ) is pointwise continuous and proves that this map is pointwise continuous if and only if it is uniformly continuous, as stated. It is now easy to see that t∈[0,ε] L(At ) is bounded, as stated in the text. Namely, the function [0, ε] t → δH (L(A), L(At )) is continuous and, hence, has a maximum ρ > 0 in the compact interval [0, ε]. It follows that t∈[0,ε] L(At ) ⊂ Bρ (L(A)), which is a bounded set. Acknowledgements. It was an important help that D. Arlt and N. P. Landsman read the manuscript carefully. This research was funded by the Deutsche Forschungsgemeinschaft, a Feodor–Lynen grant of the Alexander von Humboldt foundation, and a Hendrik Casimir–Karl Ziegler award of the Nordrhein-Westfälische Akademie der Wissenschaften. The idea to reinitiate the project originated during a stay in 1997 at the Erwin-Schrödinger Institute for Mathematical Physics at Vienna. Helpful discussions there with S. Trebels and D. Guido are gratefully acknowledged.

References 1. Alexandrov, A. D.: On Lorentz transformations. Uspekhi Mat. Nauk. 5 No. 3 (37), 187 (1950) 2. Alexandrov, A. D.: Mappings of Spaces with Families of Cones and Space-Time Transformations. Annali di matematica 103, 229–257 (1975) 3. Alexandrov, A. D., Ovchinnikova, V. V.: Notes on the foundations of relativity theory. Vestnik Leningrad Univ. 14, 95 (1953) 4. Araki, H.: Symmetries in a Theory of Local Observables and the Choice of the Net of Local Algebras. Rev. Math. Phys. Special Issue, 1–14 (1992) 5. Araki, H.: Mathematical Theory of Quantum Fields. Oxford: Oxford University Press, 1999 6. Baumgärtel, H., Wollenberg, M.: Causal Nets of Operator Algebras. Berlin: Akademie-Verlag, 1992 7. Bisognano, J. J., Wichmann, E. H.: On the Duality Condition for a Hermitian Scalar Field. J. Math. Phys. 16, 985–1007 (1975) 8. Bisognano, J. J., Wichmann, E. H.: On the Duality Condition for Quantum Fields. J. Math. Phys. 17, 303 (1976) 9. Borchers, H.-J.: Über die Vollständigkeit lorentzinvarianter Felder in einer zeitartigen Röhre. Nuovo Cimento 19, 787–796 (1961) 10. Borchers, H.-J.: On the Vacuum State in Quantum Field Theory, II. Commun. Math. Phys. 1, 57 (1965) 11. Borchers, H.-J.: The CPT-Theorem in Two-Dimensional Theories of Local Observables. Commun. Math. Phys. 143, 315–332 (1992) 12. Borchers, H.-J.: Translation Group and Particle Representations in Quantum Field Theory. Berlin– Heidelberg: Springer, 1996 13. Borchers, H.-J.: On Poincaré transformations and the modular group of the algebra associated with a wedge. Lett. Math. Phys. 46, 295–301 (1998)


99

14. Borchers, H.-J.: On the Revolutionization of Quantum Field Theory by Tomita’s Modular Theory. J. Math. Phys. 41, 3604–3673 (2000) 15. Borchers, H.-J., Hegerfeldt, G. C.: The Structure of Space-Time Transformations. Commun. Math. Phys. 28, 259–266 (1972) 16. Brunetti, R., Guido, D., Longo, R.: Modular Structure and Duality in Conformal Quantum Field Theory. Commun. Math. Phys. 156, 201–219 (1993) 17. Buchholz, D.: Collision Theory for Massless Fermions. Commun. Math. Phys. 42, 269–279 (1975) 18. Buchholz, D.: Collision Theory for Waves in Two Dimensions and a Characterization of Models with Trivial S-Matrix. Commun. Math. Phys. 45, 1–8 (1975) 19. Buchholz, D.: Collision Theory of Massless Bosons. Commun. Math. Phys. 52, 147–173 (1977) 20. Buchholz, D.: On the Structure of Local Quantum Fields with Non-Trivial Interaction. In: Proceedings of the International Conference on Operator Algebras, Ideals and Their Applications in Theoretical Physics, Leipzig, 1977. Stuttgart: Teubner, 1978 21. Buchholz, D., Dreyer, O., Florig, M., Summers, S. J.: Geometric Modular Action and spacetime Symmetry Groups. Rev. Math. Phys. 12, 475–560 (2000) 22. Buchholz, D. Florig, M., Summers, S. J.: Hawking–Unruh Temperature and Einstein Causality in Anti-de Sitter Space-Time. Class. Quant. Grav. 17, L31–L37 (2000) 23. Buchholz, D., Fredenhagen, K.: Dilations and interaction. J. Math. Phys. 18, 1107–1111 (1977) 24. Buchholz, D., Mund, J., Summers, S. J.: Transplantation of Local Nets and Geometric Modular Action on Robertson–Walker Space-Times. Preprint, hep-th/0011237 25. Buchholz, D., Summers, S. J.: An Algebraic Characterization of Vacuum States in Minkowski Space. Commun. Math. Phys. 155, 449–458 (1993) 26. Davidson, D. R.: Modular Covariance and the Algebraic PCT/Spin-Statistics Theorem. Preprint, hep-th/9511216 27. Dreyer, O.: Das Prinzip der geometrischen modularen Wirkung im de Sitter-Raum. diploma thesis, University of Hamburg, 1996 28. Florig, M.: On Borchers’ Theorem. Lett Math. Phys. 46, 289–293 (1998) 29. Florig, M.: Geometric Modular Action. PhD-thesis, University of Florida, Gainesville, 1999 30. Guido, D., Longo, R.: An Algebraic Spin and Statistics Theorem. Commun. Math. Phys. 172, 517–534 (1995) 31. Guido, D., Longo, R.: The Conformal Spin and Statistics Theorem. Commun. Math. Phys. 181, 11–36 (1996) 32. Haag, R.: Local Quantum Physics. Berlin: Springer, 1992 33. Hartogs, F.: Zur Theorie der Funktionen mehrerer komplexer Veränderlicher, insbesondere über die Darstellung derselben durch Reihen, welche nach Potenzen einer Veränderlichen fortschreiten. Math. Ann. 62, 1–88 (1906) 34. Kelley, J. L.: General Topology. New York: van Nostrand, 1955 35. Keyl, M.: Remarks on the relation between causality and quantum fields. Class. Quantum Grav. 10, 2353–2362 (1993) 36. Kuckert, B.: A New Approach to Spin & Statistics. Lett. Math. Phys. 35, 319–335 (1995) 37. Kuckert, B.: Borchers’ Commutation Relations and Modular Symmetries in Quantum Field Theory. Lett. Math. Phys. 41, 307–320 (1997) 38. Kuckert, B.: Spin & Statistics, Localization Regions, and Modular Symmetries in Quantum Field Theory. PhD-thesis, Hamburg 1998, DESY-thesis 1998-026 39. Kuckert, B.: Localization Regions of Local Observables. Commun. Math. Phys. 215, 197–216 (2000) 40. Li Bing-Ren: Introduction to Operator Algebras. Singapore: World Scientific, 1992 41. Longo, R.: On the spin-statistics relation for topological charges. In: Doplicher, S., Longo, R., Roberts, J. E., Zsido, L. (eds.): Operator Algebras and Quantum Field Theory. Proceedings of the conference at the Accedemia Nazionale dei Lincei, Rome 1996. Cambridge, MA: International Press, 1997 42. Mack, G., Salam, A.: Finite-Component Field Representations of the Conformal Group. Ann. Phys. 53, 174–202 (1969) 43. Mund, J.: Quantum Field Theory of Particles with Braid Group Statistics in 2+1 dimensions. PhD-thesis, Freie Universität Berlin, 1998 44. Reeh, H., Schlieder, S.: Bemerkungen zur Unitäräquivalenz von lorentzinvarianten Feldern. Nuovo Cimento 22, 1051 (1961) 45. Streater, R. F., Wightman, A. S.: PCT, Spin & Statistics, and All That. New York: Benjamin, 1964 46. Takesaki, M.: Tomita’s Theory of Modular Hilbert Algebras and Its Applications. Lecture Notes in Mathematics 128, New York: Springer, 1970 47. Thomas, L. J., Wichmann, E. H.: Standard forms of local nets in quantum field theory. J. Math. Phys. 39, 2643–2681 (1998) 48. Trebels, S.: PhD-thesis. Göttingen 1997, cf. also [14] 49. Unruh, W. G.: Notes on black hole evaporation. Phys. Rev. D 14, 870–892 (1976)

100

B. Kuckert

50. Vladimirov, V. S.: The construction of envelopes of holomorphy for domains of a special type. (in Russian) Doklady Akad. Nauk SSSR 134, 251–254 (1960) 51. Vladimirov, V. S.: Methods of the Theory of Functions of Many Complex Variables. Cambridge, MA: M. I. T. Press, 1966 52. Wiesbrock, H.-W.: A Comment on a Recent Work of Borchers. Lett. Math. Phys. 25, 157–159 (1992) 53. Yngvason, J.: A Note on Essential Duality. Lett. Math. Phys. 31, 127–141 (1994) 54. Zeeman, E. C.: Causality Implies the Lorentz Group. J. Math. Phys. 5, 490–493 (1964) Communicated by H. Araki

Commun. Math. Phys. 221, 101 – 140 (2001)

Communications in



Renormalization Group and the Melnikov Problem for PDE’s Jean Bricmont1, , Antti Kupiainen2, , Alain Schenkel2 1 UCL, FYMA, 2 chemin du Cyclotron, 1348 Louvain-la-Neuve, Belgium 2 Department of Mathematics, Helsinki University, P.O. Box 4, 00014 Helsinki, Finland

Received: 29 January 2001 / Accepted: 8 March 2001

Abstract: We give a new proof of persistence of quasi-periodic, low dimensional elliptic tori in infinite dimensional systems. The proof is based on a renormalization group iteration that was developed recently in [BGK] to address the standard KAM problem, namely, persistence of invariant tori of maximal dimension in finite dimensional, near integrable systems. Our result covers situations in which the so called normal frequencies are multiple. In particular, it provides a new proof of the existence of small-amplitude, quasi-periodic solutions of nonlinear wave equations with periodic boundary conditions. 1. Introduction In this paper, we address the persistence problem of quasi-periodic, low dimensional, elliptic tori in infinite dimensional systems. A typical example that we will consider is the nonlinear wave equation (NLW) on a bounded interval, ∂t2 u = ∂x2 u − V u − f (u),

(1.1)

with Dirichlet or periodic boundary conditions and f (u) = O(u3 ). The first results concerning the Melnikov problem (i.e., the persistence of elliptic invariant tori of dimension lower than the number of degrees of freedom, [M, E]) for infinite dimensional Hamiltonian systems were obtained independently by Kuksin, Pöschel and Wayne, [K2, P1, W]. In particular, existence of quasi-periodic solutions of (1.1) was shown in [K1, W]. Based on the Kolmogorov–Arnold–Moser (KAM) approach, these results were restricted to Dirichlet or Neumann boundary conditions and to specific classes of adjustable potentials V , excluding, in particular, arbitrary constant potentials. This latter case was covered in [BK] by using the sine-Gordon PDE as the unperturbed integrable system, and, following a different approach, in [P2]. In [P2], the existence of a Birkhoff normal Partially supported by ESF/PRODYN.

Partially supported by EC grant FMRX-CT98-0175.

102

J. Bricmont, A. Kupiainen, A. Schenkel

form for the Hamiltonian of (1.1) is exploited in order to control the torus frequencies via amplitude-frequency modulation, and therefore to dispense with outer parameters provided by an adjustable potential. This approach was applied in [KP] to the persistence of quasi-periodic solutions for the nonlinear Schrödinger equation (NLS) subject to Dirichlet (or Neumann) boundary conditions. The case of periodic boundary conditions is more delicate due to the fact that the eigenvalues of the Sturm-Liouville operator L = −d 2 /dx 2 + V are degenerate. This leads to resonances between pairs of frequencies corresponding to motion in directions normal to the torus (the so called normal frequencies). These additional resonances prevent one from controlling quadratic terms in the Hamiltonian of the system. (This difficulty also appears in finite-dimensional Melnikov situations.) Developing new techniques based on the Lyapunov-Schmidt method, Craig and Wayne proved in [CW] persistence of periodic solutions of the NLW with periodic boundary conditions. Later, their approach was significantly improved by Bourgain in [B1-2] who constructed quasi-periodic solutions of the NLW and NLS with periodic boundary conditions. Most notably, it is shown in [B2] that solutions of this type can be constructed, in particular, for the NLS on twodimensional domains. The usual Melnikov nonresonance condition reads, with ω ∈ Rd and µ ∈ Rn denoting the torus and, respectively, the normal frequencies (n is possibly infinite), k, ω + l, µ = 0,

k ∈ Zd , l ∈ Zn with |k| + |l| = 0, |l| ≤ 2.

(1.2)

In Bourgain’s approach and at the price of a considerable technical effort, condition (1.2) is reduced to k, ω + µs = 0,

k ∈ Zd , s = 1, . . . , n,

i.e., all nonresonance conditions on pairs of normal frequencies are absent. More recently, Chierchia and You, see [Y,CY], showed that persistence of quasi-periodic solutions of the NLW with periodic boundary conditions is tractable by KAM techniques. Their nonresonance condition, k, ω + l, µ = 0,

k ∈ Zd \ {0}, l ∈ Zn with |l| ≤ 2,

(1.3)

is stronger than Bourgain’s condition but weaker than (1.2). However, their result does not cover the case of constant potential. In the present paper, we give a new proof of Bourgain’s result for the NLW with periodic boundary conditions in the case of constant potential. Our proof is based on a renormalization group procedure recently developed in [BGK] for standard KAM problems. The nonresonance condition that we will impose is the same as Chierchia and You’s condition, but our technique could in principle accommodate Bourgain’s condition. In order to describe our result further, we start by specifying the infinite dimensional Hamiltonians we will consider. For dk , k ≥ 1, a sequence of strictly positive integers uniformly bounded by some d¯ < ∞, let R∞ denote the set of infinite sequences x = (x1 , x2 , . . . ) with xk ∈ Rdk . For an integer d ≥ 1, let P = Td × Rd × R∞ × R∞ , where Td is the torus Rd /(2πZd ). Denoting the coordinates in P by (φ, I, x, y) and endowing P with the symplectic structure dφ ∧ dI + dx ∧ dy, we consider perturbations of integrable Hamiltonians of the form H (φ, I, x, y) = ω · I + 21 I · gI + 21 µ2k |xk |2 + |yk |2 + λU (φ, I, x), (1.4) k≥1

Renormalization Group and Melnikov Problem for PDE’s

103

where µk ∈ R, k ≥ 1, ω ∈ Rd , and g is a real symmetric, invertible d × d matrix. 2 Above, |v|2 for v ∈ Rm denotes m i=1 vi . The Hamiltonian flow generated by (1.4) is given by the equations of motion I˙ = −λ∂φ U,

φ˙ = ω + gI + λ∂I U,

(1.5)

and x¨k = −µ2k xk − λ∂xk U.

(1.6)

For λ = 0 and the initial condition I 0 = φ 0 = x 0 = y 0 = 0, the flow φ(t) = ωt, I (t) = 0, and x(t) = 0, is quasi-periodic and spans a d-dimensional torus in Td × Rd × R∞ × R∞ . In order to study the case for which the perturbation is turned on, we consider a quasi-periodic solution of the form (φ(t), I (t), x(t)) = (ωt + !(ωt), J (ωt), Z(ωt)). Then, (1.5) and (1.6) require that T ≡ (!, J, Z) : Td → Rd × Rd × R∞ satisfies the equation DT (ϕ) = −λ∂U (ϕ + !(ϕ), J (ϕ), Z(ϕ)),

(1.7)

where ∂ = (∂φ , ∂I , ∂x ) and, setting µ ≡ diag(µ1 1d1 , µ2 1d2 , . . . ), together with D ≡ ω · ∂φ ,

(1.8)



 0 D 0 . 0 D = −D g 2 2 0 0 D +µ

(1.9)

Note that if T is a solution of Eq. (1.7), then so is Tβ for β ∈ Rd , where Tβ (ϕ) = T (ϕ − β) − (β, 0, 0).

(1.10)

We now state the two hypotheses under which we shall prove existence of a solution T of Eq. (1.7), first introducing the following family of Banach spaces R∞ s , s ∈ R,

∞ R∞ k s |Zk |Rdk < ∞ . (1.11) s = Z ∈ R | |Z|s ≡ k≥1

(H1) Asymptotics of eigenvalues. The sequence {µk }k≥1 satisfies µk > 0 and µk = µl for all k = l ≥ 1, and there exist γ ≥ 1 and c > 0 such that µk ≥ ck γ

for all k ≥ 1.

(1.12)

Furthermore, if γ > 1 then µk − µk ≥ c(k γ − k γ ) for all k > k ≥ 1.

(1.13)

If γ = 1, then there exist constants ξ > 0 and cl > 0 such that µk − µk = cl (1 + O(k −ξ )) for all k − k = l ≥ 1.

(1.14)

104


(H2) Regularity of the perturbation. The map (φ, I, x) → U (φ, I, x) is assumed to be real analytic in φ ∈ Td and real analytic in I and x in a neighborhood of the origin of Rd and R∞ 0 . In addition, we assume that there exist an s > 0 and a ξ > 0 such that for some OI ⊂ Rd and Ox ⊂ R∞ s neighborhoods of the origin, the gradient ∂x U is bounded as a map from Td × OI × Ox to R∞ s+ξ −γ . In the sequel, we will often use the short notation s ≡ s + ξ − γ . Theorem 1.1. Let {µk } satisfy (H1) and U satisfy (H2). Then, there exists a set +∗ = +∗ (U, µ) ⊂ Rd such that for ω ∈ +∗ , Eq. (1.7) has a unique solution (up to translations (1.10)) which is real analytic in λ and φ provided that |λ| is small enough. Furthermore, for all bounded + ⊂ Rd the set +∗ of admissible frequencies satisfies meas(+\+∗ ) → 0 as λ → 0. The proof of Theorem 1.1 is based on an inductive procedure developed in [BGK] for standard KAM problems. This renormalization group iteration can be viewed as an iterative resummation of the Lindstedt series, as is explained in more details in [BGK], and was directly inspired by the quantum field theory analogy with KAM problems forcefully emphasized by Gallavotti et al. [G, GGM]. Melnikov type problems require to deal with the additional resonances arising from the normal frequencies µk , and the goal of the present paper is to explain how the procedure of [BGK] can be applied in such cases. In contrast to standard KAM problems, the set +∗ of admissible frequencies depends for Melnikov type problems on the perturbation U . In our approach, this dependence expresses itself by the fact that under iteration, the normal frequencies are renormalized in a U -dependent way and that the set +∗ is defined according to the renormalized normal frequencies. As usual, the set +∗ is constructed in such a way that nonresonance conditions are fulfilled in order for the inductive scheme to converge. Our scheme is technically simplified if one imposes the nonresonance condition of the form (1.3), i.e., conditions involving pairs of normal frequencies. Hypothesis (H1) ensures that +∗ has large measure under these conditions, and hypothesis (H2) ensures that the asymptotic properties of the normal frequencies stated in (H1) are preserved under renormalization. The requirement ξ > 0 is needed both in (H1) when γ = 1, and, for γ > 1, in (H2) in order to cover the case of degenerate normal frequencies (more precisely the case where dk > 1 for infinitely many k). In Sect. 2, we show how Theorem 1.1 provides a proof of the existence of quasi-periodic solutions of the 1D NLW with periodic boundary conditions. In particular, γ = 1 in (H1) and we will see that (H2) is satisfied with ξ = 1. In contrast, one has for the 1D NLS γ = 2 and ξ = 0. Thus, the scheme presented here only applies to NLS with Dirichlet boundary conditions (namely dk = 1 for all k) or to the persistence of periodic solutions of NLS (namely d = 1). In order to cover the other situations, one must be able to dispense with nonresonance conditions involving certain pairs of normal frequencies. The remainder of the paper is organized as follows. Section 2 is devoted to the NLW. In Sect. 3 we explain the renormalization group scheme that will be used to prove Theorem 1.1. Section 4 is devoted to the definition of the spaces we will consider. In Sect. 5, we state some crucial inductive bounds, which will be shown to hold in Sect. 6. Section 7 is concerned with the measure estimate of +∗ , whereas the proof of Theorem 1.1 is carried out in Sect. 8. Finally, we have collected in the appendix some technical and intermediary results.


105

2. The 1D Wave Equation In this section, we show how Theorem 1.1 implies the existence of small amplitude quasi-periodic solutions of nonlinear 1D wave equations of the form ∂t2 u = ∂x2 u − mu − f (u),

t > 0, x ∈ [0, 2π ],

(2.1)

with periodic boundary conditions u(0, t) = u(2π, t), ∂t u(0, t) = ∂t u(2π, t). Here, m > 0 is a real parameter and f is a real analytic function of the form f (u) = u3 +O(u4 ). For f ≡ 0, Eq. (2.1) becomes ∂t2 u = ∂x2 u − mu ≡ −Lu.

(2.2)

The operator L with periodic boundary conditions admits a complete orthonormal basis of eigenfunctions ψn ∈ L2 ([0, 2π ]), n ∈ Z, with corresponding eigenvalues ζn = n2 + m,

(2.3)

√ if one sets ψ0 = 1/ 2π and for n ≥ 1,

1 ψn (x) = √ cos(nx), π

1 ψ−n (x) = √ sin(nx). π

(2.4)

Every solution of the linear wave Eq. (2.2) can be written √ as a superposition of the basic modes ψn , namely, for I any subset of Z and µn ≡ ζn , an cos(µn t + θn )ψn (x), (2.5) u(x, t) = n∈I

with amplitudes an > 0 and initial phases θn . Regarding existence of solutions for the nonlinear wave equation (2.1), we will prove Theorem 2.1. Let 1 ≤ d < ∞ and I = {n1 , . . . , nd } ⊂ Z satisfying |ni | = |nj | for i = j . Then, for λ > 0 small enough there is a set A ⊂ {a = (a1 , . . . , ad ) | 0 < ai < λ} of positive measure such that for a ∈ A Eq. (2.1) has a solution u(x, t) =

d i=1

ai cos(µni t + θi )ψni (x) + O(|a|3 ),

(2.6)

with frequencies µni = µni + O(|a|2 ). Furthermore, the set A is of asymptotically full measure as |a| → 0. As is well known, the nonlinear wave Eq. (2.1) can be studied as an infinite dimensional Hamiltonian system by taking the phase space to be the product of the Sobolev spaces H01 ([0, 2π ]) × L2 ([0, 2π ]) with coordinates u and v = ∂t u. The Hamiltonian for (2.1) is then 2π 1 1 H = 2 (v, v) + 2 (Lu, u) + g(u) dx, (2.7)

0

where L = −d 2 /dx 2 + m, g = f ds, and (·, ·) denotes the usual scalar product in L2 ([0, 2π ]). In order to prove existence of solutions of type (2.6) by means of Theorem 1.1, we would like to write (2.7) in the form (1.4). This turns out to be possible,

106


through amplitude-frequency modulation, due to the availability of a (partial) normal form theory for (2.7). As we shall see, the requirement for the parameter m to be non zero is crucial for this part of the argument. In the sequel, we will closely follow the exposition of Pöschel in [P2]. Introducing the coordinates q = (q0 , q1 , q−1 , . . . ) and p = (p0 , p1 , p−1 , . . . ) by setting u(x) =

qn ψn (x),

v(x) =

n∈Z

pn ψn (x),

(2.8)

n∈Z

one rewrites the Hamiltonian (2.7) in the coordinates (q, p), H =

1 2 2 µn qn + p2n + G(q), 2

(2.9)

n∈Z

where

2π

G(q) =

g

0

qn ψn (x) dx.

(2.10)

n∈Z

The Hamiltonian flow generated by (2.9) is given by the equations of motion q¨ n = −µ2n qn − ∂qn G(q),

(2.11)

and one can show that a solution q of (2.11) yields a solution of the nonlinear wave Eq. (2.1) if q has some decaying properties. More precisely, defining lbs to be the Banach space of all real valued bi-infinite sequences w = (w0 , w1 , w−1 , . . . ) with norm ||w||s =

[n]s |wn |, n∈Z

where [n] = max(1, |n|), one has the Lemma 2.2. Let s ≥ 2. If a curve I → lbs , t → q(t), is a solution of (2.11), then u(x, t) =

qn (t)ψn (x)

n∈Z

is a classical solution of (2.1). For the proof of Lemma 2.2, see [CY]. Before turning to the normal form analysis of the Hamiltonian (2.9), we state a result concerning the regularity of the gradient ∂q G. Lemma 2.3. For all s > 0, the gradient ∂q G is real analytic as a map from some neighborhood of the origin in lbs into lbs , with ||∂q G(q)||s = O(||q||3s ).

(2.12)


107

Proof. We first note that lbs is a Banach algebra with respect to convolution of sequences, with s

[i] ||q ∗ p||s ≤ [i]s |qj −i ||pj | ≤ sup ||q||s ||p||s ≤ 2s ||q||s ||p||s . i,j ∈Z [j − i][j ] i,j ∈Z

(2.13) Therefore, using the analyticity of f (u) = u3 +O(u4 ), one computes that in a sufficiently small neighborhood of the origin, ||f (u)||s ≤ C||q||3s .

(2.14)

On the other hand, since ∂qn G(q) =

2π 0

f (u)ψn (x)dx,

the components of ∂q G(q) are the Fourier components of f (u) and (2.12) follows from the estimate (2.14). The regularity of ∂q G follows from the regularity of its components and its local boundedness, cf. [PT], p. 138. We now turn to the normal form analysis of (2.9). First, since g(u) = 41 u4 + O(u5 ), we find that 1 G(q) = gij kl qi qj qk ql + O(|q|5 ), 4 i,j,k,l

where gij kl =

2π 0

ψi ψj ψk ψl dx.

(2.15)

An easy computation shows that gij kl = 0 unless i ± j ± k ± l = 0 for at least one combination of plus and minus signs. This will play an important role later on. Next, given a finite subset of indices Id = {n1 , . . . , nd } ⊂ Z with |ni | = |nj | if i = j , we decompose the Hamiltonian (2.9) as H = Hd + H∞ , where Hd (q, p) =

1 2 2 (µn qn + p2n ) 2 n∈Id

+ H∞ (q, p) =

1 4

gij kl qi qj qk ql ≡ 7d (q, p) + Gd (q),

(2.16)

i,j,k,l∈Id

1 2 2 (µn qn + p2n ) 2 n∈Id

+ G(q) − Gd (q) ≡ 7∞ (q, p) + G∞ (q).

(2.17)

108


Introducing the complex coordinates zj , j = 1, . . . , d, by zj =

1 (µnj qnj + i pnj ), 2µnj

one obtains the Hamiltonian Hd (z, z¯ ) = j µnj |zj |2 + Gd (z, z¯ ) on Cd with symplectic structure i j dzj ∧ d z¯ j . For the remaining coordinates, one introduces the notation, for k ≥ 1, (qk , q−k ) ∈ R2 if k, −k ∈ Id , xk = ˜ for some k˜ ∈ Id , q−k˜ ∈ R if k = |k| and similarly for pn , n ∈ Id , denoted in terms of yk ∈ Rdk , k ≥ 1, with dk as above, namely, dk = 2 if both k, −k ∈ Id and dk = 1 otherwise. Clearly, for q, p ∈ lbs one has ∞ x, y ∈ R∞ s , where Rs is defined in (1.11), and H∞ reads in these notations H∞ (z, z¯ , x, y) =

1 2 (µk |xk |2 + |yk |2 ) + G∞ (z, z¯ , x), 2 k≥1

with |G∞ | = O 3l=0 |z|l ||x||4−l + |z|5 + ||x||5s . The next proposition establishes the s existence of a symplectic change of coordinates that transforms the Hamiltonian Hd into a Birkhoff normal form. As it will be clear from the proof, this normal form is not available for H = Hd + H∞ , since most frequencies in H∞ are degenerate. This is the main difference with [P2] in the present discussion. Proposition 2.4. For each m > 0 and each subset Id , d < ∞, satisfying |ni | = |nj | when i = j , there exists a near identity, real analytic, symplectic change of coordinates 9d in some neighborhood of the origin in Cd that takes the Hamiltonian (2.16) into ¯ d + Kd , Hd ◦ 9d = 7d + G where |Kd | = O(|z|6 ) and d 1 3 4 − δij ¯ Gd (z, z¯ ) = g¯ ij |zi |2 |zj |2 with g¯ ij = . 2 8π µni µnj

(2.18)

i,j =1

∞ , one has H∞ ◦ 9∞ = 7∞ + K∞ with Furthermore, setting 9∞ = 9d ⊕ 1R∞ s ×Rs 3 l 4−l 5 5 |K∞ | = O l=0 |z| ||x||s + |z| + ||x||s .

Proof. Modulo straightforward modifications, the proof is carried out in [P2] and we restrict ourselves here to a quick overview. Proceeding as in [P2] and using that |n| = |n | for n = n ∈ Id , one can show that for integers i, j, k, l ∈ Id satisfying i ±j ±k ±l = 0 and {i, j, k, l} = {n, n, n , n }, one has for all combinations of plus and minus signs, |µi ± µj ± µk ± µl | ≥ c

(N 2

m > 0, + m)3/2

(2.19)


109

with c some absolute constant and N = min{|i|, . . . , |l|}. This allows to eliminate all terms in Gd (z, z¯ ) that are not of the form |zi |2 |zj |2 . To see this, it is convenient to adopt the notation zj = wj and z¯ j = w−j in which Gd reads Gd =

1 g˜ ij kl wi wj wk wl , 16 i,j,k,l

g˜ ij kl = √

gn|i| ...n|l| µn|i| . . . µn|l|

,

where the prime indicates that the sum runs over all indices i, j, k, l ∈ {1,−1, . . . , d,−d} with n|i| ±n|j | ±n|k| ±n|l| = 0 for at least one combination of plus and minus signs. Defining the transformation 9d as the time-1 map of the flow of the vectorfield XF given by a Hamiltonian F (z, z¯ ) of order four, namely, 9d = XFt |t=1 and F = Fij kl wi wj wk wl , one obtains using Taylor’s formula Hd ◦ 9d = 7d + Gd + {7d , F } + O(|z|6 ) with {7d , F } = −i (µˆ i + µˆ j + µˆ k + µˆ l )Fij kl wi wj wk wl , i,j,k,l

where µˆ i ≡ sign(i)µn|i| . With (2.19), one easily checks that if {i, j, k, l} = {a,−a, b,−b} then µˆ i + µˆ j + µˆ k + µˆ l > 0. Therefore, choosing Fij kl suitably, one finally obtains, using giijj = (2 + δij )/4π and counting multiplicities, d 3 4 − δij ¯ d. |zi |2 |zj |2 ≡ G Gd + {7d , F } = µni µnj 16π i,j =1

For the rest of the proof, we refer the reader to [P2].

¯ d is integrable with integrals |zi |2 , i = 1, . . . , d. FurtherThe Hamiltonian 7d + G more, the matrix g¯ = (g¯ ij )i,j is non degenerate, as can be checked from the explicit formula (2.18). Hence, introducing the standard action-angle variables (I, φ) ∈ Rd ×Td and linearizing H around a given value for the action, namely, by setting for some a = (a1 , . . . , ad ) ∈ Rd , zi z¯ i = Ii + ai2 , one finally obtains Ha = ω · I + 21 I · gI ¯ +

k≥1

(µ2k xk2 + yk2 ) + Ua (I, φ, x),

(2.20)

where Ua is just Kd + K∞ with the variables zi , z¯ i , i = 1, . . . , d, expressed in terms of I, φ, and where ω = (ω1 , . . . , ωd ) is given by ωi = µni +

d j =1

g¯ ij aj2 ,

and covers a cone at (µn1 , . . . , µnd ) as a varies in a neighborhood of the origin of Rd . Furthermore, Ua is real analytic in φ ∈ Td and real analytic in I in a sufficiently small neighborhood OI of the origin of Rd . As a function of x, Ua is real analytic in a neighborhood Ox ⊂ R∞ s and by Lemma 2.3, its gradient ∂x Ua is bounded as a map from Td × OI × Ox to R∞ s . Therefore, since hypothesis (H1) is satisfied with γ = 1,

110


Ua satisfies (H2) with ξ = 1. Finally, the small parameter λ is given in terms of |a| = δ. In the Hamilton’s equations for Ha , rescaling a by δ, x and y by δ 2 , and I by δ 4 , one obtains an Hamiltonian system given by the rescaled Hamiltonian H˜ a (φ, I, x, y) = δ −4 Hδa (φ, δ 4 I, δ 2 x, δ 2 y) δ4 = ω · I + I · gI (µ2k xk2 + yk2 ) + U˜ a (I, φ, x), ¯ + 2 k≥1

with U˜ a analytic in δ and, as a function of I , U˜ a = O(δ + δ 3 |I | + δ 5 |I |2 ). Hence, Theorem 1.1 implies the existence of quasi-periodic solutions I, x and y of period ω, real analytic in φ and λ. Tracing the coordinate transformations back to the original variables qn (t) in the expression (2.8) for u(x, t) completes the proof of Lemma 2.2 with u(x, t) given by (2.6). 3. The Renormalization Group Scheme Equation (1.7) consists in a system of equations for the variables (!, J ) and Z which are coupled through the perturbation U only. Adopting the notation ∂ U V (!, J, Z)(ϕ) = λ φ (ϕ + !(ϕ), J (ϕ), Z(ϕ)), (3.1) ∂I U W (!, J, Z)(ϕ) = λ∂x U (ϕ + !(ϕ), J (ϕ), Z(ϕ)),

(3.2)

one rewrites Eq. (1.7) as

! = −V (!, J, Z), J

(3.3)

(D 2 + µ2 )Z = −W (!, J, Z).

(3.4)

0 D −D g

Our strategy will be to consider (3.3) and (3.4) separately, treating the functions Z and (!, J ), respectively, as parameters. As we will see in Sect. 8, existence of a (unique) solution of the original equation (1.7) can then be proved by using the implicit function theorem. Note that (3.3) involves only the torus frequencies ω and is equivalent to a standard KAM problem. Existence of a solution for such equations is well known and has been established by various means. One important feature we will use is the regular dependence of the solution (!, J ) on the function Z. A precise result about the solution of (3.3) will be stated in Sect. 4, Theorem 4.1, once the required Banach spaces of functions have been introduced. We now focus our attention on Eq. (3.4), and will suppress from the notation the dependence of the vector field W on the parameters ! and J . Most of our analysis will be conducted in Fourier space, and we will denote by lower case letters the Fourier transforms of functions of ϕ, the latter being denoted by capital letters, namely, F (ϕ) = e−iq·ϕ f (q), where f (q) = eiq·ϕ F (ϕ)dϕ, q∈Zd

Td


111

where dϕ stands for the normalized Lebesgue measure on Td . For Z(ϕ) ∈ R∞ , note dk ˆ ∞ with zki (q) = zki (−q), where R ˆ ∞ stands for that z(q) ∈ R k≥1 C and ki refers th d ∞ ˆ s will denote the complexification of the to the i component of C k . Similarly, R Banach space R∞ defined in (1.11). Finally, we will denote the vector space of functions s ˆ ∞ by h, z(q) ∈ R ˆ ∞ , q ∈ Zd }. h = {z = (z(q)) | z(q) ∈ R In terms of the Fourier transform of W , namely, w0 (z)(q) ≡ λ eiq·ϕ ∂x U (ϕ + !(ϕ), J (ϕ), Z(ϕ))dϕ, Td

(3.5)

Eq. (3.4) becomes, K0 z = w0 (z), where the operator K0 is given by the diagonal kernel K0 (q, q ) = |ω · q|2 − µ2 δqq .

(3.6)

(3.7)

Solving Eq. (3.6) requires to invert the operator K0 . Although the inverse of K0 is unbounded for generic frequencies, restricting ω to a set of admissible frequencies gives sufficient control on the inverse of K0 to prove existence of a solution. As is well known for Melnikov problems, this set depends on the perturbation U . In order to prove existence of a solution to Eq. (3.6), we will follow a strategy developed in [BGK] for standard KAM problems, namely, for equations of the type (3.3). This strategy basically consists in inductively reducing (3.6) to a sequence of effective equations involving denominators of decreasing size. One inductive step, say the nth step, consists in splitting the effective equation obtained at the previous step into two equations involving only large and, respectively, small denominators, where large and small are defined with respect to a scale of order ηn for some fixed η < 1. This splitting is done in such a way that the nonlinear operator involved in the large denominators equation is a contraction, and this equation can thus be solved by a simple application of the contraction mapping principle. This, in turn, allows to map the small denominators equation into a new effective equation of type (3.6), with a new righthand side wn and (eventually) a new linear operator Kn . In [BGK], it was shown that for equations of type (3.3), the above mentioned contraction property follows naturally from symmetries specific to this case. In contrast, Eq. (3.4) involves in addition the normal frequencies µk and does not possess such symmetry. In order to obtain the required contraction, we must make at every inductive step an additional preparation step. As we shall see below, this amounts to renormalizing the linear operator Kn−1 obtained at the previous step into a new operator Kn , which, in effect, corresponds to renormalizing the normal frequencies. Furthermore, we will see that the renormalized normal frequencies converge to a U -dependent set {µ∗α }, α ≥ 1, as n → ∞. Therefore, since the Diophantine conditions imposed on ω will eventually be defined relative to this set, one obtains in a constructive way the dependence of the set of admissible frequencies on the perturbation U . We now describe how the renormalization group approach is implemented in practice for Melnikov type problems. First, we proceed with the above mentioned preparation step by decomposing w0 as w0 (z) = w˜ 0 (z) + A0 z,

112


where the linear operator A0 is the dominant part of Dw0 (z) evaluated at z = 0. With K1 ≡ K0 − A0 , Eq. (3.6) now reads K1 z = w˜ 0 (z).

(3.8)

As explained in more details below, A0 can be chosen in such a way that K1 is of the same form as K0 , cf. (3.7), but now given in terms of a new set of frequencies µ˜ ki ∈ R which are perturbation of order λ of the original normal frequencies µk . The notation µ˜ ki reflects the fact that the perturbation A0 may lift some of the degeneracies. Therefore, when inverting K1 , denominators smaller than O(η) occur for q such that ||ω · q| − µ˜ ki | ≤ O(η) for some ki . Furthermore, these small denominators only occur, q for such q, in a specific subspace hki of Cdk depending on which µ˜ kj , if any, has been q separated from µ˜ ki by more than O(η). Introducing P1 as the projection of h onto hki for q such that ||ω · q| − µ˜ ki | ≤ O(η) and defining Q1 ≡ 1 − P1 , one thus expects that the restriction of K1 to Q1 h is invertible with an inverse of order O(η−1 ). Multiplying (3.8) by Q1 and P1 leads to the small and large denominators equations for z˜ 1 ≡ Q1 z and z1 ≡ P1 z, K1 z˜ 1 = Q1 w˜ 0 (˜z1 + z1 ), K1 z1 = P1 w˜ 0 (˜z1 + z1 ),

(3.9) (3.10)

and by definition of Q1 , the first equation can be rewritten as a fixed point equation for the functional R1 defined as R1 (z1 ) ≡ z˜ 1 , namely, R1 (z) = K1−1 Q1 w˜ 0 (z + R1 (z)).

(3.11)

By choice of A0 , the nonlinear operator K1−1 Q1 w˜ 0 is a contraction and one can solve Eq. (3.11) for R1 using the Banach fixed point theorem. (See point (a) of Theorem 5.1 for this part of the inductive step.) Next, with w1 defined as w1 (z) ≡ w˜ 0 (z + R1 (z)), Eq. (3.10) reads K1 z1 = P1 w1 (z1 ),

(3.12)

and the solution z = z1 + z˜ 1 of the original Eq. (3.6) is now given by z = z1 + R1 (z1 ) ≡ F1 (z1 ). Hence, the problem of solving (3.6) is reduced to solving the effective Eq. (3.12). To solve this equation one proceeds similarly, starting with our preparation step. After n steps of this inductive process, the solution of (3.6) is given by z = Fn−1 (zn + Rn (zn )) ≡ Fn (zn ),

(3.13)

where Rn solves the functional equation Rn (z) = 9n w˜ n−1 (z + Rn (z)),

(3.14)

9n ≡ Kn−1 Qn Pn−1 ,

(3.15)

with


113

and, for some linear operator An−1 , w˜ n−1 (z) ≡ wn−1 (z) − An−1 z, Kn ≡ Kn−1 − Pn−1 An−1 ,

(3.16) (3.17)

whereas zn solves the effective equation Kn zn = Pn wn (zn ),

(3.18)

wn (z) ≡ w˜ n−1 (z + Rn (z)).

(3.19)

with wn defined as

Remark 3.1. The point of this inductive procedure is that Pn wn (z) becomes effectively linear in z for large n. More precisely, we will show, cf. Theorem 5.1 below, that the rescaled maps wnr defined by wnr (z) = η−n r −n wn (r n z) satisfy for r < η, Pn wnr (z) = Pn Dwnr (0)z + O(λr 2n η−n )

with

Pn Dwnr (0) = O(λ),

in some appropriate Banach space. Thus, zn = 0 becomes a better and better approximation to the solution of (3.18), and we shall construct the solution z of the original Eq. (3.6) as the limit of the approximate solutions z = lim Fn (0). n→∞

(3.20)

We now give a precise description of the operators Pn . Note that in order to obtain (3.14) and (3.18), we have tacitly assumed that Pn Pn−1 = Pn . The possibility to define Pn satisfying such a property follows from the convergence of the normal frequencies under renormalization. Recall that renormalization occurs because at every inductive step one turns the nonlinear map wn of the effective functional equation (3.18) into a contraction by subtracting some linear operator An . Delaying to subsequent sections the discussion of the appropriate choice for the family Am , m ≥ 0, it suffices to point here to the properties of Am that will ensure convergence of the renormalized normal frequencies. As will be shown, cf. point (c) of Theorem 5.1 for a precise statement, Am is a perturbation of order ˆ∞ → R ˆ ∞ linear ληm and is given by a constant kernel Am (q, q ) = am δqq with am : R n−1 and hermitian. As a consequence, the operator Kn = K0 − m=0 Pm Am has a kernel of the form (3.7) with µ2 essentially replaced by the positive definite matrix µ˜ 2n ≡ µ2 +

n−1

am ,

(3.21)

m=0

with µ˜ n having a discrete spectrum σ (µ˜ n ) ⊂ R+ . One easily checks that the singularities of Kn−1 are given by the eigenvalues of µ˜ n , which therefore correspond to renormalized normal frequencies. Since am is of order ληm , one expects the eigenvalues of µ˜ n to converge as n → ∞ with |νn+1 − νn | ≤ O(ληn ) for νn+1 ∈ σ (µ˜ n+1 ) and νn ∈ σ (µ˜ n ). This, in turn, allows us to define scales of denominators in a consistent way by carefully keeping track of the separation properties of σ (µ˜ n ) as n increases. To this end, one groups the normal frequencies into a hierarchy of clusters satisfying gap conditions that are preserved by the renormalization procedure. We first introduce some notation. For

114


x ∈ R and C a finite collection of points in R, let d(x, C) denote the distance between x and the smallest interval containing all points in C, and for two finite collections C1 , C2 ⊂ R, let d(C1 , C2 ) ≡ inf d(x, C2 ). x∈C1

Then, one can uniquely decompose σ (µ˜ n ) into a maximal number of disjoint clusters n , k ≥ 1, i = 1, . . . , M n , satisfying d(µ , C n ) = O(λ) and the gap condition Ck,i k k,i k n n , Ck,j ) > ηn d(Ck,i

if i = j.

(3.22)

Note that Mkn ≤ dk , where dk denotes the multiplicity of the original normal frequency µk , and that by requiring Mkn to be maximal, the decomposition n

σ (µ˜ n ) =

Mk k≥1 i=1

n Ck,i

(3.23)

is unique. The above observation about the rate of convergence of σ (µ˜ n ) as n → ∞ ensures that eigenvalues belonging to different clusters will remain separated. Generically, one expects all degeneracies to be lifted eventually, so that Mkn = dk for n sufficiently n contains a single eigenvalue. Next, defining S ⊂ Zd as large and each cluster Ck,i n n

Sn =

Mk k≥1 i=1

n Sk,i ,

(3.24)

where n n Sk,i = {q ∈ Zd | d(|ω · q|, Ck,i ) < 41 ηn },

(3.25)

one is ensured that all q ∈ Zd \ Sn satisfy d(|ω · q|, σ (µ˜ n )) ≥ O(ηn ) for n ≥ n. Hence, such q can be safely “integrated out” in the large denominators equation. Remark that n are pairwise disjoint. In order to achieve the construction of due to (3.22), the sets Sk,i Pn , one must isolate for every q ∈ Sn the subspace of R∞ in which small denominators n , the latter is given by the eigenspace of µ will occur. For q ∈ Sk,i ˜ n associated with n . This eigenspace will be denoted by J n , whereas the the eigenvalues belonging to Ck,i k,i n will be denoted by P n . Thus, one defines P to be the diagonal projector onto Jk,i n k,i operator acting on h given by the kernel n

Pn (q) =

Mk k≥1 i=1

n n χk,i (ω · q)Pk,i ,

n denotes a function in ∈ C 1 (R) which satisfies where χk,i

n χk,i (κ)

=

1

n ) ≤ 1 ηn , if d(|κ|, Ck,i 8

0

n ) ≥ 1 ηn , if d(|κ|, Ck,i 4

(3.26)


115

and interpolates monotonically between 0 and 1 otherwise, with n sup |χk,i (κ)| ≤ Cη−n ,

(3.27)

Qn = 1 − Pn .

(3.28)

κ∈R

whereas Qn is defined as

n have been introduced Note that Pn and Qn are not projectors. The smooth functions χk,i in order to ensure the continuity of the diagonal kernels 9n (q, q), cf. the discussion preceding Lemma 5.3 below. However, we will make use later of the projector n

Pˆn (q) =

Mk k≥1 i=1

n n (q)P ISk,i k,i ,

(3.29)

where IO denotes the indicator function of a set O. Note that Pn Pˆn = Pn , whereas Qn Pˆn = 0. We conclude this section by a few remarks related to the convergence of the inductive n ⊂ R to be the smallest interval covering C n , one easily checks scheme. First, setting Ik,i k,i n that |Ik,i | ≤ (dk − 1)ηn . Hence, since the multiplicities of the normal frequencies µk were assumed to be uniformly bounded in k, i.e., dk ≤ d¯ for all k ≥ 1, one obtains for all n ≥ 1, k ≥ 1, and i = 1, . . . , Mkn , n ¯ n. |Ik,i | ≤ dη

(3.30)

Next, it follows from the gap condition (3.22) being preserved that for all m < n the n are perturbation of all or some eigenvalues belonging eigenvalues in a given cluster Ck,i m m . Furthermore, C n remains close to C m . More to a single cluster Ck,j , denoted by Ck,j n k,i k,jin i precisely, we will show that sup

inf d(x, y) ≤ ηm+1

n y∈I m n x∈Ik,i k,j

for

1 ≤ m < n.

(3.31)

i

n . One has by construction Finally, we consider the properties of the eigenspaces Jk,i n P n = δ δ P n . However, it will be possible to choose a in (3.21) in such a way Pk,i kl ij k,i m l,j n , every m is an invariant subspace for a . Hence, by definition of µ that each Jk,i ˜ n and Jk,i m n−1 n m Jk,i is a subspace of some Jk,j , and by recursion, of some Jk,j for all m < n. The m containing J n will be denoted by J m . Therefore, one has (unique) eigenspace Jk,j k,i k,jin for all 1 ≤ m ≤ n, k ≥ 1, and i = 1, . . . , Mkn , n m n Pk,i Pl,j = δkl δjjin Pk,i ,

(3.32)

which, in particular, implies that Pn Pn−1 = Pn−1 Pn = Pn .

(3.33)

116


Notations. For most of the subsequent analysis, it will not be necessary to distinguish between indices (k, i) and (l, j ) with k = l or k = l. This intervenes only in the description of the asymptotic behavior of the spectrum σ (µ˜ n ) and the measure estimate of +∗ . For notational convenience, we thus introduce the index sets I n = {(k, i) | k ≥ 1, i = 1, . . . , Mkn }, I n.

and will reserve bold letters for indices in With this n ,k ≥ denotes for instance the collection of all clusters Ck,i

n ≥ 1, convention, {Ckn | 1, i = 1, . . . , Mkn .

(3.34) k ∈ I n },

4. Spaces For the Fourier transform z of the solution Z of our original Eq. (3.4), we consider the Banach space hs , s ∈ R, defined by ˆ∞ hs = {z = (z(q)) | z(q) ∈ R |z(q)|s < ∞}. (4.1) s , ||z||s ≡ q∈Zd

For s ≥ t, one has the natural embedding hs → ht with || · ||t ≤ || · ||s . We will denote by hns the subspace Pˆn hs . In particular, one has for z ∈ hns , ||z||s = |Pkn z(q)|s . (4.2) k∈I n q∈Skn

(n,m)

(n)

The operator norm in L(hns , hm t ) will be denoted by || · ||s,t , and by || · ||s when n = m and s = t. Let us now turn to the spaces we will consider for the functions wn . Recall that in our analysis of (3.4), the functions ! and J only appear as parameters. In the sequel, we consider !, J : Td → Rd as (fixed) real analytic maps belonging to a small neighborhood of the origin OB in the Banach space B = {(F, G) : Td → Rd × Rd | ||(F, G)||B ≡ |f (q)| + |g(q)| < ∞}. (4.3) q∈Zd

Next, it follows from assumption (H2) that the gradient ∂x U is real analytic as a map d ∞ from Td × OI × Ox to R∞ s , cf. [PT] p. 138. (Recall that OI ⊂ R and Ox ⊂ Rs are neighborhoods of the origin and that s ≡ s + ξ − a.) This implies that for (!, J ) ∈ OB small enough, one can write the Taylor expansion of ∂x U (ϕ + !(ϕ), J (ϕ), Z) = ∂x U ((ϕ + !(ϕ), J (ϕ), 0) + (0, 0, Z)) as ∂x U ((ϕ + !(ϕ), J (ϕ), 0) + (0, 0, Z)) =

∞ 1 !,J U (ϕ)(Z, . . . , Z), m! m+1

(4.4)

m=0

!,J where the coefficients Um+1 (ϕ) belong to the space of m-linear maps L(Rs , . . . , Rs ; Rs ), are real analytic in ϕ ∈ Td and analytic in (!, J ) ∈ OB . Hence, there !,J exist ρ > 0, α > 0 and b < ∞ such that the Fourier transforms of Um+1 (ϕ) satisfy −m eα|q| ||u!,J . (4.5) ˆ ,...,R ˆ ;R ˆ ) ≤ b m! ρ m+1 (q)||L(R q∈Zd

s

s

s


117

Inserting the Fourier series for Z into (4.4), one obtains the expansion for w0 as defined in (3.5), w0 (z)(q) = λ ≡

∞ m 1 !,J qi (z(q1 ), . . . , z(qm )) um+1 q − m! q

m=0 ∞

i=1

(4.6)

(m) w0 (q; q1 , . . . , qm )(z(q1 ), . . . , z(qm )),

m=0 q

where q = (q1 , . . . , qm ) ∈ Zmd . This formula suggests to consider w0 as an analytic function of z ∈ hs . Let B(r0 ) be the open ball of radius r0 in hs centered at the origin and let H ∞ (B(r0 ), hs ) denote the Banach space of an analytic function w : B(r0 ) → Hs equipped with the supremum norm, which we shall denote by |||w|||. Then, bound (4.5) implies that w0 ∈ H ∞ (B(r0 ), hs ) for r0 small enough. (m) It will be convenient to encode the decay property of the kernels w0 inherited from the estimate (4.5) as a property of the functional w0 . Let τβ denote the translation by β ∈ Rd , i.e., (τβ Z)(ϕ) = Z(ϕ − β). On hs , τβ is realized by (τβ z)(q) = eiβ·q z(q), and it induces a map w → wβ from H ∞ (B(r0 ), hs ) to itself if we define wβ (z) = τβ w(τ−β z).

(4.7)

(m)

On the kernels w0 , this is given by

(m)

wβ (q; q1 , . . . , qm ) = eiβ·(q−

qi )

w (m) (q; q1 , . . . , qm ),

and makes sense also for β ∈ Cd . Since |||w0β ||| ≤

∞ m=0

r0m sup q

q∈Zd

e−Imβ·(q−

qi )

(m)

||w0 (q; q1 , . . . , qm )||L(Rˆ

ˆ

ˆ

s ,...,Rs ;Rs )

,

it thus follows from (4.5) that there exist r0 > 0, α > 0, and D < ∞, such that w0β belongs to H ∞ (B(r0 ), hs ) and extends to an analytic function of β in the strip | Im β| < α with values in H ∞ (B(r0 ), hs ) satisfying the bound |||w0β ||| ≤ D|λ|.

(4.8)

Let us now come back to the existence of a solution for Eq. (3.3), namely for the standard KAM problem. One has the classical result (see for instance [BGK]): Theorem 4.1. Let U satisfy hypothesis (H2) and let g be an invertible matrix. Then, there is a λ1 > 0 small enough such that for |λ| < λ1 and ω satisfying a Diophantine condition of the form |ω · q| > K|q|−ν for q ∈ Zd , q = 0, (3.3) has a solution (!, J ) ∈ B which is real analytic in ϕ, analytic in λ, and vanishes for λ = 0. Furthermore, this solution is unique up to translations (!, J )(ϕ) → (! − β, J )(ϕ − β) and depends analytically on Z, for Z in a small ball centered at the origin of the Banach space hs .

118


To conclude this section, we list some standard properties of bounded analytic functions defined on open balls in Banach spaces. Let h, h , h be Banach spaces, B(r) ⊂ h, B(r ) ⊂ h , and wi ∈ H ∞ (B(r), h ), w ∈ H ∞ (B(r ), h ). First, one has the composition property: If |||wi ||| < r then w ◦ wi ∈ H ∞ (B(r), h ) and |||w ◦ wi ||| < |||w|||.

(4.9)

Next, one deduces from the Cauchy estimate that for r1 < r , sup ||Dw(x)||L(h ,h ) ≤ (r − r1 )−1 |||w|||.

||x|| 0 and {Ckn }k∈I n the clusters described in the previous section, +n (K) = ω ∈ Rd | d(|ω · q|, Ckn ), d(|ω · q|, |Ckn ± Ckn |) > K|q|−ν ∀ |q| < Kη−n/ν , Ckn

± Ckn

± ν | ν

Ckn , ν

q = 0,

and k, k ∈ I n },

(5.3)

Ckn }.

denotes the set {ν ∈ ∈ Note that +n (K) ⊂ +n (K ) where whenever K > K . Furthermore, one introduces for ω ∈ Rd the subsets of Zd , d Q+ ω = {q ∈ Z | ω · q > 0},

d Q− ω = {q ∈ Z | ω · q < 0}.

(5.4)


119

Proposition 5.1. There exist positive constants r and λ0 small enough such that the following is true for |λ| < λ0 , n ≥ 1, and | Im β| < αn , where α1 = α and, for n ≥ 2, αn = (1 − n−2 )αn−1 .

(5.5)

There exists Kλ > 0 satisfying Kλ → 0 as λ → 0 such that one has for ω ∈ +n (Kλ ) arbitrary but fixed, (a) Equation (5.1) has a solution Rnβ in H ∞ (Bn , hn−1 ) analytic in |λ| < λ0 and s (!, J ) ∈ OB . (b) Defining wnβ according to (5.2), one has wnβ ∈ An and, writing wnβ (z) ≡ wn (z) = wn (0) + Dwn (0)z + δ2 wn (z), ||Pˆn wn (0)||s ≤ εr 2n , |||Pˆn δ2 wn |||An ≤ εr 2n ,

(5.6) (5.7)

where ε → 0 as λ → 0. (c) There exists An ∈ L(hs , hs ) such that w˜ n ≡ wn − An obeys for all z ∈ Bn , (n)

||Pˆn D w˜ n (z)||s,s ≤ εηn .

(5.8)

||An ||s,s ≤ 3εηn−1 ,

(5.9)

An (q, q) = an IQ+ω (q) + an IQ−ω (q),

(5.10)

Furthermore, An (q, q ) = 0 if q = q and T n ˆ∞ ˆ∞ where an ∈ L(R s , Rs ) is hermitian, i.e., an = an , and satisfies for all k ∈ I ,

an Jkn = Jkn .

(5.11)

(d) The matrix µ˜ 2n+1 ≡ µ2 + nm=0 am is positive definite and the spectrum of µ˜ n+1 can be uniquely decomposed into a maximal family of pairwise disjoint clusters n+1 Ck,i , k ≥ 1, i = 1, . . . , Mkn+1 , with Mkn+1 ≥ Mkn , satisfying for all k ≥ 1 the gap condition n+1 n+1 , Ck,j ) > ηn+1 if i = j, d(Ck,i

(5.12)

and n+1 ν = µk + O(εk −ξ ) for all ν ∈ Ck,i , i = 1, . . . , Mkn+1 .

(5.13)

n+1 defined according to (3.25) are pairwise disjoint, and Furthermore, the sets Sk,i (3.31), (3.32) and (3.33) hold with n replaced by n + 1.

Let us briefly comment on Proposition 5.1, whose proof will be carried out in Sect. 6. n+1 enjoy First, we note that point (d) ensures, in particular, that the new set of clusters Ck,i the properties required for proceeding to the next step of the induction, cf. the discussion at the end of Sect. 3. The asymptotic behavior (5.13) concerns the measure estimate of the set +∗ of admissible frequencies in Theorem 1.1. Such an asymptotic behavior is required in order to obtain a set of large measure because one imposes Diophantine conditions with respect to differences of the normal frequencies. We will show in Sect. 7 that (5.13) implies the

120


Proposition 5.2. For ν = ν(d, ξ ) sufficiently large, the set +∗ (K) ≡ +n (K)

(5.14)

n≥1

satisfies for all bounded + ⊂ Rd , meas(+ \ +∗ (K)) → 0 as K → 0. Note that ω ∈ +∗ assume a Diophantine condition with respect to zero. Therefore, − one has for such ω, Zd \ {0} = Q+ ω ∪ Qω . Next, we turn to bound (5.8), the most delicate estimate to establish. To treat the off-diagonal part Dwn (q, q ), q = q , we will rely on the fact that the exponential decay of the kernel Dw0 (q, q ) in the size of |q − q | is preserved due to the introduction of the parameter β. We note that imposing Diophantine conditions on ω with respect to the differences Ckn ± Ckn ensures that |q − q | is of order O(η−n/ν ) for q = q ∈ Sn . To treat the diagonal part, we will use that Dwn (q, q) depends on q through ω · q only, and is, in some sense, continuous in this variable. More precisely, defining tp : L(hs , hs ) → L(hs , hs ), p ∈ Zd , by (tp L)(q, q ) = L(q + p, q + p),

(5.15)

Tp ≡ tp − 1,

(5.16)

and setting we will show that Tp Dwn is of order O(ε|ω · p|) on the diagonal. Therefore, since p = q − q satisfies |ω · p| ≤ ηn for q, q ∈ Skn such that sign(ω · q) = sign(ω · q ), one has for q ∈ Skn , Pˆn Dwn (q, q)Pˆn = aˆ k + O(εηn ), where aˆ k : Jkn → Jkn depends only on the sign of ω · q. The continuity of Dwn (q, q) ultimately follows from the fact that 9n (q) is continuous in ω·q, as stated in the following lemma, whose proof can be found in the Appendix. Lemma 5.3. Let σ ∈ R and p ∈ Zd . Then the operator 9n = Kn−1 Qn Pn−1 obeys ||9n ||σ,σ +γ ≤ Cη−n , ||Tp 9n ||σ,σ +γ ≤ Cη

−2n

(5.17) |ω · p|.

(5.18)

Finally, the perturbation an being hermitian will essentially follow from the reality of the original Eq. (3.4). More precisely, the derivative Dwn satisfies ij

ij

Dwn (q, q ) = Dwn (−q, −q ),

(5.19)

ij Dwn (q, q )

(5.20)

=

ji Dwn (−q , −q).

ˆ∞ → R ˆ ∞ is given by an hermitian matrix Thus, the diagonal element Dwn (q, q) : R for all q, and an hermitian will follow since, as was mentioned above, an will be chosen in such a way that its action on each Jkn is the constant approximation of Dwn (q, q) for q ∈ Skn . Note that due to (5.19), one expects Dwn (−q, −q) to be approximated by an , which explains the decomposition in formula (5.10). Identities (5.19) and (5.20) are easily checked to hold for n = 0. Indeed, the perturbation U in the Hamiltonian (1.4) being real analytic ensures (5.19), whereas (5.20) follows from the fact that Dw0 is the symmetric second derivative of the functional Z → λ U (ϕ + !(ϕ), J (ϕ), Z(ϕ))dϕ, cf. (3.5). Using the recursive relations (3.19) and (3.16), one obtains (5.19) and (5.20) for n ≥ 1 by iteration.


121

Remark 5.4. The choice of constants is as follows. We first fix η small enough according, essentially, to the constants entering the asymptotics of the frequencies µk in (H1), cf. Sect. 6.4. Given η, ε and r are chosen small enough, and λ0 is chosen in turn according to ε. The latter choice plays a role only in ensuring that the inductive hypothesis of Proposition 5.1 are satisfied for n = 0, cf. the introduction in Sect. 6. Finally, Kλ is chosen large enough in order for the estimate −2 K 1/ν η−n/ν λ

Ce−Cn

≤ r 2n ,

(5.21)

to hold for all n ≥ 1. This will be needed in order to iterate the bound (5.6) in Sect. 6.2. Note that due to the double exponential, the dependence of Kλ on η and r is given by the behavior at small n of the expressions entering (5.21). That Kλ can be taken smaller as λ goes to zero will follow from the fact that r and ε, and thus ultimately η, can be taken smaller. Finally, we denote by C a generic constant, independent on n, r, and ε, which may vary from place to place. 6. Proof of Proposition 5.1 We proceed by induction and assume that Proposition 5.1 holds up to n − 1 ≥ 1. Regarding the inductive hypothesis in the case n = 1, we simply choose A0 ≡ 0, so that the bounds for w0 in points (b) and (c) of Proposition 5.1 are a simple consequence of (4.8). Furthermore, µ˜ 1 = µ and point (d) follows immediately from (H1). We note that in Sect. 6.1 below, point (a) is established for n = 1 by taking ε, namely λ, small enough. At some point in the induction, however, one is forced to consider nontrivial An in order for the inductive bounds to hold uniformly in n for a given λ. In the sequel, we adopt the convention, for B a ball of radius r centered at the origin, to denote by γ B the ball of radius γ r centered at the origin. 6.1. Existence of the functional Rnβ . With the notations R = Rnβ , 9 = 9n and w˜ = w˜ (n−1)β , Eq. (5.1) reads R(z) = 9 w(z ˜ + R(z)).

(6.1)

To prove existence in H ∞ (Bn , hn−1 ) of a solution R to Eq. (6.1), one starts, using the s identities w(0) ˜ = w(0) and δ2 w˜ = δ2 w, by decomposing w˜ as w(z) ˜ = w(0) + D w(0)z ˜ + δ2 w(z),

(6.2)

R(z) = 9w(0) + 9D w(0)(z ˜ + R(z)) + 9δ2 w(z + R(z)).

(6.3)

−1 H = 1 − 9D w(0) ˜ ,

(6.4)

to obtain from (6.1),

Defining

and using the identity 1 + H 9D w(0) ˜ = H , one rewrites (6.3) as R(z) = H 9w(0) + H 9D w(0)z ˜ + u(z),

(6.5)

122


where

and

u(z) = H 9δ2 w(˜z) ≡ G(u)(z),

(6.6)

z˜ ≡ z + R(z) = H z + 9w(0) + u(z).

(6.7)

Since 9 = 9 Pˆn−1 = Pˆn−1 9, (5.17) (with σ = s + ξ − γ ) and the recursive bound (5.8) (with n replaced by n − 1) imply (n−1)

(n−1) −1 ||9D w(0)|| ˜ ≤ ||9D w(0)|| ˜ s s,s+ξ ≤ Cεη .

(6.8)

≤ 2, ||H ||(n−1) s

(6.9)

Hence,

˜ = w(0), and since bounds (5.6) for ε = ε(η) small enough. Since Bn ⊂ Bn−1 , w(0) (with n replaced by n − 1), (5.17) and (6.8) hold, the existence of R in H ∞ (Bn , hn−1 ) s ∞ n−1 follows from the existence of u in H (Bn , hs ). For reasons that will become clear in the next section, we actually show that (6.6) has a solution u in the ball √ −n 2(n−1)

) | |||u||| ≤ εη r . (6.10) B = u ∈ H ∞ ( 18 Bn−1 , hn−1 s This result is stronger, since Bn ⊂ 18 Bn−1 for r small enough. Let us first check that G maps B into itself. From (6.9) and the recursive bound (5.6), it follows that for all z ∈ 18 Bn−1 and u ∈ B, z˜ ∈ hn−1 with s √ ||˜z||s ≤ 2( 18 r n + Cεη−n r 2(n−1) ) + εη−n r 2(n−1) ≤ 21 r n , for ε = ε(r, η) and r = r(η) small enough. Hence, z˜ ∈ 21 Bn−1 ⊂ Bn−1

for all z ∈ 18 Bn−1 ,

(6.11)

and one uses the bound (5.7) to conclude that for all u ∈ B, √ |||G(u)||| ≤ 2Cη−n εr 2(n−1) ≤ εη−n r 2(n−1) , for ε small enough. To show that G is a contraction in B, we apply the estimate (4.11) to the functions z˜ i given by (6.7) in terms of ui ∈ B, i = 1, 2. Noting that |||˜zi ||| ≤ 21 r n , which follows from (6.11), and using in addition (5.7), one obtains, ||Pˆn−1 δ2 w(˜z1 ) − Pˆn−1 δ2 w(˜z2 )||s 1 z∈ 8 Bn−1 4Cη−n r −n |||Pˆn−1 δ2 w|||An−1 sup ||˜z1 − z˜ 2 ||s 1 z∈ 8 Bn−1 4Cεη−n r −n r 2(n−1) sup ||u1 (z) − u2 (z)||s 1 z∈ 8 Bn−1

|||G(u1 ) − G(u2 )||| ≤ 2Cη−n ≤ ≤

sup

≤ 21 |||u1 − u2 |||, for r = r(η) and ε = ε(r, η) small enough.


123

Before turning to part (b) of Proposition 5.1, we make some remarks that shall be useful later. First note that (6.11) means z + Rn (z) ∈ 21 Bn−1

for all

z ∈ 18 Bn−1 .

(6.12)

Therefore, with R˜ m (z) ≡ z + Rm (z),

(6.13)

Fnm (z) ≡ R˜ m ◦ R˜ m+1 ◦ · · · ◦ R˜ n (z),

(6.14)

and

it follows recursively that for m = 1, . . . , n, Fnm (z) ∈ 21 Bm−1

for all

z ∈ Bn .

(6.15)

Furthermore, since Fn1 = Fn , where Fn is defined in (3.13), one has Fn ∈ An , together with the uniform bound |||Fn |||An ≤ |||R˜ 1 |||A1 ≤ ε.

(6.16)

6.2. Bounds on the functional wn . According to (5.2), one defines wnβ (z) = w˜ (n−1)β (z + Rnβ (z)). ), it follows from (6.12) and the inductive bounds that for all Since Rnβ ∈ H ∞ (Bn , hn−1 s β with | Im β| < αn−1 , wnβ is well defined as a map from Bn to hs , with wnβ ∈ An . In the sequel, we adopt the simplified notation R = Rnβ , w = w(n−1)β and w = wnβ . We proceed with proving (5.6). Using the decomposition (6.2) at z = 0, one may write w (0) = w(0) + D w(0)R(0) ˜ + δ2 w(R(0)). Since (6.12) implies that R(0) ∈ 21 Bn−1 , one obtains using the bounds (5.6), (5.7) and (5.8), ||Pˆn w (0)||s ≤ εr 2(n−1) + 21 εηn−1 r n + εr 2(n−1) ≤ 3ε.

(6.17)

This leads to |Pkn w (0)(q)|s ≤ 3ε,

(6.18)

for all k ∈ I n and q ∈ Skn . The latter is valid for all β with | Im β| < αn−1 . Let now β with | Im β | < αn . Then, shifting β to β = β − i(αn−1 − αn )q/|q| and using the recursive relation (5.5) for αn , one obtains

−2 α n−1 |q|

wβ (0)(q) = ei(β −β)·q wβ (0)(q) = e−n

wβ (0)(q).

(6.19)

Since for such β one has | Im β| < αn−1 , it follows from (6.18) and (6.19) that −2 e−n αn−1 |q| . (6.20) ||Pˆn w (0)||s ≤ 3ε k∈I n q∈Skn

124


From the Diophantine conditions satisfied by ω ∈ +n (K), one infers for q ∈ Skn that |q| > min(Kη−n/ν , (4K)1/ν η−n/ν ), cf. (3.25) and (5.3). Therefore, bound (5.6) finally follows by choosing K appropriately, cf. (5.21). We now iterate bound (5.7). Using again the decomposition (6.2), one has δ2 w (z) = D w(0)δ ˜ 2 R(z) + δ2 δ2 w(z + R(z)). The first term on the right-hand side is estimated by using δ2 R(z) = δ2 u(z) together with (4.12) applied to u ∈ B with γ = 8r, since Bn ⊂ 18 Bn−1 , to obtain n−1 |||Pˆn D w(0)δ ˜ ˜ 2 R|||An ≤ ||Pˆn−1 D w(0)|| s,s sup ||δ2 u(z)||s z∈Bn

≤ εηn

(8r)2

|||u||| 1 − (8r)2 √ 2 ε8 ≤ εr 2n 1 − (8r)2 ≤ 21 εr 2n , for ε small enough. In a similar way, one estimates, using (6.12), that sup ||Pˆn δ2 δ2 w(z + R(z))||s ≤ 21 εr 2n ,

z∈Bn

which finally leads to (5.7).

6.3. Bounds on the derivative. In this section, we prove the estimates stated in part (c) of Proposition 5.1. The main difficulty consists in controlling the diagonal part of the kernel of the derivative Dwn evaluated at zero, namely Dwn (0)(q, q), q ∈ Zd . To address this problem, as mentioned in the end of Sect. 5, we will use the fact that Dwn (0)(q, q) depends on q through ω · q only, and satisfies some continuity property when viewed as a function of ω · q. We start by deriving an a priori bound on the norm of Dwn . From (3.14), one infers that DRn (z) = Hn (˜z)9n D w˜ n−1 (˜z),

(6.21)

−1 Hn (˜z) = 1 − 9n D w˜ n−1 (˜z) ,

(6.22)

where

z˜ = z + Rn (z).

(6.23)

Since by definition, cf. (3.19), one has Dwn (z) = D w˜ n−1 (˜z) 1 + DRn (z) , (6.21) and the identity Hn (˜z) = 1 + Hn (˜z)9n D w˜ n−1 (˜z), imply the recursive relation Dwn (z) = D w˜ n−1 (˜z)Hn (˜z).

(6.24)


125

As in the previous section, it follows from (5.17), (6.12), and the inductive bounds, that (n−1) ≤ 2 for all z˜ ∈ Bn−1 . Therefore, one obtains for all z ∈ 18 Bn−1 , using ||Hn (˜z)||s again the inductive bound (5.8), (n)

(n−1)

||Pˆn Dwn (z)||s,s ≤ ||Pˆn−1 Dwn (z)||s,s

≤ 2εηn−1 .

(6.25)

In order to iterate bounds (5.8), we decompose Dwn (z) as follows: Dwn (z) = σn + τn + δ1 Dwn (z),

(6.26)

where σn + τn = Dwn (0) and σn (q, q ) = Dwn (0)(q, q )δqq . Let us consider first the last two terms on the right-hand side of (6.26). One has Lemma 6.1. Let r and ε be the positive constants of Proposition 5.1. Then, one has for all n ≥ 0 and all z ∈ Bn , n

(n)

||Pˆn δ1 Dwn (z)||s,s ≤ 21 εr 2 ,

(6.27)

(n)

||Pˆn τn ||s,s ≤ εr n .

(6.28)

Proof. Proceeding by induction, we suppose that Proposition 5.1 and Lemma 6.1 are true up to some n − 1, n ≥ 1. We start with (6.27) and compute from δ1 Dwn (z) = Dwn (z) − Dwn (0) and the recursive relation (6.24) that δ1 Dwn (z) = H˜ n (˜z0 ) D w˜ n−1 (˜z) − D w˜ n−1 (˜z0 ) Hn (˜z), where z˜ 0 = Rn (0) and H˜ n (˜z0 ) = 1+Dwn−1 (˜z0 )Hn (˜z0 )9n . As previously, the inductive (n−1) ≤ 2. Using (6.12) and Pˆn H˜ n = Pˆn H˜ n Pˆn−1 , one bound (5.8) implies ||Pˆn H˜ n (˜z0 )||s infers from the identity D w˜ n−1 (˜z) − D w˜ n−1 (˜z0 ) = δ1 D w˜ n−1 (˜z) − δ1 D w˜ n−1 (˜z0 ) that for all z ∈ 18 Bn−1 , (n−1,n) ||Pˆn δ1 Dwn (z)||s,s ≤C

sup 1 z ∈ 2 Bn−1

(n−1) ||Pˆn−1 δ1 D w˜ n−1 (z )||s,s .

Since δ1 D w˜ n−1 = δ1 Dwn−1 , the recursive bound (6.27) leads to (n−1,n)

||Pˆn δ1 Dwn (z)||s,s

≤ Cεr

n−1 2

,

for all z ∈ 18 Bn−1 . Finally, iterating bound (6.27) is completed by restricting z to Bn ⊂ 1 8 Bn−1 and using (4.12) with γ = 8r. Next, we turn to (6.28), the estimate for the off-diagonal part of Dwn (0). The norm of τn reads (n) ||Pˆn τn ||s,s = sup sup |Pkn τn (q, q )Pkn |s,s , k ∈I n q ∈Skn k∈I n q∈S n k

and one infers from (6.27) and the a priori bound (6.25) that n

|Pkn τn (q, q )Pkn |s,s ≤ 2εηn−1 + 21 εr 2 ≤ 3ε.

(6.29)

126


The latter is valid for all β with | Im β| < αn−1 . Let now β with | Im β | < αn . Then, shifting β to β = β − i(αn−1 − αn )(q − q )/|q − q |, one obtains

−2 α n−1 |q−q |

τnβ (q, q ) = ei(β −β)·(q−q ) τnβ (q, q ) = e−n Hence, since | Im β| < αn−1 for such ||Pˆn τn ||ns,s

τnβ (q, q ).

(6.30)

β ,

(6.29) and (6.30) lead to −2 ≤ 3ε sup sup e−n αn−1 |q−q | . k ∈I n q ∈Skn k∈I n

(6.31)

q∈Skn q=q

We now show that every term in the previous sum yields a super-exponentially small factor. Let q ∈ Skn and q ∈ Skn for some k ∈ I n , k ∈ I n . Then, one estimates using (3.25) and (3.30) that if sign(ω · q) = sign(ω · q ), ¯ n, d |ω · (q − q )|, Ckn + Ckn ≤ 21 ηn + |Ikn | + |Ikn | ≤ 3dη and that otherwise d |ω · (q − q )|, |Ckn − Ckn | ≤

1 n 2η

¯ n. + |Ikn | + |Ikn | ≤ 3dη

Therefore, since q = q , it follows from (5.3) and ω ∈ +n (K) that

K 1/ν , K η−n/ν . |q − q | ≥ min 3d¯ Hence, the contribution of each term in (6.31) is super-exponentially small, and (6.28) follows for some r * η < 1. Finally, we turn to σn , the diagonal part of Dwn (0) in the decomposition (6.26). We first state a result about the continuity properties of the kernel σn (q, q), namely that Tp σn = tp σn − σn is of order |ω · p|. More precisely, one has the Proposition 6.2. Suppose that Proposition 5.1 is valid up to n − 1 for some n ≥ 1. Then, the diagonal part σn (z) of Dwn (z) satisfies for all z ∈ Bn and all p such that 1 n−1 |ω · p| < 16 η , ||Pˆn Tp σn (z)||ns,s ≤ ε 2 |ω · p|. 3

(6.32)

Delaying the proof of the above proposition to the end of this section, we now construct a diagonal operator An ∈ L(hs , hs ) such that σ˜ n ≡ σn − An obeys ||Pˆn σ˜ n ||ns,s = sup sup |Pkn σ˜ n (q, q)Pkn |s,s ≤ 21 εηn . k∈I n q∈Skn

(6.33)

The equality above follows from the sets Skn being pairwise disjoint. This will conclude the proof of iterating (5.8), since (6.27), (6.28) and (6.33) imply that the derivative of w˜ n ≡ wn −An satisfies the required bound for r = r(η) small enough. In order to obtain bound (6.33) by using the continuity property (6.32), we would like to construct An as an approximation of σn (q, q) for ω · q close to the normal frequencies in Ckn , k ∈ I n . To this end, we set µ¯ k to be the center of the interval Ikn and, using that {ω · q | q ∈ Zd } is dense in R, we choose a sequence {ql,k }l≥1 ⊂ Skn such that ω · ql,k > 0 for all l ≥ 1 and lim ω · ql,k = µ¯ k .

l→∞


127

Next, one defines the matrix aˆ n,k ∈ L(Jkn ) by aˆ n,k ≡ lim Pkn σn (ql,k , ql,k )Pkn . l→∞

(6.34)

Due to (6.32), the limit in (6.34) exists and does not depend on the particular choice of the sequence {ql,k }l≥1 . Finally, setting aˆ n,k , (6.35) an ≡ k∈I n

we define the operator An : h → h as given by the diagonal kernel An (q, q) = an IQ+ω (q) + an IQ−ω (q)

(6.36)

for all q ∈ Zd . We note that by construction, (5.11) is clearly satisfied. Furthermore, it follows from (5.19) and (5.20) that an is indeed hermitian. Let us check that definition (6.36) leads to the required bound (6.33). By construction, one has for all k ∈ I n , lim Pkn σ˜ n (ql,k , ql,k )Pkn = 0.

l→∞

(6.37)

On the other hand, since Tp An = 0, bound (6.32) is also satisfied by σ˜ n , which by definition of the norm implies that |Pkn Tp σ˜ n (q, q)Pkn |s,s ≤ ε 2 |ω · p|, 3

(6.38)

1 n−1 η . The definition of Skn for all q ∈ Skn , k ∈ I n , and p ∈ Zd with |ω · p| < 16 1 n−1 n ¯ together with (3.30) implies that |ω · (q − q )| ≤ 2dη ≤ 16 η for all q, q ∈ Skn with sign (ω · q) = sign (ω · q ) and η small enough. Therefore, using

σ˜ n (q, q) = σ˜ n (q , q ) + Tq−q σ˜ n (q , q ), one infers from (6.38) that for all ql,k and q ∈ Skn with ω · q > 0, |Pkn σ˜ n (q, q)Pkn |s,s ≤ |Pkn σ˜ n (ql,k , ql,k )Pkn |s,s + ε 2 |ω · (q − ql,k )|, 3

(6.39)

which, with (6.37), leads to ¯ 2 ηn . |Pkn σ˜ n (q, q)Pkn |s,s ≤ 2dε 3

(6.40)

For q ∈ Skn with ω · q < 0, we note that (6.39) is also valid if one replaces ql,k by −ql,k , and, due to (5.19), that the same is true of (6.37). Therefore, (6.40) holds for all q ∈ Skn , k ∈ I n , and bound (6.33) follows by taking ε small enough. Finally, we check that An obeys (5.9). The a priori bound (6.25) together with (6.33) imply that (n) ||Pˆn An ||s,s ≤ 3εηn−1 , which, with (5.11) and definition (6.36), leads to (5.9). To complete the proof of part (c) of Proposition 5.1, we are left with the Proof of Proposition 6.2. Denoting Dwn (z) = σn (z) + τn (z), with σn (z)(q, q ) = Dwn (z)(q, q )δqq , one computes from (6.24) the recursive relation σn (z) = σ˜ n−1 (˜z) + Tn (z) Hn (˜z), (6.41)

128


where −1 Hn (˜z) = 1 − 9n σ˜ n−1 (˜z) , Tn (z)(q, q ) = τn (z)9n τn−1 (˜z) (q, q )δqq . Setting Rn (z) ≡ σ˜ n−1 Hn (˜z) − 1 ,

Tn (z) ≡ Tn (z)Hn (˜z),

and using Tp σ˜ n−1 = Tp σn−1 together with the identity Tp σ0 = 0, one applies (6.41) recursively to obtain Tp σn (z) =

n

Tp Rm (zm ) + Tm (zm ) ,

(6.42)

m=1

where zm = Fnm+1 (z), cf (6.14), with Fnn+1 ≡ 1. Note that Rm (z) is diagonal and can be rewritten as Rm (z) = σ˜ m−1 (˜z)9m σ˜ m−1 (˜z)Hm (˜z).

(6.43)

As shown below, each term in (6.42) is easily seen to be of order ε 2 |ω · p|. Thus, the main issue in obtaining (6.32) is to ensure that taking the sum will deteriorate the bound only slightly. Let us first consider the terms involving the quantities Tp Tm . They are higher order terms, since Tm is quadratic in the off-diagonal part τm which, as shown in Lemma 6.1, are bounded by powers of r. Indeed, as carried out in the Appendix, one has for all m = 1, . . . , n and z ∈ Bm , (m) ||Pˆm Tp Tm (z)||s,s ≤ ε2 ηm |ω · p|,

(6.44)

so that n (n) n (m) ˆn Tp Tm (z) ≤ ||Pˆm Tp Tm (z)||s,s ≤ ε2 |ω · p|. P m=1

s,s

(6.45)

m=1

On the other hand, the terms involving Tp Rm are not higher order terms. Since

Tp Hm (˜z) = tp Hm (˜z) Tp 9m tp σ˜ m−1 (˜z) + 9m Tp σ˜ m−1 (˜z) Hm (˜z), (5.18) with σ = s + ξ − γ and n replaced by m yields with the recursive bound (6.32) ≤ η−m |ω · p|. ||Tp Hm (˜z)||(m−2) s

(6.46)

Thus, using in addition the recursive bounds (5.8) and (6.32), together with ||Hm (˜z) − 1||(m−1) = ||9m σ˜ m−1 (˜z)Hm (˜z)||(m−1) ≤ Cε, s s one obtains for all m = 1, . . . , n and z ∈ Bm , (n)

(m)

||Pˆn Tp Rm (z)||s,s ≤ ||Pˆm Tp Rm (z)||s,s ≤ Cε 2 |ω · p|,

(6.47)


129

to be compared with (6.44). However, one can actually show that n n (n) ˆ Pn Tp Rm (z) ≤ sup sup |Tp Rm (z)(q)|s,s m=1

s,s

k∈I n q∈Skn m=1

≤ Cε2 |ω · p|,

(6.48) (6.49)

with another n-independent constant C. Although (6.47) yields the a priori bound |Tp Rm (z)(q)|s,s ≤ Cε 2 |ω · p| for all q ∈ Skn and k ∈ I n , (6.49) will follow from the fact that all but a finite number of terms in (6.48) are identically zero. More precisely, there is for all k ∈ I n a set Zkn ⊂ {1, . . . , n} with #Zkn uniformly bounded in n and k such that for all q ∈ Skn , |Tp Rm (z)(q)|s,s ≡ 0

if

m ∈ Zkn .

(6.50)

This leads to (6.49) and concludes the proof of bound (6.32), since (6.42), (6.45) and (6.49) lead to (6.32) by taking ε small enough and by noting that zm ∈ Bm for all z ∈ Bn , cf. (6.15). Identity (6.50) for some finite set Zkn follows from the expression (6.43) for −1 Q P Rm since by localization of scales 9m (q) = (Km m m−1 )(q) = 0 for most m ≤ n if n q ∈ Sk . More precisely, one computes that 1 − χk˜m (q) χ ˜m−1 (q)Pkm Qm (q)Pm−1 (q) = ˜ , km−1

˜ Im k∈

where the index k˜ m−1 serves to denote the (unique) subspace J ˜m−1 containing J ˜m . Fix km−1

now some k ∈ I n . Then one has for all q ∈ Skn and all m < n, χ ˜m−1 (q)Pkm Qm (q)Pm−1 (q) = ˜ = PJ m−1 \J m , m ˜ k∈I k˜ =km

km−1

km−1

k

km

since by construction χkmm (q) = 1 for such m and q. Therefore, Qm (q)Pm−1 (q) = 0 . On the other hand, Jkmm is a strict for all q ∈ Skn if m < n is such that Jkmm = Jkm−1 m−1

only if #Ckmm < #Ckm−1 , i.e., if the eigenvalues contained in Ckm−1 subspace of Jkm−1 m−1 m−1 m−1 have been divided after perturbation by am−1 into two (or more) clusters. But this can be true only for finitely many m since the original eigenvalues µk are finitely many times degenerate. Hence, there is an L < ∞ such that for all n ≥ 1 and all 1 ≤ m ≤ n, one has Pˆn Rm (q) = 0, except for some m1 , . . . , mL . Since the same is true of Pˆn tp Rm (q) provided that p satisfies |ω · p| < ηn−1 /16, (6.50) follows.

6.4. The cluster decomposition. We now check that point (d) of Proposition 5.1 holds. First, (5.9), (5.10) and (5.11) lead to, for k = (k, ·) ∈ I n , |an Pkn |L(Jkn ) ≤ 3k γ −ξ εηn−1 ,

(6.51)

which, since µ2k ≥ ck 2γ by hypothesis (H1), implies that µ2 + nm=0 am ≡ µ˜ 2n+1 is positive definite for ε = ε(c, η) small enough. Next, it follows from an being hermitian that σ (µ˜ n+1 ) ⊂ R+ . Furthermore, using (5.11) and the fact that Jkn is by definition an

130


invariant subspace for µ˜ n , one infers from µk ≥ ck γ , the asymptotic (5.13) for µ˜ n , and the estimate (6.51), that n −1 −ξ n−1 |an µ˜ −1 . n Pk |L(Jkn ) ≤ 3c k εη

ˆ∞ = Therefore, denoting by Pk the projector onto the k th component of R one obtains

1 µ˜ n+1 Pk = µ˜ 2n + an 2 Pk = µ˜ n Pk + O(k −ξ εηn−1 ),

k≥1 C

dk ,

(6.52)

which, since µPk = µk 1dk , implies by recursion that µ˜ n+1 Pk = µk 1dk + O(εk −ξ ). Hence, the asymptotic (5.13) holds, where for each k ≥ 1 the sequence of clusters n+1 Ck,i , i = 1, . . . , Mkn+1 , forms a partition of the component σ (µ˜ n+1 Pk ) satisfying n+1 n+1 d(Ck,i , Ck,j ) > ηn+1 for i = j . This partition is unique if Mkn+1 is required to be maximal. Furthermore, it follows from (1.13) and (1.14) in (H1) that for ε = ε(c) small enough, the components σ (µ˜ n+1 Pk ) are well separated. Therefore, the sets Skn+1 , k ∈ I n+1 , defined according to (3.25) are pairwise disjoint. Next, (6.52) and the gap condition (5.12) with n + 1 replaced by n imply that for ε = ε(c, η) small enough, n+1 every cluster Ck,i is composed of perturbed eigenvalues belonging to a unique C n n+1 . k,ji

The distance between these two clusters is at most of order O(k −ξ εηn−1 ), so that (3.31) follows for n + 1 by induction. In order to iterate (3.32), we note that by definition, Jkn+1 is the eigenspace of µ˜ n+1 associated with Ckn+1 , k ∈ I n+1 , and that every Jkn , k ∈ I n , n+1 is also an invariant subspace for µ˜ n+1 by (5.11). Therefore, each Jk,i is contained in n n a unique J n+1 , namely, the eigenspace associated with C n+1 . Finally, we check that k,ji

k,ji

n+1 (3.33) iterates. This is a simple consequence of (3.32) and Sk,i ⊂ Sn

k,jin+1

following from (6.52) for ε small enough.

, the latter

7. Measure Estimate In this section, we prove Proposition 5.2, namely, that +∗ (K) =

n≥1 +n (K)

lim meas(+ \ +∗ (K)) = 0,

K→0

satisfies (7.1)

for all bounded + ⊂ Rd . The strategy is standard and consists in studying the complementary sets of +n (K). For n ≥ 1, b > 0, and q ∈ Zd , let us define

n;k n;k,k n Oq,b ∪ , ≡ Oq,b Oq,b k∈I n

k,k ∈I n

where n;k Oq,b = {ω ∈ Rd | d(|ω · q|, Ckn ) ≤ b},

n;k,k = {ω ∈ Rd | d(|ω · q|, |Ckn ± Ckn |) ≤ b}. Oq,b


131

Next, with Zn ≡ {q ∈ Zd | K ν η− 1

and O ∗ (K) ≡

n−1 ν

n≥1 q∈Zn

n

≤ |q| < K ν η− ν }, 1

n Oq,2K|q| −ν ,

one shows first, that for all bounded + ⊂ Rd , ξ meas + ∩ O ∗ (K) ≤ C+ K ξ +1 ,

(7.2)

for some constant C+ depending on + only, and, second, that ∗ c O (K) ⊆ +∗ (K).

(7.3)

Obviously, (7.1) follows from (7.2) and (7.3). Below, C+ will denote a generic constant that may change from place to place but depends on + only. Let us start with the bound (7.2). One has n n meas + ∩ O ∗ (K) ≤ Tq,2K|q|−ν + Tˆq,2K|q| (7.4) −ν , n≥1 q∈Zn

where n Tq,b =

k∈I n

n;k ˆ n n;k,k . , Tq,b = meas + ∩ meas + ∩ Oq,b Oq,b

(7.5)

k,k ∈I n

n , we first To treat the terms on the right-hand side of (7.4) involving the quantities Tq,b use (3.30) to estimate, n;k ¯ n ). meas + ∩ Oq,b ≤ C+ (b + dη

Next, we note that the asymptotic behavior of the clusters Ckn , cf. (1.12) and (5.13), n;k is empty if k = (k, ·) satisfies k ≥ C+ |q| for some constant C+ . implies that + ∩ Oq,b Hence, since the number of indices k of the form (k, ·) is uniformly bounded in k, the n is proportional to |q|, and number of terms which are non-zero in the sum defining Tq,b n n ¯ ). Finally, the fact that q ∈ Zn satisfies one obtains the estimate Tq,b ≤ C+ |q|(b + dη n −ν η ≤ K|q| leads to n ¯ Tq,2K|q| |q|1−ν ≤ C+ K, (7.6) −ν ≤ C+ 2K + dK n≥1 q∈Zn

q∈Zd

for ν = ν(d) large enough. To treat the remaining terms in (7.4), we first note that, as above, n;k,k ¯ n ). (7.7) ≤ C+ (b + 2dη meas + ∩ Oq,b Next, one distinguishes the cases γ = 1 and γ > 1. If γ > 1, then for k > k the n;k,k is empty inequality k γ − k γ > k γ −1 and the asymptotic (1.13) imply that + ∩ Oq,b

132


for k = (k, ·) and k = (k , ·) such that k ≥ C+ |q|1/(γ −1) ≡ kq . Furthermore, it follows from (5.13) that for kb = b− ξ +1 ,

n;(k,i),(k,j ) −ξ meas Oq,b ≤ Ckb . 1

k>Ckb i,j

Therefore, one obtains with (7.7), ξ

n ≤ Cb ξ +1 + Tˆq,b

Ckb k=1

n;(k,i),(k,j )

meas(+ ∩ Oq,b

)+

kq k =2 k 0 and ν = ν(d, ξ ) large enough. We now consider the case γ = 1. From n;k,k is empty for (5.13) and the asymptotic behavior (1.14), it follows first that + ∩ Oq,b k = (k, ·) and k = (k , ·) with k − k = l ≥ C|q|, and second that for all l ≥ 0,

n;(k,i),(k+l,j ) −ξ meas Oq,b ≤ Ckb , k>Ckb i,j

where kb = b− ξ +1 . Therefore, (7.7) leads to 1

ξ

n ¯ n ), Tˆq,b ≤ C|q|b ξ +1 + C+ b− ξ +1 |q|(b + 2dη 1

and one finally obtains for ν = ν(d, ξ ) large enough, n≥1 q∈Zn

ξ

n 1+ξ Tˆq,2K|q| −ν ≤ C+ K

ξ

ξ

|q|1−ν 1+ξ ≤ C+ K 1+ξ .

(7.9)

q∈Zd

Inserting (7.6) and (7.9) into (7.4) yields (7.2). We now check that (7.3) holds. If ω ∈ O ∗ (K), then the following is true for all n ≥ 1, q ∈ Zn and k, k ∈ I n , d(|ω · q|, Ckn ) > 2K|q|−ν , d(|ω

· q|, |Ckn

± Ckn |)

> 2K|q|

(7.10) −ν

.

(7.11)

Next,we verify that for such ω, this implies that bounds (7.10) and (7.11) hold for all q ∈ nm=1 Zm provided one replaces the constant 2K on the right-hand side by K. This


133

in turn implies that ω ∈ +n (K) for all n ≥ 1, so that ω ∈ +∗ (K). Let m < n and fix some k ∈ I n . Then, recalling (3.31), namely that there is at least one k ∈ I m for which sup infm d(x, y) ≤ ηm+1 ,

x∈Ikn y∈Ik

and since, on the other hand, ηm < K|q|−ν whenever q ∈ Zm , one infers from (7.10) with n replaced by m that for q ∈ Zm and η < 1, d(|ω · q|, Ckn ) ≥ d(|ω · q|, Ckm ) − ηm+1 ≥ (2K − ηK)|q|−ν

> K|q|

−ν

(7.12)

.

Since (7.12) holds for all q ∈ Zm , 1 ≤ m ≤ n, one concludes that d(|ω · q|, Ckn ) > K|q|−ν whenever 0 < |q| < Kη−n/ν . In a similar way, one derives an identical lower bound on d(|ω · q|, |Ckn ± Ckn |), thus achieving the proof of (7.3) and (7.1). 8. Proof of Theorem 1.1 Defining zn ≡ Fn (0), we now show that zn converges in hs , as n → ∞, to a function z whose Fourier transform is real analytic and provides a solution of Eq. (3.4). Using Fn (0) = Fn−1 (Rn (0)), cf. (3.13), one computes that zn − zn−1 = δ1 Fn−1 Rn (0) . According to (6.5), Rn (0) = Hn 9n wn−1 (0) + u(0), so that (5.6), (5.17), (6.9), (6.10) and the identity 9n = 9n Pˆn−1 lead to ||Rn (0)||hn−1 ≤ η−n r 2(n−1) . s Therefore, since, Fn−1 ∈ An−1 = H ∞ (Bn−1 , hs ), one can apply (4.12) to δ1 Fn−1 with γ = η−n r n−2 to obtain ||zn − zn−1 ||s ≤ Cη−n r n−2 |||Fn−1 |||An−1 , and the convergence of zn in hs follows from the uniform bound (6.16) by taking r = r(η) small enough. Bound (6.16) also implies ||zβ || ≤ ε uniformly in the strip | Im β| < α = −2 α ∞ n=2 (1 − n ). This yields the pointwise estimate

|z(q)| ≤ εe−α |q| , and, consequently, ensures the real analyticity of the Fourier transform of z. In order to prove that the limit z solves Eq. (3.6), namely, K0 z = w0 (z), we will show below that K0 zn = Qn w0 (zn ) + A 1, but with the transformation q → 1/q, α → α ∗ , β → qβ and z → z, we get a sphere which is C ∗ -isomorphic to one for |q| < 1. It is clear that the quotient of the C ∗ -algebra Aq by the ideal generated by z can be identified with the C ∗ -algebra of the compact quantum group SUq (2). However, in this paper we shall not make any use of additional structures (like coproduct, counit, and antipode) coming from SUq (2). In [19] it was shown that for q ∈ (−1, 0) ∪ (0, 1) the spaces SUq (2) are all homeomorphic in the sense that the corresponding C ∗ -algebras are isomorphic. Then, for q ∈ (−1, 0) ∪ (0, 1), all our C ∗ -algebras Aq are isomorphic as well and all corresponding spheres are homeomorphic. For the generic situation when −1 < q < 0 or 0 < q < 1 any character χ of Aq has to satisfy the equations χ (α ∗ ) = χ (α), χ (β) = 0

and

χ (β ∗ ) = χ (β),

χ (z∗ ) = χ (z),

|χ (α)|2 + (χ (z))2 = 1.

(4)

To show that the space of all characters is homeomorphic to the two dimensional sphere S 2 , we take a generic α ∈ C and z ∈ R such that |α|2 + z2 = 1. Then, from the general considerations presented above, there is a 1-dimensional representation (that is a character) χ of Aq such that χ (α) = α , χ (β) = 0 and χ (z) = z and this proves the homeomorphism in question. Hence, for −1 < q < 0 or 0 < q < 1 the space of (nonzero) characters of Aq , which can be thought of as the space of “classical points” of Sq4 , is homeomorphic to the classical S 2 . For the particular case q = 1 the algebra of the sphere Sq4 is commutative. The associated space of characters is homeomorphic to the 4-dimensional sphere S 4 . Indeed any character χ of Aq=1 satisfies the equations χ (α ∗ ) = χ (α), and

χ (β ∗ ) = χ (β),

χ (z∗ ) = χ (z),

|χ (α)|2 + |χ (β)|2 + (χ (z))2 = 1.

(5)

To show that any element of S 4 arises in this way, similarly to what we did before we take generic α , β ∈ C and z ∈ R such that |α |2 + |β |2 + z2 = 1. Thus they satisfy relations (5) (or relations (1) for q = 1) and there is a 1-dimensional representation χ of Aq (q = 1) such that χ (α) = α , χ (β) = β and χ (z) = z . This proves the homeomorphism in question and shows that the algebra Aq for q = 1 can be identified with the algebra of all continuous functions on the 4-dimensional sphere S 4 . It is in this sense that Sq4 provides a deformation of the classical S 4 . Next, we describe irreducible representations of the algebra Aq (for −1 < q < 0 or 0 < q < 1) as bounded operators on an infinite dimensional Hilbert space H with an orthonormal basis {ψn , n = 0, 1, 2, · · · }. With λ ∈ C, |λ| ≤ 1, we get two families of

164

L. D¸abrowski, G. Landi, T. Masuda

representations πλ,± : Aq → B(H ) given by πλ,± (z)ψn = πλ,± (z∗ )ψn = ± 1 − |λ|2 ψn , πλ,± (α)ψn = λ 1 − q 2(n+1) ψn+1 , πλ,± (α ∗ )ψn = λ¯ 1 − q 2n ψn−1 ,

(6)

¯ n ψn . πλ,± (β ∗ )ψn = λq

πλ,± (β)ψn = λq n ψn ,

To be precise, for λ such that |λ| = 1, the two representations πλ,+ and πλ,− are identical so that, in fact, we have a family of representations parametrized by points on a classical sphere S 2 , similarly to what happens for one dimensional representations (characters) as described before. As mentioned already, the quotient of the C ∗ -algebra Aq by the ideal generated by z is the C ∗ -algebra of the compact quantum group SUq (2). Then, with |λ| = 1, the representations πλ,+ = πλ,− =: πλ yield representations of SUq (2) which are unitary equivalent to the ones constructed by Woronowicz (see for instance ([21])).

3. The Instanton and Its Classes Consider now the following element e in the algebra Mat 4 (Aq ) Mat4 (C) ⊗ Aq 

 1+z, 0, α, β ∗ ∗ 1  0, 1+z, −qβ , α  e=  ∗ . 0  2 α , −qβ, 1−z, ∗ β , α, 0, 1−z

(7)

Using the relations (1) it can be verified that e is a selfadjoint idempotent (projection) e2 = e = e∗ . It operates on the right Aq -module A4q = Aq ⊗ C4 and its range may be thought of as sections of a vector bundle over Sq4 . It is easy to see that eA4q is a deformation of the classical instanton bundle over S 4 in the sense that for q = 1, the module eA4q is the module of sections of the complex rank two instanton bundle over S 4 [1]. Next, we compute the Chern–Connes Character of the idempotent e given in (7). If is the projection on the commutant of 4 × 4 matrices, up to normalization the component of the (reduced) Chern–Connes Character are given by

1 chn (e) = (8) ⊗ e ⊗ · · · ⊗ e , n = 0, 1, 2, . . . , e− 2 2n

and they are elements of Aq ⊗ A¯q ⊗ · · · ⊗ A¯q ,

(9)

2n

where A¯q = Aq /CI is the quotient of the algebra Aq by the scalar multiples of the unit.

Instantons on the Quantum 4-Spheres Sq4

165

The crucial property of the components chn (e) is that they define a cycle in the (b, B) bicomplex of cyclic homology [3, 12], that is, Bchn (e) = bchn+1 (e).

(10)

(−1)j a0 ⊗ · · · ⊗ aj aj +1 ⊗ · · · ⊗ am + (−1)m am a0 ⊗ a1 ⊗ · · · ⊗ am−1 ,

(11)

The operator b is defined by b(a0 ⊗ a1 ⊗ · · · ⊗ am ) =

m−1 j =0

while the operator B is written as B = AB0 ,

(12)

where B0 (a0 ⊗ a1 ⊗ · · · ⊗ am ) = I ⊗ a0 ⊗ a1 ⊗ · · · ⊗ am , m 1 A(a0 ⊗ a1 ⊗ · · · ⊗ am ) = (−1)mj aj ⊗ aj +1 ⊗ · · · ⊗ aj −1 , m

(13) (14)

j =0

with the obvious cyclic identification m + 1 = 0. To be precise, in formulæ (11), (13) and (14), all elements in the tensor products but the first one should be taken modulo complex multiples of the unit I, that is one has to project onto A¯q = Aq /CI. For the 0th component of the Chern–Connes Character of the idempotent (7) on the spheres Sq4 we find,

1 e− 2

ch0 (e) =

= 0.

(15)

This could be interpreted as saying that the idempotent and the corresponding module (the “vector bundle”) has complex rank equal to 2. Next for the 1st component we have, ch1 (e) = =

e−

1 2

⊗e⊗e

1 (1 − q 2 ) z ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) 8

(16)

+ β ∗ ⊗ (z ⊗ β − β ⊗ z) + β ⊗ (β ∗ ⊗ z − z ⊗ β ∗ ) .

It is straightforward to check that bch1 (e) = 0 = Bch0 (e).

(17)

166


Finally, the 2nd component ch2 (e) =

1 e− 2

⊗e⊗e⊗e⊗e

(18)

can be written as a sum of five terms ch2 (e) =

1 z ⊗ c z + α ⊗ c α + α ∗ ⊗ cα ∗ + β ⊗ c β + β ∗ ⊗ cβ ∗ , 32

(19)

with cz = (1 − q 4 )(β ⊗ β ∗ ⊗ β ⊗ β ∗ − β ∗ ⊗ β ⊗ β ∗ ⊗ β) + (1 − q 2 ) z ⊗ z ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) + (β ⊗ z ⊗ z ⊗ β ∗ − β ∗ ⊗ z ⊗ z ⊗ β) + (β ⊗ β ∗ − β ∗ ⊗ β) ⊗ z ⊗ z + z ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) ⊗ z − z ⊗ (β ⊗ z ⊗ β ∗ − β ∗ ⊗ z ⊗ β) − (β ⊗ z ⊗ β ∗ − β ∗ ⊗ z ⊗ β) ⊗ z + (α ⊗ α ∗ − q 2 α ∗ ⊗ α) ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) ∗

∗

∗

(20)

2 ∗

+ (β ⊗ β − β ⊗ β) ⊗ (α ⊗ α − q α ⊗ α) + (β ⊗ α − qα ⊗ β) ⊗ (α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) + (α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) ⊗ (β ⊗ α − qα ⊗ β) + (α ∗ ⊗ β − qβ ⊗ α ∗ ) ⊗ (qα ⊗ β ∗ − β ∗ ⊗ α) + (qα ⊗ β ∗ − β ∗ ⊗ α) ⊗ (α ∗ ⊗ β − qβ ⊗ α ∗ ); cα = (z ⊗ α ∗ − α ∗ ⊗ z) ⊗ (β ∗ ⊗ β − β ⊗ β ∗ ) + q 2 (β ∗ ⊗ β − β ⊗ β ∗ ) ⊗ (z ⊗ α ∗ − α ∗ ⊗ z) + q(z ⊗ β − β ⊗ z) ⊗ (α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) + (α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) ⊗ (z ⊗ β − β ⊗ z)

(21)

+ q(β ∗ ⊗ z − z ⊗ β ∗ ) ⊗ (α ∗ ⊗ β − qβ ⊗ α ∗ ) + (α ∗ ⊗ β − qβ ⊗ α ∗ ) ⊗ (β ∗ ⊗ z − z ⊗ β ∗ ); cα ∗ = q 2 (z ⊗ α − α ⊗ z) ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) + (β ⊗ β ∗ − β ∗ ⊗ β) ⊗ (z ⊗ α − α ⊗ z) + (β ∗ ⊗ z − z ⊗ β ∗ ) ⊗ (β ⊗ α − qα ⊗ β) + q(β ⊗ α − qα ⊗ β) ⊗ (β ∗ ⊗ z − z ⊗ β ∗ ) + (z ⊗ β − β ⊗ z) ⊗ (β ∗ ⊗ α − qα ⊗ β ∗ ) + q(β ∗ ⊗ α − qα ⊗ β ∗ ) ⊗ (z ⊗ β − β ⊗ z);

(22)

Instantons on the Quantum 4-Spheres Sq4

167

cβ = (1 − q 4 ) (β ∗ ⊗ z − z ⊗ β ∗ ) ⊗ β ⊗ β ∗ + β ∗ ⊗ β ⊗ (β ∗ ⊗ z − z ⊗ β ∗ ) + (1 − q 2 ) β ∗ ⊗ z ⊗ z ⊗ z − z ⊗ β ∗ ⊗ z ⊗ z + z ⊗ z ⊗ β∗ ⊗ z − z ⊗ z ⊗ z ⊗ β∗ + (β ∗ ⊗ z − z ⊗ β ∗ ) ⊗ (α ⊗ α ∗ − q 2 α ∗ ⊗ α) ∗

2 ∗

∗

∗

(23)

+ (α ⊗ α − q α ⊗ α) ⊗ (β ⊗ z − z ⊗ β ) + (α ⊗ z − z ⊗ α) ⊗ (α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) + q(α ∗ ⊗ β ∗ − qβ ∗ ⊗ α ∗ ) ⊗ (α ⊗ z − z ⊗ α) + (β ∗ ⊗ α − qα ⊗ β ∗ ) ⊗ (α ∗ ⊗ z − z ⊗ α ∗ ) + q(α ∗ ⊗ z − z ⊗ α ∗ ) ⊗ (β ∗ ⊗ α − qα ⊗ β ∗ ); cβ ∗ = (1 − q 4 ) (z ⊗ β − β ⊗ z) ⊗ β ∗ ⊗ β + β ⊗ β ∗ ⊗ (z ⊗ β − β ⊗ z) + (1 − q 2 ) − β ⊗ z ⊗ z ⊗ z + z ⊗ β ⊗ z ⊗ z −z⊗z⊗β ⊗z+z⊗z⊗z⊗β + (z ⊗ β − β ⊗ z) ⊗ (α ⊗ α ∗ − q 2 α ∗ ⊗ α) ∗

2 ∗

(24)

+ (α ⊗ α − q α ⊗ α) ⊗ (z ⊗ β − β ⊗ z) + q(z ⊗ α ∗ − α ∗ ⊗ z) ⊗ (β ⊗ α − qα ⊗ β) + (β ⊗ α − qα ⊗ β) ⊗ (z ⊗ α ∗ − α ∗ ⊗ z) + q(α ∗ ⊗ β − qβ ⊗ α ∗ ) ⊗ (z ⊗ α − α ⊗ z) + (z ⊗ α − α ⊗ z) ⊗ (α ∗ ⊗ β − qβ ⊗ α ∗ ). By using the relations (1) for our algebra, and remembering that we need to project on A¯q in all terms of the tensor product but the first one, a long (one needs to compute 750 terms) but straightforward computation gives bch2 (e) =

1 (1 − q 2 ) I ⊗ z ⊗ (β ⊗ β ∗ − β ∗ ⊗ β) 16

+ I ⊗ β ⊗ (β ∗ ⊗ z − z ⊗ β ∗ ) + I ⊗ β ∗ ⊗ (z ⊗ β − β ⊗ z) ,

(25)

and this is exactly equal to Bch1 (e). 4. Final Remarks There are several directions in which one can proceed and we just mention some of them. It would be clearly very interesting to study differential calculi on our quantum 4sphere and develop Yang–Mills theory. Another natural question is to which extent the sphere Sq4 could be endowed with a structure of a metric noncommutative manifold which fulfills (some of) the related axioms [5, 6]. In particular one should construct an appropriate Dirac operator. This will probably be possible along the lines of [8] where it was suggested that the true Dirac

168


operator D for the quantum SUq (2) (and also for the quantum Podle´s 2-sphere Sq2 [16]) should satisfy an equation of the form q 2D − q −2D = Q, q 2 − q −2

(26)

where Q is some q-analogue of the Dirac operator like the ones found in [2, 13]. Once found the operator D, one would easily “suspend” it to the 4-sphere Sq4 . Finally, we mention that it will be interesting to study if there is any relation with the sheaf-theoretic construction of a q-deformed instanton in [15]. Acknowledgements. We are grateful to Alain Connes for several enlightening conversations. This work has been partially supported by the Regione Friuli-Venezia-Giulia via the Research Project “Noncommutative geometry: algebraic, analytical and probabilistic aspects and applications to mathematical physics”.

References 1. Atiyah, M.F.: Geometry of Yang–Mills fields. Accad. Naz. Dei Lincei, Scuola Norm. Sup. Pisa, 1979 2. Bibikov, P.N., Kulish, P.P.: Dirac operators on quantum SU (2) group and quantum sphere. q-alg/9608012 3. Connes, A.: Noncommutative differential geometry. Inst. Hautes Etudes Sci. Publ. Math. 62, 257–360 (1985) 4. Connes, A.: Noncommutative geometry. London–New York: Academic Press, 1994 5. Connes, A.: Gravity coupled with matter and foundation of noncommutative geometry. Commun. Math. Phys. 182, 155–176 (1996) 6. Connes, A.: Noncommutative geometry: The spectral aspect. Les Houches Session LXIV, London–New York: Elsevier, 1998, pp. 643–685 7. Connes, A.: A short survey of noncommutative geometry. J. Math. Phys. 41, 3832–3866 (2000) 8. Connes, A., Landi, G.: Noncommutative manifolds, the instanton algebra and isospectral deformations. math.QA/0011194, Commun. Math. Phys. 221, 141–159 (2001) 9. Dabrowski, L. and Landi, G.: Instanton algebras and quantum 4-spheres. math.QA/0101177 10. Furuuchi, K.: Instantons on noncommutative R 4 and projection operators. Prog. Theor. Phys. 103, 1043 (2000) 11. Kapustin, A., Kuznetsov, A., Orlov, D.: Noncommutative instantons and twistor transform. hepth/0002193 12. Loday, J.L.: Cyclic homology. Berlin–Heidelberg–New York: Springer, 1998 13. Majid, S.: Riemannian geometry of quantum groups and finite groups with nonuniversal differentials. math.QA/0006150 14. Nekrasov, N., Schwarz, A.: Instantons on noncommutative R 4 and (2,0) superconformal six dimensional theory. Commun. Math. Phys. 198, 689–703 (1998) 15. Pflaum, M.J.: Quantum groups on fibre bundles. Commun. Math. Phys. 166, 279–316 (1994) 16. Podle´s, P.: Quantum spheres. Lett. Math. Phys. 14, 521–531 (1987) 17. Rieffel, M.: Vector bundles over higher dimensional noncommutative tori. Lect. Notes. Math. 1132, Berlin–Heidelberg–New York: Springer-Verlag, 1985, pp. 456–467 18. Rieffel, M., Schwarz, A.: Morita equivalence of multidimensional noncommutative tori. Int. J. Math. 10, 289–299 (1999) 19. Woronowicz, S.L.: Twisted SU (2) group. An example of a non-commutative differential calculus. Publications of RIMS Kyoto University, Vol. 23 No. 1, 117–181 (1987) 20. Woronowicz, S.L.: Compact matrix pseudogroup. Commun. Math. Phys. 111, 613–665 (1987) 21. Woronowicz, S.L., D¸abrowski, L., Nurowski, P.: Compact and non-compact quantum groups. I. Preprint 153/95/FM, SISSA, Trieste, 1995 Communicated by A. Connes

Commun. Math. Phys. 221, 169 – 196 (2001)

Communications in



Hyperelliptic Prym Varieties and Integrable Systems Rui Loja Fernandes1, , Pol Vanhaecke2 1 Departamento de Matemática, Instituto Superior Técnico, 1049-001 Lisboa, Portugal.

E-mail: [email protected]

2 Université de Poitiers, Département de Mathématiques, Téléport 2, Boulevard Marie et Pierre Curie,

BP 30179, 86962 Futuroscope Chasseneuil Cedex, France. E-mail: [email protected] Received: 12 December 2000 / Accepted: 26 March 2001

Abstract: We introduce two algebraic completely integrable analogues of the Mumford systems which we call hyperelliptic Prym systems, because every hyperelliptic Prym variety appears as a fiber of their momentum map. As an application we show that the general fiber of the momentum map of the periodic Volterra lattice a˙ i = ai (ai−1 − ai+1 ),

i = 1, . . . , n,

an+1 = a1 ,

is an affine part of a hyperelliptic Prym variety, obtained by removing n translates of the theta divisor, and we conclude that this integrable system is algebraic completely integrable. Contents 1. 2. 3. 4. 5. 6.

Introduction . . . . . . . . . . . . . . . . . Hyperelliptic Prym Varieties . . . . . . . . . The Hyperelliptic Prym Systems . . . . . . The Periodic Toda Lattices and KM Systems Painlevé Analysis . . . . . . . . . . . . . . Example: n = 5 . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

169 171 175 182 188 192

1. Introduction In this paper we introduce two algebraic completely integrable (a.c.i.) systems, similar to the even and odd Mumford systems (see [12] for the odd systems and [15] for the even systems). By a.c.i. we mean that the general level set of the momentum map is Supported in part by FCT-Portugal through the Research Units Pluriannual Funding Program, European Research Training Network HPRN-CT-2000-00101 and grant POCTI/1999/MAT/33081.

170

R. L. Fernandes, P. Vanhaecke

isomorphic to an affine part of an Abelian variety and that the integrable flows are linearized by this isomorphism ([16]). The phase space of these systems is described by triplets of polynomials (u(x), v(x), w(x)), as in the case of the Mumford system, but now we have the extra constraints that u, w are even and v odd for the first system (the “odd” case), and with the opposite parities for the other system (the “even” case). We show that in the odd case the general fiber of the momentum map is an affine part of a Prym variety, obtained by removing three translates of its theta divisor, while in the even case the general fiber has two affine parts of the above form. We call these systems the odd and the even hyperelliptic Prym system because every hyperelliptic Prym variety (more precisely an affine part of it) appears as the fiber of their momentum map. Thus we find the same universality as in the Mumford system: in the latter every hyperelliptic Jacobian appears as the fiber of its momentum map. To show that the hyperelliptic Prym systems are a.c.i. we exhibit a family of compatible (linear) Poisson structures, making these systems multi-Hamiltonian. These structures are not just restrictions of the Poisson structures on the Mumford system. Rather they can be identified as follows: the hyperelliptic Prym systems are fixed point varieties of a Poisson involution (with respect to certain Poisson structures of the Mumford system) and we prove a general proposition stating that such a subvariety always inherits a Poisson structure (Prop. 3.4). As an application we study the algebraic geometry and the Hamiltonian structure of the periodic Volterra lattice a˙ i = ai (ai−1 − ai+1 )

i = 1, . . . , n;

an+1 = a1 .

(1)

Although systems of this form go back to Volterra’s work on population dynamics ([20]), they first appear (in an equivalent form) in the modern theory of integrable system in the pioneer work of Kac and van Moerbeke ([10]), who constructed this system as a discretization of the Korteweg-de Vries equation and who discovered its integrability. Though those authors only considered the non-periodic case, we shall refer to (1) as the n-body KM system. In the second part of the paper we give a precise description of the fibers of the momentum map of the KM systems and we prove their algebraic complete integrability. We can summarize our results as follows: Theorem. Denote be M , P, T , K the phase spaces of the (even) Mumford system, the hyperelliptic Prym system (odd or even), the (periodic) sl Toda lattice and the (periodic) KM system. Then there exists a commutative diagram of a.c.i. systems TO ? K

/ M O ? /P

where the horizontal maps are morphisms of integrable systems, and the vertical maps correspond to a Dirac type reduction. We stress that the vertical arrows are natural inclusion maps exhibiting for both spaces the subspace as fixed points varieties, but they are not Poisson maps. On the other hand, the horizontal arrows are injective maps that map every fiber of the momentum map on the left injectively into (but not onto) a fiber of the momentum map on the right. In order

Hyperelliptic Prym Varieties and Integrable Systems

171

to make these into morphisms of integrable systems, we construct a pencil of quadratic brackets making Toda → Mumford a Poisson map. For one bracket in this pencil the induced map for the KM systems is also Poisson, so it follows that the diagram is also valid in the Poisson category. A description of the general fiber of the momentum map of the KM systems as an affine part of a hyperelliptic Prym variety follows. Since the flows of the KM systems are restrictions of certain linear flows of the Toda lattices this enables us to show that the KM systems are a.c.i.; moreover the map leads to an explicit linearization of the KM systems. In order to determine precisely which divisors are missing from the affine varieties that appear in the momentum map we use Painlevé analysis, since it is difficult to read this off from the map . The result is that n (= the number of KM particles) translates of the theta divisor are missing from these affine parts. We also show that each hyperelliptic Prym variety that we get is canonically isomorphic to the Jacobian of a related hyperelliptic Riemann surface, which can be computed explicitly, thereby providing an alternative, simpler description of the geometry of the KM systems. The plan of this paper is as follows. In Sect. 2 we recall the definition of a Prym variety and specialize it to the case of a hyperelliptic Riemann surface with an involution (different from the hyperelliptic involution). We show that such a Prym variety is canonically isomorphic to a hyperelliptic Jacobian and we use this result to describe the affine parts that show up in Sect. 3, in which the hyperelliptic Prym systems are introduced and in which their algebraic complete integrability is proved. In Sect. 4 we establish the precise relation between the KM systems and the Toda lattices and we construct the injective morphism . We use it to give a first description of the general fiber of the momentum map of the KM systems and we derive its algebraic complete integrability. A more precise description of these fibers is given in Sect. 5 by using Painlevé analysis. We finish the paper with a worked out example (n = 5) in which we find a configuration of five genus two curves on an Abelian surface that looks very familiar (Fig. 2). As a final note we remark that the (periodic) KM systems have received much less attention than the (periodic) Toda lattices, another family of discretizations of the Korteweg–de Vries equation, which besides admitting a Lie algebraic generalization, is also interesting from the point of view of representation theory. It is only recently that the interest in the KM systems has revived (see e.g. [6, 18], and the references therein). We hope that the present work clarifies the connections between these systems and the master systems (Mumford and Prym systems). It was pointed out to us by Vadim Kuznetsov, that an embedding of the KM systems in the Heisenberg magnet was constructed by Volkov in [19]. 2. Hyperelliptic Prym Varieties In this section we recall the definition of a Prym variety and specialize it to the case of a hyperelliptic Riemann surface , equipped with an involution σ . We construct an explicit isomorphism between the Prym variety of (, σ ) and the Jacobian of a related hyperelliptic Riemann surface. We use this isomorphism to give a precise description of the affine part of the Prym variety that will appear as the fiber of the momentum map of an integrable system related to KM system. 2.1. The Prym variety of a hyperelliptic Riemann surface. Let be a compact Riemann surface of genus G, equipped with an involution σ with p fixed points. The quotient

172


surface σ = /σ has genus g , with G = 2g +p/2−1, and the quotient map → σ is a double covering map which is ramified at the p fixed points of σ . We assume that g > 0, i.e., σ is not the hyperelliptic involution on a hyperelliptic Riemann surface . The group of divisors of degree 0 on , denoted by Div0 (), carries a natural equivalence relation, which is compatible with the group structure and which is defined by D ∼ 0 iff D is the divisor of zeros and poles of a meromorphic function on . The quotient group Div0 ()/ ∼ is a compact complex algebraic torus (Abelian variety) of dimension G, called the Jacobian of and denoted by Jac() ([9], Ch. 2.7), its elements are denoted by [D], where D ∈ Div0 () and we write ⊗ for the group operation in Jac(). Notice that σ induces an involution on Div0 () and hence on Jac(); we use the same notation σ for these involutions. Definition 2.1. The Prym variety of (, σ ) is the (G−g )-dimensional subtorus of Jac() given by Prym(/σ ) = {[D − σ (D)] | D ∈ Div0 ()}. We will be interested in the case in which is the Riemann surface of a hyperelliptic curve (0) : y 2 = f (x), where f is a monic even polynomial of degree 2n without multiple roots (in particular 0 is not a root of f ), so that the curve is non-singular. The Riemann surface has genus G = n−1 and is obtained from (0) by adding two points, which are denoted by ∞1 and ∞2 . The two points of (0) for which x = 0 are denoted by O1 and O2 . The 2n Weierstrass points of (the points (x, y) of (0) for which y = 0) come in pairs (X, 0) and (−X, 0); fixing some order we denote them by Wi = (Xi , 0) and −Wi = (−Xi , 0), where i = 1, . . . , n. The Riemann surface admits a group of order four of involutions, whose action on (0) and on the Weierstrass points (Xi , 0) and whose fixed point set are described in Table 1 for n odd, n = 2g + 1 and in Table 2 for n even, n = 2g + 2 (ı is the hyperelliptic involution). Table 1. n odd ı

(x, y)

O1

O2

∞1

∞2

Wi

−Wi

Fix

(x, −y)

O2

O1

∞2

∞1

Wi

−Wi

σ

(−x, y)

O1

O2

∞2

∞1

−Wi

Wi

Wi , −Wi O1 , O2

τ

(−x, −y)

O2

O1

∞1

∞2

−Wi

Wi

∞1 , ∞ 2

Table 2. n even (x, y)

O1

O2

∞1

∞2

Wi

−Wi

Fix

ı

(x, −y)

O2

O1

∞2

∞1

Wi

−Wi

Wi , −Wi

σ

(−x, −y)

O2

O1

∞2

∞1

−Wi

Wi

–

τ

(−x, y)

O1

O2

∞1

∞2

−Wi

Wi

O 1 , O 2 , ∞1 , ∞ 2

For future use we also point out that for points P ∈ which are not indicated on these tables, neither σ (P ) nor τ (P ) coincide with ı(P ). The involutions σ and τ lead to two quotient Riemann surfaces σ := /σ and τ := /τ , and to two covering maps πσ : → σ and πτ : → τ . It follows from Tables 1 and 2 that the genus of τ equals g, while the genus g of σ is g or g + 1 depending on whether n is odd or even. Also, the dimension of Prym(/σ ) = g


173

(whether n is odd or even). If the equation of (0) is written as y 2 = g(x 2 ) then for n (0) (0) odd, σ has an equation v 2 = g(u) while τ has an equation v 2 = ug(u); for n even (0) (0) the roles of σ and τ are interchanged. In order to describe Prym(/σ ), which we will call a hyperelliptic Prym variety, we need the following classical results about hyperelliptic Riemann surfaces and their Jacobians (for proofs, see [12], Ch. IIIa). Lemma 2.2. Let D be a divisor of degree H > G on , where G is the genus of , and let P be any point on . There exists an effective divisor E of degree G on such that D ∼ E + (H − G)P . Corollary 2.3. For any fixed divisor D0 of degree G, Jac() is given by G Jac() = Pi − D 0 | P i ∈ . i=1

Lemma 2.4. Let D be a divisor on of the form D = H i=1 (Pi − Qj ), where H ≤ G and Pi = Qj for all i and j . Then [D] = 0 if and only if H is even and D is of the form D=

H /2

(Ri + ı(Ri ) − Si − ı(Si )),

i=1

for some points Ri , Si ∈ . 2.2. Hyperelliptic Prym varieties as Jacobians. In the following theorem we show that for any n the Prym variety Prym(/σ ) associated with the hyperelliptic Riemann surface is canonically isomorphic to the Jacobian of τ . This result was first proven by D. Mumford (see [13]) for the case in which πσ : → σ is unramified (n even) and by S. Dalaljan (see [7]) for the case in which πσ : → σ has two ramification points (n odd). Our proof, which is valid in both cases, is different and has the advantage of allowing us to describe explicitly the affine parts of the Prym varieties that we will encounter as affine parts of the corresponding Jacobians. Theorem 2.5. Let πτ∗ denote the homomorphism Div0 (τ ) → Div0 () which sends every point of τ to the divisor on which consists of its two antecedents (under τ ). The induced map # : Jac(τ ) → Prym(/σ ) [D] →[πτ∗ D] is an isomorphism. Proof. It is clear that the homomorphism # is a well-defined: if [D] = 0 then D is the divisor of zeros and poles of a meromorphic function f on τ , hence πτ∗ D is the divisor of zeros and poles of f ◦ τ and [πτ∗ D] = 0. To see that the image of # is contained in Prym(/σ ), just notice that πτ∗ (D) can be written as E + τ (E) for some E ∈ Div0 (), so that [πτ∗ (D)] = [E + τ (E)] = [E − σ (E)] ∈ Prym(/σ ).

174


Since Jac(τ ) and Prym(/σ ) both have dimension g it suffices to show that # is injective. Suppose that [πτ∗ D] = 0 for some D ∈ Div0 (τ ). We need to show that this implies = 0. It follows from Corollary 2.3 that we may [D] gassume that D is of the g form i=1 pi − gπτ (∞1 ), where pi ∈ τ . Then πτ∗ D = i=1 Pi + τ (Pi ) − 2g∞1 (πτ (Pi ) = pi ). Since 2g ≤ G and ∞1 = ı(∞1 ) Lemma 2.4 implies that Pi = ∞1 , i.e., pi = πτ (∞1 ) for all i.

2.3. The theta divisor. We introduce two divisors on Jac() by %1 =

G−1

Pi − (G − 1)∞1 | Pi ∈ ,

i=1

%2 =

G−1

(2)

Pi + ∞2 − G∞1 | Pi ∈ .

(3)

i=1

These two divisors are both translates of the theta divisor and they differ by a shift over [∞2 − ∞1 ]. Since ∞2 = ı(∞1 ) they are tangent along their intersection locus, which is given by &=

G−2

Pi + ∞2 − (G − 1)∞1 | Pi ∈ .

i=1

Proposition 2.6. When n is odd Prym(/σ ) ∩ (%1 ∪ %2 ) consists of three translates of the theta divisor of Jac(τ ), intersecting as in the following figure.

1

2

[11 + 12 ]

[11 + O]

[12 + O]

Fig. 1.

Proof. We use the isomorphism # to determine which divisors of Jac(τ ) get mapped into %1 and %2 . Since O1 and O2 are the only points g of on which ı and τ coincide, Lemma 2.4 implies that the only divisors D = i=1 pi − gπτ (∞1 ) ∈ Div(τ ) for which πτ∗ D contains, up to linear equivalence, ∞1 or ∞2 are those for which at least


175

one of them contains πτ (∞1 ) or πτ (∞2 ) or πτ (O1 ) (=πτ (O2 )). Denoting O = πτ (O1 ) we find that these points constitute the following three divisors on Jac(τ ):    g−1   θ1 =  pi − (g − 1)πτ (∞1 ) | pi ∈ τ ,   i=1    g−1   θ2 =  pi + πτ (∞2 ) − gπτ (∞1 ) | pi ∈ τ ,   i=1    g−1   θ=  pi + O − gπτ (∞1 ) | pi ∈ τ .   i=1

They all pass through    g−2   ω=  pi + πτ (∞2 ) − (g − 1)πτ (∞1 ) | pi ∈ τ ,   i=1

which is the tangency locus of θ1 and θ2 , and θi intersects θ in addition in    g−2   ωi =  pi + πτ (∞i ) + O − gπτ (∞1 ) | pi ∈ τ ,   i=1

which is a translate of ω.

When n is even then clearly Prym(/σ ) is contained in %1 , but the following result, similar to Prop. 2.6, holds for an appropriate translate of Prym(/σ ). The proof is left to the reader. Proposition 2.7. When n is even and i ∈ {1, 2} then (Prym(/σ ) ⊗ [O1 − ∞i ]) ∩ (%1 ∪ %2 ) consists of three translates of the theta divisor of Jac(τ ), intersecting as in Fig. 1 (in which O should now be replaced by O2 ). We will see in the next section how in both cases (n odd/even) the affine variety obtained by removing these three translates from the theta divisor from Prym(/σ ) can be described by simple, explicit equations. 3. The Hyperelliptic Prym Systems In this section we introduce two families of integrable systems, whose members we call the odd and the even hyperelliptic Prym systems, where the adjective “odd/even” refers to the parity of n, as in the previous section, and where “hyperelliptic Prym” refers to the fact that the fibers of the momentum map of these systems are precisely the affine parts of the hyperelliptic Prym varieties that were considered in the previous section. These systems are intimately related to the even Mumford systems, constructed by the second author (see [15]), as even analogs of the (odd) Mumford systems, constructed by Mumford (see [12]).

176


3.1. The Mumford systems. We first recall the definition of the g-dimensional odd and even Mumford systems and we describe their geometry. Details, generalizations and applications can be found in [16]. The phase space of each of these systems is an affine space CN , which is most naturally described as an affine space of triples (u(x), v(x), w(x)) of polynomials, often represented as Lax operators L(x) =

v(x)

w(x)

,

u(x) −v(x)

where u(x), v(x) and w(x) are subject to certain constraints. Denoting by Mg (resp. Mg ) the phase space of the g th odd (resp. even) Mumford system these constraints are indicated in the following table: Table 3. Phase space Mg Mg

dim

u(x)

3g + 1

monic

3g + 2

deg = g monic

v(x)

w(x)

deg < g

monic

deg < g

deg = g

deg = g + 1 monic deg = g + 2

It is natural to use the coefficients of the three polynomials u(x), v(x), w(x) as coordinates on Mg and on Mg : for Mg for example, which will be most important for this paper, we write u(x) = x g + ug−1 x g−1 + · · · + u0 , v(x) = vg−1 x g−1 + · · · + v0 , w(x) = x g+2 + wg+1 x g+1 + · · · + w0 , or, in terms of the Lax operator L(x), as

0 1 0 0

x g+2 +

0 wg+1 0

0

x g+1 +

0 wg 1

0

xg +

0≤i0

where su(A0 ) denotes the strictly upper triangular part of A0 . The vector fields Xi are also Hamiltonian with respect to {·, ·}xT and their flows are linear on the general fiber of the momentum map K : Tn → C[x], which is defined by 1 det(x Id −L(h)) = −h − + K(x)/2; h since the general fiber of K is an affine part of a hyperelliptic Jacobian, the n-body Toda lattice is an a.c.i. system (see [3] for details). For higher order brackets for the Toda lattices, see [5]. We now turn to the n-body, periodic, Kac–van Moerbeke system (n-body KM system, for short). Its phase space Kn is the subspace of Tn consisting of all Lax operators (10) with zeros on the diagonal. Kn is not a Poisson subspace of Tn . However, Kn is the fixed manifold of the involution  : Tn → Tn defined by ((a1 , a2 . . . , an ), (b1 , b2 . . . , bn )) → ((a1 , a2 . . . , an ), (−b1 , −b2 . . . , −bn )), which is a Poisson automorphism of (Tn , {·, ·}xT ). Therefore, by Theorem 3.4, Kn inherits a Poisson bracket {·, ·}K from {·, ·}xT , which is given by ai , aj K = ai aj (δi,j +1 − δi+1,j ).

It follows that the restriction of the momentum map K to Kn is a momentum map for the n-body KM system. Notice that Ij = 0 for even j , while for j odd the Lax equations (11) lead to Lax equations for the n-body KM system, merely by putting b1 = · · · = bn = 0. Taking j = 1 we find the vector field a˙ i = ai (ai−1 − ai+1 ),

i = 1, . . . , n,

(13)

which was already mentioned in the introduction. More generally, taking j odd we find a family of commuting Hamiltonian vector fields on Kn which are restrictions of the Toda vector fields, while for j even the Toda vector fields Xj are not tangent to Kn . In order to conclude that the KM systems are a.c.i. we need to describe the fibers of the momentum map K : Kn → C[x]. This will be done in the next paragraph.

184


4.2. Algebraic integrability of KM. We first define a map : Tn → Mn−1 which maps the n-body Toda system to the even Mumford system. The following identity, valid for tridiagonal matrices, will be needed. Lemma 4.1. Let M be a tridiagonal matrix,  β1 α1 0 · · · 0 0   γ1 β2 α2 0   ..  0 γ β . 2 3  M= . .. .. ..  . . .  . .   0 βn−1 αn−1  0 0 · · · · · · γn−1 βn

       ,     

and denote by ;i1 ,...,ik the determinant of the minor of M obtained by removing from M the rows i1 , . . . , ik and the columns i1 , . . . , ik . Then: ;1 ;n − ;;1,n =

n−1 '

αi γ i .

(14)

i=1

Proof. For n = 2 this is obvious. For n > 2 one proceeds by induction, using the following formula for calculating the determinant ; of M, ; = βn ;n − αn−1 γn−1 ;n−1,n .

(15)

In the sequel we use the notation ;i1 ,...,ik from the above lemma taking as M the tridiagonal matrix obtained from x Id −L(h) in the obvious way, i.e., by removing the two terms that depend on h. In this notation the characteristic polynomial of L(h) is given by det(x Id −L(h)) = −h − h−1 + ; − an ;1,n .

(16)

Proposition 4.2. For any m = 1, . . . , n the map m : Tn → Mn−1 defined by u(x) = ;m , v(x) = am−1 ;m−1,m − am ;m,m+1 ,

(17)

w(x) = (x − bm ) ;m + 2(x − bm )(am−1 ;m−1,m + am ;m,m+1 ) + 4am am−1 ;m−1,m,m+1 , 2

maps each fiber of the momentum map K : Tn → C[x] into a fiber of the momentum map H : Mn−1 → C[x]. The restriction of m to Kn takes values in P n−1 when n 2 is odd and in P n −1 when n is even, mapping in both case the fiber of the momentum 2

map K : Kn → C[x] into the fiber of the momentum map H : P n−1 → C[x 2 ] (or 2

H : P n −1 → C[x 2 ]). As a consequence the general fiber of the momentum map of the 2 KM systems is an affine part of a hyperelliptic Jacobian.


185

Proof. Since the momentum map is equivariant with respect to the Z/n action on Tn it suffices to prove the proposition for m = n. It is easy to see that the triple (u, v, w), defined by (17) satisfies the constraints u, w monic, deg w = deg u + 2 = n + 1 and deg v < n − 1, so that n takes values in Mn−1 . Moreover, taking β1 = · · · = βn = x in (15) implies that when all entries on the diagonal of L(h) are zero then ;i1 ,...,ip has the same parity as n − p, so that the triples (u, v, w) which correspond to points in Kn have the additional property that v has the same parity as n while u and w have the opposite parity. Therefore the restriction of n to Kn takes values in P n−1 when n is odd and in P n −1 when n is even. 2

2

For p(x) a monic polynomial of degree n, let L(h) ∈ K −1 (2p(x)), i.e., p(x) = (x − bn );n − an ;1n − an−1 ;n−1,n .

(18)

Proving that n (L(h)) belongs to H −1 (p 2 (x)−4) amounts to showing that u(x)w(x)+ v 2 (x) = p2 (x) − 4, which follows from a direct computation, using (14). The commutativity of the following diagram follows: Tn

H

K

C[x]

/ M n−1

φ

/ C[x]

where φ is defined by φ(q) = (q/2)2 − 4, for q ∈ C[x]. To show that the map n is injective let (u(x), v(x), w(x)) ∈ n (Tn ). We show that the matrix L(h) ∈ Tn which is mapped to this point is unique. First observe that the monic polynomial p(x) = ; − an ;1,n can be recovered from u(x)w(x) + v(x)2 = p(x)2 − 4. We can then determine bn from the following two formulas: n bi x n−1 + · · · , p(x) = x n − i=1

n−1 u(x) = ;n = x n−1 − bi x n−2 + · · · . i=1

Next, the second relation in (17) and (18) lead to the system: an−1 ;n−1,n − an ;1,n = v(x), an−1 ;n−1,n + an ;1,n = (x − bn )u(x) − p(x). This linear system completely determines the products an ;1,n and an−1 ;n−1,n . Because the determinants of the principal minors of x Id −L(h) are monic polynomials, this means that we know an , ;1,n and ;n−1,n separately. From ; = p(x) + an ;1,n we also obtain ;. We have now shown how bn , an , ;, ;n and ;n−1,n are determined. We proceed by induction, showing how to determine bn−k−1 , an−k−1 , ;n−k−1,...,n once we know bn−i , an−i and ;n−i,...,n for i = 0, . . . , k. We use (15) to obtain the recursive relation: ;n−k+1,...,n = (x − bn−k );n−k,...,n − an−k−1 ;n−k−1,...,n .

186


This determines the product an−k−1 ;n−k−1,...,n , but also an−k−1 and ;n−k−1,...,n separately, again because ;n−k−1,...,n is monic. Now from ;n−k−1,...,n and ;n−k,...,n we bi and n−k−1 bi . Hence, bn−k−1 is determined. know, as above, the sums n−k−2 i=1 i=1 We saw in Prop. 3.3 that the fibers of the momentum map of the even Prym system are reducible (two isomorphic pieces), so there remains the question if the same is true for the n-body KM system for even n. To check that this is so, note that the highest degree coefficient of the characteristic polynomial of L(h) gives, for n even, the first integral I = a1 a3 a5 · · · an−1 + a2 a4 a6 · · · an . Since a1 a2 · · · an = 1, for generic values of I , the variety defined by a1 a3 a5 · · · an−1 = constant,

a2 a4 a6 · · · an = constant,

is reducible, and the claim follows. Note however that both a1 a3 a5 · · · an−1 and a2 a4 a6 · · · an are first integrals themselves, so we can construct a momentum map using these integrals (instead of their sum and product) and then the general fiber is irreducible. The map m : Tn → Mn−1 not only maps fibers to fibers of the momentum maps, but it maps the whole hierarchy of Toda flows to the Mumford flows defined by (7). To ϕ see this, we construct a family of quadratic Poisson brackets {·, ·}M,q on Mn−1 which make this map Poisson. First observe that there exist unique polynomials p(x) and r(x), with p(x) monic of degree n and r(x) of degree less than n, such that u(x)w(x) + v(x)2 = p(x)2 + r(x).

(19)

The coefficients of p(x) and r(x) are regular functions of ui , vi and wi . Hence, we can define a skew-symmetric biderivation on the space of regular functions of Mn−1 by setting, for any ϕ ∈ C[x] of degree at most 1, ϕ u(x), u(x ) M,q ϕ u(x), v(x ) M,q ϕ u(x), w(x ) M,q ϕ v(x), w(x ) M,q ϕ w(x), w(x ) M,q

ϕ = v(x), v(x ) M = 0, pϕ = u(x), v(x ) M + α ϕ (x + x )u(x)u(x ), pϕ = u(x), w(x ) M − 2α ϕ (x + x )u(x)v(x ), pϕ = v(x), w(x ) M + α ϕ (x + x )u(x)w(x )), pϕ = w(x), w(x ) M + 2α ϕ (x + x ) w(x)v(x ) − w(x )v(x) ),

where α ϕ (x) = ϕ(α(2x)/2). Notice that the polynomial pϕ, used in the definition of the bracket, depends on the phase variables. Proposition 4.3. Let ϕ be a polynomial of degree at most 1. Then ϕ

(i) {·, ·}M,q is a Poisson bracket on Mn−1 and the maps ϕ

ϕ

m : (Tn , {·, ·}T ) → (Mn−1 , {·, ·}M,q ) are Poisson and map the Toda flows to the Mumford flows;


187

ϕ

(ii) For ϕ odd, the bracket {·, ·}M,q induces a Poisson bracket {·, ·}P ,q on P(n−1)/2 (resp. on Pn/2−1 ), and the maps

m : (Kn , {·, ·}K ) → (P(n−1)/2 , {·, ·}P ,q ) , {·, ·}P ,q )

m : (Kn , {·, ·}K ) → (Pn/2−1

are Poisson and map the flows of the n-body KM system to the flows of the hyperelliptic Prym systems. Proof. We take the bracket of both sides of (19) with u(x) to obtain 2p(y)ϕ(y)

u(x)v(y) − u(y)v(x) ϕ ϕ = 2p(y) {u(x), p(y)}M,q + {u(x), r(y)}M,q . x−y ϕ

ϕ

It follows that {u(x), r(y)}M,q is divisible by p(y). Since {u(x), r(y)}M,q is of degree ϕ less than n in y and since p(y) is monic of degree n we must have {u(x), r(y)}M,q = 0 and u(x)v(y) − u(y)v(x) ϕ {u(x), p(y)}M,q = ϕ(y). x−y ϕ

ϕ

Similarly, we find {v(x), r(y)}M,q = {w(x), r(y)}M,q == 0 and also that: ϕ(y) w(x)u(y) − u(x)w(y) {v(x), p(y)}M,q = − α(x + y)u(x)u(y) , 2 x−y v(x)w(y) − w(x)v(y) {w(x), p(y)}M,q = ϕ(y) + α(x + y)v(x)u(y) . x−y These expressions also allow one to compute the brackets of u(x), v(x), w(x) and p(x) ϕ with α(y), and the check of the Jacobi identity follows easily from it. Therefore, {·, ·}M,q is a Poisson bracket for which the coefficients of r(x) are Casimirs. If we compare the expressions above for the brackets with p(y) with expressions (7) for the Mumford vector fields, we conclude that they are Hamiltonian with respect to {·, ·}1M,q with Hamiltonian function K. Checking that m is Poisson can be done by a straightforward (but rather long) computation using the following expressions for the derivatives of ;i1 ,...,ik : ∂;i1 ,...,ik −;i,i+1,i1 ,...,ik , i, i + 1 ∈ {i1 , . . . , ik } , = 0 otherwise, ∂ai ∂;i1 ,...,ik −;i,i1 ,...,ik , i ∈ {i1 , . . . , ik } , = 0 otherwise. ∂bi For the second statement, one easily checks that when ϕ is odd then  is a Poisson involution, so that there is an induced bracket on P(n−1)/2 or on Pn/2−1 . Explicit formulas for this bracket are computed as in the proof of Proposition 3.5. The other statements in (ii) then follow from (i). ϕ

ψ

It is easy to check that the Poisson brackets {·, ·}M,q and {·, ·}M on Mn−1 are compatible, when ϕ and ψ have degree at most 1. This is however not true when ψ is of higher degree.

188


5. Painlevé Analysis The results in the previous section show that the general fiber of the momentum map of the KM systems is an affine part of a hyperelliptic Prym variety (or two copies of it), which can also be described as a hyperelliptic Jacobian. In order to describe precisely which affine part we determine the divisor which needs to be adjoined to each affine part in order to complete it into an Abelian variety. Since it is difficult to do this by using the maps m we do this by performing Painlevé analysis of the KM systems. The method that we use is based on the bijective correspondence between the principal balances of an integrable vector field (Laurent solutions depending on the maximal number of free parameters) and the irreducible components of the divisor which is missing from the fibers of the momentum map (see [1]). We look for all Laurent solutions ∞ 1 (j ) j ai t , tr

ai (t) =

(20)

j =0

to the vector field (13) of the n-body KM system. The following lemma shows that any such Laurent solution of (13) can have at most simple poles. We may suppose that r in (0) (20) is maximal, i.e., ai = 0 for at least one i, and we call r the order of the Laurent solution. The order of pole (or zero) of ai (t) is denoted by ri , so r = maxi ri . Lemma 5.1. Let the Laurent series ai (t), i = 1, . . . , n, given by (20) be a solution to the vector field (13) of the n-body KM system. If at least one of the ai has a pole (for t = 0) then it is a Laurent solution of order 1. Moreover the orders of the pole (or zero) of each ai (t) satisfy (0)

(0)

ri = ai+1 − ai−1 .

(21)

Proof. For s ∈ N we find from (20):

a˙ i (t) s −ri , t = Res t=0 ai (t) 0,

s=0 s > 0.

On the other hand, if we use (13) then we find Res t=0

a˙ i (t) s (r−s−1) (r−s−1) − ai+1 . t = Res (ai−1 (t) − ai+1 (t)) t s = ai−1 t=0 ai (t)

We conclude that

(k) ai−1

(k) − ai+1

=

−ri , 0,

k =r −1 0 ≤ k ≤ r − 2.

(22)

Now substituting (20) into (13) and comparing the coefficient of 1/t r+1 the following equation (the indicial equation) is obtained: (0)

−rai

(0)

(0)

(0)

= ai (ai−1 − ai+1 ), (0)

i = 1, . . . , n.

(23) (0)

(0)

If ai has a pole of order r > 0 then ai = 0 and (23) implies ai−1 − ai+1 = −r. Comparing with (22) we see that we must have r = 1 and that (21) holds.


189

Notice that in view of the periodicity of the indices (ai+n = ai ) the linear system (0)

(0)

1 = (ai+1 − ai−1 ), (0)

has no solutions, so that at least one of the ai (0)

i = 1, . . . , n, (0)

(0)

vanishes. If, say, a0 = ak+1 = 0 while

ai = 0 for i = 1, . . . , k for some k in the range 1, . . . , n − 1 (this includes the case of (0) a single i for which ai = 0) then the indicial equation specializes to (0)

a2 = 1, (0)

(0)

ai+1 − ai−1 = 1,

(0) ak−1

i = 2, . . . , k − 1,

= −1, (0)

(0)

which has no solution for k odd, and which has a unique solution (a1 , . . . , ak ) = (0) (0) (−l, 1, 1 − l, 2, . . . , −1, l) for even k, k = 2l. The other variables ak+1 . . . , an can either be all zero, or they can constitute one or several other solutions of this type (with varying k = 2l), separated by zeroes. Using periodicity the other solutions to the indicial equation are obtained by cyclic permutation. Thus we are led to the following combinatorial description of the solutions to the indicial equation of the n-body KM system. For a subset A of Z/n, and for p ∈ Z/n let us denote by A(p) ⊂ Z/p the largest subset of A that contains p and that consists of consecutive elements (with the understanding that A(p) = ∅ when p ∈ / A). If we define Fn = {A ⊂ Z/n | p ∈ A ⇒ #A(p) is even}, then we see that the solutions to the indicial equation are in one to one correspondence with the elements of Fn . In the sequel we freely use this bijection. For A ∈ Fn we call the integer #A/2 its order, denoted by ord A. For each solution to the indicial equation (i.e., for each A ∈ Fn ) we compute the eigenvalues of the Kowalevski matrix M, whose entries are given by Mij =

∂Fi (0) (a , . . . , an(0) ) + δij , ∂aj 1

where Fi = ai (ai−1 − ai+1 ), the i th component of (13). The number of non-negative integer eigenvalues of this matrix are precisely the number of free parameters of the (0) (0) family of Laurent solutions whose leading term is given by (a1 , . . . , an ) (see [1]), hence we can deduce from it which strata of the Abelian variety, whose affine part appears as a fiber of the momentum map, are parameterized by it. Proposition 5.2. For a solution of the indicial equation corresponding to A ∈ Fn the Kowalevski matrix M has n − ord A non-negative integer eigenvalues. Proof. In view of (21) the entries of M can be written in the form (0) (1 − ri )δi,j , if ai = 0 Mij = (0) (0) ai (δi,j +1 − δi,j −1 ), if ai = 0.

190


Note also that, by using the Z/n action, we can assume that 1 ∈ A, n ∈ / A, and that A is a disjoint union of A(p1 ), . . . , A(ps ), with p1 < p2 < · · · < ps . Let li = ord A(pi ). Then M has the following form:   −l1 C1 E1     D1 0       C2 E2     D2 M= .   ..   .       0 C E s s   Ds On the upper right corner the matrix has entry −l1 , and the blocks Ci ,Di and Ei , i = 1, . . . , s, are matrices as follows: • Ci is a tridiagonal matrix of size 2li of the form: 

0



li

  1 0 −1   1 − l i 0 li − 1    2 0 −2  Ci =  . .. ..  .. . .    li − 1 0 1 − l i   −1 0 1  li 0

        ;       

• Di is a diagonal matrix of the form Di = diag (1 + li , 1, . . . , 1, 1 + li+1 ), with the convention that if Di is 1 × 1 then its only entry is 1 + li + lj ; • Ei is a matrix with only one non-zero entry −li in the lower left corner. It is clear that the set of eigenvalues of M is the union of the set of eigenvalues of the Ci ’s and Di ’s. Now we have: Lemma 5.3. The eigenvalues of the matrix Ci are {±1, ±2, . . . , ±li }. Assuming to hold we find that the number of negative eigenvalues of M the lemma is equal to si=1 li = si=1 ord A(pi ) = ord A, so the proposition follows. So we are left with the proof of the lemma. We write l for li and we denote by ej the j th vector of the standard basis of C2l . In the basis e1 , e3 , . . . , e2l−3 , e2l−1 , e2l , e2l−2 , . . . , e4 , e2 the matrix Ci takes form

0 A A 0

,


191

where A is the transpose of the matrix 

0 .. . .. .

...

...

0

1



     0 2 −1      .. I= . . 3 −2 0      . . ..   0 ... .. .. .   l l − 1 0 ... 0 We show that this matrix has eigenvalues 1, −2, 3, . . . , (−1)l−1 l. Then the result follows because the eigenvalues of C are ± the eigenvalues of A. For j = 1, . . . , l, let fj = [1j −1 , 2j −1 , . . . , l j −1 ]T and let Vj denote the span of f1 , . . . , fj . For v = [v1 , . . . , vl ]T ∈ Cl we have that v ∈ Vj if and only if there exists a polynomial P of degree less than j such that vk = P (k) for k = 1, . . . , l. Since the k th component of Ifj is given by 1 k(l − k + 1)j −1 + (1 − k)(l − k + 2)j −1 = (−1)j −1 j k j −1 1 + O , k we have that Ifj ⊂ Vj , more precisely Ifj ∈ (−1)j −1 j fj + Vj −1 . This means that in terms of the basis fj the matrix I is upper triangular, with the integers 1, −2, 3, . . . , (−1)l−1 l on the diagonal. By the proposition above we can have a Laurent solution depending on n − 1 free (0) (0) parameters (a principal balance) only for the n choices of A given by (a1 , . . . , an ) = (−1, 1, 0, . . . , 0) and their cyclic permutations. Let us check that these lead indeed to asymptotic expansions which formally solve (13). By §2 in [1], these solutions are actually convergent and so they define convergent Laurent solutions. (0) (0) It suffices to do this for the solution (a1 , . . . , an ) = (−1, 1, 0, . . . , 0) of the indicial equation. By (21) we know that the order of the singularities of this solution are (r1 , . . . , rn ) = (1, 1, −1, 0, . . . , −1) so we have the following ansatz for the formal expansions: 1 a1 (t) = − + α1 + β1 t + O(t 2 ), t 1 a2 (t) = + α2 + β2 t + O(t 2 ), t a3 (t) = β3 t + O(t 2 ), aj (t) = αj + βj t + O(t 2 ), an (t) = βn t + O(t ). 2

4 ≤ j ≤ n − 1,

192


If we replace these expansions in Eq. (13) defining the n-body KM system we obtain the consistency equations: α 1 − α2 2β1 − β2 β1 − 2β2 βj

= 0, = −α1 α2 − βn , = −α1 α2 + β3 , = αj (αj −1 − αj +1 ),

4 ≤ j ≤ n − 1.

They give exactly the n − 1 free parameters α1 , α4 , . . . , αn−1 , β3 , βn . The coefficients (k) (k) a(k) = (a1 , . . . , an ) for k > 2 are then completely determined since they satisfy an equation of the form (j )

(M − kI ) · a(k) = some polynomial in the ai

with j < k,

and the eigenvalues of the Kowalevski matrix M are −1, 1, 2, by the proof above. This leads to the following result. Theorem 5.4. When n is odd the general fiber of the momentum map of the n-body KM system is an affine part of a hyperelliptic Prym variety, obtained by removing n translates of its theta divisor. When n is even the general fiber consists of two isomorphic components which admit the same description as in the odd case. In both cases the Prym variety admits an alternative description as a hyperelliptic Jacobian. 6. Example: n = 5 In this section we study the 5-body KM system in more detail. Its phase space is fourdimensional and is given by K5 = {(a1 , a2 , a3 , a4 , a5 ) | a1 a2 a3 a4 a5 = 1}, with Lax operator   0 a1 0 0 h−1    1 0 a2 0 0      L= 0 1 0 a3 0  .    0 0 1 0 a4    ha5 0 0 1 0 The spectral curve det(x Id −L) = 0 is explicitly given by h+

1 = x 5 − Kx 3 + Lx, h

where K = a1 + a2 + a3 + a4 + a5 , L = a1 a3 + a2 a4 + a3 a5 + a4 a1 + a5 a2 . These functions are in involution with respect to the quadratic Poisson structure, given by {ai , aj } = (δi,j +1 − δi+1,j )ai aj . It follows from the previous section that for generic k, l the affine surface Pkl defined by K = k, L = l is an affine part of the Jacobian


193

of the genus two Riemann surface τ minus five translates of its theta divisor, which is (0) isomorphic to τ . As we have seen, an equation for τ is given by τ(0) : y 2 = (u3 − ku2 + lu)2 − 4u.

(24)

The two commuting Hamiltonian vector fields XK and XL are given by a˙ 1 = a1 (a5 − a2 ),

a1 = a1 (a3 a5 − a2 a4 ),

a˙ 2 = a2 (a1 − a3 ),

a2 = a2 (a4 a1 − a3 a5 ),

a˙ 3 = a3 (a2 − a4 ),

a3 = a3 (a5 a2 − a4 a1 ),

a˙ 4 = a4 (a3 − a5 ),

a4 = a4 (a1 a3 − a5 a2 ),

a˙ 5 = a5 (a4 − a1 ),

a5 = a5 (a2 a4 − a1 a3 ).

The principal balance of XK for which a1 and a2 have a pole corresponds, according to Sect. 5, to the following solution of the indicial equations: (0)

(0)

(0)

(0)

(0)

(a1 , a2 , a3 , a4 , a5 ) = (−1, 1, 0, 0, 0), and its first few terms are given by 1 1 a1 = − + α − (α 2 + 2β + γ )t + O(t 2 ), t 3 1 1 2 a2 = + α + (α − β − 2γ )t + O(t 2 ), t 3 a3 = γ t + O(t 2 ),

(25)

a4 = δ + O(t ), 2

a5 = βt + O(t 2 ). Here α, β, γ and δ are the free parameters. If we look for Laurent solutions that correspond to the divisor to be added to Pkl we find by substituting the above Laurent solution in K = k, L = l, a1 a2 a3 a4 a5 = 1,   2α + δ = k, 2αδ + β − γ = l,  γβδ = −1, which means that the Laurent solution depends on two parameters β and δ, bound by the relation (k − δ)δ + β +

1 = l, βδ

(26)

which is an (affine) equation for the theta divisor, i.e., for τ ; it is easy to see that this curve is birational to the curve (24). The other four principal balances are obtained by cyclic permutation from (25). Pkl can be embedded explicitly in projective space by using the functions with a pole of order at most 3 along one of the translates of the theta divisor and no other poles. Since the theta divisor defines a principal polarization on its Jacobian, the vector space of such functions has dimension 32 = 9, giving an embedding in P8 . One checks by

194


direct computation that the following functions z0 , . . . , z8 form a basis for the space of functions with a pole of order at most 3 along the divisor associated with the Laurent solution (25) (the first two functions are obvious choices from the expression (25), while the others can be obtained from them by taking the derivative along the two flows): z0 z1 z2 z3 z4 z5 z6

= 1, = a1 a2 , = a1 a2 a4 , = a1 a2 (a1 + a5 ), = a1 a2 a4 (a3 + a4 + a5 ), = a1 a2 a4 (a1 − a2 ), = a1 a2 a4 ((a3 + a4 )a1 − (a4 + a5 )a2 ),

z7 = a12 a22 a4 a5 , z8 = a1 a22 a4 ((a4 + a5 )2 + a3 a4 ). The corresponding embedding of the Jacobian in P8 is then given explicitly on the affine surface Pkl by (a1 , . . . , a5 ) → (z0 : · · · : z8 ). By substituting the five principal balances in this embedding and letting t → 0 we find an embedding of the five curves 1 , . . . , 5 (in that order) which constitute the divisor Jac(τ ) \ Pkl :  (0 : 0 : 0 : 1 : 0 : 2δ : 2δ 2 : βδ : −δ 3 )      (βδ 2 : −β 2 δ 2 : 0 : −β 2 δ 3 : βδ : βδ : βδ 2 : 0 : 1 − βδ 3 )   (1 : 0 : βδ : 0 : βδ(k − δ) : βδ 2 : −βδ(β + δ 2 − kδ) : 0 : β 2 δ(k − δ)) (β, δ) →  (β 2 δ : 0 : βδ : −βδ : βδ(k − δ) : −βδ 2 : 1 + βδ 2 (δ − k)     : −δ : −βδ 2 (β − (δ − k)2 ))    2 2 (βδ : −δ : 0 : δ(δ − k) : βδ : −βδ : −βδ : −1 : 1). The points on the divisor that correspond to the above Laurent solutions are the ones for which β and δ are finite; notice that all these points in P8 are different. In order to determine the coordinates of the other points and the incidence relations between these points and the curves i we choose a local parameter around each of the three points needed to complete (26) into a compact Riemann surface: (a) (b) (c)

δ = 1/u, β = 1/u2 (1 + O(t)); δ = 1/u, β = u3 (1 + O(t)); β = 1/u, δ = −u2 (1 + O(t)).

Substituting these in the equations of the five embedded curves we find the following 5 points (each one is found 3 times because it belongs to three of the curves i ) p1 p2 p3 p4 p5

= (0 : 0 : 0 : 1 : 0 : 0 : 0 : 0 : 0), = (0 : 0 : 0 : 0 : 0 : 0 : 0 : 0 : 1), = (1 : 0 : 0 : 0 : 0 : 0 : 1 : 0 : −k), = (1 : 0 : 0 : 0 : 0 : 0 : −1 : 0 : 0), = (0 : 0 : 0 : 0 : 0 : 0 : 0 : 1 : −1).


3

195

1

4

p4

p3 p2

p3

p5

p1

2

p4

5

Fig. 2.

With this labeling of the points pi we have that i contains the points pi−1 , pi and pi+1 . As a corollary we find a 53 configuration on the Jacobian, where the incidence pattern of the 5 Painlevé divisors and the 5 points pi is as in the following picture (to make the picture exact one has to identify the two points labeled p3 , as well as the two points labeled p4 in such a way that the curves 2 and 4 are tangent, as well as the curves 3 and 5 ). Obviously the order 5 automorphism (a1 , a2 , a3 , a4 , a5 ) → (a2 , a3 , a4 , a5 , a1 ) preserves the affine surfaces Pkl and maps every curve i and every point pi to its neighbor. Since this automorphism does not have any fixed points it is a translation on Jac(τ ), and since its order is 5 it is a translation over 1/5 of a period. Notice also that with the above labeling of points and divisors the intersection point between i and i+2 is pi+1 (so they are tangent), while the intersection points between i and i+1 are pi and pi+1 . Dually, the divisors that pass through pi are precisely i−1 , i and i+1 . The usual Olympic rings are nothing but an asymmetric projection of this most beautiful Platonic configuration! References 1. Adler, M. and van Moerbeke, P.: The complex geometry of the Kowalewski–Painlevé analysis. Invent. Math. 97, 3–51 (1989) 2. Adler, M. and van Moerbeke, P.: Algebraic completely integrable systems: A systematic approach. Perspectives in Mathematics, Academic Press (to appear) 3. Adler, M. and van Moerbeke, P.: The Toda lattice, Dynkin diagrams, singularities and Abelian varieties. Invent. Math. 103, 223–278 (1991) 4. Courant, T.: Dirac manifolds, Trans. Am. Math. Soc. 319, 631–661 (1990) 5. Fernandes, R.L.: On the master symmetries and bi-Hamiltonian structure of the Toda lattice. J. Phys. A: Math. Gen. 26, 3797–3803 (1993) 6. Fernandes, R.L. and Santos, J.P.: Integrability of the periodic KM system. Rep. Math. Phys 40, 475–484 (1997) 7. Dalaljan, S.G.: The Prym variety of a two-sheeted covering of a hyperelliptic curve with two branch points. (Russian) Mat. Sb. (N.S.), 98 (140), no. 2 1(10), 255–267, 334 (1975)

196


8. Griffiths, P.A.: Linearizing flows and a cohomological interpretation of Lax equations. Am. J. Math. 107, 1445–1484 (1985) 9. Griffiths, P.A. and Harris, J.: Principles of algebraic geometry. New York: Wiley-Interscience 1978 10. Kac, M. and van Moerbeke, P.: On an explicitly soluble system of nonlinear differential equations related to certain Toda lattices. Adv. in Math. 3, 160–169 (1975) 11. Kuznetsov, V. and Vanhaecke, P.: Bäcklund transformations for finite-dimensional integrable systems: A geometric approach. nlin.SI/0004003 12. Mumford, D.: Tata lectures on theta. II. Boston: Birkhäuser Boston Inc., 1984 13. Mumford, D.: Prym varieties I. In: Contributions to analysis, Ahlfors L.V. Kra I. Maskit B. Nirenberg L., Eds., New York: Academic Press, 1974, pp. 325–350 14. Pedroni, M. and Vanhaecke, P.: A Lie algebraic generalization of the Mumford system, its symmetries and its multi-Hamiltonian structure. J. Moser at 70, Regul. Chaotic Dyn. 3, 132–160 (1998) 15. Vanhaecke, P.: Linearising two-dimensional integrable systems and the construction of action-angle variables. Math. Z. 211, 265–313 (1992) 16. Vanhaecke, P.: Integrable systems in the realm of algebraic geometry. Berlin–Heidelberg–New York: Springer-Verlag, 1996 17. Vanhaecke, P.: Integrable systems and symmetric products of curves. Math. Z. 227, 93–127 (1998) 18. Veselov, A.P. and Penskoï, A.V.: On algebro-geometric Poisson brackets for the Volterra lattice. Regul. Chaotic Dyn. 3, 3–9 (1998) 19. Volkov, A.: Hamiltonian interpretation of the Volterra model. J. Soviet Math. 46, (1576–1581) (1989) 20. Volterra, V.: Leçons sur la Théorie Mathématique de la Lutte pour la Vie. Paris: Gauthier-Villars et Cie., 1931 21. Weinstein, A.: The local structure of Poisson manifolds. J. Differ. Geom., 18, 523–557 (1983) Communicated by M. Aizenman

Commun. Math. Phys. 221, 197 – 227 (2001)

Communications in



The Complex Geometry of Weak Piecewise Smooth Solutions of Integrable Nonlinear PDE’s of Shallow Water and Dym Type Mark S. Alber1,2 , Roberto Camassa3,4, , Yuri N. Fedorov5, , Darryl D. Holm4,† , Jerrold E. Marsden6,‡ 1 Department of Mathematics, Stanford University, Building 380, MC 2125, Stanford, CA 94305, USA.


2 Department of Mathematics, University of Notre Dame, Notre Dame, IN 46556, USA.


3 Department of Mathematics, University of North Carolina, Chapel Hill, NC 27599, USA 4 Center for Nonlinear Studies and Theoretical Division, Los Alamos National Laboratory, Los Alamos,

NM 87545, USA. E-mail: [email protected]; [email protected]

5 Department of Mathematics and Mechanics, Moscow Lomonosov University, Moscow 119 899, Russia.


6 Control and Dynamical Systems 107-81, California Institute of Technology, Pasadena, CA 91125, USA.

E-mail: [email protected] Received: 16 February 1999 / Accepted: 10 April 2001

To the 70th birthday of Solomon Alber Abstract: An extension of the algebraic-geometric method for nonlinear integrable PDE’s is shown to lead to new piecewise smooth weak solutions of a class of N component systems of nonlinear evolution equations. This class includes, among others, equations from the Dym and shallow water equation hierarchies. The main goal of the paper is to give explicit theta-functional expressions for piecewise smooth weak solutions of these nonlinear PDE’s, which are associated to nonlinear subvarieties of hyperelliptic Jacobians. The main results of the present paper are twofold. First, we exhibit some of the special features of integrable PDE’s that admit piecewise smooth weak solutions, which make them different from equations whose solutions are globally meromorphic, such as the KdV equation. Second, we blend the techniques of algebraic geometry and weak solutions of PDE’s to gain further insight into, and explicit formulas for, piecewisesmooth finite-gap solutions. The basic technique used to achieve these aims is rather different from earlier papers dealing with peaked solutions. First, profiles of the finite-gap piecewise smooth solutions are linked to certain finite dimensional billiard dynamical systems and ellipsoidal billiards. Second, after reducing the solution of certain finite dimensional Hamiltonian Research partially supported by NSF grant DMS 9626672 and NATO grant CRG 950897.

Research supported in part by US DOE CCPP and BES programs and NATO grant CRG 950897. Research supported by INTAS grant 97-10771 and, in part, by the Center for Applied Mathematics,

University of Notre Dame. † Research supported in part by US DOE CCPP and BES programs. ‡ Research partially supported by the California Institute of Technology and NSF grant DMS 9802106.

198

M. S. Alber, R. Camassa, Y.N. Fedorov, D. D. Holm, J. E. Marsden

systems on Riemann surfaces to the solution of a nonstandard Jacobi inversion problem, this is resolved by introducing new parametrizations. Amongst other natural consequences of the algebraic-geometric approach, we find finite dimensional integrable Hamiltonian dynamical systems describing the motion of peaks in the finite-gap as well as the limiting (soliton) cases, and solve them exactly. The dynamics of the peaks is also obtained by using Jacobi inversion problems. Finally, we relate our method to the shock wave approach for weak solutions of wave equations by determining jump conditions at the peak location. Contents 1. 2. 3. 4. 5. 6.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Finite-Gap Solutions . . . . . . . . . . . . . . . . . . . . . . . . . Flows on n-Dimensional Quadrics and Stationary n-Gap Solutions of the (HD) and (SW) Equations . . . . . . . . . . . . . . . . . . Billiard Dynamical Systems and Piecewise-Smooth Weak Solutions of PDE’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kinematics of Peaks . . . . . . . . . . . . . . . . . . . . . . . . . The Dynamics of Peaks and Weak Solutions . . . . . . . . . . . .

. . . . . . . .

198 201

. . . .

208

. . . . . . . . . . . .

210 222 223

1. Introduction An important feature of many integrable nonlinear evolution equations is the nature of their soliton solutions. There are many examples of such solutions found in a variety of physical applications, such as nonlinear optics and water wave equations. Nonsmooth soliton solutions of integrable equations are now well known, and include solutions of the shallow water equation (SW) with peaks, the points at which their spatial derivative changes sign (see Camassa and Holm [1993] and Camassa, Holm and Hyman [1994]). It was noted in Alber et al. [1994, 1995, 1999] that the spatial structure of these “peakon” and finite-gap piecewise smooth weak solutions are closely related to finite dimensional integrable billiard systems. Some history. Camassa and Holm [1993] described classes of n-peakon solutions for an integrable equation in the context of a model for shallow water theory. This work (see also Camassa, Holm and Hyman [1994]) contains many other facts about these equations as well, such as a Hamiltonian derivation of the equation, the associated linear isospectral eigenvalue problem and its discrete spectrum corresponding to the peakons, a steepening lemma important for understanding how solutions lose regularity, numerical stability, etc. Of particular interest to us is their description of the dynamics of the peakons in terms of a finite-dimensional completely integrable Hamiltonian system. In other words, each peakon solution can be associated with a mechanical system of moving particles. Calogero [1995] and Calogero and Francoise [1996] further extended the class of mechanical systems of this type. It is well-known (see, for example, Ablowitz and Segur [1981]), that solitons and quasi-periodic solutions of most classical integrable equations can be obtained by using the inverse scattering transform (IST) method. This is done by establishing a connection with an isospectral eigenvalue problem for an associated operator that is often a Schrödinger operator. In some cases it involves a potential in the form of an entire function of the spectral parameter. Such an operator is called an energy-dependent

Complex Geometry of Piecewise Solutions

199

Schrödinger operator. The scattering problem for the operators of this type was studied by Jaulent [1972] and Jaulent and Jean [1976]. On the other hand, in connection with certain N -component systems of integrable evolution equations,Antonowicz and Fordy [1989] investigated certain energy dependent scalar Schrödinger operators. Using this formalism, they obtained multi-Hamiltonian structures for this class of systems. Later, Alber et al. [1994, 1995, 1999] showed that in case of certain potentials, a limiting procedure can be applied to generic solutions, which results in solutions with peaks. The latter were related to finite dimensional integrable dynamical systems with reflections and were termed piecewise-smooth solutions, a terminology that hereafter we will adopt. This relation provides an efficient route to the study of finite-gap and piecewise soliton solutions of nonlinear PDE’s. The approach is based on studying finite dimensional Hamiltonian systems on certain Riemann surfaces and can be used for a number of equations including the shallow water equation, the Dym type equation, as well as certain N -component systems and equations in their hierarchies. Finite-gap solutions of the Dym equation were studied in Dmitrieva [1993a] and Novikov [1999] by making use of a connection with the KdV equation and with the aid of additional phase functions. Soliton solutions of Dym type equations were studied in Dmitrieva [1993b]. Periodic solutions of the shallow water equation were discussed in McKean and Constantin [1999]. The papers by Beals et al. [1998, 1999, 2000] used Stieltjes’ theorem on continued fractions and the classical moment problem for studying multi-peakon solutions of the (SW) equation. Multi-peakon solutions have also been derived in Camassa [2000] by Gram–Schmidt orthogonalization. The main results of this paper. While our techniques are rather general and can be applied to large classes of N -component systems, we shall illustrate them in detail for two specific integrable PDE’s. One of these equations is a member of the Dym hierarchy that has been studied by, amongst others, Kruskal [1975], Cewen [1990], Hunter and Zheng [1994] and Alber et al. [1995, 1999]. Using subscript notation for partial derivatives, this equation is Uxxt + 2Ux Uxx + U Uxxx − 2κUx = 0.

(HD)

The other equation, derived from the Euler equations of hydrodynamics in a shallow water framework by Camassa and Holm [1993], is Ut + 3U Ux = Uxxt + 2Ux Uxx + U Uxxx − 2κUx .

(SW)

In both equations, the dependent variable U (x, t) may be interpreted as a horizontal fluid velocity and κ is a parameter. Under appropriate boundary conditions, applying the limit κ → 0 to (SW) leads to an equation that has peaked solutions. For equation (HD), such solutions exist also for κ = 0 (for example periodic and finite-gap peaked solutions). By using the method of generating equations for nonlinear integrable PDE’s, we reduce the equations to a Jacobi inversion problem associated with hyperelliptic curves. The solutions U (x, t) themselves are given by trace formulae, i.e., sums of coordinates of points on such curves. An important feature is that the corresponding Abel–Jacobi mapping is not a standard one. First of all, the holomorphic differentials that are involved do not form a complete set of such differentials on a hyperelliptic curve. Second, it involves a meromorphic differential.As a result, the image of the mapping turns out to be a non-Abelian subvariety

200


(a stratum) of a generalized Jacobian. This also implies that the x- and t-flows of (HD) and (SW) are essentially nonlinear, i.e., they are not translationally invariant. Seen from the viewpoint of algebraic geometry, these nonstandard aspects constitute the main difference between shallow water and Dym type equations, and equations of KdV type and more generally equations from the whole KP hierarchy which lead to standard Abel–Jacobi mappings. The basic technique of the present paper is rather different from earlier papers dealing with peaked solutions. First, profiles of the finite-gap piecewise-smooth solutions are linked to certain finite dimensional billiard dynamical systems and ellipsoidal billiards in the field of Hooke potentials. Second, after reducing the solution of the finite dimensional Hamiltonian systems on Riemann surfaces to the solution of a nonstandard Jacobi inversion problem, it is resolved by introducing new parametrizations. The philosophy that “justifies” procedures of this sort is that, in the end, by using the trace formulae, we obtain weak solutions of the PDE’s (HD) and (SW) in the spacetime sense. This is regarded as equivalent to the validity of Hamilton’s principle for these PDE’s and is taken as a fundamental criterion for the definition of their solutions. It is worth emphasizing that Hamilton’s principle naturally leads to weak solutions in the spacetime sense (and not in the spatial sense alone). We might also remark that even for billiards, one has to be careful about the sense in which solutions are interpreted. In the case of a point particle bouncing off a wall, for example, the equations of motion themselves do not rigorously make sense at the collision; what does make sense is the fundamental principle of Hamilton. This point of view of course is not new – see, e.g., Young [1969] and Kane et al. [1999]. The contents of the paper. In Sect. 2, basic trace formulae and µ-variable representations are used to establish a connection between solutions of the nonlinear equations and finite dimensional Hamiltonian systems on Riemann surfaces. These representations describe finite-gap and soliton type solutions, as well as mixed soliton–finite-gap solutions. Then, solving the Hamiltonian systems is reduced to Jacobi inversion problems with meromorphic differentials. These inversion problems are solved by introducing a new parameterization that yields a Hamiltonian flow on a nonlinear subvariety of the Jacobi variety. The approach of recurrence chains used in this section is demonstrated in detail in the case of Dym-type equations. In Sect. 3 the geodesic motion and motion in the field of a Hooke potential on an ellipsoid are linked, at any fixed time t, to finite-gap solutions of (HD) and (SW) equations respectively through trace formulae. In Sect. 4 it is shown how peaked finitegap solutions of (HD) and (SW) equations arise in the particular limit of smooth solutions. Based on this, a connection to ellipsoidal and hyperbolic billiards is used to construct the peak solutions of equations (HD) and (SW) in the form of an infinite sequence of pieces, corresponding to the segments between impacts, glued together along peaks. The motion between impacts in the billiard problems is made linear on generalized Jacobians of hyperelliptic curves. By solving the corresponding generalized Jacobi inversion problem, we find thetafunction solutions to the billiards, which thereby enables us to describe explicit peak solutions for the above PDE. We then extend the analysis from fixed-time peak solutions to time-dependent ones and show that the latter are described by an infinite number of meromorphic pieces in x and t that are glued along peak lines (surfaces) where the solution has discontinuous derivatives in the dependent variables. We give thetafunction expressions for the pieces and the peak surfaces. These formulae may be useful


201

for stability analysis as well as for numerical investigations of the perturbed nonlinear PDE’s. In Sect. 5 the Hamiltonian structure for the motion of the peaks of the finite-gap piecewise-solutions is obtained by using algebraic-geometric methods. Lastly, in Sect. 6 we relate our method to the shock wave approach for weak solutions of wave equations by determining jump conditions at the shock location. 2. Finite-Gap Solutions In this section we will show that even on the level of finite-gap solutions, there are crucial differences between the KdV equation case and equations (HD) or (SW). The same method can be applied to other equations forming the HD and SW hierarchy as well as to N -component systems of nonlinear evolution equations which have associated with them energy dependent Schrödinger operators (see Alber et al. [1997]). We will start by describing the algebraic geometrical structure of finite-gap solutions of equations (HD) and (SW) related to a hyperelliptic curve of genus n, also called n-gap solutions. The same method can be applied also to the other equations forming the HD and SW hierarchy. For the HD equation such solutions were obtained in terms of theta-functions by Dmitrieva [1993a] (see also Dmitrieva [1993b]) and Novikov [1999]. For equation (SW) on a circle, the problem was discussed in Constantin and McKean [1999]. Lax pairs and recurrence chains. We now use the recurrence chain approach to develop a basic trace formula which establishes a connection between solutions of equation (HD) and finite dimensional Hamiltonian systems on Riemann surfaces, written in the socalled µ-variables representation. This representation describes finite-gap solutions, as well as their limiting forms of soliton-type. This representation also yields the existence of peakons in a special limiting case. For definiteness, we concentrate here on equation (HD). Analogous results are available in the case of equation (SW) (for details see Alber et al. [1994, 1995]). The hierarchy of Dym equations is obtained from the Lax equations ∂ L = [L, An ], ∂tn

n ∈ N,

L=−

∂2 + V (E, x, tn ), ∂x 2

where the potential V (E, x, tn ) is written in terms of a complex parameter E in the form M(x, tn ) , (2.1) 2E for a function M(x, tn ) to be determined below. Assuming [L, An ] to be a scalar operator, we choose An = Bn ∂x − 21 Bn for some function Bn (E, x, tn ) and obtain the following sequence of equations for V , V (x, tn , E) =

∂V 1 ∂ 3 Bn ∂Bn ∂V V + Bn . =− +2 ∂tn 2 ∂x 3 ∂x ∂x

(2.2)

Now we choose Bn to be a polynomial in E of degree n: Bn (x, t, E) = b0

n

(E − µk (x, t)) =

k=1

n k=0

bn−k (x, t) E k .

(2.3)

202


Substituting the expressions (2.1) and (2.3) into the generating equation (2.2) and equating like powers of E, we obtain a recurrence chain for coefficients of B(x, t) which yields the nth equation of the Dym hierarchy. For example, putting t1 = t and choosing n = 1, B1 (x, t, E) = b0 (x, t)E + b1 (x, t) yields the following chain

E 1 : −b0 = 0,

E 0 : −b1 + 2b0 M + b0 M = 0, ∂M E −1 : 2b1 M + b1 M = . ∂t

(2.4)

After setting b0 = 1 and using (2.1), we get

M = b1 , ∂M . 2b1 M + b1 M = ∂t

(2.5)

The first equation defines b1 in terms of M, M = b1 + κ, with κ a constant. Renaming b1 = −U , so that M = −U + κ,

(2.6)

and putting this into the second equation of the set (2.5) results in equation (HD). (For further details about the hierarchies of (HD) and (SW), see for example Alber et al. [1994, 1995, 1999].) The method of generating equations is due to S. Alber [1979] and another exposition of it can be found in Alber et al. [1985, 1997]. We call (2.2) the “dynamical generating equations”, because it generates a hierarchy of equations governing the dynamics of the dependent variable M(x, t). Remark. The flows where Bn is a polynomial E, as in the definition (2.3) and in the example above, will in general lead to nonlocal equations, i.e., the evolution equation for M involves terms that depend on nonlocal operators acting on combinations of M and its derivatives. This can be seen, for instance, in Eq. (2.5) where both b1 and b1 require inverting (2.6) to write U in terms of M. Thus, flows generated by polynomials Bn in E should be properly classified as integro-differential evolution equations, rather than PDE’s. In contrast, the choice of polynomials in 1/E for Bn leads to flows that are local, i.e., Mt only depends on combinations M and its (spatial) derivatives, and these flows are proper PDE’s. This feature of equations of Dym (HD) or shallow water (SW) type is somewhat different from other completely integrable PDE’s like the KdV or Sine-Gordon equation. Equations (HD) and (SW) possess “open ended” hierarchies: the recurrence chain can be extended from negative to positive powers of E, by choosing Bn in (2.2) to be a rational function of the parameter E. The case when the chain includes only negative powers of E is in fact the one most studied in the literature (see, e.g., Dimitrieva [1993a], Novikov [1999] for the case of Dym equation). Now let us consider the stationary flow for the nth equation of the hierarchy, which is obtained by dropping the time derivative of V in the left-hand side of (2.2). By definition


203

a stationary equation describes a finite-dimensional system for the coefficients of Bn and is equivalent to the 2 × 2 Lax pair ∂ ∂ Wn (E) = −[Wn (E), L(E)], or + L(E), Wn (E) = 0, ∂x ∂x (2.7) 1 0 1 Bn − 2 Bn , L = . Wn (E) = M 1 − 21 Bn + Bn M E 0 E 2 Bn The matrix Wn (E) undergoes an isospectral deformation. Hence the spectral curve = {|Wn (E) − zI | = 0} is an invariant of the stationary flow. The curve is hyperelliptic and can be represented in the form = {w 2 = µC(µ)},

(2.8)

1 C(E) = E −Bn Bn + Bn 2 + Bn2 M. 2

(2.9)

where z = wE and

Since Bn is a polynomial of degree n, C(E) becomes a polynomial of degree (at most) 2n: C(E) =

2n

Cj E j = C2n

j =0

2n

(E − mk ),

(2.10)

k=0

for some constants mk , k = 1, . . . , 2n. In this case the curve has genus n and we set the coefficient C2n to be a negative number: C2n ≡ −L20 . We shall refer to (2.9) as the stationary generating equation. Equating like-powers of E in both sides of the stationary generating equation yields first integrals E 2n : C2n

= − b1 + M,

1 E 2n−1 : C2n−1 = − b1 b1 − b2 + (b1 )2 + 2b1 M, 2 Ej : ··· E 0 : C0

(2.11)

= 2bn2 M.

Let us consider the divisor of points P1 = (µ1 , w1 ), . . . , Pn = (µn , wn ) on . Substituting (2.3) into (2.9) and setting E = µ1 , . . . , µn successively, one gets the following system of equations describing evolution of the points under the stationary flow: √ ∂µi R(µi ) = µi ≡ , (2.12) ∂x µi nj=i (µi − µj )

204


where R(µ) = µC(µ) = −L20 µ

2n

(µ − mr ).

(2.13)

r=1

In the case of equation (SW), this should be replaced by R(µ) = µ

2n+1

(µ − mr ).

(2.14)

r=1

We now proceed to describe finite-gap solutions of equation (HD) and the other equations from its hierarchy. According to a general theory (see, e.g., Dubrovin [1981], Belokolos et al. [1994], for any fixed t, the x-profile of an n-gap solution of an integrable PDE satisfies the nth stationary equation of the hierarchy. Hence, n-gap solutions M(x, tk ) of k th equation of HD hierarchy must satisfy the stationary generating equation (2.9) represented by the Lax pair (2.7), as well as the dynamical generating equation ∂V 1 ∂ 3 Bk ∂Bk ∂V =− +2 V + Bk , ∂tk 2 ∂x 3 ∂x ∂x

V =

M(x, tk ) , 2E

(2.15)

where the coefficients of Bk (E) are found recursively. Notice that the latter equation is equivalent to the matrix commutativity relation ∂ ∂ (2.16) + L, + Wk = 0, ∂x ∂tk where Wk (E) =

− 21 Bk 1 − 2 Bk + B k M E

Bk 1 , 2 Bk

(2.17)

and L is defined in (2.7). The compatibility of conditions (2.16), and (2.7) leads to the following Lax pair: ∂ Wn (E) = −[Wn (E), Wk (E)], ∂tk

k ∈ N,

k = n.

(2.18)

For k = n, we replace (2.18) with the Lax pair (2.7) thus identifying tn with x. The (1,2)-entry of the matrix equation (2.18) implies the following tk -evolution of the polynomial Bn (E): ∂Bn ∂Bk ∂Bn = Bk − B n , ∂tk ∂x ∂x

k = n.

In case k = n this relation is replaced by ∂B ∂B = vb0 , ∂x ∂tn where v is a constant, which can always be eliminated by rescaling tn .

(2.19)


205

Expanding the right-hand side of (2.19) in E and using the condition that it must be a polynomial of degree n − 1, we find 1 Bk (E) = Bn (E) , (2.20) E n−k + where [ ]+ denotes the polynomial part of the expression. As follows from the first equation nin (2.11), M = C2n + b1 . On the other hand, according to formula (2.3), b1 = − i=1 µi . Finally, using the definition (2.6) of M in terms of the solution U and integrating twice with respect to x, we obtain U=

n i=1

µi +

1 κ − C2n x 2 + K1 x + K2 , 2

(2.21)

where K1 and K2 are constants of integration. If we assume that all the variables µi are bounded, which is related to the choice of sign of the leading order coefficient C2n , then b1 is a bounded function of x. To find bounded solutions U (x, t) of the PDE, we set C2n = κ,

and

K1 = 0.

Hence, when the above requirements are imposed, we see that the leading order coefficient of the polynomial C(E) must coincide with the parameter κ of the PDE. The Dym equation (HD) is invariant under the Galilean transformation xˆ = x + K2 t,

tˆ = t,

Uˆ = U − K2 ,

so that the constant K2 can always be eliminated from expression (2.21). Therefore, under the boundedness conditions above, and up to a Galilean transformation, we assume that the finite-gap and soliton solutions of the Dym equation (HD) is reconstructed in terms of the root variables µ s by the “trace” formula which in case of equations (HD) and (SW) have the form U (x, t) =

n

µi − m.

(2.22)

i=1

Here m is a constant, which equals zero in the case of equation (HD). Through (2.22) a solution of the system (2.12) allows to construct the instantaneous profile of U (x, ·) from a set of initial conditions µi (x, ·) = µi (0, ·) ∈ [m2i , m2i+1 ], i = 1, . . . , n. Here the “dot” notation stresses the fact that time t is just a parameter in this system. On the other hand, substitution of (2.3) into (2.19), setting E = µ1 , . . . , µn successively, and taking into account expressions (2.12) results in the following tk -evolution equations for µi , √ R(µi ) ∂µi ∂µi = Bk (µi ) = Bk (µi ) n , i = 1, . . . , n, (2.23) ∂tk ∂tx µi j =i (µi − µj ) where, in view of (2.3) and (2.20), for k = 1, . . . , n − 1, Bk (µi ) = Ress=0

1 (s − µ1 ) · · · (s − µn ) , s n−k s − µi

206


i.e., up to the sign, the k th elementary symmetric function of {µ1 , . . . µn } \ µi . In the case k = 1, √ ∂µi (µi − &) R(µi ) µ˙ i ≡ = , i = 1, . . . , n, ∂t1 µi nj=i (µi − µj )

& = µ1 + · · · + µn ,

(2.24)

the solution of which produces the µ’s, and hence the PDE’s solution U , at any (later) time t. We notice that for k > n, the derivatives ∂/∂tk are linear combinations of ∂/∂t1 , . . . , ∂/∂tn . Expressions (2.12), (2.23), and (2.12) provide the so-called µ-variables representation for the finite-gap solutions of an evolution equation. They are the analogs of Drach–Dubrovin equations which describe evolution of points on the spectral hyperelliptic curve in the case of the KdV equation. (For further details see Dubrovin [1975], Drach [1919], Alber et al. [1994, 1995, 1999], Gesztesy et al. [1996], and Alber and Fedorov [2001].) With the initial conditions chosen, the right-hand-side of system (2.12) is real, and the derivative of µi changes sign when µi reaches the end points of its gap, µi = m2i or µi = m2i+1 , corresponding to a change of the sheet of the spectral curve . Thus each variable µ undergoes (real) oscillations between the end points of a gap (so that the resulting PDE solution U (x, t) remains real). Remark. The condition that the root variables µ’s are real (or, equivalently, their initial conditions are chosen as described above), while certainly sufficient to assure reality of the PDE’s solution U resulting from (2.21), is clearly not necessary (namely, some of the µ’s could occur in conjugate pairs). A wider class of real solutions U could be constructed by relaxing the reality assumption on the µ-variables. However, a thorough discussion of the reality condition for U and its implications for the root variables, while certainly desirable, lies beyond the scopes of the present paper, and it will be addressed in future work. By rearranging and summing up Eqs. (2.12) and (2.24), (2.23), one obtains the following nonstandard Abel–Jacobi equations n µki dµi dtk = √ x R(µi ) i=1

k = 1, . . . , n − 1, k = n,

(2.25)

which contain (n − 1) holomorphic differentials and one meromorphic differential on . Thus, the number of holomorphic differentials is less than genus of the Riemann surface, which implies that the corresponding inversion problem cannot be solved in terms of meromorphic functions of x and t1 , . . . , tn−1 (see e.g., Markushevich [1977]). Finite-gap stationary flows in x. Let us first consider the x-flow by fixing time variables in (2.25): tk = tk0 =const, k = 1, . . . , n − 1, so that dtk = 0. Now introduce a new spatial variable x1 defined as follows:

x1

x= 0

1 µ1 · · · µn dx1 . L0

(2.26)


207

In view of the well-known Jacobi identities   1/(µ1 · · · µn ) k = −1,   0 µki k = 0, . . . , n − 2, n =  1 k = n − 1,  j =i (µi − µj )  & k = n, Eqs. (2.12) give rise to the following system: n µi k−1 µ dµ x1 + φ 1 = √ φk R(µ) µ0 i=1

k = 1, k = 2, . . . , n,

(2.27)

(2.28)

where φ1 , . . . , φn are constant phases which depend on tk0 as on parameters. Equations (2.28) include n holomorphic differentials on and determine the standard Abel–Jacobi map of the symmetric product (n) of n copies of to the Jacobi variety (Jacobian) Jac(). Thus, the flow generated by the system (2.12) is made linear on Jac() after introducing the reparametrization (2.26). By using standard methods (see e.g., Dubrovin [1981] or Mumford [1983]), the map can be inverted, resulting in expressions for algebraic symmetric functions of µ-variables in terms of theta-functions of n arguments which depend linearly on x1 and, in a transcendental way, on tk0 as parameters. Then, by using the trace formula (2.22), one obtains a theta-functional expression for U as a function of x1 , tk0 , U = U˜ (x1 , tk0 ). On the other hand, substituting the theta-functional expression for the product µ1 · · · µn into (2.26) yields a quadrature. By solving it, one finds x as a meromorphic function of x1 which depends on t0 as a parameter. However, the inverse function x1 (x, t0 ) is no longer meromorphic in x. Finally, the composition function U (x, t0 ) = U˜ (x1 (x, t0 ), tk0 ) gives a profile of the finite-gap solutions of the (HD) or (SW) equation (for explicit theta-functional expressions U˜ (x1 , t0 ), x(x1 , t0 ) see Alber and Fedorov [2001]). Notice that as seen from (2.26) and (2.28), the original x-flow is also made linear on Jac(). However the straight line motion is not uniform. The transformation (2.26) involving x and x1 coincides with a change of variable in the well-known Liouville transformation (see, e.g., Verhulst [1996]). Finite-gap flows in tk . Now let us fix the coordinate x = x0 as well as all the times t1 , . . . , tn−1 but tk . Then introduce a new time variable t˜ defined by dtk =

µ1 · · · µn d t˜, L0 (&k−1 )

(2.29)

where &k−1 are the elementary symmetric functions of µ1 , . . . , µn such that (s − µ1 ) · · · (s − µn ) = s n + s n−1 &1 + · · · + s 0 &n . Applying again the identities (2.27), from (2.24) and (2.23) we arrive at the following canonical Abel–Jacobi mapping n µi s−1 dµ µ s = 1, ψ1 = t˜ + φ1 (2.30) = √ ψs = δs,k tk + φs s = 2, . . . , n, µ0 2 R(µ) i=1

208


where φ1 , . . . , φn are constant phases which depend on x0 and the rest of times tl as on parameters, and δij is the Kronecker delta. As a result of inversion of (2.30), elementary symmetric functions of µ s and therefore the solution of equations (HD) and (SW) can be found in terms of theta-functions of n arguments which depend linearly on ψs . This means that the arguments depend linearly on t˜, as well as on the original time tk . However, t˜ itself depends on tk in a nonlinear way. Indeed, to describe the relation between t˜ and tk , we substitute the theta-functional expressions for the symmetric functions &n = µ1 · · · µn and &k−1 into (2.29). As a result, in contrast to the quadrature (2.26) relating x and x1 , we now get a differential equation of the form dtk = F (tk , t˜|x0 ), d t˜ where F is a transcendental function of t, t˜ and the parameter x0 . It can be shown that the equation involves a transcendental integral. Remarks. 1. In contrast to the x1 - and x-flows considered above, the flows generated by (2.23) (tk -flows) including (2.24), are nonlinear flows on the Jacobi variety Jac(). From the point of view of algebraic geometry, this phenomenon constitutes the main difference between solutions of such well known equations as KdV and sine Gordon equations and equations of (HD) or (SW) type. 2. The problem of inversion of the full nonstandard Abel mapping defined by (2.25) can be also studied by using a generalized Jacobian of the curve . Namely, one has to extend the mapping by including an extra holomorphic differential on to get a complete set of such differentials. As a result of this procedure, one gets a flow on nonlinear subvarieties (strata) of generalized Jacobians. The complete algebraic geometrical description and explicit formulae are presented in Alber and Fedorov [2001]. 3. Flows on n-Dimensional Quadrics and Stationary n-Gap Solutions of the (HD) and (SW) Equations Consider a family of confocal quadrics in Rn+1 = (X1 , . . . , Xn+1 ) 2 Xn+1 X12 ˜ Q(s) = + ··· + = 1 , s ∈ R, 0 < an+1 < a1 < · · · < an . a1 − s an+1 − s (3.1) The elliptic coordinates µ1 , . . . , µn+1 can be defined in Rn+1 in a standard way (see, ˜ e.g., Jacobi [1884a]) as follows. The condition s = c determines the quadric Q(c) on which one of the coordinates, say µn+1 , equals c, and the other coordinates µ1 , . . . , µn ˜ are elliptic coordinates on Q(c) defined by relations n l=1 (aj − µl ) , j = 1, . . . , n + 1. (3.2) Xj2 = (aj − c) n+1 k=1,k=j (aj − ak ) In the sequel without loss of generality we assume c = 0. ˜ = Q(0) ˜ It is well-known that the problem of geodesics on the ellipsoid Q is completely integrable (Jacobi [1884 a,b]). Moreover, as noticed by Jacobi himself and later


209

by many other authors (see e.g. Rauch-Wojciechowski [1995]), there exists an infinite ˜ in the sequence of integrable generalizations of the problem describing a motion on Q force field of certain polynomial potentials Vp (X1 , . . . , Xn+1 ), p ∈ N of degree 2p. The simplest integrable potential is the quadratic Hooke potential or the potential of an ˜ to the point mass on it: elastic string joining the center of the ellipsoid Q V1 =

σ 2 2 ), (X + · · · + Xn+1 2 1

σ = const.

In this case in terms of the ellipsoidal coordinates, the total energy (Hamiltonian) takes the Stäckel form: n n 1 j =i (µi − µj )µi dµi 2 σ H = + µi + const, 8 6(µi ) dx 2 i=1

where

i=1

6(µ) = (µ − a1 ) · · · (µ − an+1 )

and x denotes time. After fixing constants of motion, the system is reduced to the Abel– Jacobi equations n

µk

k=1 µ0

µi dµ = δin x + φi , √ 2 R(µk )

R(µ) = −µ6(µ)[c0 (µ − c1 ) · · · (µ − cn−1 ) − σ µn ],

i = 1, . . . , n,

(3.3)

c0 , . . . , cn−1 = const,

where φ1 , . . . , φn are constant phases and c1 , . . . , cn−1 are constants of motion. Notice that for σ = 0 the order of the polynomial R(µ) is 2n + 1, whereas for σ = 0 ˜ c0 is the it is 2n + 2. The case σ = 0 corresponds to the free (geodesic) motion on Q. ˙ X) ˙ and the remaining constants admit a clear geometric constant in the first integral (X, interpretation: the tangent line to a geodesic is also tangent to the fixed confocal quadrics ˜ 1 ), . . . , Q(c ˜ n−1 ) (Chasles theorem). Q(c Now notice that Eqs. (3.3) are equivalent to the system (2.25) with dt = 0 describing stationary (HD) and (SW) equations, provided we identify the roots of the polynomial R(µ) with those of the odd order polynomial (2.13) (for σ = 0 and L0 = 1) and of the even order polynomial (2.14) (for σ = 1) respectively. The equivalence also holds when some of the parameters ai in (3.3) are negative, which correspond to the motion on a hyperboloid. For concreteness, we shall consider only the case of ellipsoids. Taking into account the trace formula (2.22), we arrive at the following theorem: Theorem 3.1. The geodesic motion and motion in the field of a Hooke potential on ˜ are linked, at any fixed time t, to the n-gap solutions of (HD) and the ellipsoid Q (SW) equations respectively through the trace formula (2.22). Namely, if the roots of the polynomials R(µ) in (2.13) or (2.14) coincide with the roots of R(µ) in (3.3), the profiles of such solutions are given by the sum of the elliptic coordinates of the moving ˜ with addition of (−m) in case of equation (SW). point on Q ˜ (σ = 0) and equation (HD), this result was obtained in For the geodesic flow on Q Alber and Alber [1985], Cewen [1990], and Alber et al. [1995]). As with Eq. (2.25), under the change of parameter (2.26), Eqs. (3.3) reduce to those containing holomorphic differentials only and having the same structure as (2.28). By

210


inverting the corresponding Abel–Jacobi mapping, one obtains explicit expressions for elementary symmetric functions of µi and, in view of (3.2), for the Cartesian coordinates X1 , . . . , Xn+1 in terms of theta-functions of the new parameter x1 (for the case of the geodesic flow, see Weierstrass [1844], Moser[1978], and Knörrer [1982]). In the case n = 2, the change of parameter (2.26) was first applied by Weierstrass [1844] to solve the classical Jacobi geodesic problem on a triaxial ellipsoid (Jacobi [1884a,1884b]).

4. Billiard Dynamical Systems and Piecewise-Smooth Weak Solutions of PDE’s In this section it is first shown how peaked finite-gap solutions of (HD) and (SW) equations arise in the limit m1 → 0, where m1 is the smallest root of the polynomial R(E) in Eqs. (2.12)–(2.24). Then a connection to ellipsoidal and hyperbolic billiards is established. Ellipsoidal billiards and generalized Jacobians. Suppose that one of the semi-axes of ˜ tends to zero, namely, an+1 → 0. In the limit, Q ˜ passes into the interior the ellipsoid Q of (n − 1)-dimensional ellipsoid Q = {X12 /a1 + · · · + Xn2 /an = 1} ∈ Rn ,

Rn = (X1 , . . . , Xn ).

˜ transform to elliptic coordinates in Rn giving The elliptic coordinates µ1 , . . . , µn on Q Xj2

n (aj − µl ) = n l=1 , k=1,k=j (aj − ak )

j = 1, . . . , n,

(4.1)

which appear as the corresponding limits of (3.2). ˜ gets transformed into billiard motion inside the ellipsoid Then the motion on Q ˜ Q. Geodesics on Q pass into straight line segments inside Q, whereas the points of intersection of the geodesics with the plane {Xn+1 = 0} are mapped into impact points ˜ under the Hooke force passes on Q with elastic reflection. Also, the motion on Q to the motion inside Q under the action of the Hooke force with the potential V = σ (X12 + · · · + Xn2 )/2. However, in contrast to cases σ = 0 or σ < 0, for σ > 0 (an attracting Hooke potential), for the trajectory to reach Q the total energy h must be sufficiently large. Namely, there ought to exist a positive ε such that inside Q the following double inequality holds: h + σ (X12 + · · · + Xn2 )/2 > ε > 0. ˜ transforms to billiard motion inside the ellipsoid Under this condition, the motion on Q Q again having impacts and elastic reflections along Q. Thus, we have “an ellipsoidal billiard with the Hooke potential” which is described by the mapping B : (x, v) → (˜x, v˜ ), where x, v ∈ Rn are the Cartesian coordinates of a point on Q and the starting velocity vector respectively, while (˜x, v˜ ) are the coordinates and the starting velocity at


211

the next impact point. Following Fedorov [2001], the mapping has the form −1 [(σ − (v, a −1 v))x + 2(x, a −1 v)v], ν −1 v˜ = [(σ − (v, a −1 v))v − 2σ (x, a −1 v)x] + :a −1 x˜ ν −1 = [(σ − (v, a −1 v))(v + :a −1 x) + 2(x, a −1 v)(:a −1 v − σ x)], ν x˜ =

ν=

4σ (x, a −1 v)2 + (σ − (v, a −1 v))2 ,

:=

(4.2)

2(˜v, a −1 x˜ ) . (˜x, a −2 x˜ )

Notice that in the limit σ → 0 this reduces to a standard billiard mapping given in Veselov [1988] x˜ = x −

2(x, a −1 v) v, (v, a −1 v)

v˜ = v +

2(˜v, a −1 x˜ ) −1 a x˜ . (˜x, a −2 x˜ )

˜ with the higher order The mapping (4.2), as well as the billiard limits of the motion on Q potentials Vp (X1 , . . . , Xn , Xn+1 ) (Xn+1 = 0) are completely integrable. In the limit an+1 → 0 and after using the change of variable (2.26), the Abel–Jacobi equations (3.3) are transformed as follows: n µk i−1 µ dµ = φi = const, i = 1, . . . , n − 1, √ 2 ρ(µ) k=1 µ0 (4.3) n µk dµ = x1 + φn , √ 2µ ρ(µ) µ 0 k=1 ρ(µ) = −(µ − a1 ) · · · (µ − an ) [c0 (µ − c1 ) · · · (µ − cn−1 ) − σ µn ]. This system contains n−1 holomorphic differentials on the Riemann surface C = {w 2 = ρ(µ)} of genus g = n − 1 and one differential of the third kind having a pair of simple poles Q− , Q+ on C with µ(Q± ) = 0. Here again φ1 , . . . , φn are constant phases and c0 , . . . , cn−1 are constants of motion. The elliptic coordinates µ1 , . . . , µn represent the divisor of n points Pi = (µi , wi ) on C. Equations (4.3) describe a well defined mapping of the symmetric product C (g+1) to Jac(C, Q− , Q+ ), the (g + 1)-dimensional generalized Jacobian of the curve C with two distinguished points Q± . The later is obtained from the genus n curve w 2 = R(µ) in (3.3) as a result of confluence of two Weierstrass points (an+1 → 0) and regularization: cutting out the double point and gluing Q− , Q+ . The generalized Jacobian is a noncompact algebraic variety which is topologically equivalent to the product of the customary g-dimensional Jacobian variety Jac(C) with complex angle coordinates φ1 , . . . , φg and the cylinder C∗ = C \ {0} (for the definition and description of generalized Jacobians see, among others, Serre [1959], Previato [1985], Gavrilov [1999], and Fedorov [1999]). As follows from (4.3), the geodesic and the potential billiard motion parameterized by x1 is represented by a straight line flow on Jac(C, Q− , Q+ ), which is directed along the real section of C∗ and leaves the coordinates φ on Jac(C) invariant. As we shall see below, the solutions to the generalized inversion problem (4.3) have different structures, depending on whether R(µ) is an even or an odd order polynomial.

212


Solutions in terms of generalized theta-functions. First we concentrate on straight line billiards corresponding to the case σ = 0 when the curve C has one infinite point ∞. Fix a canonical basis of cycles A1 , . . . , Ag , B1 , . . . , Bg on C and let ω¯ 1 , . . . , ω¯ g be the dual basis of normalized holomorphic differentials on C and z1 , . . . , zg be corresponding coordinates on the universal covering of Jac(C). There exists a unique g × g constant normalizing matrix D such that ω¯ k =

g Dkj µj −1 dµ , √ ρ(µ) j =1

zk =

g

Dkj φj ,

k = 1, . . . , g = n − 1.

(4.4)

j =1

Let us also introduce a normalized differential of the third kind ?0 having simple poles at Q± with residues ±1 respectively: ?0 =

√ g ρ(0) dµ βk ω¯ k , + √ µ ρ(µ) k=1

√ ρ(0) = a1 · · · an · c1 · · · cn−1 ,

(4.5)

where βk are unique constants such that ?0 has zero A-periods on C. Then the last equation in (4.3) can be represented in the following form: n

µk

k=1 µ0

?0 = Z,

Z = 2 ρ(0)x1 + const.

(4.6)

√ Notice that in case of the ellipsoidal billiards R(0) is always real and hence Z is also real. Let us also choose the base point (µ0 , w0 ) of the mapping (4.3) to be an infinite point ∞ ∈ C. According to Fedorov [1999], the solution of the problem of inversion (4.3) together with (4.1) yields the following expressions for the Cartesian coordinates Xi of the point moving inside the ellipsoid Q: e−Z/2 θ [D + η(i) ](z − q/2) + eZ/2 θ [D + η(i) ](z + q/2) , (4.7) e−Z/2 θ [D](z − q/2) + eZ/2 θ[D](z + q/2) i = 1, . . . , n, z = (z1 , . . . , zn−1 )T , Z = 2 R(0)x1 + Z0 , Q+ T Q+ z, Z0 = const, q = 2 ω¯ 1 , . . . , ω¯ g ∈ Cg ,

Xi (x1 , z) = κi

∞

∞

κi = const. These expressions involve quotients of generalized theta-functions, where θ [D+η(i) ](z) and θ[D](z) are customary theta-functions associated with the Riemann surface C with appropriately chosen half-integer theta-characteristics η(i) (D is the half-integer thetacharacteristic corresponding to the vector of Riemann’s constants). The vector q coincides with the vector of B-periods of the meromorphic differential ?0 . The constant factors κi depend on the parameters of the curve C only. (For the definition and properties of the generalized theta-functions see e.g., Belokolos et al. [1994], Gagnon et al. [1992], Ercolani [1987], and Fedorov [1999].) The expressions (4.7) describe a straight line segment in Rn (Cn ) with z playing a role of a constant phase vector which defines the position of the segment. When one of √ the µ-variables, say µ1 , equals zero, the corresponding point P1 = (µ1 , R(µ1 )) on


213

the curve C coincides with one of the poles Q− , Q+ of the differential ?0 . Then, as follows from the mapping (4.3) and (4.6), x1 and Z become infinite. On the other hand, in view of (4.1), at this moment the moving point in Rn meets an ellipsoid Q. It follows that as x1 and Z change from −∞ to ∞ along the real axis, the expressions (4.7) have finite limits, giving the coordinates of two subsequent impact points on Q. Notice that Xi (∞, z) have the same values as Xi (−∞, z + q). Hence the next segment of the billiard trajectory is given by (4.7) with z being replaced by z + q. This yields the following algebraic-geometrical description of the billiard motion (see also Fedorov [1999]). Theorem 4.1. As the point mass inside Q approaches the ellipsoid, the point P1 on C tends to the pole Q+ . At the moment of impact, P1 jumps from Q+ back to Q− , whereas the phase vector z is increased by q defined in formulas (4.7). The process repeats itself for each impact. Using this property and by applying induction, from (4.7) the coordinates of the whole sequence of impact points are found in the form xi (N ) = κi

θ [D + η(i) ](z0 + N q) , θ [D](z0 + N q)

i = 1, . . . , n,

(4.8)

where N ∈ N is the number of impacts and the phase vector z0 = (z10 , . . . , zg0 )T is the same for all the segments of the billiard trajectory. These expressions depend on customary theta-functions only and, as functions of z0 , are meromorphic on a covering of the Jacobian variety of C. They have also been obtained by Veselov [1988] by using a factorization of matrix polynomials (see also Moser and Veselov [1991]). The work of Veselov is closely related to the discretization of mechanics that preserves the integrable structure. The numerical implementation of Veselov’s procedures was given in Wendlandt and Marsden [1997], a discrete reduction procedure in Marsden, Pekarsky and Shkoller [1999], Bobenko and Suris [1999] and an extension to PDE’s in Marsden, Patrick and Shkoller [1999]. The generalized Abel map (4.3) yields expressions in terms of generalized thetafunctions for the elementary symmetric functions of the variables µ. In particular, following Fedorov [1999], one obtains µ1 · · · µn = ∂x1 ∂V log θ˜ [D](z, Z) e−Z/2 ∂V θ [D](z − q/2) + eZ/2 ∂V θ [D](z + q/2) , = 2 ρ(0) ∂Z e−Z/2 θ [D](z − q/2) + eZ/2 θ [D](z + q/2)

(4.9)

where ˜ θ[D](z, Z) = e−Z/2 θ [D](z − q/2) + eZ/2 θ[D](z + q/2), Z = 2 ρ(0)x1 + Z0 ,

∂V = V1

(4.10)

∂ ∂ + · · · + Vn , ∂z1 ∂zn

and where V is the last column of the normalizing matrix D defined in (4.4): V = (D1g , . . . , Dgg )T . The phases z and Z0 are the same as in (4.7). As follows from (4.9),

214


for x1 , Z → ±∞, the product µ1 · · · µn tends to zero, as expected. Taking the integral (2.26) with L0 = 1 yields ˜ Z) + const x(x1 , z) = µ1 · · · µn dx1 = ∂V log θ(z, (4.11) e−Z/2 ∂V θ [D](z − q/2) + eZ/2 ∂V θ[D](z + q/2) = + const. e−Z/2 θ [D](z − q/2) + eZ/2 θ[D](z + q/2) It follows from this expression that the original parameter x has finite values as x1 → ±∞ and x(∞, z) has the same value as x(−∞, z + q). Now, substituting in (4.11) Z = −∞, Z = ∞, by induction, we find the length of the N th segment of the billiard trajectory in the form x(N ) − x(N − 1) =

∂V θ [D](z0 + N q) ∂V θ [D](z0 + N q − q) − , θ [D](z0 + N q) θ[D](z0 + N q − q)

N ∈ N (4.12)

z0 being the same as in (4.8). As a result, the solution Xi (x), x ∈ R, of the continuous geodesic billiard problem should be viewed as consisting of an infinite number of pieces each parameterized by x1 ∈ (−∞, ∞) and given by (4.7) and (4.11). These pieces are obtained by iteratively adding vector q to the phase z in (4.7) and (4.11) and they are glued together at the impact points corresponding to x1 = ±∞. Now we turn to the ellipsoidal billiard with the Hooke potential (σ = 1). In this case the curve C appearing in (4.3) has 2 infinite points at ±∞. We again introduce normalized differentials ω¯ k , ?0 , and coordinates zk , Z according to (4.4) and (4.6). Let the base point of the mapping (4.3) be one of the Weierstrass points of C, say µ0 = an . Then, instead of (4.7), the inversion of the generalized mapping (4.3) yields the following expressions for the squares of the Cartesian coordinates of the mass point moving inside an ellipsoid Q: Xi2 (x1 , z) = κi

θ˜ 2 [D + η(i) ](z, Z) , ˆ ˆ ˜ ˜ θ[D](z − q/2, ˆ Z − S/2) θ[D](z + q/2, ˆ Z + S/2)

(4.13)

i = 1, . . . , n,

z = (z1 , . . . , zn−1 )T , Z = 2 ρ(0)x1 + Z0 , z, Z0 = const, ∞+ Q+ (ω¯ 1 , . . . , ω¯ g )T , qˆ = 2 (ω¯ 1 , . . . , ω¯ g )T , q= Sˆ =

Q− ∞+

an

∞−

?0 ,

where θ˜ [D](z, Z) is defined in (4.10) and θ˜ [D + η(i) ](z, Z) = e−Z/2 θ [D + η(i) ](z − q/2) + eZ/2 θ[D + η(i) ](z + q/2) (4.14) √ Here κi are constants, and ρ(0) is the same as in (4.5). Similarly to (4.7), as x1 and Z pass from −∞ to ∞, Xi2 (x1 , z) tend to finite values resulting in the squares of the coordinates of subsequent impact points on Q. Thus, expressions (4.13) describe a segment of trajectory of the billiard in the field of the Hooke potential between two


215

impacts. After each impact the phase vector z changes according to Theorem 4.1. Then, by using induction, the sequence of impact points is described as follows: xi2 (N ) = κi

θ 2 [D + η(i) ](z0 + N q) , θ [D](z0 − qˆ + N q) θ[D](z0 + qˆ + N q) N ∈ N,

i = 1, . . . , n,

(4.15)

T

z0 = (z10 , . . . , zg0 ) = const.

Apparently, this theta-functional solution for the billiard with the Hooke potential was not previously known. Lastly, we find the following expression for x x(x1 , z) = const + log

ˆ ˜ θ[D](z − q/2, ˆ Z − S/2) , ˆ ˜ θ[D](z + q/2, ˆ Z + S/2)

Z = 2 ρ(0)x1 + const, (4.16)

which, for x1 → ±∞ and Z → ±∞, has finite limits determining x for two subsequent impacts. Then, using the expression (4.10), by induction, we express a x-interval between the impacts in terms of the customary theta-function: θ [D](z0 − q/2 ˆ + N q) θ[D](z0 + q/2 ˆ + N q) θ [D](z0 − q/2 ˆ + N q − q) ˆ − log − log S. θ[D](z0 + q/2 ˆ + N q − q)

x(N) − x(N − 1) = log

(4.17)

We emphasize that, in contrast to the geodesic billiard, for the billiard in the potential field the “time” x is not proportional to the length of a trajectory. Stationary finite-gap peaked solutions. Now we return to the finite-gap solutions of equations (HD) and (SW). Notice that under the limit m1 → 0 the mapping (2.28) takes the form (4.3) with ρ(µ) being a polynomial of degree 2n − 1 and 2n respectively. The trace formula (2.22) and relations (4.1) yield U=

n j =1

Xj2 +

n

ai + m.

i=1

Then solution to the billiard problems (4.7)–(4.17) provide solutions U (x, t0 ) for the above equations which consist of infinite sequences of smooth pieces each one corresponding to a segment between two impacts. The impacts themselves give peaks of U (x, t0 ). This leads to the following theorem. Theorem 4.2. 1) At any fixed time t = t0 , finite-gap peaked solution of the equation (HD) consists of an infinite number of pieces UN (x, t0 ), N ∈ Z glued at peak points. Let ρ(µ) be any polynomial with distinct roots a1 , . . . , an . Then, for any N , every piece is given by the following pair of theta-functional expressions parameterized by x1 ∈ R, UN =

n j =1

x(x1 , z) =

Xj2 (x1 , zN ) +

n

ai ,

(4.18)

i=1

e−Z/2 ∂V θ [D](zN − q/2) + eZ/2 ∂V θ [D](zN + q/2) + x0 , e−Z/2 θ [D](zN − q/2) + eZ/2 θ [D](zN + q/2) zN = z0 + N q ∈ Cn−1 , Z = 2 ρ(0)x1 + Z0 ,

(4.19)

216


where Xj2 (x1 , z) and q are given by (4.7) and z0 , Z0 , x0 are constant phases of the solution depending on t0 , which are the same for any piece. The length of the N th piece equals ∂V θ [D](z0 + N q) ∂V θ[D](z0 + N q − q) − . θ [D](z0 + N q) θ [D](z0 + N q − q)

(4.20)

2) At any fixed time t = t0 finite-gap peaked solution to equation (SW) consists of an infinite number of pieces UN (x, t0 ), N ∈ Z which are glued at peak points. The pieces are given in the following parametric form UN =

n j =1

x(x1 , t0 ) = log

Xj2 (x1 , zN ) +

n

ai + m,

zN = z0 + N q ∈ Cn−1 ,

(4.21)

i=1

ˆ ˜ θ[D](z ˆ Z − S/2) N − q/2, + x0 , ˆ ˜ ˆ Z + S/2) θ[D](z N + q/2,

Z = 2 ρ(0)x1 + Z0 , (4.22)

where Xj2 (x1 , z) are given by (4.13) and z0 , Z0 , x0 are constant phases which depend on t0 . The x-length of N th piece equals log

θ [D](z0 − q/2 ˆ + N q) ˆ + N q − q) θ[D](z0 − q/2 ˆ − log − log S. θ [D](z0 + q/2 ˆ + N q) θ [D](z0 + q/2 ˆ + N q − q)

(4.23)

When in the polynomials (2.13) or (2.14) m1 = 0 and m2 tends to zero, the distance between subsequent peaks of a profile tends to zero and in the limit the peaks coalesce. (Notice that this is done for a fixed t.) The solution U (x, t0 ) for this limiting case is smooth. Remark. It is known (see, for instance, Fedorov [1999]) that there are special degenerate umbilic billiard solutions of the classical billiard problem (without a potential) that have straight line segments meeting n − 1 fixed focal conics of Q between any subsequent impacts and, as x → ±∞, the billiard motion converges to simple oscillations along the largest axis of the ellipsoid. This corresponds to the confluence of the roots of the polynomial ρ(µ) in (4.3), c1 = a1 ,

...,

cn−1 = an−1 .

As a result, the hyperelliptic curve C becomes singular of arithmetic genus zero and the asymptotic billiard motion is described in terms of tau-functions. The corresponding asymptotic peaked solutions of equations (HD) and (SW) are given in Alber and Fedorov [2001]. Time-dependent piecewise-meromorphic solutions. Now we pass to global algebraic geometrical description of the finite-gap peaked solutions. After setting m1 → 0, the system (2.25) is formally reduced to the following Abel–Jacobi mapping: µ1 k−1 µn k−1 µ dµ µ dµ tk + φk k = 1, . . . , n − 1, + ··· + = (4.24) √ √ x + φn k = n, ρ(µ) ρ(µ) 2 2 µ0 µ0


where ρ(µ) = −L20

2n

217

(µ − mr )

and

ρ(µ) =

r=2

2n+1

(µ − mr )

r=2

in the case of equations (HD) or (SW) respectively. Here φ1 , . . . , φn are constant phases. This system contains n − 1 independent holomorphic differentials defined on the genus g = n − 1 Riemann surface {w2 = ρ(µ)}, which can be identified with the curve C described above. However, in contrast to the system (4.3), in the case of a polynomial ρ(µ) of odd order which corresponds to equation (HD), the last equation in (4.24) contains a meromorphic differential of the second kind having a double pole at the infinite point ∞ on C. In case of a polynomial ρ(µ) of even order corresponding to equation (SW), the last equation includes a meromorphic differential of the third kind with a pair of simple poles at the infinite points ∞− , ∞+ on C. According to Clebsch and Gordon [1866] and Gavrilov [1999], in the odd order case, such a system describes a well defined and invertible mapping of the symmetric product C (g+1) to Jac(C, ∞), the generalized Jacobian of the curve C with one distinguished point at ∞. The set Jac(C, ∞) is a noncompact algebraic variety which is topologically equivalent to the product Jac(C) × C. To describe this case we introduce a normalized differential of second kind having a double pole at ∞, √ g −1L0 µg dµ (1) dk ω¯ k , g = n − 1, (4.25) ?∞ = + √ 2 ρ(µ) k=1 where ω¯ k are the normalized holomorphic differentials specified in (4.4), dk are normal(1) izing constants such that all A-periods of ?∞ on C are zeros. Then the last equation in (4.24) implies that n i=1

µi µ0

?(1) ∞ = Z,

Z=

√ −1L0 x + (d, Dt) + const

d = (d1 , . . . , dn−1 )T ,

(4.26)

t = (tn , . . . , t2 )T ,

where D is an (n − 1) × (n − 1) normalizing matrix defined in (4.11). (1) Since ∞ now is a pole of ?∞ , we choose the basepoint P0 = (µ0 , w0 ) to be a finite Weierstrass point on C. For concreteness we choose P0 = (m2n , 0). Applying the residue theorem to the generalized theta-function associated with Jac(C, ∞) we solve the inversion problem (4.24) and find the following expression: n

µi = C1 − Z 2 +

i=1

2Z∂V θ[D + η2n ](z) − ∂V2 θ[D + η2n ](z) , θ [D + η2n ](z)

(4.27)

√

−1L0 x + (d, Dt) + Z0 , z = Dt + z0 ∈ Cn−1 , g µ ω¯ k + m2n , Z0 , z0 = const, C1 =

Z=

k=1 Ak

where the half-integer characteristic η2n labels the point (m2n , 0), the vector V = (D1g , . . . , Dgg )T is specified in (4.4), and the constant C1 contains the sum of integrals along the canonical cycles A1 , . . . , Ag on C. Notice that in the above formula

218


∂V = ∂t2 . Expression (4.27) is meromorphic in x and t1 , . . . , tn−1 and can be regarded as a generalization of the Matveev–Its formula to the case of the noncompact variety Jac(C, ∞). In the case of an even order curve C, corresponding to finite-gap peaked solutions of equation (SW), system (4.24) defines a mapping of the symmetric product C (g+1) to the generalized Jacobian Jac(C, ∞± ) which is topologically equivalent to the product Jac(C)×C∗ . As above, we set P0 to be the last Weierstrass point (m2n+1 , 0) and introduce the normalized differential of the third kind having a pair of simple poles at ∞− , ∞+ ∈ C, as well as the corresponding variable Z: g

µg dµ d¯k ω¯ k , ?∞± = √ + 2 ρ(µ) k=1

Z=

n i=1

µi

m2n+1

?∞± ,

(4.28)

where (d¯1 , . . . , d¯g ) = d¯ are chosen such that all the A-periods of ?∞± are zeros. Then, applying the residue theorem to the generalized theta-function associated with the Jac(C, ∞± ) yields n

ˆ + eZ θ [D](z + q) ˆ e−Z θ[D](z − q) , θ [D](z)

(4.29)

¯ Dt) + Z0 , z = Dt + z0 ∈ Cg , Z = x + (d, ∞+ ∞+ T qˆ = ω¯ 1 , . . . , ω¯ g ∈ Cg , Z0 , z0 = const.

(4.30)

µi + m = const −

i=1

where, in view of (4.28),

∞−

∞−

Remark. According to the formula (2.22), expressions (4.27) and (4.29) describe formal solutions to equations (HD) and (SW) respectively. However, while treating these solutions, one needs to take into account the reflection phenomenon described √ in Theorem 4.1. Namely, when a certain variable µi passes zero, the point Pi = (µi , ρ(µi )) jumps from one sheet of the Riemann surface C to another or, in other words, from the pole Q+ of the differential of the third kind ?0 to another pole Q− . Therefore, the above expressions do not provide global solutions to the equations. Instead, the following theorem holds. Theorem 4.3. 1) The time-dependent finite-gap peaked solution U (x, t) of (HD) consists of an infinite number of pieces in Rn = (t1 , . . . , tn−1 , x) described by meromorphic functions 2ZN ∂V θ [D + η2n ](zN ) − ∂V2 θ [D + η2n ](zN ) , N ∈ Z, θ [D + η2n ](zN ) √ zN = Dt + N q + z0 , ZN = −1L0 x + (d, zN ) + N h + Z0 , Z0 , z0 = const, t = (t1 , . . . , tn−1 ), (4.31) T Q+ Q+ Q+ h= ?(1) ω¯ 1 , . . . , ω¯ g , ∞, q =

2 UN (x, t) = C1 − ZN +

Q−

Q−

where C1 is the constant specified in (4.27).

Q−


219

For a fixed N the corresponding piece UN (x, t) is bounded by nonintersecting surfaces SN−1 and SN in Rn given by equations SN = {x = pN (t)}, 1 pN (t) = √ (∂V log θ [D + η2n ](zN + q/2) − (d, zN ) − N h) . −1L0

(4.32)

The adjacent pieces UN (x, t) and UN+1 (x, t) are thus glued to each other along SN , where U (pN (t), t) = C1 − ∂V2 log θ [D + η2n ](zN + q/2).

(4.33)

2) The finite-gap peaked solution U (x, t) of (SW) consists of an infinite number of pieces given by meromorphic functions ˆ + eZN θ [D](zN + q) ˆ e−ZN θ [D](zN − q) , N ∈ Z, θ [D](zN ) ¯ zN ) + N h¯ + Z0 , zN = Dt + qN + z0 , ZN = x + (d, Q+ ?∞± , t = (tn , . . . , t2 ), Z0 , z0 = const, h¯ =

UN (x, t) = const −

(4.34)

Q−

where the vector qˆ is described in (4.30). The piece UN (x, t) is bounded by peak surfaces S¯N−1 and S¯N defined as follows: S¯N = {x = p¯ N (t)},

p¯ N (t) = const − log

θ [D](zN − qˆ + q/2) . θ [D](zN + qˆ + q/2)

(4.35)

The adjacent pieces UN (x, t) and UN+1 (x, t) are glued together along S¯N , where U (pN (t), t) = const − ∂V log

ˆ θ[D](zN − q) . θ[D](zN + q) ˆ

(4.36)

Notice that along the peak surfaces, the solutions described in 1) and 2) have discontinuous partial derivatives with respect to x and t1 , . . . , tn−1 . Remark. By fixing all the times but tk in the above expressions, one obtains 2-dimensional piecewise solutions UN (x, tk ), whereas the corresponding sections of SN , S¯N ⊂ (x, t) = Rn describe peak lines in (x, tk )-plane. As follows from (4.32) and (4.35), the motion of the N th peak pN (tk ) along the x-axis is described by a sum of a linear function in tk and a quasi-periodic one. The latter function becomes periodic in the case g = 1. Finally, after fixing all the times without exception, expressions (4.31) and (4.34) provide pieces of the stationary finite-gap peaked solution already described in Theorem 4.2. Proof of Theorem 4.3. According to Theorem 4.2, the profiles of finite-gap peaked solutions are associated with geodesic ellipsoidal billiards and billiards in the field of a Hooke potential. An impact point on the boundary of a billiard trajectory corresponds to a peak of the profile U (x, t0 ), and this happens when one of the µi passes zero. Hence, the solution (4.27) is valid until one of the points P1 , . . . , Pn on C coincides with Q− or Q+ , the poles of the differential ?0 in (4.5). Putting, for example, Pn ≡ Q+ (µn ≡ 0)

220


in (4.24), one arrives at the following relations involving the normalized differentials defined in (4.4) and (4.25): g g i=1

µi

i=1 P0 g

?(1) ∞

P0

µi

−

ω¯ k = zk − qk /2,

dk ω¯ k

=

k=1

√

−1L0 x −

k = 1, . . . , g, Q+ P0

?(1) ∞

−

(4.37) g

dk ω¯ k ,

(4.38)

k=1

where P0 = (m2n , 0). Notice that Eqs. (4.37) form a closed system for the variables µ1 , . . . , µn−1 and describe the standard Abel–Jacobi mapping C (g) →Jac(C). Hence, the first symmetric polynomial has the following standard form in terms of theta-functions in the odd order case: g µ1 + · · · + µn−1 = c1 − ∂V2 log θ [D + η2n ](z − q/2), c1 = ω¯ k . (4.39) k=1 Ak

On the other hand, Eq. (4.38) implies that at a peak point the coordinate x becomes a function of z and therefore of t: x = p0 (t). In the odd order case, this equation contains a sum of Abelian integrals of the second kind, the so-called Abelian transcendent. By making use of the following standard expression for the normalized transcendent (Clebsch and Gordon [1866]) g g µi µi ?(1) ω¯ k , ∞ = −∂V log θ[D + η2n ] i=1

µ0

i=1

P0

from (4.37) and (4.38) we find p0 (t) = √

1 −1L0

(∂V log θ [D + η2n ](z − q/2) − (d, z) + h/2) ,

h=

Q+ Q−

?(1) ∞. (4.40)

Using the trace formula for the solution U (p0 (t), t) = µ1 + · · · + µn−1 and expression (4.39) it follows that the equation x = p0 (t) determines a surface S0 in Cn along which the solution U has a peak. Now setting in (4.24) Pn ≡ Q− and taking into account (4.4), (4.25) and Q− Q+ Q− Q+ (1) (1) ?∞ = − ?∞ , ω¯ = − ω¯ P0

P0

P0

P0

we obtain an expression for another peak surface S1 determined by the equation {x = p1 (t)} with p1 (t) = √

1 −1L0

(∂V log θ [D + η2n ](z + q/2) − (d, z) − h/2) ,

(4.41)

along which U (p1 (t), t) = µ1 + · · · + µn−1 = C1 − ∂V2 log θ [D + η2n ](z + q/2).

(4.42)


221

Under the reality condition, the surfaces S0 and S1 do not intersect and therefore determine a connected domain in Cn = (x, t) where the solution (4.27) is applicable. We denote this piece of solution as U1 (x, t). As follows from (4.40) and (4.41) S1 is obtained from S0 by changing the phase as follows: Z → Z + h,

z→z+q

that is x → x + √

1

−1L0 −1 t → t + D q.

(h − (d, Dt)),

(4.43)

In addition, according to (4.42) and (4.39) at any two points on S0 and S1 which are equivalent modulo the shift, U1 (x, t) takes the same values: 1 (h − (d, Dt)), t + D −1 q . (4.44) U1 (q1 (t), t) = U1 q0 (t) + √ −1L0 √ Now let us define the function U2 (x, t) = U1 (x + (h − (d, Dt))/( −1L0 ), t + D −1 q), which is also a local solution to (HD). In view of (4.44), U1 and U2 take the same values along S1 , which ensures a correct gluing of two pieces together. By using iteration with respect to both positive and negative N s, we construct a complete sequence of peak surfaces and obtain formulae given in part 1) of the theorem. Similarly, solution (4.29) of (SW) is valid until one of the points P1 , . . . , Pn on C coincides with Q− or Q+ , the poles of ?0 . Setting Pn ≡ Q+ in (4.24) for the case of an even order curve C, and using (4.4) and (4.28) yields n−1

n−1

µi P0

i=1

µi

i=1 P0 n−1

ω¯ k = zk − qk /2,

d¯k ω¯ k

?∞± −

k=1

=x−

k = 1, . . . , n − 1,

Q+ P0

?∞± −

n−1

(4.45)

d¯k ω¯ k ,

(4.46)

k=1

where P0 = (m2n+1 , 0). Inverting (4.45) results in the following expression for a symmetric polynomial (see e.g., Clebsch and Gordon [1866]) µ1 + · · · + µn−1 = const − ∂V log

θ [D](z − qˆ − q/2) . θ [D](z + qˆ − q/2)

(4.47)

After applying the theta-functional formula for the normalized transcendent of the third kind (Clebsch and Gordon [1866]), g i=1

µi P0

?∞±

θ [D](s − q) ˆ , = const − log θ [D](s + q) ˆ

s=

g i=1

µi P0

ω, ¯

g = n − 1,

from (4.46) and (4.45) we obtain x = p0 (t) = const − log

θ [D](z − qˆ + q/2) ¯ + h/2. ¯ − (z, d) θ [D](z + qˆ + q/2)

(4.48)

By choosing Pn ≡ Q+ in (4.24), one arrives at the expressions (4.47) and (4.48) with ¯ replaced by −q/2, −h/2. ¯ q/2, h/2 Then, following similar arguments and applying induction, the piecewise solution of part 2) is constructed.

222


We emphasize that although the different pieces UN (x, t) of the solution are obtained by iterative shifting the phases z, Z by the same vector, the pieces UN (x, t0 ) of the solution (t0 being fixed) are all distinct because the shift occurs in both x- and t-directions. Remark. If we omit the reality condition above, then the hypersurfaces SN , S¯N in Cn intersect, bounding a set of n-dimensional domains adjacent to each other in a rather complicated manner. Then the procedure of gluing different pieces of the functions UN (x, t) meromorphic inside each domain cannot be defined uniquely. As a result, the generic complex solution U (x, t) branches along the peak surfaces. 5. Kinematics of Peaks Now we obtain expressions for the velocity of the N th peak pN (t) of the piecewise solution of (HD) with respect to time tk . As was shown above, the solution has a peak when one of the µ-variables passes zero implying that Pn = Q− or Pn = Q+ . Theorem 5.1. Let y1 , . . . , yn−1 denote the µ-coordinates of the points P1 , . . . , Pn−1 at the moment in time when one of the µ-variables passes zero. The following system of equations holds: ∂pN (t) = −&k−1 (y1 , . . . , yn−1 ), ∂tk

(5.1)

where &k is k th the symmetric function of y1 , . . . , yn−1 . In particular, we have ∂pN (t) = y1 + · · · + yn−1 = U (pN (t), t), ∂t2

(5.2)

i.e., the t2 -velocity of the peak coincides with its height. Proof. After applying limit m1 → 0, Eqs. (2.12) and (2.23) for the derivatives of µn take the form √ ∂µn ρ(µn ) = , (5.3) ∂x (µn − µ1 ) · · · (µn − µn−1 ) √ ∂µn ρ(µn ) . (5.4) = &k−1 (µ1 , . . . , µn−1 ) ∂tk (µn − µ1 ) · · · (µn − µn−1 ) On the other hand, along the peak line {x = pN (tk )}, we have ∂µn d pN (tk ) ∂µn d + = 0, µn (pN (tk ), tk ) ≡ dt ∂x d tk ∂tk which, in view of (5.3) and (5.4) and after setting µn ≡ 0, yields √ ∂µn ρ(0) + &k−1 (y1 , . . . , yn−1 ) = 0. µ1 · · · µn−1 ∂tk Since ρ(0) = 0 and µ1 = y1 , . . . , µn−1 = yn−1 are finite, the latter relation gives (5.1).


223

Remark. The relations (5.1) can be also found by using direct differentiation of the expression for the N th peak surface (4.35) with respect to tk . Namely, putting without loss of generality N = 0, and taking into account ∂V = ∂t2 , we write n−1 µi ∂p0 (t) = ∂tk ∂t2 log θ [D + η2n ] ω¯ . ∂tk P0

(5.5)

i=1

According to Mumford [1983], in case of odd order hyperelliptic curves, this gives a theta-functional expression for the coefficient in front of λn−k in the polynomial (λ − µ1 ) · · · (λ − µg ) which coincides with &k−1 (y1 , . . . , yn−1 ). 6. The Dynamics of Peaks and Weak Solutions Expression (5.2) states that for equations in the hierarchies of (HD) or (SW), every peak in the solution profile moves with velocity determined by the local value of the solution. In this section, we derive this property without recourse to tools related to the complete integrability of the evolution equation. Thus, this property of peak motion can hold in general for equations that admit piecewise-smooth weak solutions, with jumps in the first spatial derivative at isolated points in the solution’s support. In this case, the derivative discontinuity can be viewed as a “shock” in the appropriate weak form of the evolution equation. We will take the weak form of the equation (HD) or (SW) to be ∇φ(x, t) · V(x, t) dx dt = 0, (6.1) ?

where the equality is satisfied for all test functions φ(x, t) is C ∞ with compact support in a domain ? in the (x, t) plane. Here ∇φ = (φt , φx ), the dot denotes the R2 inner product, and the vector function V(x, t) = (V1 , V2 ) is defined by V1 = Ux , 1 2 1 ∞ 2 U − |x − y| (Uy − 2κU ) dy , V2 = ∂x 2 4 −∞

(6.2)

for equation (HD) and V 1 = Ux ,

1 2 1 ∞ −|x−y| 2 2U + Uy2 − 2κU dy , U + e V2 = ∂x 2 4 −∞

(6.3)

for equation (SW), respectively. We will look for jump conditions satisfied by the solutions of Eq. (6.1). If the jump discontinuities are isolated, by adjusting the support of the test functions φ(x, t) we only need to consider the case of a single discontinuity. Let us suppose that the function U (x, t) is infinitely differentiable almost everywhere in ?, except along the curve x = q(t) where the first derivative Ux has a discontinuity. If we partition the domain ? into ? = ?1 ∪ ?2 by cutting along the portion of the

224


discontinuity curve x − q(t) = 0 in ?, the divergence theorem and the choice of test functions φ(x, t) vanishing on the boundary ∂? allow us to write Eq. (6.1) as 0= dx dt ∇φ · V = dx dt φ ∇ · V ?

?1

+

dx dt φ ∇ · V +

?2

dl φ n · [V]+ −.

(6.4)

∂?1 ∩∂?2

Here the unit vector n is directed along the normal [−q, ˙ 1] to the discontinuity curve ∂?1 ∩ ∂?2 in ?, and [V]+ denotes the jump of the vector V across this curve, − [V]+ − ≡

lim

x→q(t)+

V(x, t) −

lim

x→q(t)−

V(x, t).

By the arbitrariness of φ(x, t), each integrand term on the right-hand side of (6.4) has to vanish separately. Thus, from the first two terms, ∇ · V = 0,

or

∂V1 ∂V2 + = 0, ∂t ∂x

(6.5)

in ?1 or ?2 , where U (x.t) is smooth. This smoothness and zero divergence condition, by the definition (6.2) or (6.3) for (HD) or (SW) respectively, imply that U (x, t) is a solution of these equations in ?1 or ?2 . For instance, (6.5) becomes 1 2 1 ∞ |x − y| (Uy2 − 2κU ) dy = 0, Uxt + ∂xx U − 2 4 −∞ which is the integrated form of the Harry-Dym equation (HD). The last (jump) condition in (6.4), n · [V]+ − =0 along ∂?, implies + q[V ˙ 1 ]+ − = [V2 ]− .

(6.6)

The left-hand side of this expression is simply ˙ x ]+ q[V ˙ 1 ]+ − = q[U −.

(6.7)

As to the right-hand side, the second (integral) term in the definitions (6.2) or (6.3) of V2 (x, t) is a continuous function of x, as the integral wipes out the discontinuity sgn(x − y) as well as additional ones that Uy2 might have. Hence the integral terms do not contribute to the right hand side of (6.6). The jump of V2 (x, t) across the discontinuity curve x = q(t) then reduces to 1 2 + U )x [V2 ]+ = U (q, t) [Ux ]+ (6.8) − = −. − 2 If

+ − [Ux ]+ − ≡ Ux (q , t) − Ux (q , t) = 0,

Eqs. (6.7) and (6.8) yield q˙ = U (q, t),

(6.9)

i.e., the location of the discontinuity (shock) in the Ux moves at the local speed U (q, t). We have then proved the following


225

Theorem 6.1. Let U (x, t) be a solution of Eq. (6.1), with the vector V(x, t) defined in terms of U (x, t) by the nonlinear, nonlocal operators (6.2) and (6.3) respectively for equations (HD) and (SW). Let U (x, t) be a smooth function of (x, t) in the domain ? ⊆ R2 , except along the curve x = q(t), where U is continuous while the first derivative Ux has a jump discontinuity (peak) U (q + , t) = U (q − , t). Then U (x, t) is a solution of equations (HD) and (SW) in each domain ?1 and ?2 in which the curve x = q(t) partitions ?, and the location of the peak q(t) moves with velocity equal to its height, q˙ = U (q, t). Conclusions. In this paper, profiles of the weak finite-gap piecewise-smooth solutions of the integrable nonlinear equations of shallow water and Dym type are linked to billiard dynamical systems and geodesic flows with reflections described in terms of finite dimensional Hamiltonian systems on Riemann surfaces. After reducing the solution of these systems to that of a nonstandard Jacobi inversion problem, solutions are found by introducing new parametrizations. The extension of the algebraic-geometric method for nonlinear integrable PDE’s given in this paper leads to a description of piecewise-smooth weak solutions of a class of N -component systems of nonlinear evolution equations and its associated energy dependent Schrödinger operators. Acknowledgements. Mark Alber and Roberto Camassa would like to thank Francesco Calogero and Al Osborne for helpful discussions. The authors would like to thank R. Beals, D. Sattinger and J. Szmigielski for pointing out their recent work and for making it available.

References Abenda, S., Fedorov, Yu. [2000]: On the weak Kowalevski–Painlevé property for hyperelliptically separable systems. Acta Appl. Math. 60 (2), 137–178 Ablowitz, M.J., Segur, H. [1981]: Solitons and the Inverse Scattering Transform. Philadelphia: SIAM Alber, S.J. [1979]: Investigation of equations of Korteweg de Vries type by the method of recurrence relations. (Russian) J. Lond. Math. Soc. (2) 19, no. 3, 467–480 Alber, M.S., Alber, S.J. [1985]: Hamiltonian formalism for finite-zone solutions of integrable equations. C. R. Acad. Sci. Paris Ser. I Math. 301, 777–781 Alber, M.S., Camassa, R., Holm, D.D. and Marsden, J.E. [1994]: The geometry of peaked solitons and billiard solutions of a class of integrable PDE’s. Lett. Math. Phys. 32, 137–151 Alber, M.S., Camassa, R., Holm, D.D. and Marsden, J.E. [1995]: On the link between umbilic geodesics and soliton solutions of nonlinear PDE’s. Proc. Roy. Soc. 450, 677–692 Alber, M.S., Camassa, R., Fedorov, F., Holm, D.D. and Marsden, J.E. [1999]: On billiard solutions of nonlinear PDE’s. Phys. Lett. A 264, 171–178 Alber, M.S. and Miller, C. [2001]: On peak on solutions of the shallow water equation. Appl. Math. Lett. 14, 93–98 Alber, M.S. and Fedorov, Yu.N. [2000]: Wave solutions of evolution equations and Hamiltonian flows on nonlinear subvarieties of generalized Jacobians. J. Phys. A: Math. Gen. 33, 8409–8425 Alber, M.S. and Fedorov, Yu.N. [2001]: Algebraic geometric solutions for nonlinear evolution equations and flows on the nonlinear subvarieties of Jacobians. Inverse Problems (to appear) Alber, M.S., Luther, G.G., and Marsden, J.E. [1997]: Complex billiard Hamiltonian systems and nonlinear waves. In: Fokas,Y.H., Gelfand, I. M. (eds.) Algebraic Aspects of Integrable Systems, 1–16, Progr. Nonlinear Differential Equations Appl., 26. Boston: Birkhäuser Antonowicz, M. and Fordy, A. P. [1989]: Factorization of energy dependent Schrödinger operators: Miura maps and modified systems. Commun. Math. Phys. 124, no. 3, 465–486 Beals, W., Sattinger, D., and Szmigielski, J. [1998]: Acoustic scattering and the extended Korteweg de Vries hierarchy. Adv. Math. 140, 190–206

226


Beals, R., D.H. Sattinger, and J. Szmigielski [1999]: Multi-peakons and a theorem of Stietjes. Inverse Problems 15, L1–L4 Beals, R., Sattinger, D.H. and Szmigielski, J. [2000]: Multipeakons and the classical moment. Adv. in Math. 154, no. 2, 229–257 Belokolos, E.D., Bobenko, A.I., Enol’sii, V.Z., Its, A.R., and Matveev, V.B. [1994]: Algebro-Geometric Approach to Nonlinear Integrable Equations. Springer-Verlag series in Nonlinear Dynamics. New York: Springer-Verlag Bobenko, A. I. and Suris, Y. B. [1999]: Discrete Lagrangian reduction, discrete Euler-Poincaré equations, and semidirect products. Lett. Math. Phys. 49, no. 1, 79–93 Bulla, W., Gesztesy, F., Holden, H. and Teschl, G. [1998]: Algebro-geometric quasi-periodic finite-gap solutions of the Toda and Kac–van Moerbeke hierarchies. Mem. Am. Math. Soc. 135 Camassa, R. [2000]: Characteristic variables for a completely integrable shallow water equation. In: Boiti, M. et al. (eds.) Nonlinearity, Integrability and All That: Twenty Years After NEEDS ’79. Singapore: World Scientific Camassa, R., Holm, D.D. [1993]: An integrable shallow water equation with peaked solitons. Phys. Rev. Lett. 71, 1661–1664 Camassa, R., Holm, D.D. and Hyman, J.M. [1994]: A new integrable shallow water equation. Adv. Appl. Mech. 31, 1–33 Calogero, F. [1995]: An integrable Hamiltonian system. Phys. Lett. A 201, 306–310 Calogero, F. and Francoise, J.-P. [1996]: Solvable quantum version of an integrable Hamiltonian system. J. Math. Phys. 37 (6), 2863–2871 Cewen, C. [1990]: Stationary Harry-Dym’s equation and its relation with geodesics on ellipsoid. Acta Math. Sinica 6, 35–41 Clebsch, A. and Gordon, P. [1866]: Theorie der abelschen Funktionen. Leipzig: Teubner Dmitrieva, L.A. [1993a]: Finite-gap solutions of the Harry Dym equation. Phys. Lett. A 182, 65–70 Dmitrieva, L.A. [1993b]: The higher-times approach to multisoliton solutions of the Harry Dym equation J. Phys. A 26, 6005–6020 Drach, M. [1919]: Sur l’integration par quadratures de l’equation d 2 y/dx 2 = [φ(x) + h]y. Comptes rendus 168, 337–340 Dubrovin, B.A. [1975]: Periodic problems for the Korteweg–de Vries equation in the class of finite-band potentials. Funct. Anal. Appl. 9, 215–223 Dubrovin, B.A. [1981]: Theta-functions and nonlinear equations. Russ. Math. Surv. 36 (2), 11–92 Ercolani, N. [1987]: Generalized theta functions and homoclinic varieties. In: Ehrenpreis, L., Gunning, R.C. (eds.) Theta functions. Proceedings, Bowdoin. 87–100, Providence, R.I.: American Mathematical Society Fedorov, Yu. [1999]: Classical integrable systems related to generalized Jacobians. Acta Appl. Math. 55, 3, 151–201 Fedorov, Yu. [2001]: Ellipsoidal billiard with the quadratic potential. Funct. Anal. Appl. (Russian) (to appear) Gavrilov, L. [1999]: Generalized Jacobians of spectral curves and completely integrable systems. Math. Z. 230, 487–508 Gesztesy, F. [1995]: New trace formulas for Schrödinger operators. Evolution equations (Baton Rouge, LA, 1992), Lecture Notes in Pure and Appl. Math., 168, New York: Dekker, pp. 201–221 Gesztesy, F. and Holden, H. [1994]: Trace formulas and conservation laws for nonlinear evolution equations. Rev. Math. Phys. 6, 51–95 and 673 Gesztesy, F. and Holden, H. [2001]: Algebraic-geometric solutions of the Camassa–Holm equation. Preprint Gesztesy, F. and Holden, H. [2001]: Dubrovin equations and integrable systems on hyperelliptic curves. Math. Scand. (to appear) Gesztesy, F., Ratnaseelan, R., and Teschl, G. [1996]: The KdV hierarchy and associated trace formulas. Recent developments in operator theory and its applications (Winnipeg, MB, 1994), 125–163, Oper. Theory Adv. Appl. 87, Basel: Birkhäuser Hunter, J.K. and Zheng, Y.X. [1994]: On a completely integrable nonlinear hyperbolic variational equation. Physica D 79, 361–386 Jacobi, C.G.J. [1884a]: Vorlesungen uber Dynamik, Gesammelte Werke. Berlin: Supplementband Jacobi, C.G.J. [1884b]: Solution nouvelle d’un probleme fondamental de geodesie. Gesamelte Werke Bd. 2, Berlin


227

Jaulent, M. [1972]: On an inverse scattering problem with an energy dependent potential. Ann. Inst. H. Poincare A 17, 363–372 Jaulent, M. and Jean, C. [1976]: The inverse problem for the one-dimensional Schrödinger operator with an energy dependent potential. Ann. Inst. H. Poincare A. I, II 25, 105–118, 119–137 Kane, C., E. A. Repetto, E.A., Ortiz, M., and Marsden, J.E. [1999]: Finite element analysis of nonsmooth contact. Comput. Methods Appl. Mech. and Engrg. 180, 1–26 Klingerberg, W. [1982]: Riemannian Geometry. New York: de Gruyter Knörrer, H. [1982]: Geodesics on quadrics and mechanical problem of C. Neumann. J. Reine Angew. Math. 334, 69–78 Kozlov, V.V. and Treschev, D. V. [1991]: Billiards, a Genetic Introduction to Systems with Impacts. AMS Translations of Math. Monographs 89. New York Kruskal, M.D. [1975]: Nonlinear wave equations. In: Moser, J. (eds.) Dynamical Systems, Theory and Applications. Lecture Notes in Physics 38, New York: Springer Markushevich, A. I. [1977]: Introduction to the Theory of Abelian Functions. English version: Translations of Mathematical Monographs, 96. Providence, RI: American Mathematical Society, 1992 Marsden, J. E., Patrick, G.W., and Shkoller, S. [1998]: Mulltisymplectic geometry, variational integrators and nonlinear PDEs, Commun. Math. Phys. 199, 351–395 Marsden, J. E., Pekarsky, S., and Shkoller, S. [1999]: Discrete Euler-Poincare and Lie-Poisson equations. Nonlinearity 12, 1647–1662 Marsden, J. E. and Ratiu, T.S. [1999]: Introduction to Mechanics and Symmetry. Texts in Applied Mathematics, 17, Berlin–Heidelberg–New York: Springer-Verlag McKean, H.P. and Constantin, A. [1999]: A shallow water equation on the circle. Comm. Pure Appl. Math. Vol LII, 949–982 Moser, J. and Veselov, A.P. [1991]: Discrete versions of some classical integrable systems and factorization of matrix polynomials. Commum. Math. Phys. 139, 217–243 Mumford, D. [1983]: Tata Lectures on Theta I and II. Boston: Birkhäuser-Verlag Novikov D.P. [1999]: Algebraic-geometrical solutions of the Harry–Dym equations. Sibirskii Matematicheskii Zhurnal, 40, 159–163, (Russian) English transl. in: Siberian Math. Journal, 40, 136–140 Previato, E. [1995]: Hyperelliptic quasi-periodic and soliton solutions of the nonlinear Schrödinger equation. Duke Math. J. 52, 329–377 Rauch-Wojciechowski, S. [1995]: Mechanical systems related to the Schrödinger spectral problem. Chaos, Solitons & Fractals 5, 2235–2259 Serre, J.-P. [1959]: Groupes algébriques et corps de classes. Paris: Hermann Vanhaecke, P. [1995]: Stratification of hyperelliptic Jacobians and the Sato Grassmannian. Acta Appl. Math. 40, 143–172 Veselov, A.P. [1988]: Integrable discrete-time systems and difference operators. Funct. An. and Appl. 22, 83–94 Verhulst, F. [1996]: Nonlinear Differential Equations and Dynamical Systems. Second Edition, Berlin–Heidelberg–New York: Springer-Verlag Wadati, M., Ichikawa, Y.H., and Shimizu, T. [1980]: Cusp soliton of a new integrable nonlinear evolution equation. Prog. Theor. Phys. 64, 1959–1967 Weierstrass, K. [1884]: Über die geodätischen Linien auf dreiachsigen Ellipsoid, Math. Werke I, 257–266 Wendlandt, J.M. and Marsden, J.E. [1997]: Mechanical integrators derived from a discrete variational principle. Physica D 106, 223–246 Whittaker, E.T. [1937]: A Treatise on the Analytical Dynamics of Particles and Rigid Bodies, Cambridge: Cambridge University Press, 1904; 4th Ed., 1937 (reprinted by Dover 1944, and Cambridge University 1988) Young, L.C. [1969]: Lectures on the Calculus of Variations and Optimal Control Theory. Corrected printing, Chelsea: Saunders, 1980 Communicated by T. Miwa

Commun. Math. Phys. 221, 229 – 254 (2001)

Communications in



The Absolute Continuity of the Integrated Density of States for Magnetic Schrödinger Operators with Certain Unbounded Random Potentials Thomas Hupfer1 , Hajo Leschke1 , Peter Müller2 , Simone Warzel1 1 Institut für Theoretische Physik, Universität Erlangen-Nürnberg, Staudtstraße 7, 91058 Erlangen, Germany 2 Institut für Theoretische Physik, Georg-August-Universität, 37073 Göttingen, Germany

Received: 20 October 2000 / Accepted: 8 March 2001

Dedicated to the memory of Kurt Broderix (26 April 1962 – 12 May 2000) Abstract: The object of the present study is the integrated density of states of a quantum particle in multi-dimensional Euclidean space which is characterized by a Schrödinger operator with magnetic field and a random potential which may be unbounded from above and below. In case that the magnetic field is constant and the random potential is ergodic and admits a so-called one-parameter decomposition, we prove the absolute continuity of the integrated density of states and provide explicit upper bounds on its derivative, the density of states. This local Lipschitz continuity of the integrated density of states is derived by establishing a Wegner estimate for finite-volume Schrödinger operators which holds for rather general magnetic fields and different boundary conditions. Examples of random potentials to which the results apply are certain alloy-type and Gaussian random potentials. Besides we show a diamagnetic inequality for Schrödinger operators with Neumann boundary conditions. Contents 1. 2.

3. 4.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Random Schrödinger Operators with Magnetic Fields . . . . . . 2.1 Basic notation . . . . . . . . . . . . . . . . . . . . . . . 2.2 Basic assumptions . . . . . . . . . . . . . . . . . . . . . 2.3 Definition of the operators . . . . . . . . . . . . . . . . . 2.4 The integrated density of states . . . . . . . . . . . . . . Existence of the Density of States for Certain Random Potentials 3.1 A Wegner estimate . . . . . . . . . . . . . . . . . . . . . 3.2 Upper bounds on the density of states . . . . . . . . . . . Examples Illustrating the Results of Section 3 . . . . . . . . . . 4.1 Alloy-type random potentials . . . . . . . . . . . . . . . 4.2 Gaussian random potentials . . . . . . . . . . . . . . . . 4.3 Two space dimensions: random Landau Hamiltonians . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

230 230 230 231 232 234 236 236 238 238 239 241 243

230

T. Hupfer, H. Leschke, P. Müller, S. Warzel

A.

On Finite-Volume Schrödinger Operators with Magnetic Fields A.1 Definition of magnetic Neumann Schrödinger operators A.2 Diamagnetic inequality . . . . . . . . . . . . . . . . . A.3 Some consequences . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

245 245 248 251 252

1. Introduction The integrated density of states is a quantity of primary interest in the theory [34, 10, 49] and application [54, 7, 40, 2, 37] of Schrödinger operators for a particle in d-dimensional Euclidean space Rd (d = 1, 2, 3, . . . ) subject to a random potential. Its knowledge allows one to compute the free energy and hence all basic thermostatic quantities of the corresponding non-interacting many-particle system. It also enters formulae for transport coefficients. The goal of the present paper is to prove the absolute continuity of the integrated density of states N for certain unbounded random potentials, thereby generalizing a result in [23] for zero magnetic field to the case of a constant magnetic field. Examples of random potentials to which our result applies are certain alloy-type and Gaussian random potentials. In particular, we consider the situation of two space dimensions and a perpendicular constant magnetic field where N is not absolutely continuous without random potential. For the proof of absolute continuity of N , we use the abstract one-parameter spectralaveraging estimate of [11] to derive what is called a Wegner estimate [65]. Such estimates provide upper bounds on the averaged number of eigenvalues of finite-volume random Schrödinger operators in a given energy regime. They play a major rôle in proofs of Anderson localization for multi-dimensional random Schrödinger operators [10, 49, 11, 24, 61]. In contrast to the Wegner estimates with magnetic fields which are available so far, we are neither restricted to the case of a constant magnetic field [12, 5, 64] nor to the existence of gaps in the spectrum of the magnetic Schrödinger operator without random potential [4]. In fact, the Wegner estimate in the present paper holds for magnetic vector potentials whose components are locally square integrable. Its proof involves techniques for (non-random) magnetic Neumann Schrödinger operators among them Dirichlet–Neumann bracketing and a diamagnetic inequality. Appendix A provides the definition of these operators and proofs of the latter techniques in greater generality than actually needed for the main body of the present paper. 2. Random Schrödinger Operators with Magnetic Fields 2.1. Basic notation. As usual, let N := {1, 2, 3, . . . } denote the set of natural numbers. Let R, respectively C, denote the algebraic field of real, respectively complex numbers and let Zd be the simple cubic lattice in d dimensions, d ∈ N. An open cube in ddimensional Euclidean space Rd is a translate of the d-fold Cartesian product I × . . . × I of an open interval I ⊆ R. The open unit cube in Rd which is centered at site y ∈ Rd and whose edges are oriented parallel to the co-ordinate axes is denoted by (y). The d 2 1/2 . Euclidean norm of x ∈ Rd is |x| := j =1 xj d The volume of a Borel subset d ⊆ R with respect to the d-dimensional Lebesgue d χ measure is || := d x = Rd d x (x), where χ is the indicator function of . In particular, if is the strictly positive half-line, := χ ] 0,∞[ is the left-continuous

Density of States for Random Schrödinger Operators with Magnetic Fields

231

Heaviside unit-step function. The Banach space Lp (), p ∈ [1, ∞], consists of the Borel-measurable complex-valued functions f : → C which areidentified if their values differ only on a set of Lebesgue measure zero and which obey dd x |f (x)|p < 2 ∞ if p < ∞ and f ∞ := ess supx∈ |f (x)| < ∞ if p = ∞. We recall dthat L () is a separable Hilbert space with scalar product · , · given by f, g = d x f (x) g(x). p Here the overbar denotes complex conjugation. We write f ∈ Lloc (Rd ), if f χ ∈ Lp (Rd ) for any bounded Borel set ⊂ Rd . Finally, C0∞ () is the vector space of functions f : → C which are arbitrarily often differentiable and have compact supports. 2.2. Basic assumptions. Let (, A, P) be a complete probability space and E{·} := P(dω)(·) be the expectation induced by the probability measure P. By a random potential we mean a (scalar) random field V : × Rd → R , (ω, x) → V (ω) (x) which is assumed to be jointly measurable with respect to the product of the sigma-algebra A of event sets in and the sigma-algebra B(Rd ) of Borel sets in Rd . We will always assume d ≥ 2, because magnetic fields in one space dimension may be “gauged away” and are therefore of no physical relevance. Furthermore, for d = 1 far more is known [10, 49] thanks to methods which only work for one dimension. We list four properties which V may have or not: (F) There exists some real p ∈]1, ∞[ with p > 1 if d = 2 and p ≥ d/2 if d ≥ 3 such that for P-almost each ω ∈ the realization V (ω) : x → V (ω) (x) of V belongs to p Lloc (Rd ). (S) There exists some pair of reals p1 > p(d) and p2 > p1 d/ [2(p1 − p(d))] such that p /p (2.1) dd x |V (x)| p1 2 1 < ∞. sup E y∈Zd

(y)

Here p(d) is defined as follows: p(d) := 2 if d ≤ 3, p(d) := d/2 if d ≥ 5 and p(4) > 2, otherwise arbitrary. (E) V is Zd -ergodic or Rd -ergodic. (I) The finiteness condition dd x |V (x)|2ϑ+1 < ∞ (2.2) sup E y∈Zd

(y)

holds, where ϑ ∈ N is the smallest integer with ϑ > d/4. Remark 2.1. (i) Property (E) requires the existence of a group Tx , x ∈ Zd or Rd , of probability-preserving and ergodic transformations on such that V is Zd - or Rd homogeneous in the sense that V (Tx ω) (y) = V (ω) (y − x) for all x ∈ Zd or Rd , all y ∈ Rd and all ω ∈ . p(d)

(ii) Since property (S) assures that the realization V (ω) belongs to Lloc (Rd ) for Palmost each ω ∈ , property (S) implies property (F). Property (I) also implies property (F). We proceed by listing two properties either of which a random potential may additionally have or not and which characterize two examples of random potentials, which we will consider in the present paper.

232


(A) V is an alloy-type random field, that is, a random field with realizations given by

(ω) λj u0 (x − j ). (2.3) V (ω) (x) = j ∈Zd

The coupling strengths {λj } form a family of random variables which are P-independent and identically distributed according to the common probability measure B(R) I → P{λ0 ∈ I }. Moreover, we suppose that the single-site potential u0 : Rd → R satisfies the Birman–Solomyak condition d p1 1/p1 < ∞ with some real p ≥ 2ϑ + 1 and that 1 j ∈Zd (j ) d x |u0 (x)| E (|λ0 |p2 ) < ∞ for some real p2 satisfying p2 ≥ 2ϑ + 1 and p2 > p1 d/[2(p1 − p(d))]. [The constants p(d) and ϑ are defined in properties (S) and (I).] (G) V is a Gaussian random field [1, 41] which is Rd -homogeneous. It has zero mean, E [ V (0)] = 0, and its covariance function x → C(x) := E [ V (x)V (0)] is continuous at the origin where it obeys 0 < C(0) < ∞. Remark 2.2. (i) Consider an alloy-type random potential V , that is, a random potential with property (A). Then V has properties (E), (I), (S) and (F), see, for example [29]. (ii) Consider a random field with the Gaussian property (G). Then its covariance function C is bounded and uniformly continuous on Rd . Consequently, [22, Thm. 3.2.2] implies the existence of a separable version V of this field which is jointly measurable. Speaking about a Gaussian random potential, we tacitly assume that only this version will be dealt with. By the Bochner–Khintchine theorem [51, Thm. IX.9] there is a one-to-one correspondence between finite positive (and even) Borel measures on Rd and Gaussian random potentials. An explicit calculation shows that a Gaussian random potential enjoys properties (I), (S) and (F). A simple sufficient criterion for the ergodicity property (E) is the mixing condition lim|x|→∞ C(x) = 0. By a vector potential we mean a (non-random) Borel-measurable vector field A : Rd → Rd , x → A(x) which we assume to possess either the property 1 (Rd ), (B) |A|2 belongs to Lloc

or the property (C) A has continuous partial derivatives which give rise to a magnetic field (tensor) with constant components given by Bj k := ∂j Ak − ∂k Aj , where j , k ∈ {1, . . . , d}. Remark 2.3. (i) Property (C) implies property (B). (ii) Given property (C), we may exploit the gauge freedom to choose the vector potential in the symmetric gauge in which the components of A are given by Ak (x) = d j =1 xj Bj k /2, where k ∈ {1, . . . , d}. 2.3. Definition of the operators. We are now prepared to precisely define magnetic Schrödinger operators with random potentials on the Hilbert spaces L2 () and L2 (Rd ). The finite-volume case is treated in Proposition 2.1. Let ⊂ Rd be a bounded open cube, A be a vector potential with the property (B) and V be a random potential with the property (F). Then


233

(i) the sesquilinear form hA,0 ,N (ϕ, ψ) :=

d

1 (i∇ + A)j ϕ , (i∇ + A)j ψ , 2

(2.4)

j =1

2 2 d with ϕ, ψ in the form domain Q hA,0 ,N := φ ∈ L () : (i∇ + A) φ ∈ (L ()) and ∇ − iA denoting the gauge-covariant gradient in the sense of distributions on C0∞ (), uniquely defines a self-adjoint positive operator on L2 (), which we A,0 denote by H,N (A, 0). The closure hA,0 ,D of the restriction of h,N to the domain C0∞ () uniquely defines another self-adjoint positive operator on L2 (), which we denote by H,D (A, 0). (ii) The two operators H,X (A, V (ω) ) := H,X (A, 0) + V (ω) ,

X = D or X = N,

(2.5)

are self-adjoint and bounded below on L2 () as form sums for all ω in some subset F ∈ A of with full probability, in symbols, P(F ) = 1. (iii) The mapping H,X (A, V ) : F ω → H,X (A, V (ω) ) is measurable. We call it the finite-volume magnetic Schrödinger operator with random potential V and Dirichlet or Neumann boundary condition if X = D or X = N, respectively. (iv) The spectrum of H,X (A, V (ω) ) is purely discrete for all ω ∈ F . (v) The (random) finite-volume density-of-states measure, defined by the trace (ω) (2.6) ν,X (I ) := Tr χ I H,X (A, V (ω) ) , is a positive Borel measure on the real line R for all ω ∈ F . Here χ I H,X (A, V (ω) ) is the spectral projection operator of H,X (A, V (ω) ) associated with the energy regime I ∈ B(R). Moreover, the (unbounded left-continuous) distribution function (ω) (ω) N,X (E) := ν,X ]−∞, E[ = Tr E − H,X (A, V (ω) ) < ∞ (2.7) (ω)

of ν,X , called the finite-volume integrated density of states, is finite for all energies E ∈ R. Proof. The proofs of assertions (i), (ii) and (iv) are contained in Appendix A because (B) and (F) imply (A.1) and (A.2). Assertion (iii) is a consequence of considerations in [35], see also Sect. V.1 in [10], and of a straightforward generalization to non-zero vector potentials. Assertion (v) follows from (ii) and (iv). (ω)

Remark 2.4. Counting multiplicity, ν,X (I ) is just the number of eigenvalues of the operator H,X (A, V (ω) ) in the Borel set I ⊆ R. Since this number is almost-surely (ω) finite if I is bounded, the mapping ν,X : ω → ν,X is a random Borel measure. The precise definition of the infinite-volume magnetic Schrödinger operator on L2 (Rd ) and a compilation of its basic properties are given in Proposition 2.2. Assume that the vector potential A and the random potential V enjoy the properties (C) and (S). Then

234


(i) the operator C0∞ (Rd ) ψ → 21 dj =1 (i∂j + Aj )2 ψ + V (ω) ψ is essentially selfadjoint for all ω in some subset S ∈ A of with full probability, P(S ) = 1. Its self-adjoint closure on L2 (Rd ) is denoted by H (A, V (ω) ). (ii) The mapping H (A, V ) : S ω → H (A, V (ω) ) is measurable. We call it the infinite-volume magnetic Schrödinger operator with random potential V . Proof. For assertion (i) see [24, Prop. 2.3], which generalizes [10, Prop. V.3.2] to the case of continuously differentiable vector potentials A = 0. Note that the assumption of a vanishing divergence, dj =1 ∂j Aj = 0, in [24, Prop. 2.3] is not needed in the argument. Assertion (ii) is a straightforward generalization of [10, Prop. V.3.1] to continuously differentiable A = 0, see also [34, Prop. 2 on p. 288]. Remark 2.5. For alternative or weaker criteria instead of (S) guaranteeing the almostsure self-adjointness of H (0, V ), see [49, Thm. 5.8] or [34, Thm. 1 on p. 299]. If A has the property (C), the infinite-volume magnetic Schrödinger operator without scalar potential, H (A, 0), is unitarily invariant under so-called magnetic translations [67]. The latter form a family of unitary operators {Tx }x∈Rd on L2 (Rd ) defined by ψ ∈ L2 (Rd ), (2.8) (Tx ψ) (y) := ei(x (y) ψ(y − x), where (x (y) := K(x,y) dr · (A(r) − A(r − x)) is an integral along some smooth curve K with initial point x ∈ Rd and terminal point y ∈ Rd . Since A and its x-translate A( · − x) give rise to the same magnetic field and Rd is simply connected, the integral (x (y) is actually independent of K. Remark 2.6. (i) For the vector potential in the symmetric gauge (see Remark 2.3 (ii)) one has (x (y) = dj,k=1 xj Bj k (yk − xk )/2. (ii) For a discussion in the case of more general configuration spaces and magnetic fields, see for example [44]. (iii) In the situation of Prop. 2.2 and if the random potential V has property (E), we have Tx H (A, V (ω) ) Tx† = H (A, V (Tx ω) )

(2.9)

for all ω ∈ S and all x ∈ Zd or x ∈ Rd , depending on whether V is Zd - or Rd -ergodic. Hence, following standard arguments, H (A, V ) is an ergodic operator and its spectral components are non-random, see [62, Thm. 2.1]. Moreover, the discrete spectrum of H (A, V (ω) ) is empty for P-almost all ω ∈ , see [34, 10, 62]. 2.4. The integrated density of states. The quantity of main interest in the present paper is the integrated density of states and its corresponding measure, called the density-ofstates measure. The next theorem, which we recall from [29], deals with its definition and its representation as an infinite-volume limit of the suitably scaled finite-volume counterparts (2.7). Proposition 2.3. Let χ (0) denote the multiplication operator associated with the indicator function of the unit cube (0). Assume that the potentials A and V have the properties (C), (S), (I) and (E). Then the (infinite-volume) integrated density of states (2.10) N (E) := E Tr χ (0) E − H (A, V ) χ (0) < ∞


235

is well defined for all energies E ∈ R in terms of the (spatially localized) spectral family of the infinite-volume operator H (A, V ). It is the (unbounded left-continuous) distribution function of some positive Borel measure ν on the real line R. Moreover, let ⊂ Rd stand for bounded open cubes centered at the origin. Then there is a set 0 ∈ A of full probability, P(0 ) = 1, such that the limit relation (ω)

N (E) = lim

↑Rd

N,X (E) ||

(2.11)

holds for both boundary conditions X = D and X = N, all ω ∈ 0 and all E ∈ R except for the (at most countably many) discontinuity points of N . Proof. See [29]. Remark 2.7. (i) A proof of the existence of the integrated density of states N under slightly different hypotheses was outlined in [43]. It uses functional-analytic arguments first presented in [36] for the case A = 0. A different approach to the existence of the density-of-states measure ν for A = 0, using Feynman–Kac(-Itô) functional-integral representations of Schrödinger semigroups [58, 9], can be found in [62, 8]. The latter approach dates back to [47, 46] for the case A = 0. To our knowledge, it works straightforwardly in the case A = 0 for X = D only. For A = 0 the independence of the infinite-volume limit in (2.11) of the boundary condition X (previously claimed without proof in [43]) follows from [45] if the random potential V is bounded and from [19] if V is bounded from below. So the main new point about Prop. 2.3 is that it also applies to a wide class of V unbounded from below. Even for A = 0, Prop. 2.3 is partially new in that the corresponding result [49, Thm. 5.20] only shows vague convergence of the underlying measures, see the next remark. (ii) An immediate corollary of Prop. 2.3 is the vague convergence [6, Def. 30.1] of (ω) the spatial eigenvalue concentrations ||−1 ν,X in the infinite-volume limit ↑ Rd to the non-random positive Borel measure ν uniquely corresponding to the integrated density of states (2.10) in the sense that N (E) = ν(] − ∞, E[) for all E ∈ R, that is, (ω)

ν = lim

↑Rd

ν,X ||

(vaguely)

(2.12)

for both X = D and X = N and P-almost all ω ∈ . One may relate properties of the density-of-states measure ν to simple spectral properties of the infinite-volume magnetic Schrödinger operator. Examples are the support of ν and the location of the almost-sure spectrum of H (A, V (ω) ) or the absence of a point component in the Lebesgue decomposition of ν and the absence of “immobile eigenvalues” of H (A, V (ω) ). This is the content of Corollary 2.1. Under the assumptions of Prop. 2.3 and letting I ∈ B(R), the following equivalence holds: ν(I ) = 0 if and only if χ I H (A, V (ω) ) = 0 for P-almost all ω ∈ . This immediately implies: (i)

supp ν = spec H (A, V (ω) ) for P-almost all ω ∈ . [Here spec H (A, V (ω) ) denotes the spectrum of H (A, V (ω) ) and supp ν := {E ∈ R : ν(]E − ε, E + ε[) > 0 for all ε > 0} is the topological support of ν.]

236


(ii) 0 = ν({E}) = limε↓0 N (E + ε) − N (E) if and only if E ∈ R is not an eigenvalue of H (A, V (ω) ) for P-almost all ω ∈ . Proof. See [29]. The equivalence (ii) of the above corollary is a continuum analogue of [15, Prop. 1.1], see also [49, Thm. 3.3]. In the one-dimensional case [48] and the multi-dimensional lattice case [18], the equivalence has been exploited to show for A = 0 the (global) continuity of the integrated density of states N under practically no further assumptions on the random potential beyond those ensuring the existence of N . The proof of such a statement in the multi-dimensional continuum case is considered an important open problem [60]. For A = 0 one certainly needs additional assumptions as [20] illustrates, see Remark 4.3(ii) below. Under the additional assumptions of Corollary 3.1 below, we will show that the integrated density of states is not only continuous, but even absolutely continuous in the case of a constant magnetic field of arbitrary strength. 3. Existence of the Density of States for Certain Random Potentials In this section we provide conditions under which the integrated density of states N (or, equivalently, its measure ν) is absolutely continuous with respect to the Lebesgue measure. As a by-product, we get rather explicit upper bounds on the resulting Lebesgue density dN (E)/dE = ν(dE)/dE, called the density of states. Results of this genre date back to [65] and go nowadays under the name Wegner estimates. 3.1. A Wegner estimate. The main aim of this subsection is to extend the Wegner estimate in [23] to the case with magnetic fields. For this purpose we recall from there Definition 3.1. A random potential V : × Rd → R admits a (U, λ, u, ,)decomposition if there exists a random potential U : × Rd → R , a random variable d λ : → R and a real-valued u ∈ L∞ loc (R ) such that (i) V (ω) = U (ω) + λ(ω) u for P-almost all ω ∈ , (ii) the conditional probability distribution of λ relative to the sub-sigma-algebra generated by the family of random variables {U (x)}x∈Rd has a jointly measurable density , : × R → [0, ∞[ with respect to the Lebesgue measure on R . d The condition u ∈ L∞ loc (R ) was missed out in [23, Def. 2]. We now state the following generalization of [23, Thm. 2] which in its turn relies on a result in [11]. int J Theorem 3.1. Let ⊂ Rd be a bounded open cube. Let = be j =1 j decomposed into the interior of the closure of finitely many, J ∈ N, pairwise disjoint bounded open cubes j ⊂ Rd . Let the potentials A and V be supplied with the properties (B) and (F), respectively. Assume that for each j ∈ {1, . . . , J } the random potential V admits a (Uj , λj , uj , ,j )-decomposition subject to the following three conditions: there exist five strictly positive constants v1 , v2 , β, R, Z > 0 such that for all j ∈ {1, . . . , J },

(i) v1 χ j (x) ≤ uj (x) and uj (x)χ j (x) ≤ v2 for Lebesgue-almost all x ∈ Rd , (ω) (ii) ess sup ,j (ξ ) max{e−βv1 ξ , e−βv2 ξ } ≤ R for P-almost all ω ∈ , ξ ∈R

−βH ,N (A,Uj ) j ≤ j Z. (iii) E Tr e


237

Then the averaged number of eigenvalues of the finite-volume operator H,X (A, V ) in any non-empty energy regime I ∈ B(R) of finite Lebesgue measure |I | is bounded from above according to RZ β sup I E ν,X (I ) ≤ || |I | e v1

(3.1)

for both boundary conditions X. [Here sup I denotes the least upper bound of I ⊂ R.] Remark 3.1. The (Chebyshev–Markov) inequality χ [1,∞[ (|ξ |) ≤ |ξ | implies P I ∩ spec H,X (A, V ) = ∅ = E χ [1,∞[ ν,X (I ) ≤ E ν,X (I ) .

(3.2)

Therefore the Wegner estimate (3.1) in particular bounds the probability of finding at least one eigenvalue of H,X (A, V ) in a given energy regime I ∈ B(R). Such bounds are a key ingredient of proofs of Anderson localization for multi-dimensional random Schrödinger operators, see [10, 49, 11, 24, 61] and references therein. Proof (of Theorem 3.1). Since we follow exactly the strategy of the proof of [23, Thm. 2], we only remark that the two main steps in this proof remain valid in the presence of a vector potential A. The first step, used in inequality (27) of [23], concerns the lowering of the eigenvalues of the operator H,X (A, V (ω) ) by so-called Dirichlet– Neumann bracketing in case X = D and by the (subsequent) insertion of interfaces in with the requirement of Neumann boundary conditions. For A = 0, supplied with property (B), the validity of these two techniques is established in Appendix A. The second step is an application of a spectral-averaging estimate of [11], which is re-phrased as Lemma 3.1 below. Since there the operator L is only required to be self-adjoint and does not enter the r.h.s. of (3.3), it makes no difference if L is taken as H,X (0, Uj ) (as is done in [23]) or as H,X (A, Uj ) for each j ∈ {1, . . . , J }. An essential tool in the preceding proof is the (simple extension of the) abstract oneparameter spectral-averaging estimate of [11]; in this context see also [13]. Lemma 3.1. Let K, L and M be three self-adjoint operators acting on a Hilbert space H with K and M bounded such that κ := inf Kϕ=0 ϕ , M ϕ/ϕ , K 2 ϕ > 0 is strictly positive. Moreover, let g ∈ L∞ (R). Then the inequality R

dξ |g(ξ )| ψ , K χ I (L + ξ M) K ψ ≤ |I |

g ∞ ψ, ψ κ

(3.3)

holds for all ψ ∈ H and all I ∈ B(R). Proof. Since the assumption κ > 0 implies the operator inequality κ K 2 ≤ M, the lemma is proven as Cor. 4.2 in [11] for any positive bounded functions g with compact supports. It extends to positive bounded function with arbitrary support by a monotoneconvergence argument.

238


3.2. Upper bounds on the density of states. If the fraction RZ/v1 on the r.h.s of the Wegner estimate (3.1) is independent of for sufficiently large ||, this estimate enables one to prove the absolute continuity of the infinite-volume density-of-states measure with a magnetic field. Corollary 3.1. Let A and V have the properties (C), (S), (I) and (E). Suppose furthermore: (i) there exists a sequence () of bounded open cubes ⊂ Rd with ↑ Rd such that int J infinitely many of them admit a decomposition = into a finite j =1 j number J (depending on ) of pairwise disjoint open cubes 1 , . . . , J . (ii) V obeys the assumptions of Theorem 3.1 for every such decomposition with constants β, v1 , R, Z > 0, all of them not depending on . Then the density-of-states measure ν is absolutely continuous with respect to the Lebesgue measure. Moreover, its Lebesgue density w, called the density of states, is locally bounded according to w(E) :=

ν(dE) dN (E) RZ βE e =: W (E) = ≤ dE dE v1

(3.4)

for Lebesgue-almost all energies E ∈ R. Proof. Let I ⊂ R be bounded and open. Then (2.12) together with [6, Satz 30.2] implies (ω) that ν(I ) ≤ lim inf ↑Rd ||−1 ν,X (I ) for P-almost all ω ∈ . Therefore, by the nonrandomness of the density-of-states measure ν and Fatou’s lemma we have E ν,X (I ) RZ β sup I ν(I ) ≤ lim inf e . (3.5) ≤ |I | || v1 ↑Rd Here we used (3.1) and the assumption that the constants involved there do not depend on . Now the Radón-Nikodým theorem yields the claimed absolute continuity of ν. 4. Examples Illustrating the Results of Section 3 Assumption (iii) of Theorem 3.1 may be checked in various ways. For example, by the diamagnetic inequality (A.24) of Appendix A for Neumann partition functions one sees that a possible choice of Z in (3.1) is −βHj ,N (0,Uj ) Z1 := max |j |−1 E Tr e . (4.1) 1≤j ≤J

This yields an upper bound on E ν,X (I ) in (3.1) which is independent of the magnetic field and, in particular, coincides with the one in [23, Thm. 2]. Rather weak conditions on the random potential Uj assuring the finiteness of the expectation value in (4.1) can be found in [21]. Another choice of Z results from applying the following averaged Golden–Thompson inequality.


239

Lemma 4.1. Let ⊂ Rd be a bounded open cube and assume that A and V enjoy properties (B) and (F). Then the averaged partition function of H,X (A, V ) is bounded for all β > 0 according to E Tr e−β H,X (A,V ) ≤ Tr e−β H,X (A,0) ess sup E e−β V (x) , (4.2) x∈

provided that the essential supremum on the r.h.s. is finite. (ω)

Proof. We proceed as in the proof of [36, Thm. 3.4(ii)] and define Vn (x) := max{−n, V (ω) (x)} for n ∈ N and ω ∈ F . The Golden–Thompson inequality [53] yields (ω) (ω) Tr e−β H,X (A,Vn ) ≤ Tr e−βH,X (A,0) e−β Vn .

(4.3)

We then evaluate the trace on the r.h.s. in an orthonormal eigenbasis of H,X (A, 0). Using Fubini’s theorem, the probabilistic expectation of the quantum-mechanical expectation of exp(−βVn ) with eigenfunction of H,X (A, 0) is estimated respect to a normalized by ess supx∈ E exp(−β Vn (x)) , which is smaller than the second factor on the r.h.s. of (4.2) since V ≤ Vn . The proof is completed by noting that the l.h.s. of (4.3) converges for n → ∞ to the trace on the l.h.s. of (4.2) by monotone convergence of forms [51, Thm. S.16], similar to the proof of [36, Prop. 2.1(e)]. Using (4.2) one gets Z2 := max

1≤j ≤J

|j |

−1

Tr e

−βHj ,N (A,0)

ess sup E e x∈j

−βUj (x)

(4.4)

as another choice for Z in (3.1). By (A.24) one may further estimate the magnetic Neumann partition function in (4.4) according to d Tr e−β H,N (A,0) ≤ Tr e−β H,N (0,0) ≤ || ||−1/d + (2πβ)−1/2 . (4.5) The second inequality follows from the explicitly known [53, p. 266] spectrum of H,N (0, 0). Applying (4.5) to (4.4) one weakens Z2 to a rather explicit choice of Z in (3.1) given by d −1/d −1/2 −βUj (x) Z3 := max . (4.6) |j | + (2πβ) ess sup E e 1≤j ≤J

x∈j

4.1. Alloy-type random potentials. The existence of a (U, λ, u, ,)-decomposition of V as required in Theorem 3.1 is immediate for alloy-type random potentials whose coupling strengths are distributed according to a Borel probability measure on the real line with a bounded Lebesgue density. To illustrate the essentials of Theorem 3.1 we first consider the case of positive potentials.

240


d Corollary 4.1. Let A and V have the properties (B) and (A). Assume that u0 ∈ L∞ loc (R ) ∞ and that the probability distribution of λ0 has a Lebesgue density g ∈ L (R) with support in the positive half-line [0, ∞[. Furthermore, suppose that there exist two strictly positive constants v1 , v2 > 0 such that

v1 χ (0) (x) ≤ u0 (x) and u0 (x)χ (0) (x) ≤ v2

(4.7)

for Lebesgue-almost all x ∈ Rd . Then for each bounded open cube of the form =

int (j )

,

(4.8)

j ∈∩Zd

one has E ν,X (I ) ≤ || |I | WA ( sup I )

(4.9)

for both X = D and X = N and all I ∈ B(R). Here WA is the function d g ∞ βE e R E → WA (E) := 1 + (2πβ)−1/2 v1

(4.10)

with β ∈] 0, ∞[ serving as a variational parameter. (ω)

Proof. For each j ∈ ∩ Zd , the choice uj (x) := u0 (x − j ) and Uj (x) := V (ω) (x) − (ω)

λj uj (x) yields a (Uj , λj , uj , g)-decomposition of V in the sense of Definition 3.1. It remains to verify the three assumptions of Theorem 3.1. Assumption (i) is guaranteed by (4.7). Assumption (ii) is fulfilled with R = g ∞ . To verify assumption (iii), we make (ω) use of (4.6) and observe that Uj ≥ 0. Remark 4.1. (i) The estimates in the proof of Corollary 4.1, when specializing the fraction RZ/v1 of Theorem 3.1 to WA , were unnecessarily rough for the sake of simplicity. In specific examples the upper bound WA may be improved. Moreover, more general alloy-type random potentials are also covered by Theorem 3.1. In particular, the random potential may be unbounded from below, see the next corollary. Furthermore, one may allow for correlated coupling strengths {λj } as long as the relevant conditional probabilities have bounded Lebesgue densities. (ii) Apart from the existence of a bounded Lebesgue density for the coupling strength λ0 one further restrictive assumption of Corollary 4.1 is the fact that the single-site potential u0 must possess a definite sign. The latter may be slightly weakened such that one may treat certain u0 taking on values of both signs by choosing a more complicated decomposition different from the natural one used in the proof of Corollary 4.1. This basically corresponds to the linear-transformation technique introduced in [63] which turns certain given alloy-type random potentials into ones with positive single-site potentials and correlated coupling strengths, see the previous Remark 4.1(i). In any case, the fact that u0 must possess a sufficiently large support is believed to be important for the absolute continuity of the integrated density of states in the presence of a magnetic field, see Remark 4.3(ii).


241

(iii) We only know of [12, 4, 5, 64] where Wegner estimates for magnetic Schrödinger operators with alloy-type random potentials have been derived.1 The Wegner estimate of [4] is proven for energies in pre-supposed gaps of the spectrum of H (A, 0). The other three works consider the case of two space dimensions and a perpendicular constant magnetic field, see Subsect. 4.3, especially Remark 4.3(iii) and 4.3(iv). We close this subsection by considering the example of an unbounded below alloy-type random potential with exponentially decaying probability density for its (independent) coupling strengths. This example is marginal in the sense that any such density has to fall off at minus infinity at least as fast as exponentially in order to ensure the applicability of Theorem 3.1. Corollary 4.2. Let A and V have the properties (B) and (A). Assume a Laplace distribution for λ0 , that is

1 dξ e−|ξ |/α , I ∈ B(R), (4.11) P λ0 ∈ I = 2α I d with some α > 0. Furthermore, suppose that u ∈ L∞ loc (R ) and that (4.7) holds with some v1 , v2 > 0 and let

ln 1 − [βαu0 (x − j )]2 < ∞ (4.12) Kβ := − ess inf x∈(0)

j ∈Zd

be finite for some β ∈] 0, (α u0 ∞ )−1 [. Finally, let be of the form (4.8). Then (4.9) holds where WA may be taken as the function d 1 − (βαv )2 1 E → WA (E) := 1 + (2πβ)−1/2 (4.13) eβE+Kβ 2αv1

with β ∈ β ∈ ] 0, (α u0 ∞ )−1 [ : Kβ < ∞ serving as a variational parameter. Proof. The proof is analogous to that of Corollary 4.1. To verify the assumptions of Theorem 3.1 we note that assumption (i) is guaranteed by (4.7). Assumption (ii) is fulfilled with R = (2α)−1 if β ∈] 0, (αv2 )−1 ]. As for assumption (iii), we make use of (4.6) and explicitly compute the involved expectation if β ∈] 0, (α u0 ∞ )−1 [. 4.2. Gaussian random potentials. As another application of Theorem 3.1 we note that the Wegner estimate derived previously [23, Thm. 1] for certain Gaussian random potentials and the case without magnetic field remains valid in the present setting. The reason for this is the fact that every Wegner estimate stemming from [23, Thm. 2] is also one in the presence of a magnetic field thanks to the diamagnetic inequality. Corollary 4.3. Let A and V have the properties (B) and (G). Moreover, assume that a finite signed Borel measure µ on Rd , which is normalized in the sense that there exist d x) d y) C(x −y) = C(0), an open subset > ⊂ Rd with volume > > 0 µ(d µ(d Rd Rd and a constant γ > 0 such that the covariance function C of V obeys γ χ > (x) ≤ (C(0))−1 µ(dd y) C(x − y) =: (C(0))−1/2 u(x) (4.14) Rd

1 See, however, note added in proof.

242


for all x ∈ Rd . Then for each @ > 0, for which there exists a bounded open cube (@) ⊆ > with edges of length @ parallel to the co-ordinate axes, and each bounded open cube ⊂ Rd satisfying the matching condition ||1/d /@ ∈ N, one has E ν,X (I ) ≤ || |I | WG ( sup I ) (4.15) for both X = D and X = N and all I ∈ B(R). Here WG is the function d exp βE + β 2 C@ /2 (4.16) E → WG (E) := 2@ + (2πβ) √ 2π C(0) b@ where we introduced the constants C@ := C(0) 1 + B@2 − b@2 , B@ := (C(0))−1/2 supx∈(@) u(x) and b@ := (C(0))−1/2 inf x∈(@) u(x) ≥ γ . Finally, β ∈ ] 0, ∞[ serves, besides @, as a second variational parameter.

−1

−1/2

Proof. The key input is the fact that every Gaussian random potential V admits a (U, λ, u, ,)-decomposition in the sense of Definition 3.1. More precisely, λ(ω) := −1/2 d (ω) (x) is a standard Gaussian random variable with Lebesgue (C(0)) Rd µ(d x)V density ,(ξ ) := (2π)−1/2 exp −ξ 2 /2 . This random variable and the Gaussian random field U (ω) (x) := V (ω) (x) − λ(ω) u(x), where u is defined in (4.14), are stochastically independent. For details see the proof of [23, Thm. 1]. To obtain the specific form WG , which is independent of the magnetic field, we used (4.6). Remark 4.2. (i) Without loss of generality, every measure µ yielding (4.14) may be normalized in the sense of the assumption in the above corollary. The measure µ allows one to apply Corollary 4.3 to Gaussian random potentials with certain covariance functions taking on also negative values. Examples are given in [23, 30]. (ii) If C(x) ≥ 0 for all x ∈ Rd , we may choose µ equal to Dirac’s point measure at the origin. Due to the continuity of C and since C(0) > 0, condition (4.14) is then fulfilled with some sufficiently small cube > containing the origin and γ = inf x∈> C(x)/C(0). Under stronger conditions on the vector potential A the Wegner estimate for this case has been stated in [24, Prop. 2.14] where it serves as one input for a proof of Anderson localization by certain Gaussian random potentials, see Remark 3.1. (iii) Choosing @ = |E|−1/4 and β = (2C@ )−1 E 2 + 2d C@ − E we obtain the following leading low- and high-energy behaviour: lim

E→−∞

ln WG (E) 1 =− , 2 E 2C(0)

lim

E→∞

WG (E) (e/(π d))d/2 = . √ E d/2 2π u(0)

(4.17)

Since WG provides an upper bound on the density of states (see Corollary 3.1), its lowenergy behaviour is optimal in the sense that it coincides with that of the derivative of the known low-energy behaviour of the integrated density of states [43, 62, 8]. This is not true for the high-energy behaviour. It is known [43, 62] that the high-energy growth of the integrated density of states is neither affected by the random potential nor by the magnetic field and proportional to E d/2 for E → ∞ in analogy to Weyl’s celebrated asymptotics for the free particle [66]. Note that the constant on the r.h.s. of the second equation in (4.17) is smaller than the one given by [23, Eq. (14)].


243

4.3. Two space dimensions: random Landau Hamiltonians. In this subsection we consider the special case of two space dimensions and a perpendicular constant magnetic field of strength B := B12 > 0. Accordingly, the vector potential in the symmetric gauge is given by B −x2 x1 A(x) = , x= ∈ R2 . (4.18) 2 x1 x2 This case has received considerable attention during the last three decades [2, 37] in the physics of low-dimensional electronic structures. The magnetic Schrödinger operator on L2 (R2 ) modelling the non-relativistic motion of a particle with unit charge on the Euclidean plane R2 under the influence of this magnetic field is the Landau Hamiltonian. Its spectral resolution dates back to Fock [25] and Landau [38] and is given by the strong-limit relation ∞ B

H (A, 0) = (2l + 1) Pl . 2

(4.19)

l=0

The energy eigenvalue (l + 1/2)B is called the l th Landau level and the corresponding orthogonal eigenprojection Pl is an integral operator with continuous complex-valued kernel B B B B 2 2 Pl (x, y) := (4.20) exp i (x2 y1 − x1 y2 ) − |x − y| Ll |x − y| , 2π 2 4 2 l −ξ dl given in terms of the l th Laguerre polynomial ξ → Ll (ξ ) := l!1 eξ dξ , ξ ≥ 0, [27, l ξ e Sect. 8.97]. The diagonal Pl (x, x) = B/(2π ) is naturally interpreted as the degeneracy per area of the l th Landau level. Using definition (2.10) with V = 0, the integrated density of states of the Landau Hamiltonian (4.19) turns out to be the well-known “staircase” function N (E) =

∞ 1 B E− l+ B , 2π 2

V = 0,

(4.21)

l=0

which is obviously not absolutely continuous with respect to the Lebesgue measure. For the derivation of (4.21) one may apply [51, Thm. VI.23] because the operator Pl χ (0) is Hilbert-Schmidt, more precisely Tr[χ (0) Pl χ (0) ] = B/(2π) < ∞. Alternatively one may compute [45, App. B] the infinite-area limit lim↑R2 ||−1 Tr (E − H,X (A, 0)) for some boundary condition X. The result coincides with (4.21) by Prop. 2.3. Informally, the density of states associated with (4.21) is a series of Dirac delta functions supported at the Landau levels. The corresponding infinities are indicated by vertical lines in Fig. 4.1 and together form what might be called a “Dirac half-comb”. By adding a random potential V to (4.19), the delta peaks are expected to be smeared out. In fact, under the assumptions of Corollary 3.1 they are smeared out completely in the sense that the density of states w of the arising random Landau Hamiltonian H (A, V ) = H (A, 0) + V is shown there to be locally bounded. For example, in the presence of a Gaussian random potential with the Gaussian covariance function C(x) = C(0) exp − |x|2 /(2τ 2 ) > 0, τ > 0, Fig. 4.1 contains the graph of the upper bound WG on w given in (4.16) after (numerically) minimizing with

244


WG

1/2π

E 0

B/2

3B/2

5B/2

Fig. 4.1. Plot of the upper bound WG (E) on w(E) as a function of the energy E. Here w is the density of states of the Landau Hamiltonian with a Gaussian random potential with Gaussian covariance function. The dashed line is a plot of the graph of an approximation to w. The exact w is unknown. Vertical lines indicate the delta peaks which reflect the non-existence of the density of states without random potential V . The step function (E)/2π (not shown) is the free density of states characterized by B = 0 and V = 0. (See text)

respect to β, @ and a certain one-parameter subclass of possible decompositions of V . Here we picked a (small) disorder parameter, C(0) = (B/5)2 , and a (large) correlation length, τ = 100B −1/2 . We recall that the function WG is independent of B due to our application of the diamagnetic inequality, but nevertheless provides an upper bound on w for all B ≥ 0. Therefore WG (E) is a rather rough estimate of w(E) already for energies E < B/2 and, in particular, starts increasing significantly at too low energies. Nevertheless, the upper bound shows that the density of states w has no infinities for arbitrarily weak disorder, that is, for arbitrarily small C(0) > 0. In fact, in the above situation we believe the graph of w to look similar to the dashed line in Fig. 4.1. We conclude this subsection with several remarks: Remark 4.3. (i) Unfortunately, our upper bound W in (3.4) is never sharp enough to reflect the expected “magneto-oscillations” of w. Instead, by construction W is always increasing. (ii) The assumptions of Corollary 3.1 guarantee in particular that there occurs no point component in the Lebesgue decomposition of the density-of-states measure ν. Using Corollary 2.1, this implies that any given energy E ∈ R, in particular any Landau-level energy, is P-almost surely no eigenvalue under these assumptions. This stands in contrast to a certain situation with random point impurities, in which case the authors of [20] show that finitely many Landau-level energies remain infinitely degenerate eigenvalues if B is sufficiently large. (iii) Exploiting the existence of spectral gaps of H (A, 0), a Wegner estimate for Landau Hamiltonians with alloy-type random potentials is derived in [12, 4, 5] which proves that ν is absolutely continuous when restricted to intervals between the Landau-level energies. For this result to hold the authors were able to weaken the assumption (4.7) on the size of the support of the single-site potential which our Corollary 4.1 requires. On


245

the other hand, absolute continuity of ν at all energies is proven in [12] only for bounded random potentials under the present assumptions on the support. (iv) In [64] a Wegner estimate for alloy-type random potentials is derived without assuming a definite sign of the single-site potential. However, this estimate holds only between the Landau-level energies for sufficiently strong magnetic field and does not enable one to deduce the existence of the density of states. (v) In [30] the integrated density of states associated with the restricted random Landau Hamiltonian Pl H (A, V )Pl of a single but arbitrary Landau level is shown to be absolutely continuous for Gaussian random potentials satisfying the assumptions of Corollary 4.3 (for d = 2). A. On Finite-Volume Schrödinger Operators with Magnetic Fields For convenience of the reader (and the authors), this appendix defines non-random magnetic Schrödinger operators with Neumann boundary conditions and compiles some of their basic properties. In passing, the more familiar basic properties of the corresponding operators with Dirichlet boundary conditions are briefly recalled, see for example [42, 9]. In particular, we prove a diamagnetic inequality for Neumann Schrödinger operators and Dirichlet–Neumann bracketing for a wide class of vector potentials including singular ones. Altogether, this appendix may be understood to extend some of the results in the key papers [31, 32, 3, 57] to the case of Neumann boundary conditions. Throughout this appendix, ⊆ Rd denotes a non-empty open, not necessarily proper subset of d-dimensional Euclidean space Rd with d ∈ N. Moreover, a : Rd → Rd stands for a vector potential and v : Rd → R for a scalar potential with v± := (|v| ± v) /2 denoting its positive respectively negative part. We will assume throughout that 1 |a|2 , v+ ∈ Lloc (Rd ).

(A.1)

The negative part v− is assumed to be a form perturbation either of H,N (a, 0) or even of H,N (0, 0). By this we mean that v− is form-bounded [52, Def. p. 168] with form bound strictly smaller than one either relative to H,N (a, 0) or even to H,N (0, 0). Both operators will be defined in Lemma A.1 below. The operator H,N (0, 0) is the usual Neumann Laplacian, up to a factor of −1/2. Remark A.1. By the diamagnetic inequality, see Prop. A.2 below, we will see that v− is a form perturbation of H,N (a, 0) if it is one of H,N (0, 0). If is a bounded open cube, an easy-to-check sufficient criterion for v− to be even infinitesimally form-bounded [52, Def. p. 168] relative to H,N (0, 0) can be taken from [36, Lemma 2.1] and reads p

v− ∈ Lloc (Rd )

(A.2)

with p = 1 if d = 1, some p > 1 if d = 2 and some p ≥ d/2 if d ≥ 3. A.1. Definition of magnetic Neumann Schrödinger operators. In a first step, we consider 1 (Rd ) or, equivalently, a ∈ L2 (Rd ) d , that is, a ∈ the case v = 0 and |a|2 ∈ Lloc j loc 2 (Rd ) for all j ∈ {1, . . . , d}. We define the sesquilinear form Lloc ha,0 ,N (ϕ, ψ) :=

d 1 (i∇ + a)j ϕ, (i∇ + a)j ψ 2 j =1

(A.3)

246


for all ϕ and ψ in its form domain Wa1,2 () := φ ∈ L2 () : (i∇ + a) φ ∈ (L2 ())d ,

(A.4)

which might be called a magnetic Sobolev space, see [39, Sect. 7.20] in case = Rd . Here and in the following, ∇ − ia denotes the gauge-covariant gradient in the sense of distributions on C0∞ (). In particular, this means Wa1,2 () =

d

φ ∈ L2 () : there is φj ∈ L2 () such that

j =1

φ , i∂j η + aj η = φj , η

(A.5)

for all η ∈ C0∞ () .

Remark A.2. We emphasize that the condition ψ ∈ Wa1,2 () allows for the case that d neither ∇ψ nor aψ belongs to L2 () . In general, ψ ∈ Wa1,2 () only implies ∇ψ ∈

d d 1 , the usual firstLloc () and | ψ | ∈ W 1,2 () := φ ∈ L2 () : ∇φ ∈ L2 () order Sobolev space of L2 -type. The latter statement is a consequence of the diamagnetic inequality, see Remark A.5(iv) below and [59]. If even |a|2 ∈ L∞ (Rd ), the magnetic Sobolev space coincides with the usual one, Wa1,2 () = W 1,2 (), up to equivalence of norms. Basic facts about ha,0 ,N are summarized in 2 Lemma A.1. The form ha,0 ,N is densely defined on L (), symmetric, positive and closed. It therefore uniquely defines a self-adjoint positive operator H,N (a, 0) on L2 () which, up to a factor of −1/2, is called magnetic Neumann Laplacian.

Proof. Since C0∞ () ⊂ Wa1,2 () ⊂ L2 () and C0∞ () is dense in L2 (), the form ha,0 ,N is densely defined. Its symmetry and positivity are obvious from the definition. To

1,2 prove that ha,0 ,N is also closed we have to show that the space Wa () is complete with respect to the (metric induced by the form-) norm φ, φ + ha,0 (A.6) ,N (φ, φ).

To this end, we proceed along the lines of Sects. 7.20 and 7.3 in [39] and let (φn )n∈N be a sequence in Wa1,2 () which is Cauchy with respect to the norm (A.6). By completeness of L2 (), there exist functions φ, ψj ∈ L2 (), j ∈ {1, . . . , d}, such that φn → φ and (i∇ + a)j φn → ψj strongly in L2 () as n → ∞. Since (i∇ + a)j φn → (i∇ + a)j φ in the sense of distributions on C0∞ () as n → ∞, we have (i∇ + a)j φ = ψj and hence φ ∈ Wa1,2 (). The existence and uniqueness of H,N (a, 0) follow now from the one-to-one correspondence between densely defined, symmetric, bounded below, closed forms and self-adjoint, bounded below operators, see [51, Thm. VIII.15]. Remark A.3. (i) We recall that the operator H,N (a, 0) has the subspace ∈ L2 () such that D H,N (a, 0) := ψ ∈ Wa1,2 () : there is ψ ha,0 ,N (ϕ, ψ) = ϕ , ψ

for all ϕ ∈ Wa1,2 ()

(A.7)


247

of its underlying form domain as its operator domain and acts according to . H,N (a, 0) ψ = ψ (ii) Let Dj (a) denote the closure of the symmetric operator C0∞ () ψ → (i∇ + a)j ψ ∈ L2 (). Being the closure of a symmetric operator, Dj (a) is symmetric. The domain of its adjoint Dj† (a) is given by D Dj† (a) := ψ ∈ L2 () : (i∇ + a)j ψ ∈ L2 () ,

(A.8)

because the adjoint of C0∞ () ψ → (i∇ + a)j ψ coincides with that of its closure. While for a proper subset = Rd the operator Dj (a) is not self-adjoint, it is so for = Rd [57, Lemma 2.5]. In the latter case it may physically be interpreted, up to a sign, as the j th component of the velocity (operator). By construction the magnetic Neumann Laplacian is a form sum of d operators in accordance with H,N (a, 0) =

d 1

Dj (a) Dj† (a) , 2

(A.9)

j =1

where the self-adjoint positive operator Dj (a) Dj† (a) comes from the closed form † Dj (a) ϕ, Dj† (a) ψ with form domain (A.8). Note that (A.8) is just the j th set of the intersection on the r.h.s. of (A.5). See also Thm. X.25 in [52]. 1,2 ∞ (iii) Restricting the form ha,0 ,N to the domain C0 () ⊂ Wa (), one obtains a form

which is closable in Wa1,2 () with respect to the norm (A.6), see [57, 42, 9]. Its closure ha,0 ,D is uniquely associated with another self-adjoint positive operator H,D (a, 0) on L2 () which, up to a factor of −1/2, is called magnetic Dirichlet Laplacian. For general 2 d a ∈ Lloc (Rd ) the space C0∞ () is not contained in D H,N (a, 0) , see (A.7). As a consequence, H,N (a, 0) in general cannot be restricted to C0∞ (). This stands in contrast to the case a = 0 where H,D (0, 0) is the Friedrichs extension of the restriction of H,N (0, 0) to C0∞ ().As the Dirichlet counterpart of (A.9) we only have the inequality H,D (a, 0) ≤ 21 dj =1 Dj† (a) Dj (a) which is meant in the sense of forms [53, Def. on p. 269]. The operators HRd,N (a, 0) and HRd,D (a, 0) are equal, see [57]. (iv) In the free case, which is characterized by a = 0 and v = 0, the just defined operators H,D (0, 0) and H,N (0, 0) coincide, up to a factor of −1/2, with the usual Dirichlet- and Neumann-Laplacian [53, p. 263], respectively. 1 (Rd ) and assume v to be a form perturbation In a second and final step, we let v+ ∈ Lloc − of H,N (a, 0). As a consequence, the sesquilinear form ! ! 1/2 1/2 1/2 1/2 a,0 (A.10) ha,v ,N (ϕ, ψ) := h,N (ϕ, ψ) + v+ ϕ, v+ ψ − v− ϕ, v− ψ

1,2 is well defined for all ϕ and ψ in its form domain Q ha,v ,N := Wa ()∩Q (v+ ), where 1/2 Q (v+ ) := φ ∈ L2 () : v+ φ ∈ L2 () . (A.11) Basic facts about ha,v ,N are summarized in

248


2 Lemma A.2. The form ha,v ,N is densely defined on L (), symmetric, bounded below and closed. It therefore uniquely defines a self-adjoint, bounded below operator H,N (a, v) on L2 () which is called magnetic Neumann Schrödinger operator. a,v

Proof. The domain Wa1,2 ()∩Q (v+ ) of h,N+ is dense in L2 (), because both Wa1,2 () and Q (v+ ) contain C0∞ (). Hence H,N (a, v+ ) is well defined as a form sum of a,v H,N (a, 0) and v+ . Moreover, h,N+ is symmetric, positive and closed, because it is the sum of two of such forms. Since H,N (a, 0) ≤ H,N (a, v+ ), the negative part v− of v is also a form perturbation of H,N (a, v+ ). The proof of the lemma is then completed by the KLMN-theorem [52, Thm. X.17]. 1,2 Remark A.4. Since the form domain of ha,0 ,D is contained in Wa (), the negative part v− of v is also a form perturbation of H,D (a, 0) ≤ H,D (a, v+ ). Hence one may apply the KLMN-theorem to define, similarly to H,N (a, v), what is called the magnetic Dirichlet Schrödinger operator and denoted as H,D (a, v).

An immediate consequence of the definition of H,X (a, v) is the fact that so-called decoupling and Dirichlet–Neumann bracketing continues to hold for a = 0 as in the case a = 0, see Props. 3 and 4 in Sect. XIII.15 of [53], and [14, 45] for smooth a = 0. 1 (Rd ) and v be a form perturbation of H Proposition A.1. Let |a|2 , v+ ∈ Lloc − ,N (a, 0). d Moreover, let 1 , 2 ⊂ R be a disjoint pair of non-empty open sets.

(i) Then the orthogonal decomposition H1 ∪2 ,X (a, v) = H1 ,X (a, v) ⊕ H2 ,X (a, v)

(A.12)

holds for both X = D and X = N on L2 (1 ∪ 2 ) = L2 (1 ) ⊕ L2 (2 ). int (ii) Let := 1 ∪ 2 be defined as the interior of the closure of the union of 1 and 2 , and suppose that the interface \ (1 ∪ 2 ) is of d-dimensional Lebesgue measure zero. Then the inequalities H1 ∪2 ,N (a, v) ≤ H,N (a, v) ≤ H,D (a, v) ≤ H1 ∪2 ,D (a, v)

(A.13)

hold in the sense of forms. Proof. The proofs of Props. 3 and 4 in Sect. XIII.15 of [53] for the free case carry over to the case a = 0 and v = 0. In particular, the inclusion relations between the various form domains for a = 0 and v = 0 hold analogously for the form domains in the case a = 0 and v = 0. A.2. Diamagnetic inequality. A useful tool in the study of Schrödinger operators with magnetic fields is 1 (Rd ) and v be a form perturProposition A.2. Let ⊆ Rd be open, |a|2 , v+ ∈ Lloc − bation of H,N (0, 0). Then v− is a form perturbation of H,N (a, 0) with form bound not exceeding the one for a = 0 and the inequality −t H (a,v) ,X e (A.14) ψ ≤ e−t H,X (0,v) |ψ|

holds for all ψ ∈ L2 (), all t ≥ 0 and both X = D and X = N .


249

Remark A.5. (i) For the Dirichlet version X = D of the diamagnetic inequality (A.14) to hold, it would be sufficient that v− is a form perturbation of H,D (0, 0). (ii) Inequality (A.14) for = Rd dates back to [31, 56, 28, 32, 3, 59, 57]. It is also known to hold for = Rd and X = D, even under the weaker assumptions 1 (), see [50, 42]. These assumptions still guarantee that the operators |a|2 , v+ ∈ Lloc H,D (a, v) and H,N (a, v) are definable as self-adjoint operators via forms. However, for arbitrary open = Rd the proof of (A.14) for X = N would be more complicated than the one which we will give under the stronger assumptions of Prop. A.2. The reason is that a gauge function more fancy than that in Lemma A.3 would be needed in order to avoid integration of aj across the boundary of . For a “simply shaped” , like a cube, such complications do not arise which implies that our proof would go through for cubes under the weaker assumptions. (iii) If a = 0 inequality (A.14) is equivalent to the assertion that H,X (0, v) is the (negative of the) generator of a positivity-preserving one-parameter operator semigroup 2 d on L2 (), see [52, pp. 186]. For general a ∈ Lloc (Rd ) inequality (A.14) asserts that the semigroup generated by H,X (0, v) dominates the one generated by H,X (a, v). (iv) It follows from [28, 59] that (A.14) is equivalent to the following pair of statements: (a) ψ ∈ D H,X (a, v) implies |ψ| ∈ Q h0,v ,X , (b) h0,v ,X (ϕ, |ψ|) ≤ Re ϕ sgn ψ , H,X (a, v) ψ for all ϕ ∈ Q h0,v ,X with ϕ ≥ 0 and all ψ ∈ D H,X (a, v) , where the signum function associated with ψ is defined by (sgn ψ) (x) := ψ(x)/|ψ(x)| ∈ C if ψ(x) = 0 and zero otherwise. If a = 0 these statements boil down to a Beurling–Deny criterion [17, Thm. 1.3.2] for H,X (0, v) which guarantees that it generates a positivity-preserving semigroup. Inequality (b) with X = N and v = 0 basically corresponds to the germinal distributional inequality of Kato, which he d proved [31] for a ∈ C 1 (Rd ) . In case = Rd and X = N, we are not aware of a reference proving (A.14) or (a) and (b) for singular a. Our proof of the diamagnetic inequality (A.14) for X = N will mimic the proof in [57], where the case = Rd is considered, see also Sect. 1.3 in [16]. It relies on the fact that for one dimension the vector potential can be removed by a gauge transformation. More precisely, for each j ∈ {1, . . . , d} the operator Dj† (a) is unitarily equivalent to Dj† (0). 1 (Rd ) and define a (gauge) function λ : Rd → R through Lemma A.3. Let |a|2 ∈ Lloc j xj λj (x) := dyj aj x1 , . . . , xj −1 , yj , xj +1 , . . . , xd . (A.15) 0

For open ⊆ Rd it induces a densely defined and self-adjoint multiplication operator λj on L2 (). The corresponding unitary operator e−iλj maps D Dj† (a) onto D Dj† (0) , recall (A.8), and one has Dj† (a) ψ = eiλj Dj† (0) e−iλj ψ for all ψ ∈ D Dj† (a) .

(A.16)

250


2 (Rd ). Proof. Fubini’s theorem and the Cauchy–Schwarz inequality show that λj ∈Lloc

Therefore, the induced multiplication operator on its maximal domain D λj := ψ ∈ L2 () : λj ψ ∈ L2 () ⊃ C0∞ () is densely defined and self-adjoint. Moreover, † 1 (), we are allowed to use the product since ψ ∈ D Dj (a) implies ∇j ψ ∈ Lloc and chain rule for distributional derivatives [26, pp. 150] which yield ∇j e−iλj ψ = e−iλj ∇j ψ − e−iλj iaj ψ.

Proof of Prop. A.2. For X = D see [50, 42, 9]. The proof for X = N consists of three steps. 1 (Rd ) to be bounded from below. In this case In the first step, we assume v ∈ Lloc H,N (a, v) is a form sum of d +1 operators each of which is bounded from below, recall Remark A.3(ii) and Lemma A.2. Hence we may employ the strong Lie–Trotter product formula generalized to form sums of several operators [33] and write n † † e−tH,N (a,v) = s-lim e−tD1 (a)D1(a)/2n · · · e−tDd (a)Dd (a)/2n e−tv/n . (A.17) n→∞

Gauge equivalence (A.16) now shows that †

†

e−tDj(a)Dj(a)/2n = eiλj e−tDj(0)Dj(0)/2n e−iλj

(A.18) for all j ∈ {1, . . . , d} and all t ≥ 0. By the distributional inequality ∇j |ψ| ≤ ∇j ψ , valid for all ψ ∈ D Dj† (0) [39, Thm. 6.17], the operator Dj (0) Dj† (0) obeys a Beurling– Deny criterion [17, Thm. 1.3.2] and hence is the generator of a positivity-preserving semigroup. It follows that † −tDj (a)D †(a)/2n e j (A.19) ψ ≤ e−tDj(0)Dj(0)/2n |ψ| for all ψ ∈ L2 () and all t ≥ 0. This together with (A.17) implies the assertion (A.14) 1 (Rd ) which are bounded from below. (with X = N) for scalar potentials v ∈ Lloc In the second step, we prove that if v− is a form perturbation of H,X (0, 0) then it is also one of H,X (a, 0) with form bound not exceeding the one for a = 0 (see [3] or [58, Thm. 15.10] for the case = Rd ). This follows from (A.23) below with v = 0 and α = 1/2 together with the fact that the form bound of v− relative to H,X (a, 0) can be expressed as " −1/2 −1/2 " " " lim " H,X (a, 0) + E (A.20) v− H,X (a, 0) + E ", E→∞

see [16, Prop. 1.3(ii)]. Here · denotes the (uniform) norm of bounded operators on L2 (). In the third step, we extend the validity of (A.14) (with X = N) to scalar potentials 1 (Rd ) and v being a form perturbation of H v with v+ ∈ Lloc − ,N (0, 0). To this end, we approximate v by vn defined through vn (x) := max {−n, v (x)}, x ∈ Rd , n ∈ N. Monotone convergence for forms [51, Thm. S.16] yields the convergence of H,N (a, vn ) to H,N (a, v) in the strong resolvent sense as n → ∞. It follows that s-lim e−tH,N (a,vn ) = e−tH,N (a,v) n→∞

(A.21)

for all t ≥ 0. Since (A.14) (with X = N) holds for each vn by the first step, the proof is complete.


251

A.3. Some consequences. We list some immediate consequences of the diamagnetic inequality. For this purpose, we assume the situation of Prop. A.2. (i) Powers of the resolvent of the self-adjoint operator H,X (a, v) may be expressed in terms of its semigroup by using the functional calculus. This gives the integral representation ∞ −α 1 H,X (a, v) − z = dt t α−1 etz e−tH,X (a,v) , (A.22) (α − 1)! 0 which is valid for all α > 0, all z ∈ C with Re z < inf spec H,X (a, v) and both X = D and X = N. Here α → (α − 1)! denotes Euler’s gamma function [27]. Inequality (A.14) then implies the diamagnetic inequality for powers of the resolvent H,X (a, v) − z −α ψ ≤ H,X (0, v) − Re z −α |ψ| ,

(A.23)

valid for all ψ ∈ L2 () and all z ∈ C with Re z < inf spec H,X (0, v). We recall [55] that the ground-state energy goes up when the magnetic field is turned on, in symbols, inf spec H,X (0, v) ≤ inf spec H,X (a, v). This follows from Remark A.5(iv)(b) or inequality (A.24) below if its r.h.s. is finite. (ii) If H,X (0, v) has purely discrete spectrum or, equivalently [53, Thm. XIII.64], has compact resolvent, the Dodds-Fremlin-Pitt theorem [3, Thm. 2.2] together with (A.23) implies that H,X (a, v) has also compact resolvent and hence purely discrete spectrum. In turn, H,X (0, v) has purely discrete spectrum if the free operator H,X (0, 0) has and if v is a form perturbation of H,X (0, 0) [53, Thm. XIII.68]. While H,D (0, 0) has purely discrete spectrum for arbitrary bounded open ⊂ Rd , H,N (0, 0) only has if possesses an additional property, for example the segment property, see [53, pp. 255]. For example, if is a bounded open cube the spectra of H,D (a, −v− ) and H,N (a, −v− ) are both purely discrete. Moreover, by the min-max principle the addition of the positive multiplication operator v+ to H,X (a, −v− ) cannot create essential spectrum. As a consequence, H,X (a, v) has purely discrete spectrum for both X = D and X = N if is a bounded open cube. (iii) The diamagnetic inequality (A.14) together with Lemma 15.11 in [58] implies the diamagnetic inequality for partition functions Tr e−tH,X (a,v) ≤ Tr e−tH,X (0,v)

(A.24)

for all t > 0 and both X = D and X = N, provided that the r.h.s. is finite. The latter is the case if is a bounded open cube, for example. This follows from Dirichlet– Neumann bracketing (see (A.13) with a = 0), the facts that v+ ≥ 0 and v− is a form perturbation of H,N (0, 0), and the finiteness of the free Neumann partition function (see [36, Prop. 2.1(c)] or (4.5)). Acknowledgement. It is a pleasure to thank Kurt Broderix, Dirk Hundertmark, Thomas Hoffmann-Ostenhof and Georgi D. Raikov for helpful remarks and stimulating discussions. This work was supported by the Deutsche Forschungsgemeinschaft under grant nos. Le 330/10 and Le 330/12. The latter is a project within the Schwerpunktprogramm “Interagierende stochastische Systeme von hoher Komplexität” (DFG Priority Programme SPP 1033).

252


Note added in proof. After submission of the present paper we learned of the interesting paper The Lp -theory of the spectral shift function, the Wegner estimate, and the integrated density of states for some random operators, Commun. Math. Phys. 218, 113–130 (2001), by J. M. Combes, P. D. Hislop and S. Nakamura. Among other things, their approach yields Wegner estimates for rather general magnetic fields and certain bounded random potentials. While these estimates do not imply absolute continuity of the integrated density of states, they yield Hölder continuity of arbitrary order strictly smaller than one. The recent preprint The integrated density of states for some random operators with nonsign definite potentials, mp_arc 01-139 (2001), by P. D. Hislop and F. Klopp extends part of this result to single-site potentials taking values of both signs.

References 1. Adler, R.J.: The geometry of random fields. Chichester: Wiley, 1981 2. Ando, T., Fowler, A.B., Stern, F.: Electronic properties of two-dimensional systems. Rev. Mod. Phys. 54, 437–672 (1982) 3. Avron, J., Herbst, I., Simon, B.: Schrödinger operators with magnetic fields. I. General interactions. Duke Math. J. 45, 847–883 (1978) 4. Barbaroux, J.-M., Combes, J.M., Hislop, P.D.: Localization near band edges for random Schrödinger operators. Helv. Phys. Acta 70, 16–43 (1997) 5. Barbaroux, J.-M., Combes, J.M., Hislop, P.D.: Landau Hamiltonians with unbounded random potentials. Lett. Math. Phys. 40, 335–369 (1997) 6. Bauer, H.: Maß- und Integrationstheorie. 2. Auflage, Berlin: de Gruyter, 1992 [in German] English translation to appear 7. Bonch-Bruevich,V.L., Enderlein, R., Esser, B., Keiper, R., Mironov,A.G., Zvyagin, I.P.: Elektronentheorie ungeordneter Halbleiter. Berlin: VEB Deutscher Verlag der Wissenschaften, 1984 [in German. Russian original: Moscow: Nauka, 1981] 8. Broderix, K., Hundertmark, D., Leschke, H.: Self-averaging, decomposition and asymptotic properties of the density of states for random Schrödinger operators with constant magnetic field. In: Path integrals from meV to MeV: Tutzing ’92. Grabert, H., Inomata, A., Schulman, L.S., Weiss, U. (eds.), Singapore: World Scientific, 1993, pp. 98–107 9. Broderix, K., Hundertmark, D., Leschke, H.: Continuity properties of Schrödinger semigroups with magnetic fields. Rev. Math. Phys. 12, 181–225 (2000) 10. Carmona, R., Lacroix, J.: Spectral theory of random Schrödinger operators. Boston: Birkhäuser, 1990 11. Combes, J.M., Hislop, P.D.: Localization for some continuous, random Hamiltonians in d-dimensions. J. Funct. Anal. 124, 149–180 (1994) 12. Combes, J.M., Hislop, P.D.: Landau Hamiltonians with random potentials: Localization and the density of states. Commun. Math. Phys. 177, 603–629 (1996) 13. Combes, J.M., Hislop, P.D., Mourre, E.: Spectral averaging, perturbation of singular spectra, and localization. Trans. Am. Math. Soc. 348, 4883–4894 (1996) 14. Combes, J.M., Schrader, R., Seiler, R.: Classical bounds and limits for energy distributions of Hamilton operators in electromagnetic fields. Ann. Phys. (N.Y.) 111, 1–18 (1978) 15. Craig, W., Simon, B.: Log Hölder continuity of the integrated density of states for stochastic Jacobi matrices. Commun. Math. Phys. 90, 207–218 (1983) 16. Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schrödinger operators. Berlin: Springer, 1987 17. Davies, E.B.: Heat kernels and spectral theory. Paperback edition, Cambridge: Cambridge Univ. Press, 1990 18. Delyon, F., Souillard, B.: Remark on the continuity of the density of states of ergodic finite difference operators. Commun. Math. Phys. 94, 289–291 (1984) 19. Doi, S., Iwatsuka, A., Mine, T.: The uniqueness of the integrated density of states for the Schrödinger operators with magnetic fields. Math. Z. 237, 335–371 (2001) 20. Dorlas, T.C., Macris, N., Pulé, J.V.: Characterization of the spectrum of the Landau Hamiltonian with delta impurities. Commun. Math. Phys. 204, 367–396 (1999) 21. Droese, J., Kirsch, W.: The effect of boundary conditions on the density of states for random Schrödinger operators. Stochastic Processes Appl. 23, 169–175 (1986) 22. Fernique, X.M.: Regularité des trajectoires des fonctions aléatoires Gaussiennes. In: Ecole d’Eté de Probabilités de Saint-Flour IV - 1974. Hennequin, P.-L. (ed.), Lecture Notes in Mathematics 480, Berlin: Springer, 1975, pp. 1–96 [in French] 23. Fischer, W., Hupfer, T., Leschke, H., Müller, P.: Existence of the density of states for multi-dimensional continuum Schrödinger operators with Gaussian random potentials. Commun. Math. Phys. 190, 133–141 (1997)


253

24. Fischer, W., Leschke, H., Müller, P.: Spectral localization by Gaussian random potentials in multidimensional continuous space. J. Stat. Phys. 101, 935–985 (2000) 25. Fock, V.: Bemerkung zur Quantelung des harmonischen Oszillators im Magnetfeld. Z. Physik 47, 446–448 (1928) [in German] 26. Gilbarg, D., Trudinger, N.S.: Elliptic partial differential equations of second order. 2nd edition, Berlin: Springer, 1983 27. Gradshteyn, I.S., Ryzhik, I.M.: Table of integrals, series, and products. Corrected and enlarged edition, San Diego: Academic, 1980 28. Hess, H., Schrader, R., Uhlenbrock, D.A.: Domination of semigroups and generalization of Kato’s inequality. Duke Math. J. 44, 893–904 (1977) 29. Hupfer, T., Leschke, H., Müller, P., Warzel, S.: Existence and uniqueness of the integrated density of states for Schrödinger operators with magnetic fields and unbounded random potentials. e-print mathph/0010013 (2000). 30. Hupfer, T., Leschke, H., Warzel, S.: Upper bounds on the density of states of single Landau levels broadened by Gaussian random potentials. e-print math-ph/0011010 (2000) 31. Kato, T.: Schrödinger operators with singular potentials. Israel J. Math. 13, 135–148 (1972) 32. Kato, T.: Remarks on Schrödinger operators with vector potentials. Integral Equations Oper. Theory 1, 103–113 (1978) 33. Kato, T., Masuda, K.: Trotter’s product formula for nonlinear semigroups generated by the subdifferentials of convex functionals. J. Math. Soc. Japan 30, 169–178 (1978) 34. Kirsch, W.: Random Schrödinger operators: A course. In: Schrödinger operators. Holden, H., Jensen, A. (eds.), Lecture Notes in Physics 345, Berlin: Springer, 1989, pp. 264–370 35. Kirsch, W., Martinelli, F.: On the ergodic properties of the spectrum of general random operators. J. Reine Angew. Math. 334, 141–156 (1982) 36. Kirsch, W., Martinelli, F.: On the density of states of Schrödinger operators with a random potential. J. Phys. A 15, 2139–2156 (1982) 37. Kukushkin, I.V., Meshkov, S.V., Timofeev, V.B.: Two-dimensional electron density of states in a transverse magnetic field. Sov. Phys. Usp. 31, 511–534 (1988) [Russian original: Usp. Fiz. Nauk 155, 219–264 (1988)] 38. Landau, L.: Diamagnetismus der Metalle. Z. Physik 64, 629–637 (1930) [in German] 39. Lieb, E.H., Loss, M.: Analysis. Providence, Rhode Island: Am. Math. Soc., 1997 40. Lifshits, I.M., Gredeskul, S.A., Pastur, L.A.: Introduction to the theory of disordered systems. New York: Wiley, 1988 [Russian original: Moscow: Nauka, 1982] 41. Lifshits, M.A.: Gaussian random functions. Dordrecht: Kluwer, 1995 42. Liskevitch, V., Manavi, A.: Dominated semigroups with singular complex potentials. J. Funct. Anal. 151, 281–305 (1997) 43. Matsumoto, H.: On the integrated density of states for the Schrödinger operators with certain random electromagnetic potentials. J. Math. Soc. Japan 45, 197–214 (1993) 44. Mohamed, A., Raikov, G.D.: On the spectral theory of the Schrödinger operator with electromagnetic potential. In: Pseudo-differential calculus and mathematical physics. Demuth, M., Schrohe, E., Schulze, B.-W.(eds.), Berlin: Akademie, 1994, pp. 298–390 45. Nakamura, S.: A remark on the Dirichlet–Neumann decoupling and the integrated density of states. J. Funct. Anal. 179, 136–152 (2001) 46. Nakao, S.: On the spectral distribution of the Schrödinger operator with random potential. Japan. J. Math. 3, 111–139 (1977) 47. Pastur, L.: On the Schrödinger equation with a random potential. Theor. Math. Phys. 6, 299–306 (1971) [Russian original: Teor. Mat. Fiz. 6, 415–424 (1971)] 48. Pastur, L.: Spectral properties of disordered systems in the one-body approximation. Commun. Math. Phys. 75, 179–196 (1980) 49. Pastur, L., Figotin, A.: Spectra of random and almost-periodic operators. Berlin: Springer, 1992 50. Perelmuter, M.A., Semenov, Yu.A.: On decoupling of finite singularities in the scattering theory for the Schrödinger operator with a magnetic field. J. Math. Phys. 22, 521–533 (1981) 51. Reed, M., Simon, B.: Methods of modern mathematical physics I: Functional analysis. Revised and enlarged edition, San Diego: Academic, 1980 52. Reed, M., Simon, B.: Methods of modern mathematical physics II: Fourier analysis, self-adjointness. New York: Academic, 1975 53. Reed, M., Simon, B.: Methods of modern mathematical physics IV: Analysis of operators. New York: Academic, 1978 54. Shklovskii, B.I., Efros, A.L.: Electronic properties of doped semiconductors. Berlin: Springer, 1984 [Russian original: Moscow: Nauka, 1979] 55. Simon, B.: Universal diamagnetism of spinless Bose systems. Phys. Rev. Lett. 36, 1083–1084 (1976) 56. Simon, B.: An abstract Kato’s inequality for generators of positivity preserving semigroups. Ind. Math. J. 26, 1067–1073 (1977)

254


57. 58. 59. 60.

Simon, B.: Maximal and minimal Schrödinger forms. J. Operator Theory 1, 37–47 (1979) Simon, B.: Functional integration and quantum physics. New York: Academic, 1979 Simon, B.: Kato’s inequality and the comparison of semigroups. J. Funct. Anal. 32, 97–101 (1979) Simon, B.: Schrödinger operators in the twenty-first century. In: Mathematical Physics 2000. Fokas, A., Grigoryan, A., Kibble, T., Zegarlinski, B. (eds.), London: Imperial College Press, 2000, pp. 283–288 Stollmann, P.: Caught by disorder: Bound states in random media. Boston: Birkhäuser, 2001 Ueki, N.: On spectra of random Schrödinger operators with magnetic fields. Osaka J. Math. 31, 177–187 (1994) Veselić, I.: Wegner estimate for some indefinite Anderson-type Schrödinger operators. e-print mp_arc 00-373 (2000) Wang, W.-M.: Microlocalization, percolation, and Anderson localization for the magnetic Schrödinger operator with a random potential. J. Funct. Anal. 146, 1–26 (1997) Wegner, F.: Bounds on the density of states in disordered systems. Z. Phys. B 44, 9–15 (1981) Weyl, H.: Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Differentialgleichungen (mit einer Anwendung auf die Theorie der Hohlraumstrahlung). Math. Ann. 71, 441–479 (1912) [in German] Zak, J.: Magnetic translation group. Phys. Rev. 134, A1602–A1606 (1964)

61. 62. 63. 64. 65. 66. 67.

Communicated by B. Simon

Commun. Math. Phys. 221, 255 – 265 (2001)

Communications in



Eigenvalues of the Dirac Operator on Manifolds with Boundary Oussama Hijazi1 , Sebastián Montiel2 , Xiao Zhang3 1 Institut Élie Cartan, Université Henri Poincaré, Nancy I, B.P. 239, 54506 Vandœuvre-Lès-Nancy Cedex,

France. E-mail: [email protected]

2 Departamento de Geometría y Topología, Universidad de Granada, 18071 Granada, Spain.


3 Institute of Mathematics, Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences,

Beijing 100080, P.R. China. E-mail: [email protected] Received: 22 August 2000 / Accepted: 15 March 2001

Abstract: Under standard local boundary conditions or certain global APS boundary conditions, we get lower bounds for the eigenvalues of the Dirac operator on compact spin manifolds with boundary. For the local boundary conditions, limiting cases are characterized by the existence of real Killing spinors and the minimality of the boundary. 1. Introduction It is well known that the spectrum of the Dirac operator on closed spin manifolds detects subtle information on the geometry and the topology of such manifolds (see for example [6, 8]). In [31, 33, 27, 30], basic properties of the hypersurface Dirac operator are established. This hypersurface Dirac operator appears as the boundary term in the integral Schrödinger–Lichnerowicz formula (2.3) for compact spin manifolds with compact boundary. In fact, the hypersurface Dirac operator is, up to a zero order operator, the intrinsic Dirac operator of the boundary. In this paper, we examine the classical local boundary conditions and certain Atiyah– Patodi–Singer boundary conditions for the Dirac operator. Here, the spectral resolution of the intrinsic Dirac operator of the boundary is used to define the APS boundary conditions. We first prove self-adjointness and ellipticity of such conditions. Then, systematic use of the modified Levi–Civita connections, introduced in [10, 24, 33, 11, 27, 30], is made (see also [28, 15] for the Dirac operators on submanifolds). Under appropriate curvature assumptions, these modified connections combined with formula (2.3), yield the corresponding estimates for compact spin manifolds with boundary. The limiting cases are then studied. Such estimates are obtained in Sects. 3 and 4. In Sect. 3 we consider both the local and the above mentioned APS boundary conditions. We first introduce the modified connection (3.1) which allows to establish a Friedrich’s type inequality, in case the mean curvature of the boundary is nonnegative. Under the local boundary conditions,

256

O. Hijazi, S. Montiel, X. Zhang

the limiting case is then characterized by the existence of a Killing spinor on the compact manifold with minimal boundary (see (3.5)). Then the energy-momentum tensor is used to define the modified connection (3.7), from which one can deduce inequality (3.9). Finally, in Sect. 4, under the local boundary conditions, the conformal aspect is examined. For example, generalizations of the conformal lower bounds in [22, 24] are obtained (see Remark 9). It might be useful to mention that local and global boundary conditions are introduced in [25] to get optimal extrinsic lower bounds for the first nonnegative eigenvalue of the intrinsic Dirac operator of the boundary. Moreover, in [26], the conformal aspect of this setup is examined where a conformal extrinsic lower bound is given. 2. The Elliptic Boundary Conditions Let M be an n-dimensional Riemannian spin manifold with boundary ∂M endowed with its induced Riemannian and spin structures. Denote by S the spinor bundle of M. Let ∇ (resp. ∇ ∂M ) be the Levi–Civita connection of M (resp. ∂M) and denote by the same symbol their corresponding lift to the spinor bundle S. Consider the Dirac operator D of M defined by ∇ on S. It is known [29] that there exists a positive definite Hermitian metric on S which satisfies, for any covector field X∗ ∈ (T ∗ M), and any spinor fields ϕ, ψ ∈ (S), the relation (X∗ · ϕ, X ∗ · ψ) = |X ∗ |2 (ϕ, ψ),

(2.1)

where “·” denotes Clifford multiplication. The connection ∇ is compatible with the metric ( , ). Fix a point p ∈ ∂M and an orthonormal basis {eα } of Tp M with e0 the outward normal to ∂M and ei tangent to ∂M such that for 1 ≤ i, j ≤ n, (∇i∂M ej )p = (∇0 ej )p = 0. Let {eα } be the dual coframe. Then, for 1 ≤ i, j ≤ n, (∇i ej )p = −hij e0 , (∇i e0 )p = hij ej , where hij = (∇i e0 , ej ) are the components of the second fundamental form at p, and we have 1 ∇i = ∇i∂M + hij e0 · ej · . (2.2) 2 Let H = hii be the unnormalized mean curvature of M. In the above notation, the standard sphere Srn = ∂Brn+1 has positive mean curvature H = nr . By (2.1), (e0 · ej · ϕ, ψ) = (ϕ, ej · e0 · ψ). Therefore (2.2) implies d(ϕ, ψ) ∗ ei = (∇i ϕ, ψ) + (ϕ, ∇i ψ) ∗ 1 = (∇i∂M ϕ, ψ) + (ϕ, ∇i∂M ψ) ∗ 1. Hence the connection ∇ ∂M is also compatible with the metric ( , ). Denote by D ∂M the Dirac operator of ∂M. In the above orthonormal coframe {ei } of M, D ∂M = ei · ∇i∂M . Thus D ∂M is self-adjoint with respect to the metric ( , ). The relation (2.2) implies that ∇i∂M (e0 · ϕ) = e0 · ∇i∂M ϕ.

Dirac Operator on Manifolds with Boundary

257

Hence

D ∂M (e0 · ϕ) = −e0 · D ∂M ϕ. Consider the integral form of the Schrödinger–Lichnerowicz formula for a compact manifold with compact boundary 1 (ϕ, e0 · D ∂M ϕ) − H |ϕ|2 2 ∂M ∂M R = |∇ϕ|2 + |ϕ|2 − |Dϕ|2 . (2.3) 4 M It is well-known that there are basically two types of elliptic boundary conditions for the Dirac operator: The local boundary condition and the (global) Atiyah–Patodi–Singer (APS) boundary condition. Such boundary conditions are used in the positive mass theorem for black holes, Penrose conjecture in general relativity and the index theory in topology [13, 14, 20, 21, 34]. The APS boundary condition exists on any spin manifold with boundary [2–4] (see also [16–19]), while the local boundary condition requires certain additional structures on manifolds such as the existence of a Lorentzian structure or a chirality operator, etc [12, 13, 21]. Now we shall show that the local boundary condition exists on certain spin manifolds with a “boundary chirality operator”. An operator defined on C ∞ (∂M, S|∂M ) is said to be a boundary chirality operator if it satisfies the following conditions: 2 = I d, = 0,

∇e∂M i 0

e · = − · e0 , ei · = · ei , ( · ϕ, · ψ) = (ϕ, ψ).

(2.4) (2.5) (2.6) (2.7) (2.8)

If M is a spacelike hypersurface of a spacetime manifold with timelike covector T , then we can let = T · e0 , where e0 is the normal covector on ∂M. Recall that (see [12] for example), an operator F defined on C ∞ (M, S) is called a chirality operator on M if for all X∗ ∈ (T ∗ M), and any spinor fields ϕ, ψ ∈ (S), one has F 2 = I d, ∇X F = 0, X ∗ · F = −F · X ∗ , (F · ϕ, F · ψ) = (ϕ, ψ). Note that such an operator exists if the spin manifold M is even dimensional. It is easy to see that if M has a chirality operator F , then = F |∂M · e0 is a boundary chirality operator. In this paper, we consider the following boundary conditions: • The local boundary condition. As the eigenvalues of the chirality operator are ±1, the corresponding eigenspaces loc + = ϕ ∈ C ∞ (∂M, S|∂M ), · ϕ = ϕ , loc − = ϕ ∈ C ∞ (∂M, S|∂M ), · ϕ = −ϕ provide local boundary conditions.

258


• The APS type boundary condition. The operator e0 ·D ∂M is self-adjoint with respect to the induced metric ( , ) on ∂M. Therefore it has a discrete (real) spectrum. Let (ϕk )k∈N be the spectral resolution of e0 · D ∂M , i.e., e0 · D ∂M ϕk = λk ϕk , and consider APS spanned by the positive and negative the corresponding L2 -orthogonal subspaces ± 0 ∂M eigenspaces of e · D , i.e., APS + = ϕ ∈ C ∞ (∂M, S|∂M ), ϕ = ck ϕk , λk >0

APS − = ϕ ∈ C ∞ (∂M, S|∂M ), ϕ =

ck ϕk .

λk 0, there exists Ck,δ such that

ϕ 2H k ≤ (1 + δ) Dϕ 2L2 +Ck,δ ϕ 2H k−1 . loc or ϕ ∈ loc , D ∂M ( · ϕ) = · D ∂M ϕ, thus Proof. Note that for any ϕ ∈ + −

(ϕ, e0 · D ∂M ϕ) = · ϕ, e0 · D ∂M ( · ϕ) = ( · ϕ, e0 · · D ∂M ϕ) = −(ϕ, e0 · D ∂M ϕ).

(2.9)


259

APS , then Therefore (ϕ, e0 · D ∂M ϕ) = 0. If ϕ ∈ − (ϕ, e0 · D ∂M ϕ) = |ck |2 λk ≤ 0. ∂M

λk 0, there exists a constant Cε > 0 such that ϕ 2L2 (∂M) ≤ ε ϕ 2H 1 +Cε ϕ 2L2 , thus (2.3) implies ϕ 2H 1 ≤ (1 + δ) Dϕ 2L2 +Cδ ϕ 2L2 . Then a standard argument gives (2.9).

(2.10)

The following corollary is a direct consequence of the Sobolev embedding theorem ϕ 2C k ≤ C ϕ 2

n

H k+ 2

.

Corollary 2. Any eigenspinor of the Dirac operator which satisfies either the local loc or the (negative) APS boundary condition ϕ ∈ AP S boundary condition ϕ ∈ ± − is smooth. 3. Lower Bounds for the Eigenvalues In this section, we adapt the arguments used in [27] to the case of spin compact manifolds with boundary. In particular, we get generalizations of basic inequalities on the eigenloc or the negative values of the Dirac operator D under the local boundary conditions ± APS APS boundary condition − . For this, we use the integral identity (2.3) together with an appropriate modification of the Levi–Civita connection. Let Dϕ = λϕ, where λ is a real constant or a real function. For any real functions a and u, we define ∇ia,u = ∇i + a∇i u + Then

a λ ∇j u e i · e j · + e i · . n n

(3.1)

1 λ2 2 |ϕ| + a 2 1 − |du|2 |ϕ|2 n n 2λ (∇i ϕ, ei · ϕ) +2a∇i u (∇i ϕ, ϕ) + n 1 λ2 2 2 2 = |∇ϕ| − |ϕ| + a 1 − |du|2 |ϕ|2 + a∇i u∇i |ϕ|2 . n n

|∇ a,u ϕ|2 = |∇ϕ|2 +

Define the functions Ra,u by

1 2 Ra,u = R − 4a,u + 4∇a∇u − 4 1 − a |du|2 , n

(3.2)

260


where , is the positive scalar Laplacian. Then we have M

|∇

a,u

λ2 ϕ| = |∇ϕ| − |ϕ|2 − n M + a du(e0 )|ϕ|2 . 2

2

Ra,u R − |ϕ|2 4 4

∂M

Therefore (2.3) yields M

|∇ a,u ϕ|2 =

1 2 Ra,u )λ − |ϕ|2 n 4 M

H 0 ∂M + (ϕ, e · D ϕ) + a du(e0 ) − |ϕ|2 . 2 ∂M

(1 −

(3.3)

Now we generalize Lemma 2.3 in [11] to the case where a is a real function. Lemma 3. Suppose there exist a spinor field ϕ ∈ (S), a real number λ and a real functions a and u on M such that for all i, 1 ≤ i ≤ n, λ a ∇i ϕ = − ei · ϕ − a∇i uϕ − ∇j uei · ej · ϕ. n n

(3.4)

Then ϕ is a real Killing spinor, i.e., either a = 0 or du = 0. In particular, the manifold is Einstein. Proof. First, observe that (3.4) implies Dϕ = λϕ. By the Ricci identity (see [11]), we have 1 Rij ei · ej · ϕ = ei · D(∇i ϕ) − D 2 ϕ 2 λ a = ei · ej · ∇j − ei · −a∇i u − ∇k uei · ek · ϕ − λ2 ϕ n n λ = ei · (ei · ej · +2δij )∇j ϕ n − λ2 ϕ − du · da · ϕ + auϕ − aλdu · ϕ 1 + ei · (ei · ej · +2δij )ek · ∇j a∇k uϕ n + a∇j ∇k uϕ + a∇k u∇j ϕ 2(1 − n) 2 2a 2(2 − n) = λ + u − ∇a∇u ϕ n n n 4aλ 2 − du · da · ϕ + 2 du · ϕ. n n

This implies either a = 0 or du = 0. By (3.3) and Lemma 3, we obtain


261

Theorem 4. Let M n be a compact Riemannian spin manifold of dimension n ≥ 2, with boundary ∂M, and let λ be any eigenvalue of D under either the local boundary condition loc or the (negative) APS boundary condition APS . If there exist real functions a, u ± − on M such that H ≥ 2a du(e0 ) on ∂M, where H is the mean curvature of ∂M, then λ2 ≥

n sup inf Ra,u , 4(n − 1) a,u M

(3.5)

where Ra,u is given in (3.2). In the limiting case with the local boundary conditions, the associated eigenspinor is a real Killing spinor and ∂M is minimal. Note that by [25], under the APS boundary conditions equality in (3.5) could not hold. Now we make use of the energy-momentum tensor (see [24]) to get lower bounds for the eigenvalues of D. For any spinor field ϕ, we define the associated energy momentum 2-tensor Qϕ on the complement of its zero set by, Qϕ,ij =

1 i e · ∇j ϕ + ej · ∇i ϕ , ϕ/|ϕ|2 . 2

(3.6)

If ϕ is an eigenspinor of D, the tensor Qϕ is well-defined in the sense of distribution. Let a Q,a,u ∇i = ∇i + a∇i u + ∇j u ei · ej · +Qϕ,ij ej · . (3.7) n It is easy to prove that (see [27]) 1 |du|2 |ϕ|2 + a∇i u∇i |ϕ|2 . |∇ Q,a,u ϕ|2 = |∇ϕ|2 − |Qϕ |2 |ϕ|2 + a 2 1 − n Therefore

M

|∇

Q,a,u

Ra,u 2 λ − |ϕ|2 ϕ| = + |Qϕ | 4 M

H |ϕ|2 . + (ϕ, e0 · D ∂M ϕ) + a du(e0 ) − 2 ∂M 2

2

(3.8)

Thus we have Theorem 5. Let M n be a compact Riemannian spin manifold of dimension n ≥ 2, with boundary ∂M, and let λ be any eigenvalue of D under either the local boundary condition loc or the (negative) APS boundary condition APS . If there exist real functions a, u ± − on M such that H ≥ 2a du(e0 ) on ∂M, where H is the mean curvature of ∂M, then Ra,u 2 2 + |Qϕ | . λ ≥ sup inf 4 a,u M In the limiting case, one has H = 2adu(e0 ) on ∂M.

(3.9)

262


loc or the (negative) APS boundRemark 6. Under either the local boundary condition ± APS , assume that H ≥ 0. Take a = 0 or u constant in (3.5) and (3.9), ary condition − then one gets Friedrich’s inequality [10]

λ2 ≥

n inf R 4(n − 1) M

(3.10)

and the following inequality [24] λ ≥ inf 2

M

R 2 + |Qϕ | . 4

(3.11)

4. Conformal Lower Bounds loc , we show that As in the previous section and under the local boundary conditions ± the conformal arguments used in [27] combined with the integral formula (2.3) yield to generalizations of all known lower bounds for the eigenvalues of the Dirac operator. Let g be the metric of M. For any real function u on M, consider a conformal metric g¯ = e2u g. Denote by D the Dirac operator with respect to this conformal metric. If n−1 Dϕ = λϕ, then D ψ = λ e−u ψ, where ψ = e− 2 u ϕ. Note that

a λ ∇ea,u = ∇ ei + a e−u ∇i u + e−u ∇j u ei · ej + e−u ei ·, i n n −u −u −2u ,u = − e (∇ei (e ∇ei u)) = e (,u + |du|2 ), i

R e2u = R + 2(n − 1),u − (n − 1)(n − 2)|du|2 , also, on ∂M,

(n−2) n D ∂M e− 2 u ϕ = e− 2 u D ∂M ϕ, H = e−u H + (n − 1) du(e0 ) .

a,u by Define the function R a,u = R + 4 n − 1 − a ,u + 4∇a∇u R 2 1 2 − (n − 1)(n − 2) + 4(2 − n)a + 4(1 − )a |du|2 , n

(4.1)

where , is the positive scalar Laplacian. Then apply (3.3) to the conformal metric g, to get

a,u 2 1 2 R a,u 2 −u 1− e λ − |ϕ| vg ∇ ψ v¯g = g¯ n 4 M M

H (ψ, e0 · D ∂M ψ)g + a du(e0 ) − |ψ|2g vg , + 2 ∂M


hence

263

a,u 2 1 2 R a,u 2 1− λ − |ϕ| vg e−u ∇ ψ v¯g = g¯ n 4 M M + e−u (ϕ, e0 · D ∂M ϕ) ∂M

n−1 H 2 + (a − ) du(e0 ) − |ϕ| vg . 2 2

(4.2)

a,u ψ = 0 implies Note that ∇ λ n 1 n ∇i uϕ − a− ∇j uei · ej · ϕ ∇i ϕ = − ei · ϕ − a − n 2 n 2 (see [11]), we thus have either a = Mn

n 2

or du = 0 by Lemma 3. Thus we obtain:

be a compact Riemannian spin manifold of dimension n ≥ 2, with Theorem 7. Let boundary ∂M, and let λ be any eigenvalue of D under the local boundary condition loc . If there exist real functions a, u on M such that ± H ≥ (2a − n + 1) du(e0 ) on ∂M, where H is the mean curvature of ∂M, then n a,u , sup inf R λ2 ≥ 4(n − 1) a,u M

(4.3)

a,u is given in (4.1). In the limiting case, the associated eigenspinor where the function R is a real Killing spinor and either H = du(e0 ) or H = 0 on ∂M. Since Qϕ,i¯ j¯ = e−u Qϕ,ij under the conformal transformation g = e2u g, we apply (3.8) to the conformal metric g, to get

Ra,u Q,a,u 2 ψ v¯g = e−u λ2 − + |Qϕ |2 |ϕ|2 vg ∇ g¯ 4 M M + e−u (ϕ, e0 · D ∂M ϕ)

∂M

+ (a −

H n−1 ) du(e0 ) − |ϕ|2 vg . 2 2

(4.4)

Thus we have Theorem 8. Let M n be a compact Riemannian spin manifold of dimension n ≥ 2, with boundary ∂M, and let λ be any eigenvalue of D under the local boundary condition loc . If there exist real functions a, u on M such that ± H ≥ (2a − n + 1) du(e0 ) on ∂M, where H is the mean curvature of ∂M, then Ra,u + |Qϕ |2 . λ2 ≥ sup inf 4 a,u M In the limiting case one has H = (2a − n + 1) du(e0 ) on ∂M.

(4.5)

264


2 Remark 9. If n ≥ 3, take a = 0 and u = − n−2 log h in (4.3) and (4.5), where h is a positive eigenfunction of the first eigenvalue µ1 of the conformal Laplacian

L := 4

n−1 +R n−2

under the boundary condition dh(e0 ) −

(n − 2)H h = 0. 2(n − 1)

Then, one gets the lower bounds [22, 24] λ2 ≥ and λ2 ≥ inf M

n µ1 , 4(n − 1) µ

1

4

+ |Qϕ |2

(4.6) (4.7)

loc . In the limiting case of (4.6), the associated under the local boundary condition ± eigenspinor is a real Killing spinor and ∂M is minimal.

Acknowledgements. Research of S.M. is partially supported by a DGICYT grant No. PB97-0785. Research of X.Z. is partially supported by the Chinese NSF and mathematical physics program of CAS. This work is partially done during the visit of the last two authors to the Institut Élie Cartan, Université Henri Poincaré, Nancy 1. They would like to thank the institute for its hospitality.

References 1. Adams, R.A.: Sobolev spaces. New York: Academic Press, 1978 2. Atiyah, M.F., Patodi, V.K., Singer, I.M.: Spectral asymmetry and Riemannian geometry, I. Math. Proc. Cambr. Phil. Soc. 77, 43–69 (1975) 3. Atiyah, M.F., Patodi, V.K., Singer, I.M.: Spectral asymmetry and Riemannian geometry, II. Math. Proc. Cambr. Phil. Soc. 78, 405–432 (1975) 4. Atiyah, M.F., Patodi, V.K., Singer, I.M.: Spectral asymmetry and Riemannian geometry, III. Math. Proc. Cambr. Phil. Soc. 79, 71–99 (1976) 5. Bär, C.: Lower eigenvalue estimates for Dirac operators. Math. Ann. 293, 39–46 (1992) 6. Baum, H., Friedrich, T., Grunewald, R., Kath, I.: Twistor and Killing Spinors on Riemannian Manifolds. Seminarbericht 108, Humboldt-Universität zu Berlin, 1990 7. Bourguignon, J.P., Gauduchon, P.: Spineurs, Opérateurs de Dirac et Variations de Métriques. Commun. Math. Phys. 144, 581–599 (1992) 8. Bourguignon, J.P., Hijazi, O., Milhorat, J.-L., Moroianu, A.: A Spinorial Approach to Riemannian and Conformal Geometry. Monograph (In preparation) 9. Botvinnik, B., Gilkey, P., Stolz, S.: The Gromov Lawson Rosenberg conjecture for groups with periodic cohomology. J. Diff. Geo. 46, 374–405 (1997) 10. Friedrich, T.: Der erste Eigenwert des Dirac-Operators einer kompakten, Riemannschen Mannigfaltigkeit nicht negativer Skalarkrümmung. Math. Nachr. 97, 117–146 (1980) 11. Friedrich, Th., Kim, E.-C.: Some remarks on the Hijazi inequality and generalizations of the Killing equation for spinors. To appear in J. Geom. Phys. 12. Farinelli, S., Schwarz, G.: On the spectrum of the Dirac operator under boundary conditions. J. Geom. Phys. 28, 67–84 (1998) 13. Gibbons, G., Hawking, S., Horowitz, G., Perry, M.: Positive mass theorems for black holes. Commun. Math. Phys. 88, 295–308 (1983) 14. Gilkey, P.B.: Invariance theory, the heat equation, and the Atiyah–Singer index theorem. 2nd ed., Boca Raton: CRC Press, 1995


265

15. Ginoux, N., Morel, B.: Eigenvalue Estimates for the Submanifold Dirac Operator. Preprint IÉCN, Nancy, n◦ 44 (2000) 16. Grubb, G.: Heat operator trace expansions and index for generalAtiyah-Patodi-Singer boundary problems. Commun. Part. Diff. Equat. 17, 2031–2077 (1992) 17. Grubb, G., and Seeley, R.: Développements asymptotiques pour l’opérateur d’Atiyah–Patodi–Singer, C. R. Acad. Sci., Paris, Ser. I 317, 1123–1126 (1993) 18. Grubb, G., and Seeley, R.: Weakly parametric pseudodifferential operators and Atiyah Patodi Singer boundary problems. Invent. Math. 121, 481–529 (1995) 19. Grubb, G., and Seeley, R.: Zeta and eta functions for Atiyah-Patodi-Singer operators. J. Geom. Anal. 6, 31–77 (1996) 20. Herzlich, M.: A Penrose-like inequality for the mass of Riemannian asymptotically flat manifolds. Commun. Math. Phys. 188, 121–133 (1998) 21. Herzlich, M.: The positive mass theorem for black holes revisited. J. Geom. Phys. 26, 97–111 (1998) 22. Hijazi, O.: A conformal lower bound for the smallest eigenvalue of the Dirac operator and Killing spinors. Commun. Math. Phys. 104, 151–162 (1986) 23. Hijazi, O.: Première valeur propre de l’opérateur de Dirac et nombre de Yamabe. C. R. Acad. Sci. Paris, 313 , 865–868 (1991) 24. Hijazi, O.: Lower bounds for the eigenvalues of the Dirac operator. J. Geom. Phys. 16, 27–38 (1995) 25. Hijazi, O., Montiel, S., Zhang, X.: Dirac operator on embedded hypersurfaces. Math. Res. Lett. 8, 195–208 (2001) 26. Hijazi, O., Montiel, S., Zhang, X.: Conformal Lower Bounds for the Dirac Operator of Embedded Hypersurfaces. Asian J. Math., to appear 27. Hijazi, O., Zhang, X.: Lower bounds for the Eigenvalues of the Dirac Operator, Part I. The hypersurface Dirac Operator. To appear in Ann. Glob. Anal. Geom. 28. Hijazi, O., Zhang, X.: Lower bounds for the Eigenvalues of the Dirac Operator, Part II. The Submanifold Dirac Operator. To appear in Ann. Glob. Anal. Geom. 29. Lawson, H., Michelsohn, M.: Spin geometry. Princeton, NJ: Princeton Univ. Press, 1989 30. Morel, B.: Eigenvalue Estimates for the Dirac-Schrödinger Operators. J. Geom. Phys. 38, 1–18 (2001) 31. Trautman, A.: The Dirac Operator on Hypersurfaces. Acta Phys. Plon. B 26, 1283–1310 (1995) 32. Witten, E.: A new proof of the positive energy theorem. Commun. Math. Phys. 80, 381–402 (1981). 33. Zhang, X.: Lower bounds for eigenvalues of hypersurface Dirac operators. Math. Res. Lett. 5, 199–210 (1998); A remark on: Lower bounds for eigenvalues of hypersurface Dirac operators. Math. Res. Lett. 6, 465–466 (1999) 34. Zhang, X.: Angular momentum and positive mass theorem. Commun. Math. Phys. 206, 137–155 (1999) Communicated by M. Aizenman

Commun. Math. Phys. 221, 267 – 292 (2001)

Communications in



Boundary Layer Stability in Real Vanishing Viscosity Limit Denis Serre1, , Kevin Zumbrun2, 1 ENS Lyon, UMPA (UMR 5669 CNRS), 46, allée d’Italie, 69364 Lyon Cedex 07, France.


2 Department of Mathematics, Indiana University, Rawles Hall, Bloomington, IN 47405, USA.

E-mail: [email protected] Received: 27 November 2000/ Accepted: 16 March 2001

Abstract: In the previous paper [20], an Evans function machinery for the study of boundary layer stability was developed. There, the analysis was restricted to strongly parabolic perturbations, that is to an approximation of the form ut + (F (u))x = ν(B(u)ux )x (ν 0.

(1)

Here, F is a given flux, a C 2 -vector field on a convex open subset U of Rn . The diffusion tensor B is of class C 2 ; its eigenvalues need to have non-negative real parts, but it is important not to assume the invertibility of B(u). We assume that the rank of B does not depend on u, and we denote it by r (1 ≤ r ≤ n). The positive constant ν is small. We therefore are interested in the limit as ν → 0+ , expecting that the solutions uν of (1) converge boundedly almost everywhere to solutions of the inviscid system ut + F (u)x = 0,

x, t > 0.

(2)

The local well-posedness of the Cauchy problem (where x ∈ R replaces x > 0) for (1) is a difficult problem, first addressed by Kawashima in his unpublished thesis [15]. A natural hypothesis (see [16]), that we shall adopt here is that there exists a smooth change of variables u → v(u) (with inverse u = g(v)), in which the system rewrites as g(v)t + f (v)x = ν(b(v)vx )x ,

(3)

with the following properties: (H1) b(v) is block-diagonal:

b(v) =

0

0

0 b1 (v)

,

with b1 (v) ∈ GLr (R). (H2) dg(v) is lower block-triangular: dg(v) =

γ (v)

0

·

δ(v)

with, necessarily, γ (v) ∈ GLn−r , δ(v) ∈ GLr .

,

Boundary Layer Stability

269

(H3) For each v¯ ∈ V := v(U), the linear operator ¯ x2 δ(v)∂ ¯ t − b1 (v)∂ is strongly parabolic. (H4) In the block decomposition of df , df (v) =

h(v) · ·

·

,

the matrix γ (v)−1 h(v) is diagonalisable with real eigenvalues. In this list, (H1) is a little bit restrictive. But it only needs the eigenvalue λ = 0 of B to be semi-simple. We shall see later on that a natural assumption (see (H9)) ensures this property. The last hypothesis means that the system obtained from (3) by removing the second order equations, and freezing the corresponding variables, is an (n − r) × (n − r) hyperbolic system. In view of these hypotheses, we shall denote by (w, z)T the block decomposition of v. Defining also f =: (f0 , f1 ), g =: (g0 , g1 ), we have h = dw f0 and γ = dw g0 . Since we are concerned with initial-boundary value problems (IBVP), we need to distinguish between two cases, the boundary {x = 0} being characteristic or not. We say that it is characteristic at some state u ∈ U (corresponding to v ∈ V) if one of the signal velocities of the system under consideration vanishes at u. For the perturbed system (1), or equivalently (3), this means that h(v) is singular. For the “inviscid” system (2) this means that dF (u) is singular. Characteric IBVPs may be difficult to attack. But overall, the status of the boundary layer is much different according to the nature of the boundary. The example of a gas flow, where (2) is the Euler equations and (1) is the Navier–Stokes equations, is enlightening. A natural assumption is that the boundary is impermeable ; then it is characteristic, for both (2) and (1). The width of the boundary √ layer is about the square root ν of the viscosity. Its profile is expected to obey the Prandtl equation. Very little is known at a rigorous level in this case. On the contrary, an inflow (or outflow) boundary condition makes the boundary non-characteristic in most cases1 . In that case, the width of the boundary layer is of order ν and its profile obeys an ODE (see (4) below). A suitable analysis of this case was carried out by Gisclon and one of us [9, 10] in one-space dimension, and by Grenier & Guès [12] in several space dimensions. These references deal with boundary layers of moderate amplitude, when (1) is strictly parabolic, that is r = n. Let us assume that (2) is hyperbolic, that is dF (u) is diagonalisable with real eigenvalues. When dF (u) is invertible, an IBVP needs q independent scalar boundary conditions, where q is the number of incoming characteristic curves, that is the number of positive eigenvalues of dF (u). Similarly, assuming (H1–H4) and that h(v) is invertible, an IBVP for (1) needs r + p independent scalar boundary conditions, where p is the number of incoming characteristics for the reduced system g0 (w, z¯ )t + f0 (w, z¯ )x = 0 (¯z constant), that is the number of positive eigenvalues of γ (v)−1 h(v). We shall see that p ≤ q ≤ r + p (Corollary 1); a boundary layer occurs when q < p + r. For the sake of simplicity, we shall restrict to a set of Dirichlet-type boundary conditions for (1). In [9, 10, 12], the convergence of (1) towards (2) was proved under natural assumptions, as long as the amplitude of the boundary layer remains smaller than some threshold. 1 In- or out-flow data are of interest in problems with apertures, such as occur in oil recovery.

270

D. Serre, K. Zumbrun

For x >> ν, the solution uν of the IBVP for (1) behaves like the solution u¯ of an appropriate IBVP for (2). For x = O(ν), it behaves as a layer U (x/ν; t). Here, the time variable acts just as a parameter and U (·; t) solves an ODE ¯ t)), B(U )U = F (U ) − F (u(0,

(4)

with U (+∞; t) = u(0, ¯ t) and U (0; t) satisfying the boundary condition of (1). Given u(0, ¯ t), this is an overdetermined problem, which admits a solution if and only if u(0, ¯ t) belongs to some subset C(t) (the reference to t is there because the boundary data might depend on t). The relation u(0, ¯ t) ∈ C(t) then plays the rôle of a boundary condition for (2), called the residual boundary condition. Under suitable assumptions, C(t) is a submanifold of codimension q (see [9, 10, 19]) and gives rise to a locally well-posed IBVP for (2) (see [17] for a theory of such IBVP), which determines u¯ on some strip R+ × (0, T ). The restriction r = n, assumed in [9, 10], is not essential here, as pointed out by H. Freistühler (personal communication). We point out that U (·; t) is nothing but a steady solution of the IBVP for the rescaled problem uτ + F (u)y = (B(u)uy )y .

(5)

Similarly, U (·/ν; t0 ) is a steady solution of the IBVP for (1). The restriction of moderate strength in [9, 10, 12] is actually relevant. We do not exclude that some strong layers become linearly unstable, which would forbid the convergence as ν → 0. The instability mechanism may be described as follows. Let us consider the linearized problems about U (·/ν; t0 ) and U (·; t0 ): ut = Lν u + linearized boundary conditions, uτ = Lu + linearized boundary conditions.

(6) (7)

Clearly, the linearized boundary conditions are the same for both problems ; therefore L and Lν have the same domain D. One easily checks that Lν is conjugated to ν −1 L, through the rescaling u → u, ˜ u(y) ˜ = u(νy). Let us now suppose that the spectrum of L contains some complex number λ with real part ω > 0. Then Lν admits the spectral value λ/ν and the boundary layer is more and more unstable as ν → 0: disturbances are amplified by a factor exp(ωt/ν) and are completely destroyed on a time scale O(ν). In other words, such a boundary layer may not be observed in practice and is irrelevant. As a matter of fact, the analysis in [9, 10, 12] implies that layers of moderate size, with r = n, are linearly stable. On the contrary, a recent work by Grenier and Rousset [13] shows that spectral stability of the boundary layer implies non-linear stability, under the condition that r = n. Let us give a short description of their result. Being given a Dirichlet boundary data a(t) for (1), let u be a smooth solution of the hyperbolic system (2) with initial data u0 (x) and residual boundary condition u(0, t) ∈ C(t) associated to a. Let U (·; t) be the boundary layer, determined by (4) and U (0; t) = a(t) (and therefore U (+∞; t) = u(0, t)). Finally, let uν be the solution of the IBVP for (1). Assume that for every t ∈ [0, T [, the boundary layer U (·; t) is spectrally stable. Then uν converges strongly towards u. This motivates our study of the “spectral stability of the boundary layer”. By this, we mean that the spectrum of L is included in the left (say, stable) half-space {λ ∈ C; λ ≤ 0}. To decide whether a given boundary layer is spectrally stable or not is a difficult task, which cannot be solved explicitly by quadrature. We shall see that, under reasonable assumptions, the essential spectrum of L lies in the stable half-space.


271

Therefore, instability can occur only when L admits an eigenvalue with λ > 0. This yields the eigenvalue problem (L − λ)u = 0,

u ∈ D.

(8)

The difficulty then comes from the fact that, since L is a differential operator with variable coefficients, we are not able to solve explicitly the ODE (L − λ)u = 0. The information obtained by differentiating (4) is clearly not enough: LU = 0.

(9)

We point out in passing that U does not satisfy the linearized boundary condition in general, so that (9) does not mean that zero is an eigenvalue of L, contrary to the case of travelling waves (see for instance [7]). In the sequel, we first focus on the stability analysis of one single given layer U . We denote by u+ its limit at +∞ and we define V := v ◦ U . We only assume that (H1–H4) hold on a neighbourhood U of the range of U . In order to have minimal hypotheses, we complete (H1–H4) by (H5) The boundary is non-characteristic for (1), that is h(v) ∈ GLn−r (R),

∀v ∈ V := v(U).

(H6) Strict hyperbolicity of (2) near u+ : for u in some neighbourhood of u+ , the matrix dF (u) is diagonalisable with real eigenvalues of constant multiplicities. (H7) The boundary is non-characteristic for (2) at u+ , that is dF (u+ ) ∈ GLn (R). (H8) The state u+ is linearly L2 -stable for the Cauchy problem of (1): for all ξ ∈ R∗ , the eigenvalues of the matrix ξ 2 B(u+ ) + iξ dF (u+ ) have strictly positive real parts. For the sake of simplicity, we shall denote K+ := K(u+ ) (for functions of the variable u ∈ U) or k+ = limx→+∞ k(x) (for functions of the variable x > 0). When (H8) holds, there exists a positive θ such that κ ≥ θξ 2 holds for all eigenvalue κ of ξ 2 B+ + iξ dF+ and |ξ | < 1. This estimate is not uniformly valid for ξ ∈ R when r < n. We point out that (H8) implies that dF+ has real eigenvalues (examine the limit as ξ → 0), a slightly weaker property than (H6). Also, (H8) follows from stronger, but rather natural, hypotheses: (H9) There is a dissipative symmetrizer S+ at u+ , that is a positive definite symmetric matrix such that S+ dF+ is symmetric and ∀X ∈ Rn ,

(S+ B+ X, X) ≥ βB+ X2 ,

where (· , ·) denotes the scalar product in Rn and β > 0 is a constant. (H10) The hyperbolic and parabolic modes do couple: the kernel of B+ does not contain eigenvectors of dF+ . Lemma 1. Hypotheses (H9, H10) imply (H8).

272


Proof. Let ξ ∈ R∗ and let (λ, X) be an eigenpair of ξ 2 B+ + iξ dF+ . Then (ξ 2 B+ + iξ dF+ − λ)X = 0. Multiplying by X ∗ S+ , and taking the real part, we obtain (λ)X∗ S+ X = ξ 2 (S+ B+ X, X∗ ) ≥ ξ 2 βB+ X2 . Therefore λ is positive. It is strictly so, because otherwise B+ X = 0 so that X would be an eigenvector of dF+ . Such a symmetrizer usually comes as the Hessian matrix of an entropy for (2), which is strongly convex at u+ and dissipative for (1). The rôle of (H9) in the computation of a “stability index” has been explained in [2]. Let us point out that assumption (H9) immediately implies that the range R(B+ ) is S+ -orthogonal to ker B+ . This shows that zero is a semi-simple eigenvalue of B+ , a property which was implicit in assumption (H1). We also remark that instability does not occur in scalar problems (n = 1), even at a non-linear level, as shown in [4]. Our paper is organized as follows. In the next section, we study the boundary layer equation in a geometrical setting and we show that the stability analysis reduces to the search of ordinary eigenvalues. In Sect. 3, we built our Evans function, following [20] and focus on its crucial estimate at λ = 0. In Sect. 4, we consider a richer situation, where the boundary layer is parametrized in such a way that it is a piece of a maximal solution of the layer equation. When this solution is a viscous shock profile and the piece is almost the whole, then we show that the stability index is the sign of an algebraic expression. This sign can be computed in several cases. The remaining sections are devoted to full as well as to isentropic gas dynamics. For full gas dynamics (Sect. 5), we show that for an adiabatic constant γ > 2 and when the viscosity coefficient ν dominates the heat diffusion κ (for instance, ν > κ works), then there exist unstable boundary layers with inflow. As explained above, such instability is only shown for layers of large amplitude, which are almost heteroclinic orbits of (4). This result is the main application of our analysis. Finally, an appendix shows that weak boundary layers are spectrally stable, thanks to generalized energy inequalities. 2. Linear and Non-Linear Dynamical Systems We begin with the non-linear equation (4), that we rewrite as B(U )U = F (U ) − F (u+ ).

(10)

When r < n, this is not an ODE in the strict sense, but a “differential-algebraic” equation. It may be better to see it under the form b(V )V = f (V ) − f (v+ ),

v+ = v(u+ ).

(11)

We split this system into two pieces: f0 (W, Z) = f0 (v+ ),

b1 (W, Z)Z = f1 (W, Z) − f1 (v+ ).

From (H5), the identity f0 (v) = f0 (v+ ) allows to determine w in terms of (z, v+ ) in a neighbourhood of the range of V : w = w(z, ˆ v+ ). Therefore, the differential part becomes an ODE in z, well-defined in a neighbourhood of the z-projection of this range. Let write it as z = G(z; v+ ) We know w(z ˆ + , v+ ) = w+ and therefore G(z+ , v+ ) = 0.

(12)


273

Lemma 2. Under (H7,9,10), the rest point z+ of the dynamical system (12) is hyperbolic. Its stable manifold is of dimension r + p − q. Proof. One easily computes dz G(z+ , v+ ) = b1 (v+ )−1 (dw f1 dz wˆ + dz f1 )+ . Let us consider eigenvalues σ of dz G(z+ , v+ ). We have det(dw f1 dz wˆ + dz f1 − σ b1 )+ = 0. However, dz wˆ = −(dw f0 )−1 dz f0 . Thus, using Schur’s formula, we arrive to det(df+ − σ b+ ) = 0, or equivalently det(dF+ − σ B+ ) = 0. Up to a non-zero constant, this determinant is the characteristic polynomial of dG+ . According to (H7, H8), σ may not be purely imaginary. Therefore, z+ is a hyperbolic rest point. We now proceed by homotopy. For m > 0, we define Pm (σ ) := det(dF+ − σ (B+ + mIn )). Since the pair (dF+ , B+ + mIn ) satisfies the assumptions (H7, H8), we again see that Pm does not vanish on the imaginary axis. Since its degree n is constant for m > 0, we deduce that the number of roots of negative real part does not depend on m > 0. Letting m → +∞, we find that this number is n − q. Since the degree of Pm drops to p as m reaches 0+, its roots split into two parts. One set contains those which tend to the roots of P0 with negative real parts. The cardinality of this set is the dimension of the stable manifold. The other set consists of those roots which tend to infinity as m → 0+. To prove the lemma, we need to show that its cardinality is n − r − p, the number of negative eigenvalues of γ+−1 h+ . By density, we may assume that this matrix has only simple eigenvalues. We first show that, if Pm (σm ) = 0 and σm tends to infinity, then mσm tends to such an eigenvalue. For Pm = 0 means that there is an Xm , say of unit norm, such that df+ Xm = σm (B+ + mdg+ )Xm . Dividing by σm , we first obtain that Xm has a cluster point X¯ in ker B+ . Obviously, X¯ has unit norm. Next, retaining only the p first rows of ¯ which proves the claim. the equality, we have h+ X¯ ∼ σm mγ+ X, Conversely, let µ0 < 0 be a negative eigenvalue of γ+−1 h+ . This means that there exists a non-zero pair (Y0 , Z0 ) with Y0 ∈ ker B+ , Z0 ∈ R(B+ ) and (dF+ − µ0 )Y0 = B+ Z0 . From (H5), µ0 = 0 and we may redefine Z0 so to have (dF+ −µ0 )Y0 = µ0 BZ0 . This means in particular that (dF+ − µ0 )(ker B+ ) ∩ R(B+ ) = {0}. Since the sum of dimensions of these spaces is n (hypothesis (H10)), this means that their sum has codimension one. Let l0 be a non trivial linear form vanishing on it. From the simplicity of µ0 , we know that l0 Y0 = 0 ; we therefore normalize l0 by l0 Y0 = 1. We now define the following non-linear mapping: R2 × ker B+ × R(B+ ) → R × Rn   m    µ  l Y − 1 0  → N (m, µ, Y, Z) :=  .   (dF+ − µ)(Y + mZ) − µB+ Z  Y  Z We already have N (0, µ0 , Y0 , Z0 ) = 0. We check easily that the differential dµ,Y,Z N , computed at (0, µ0 , Y0 , Z0 ), is injective, thus invertible. From the implicit function theorem, we receive a locally defined smooth function m → (µ, Y, Z), whose graph is the zero set of N near (0, µ0 , Y0 , Z0 ). Then X := Y (m) + mZ(m) and σ := mµ(m) satisfy (dF+ − σ (B+ + m))X = 0, so that Pm (σ ) = 0, with σ < 0.

274


Corollary 1. Under (H7, H9, H10), one has p ≤ q ≤ p + r. We point out that both inequalities in this corollary are equivalent to each other, in the following sense. Let us for instance assume that p ≤ q is true under (H7, H9, H10). Then (−F, B) satisfy (H7, H9, H10) too, with (p, q) replaced by (n − r − p, n − q). Therefore, n − r − p ≤ n − q, or equivalently q ≤ r + p. We deduce from the lemma that the profile U tends exponentially fast to its limit u+ : U (y) − u+ + U (y) = O(e−α+ y ),

α+ > 0.

(13)

This is actually clear for Z, then for W , using the formula W = w(Z, ˆ v+ ). We now turn to the linear operator Lu = {B(U )u + (dB(U )u)U − dF (U )u} . Its boundary conditions are given by r + p linear forms D1 , . . . , Dr+p : D1 u(0) = · · · = Dr+p u(0) = 0.

(14)

The linear transform u → dv(U )u shows that L is conjugate to l, where

lv := dg(v)−1 b(V )v + (db(V )v)V − df (V )v . The boundary conditions transform accordingly: d1 v(0) = · · · = dr+p v(0) = 0,

(15)

where dj ◦ dv(U (0)) = Dj . The operator l is a list of r second-order differential operators and n − r first order ones. Its domain is

Dl = (w, z) ∈ H 1 (R+ )n−r × H 2 (R+ )r ; dj (w(0), z(0)) = 0, 1 ≤ j ≤ r + p . For instance, Dl = H01 (R+ )n−r × (H 2 (R+ ) ∩ H01 (R+ ))r , when r + p = n, that is when all the eigenvalues of (γ −1 h)(V (0)) are positive. Let us now introduce the constant coefficient operator on the whole line l+ v = (dg+ )−1 (b+ v − df+ v ), with domain H 1 (R)n−r × H 2 (R)r . It is obtained from l by taking the limit as x → +∞. Its spectrum, computable from the Fourier transform, is given by σ+ = {λ ; det(µ2 b+ + iµdf+ + λdg+ ) = 0 for some µ ∈ R}. From (H8), we know that σ+ consists of numbers of strictly negative real part, apart from λ = 0. We shall denote by A the connected component of C \ σ+ , which contains the right half-plane {λ > 0}. As usual, the following lemma is crucial. Lemma 3. For all λ ∈ A, the operator λ − l : Dl → L2 (R+ ) is Fredholm with index zero. The eigenvalues of l, in A, are isolated. Therefore, Corollary 2. The unstable spectrum of L, or equivalently l, consists only of isolated eigenvalues of finite multiplicities.


275

Proof. This follows similarly as in the case of an asymptotically constant-coefficient operator on the whole line, by a now-standard argument of Henry [14]. Specifically, the result for constant coefficient operators can be established by direct computation, similarly as in [14, p. 138]; this can then be extended to the asymptotically constant case by a version of Weyl’s Lemma (Theorem A.1 of [14, p. 136]) stating that, except for isolated eigenvalues of constant multiplicity, the spectrum of an operator is unchanged by relatively compact perturbation. For, it is readily verified that an asymptotically constant-coefficient operator is a relatively compact perturbation of the corresponding constant-coefficient operator with limiting coefficients at x → +∞, see Exercise 2, p. 137 of [14]. Alternatively, following the approach of [21] for operators on the line, one can establish the result directly, by explicit construction of the Green’s function in terms of the Evans function, followed by a direct computation showing that the location and multiplicity of eigenvalues of L correspond exactly to the location and multiplicity of zeroes of the (analytic) Evans function. 3. The Evans Function Following the general theory set up in [1], we construct an holomorphic function B : A → C, whose zeroes are the unstable eigenvalues of L. This extends Serre’s construction [20] to the case of a non-invertible B. We call B the “Evans function” of L. Following Gardner & Zumbrun [7], we show that B extends analytically to a neighbourhood of the origin. Let λ be a complex number with λ > 0, or more generally an element of A. We first rewrite the differential equation (l − λ)v = 0 as a linear first order system of n + r ordinary differential equations:     w w      z  = M(x; λ)  z  , x > 0. (16)     z z The boundary conditions are rewritten as dˆj (w, z, z ) := dj (w, z) = 0. The matrix M+ (λ) = M(+∞; λ) is hyperbolic, that is its eigenvalues have non-zero real parts. These are the zeroes of the polynomial Pλ (µ) := det(µ2 b+ −µdf+ −λdg+ ). By a contin+ uation argument, there are r + p eigenvalues of negative real part µ+ 1 (λ), . . . , µr+p (λ), counting with multiplicities. The corresponding (generalized) eigenvectors span the “stable subspace” E+ (λ) of M+ (λ). It follows that the set of bounded solutions of (16) is a vector space of dimension r + p, that we denote by E(λ). Such solutions actually decay exponentially fast as x → +∞. The space E+ (λ) is the limit of the trace E(λ; x), as x → +∞. The space E(λ) depends holomorphically on λ. If it was possible to select a holomorphic basis B(λ) = {φ1 (·; λ), . . . , φp+r (·; λ)} of E(λ), then one should define B(λ) directly by B(λ) := dˆj (φk (0; λ)) . (17) 1≤j,k≤p+r

The vanishing of such a number is clearly equivalent to the existence of a linear combination of the φk ’s, on which the dˆj ’s vanish simultaneously. This amounts to saying

276


that there exists a φ in E(λ), such that dˆ1 φ = · · · = dˆr+p φ = 0. Equivalently, there is a v ∈ Dl such that (l − λ)v = 0: λ is an eigenvalue of l. Reciprocally, B vanishes at every eigenvalue of l in A. This procedure is possible when r +p = 1. It is also possible in every small open ball in A. However, in the general case, it raises serious difficulties because of the existence of branching points in A, where M+ (λ) fails to be diagonalisable. At such points, the natural choice of B, given by prescribed asymptotic behaviour of the φk ’s, is meaningless. To overcome this difficulty, one commonly works in the exterior algebra Fr+p (Cn+r ). For 1 ≤ m ≤ n + r, there is a unique homomorphism M (m) (x; λ) in Fm (Cn+r ), such that m solutions φ1 , . . . , φm of (16) always satisfy d φ1 ∧ · · · ∧ φm = M (m) (x; λ)φ1 ∧ · · · ∧ φm . dx (p+r)

When m = r +p, M+ (λ) has the nice property that it has only one eigenvalue µ+ (λ) of minimal real part and that it is simple. Actually, + µ+ (λ) = µ+ 1 (λ) + · · · + µr+p (λ).

The corresponding eigenvector has the form y1 ∧ · · · ∧ yr+p , where {y1 , . . . , yr+p } is (r+p) (λ) is holomorphic and µ+ is simple, µ+ (λ) any basis of E+ (λ). Since λ → M+ is holomorphic too. Therefore, one may select a holomorphic section λ → Y (λ) of the eigen-bundle: (r+p) (M+ (λ) − µ+ (λ))Y (λ) = 0, Y (λ) = 0. In addition, noticing that µ+ (λ) is real when λ ∈]0, +∞[, we infer that one may choose ¯ = Y (λ). Next, there is a unique Y (λ) so that it is real when λ is. In particular, Y (λ) solution y(·; λ) of y = M (r+p) (x; λ)y,

y(x; λ) ∼ (exp µ+ (λ)x)Y (λ) as x → +∞.

(18)

For every λ ∈ A, y(·; λ) equals, up to a constant, a wedge product of a basis of E(λ). ¯ = y(λ). Moreover, it inherits the holomorphy of Y (λ). Similarly, y(λ) Defining the (p + r)-form dˆ := dˆ1 ∧ · · · ∧ dˆp+r , we may define our Evans function as ˆ y(0; λ) > . B(λ) =< d, Besides all the above-mentioned properties, we point out that it takes real values on the real positive semi-axis. Given any point λ0 ∈ A, it admits a form (17) in a vicinity of λ0 . This is the way we compute its local behaviour in practice. We now point out that λ → (µ+ (λ), Y (λ)) admits an analytic extension in a neighbourhood of the origin. Then, thanks to the exponentially fast convergence of (p+r) M (p+r) (x; λ) towards M+ (λ), we deduce (“gap lemma”, see [7]): Proposition 1. The spaces E+ (λ), E(λ), the eigenvector Y (λ), the eigen-function y(·; λ) ˆ of the origin. and the Evans function B(λ) extend analytically to a neighbourhood A Let us point out that, however, these extensions no longer obey the same definitions. For instance, E+ (λ) is no longer the stable subspace of M+ (λ), and so on. Let us describe E+ (λ) when |λ| 0, the eigenvalues of M+ (λ) are found by looking at decaying modes eµx vˆ of l+ − λ: these are roots of Pλ . Since Pλ (µ) = det(−µdf+ − λdg+ ) + O(µ2 ),


277

we see that q roots vanish as λ → 0 in A. They behave as −λ/aj , where an−q+1 , . . . an are the positive eigenvalues of (dg+ )−1 df+ , or of dF+ = df+ (dg+ )−1 . The corresponding eigenvectors are (rj , 0)T + O(λ), where rj is an eigenvector of (dg+ )−1 df+ associated to aj . In terms of eigenvectors Rj of dF+ , one has Rj = df+ rj . The fact that µj extends analytically near the origin is clear when aj is simple, from the implicit function theorem. It is still true when aj is semi-simple (assumption (H6)). The other roots tend to non-zero limits µj (0) as λ → 0. These limits are roots of det(µb+ − df+ ) = 0. These are the eigenvalues of negative real part of the matrix M1 := b1−1 (dz f1 − dw f1 (dw f0 )−1 dz f0 ),

v = v+ .

Given a (generalized) eigenvector zˆ of this matrix, one built a (generalized) eigenvector of M+ (0) through   −(dw f0 )−1 dz f0 zˆ   . (19) ϕ :=  zˆ   µˆz In summary, a basis {ϕ1 , . . . , ϕp+r } of E+ (0) is given by  −(dw f0 )−1 dz f0 zˆ j  rn−q+j ϕj = , if j ≤ q, ϕj =  zˆ j  0 µj (0)ˆzj

   , if q < j ≤ p + r. 

Hereabove, {ˆzq+1 , . . . , zˆ p+r } is a basis of the stable subspace of M1 , in which M1 has a Jordan form, diagonal if possible. The µj (0) are the corresponding eigenvalues of M1 . We now turn to the elements of E(0). These are solutions φ = (v, z )T of (16) with λ = 0. This amounts to lv = 0, or {b(V )v + (db(V )v)V − df (V )v} = 0. Integrating once, we receive a first-order differential-algebraic equation: b(V )v + (db(V )v)V − df (V )v = constant =: q.

(20)

Though E(λ) is made up of functions decaying at +∞ when λ ∈ A, this is not true any more for λ = 0. However, E(0) certainly contains all the exponentially decaying solutions of (16). These correspond to the decaying solutions of the homogeneous equation b(V )v + (db(V )v)V − df (V )v = 0.

(21)

Such solutions form a vector subspace of dimension p + r − q, a basis of which being {φq+1 (0), . . . , φp+r (0)}, where φj (0) solves (16) and φj (x; 0) ∼ (exp µj (0)x)ϕj ,

x → +∞,

q < j ≤ p + r.

The remaining elements of E(0) actually do not decay, but have finite limits ϕ ∈ Span{ϕ1 , . . . , ϕq } (the case ϕ = 0 corresponds to the decaying solutions, already considered). The constant in (20) is computed by letting x → +∞. We thus complete a basis of E(0) by choosing φj (0), solutions of (16) with λ = 0, according to lim φj (x; 0) = ϕj ,

x→+∞

1 ≤ j ≤ q.

278


With φj =: (vj , zj )T , we obtain from (20), b(V )vj + (db(V )vj )V − df (V )vj = −Rn−q+j ,

1 ≤ j ≤ q.

Once a basis B(0) = {φ1 (0), . . . , φp+r (0)} of E(0) is chosen according to the above ˆ as a basis of E(λ). requirements, it is extendable in an analytic way in A, Two remarks. First we point out that there remains much room in the choice of B(0). Second, as mentioned above, lV = 0 and V decays at infinity. Therefore, V ∈ E(0). Moreover, the asymptotic behaviour of V is generically

V (x) ∼ eµj (0)x r , for some index j > q, with (µj (0)b − df )+ r = 0. In the case where µj (0) is real, we may choose φj = (V , 0)T . 3.1. Discussion of (20). We now show that (20) may be viewed as a traditional ODE, instead of a differential-algebraic equation. Let us denote the constant right-hand side by q ∈ Rn . We first split the equation into two parts. With q = (q0 , q1 )T and v = (w, z)T : df0 (V )v = −q0 ,

(22)

b1 (V )z + (db1 (V )v)Z − df1 (V )v = q1 .

(23)

We now differentiate (22) and keep (23) unchanged. This yields ˜ b(x)v − a(x)v ˜ = q, ˜

with b˜ =

h

·

0 b1

,

a˜ =

(df0 ) ···

(24)

,

q˜ =

0 q1

.

Thanks to (H5), b˜ is invertible. Therefore, (24) is a linear ODE in the traditional form. Every solution of (20) solves (24). Conversely, let v be a solution of (24), with a constant vector q˜ and q˜0 = 0, and assume that (v(x), v (x)) → (v∞ , 0), as x → +∞. Then (20) holds true, with q1 := q˜1 and q0 := −df0 (v+ )v∞ . 4. Parametrized IBVPs Let U : R → U be a given solution of the differential-algebraic system (10). We emphasize that it is defined on the whole line R, instead of on the semi-axis R+ . We now consider the initial-boundary value problem for (1), on the space domain I :=]x0 , +∞[, instead of R+ . For this, we provide (1) with p + r suitable boundary conditions, possibly depending on the choice of x0 , and we assume that the restriction U |I satisfies these conditions. We are now concerned with the linear stability of U |I for the corresponding IBVP. For this, we construct the Evans function B(x0 ; λ). We easily see that it can be built as a continuous function with respect to x0 . Here, we focus on the sign of W (x0 ) := B(x0 ; 0), vs the sign of B(x0 ; ·) near +∞. By the intermediate value theorem, opposite signs imply the existence of at least one real positive root. In particular, U |I would be unstable. More precisely, opposite signs mean


279

that the number of unstable eigenvalues of L is odd, while same signs mean that this number is even, keeping in mind that non-real eigenvalues come in complex conjugate pairs. In the sequel, we shall denote this parity by the stability index of L. Since it can be checked that B(x0 ; ·) does not vanish near infinity, a consequence of a Gårding estimate (see [2]), its sign does not depend on x0 . Therefore, the only ingredient in the computation of the stability index of L is the sign of W (x0 ), which may vary with x0 . Though the exact computation of W is not easy, we may expect to receive some results by means of a qualitative study of W . Notice that, in this case, a suitable choice of solutions φ1 , . . . , φp+r of φ = M(x; 0)φ gives a coherent basis of E(x0 ; 0), for all x0 . That is, {φ1 |I , . . . , φp+r |I } form a basis B(x0 ; 0). As in the previous section, we choose the φj ’s so that φj decays exponentially fast as x → +∞ if j > q, and φj tends to ϕj as x → +∞ if j ≤ q. We also decompose φj =: (vj , zj )T . A convenient tool for this study is a differential equation that W should satisfy. Let us consider for instance the easiest case where p + r = n, that is when the boundary condition for (1) is a pure Dirichlet one (in other cases, it will often depend on x0 when U is not constant). Then, W (x0 ) = det(v1 (x0 ), . . . , vn (x0 )). Let us differentiate, using the matrix K := b˜ −1 a: ˜ W = det(. . . , vj −1 , Kvj + b˜ −1 q˜j , vj +1 , . . . ) j

= (TrK)W +

q

det(. . . , vj −1 , b˜ −1 q˜j , vj +1 , . . . )

j =1

= (TrK)W +

1

q

det b˜

j =1

˜ j −1 , q˜j , bv ˜ j +1 , . . . ). det(. . . , bv

Lemma 4. The spectrum of K+ = K(+∞) is made up of the eigenvalues of dG+ (with the same multiplicities), plus µ = 0, with multiplicity n − r. Proof. Because the first n − r rows of a˜ + vanish, we easily see that w =0 (K+ − µ) z is equivalent to either µ = 0 or to dG+ z = µz with appropriate w. The case where eigenvalues of dG+ are simple is done. The case of higher multiplicy follows by a density argument, as in, e.g. [6, 5]. Two subcases, q = p or p + 1 (recall that we already know that q ≥ p), are of ˜ k consist strong interest for applications. We point out that the p first components of bv of the vector df0 (V )vk , that is of −qk,0 , with qk =: (qk,0 , qk,1 )T . For k > q, this is zero, since qk = 0. Since q˜j,0 vanishes too, we see that each of the above n × n determinants contains a null block of size p × (n − q + 1). Let’s first consider the case q = p. From p + (n − q + 1) = n + 1, we conclude that each of these determinants vanish, so that W = (TrK)W . Therefore, W does not vanish ; it keeps a constant sign. Since we can

280


prove (see Appendix A, below) that weak boundary layers are linearly stable, we already know that the signs of B(x0 , 0) and B(x0 , λ >> 1) agree for x0 >> 1. We conclude that they do, for all x0 : Theorem 1. Let U be a boundary layer, with p + r = n (so that the boundary condition is a pure Dirichlet one) and q = p. Then the stability index of the linearized operator L is even. From this, we cannot conclude, regarding the linear stability of U . The next subcase comes with q = p + 1. The same arguments as above show that each of the determinants are block-triangular, since p + (n − q + 1) = n. Therefore, they may be written as products of two determinants, of respective sizes p × p and (n − p) × (n − p). However, we may rewrite the differential equation in a simpler form, with the next lemma. Lemma 5. If r + p = n and q = p + 1, then q

˜ j −1 , q˜j , bv ˜ j +1 , . . . ) = (−1)p det(q1 , . . . , qp+1 , bvq+1 , . . . , bvn ). det(. . . , bv

j =1

Proof. Let us define Qj := qj − q˜j = (qj,0 , 0)T . We use qj = q˜j + Qj and the linearity of the determinant to rewrite det(q1 , . . . , qp+1 , bvq+1 , . . . , bvn ) as a sum of 2p+1 terms. Those containing two (or more) q˜j vanish, since they contain a null block of size p × (n − p + 1). The term with only Qj ’s vanishes too, since it contains a null block of size (n − p) × (p + 1). There remain only those terms with exactly one q˜j . These ones are block diagonal and are not changed when replacing the lower null block (of size (n − p) × p) by a non-zero block. Such a change is performed when replacing the corresponding Ql ’s by −bvl , since their first p components agree. We finally obtain ˜ k for k > q. the expected formula by noticing that bvk = bv We now face the linear ODE W = (TrK)W + (−1)p det(q1 , . . . , qp+1 , bvq+1 , . . . , bvn ) =: τ (x)W + s(x), (25) for which we can write W (x) exp −

x

τ (y)dy

x

=c+

0

s(y) exp −

0

y

τ (ξ )dξ

dy,

0

where c is a constant. Then the sign of W equals the one of the right-hand side. This may be evaluated for x → −∞, independently of c, each time the integral

x −∞

diverges.

s(y) exp − 0

y

τ (ξ )dξ

dy

(26)


281

4.1. Boundary layers from 1-shock profiles. A rather interesting case consists in choosing for U a viscous shock profile of a steady shock wave, that is a solution of (10), which admits a limit u− as x → −∞. In order to have some control on the behaviour of U as x → −∞, we ask that this shock be non-characteristic for (2): df− := df (v− ) is invertible. Again, z− is a hyperbolic point for the ODE (12), and Z takes values in its unstable manifold. Then, generically, there exists a pair (µ− , r− ), such that V (x) ∼ (eµ− x r− ),

x → −∞,

(27)

and (µb − df )− r− = 0. We shall focus on the most frequent case of a Lax shock, that is an (n−q)-Lax shock. Similar to Lemma 2, we have Lemma 6. For an (n−q)-Lax shock, the unstable manifold of z− for (12) is of dimension 1 + q − p. Example (fundamental). In full gas dynamics, (1) is the Navier–Stokes model, with viscosity and heat conduction. Then, n = 3, r = 2. Let assume that p = 1 and q = 2 ; that is 0 < u+ < c+ , where c denotes the sound speed. Then the stable manifold of z+ for (12) is a curve, which splits into two trajectories. For reasonable state laws (as the perfect gas law p = (γ − 1)ρe), there is a single other state u− such that F (u− ) = F (u+ ), and the pair (u− , u+ ) is a 1-Lax shock2 . From Gilbarg’s study [8], we know that a shock profile exists for every positive choice of the viscosity and heat conductivity. Therefore, one of the two possible trajectories U such that U (+∞) = u+ is actually such a shock profile. This argument certainly admits generalisations to many systems and states u+ such that q = n − 1. Our assumption that U is a piece of a shock profile is thus not too restrictive. Let us assume that (u− , u+ ) is such a shock and that U is its profile, with as before r + p = n, q = p + 1. Since we do not know explicitly the vq+1 , . . . , vn , except for vn = V , we cannot in general evaluate s(x). We therefore assume q = n − 1, that is r = 2, p = n−2. Then, noticing that f (v− ) = f (v+ ) because of the Rankine–Hugoniot condition, s = (−1)n det(q1 , . . . , qn−1 , bV ) = (−1)n det(q1 , . . . , qn−1 , f (V ) − f (v± )). Since (u− , u+ ) is a 1-shock, Lemma 6 shows that z− is a source for (12). Thus, all eigenvalues of DG− , and from Lemma 4, both non-zero eigenvalues of K− have positive real parts, one of them being µ− , the other one denoted by σ− . Thus we have TrK− = µ− + σ− > µ− . Then the integrand in (26) is equivalent to Se−σ− x , S := (−1)p det(q1 , . . . , qn−1 , b− r− ). This proves that (26) diverges. Then there are two generic pictures. Either µ− is not real, and σ− = µ− . Then W oscillates, as shown in [20], so that the stability index is odd when x0 belongs to a denumerable union of intervals, implying the instability of the corresponding boundary layers. Or µ− , σ− are real and simple. Then the sign of W as 2 Let us remark that 2-shocks do not exist in gas dynamics, since the second characteristic field is linearly degenerate.

282


x → −∞ is the one of −S/σ− , an explicit quantity ! This is made even simpler when choosing, as it is possible, qj = −Rj +1 (recall that (dF+ − aj )Rj = 0 and aj > 0): S = − det(R2 , . . . , Rn , b− r− ) = − det(R2 , . . . , Rn , B− R− ), with R− = df− r− . Choosing appropriately an eigenform L1 of dF+ (that is a non-trivial solution of L1 (dF+ − a1 ) = 0), we also have S = L1 B− R− = L1 b− r− . At this stage, one may wonder about the consistency of this analysis, since a change in the choice of the basis B(0) could result, for instance, in a flip of the sign of S. This is without taking into account the need to define continuously the bases B(λ) along R+ . Such a modification would also change the sign of B(x0 ; λ) for λ >> 1, but it would not affect the sign of the product B(x0 ; 0)B(x0 ; +∞), of course since it is intrinsic. For the moment, let us say that, considering a one-parameter family of steady shock waves L → (u− , u+ ), endowed with a smoothly varying family of profiles L → UL , we obtain a presumably smooth function L → S(L), instead of a single number. Then, detecting a value, for instance L = 0, where S vanishes with dS/dL = 0, we conclude that for, say L > 0, S and B(x0 ; λ) have the same sign, for all x0 and all λ >> 1. Therefore, there is a point X(L) such that, for L > 0 and x0 < X(L), the corresponding profile is unstable (let us point out that limL→0 X(L) = −∞). To summarize this analysis, we write For r = 2 and a profile U of a steady 1-shock wave, the vanishing of S detects the instability of some boundary layers associated to nearby steady shocks, with x0 1 is the adiabatic constant. Its value is 5/3 for a mono-atomic gas, 7/5 for the air.

284


Table 1. Numbers of boundary conditions v+ q

−c+ 0

[p, p + r] [0, 2]

c+

0 1

2

3

[0, 2]

[1, 3]

[1, 3]

From now on, we consider only perfect gases. Without loss of generality, we may assume that θ ≡ e. We therefore have 

ρv



  2 + (γ − 1)ρe  , f (v) =    ρv 1 2 2 v + γ e ρv



 0

0

0

 0  . 0 νv κ

 b(v) =  0

ν

We notice that r = 2. We easily check (H1–H4, H6) for the Navier–Stokes system. Assumption (H5) asks that v(0) = 0. Since ρv is a constant along a boundary layer (because of the √ first line of (11)), this amounts to v+ = 0. Next, denoting the sound speed by c := γ (γ − 1)e, the characteristic speeds in the Euler equations (2) are 2 − c2 ) = 0. The number of boundary v − c, v, v + c, so that (H7) asks that v+ (v+ + conditions that we need for (1) and (2) is given by Table 1. From the existence of a strictly convex, dissipative entropy, we know that (H9) holds. Since (H10) holds trivially, Lemma 1 shows that (H8) is satisfied. Therefore, the construction of the Evans function and the analysis done in the previous section apply when v+ > 0. When v+ > c+ , the boundary layer is trivial (that is, constant) and the linearized IBVP has a full Dirichlet boundary condition. An obvious a priori estimate shows that such a layer is stable. The situation is less clear when v+ < 0, since then a choice has to be made concerning the boundary condition (we need only two scalar data). Since the Evans function strongly depends on the linearized ones d1 , d2 , we anticipate that the stability index of a layer will depend not only on the layer but also on the linearized boundary conditions. We shall illustrate this dependence in the simpler case of an isentropic flow (see Sect. 6).

5.1. The case 0 < v+ < c+ . The case considered in the previous section (r = 2, q = n − 1 = 2, p = n − 2 = 1) corresponds to the choice 0 < v+ < c+ . From Gilbarg [8], we know that, given such a state v + there is a unique state v − with f (v − ) = f (v + ) and that the pair (v − , v + ) is 1-Lax steady shock. In particular, v− > c− . Also, it is proved that this shock admits a viscous profile V , for every choice of the positive functions ν and κ. We first compute the expression S = L1 b− r− . The differential form L1 is the eigenform for dF+ , associated to v+ − c+ , its first eigenvalue. We have L1 = (γ − 1)(e+ dρ + ρ+ de) − ρ+ c+ dv γ −1 2 v + vc = du1 − (c + (γ − 1)v)+ du2 + (γ − 1)du3 . 2 +


285

Dropping for a moment the minus indices, r = r− obeys the eigen-equation (µb − df )r = 0. With r =: (x, y, z)T , this reads     vx + ρy 0      =  (v 2 + (γ − 1)e)x + 2ρvy + (γ − 1)ρz  . µ νy     3 2 1 2 νvy + κz v + γ e vx + v + γ e ρy + γρvz 2 2 From vx + ρy = 0, we eliminate x. Making also a linear combination of the two last equalities, we arrive at µνvy = (γ − 1)ρvz + ρ(v 2 − (γ − 1)e)y, µκz = ρ((γ − 1)ey + vz). Since r and v are non-zero, we have (y, z) = 0, which implies µνv + ρ((γ − 1)e − v 2 ) −(γ − 1)ρv = 0. −(γ − 1)ρe µκ − ρv Defining ζ :=

(28)

µκ − 1, ρv

this is rewritten as ν ζ (ζ + 1) + ζ κ

1−γ 1 −1 + = 0, 2 γM γ M2

(29)

where M = v− /c− > 1 is the Mach number. This quadratic equation has two real solutions of opposite signs, the negative one corresponding to the smallest “eigenvalue” µ. That is precisely the one which governs the asymptotic behaviour of V near −∞ (see [8]). In passing, this shows that these eigenvalues are simple and real, so that the behaviour (27) is correct. From now on, by ζ we mean the negative root of (29). The corresponding eigenvector is given by the formula     0 −ρζ      , br =  , r= νvζ vζ     νv 2 ζ + κ(γ − 1)e (γ − 1)e everything being evaluated at v − . Finally, S = L1 b− r− = −(c+ + (γ − 1)(v+ − v− ))ν− v− ζ + κ− (γ − 1)e− . We now investigate the possible vanishing of S. First of all, since ν, κ, v, e are positive and ζ is negative, ℵ := c+ + (γ − 1)[v] needs to be negative. This is far from true in general. For instance, a weak shock (v − being close to v + ) yields ℵ ∼ c+ > 0. Next, it can be shown that, as long as γ ≤ 2, all the 1-shocks satisfy ℵ > 0. However, when γ > 2, strong 1-shocks satisfy ℵ < 0. For instance, so-called “maximal” shocks (see

286


[3]), for which e− vanishes, or equivalently M = +∞, give 2γ − 2 v+ , ℵ= γ −1 and the parenthesis is negative if and only if γ > 2. We therefore restrict to the case γ > 2 and select a strong enough steady 1-shock, that is one for which ℵ < 0. Having fixed the state v − in such a way, S appears to be an homogeneous function of degree one of (ν− , κ− ). Thus its vanishing depends only on the ratio ν/κ. However, we easily obtain the following asymptotics: ν/κ → 0+ : then S ∼ (γ − 1)e− κ− > 0, ν/κ → +∞: then S ∼ ℵv− ν− < 0. By continuity we see that S vanishes for some value of the ratio ν/κ. In order to have a qualitative feeling for this value, let us consider the example of a maximal 1-shock. Using β := ν/κ and L := 1/γ M 2 > 0, L β ∗ (γ )κ, particularly when ν ≥ κ, then S < 0 for sufficiently strong shocks, that is for sufficiently large M. Let us point out that the value of M > 1 2 , v 2 , e , e ), up to a single multiplicative positive constant, which just determines (v− + − + factorizes in S.


287

Existence of unstable boundary layers. We now give the conclusion. Because the 1shock curves are connected, and because of Galilean invariance, the set of steady 1shocks is connected too. Then we may consider the Evans function as a continuous function of, altogether, λ, x0 , ν, κ and the shock itself. Consequently, W is a continuous function of x0 , ν, κ and the shock. Thanks to the Gårding inequality, we know that the stability index depends only on the sign of W . Since the “eigenvalues” µ are distinct and real, we know that the sign of W (x) becomes constant as x → −∞, and that it is opposite to the sign of S. In turn, S˜ := S/ν− is a continuuous function of ν− /κ− and the shock. On one hand, we know that for small shocks, S˜ is positive. On the other hand, we know that the corresponding boundary layers, being small whatever x0 is chosen, are stable (this comes from a direct entropy estimate on the linearized system). By continuity, we therefore conclude that the stability index of a boundary layer V |(x0 ,+∞) is even when W (x0 ) is negative and odd when it is positive. Now, let γ be larger than 2, assume that ν > β ∗ (γ )κ, for instance ν > κ, and let the shock (v − , v + ) be strong enough. Then S > 0, which means that W (x0 ) is negative for x0 < 0, large enough. Then the corresponding boundary layer is unstable. 6. Isentropic Gas Dynamics In one-space dimensional isentropic gas dynamics, the flow is described by v = (ρ, v) only. Therefore, n = 2 and the conservation laws express the mass and momentum balances. We have ρv ρ 0 0 , f (v) = . g(v) = , b(v) = ρv 2 + p(ρ) ρv 0 ν(ρ) Hereabove, ν > 0 (thus r = 1) and the pressure satisfies (hyperbolicity) p > 0. The sound speed is here c := p . The eigenvalues of dF are v ± c(ρ) and the one of dg −1 h (a 1 × 1 matrix !) is v. Contrary to the case of full gas dynamics, both matrices do not share a common eigenvalue. We summarize the number of boundary conditions for the viscous (p + r) and the inviscid (q) problems in Table 2. There are four distinct cases: v+ > c+ : then q = r + p, so that every boundary layer is trivial (that is constant). Since r + p = n, the linearized boundary layer is homogeneous Dirichlet, The “layer” is linearly stable, from an obvious a priori estimate. 0 < v+ < c+ : then q = p and r + p = n, so that W is a Wronskian. It cannot vanish. Thus the sign of B(x0 ; 0)B(x0 ; +∞) is constant, therefore positive. The stability index is even. −c+ < v+ < 0: then q = p + r, so every boundary layer is constant. Since r + p = 1, there is only one boundary condition for the Navier–Stokes system. Let αρ +βv = 0 be the linearized boundary condition, with β = 0 for the viscous IBVP being locally Table 2. Isentropic case v+ q

−c+ 0

[p, p + r] [0, 1]

c+

0 1

1

2

[0, 1]

[1, 2]

[1, 2]

288


well-posed. The subspace E(0) is spanned by the constant v = (−ρ+ , c+ ). Then B(0) = −αρ+ + βc+ . From an obvious energy estimate, it is stable when α = 0. By continuity, we conclude that the stability index is even (resp. odd) when α ρ+ < c+ , β

resp. > c+ .

v+ < −c+ : then q = p and r + p = 1. Here, E(0) is spanned by V (if V is not trivial). With the same notations as above, B(0) = αρ (0) + βv (0). Since ρv ≡ j in the layer, the vanishing of B(0) is equivalent to αρ 2 = βj . By continuity, the stability index depends only on the sign of α ρ(0)2 − j. β For α = 0, we find that a constant layer (seen as a limit case) is stable, from an obvious energy estimate. Therefore the stability index is even (resp. odd) when the above quantity is positive (resp. negative). 7. Appendix A In this appendix, we establish the stability of weak boundary layers in the case of degenerate viscosity, under the simplifying hypothesis r + p = n followed in Sects. 4– 5; as discussed in the introduction, this corresponds to the case of full Dirichlet boundary conditions. This extends prior results of [10] and [12] in the one-dimensional and multidimensional case, respectively, obtained for strictly parabolic viscosities. Similarly as in [10, 12], a key ingredient in the proof is the following elementary Poincaré estimate. Lemma 7. Let w ∈ C 1 [0, +∞) vanish at x = 0. Then, for any weighting function α(·) > 0, we have +∞ +∞ +∞ 2 α(y)|w(y)| dy ≤ α(y)|y|dy |w (y)|2 dy. (30) 0

0

0

Proof. Cauchy–Schwarz inequality, applied to w(x) = |w(x)|2 ≤

x

x

1dy 0

|w (y)|2 dy

x 0

w (y)dy, gives

= |x|

0

x

|w (y)|2 dy,

0

whence

+∞

+∞

α(y)|w(y)|2 ≤

0

0

=

0

proving the claim.

+∞

y

α(y)|y| 0

|w (z)|2 (

|w (z)|2 dzdy +∞

z

α(y)|y|dy)dz,


289

Proposition 2. Fixing u+ , assume that (H1)–(H10) hold in some neighborhood U of u+ . Further, suppose that r + p = n, where r and p are defined as in the introduction. Then, boundary layers U : U (+∞) = u+ that are sufficiently “weak” in the sense that the entire profile U (·) is contained in a sufficiently small ball about u+ , are spectrally stable. Proof. The result follows by a combination of two energy estimates. The first is the one used to establish stability in the strictly parabolic case: Writing the linearized eigenvalue equation in original coordinates, we have λu + (Au) = (Bu ) ,

(31)

where B := B(U ) is singular, and Aα := dF (U )α − dB(U )(α, U ) for any vector α. From (H6) and (H9), we find by continuity that, for sufficiently weak profiles, there exists a symmetric, positive definite symmetrizer S(x) ≥ s0 > 0 such that S(x)dF (U (x)) is symmetric for all x (recall, existence of a symmetrizer is equivalent to semisimple, real spectrum for dF ), and, moreover (dissipativity): X ∗ SBX ≥ β|BX|2 .

(32)

Taking the real part of the (complex) L2 inner product of Su against Eq. (31), carrying out various integrations by parts, and rearranging, we obtain the basic energy estimate: λ'Su, u( = 'O(|U |)u, u( + 'O(|U |)u, Bu ( − 'u , SBu ( − 'Su , (dB · u)U (, which, through (32), implies λ'Su, u( + βBu 2 ≤ 'O(|U |)u, u( + 'O(|U |)u, Bu ( − 'Su , (dB · u)U (.

(33)

Let us denote by π the projection from Cn to Cr , where we just retain the r last components. Then the last term in (33) is bounded by cst π Su · u, since the n − r first rows of dB · u, like those of B, vanish. We now use Lemma 8 below in order to bound this term by cst Bu · u. We then derive from (33) and Young’s inequality the following estimate, s0 λu2 + βBu 2 ≤ 'O(|U |)u, u(,

(34)

or, in (w, z) coordinates: s1 λ(w2 + z2 ) + β1 z 2 ≤ 'O(|U |)w, w( + 'O(|U |)z, z(, where s1 and β1 denote modified, positive constants. Applying (30) with α = O(|U (y)|), and observing that +∞ |U ||y|dy 0

(35)

290


can be made as small as desired by enforcing sufficiently weak layer strength, we find that we can absorb the z2 term in the right-hand side of (35) to obtain, finally, the desired first energy estimate: s2 λ(w2 + z2 ) + β2 z 2 ≤ 'O(|U |)w, w(,

(36)

where s2 and β2 denote further modified, positive constants. Note that, were there no w term, this would already establish a contradiction to the assumption that λ ≥ 0, proving spectral stability. In the case of degenerate viscosity, however, we require also a second energy estimate controlling term 'O(|U |)w, w(. Restrict attention, now, to the (n − r)-dimensional reduced eigenvalue equation λγ w + (dhw) + (dkz) = 0

(37)

arising in the (w, z) coordinates, where γ and h are as defined in (H2)–(H4), and γ := γ (U ), dh := dh(U ), dk := dk(U ). By (H4), the matrix γ −1 dh is diagonalisable, with real eigenvalues, for all x. Moreover, by assumption p = n − r, we have at x = +∞ that all eigenvalues are positive; for sufficiently weak boundary layers, then, this property holds for all x ∈ [0, +∞). It follows that there is a symmetric, positive definite symmetrizer Sw (x) such that P := Sw γ −1 dh ≥ h0 > 0

(38)

is symmetric, positive definite for all x ∈ [0, +∞). That is, (37) features purely upwind propagation. This allows us to apply to the system of equations (37) a weighted energy estimate like that applied by Goodman [11] to individual characteristic fields, in the context of shock stability. Precisely, following Goodman, define the scalar weight α(x) > 0 by ODE α = −C|U |α/ h0 ,

(39)

with initial condition α(0) = 1, where C > 0 is a sufficiently large constant. Then, multiplying (37) by αγ −1 and taking the real part of the (complex) L2 inner product against Sw w, we obtain after rearrangement the basic energy estimate: 'αw, w( − (1/2)'w, (αP )w( = 'O(α|U |)w, w( + 'O(α|U |)z, z(.

(40)

But, from (39), (38) we easily obtain as in [11] that −(1/2)'w, (αP )w( ≥ (C/2)'α|U |w, w(. Thus, summing (34) and (36), and taking C sufficiently large, we obtain the final estimate s2 λ(w2 + z2 ) + (β2 /2)z 2 + (C/4)'O(|U |)w, w( ≤ 0,

(41)

where we have as before used (30) to absorb the term 'O(|U |)z, z( in β2 z 2 . Evidently, (41) implies spectral stability, since λ ≥ 0 gives an immediate contradiction. (Note: above, we have used freely the fact that α is bounded away from zero.) It remains to prove the following Lemma 8. Under assumption (32), there exists a finite number c(u) such that |π SX| ≤ c|BX|.


291

Proof. Let Y be a vector such that BY = 0. Then, applying (32) to X + sY , we see that the affine function s → (S(X + sY ), BX) − β|BX|2 is non-negative, therefore constant. Thus, BY = 0 implies (SY, BX) = 0 for every X. In other words, π Y = 0 implies (π SY, π BX) = 0 for every X. However, π B is onto, therefore π Y = 0 implies πSY = 0. This means that there exists a matrix S, such that π S = Sπ B. A simple computation gives S = S0 B0−1 , where S0 and B0 are the lower right blocks of S and B. (Let us recall that, zero being a semi-simple eigenvalue of B, the block B0 is invertible.) Then the result holds, with c := S. We remark that essentially the same argument, rephrased as an energy estimate on the time-evolutionary equations, establishes linearized and not only spectral stability. In the case r = n, Gisclon and Serre [10] were able to establish a full nonlinear stability result. It would be interesting to determine whether or not stability of weak boundary layers holds also in the case that p < n−r, when there are fewer than n Dirichlet conditions. In this case, it seems conceivable that the nature of the boundary conditions may strongly affect stability, even in the weak layer limit. For related work, see [18]. References 1. Alexander, J.C., Gardner, R. and Jones, C.K.R.T.: A topological invariant arising in the stability analysis of travelling waves. J. Reine Angew. Math. 410, 167–212 (1990) 2. Benzoni-Gavage, S., Serre, D. and Zumbrun, K: Alternate Evans functions and viscous sjock wawes. SIAM J. Math. Anal. 32, 929–962 (2001) 3. Bultelle, M., Grassin, M. and Serre, D.: Unstable Godunov discrete profiles for steady shock waves. SIAM J. Numer. Anal. 35, 2272–2297 (1998) 4. Freistühler, H. and Serre, D.: The L1 -stability of boundary layers for scalar viscous conservation laws. J. Dynamics & Diff. Eqns. to appear 5. Gardner, R.A. and Jones, C.K.R.T.: Traveling waves of a perturbed diffusion equation arising in a phase field model. Indiana Univ. Math. J. 39, 4, 1197–1222 (1990) 6. Gardner, R.A. and Jones, C.K.R.T.: A stability index for steady state solutions of boundary value problems for parabolic systems. J. Differential Eqs. 91, 2, 181–203 (1991) 7. Gardner, R.A. and Zumbrun, K.: The gap lemma and geometric criteria for instability of viscous shock profiles. Comm. Pure Appl. Math. 51, 7, 797–855 (1998) 8. Gilbarg, D.: The existence and limit behavior of the one-dimensional shock layer: Am. J. Math. 73, 256–274 (1951) 9. Gisclon, M.: Etude des conditions aux limites pour un système hyperbolique, via l’approximation parabolique. J. Maths. Pures & Appl. 75, 485–508 (1996) 10. Gisclon, M. and Serre, D.: Etude des conditions aux limites pour un système hyperbolique, via l’approximation parabolique. C. R. Acad. Sci. Paris, Série I 319, 377–382 (1994) 11. Goodman, J.: Remarks on the stability of viscous shock waves. In Viscous profiles and numerical methods for shock waves (Raleigh, NC, 1990), Philadelphia: SIAM, 1991, pp. 66–72 12. Grenier, E. and Guès, O.: Boundary layers for viscous perturbations of noncharacteristic quasilinear hyperbolic problems. J. Diff. Eqs. 143, 110–146 (1998) 13. Grenier, E. and Rousset, F.: Stability of one dimensional boundary layers using Green’s functions. Comm. Pure Applied Math., to appear 14. Henry, D.: Geometric theory of semilinear parabolic equations. Lecture Notes in Maths. 840, Berlin: Springer-Verlag, 1981 15. Kawashima, S.: Systems of a hyperbolic–parabolic composite type, with applications to the equations of magnetohydrodynamics. PhD thesis, Kyoto University (1983) 16. Kawashima, S. and Shizuta, Y.: On the normal form of the symmetric hyperbolic-parabolic systems associated with the conservation laws. Tohoku Math. J. 40, 449–464 (1988)

292


17. Li, Ta-tsien and Yu, Wen-ci: Boundary value problems for quasilinear hyperbolic systems. Durham: Duke Univ., 1985 18. Matsumura, A. and Mei, M.: Convergence to travelling fronts of solutions of the p-system with viscosity in the presence of a boundary. Arch. Ration. Mech. Anal. 146, 1–22 (1999) 19. Rousset, F.: Inviscid boundary conditions and stability of viscous boundary layers. Asymptotic Analysis, 2001, to appear 20. Serre, D.: Sur la stabilité des couches limites de viscosité. Ann. Inst. Fourier 51, 109–129 (2001) 21. Zumbrun, K. and Howard, P.: Pointwise semigroup methods and stability of viscous shock waves. Indiana Univ. Math. J. 47, 3, 741–871 (1998) Communicated by P. Constantin

Commun. Math. Phys. 221, 293 – 304 (2001)

Communications in



Symplectic Structures of Moduli Space of Higgs Bundles over a Curve and Hilbert Scheme of Points on the Canonical Bundle Indranil Biswas1 , Avijit Mukherjee2 1 School of Mathematics, Tata Institute of Fundamental Research, Homi Bhabha Road, Bombay 400005,

India. E-mail: [email protected]

2 Scuola Internazionale Superiore di Studi Avanzati, via Beirut 4, 34014 Trieste, Italy.

E-mail: avijit@@sissa.it Received: 15 January 2000 / Accepted: 25 March 2001

Abstract: The moduli space of triples of the form (E, θ, s) are considered, where (E, θ ) is a Higgs bundle on a fixed Riemann surface X, and s is a nonzero holomorphic section of E. Such a moduli space admits a natural map to the moduli space of Higgs bundles simply by forgetting s. If (Y, L) is the spectral data for the Higgs bundle (E, θ ), then s defines a section of the line bundle L over Y . The divisor of this section gives a point of a Hilbert scheme, parametrizing 0-dimensional subschemes of the total space of the canonical bundle KX , since Y is a curve on KX . The main result says that the pullback of the symplectic form on the moduli space of Higgs bundles to the moduli space of triples coincides with the pullback of the natural symplectic form on the Hilbert scheme using the map that sends any triple (E, θ, s) to the divisor of the corresponding section of the line bundle on the spectral curve. 1. Introduction A Higgs bundle over a compact connected hyperbolic Riemann surface X is a pair of the form (E, θ ), where E is a holomorphic vector bundle over X and θ is a holomorphic section of KX End(E). Higgs bundles were introduced in [Hi1]. Let MH denote a moduli space of stable Higgs bundles. We assume that for a Higgs bundle in MH , the degree of the underlying vector bundle E satisfies the inequality degree(E) > rank(E)(g − 1). So the Riemann–Roch theorem for E ensures that E admits a nonzero section. It is known that MH is a connected complex manifold of dimension 2rank(E)2 (g − 1) + 2 [Hi2]. Here we consider triples of the form (E, θ, s), where (E, θ ) is a Higgs bundle in MH and s ∈ H 0 (X, E) is a nonzero section. We recall that triples of this kind are considered in [BG, Li].

294

I. Biswas, A. Mukherjee

Let MT denote the moduli space of isomorphism classes of triples. It can be shown that MT exists as an analytic space. Since for every (E, θ, s) ∈ MT we have (E, θ ) ∈ MH , there is a surjective morphism F : MT −→ MH . The sujectivity of F is a consequence of the earlier observation that dim H 0 (X, E) > 0. A holomorphic symplectic form on the moduli space MH was constructed in [Hi1]. We will denote this symplectic form on MH by . Therefore, F ∗ is a holomorphic closed two-form on MT . If E is of rank n, then it is possible to evaluate on θ a GL(n, C)-invariant polynomial on the Lie algebra M(n, C), namely polynomials of the form A −→ trace(Ai ). This yields an element of the vector space H 0 (X, KX⊗i ). The resulting map P : MH −→ V :=

n i=1

H 0 (X, KX⊗i ),

which is known as the Hitchin map, is proper [Hi1]. Any element of V defines a spectral curve, which is a curve in the total space of KX defined by a polynomial constructed from the given element of V. Given any point p ∈ MH , a holomorphic line bundle L can be constructed on the spectral curve Yp corresponding to the point P (p). If p = (E, θ ), then the fibers of L are basically the eigenvectors of θ . Let π denote the projection of Yp to X obtained from the obvious projection of KX to X. It turns out that π∗ L = E [Hi2]. Therefore, a section s of E defines a section s of L over Yp . Since Yp is embedded in KX , the divisor of s defines a 0-dimensional subscheme of KX . In other words, we have a map : MT −→ Hilbl (KX ) to a Hilbert scheme of points on KX ; the integer l is the degree of the line bundle L. The natural symplectic structure on KX induces a holomorphic symplectic structure on Hilbl (KX ) [Be]. This symplectic form on Hilbl (KX ) will be denoted by C . In Theorem 3.1 we prove that the 2-form F ∗ on MT coincides with ∗ C . Since both and C are exact, Theorem 3.1 reduces to an equality of two given 1-forms. We show that the difference of these two 1-forms in question descends to V. The properness of the Hitchin map is very useful for this step. Then we show that a twisted version of the descended 1-form on V further descends as a meromorphic 1-form on a projective space with appropriate poles. Finally, the proof of Theorem 3.1 is completed using the result that no nonzero meromorphic form of the given type exists. 2. Higgs Bundles and Triples Let X be a compact connected Riemann surface of genus g, with g ≥ 2. The holomorphic cotangent bundle of X will be denoted by KX . A Higgs bundle over X is a pair of the form End(E)) (E, θ ), where E is a holomorphic vector bundle over X and θ ∈ H 0 (X, KX known as a Higgs field. A Higgs bundle (E, θ ) is calledstable if for every proper subbundle F ⊂ E of positive rank and with θ (F ) ⊆ KX F , the inequality degree(F )/rank(F ) < degree(E)/rank(E)

Higgs Bundles and Hilbert Scheme

295

is valid [Hi1]. Stability is an open condition. In other words, given any algebraic (respectively, analytic) family of Higgs bundles the points of the parameter space over which the Higgs bundle is stable form a Zariski (respectively, analytic) open subset. End(E)) on a vector bundle E of rank n and Given a Higgs field θ ∈ H 0 (X, KX an integer i ∈ [1, n], consider pi (θ ) := trace(θ i ) ∈ H 0 (X, KX⊗i ), which is defined using the associative algebra structure of the fibers of End(E). The map which sends a Higgs bundle (E, θ ) to n

pi (θ ) =

i=1

n

trace(θ i ) ∈

i=1

n i=1

H 0 (X, KX⊗i )

is known as the Hitchin map. By p0 (θ ) we will mean the section of KX⊗0 = OX given by the constant function 1. The total space of the line bundle KX will also be denoted by KX . Let π denote the projection of KX to X. For a Higgs bundle (E, θ ), the subscheme of KX defined by the solution of the polynomial n n Pθ (t) = t n−i pi (θ ) = t n−i trace(θ i ) = 0 i=0

i=0

is called the spectral curve associated to (E, θ ) [Hi2, BNR]. We will denote this spectral curve by Yθ . The natural projection from the spectral curve Yθ to X obtained by restricting π – which we will also denote by π – is a finite morphism. Furthermore, there is a torsionfree sheaf L of rank one on Yθ such that π∗ L ∼ = E.

(2.1)

The fiber Ly of L over a point y ∈ Yθ can be considered as the eigenvector of θ(π(y)) for the eigenvalue y [Hi2]. The pair (Yθ , L) is called the spectral data for the Higgs bundle (E, θ ). Since Yθ ⊂ KX , there is a natural homomorphism fθ : L −→ π ∗ KX ⊗ L,

(2.2)

which sends any vector v ∈ Ly , where y ∈ Yθ , to the tensor product y ⊗l ∈ (KX )|π(y) ⊗ Ly . Its direct image π∗ fθ : π∗ L −→ π∗ (π ∗ KX ⊗ L) ∼ = KX ⊗ π ∗ L coincides with the Higgs field θ [Hi2]; the last isomorphism is obtained from the projection formula. The equivalence between Higgs bundles and spectral data was used in [Si] to give a construction of moduli space of semistable Higgs bundles. Given a Higgs bundle (E, θ ), let C. denote the following two term complex of sheaves on X: [−,θ] C. : C0 = End(E)−→C1 = KX ⊗ End(E), where End(E) is at the 0th position, and if θ = dz ⊗ A in a local coordinate function z and s is a local section of End(E), then [s, θ ] = dz ⊗ (sA − As).

296


The space of infinitesimal deformations of the Higgs bundle (E, θ ) is parametrized by the hypercohomology H1 (C. ) [BR]. There is a canonical element in the dual vector space H1 (C. )∗ . To construct this element, first consider the diagram [−,θ]

End(E) −→ KX ⊗  End(E)  End(E) −→ 0

(2.3)

This diagram gives a homomorphism δ : H1 (C. ) −→ H 1 (X, End(E)). Now, given any α ∈ H1 (C. ), the pairing trace(δ(α) ∪ θ) ∈ H 1 (X, KX ) = C defines the canonical element in H1 (C. )∗ [Hi1]; we will denote this canonical element in H1 (C. )∗ by &θ . Let KX [1] denote the complex of sheaves with only one nonzero term KX at the first position. We have the following homomorphism from the complex C. ⊗ C. to KX [1]: 0   End(E) ⊗  End(E)  [−, θ ] ⊕ [−, θ ]

0   −→  0  trace

End(E) ⊗ (KX ⊗ End(E)) ⊕  (KX ⊗ End(E)) ⊗ End(E) −→  [−, θ ] + [−, θ ] (KX ⊗ End(E)) ⊗ −→  (KX ⊗ End(E))  0

KX  0   0

where the middle homomorphism, namely trace, is defined using the trace map End(E) ⊗ End(E) −→ OX of endomorphisms. This homomorphism of complexes gives the following homomorphism of hypercohomologies: H1 (C. ) ⊗ H1 (C. ) −→ H2 (C. ⊗ C. ) −→ H2 (KX [1]) = H 1 (X, KX ) = C.

(2.4)

This bilinear form on H1 (C. ) is evidently anti-symmetric and nondegenerate. In other words, it defines a symplectic form on H1 (C. ). This symplectic form on H1 (C. ) will be denoted by θ . Given a holomorphic family of Higgs bundles parametrized by a complex manifold T , the pointwise construction of &θ gives a holomorphic one-form, which we will denote by &T on T . The exterior derivative d&T coincides with the one obtained from pointwise construction of θ [BR]. In fact, the pairing θ defines a symplectic form on the smooth locus of any moduli space of Higgs bundles over X [Hi2].


297

Let MH denote the moduli space of stable Higgs bundles over X of rank n and degree d. See [Si] for the construction of MH . We assume that d > n(g − 1). So from Riemann–Roch, dim H 0 (X, E) = d − n(g − 1) + dim H 1 (X, E) > 0. In other words, E admits nonzero sections. Definition 2.1. A triple over X is a data of the form (E, θ, s), where (E, θ ) is a stable Higgs bundle of rank n and degree d, and s ∈ H 0 (X, E) − 0 is a nonzero holomorphic section. Let MT denote the moduli space of triples that parametrizes isomorphism classes of triples. Two triples (E, θ, s) and (E , θ , s ) are called isomorphic if there is a holomorphic isomorphism E −→ E that takes θ to θ and s to s . We will show that MT exists as an analytic space. Let (ET , θT ) be a family of stable Higgs bundles of rank n and degree d over X parametrized by a complex space T . Let ψT denote the projection of X × T to T . If (ET , θT ) is another such family such that for every point t ∈ T , the Higgs bundle (Et , θt ) is isomorphic to (Et , θt ), then it can be shown that there is a holomorphic line bundle ξ over T such that ET = ET ⊗ ψT∗ ξ and θT = θT ⊗ I dψT∗ ξ . Indeed, this is an immediate consequence of the fact that given a stable Higgs bundle (E, θ ), the only automorphisms of E that takes θ to itself are the scalar multiplications. Now note that any automorphism of E defined by a scalar multiplication acts trivially on the projective space PH 0 (X, E) of lines in H 0 (X, E). From this it follows that MT exists as an analytic space, and the fiber of the forgetful map, that assigns (E, θ ) to (E, θ, s), is PH 0 (X, E) over the point of MH represented by (E, θ ). In fact, MT can be constructed locally over MH , that is over sufficiently small analytic open subsets of MH . The earlier remarks ensure that these local constructions patch together to define MT . It was remarked in (2.1) that π∗ L is isomorphic to E. Since π −1 (π∗ L) is a subsheaf of L, there is a canonical injective homomorphism π∗ : H 0 (Yθ , L) −→ H 0 (X, π∗ L) = H 0 (X, E). The finiteness of the map π implies that this homomorphism π∗ is actually an isomorphism. Now, given a section s ∈ H 0 (X, E), let s := π∗−1 (s) ∈ H 0 (Yθ , L)

(2.5)

be the section of L that corresponds to it by the isomorphism π∗ defined above. For a nonzero section s ∈ H 0 (X, E)−0, consider the divisor div( s) on Yθ . Using the inclusion map of Yθ in the surface KX , the zero-dimensional subscheme div( s) of Yθ defines a zero-dimensional subscheme of KX . The genus of the spectral curve Yθ is n2 (g − 1) + 1 [Hi2]. Therefore, from the Riemann–Roch theorem it follows that degree(div( s)) = d + n(n − 1)(g − 1),

(2.6)

where d = degree(E). In the next section we will construct a morphism from a moduli space of triples to a Hilbert scheme parametrizing 0-dimensional subschemes of the total space KX of the canonical bundle.

298


3. Morphism from Triples to Hilbert Scheme For any integer j ≥ 1, let Hilbj (KX ) denote the Hilbert scheme, which is the moduli space parametrizing 0-dimensional subschemes of length j of the quasi-projective surface KX . Since the spectral curve Yθ is embedded in KX , the divisor div( s) (defined in (2.5)) on Yθ defines a point of Hilbl (KX ), where l = d + n(n − 1)(g − 1) is the degree of s as obtained in (2.6). As before, let MT denote the moduli space of triples of the form (E, θ, s) considered in Definition 2.1. Recall that d = degree(E) > rank(E)(g − 1) = r(g − 1). Associating to any triple (E, θ, s) the element of the Hilbert scheme Hilbl (KX ) defined by the divisor div( s), we obtain a morphism : MT −→ Hilbl (KX ).

(3.1)

As before, let MH denote the moduli space of stable Higgs bundles over X of degree d and rank n. Let (3.2) F : MT −→ MH denote the forgetful map which sends any triple (E, θ, s) to the Higgs bundle (E, θ ). Recall that we have assumed that d > n(g − 1). Therefore, the map F in (3.2) is surjective. We note that there are different notions of stability of a triple. If notions of moduli spaces of triples different from the one given here is used, then F is usually not everywhere defined. Therefore, MT is a projective bundle over MH . For any point p = (E, θ ) on MH , the inverse image F −1 (p) is P (H 0 (X, L)). Let denote the symplectic form on MT whose pointwise construction has been described in (2.4). This symplectic form was introduced in [Hi1]. The surface KX has a natural symplectic structure. This symplectic form induces a symplectic structure on any Hilbert scheme Hilbj (KX ) of 0-dimensional subschemes of KX of length j [Be, pp. 766–767]. Let C denote the canonical symplectic form on Hilbl (KX ). Therefore, on the moduli space MT we have two holomorphic 2-forms, namely F ∗ and ∗ C . The following theorem says that F ∗ = ∗ C . Theorem 3.1. The two holomorphic 2-forms F ∗ and ∗ C on MT coincide. The proof of this theorem will be carried out in the next section. In the rest of this section we will reduce the equality F ∗ = ∗ C to an equality of 1-forms. In Sect. 2 a one-form &θ was constructed on the space of infinitesimal deformations of a Higgs bundle (E, θ ). Let & denote the holomorphic 1-form on MH such that for any point (E, θ ) ∈ MH the form & at the point (E, θ ) coincides with the 1-form &θ . We already noted in Sect. 2 that d& = . (3.3) Take a point p ∈ Hilbl (KX ) representing the collection of points {p1 , p2 , · · · , pl }, where pi ∈ KX and pi are distinct. In other words, pi = pj if i = j . We have Tp Hilbl (KX ) =

l i=1

Tpi KX .

(3.4)


299

This decomposition is immediate from the fact that a neighborhood of p in Hilbl (KX ) is identified with a neighborhood in the l th symmetric product of KX of the point represented by the collection {pi }. Let ω denote the canonical 1-form on KX . The exterior derivative dω is the canonical symplectic form on KX . Since ω(pi ) is a 1-form on Tpi KX , using the decomposition (3.4) we have an element &C (p) ∈ Tp∗ Hilbl (KX ). It is easy to check that this defines a 1-form on Hilbl (KX ). Let &C denote the 1-form on Hilbl (KX ) whose evaluation at any point p coincides with &C (p) constructed above. Proposition 3.2. The exterior derivative d&C coincides with the symplectic form C on Hilbl (KX ). Proof. Let v := {v1 , v2 , · · · vl } and w = {w1 , w2 , · · · wl } be two tangent vectors in Tp Hilbl (KX ), where vi , wi ∈ Tpi KX ; the decomposition of Tp Hilbl (KX ) used here is the one obtained in (3.4). The evaluation of the symplectic form C on the pair {v, w} is described by the following identity ([Be]; the construction in Prop. 5 (pp. 766)): C (p)(v, w) =

l

dω(pi )(vi , wi ),

(3.5)

i=1

where dω, as before, is the canonical symplectic form on KX . The equality (3.5) immediately implies that the decomposition (3.4) of Tp Hilbl (KX ) is orthogonal with respect to the symplectic form C (p). The decomposition (3.4) is obviously orthogonal with respect to the skew-symmetric form d&C (p). Consequently, it suffices to check that the restriction of d&C (p) to the subspace Tpi KX ⊂ Tp Hilbl (KX ) coincides with the restriction of C (p). But clearly both these restrictions coincide with the symplectic form dω(pi ) on Tpi KX . This completes the proof of the proposition. In view of Proposition 3.2 and the equality (3.3), the Theorem 3.1 is an immediate consequence of the following lemma. Lemma 3.3. The two holomorphic 1-forms F ∗ & and ∗ &C on MT coincide. The following section will be devoted to the proof of Lemma 3.3. 4. Proof of the Lemma Let Y be a connected Riemann surface, and let π : Y −→ X be a covering map, possibly ramified, of degree n. Fix a holomorphic section β ∈ H 0 (Y, π ∗ KX ). Using the natural homomorphism (dπ )∗ : π ∗ KX −→ KY , the section β gives a section of KY . This section of KY will also be denoted by β.

300


For any holomorphic line bundle L on Y , the direct image π∗ L is a holomorphic vector bundle of rank n over X. Considering the infinitesimal deformations, we have a homomorphism π : H 1 (Y, OY ) −→ H 1 (X, End(π∗ L)),

(4.1)

as the space of infinitesimal deformations of π∗ L (respectively, L) are parametrized by H 1 (X, End(π∗ L)) (respectively, H 1 (Y, OY )). Since β ∈ H 0 (Y, KY ), we have ∪β

fL : H 1 (Y, OY ) −→ H 1 (Y, KY ) = C.

(4.2)

On the other hand, since β ∈ H 0 (Y, π ∗ KX ), taking the direct image of the multiplication map β⊗

L −→ π ∗ KX ⊗ L we have a homomorphism φL : π∗ L −→ π∗ (π ∗ KX ⊗ L) ∼ = KX ⊗ π ∗ L of vector bundles, where the last isomorphism is obtained from the projection formula. So, φL defines a holomorphic section βL ∈ H 0 (X, KX End(π∗ L)). Let f L denote the composition βL ∪

H 1 (X, End(π∗ L)) −→ H 1 (X, KX ⊗ End(π∗ L) ⊗ End(π∗ L)) trace

−→ H 1 (X, KX ) = C. Here trace, as before, is the homomorphism End(π∗ L) End(π∗ L) −→ OX constructed using the Killing form on GL(n, C) defined by A B −→ trace(AB). Proposition 4.1. The following diagram commutes H 1 (Y,  OY )  π

fL

−→ C fL

H 1 (X, End(π∗ L)) −→ C where π and fL are defined in (4.1) and (4.2) respectively. Proof. The proposition follows immediately by unraveling the definitions of the above homomorphisms. Take any cohomology class v ∈ H 1 (Y, OY ). Let α be a (0, 1)-form on Y which is a Dolbeault representative of v. Let U ⊂ X be the complement of the finite subset of X consisting of points over which π is ramified. Consider the π∗ Oπ −1 (U ) -valued (0, 1)-form on U defined by α. It is the restriction of a Dolbeault representative of π (α) to U . Since the canonical isomorphism H 1 (X, KX ) ∼ = C is defined using the integration of (1, 1)-forms on X, and similarly for Y , the proposition follows easily.


301

In Sect. 2 we briefly described the identification of Higgs bundles and the spectral data constructed in [Hi2]. Set Y to be a spectral curve. So, in particular, it is a curve embedded in KX . Set π to be the natural projection of the spectral curve to X. Let JY denote the subvariety of MH consisting of all Higgs bundles such that the corresponding spectral curve coincides with Y . We know that JY is identified with the component Picl (Y ) of the Picard group of Y , where l, as before, is d + n(n − 1)(g − 1) [Hi2]. The inverse image F −1 (JY ) ⊂ MT will be denoted by AY . So AY is a projective bundle over Picl (Y ). More precisely, it is the projectivized Picard bundle over Picl (Y ). Let ν : AY 5→ MT be the inclusion map. Proposition 4.2. The two 1-forms (F ◦ ν)∗ & and ( ◦ ν)∗ &C on AY coincide. Proof. This is actually a consequence of Proposition 4.1. Let f denote the inclusion of Y in KX . As before, the canonical 1-form on KX is denoted by ω. Set β in Proposition 4.1 to be the 1-form f ∗ ω on Y . Note that f ∗ ω is indeed a section of π ∗ KX . For any point (L, s) ∈ AY , the homomorphism fL constructed in (4.2) gives the oneform ( ◦ ν)∗ &C . On the other hand, (F ◦ ν)∗ & is constructed from the homomorphism f L . Now, Proposition 4.1 finishes the proof. We will note a simple observation. Let g : Z1 −→ Z2 be a proper smooth holomorphic map between connected complex manifolds. For any u ∈ Z2 , the inverse image g −1 (u) is assumed to be connected. Let τ be a holomorphic 1-form on Z1 such that the contraction of τ with any vertical tangent vector v ∈ Tu Z1 vanishes. By a vertical vector we mean a vector in the kernel of the differential dg : T Z1 −→ g ∗ T Z2 of the map g. Then there is a 1-form on Z2 , say τ , such that g ∗ τ = τ . To see this first note that given any tangent vector w ∈ Tz Z2 , using the above condition, namely the contraction of τ with any vertical vector vanishes, a holomorphic function on g −1 (z) is obtained. Now the existence of such a form τ is an immediate consequence of the fact that there is no nonconstant holomorphic function on a compact connected complex manifold. As we know from [Hi2], the space of spectral curves is parametrized by the vector space n V := H 0 (X, KX⊗i ). i=1

Let P : MH −→ V denote the Hitchin map defined in Sect. 2 which sends any Higgs bundle to the spectral curve associated to it. The morphism P is proper [Hi2]. Now, from Proposition 4.2 coupled with the above observation it follows that there is a holomorphic 1-form γ on V such that F ∗ & − ∗ &C = (P ◦ F )∗ γ .

(4.3)

Indeed, taking Z1 in the above observation to be the open subset of MT over which P ◦ F is smooth and setting τ to be F ∗ & − ∗ &C , the existence of a form γ satisfying (4.3) is ensured by the above observation. Lemma 3.3 will be proved by showing that the form γ in (4.3) vanishes identically.

302


Consider the action of the group C∗ = C\{0} on V is defined by t · {v1 , v2 , · · · , vn } = {tv1 , t 2 v2 , · · · , t i vi , · · · , t n vn },

(4.4)

where t ∈ C∗ and vi ∈ H 0 (X, KX⊗i ). This action will be denoted by ρ. The quotient of V\{0} by this action ρ is a weighted projective space. This weighted projective space . Let will be denoted by P q : V\{0} −→ P denote the quotient map. Proposition 4.3. The evaluation of the 1-form γ , obtained in (4.3), on any vertical vector for the projection q vanishes. Proof. Take a point v ∈ V\{0}. For any z ∈ C∗ , let fz denote the automorphism of KX that sends any vector α ∈ KX to zα. It is easy to see that the automorphism fz of KX takes the spectral curve Yv defined by v to the spectral curve Yρ(z)v defined by ρ(z)v ∈ V, where ρ, as before, is the action defined in (4.4). The isomorphism of Yρ(z)v with Yv obtained from fz will be denoted by f z . Fix a line bundle L over the spectral curve Yv and also fix a holomorphic section s ∗ ∗ of L. Now f z L is a holomorphic line bundle over the spectral curve Yρ(z)v and f z s is ∗ a holomorphic section of f z L. Since the following diagram commutes fz

Yρ(z)v  −→ Yv   π π X = X ∗

∗

the two vector bundles π∗ L and π∗ f z L over X are isomorphic. Let (π∗ f z L, θz ) denote ∗ the Higgs bundle over X corresponding to the spectral data (Yρ(z)v , f z L). Since π∗ L ∗ and π∗ f z L are isomorphic, it follows that the evaluation of the 1-form & on MH to the ∗ tangent vector at the point (π∗ L, θ1 ) defined by the curve z −→ (π∗ f z L, θz ) vanishes. Indeed, this is immediate from the construction of &θ done following (2.3). ∗ −1 The divisor of the section f z s is simply f z (div(s)), the image of the divisor of −1

s by the isomorphism f z which is the inverse of f z . Furthermore, the evaluation of the canonical 1-form ω on KX on any vertical vector for the projection π vanishes. Consequently, the evaluation of the 1-form ∗ &C on the tangent vector at the point (π∗ L, θ1 , π∗ s) ∈ MT ∗

∗

defined by the curve z −→ (π∗ f z L, θz , π∗ f z s) vanishes. Combining the above two observations on & and ∗ &C respectively, we conclude that γ vanishes on any vertical vector for the projection q. This completes the proof of the proposition.


303

For each integer i ∈ [1, n], fix a basis {βi,1 , βi,2 , · · · , βi,mi } of the vector space So n1 = g and ni = (2i − 1)(g − 1) if i ≥ 2. This collection of basis gives us a basis for the vector space V. n Let N denote the dimension of V. So we have N = i=1 ni . Let P denote the N projective space consisting of all lines in C . We will define a morphism from V − {0} to P using the basis of V. Take any nonzero vector H 0 (X, KX⊗i ).

v=

ni n

ci,j βi,j ∈ V − {0},

i=1 j =1

where ci,j ∈ C. Let q : V − {0} −→ P be the map that sends any such vector v to the point of P represented by {ci,j }. Let Q : V − {0} −→ V − {0}

(4.5)

(4.6)

be the holomorphic map that sends any vector v as above to the vector ni n

(ci,j )i βi,j ∈ V − {0}.

i=1 j =1

Now we have a commutative diagram Q

V − {0} −→ V − {0} q  q P −→ P is the one induced by the map Q; the map q is defined in where the morphism P −→ P (4.5). In view of the commutativity of the above diagram, Proposition 4.3 implies that the evaluation of the 1-form Q∗ γ for any vertical vector for the projection q vanishes. Take a nonzero vector v ∈ V − {0}. Let α be a holomorphic tangent vector to the manifold V −{0} at v. Take a nonzero complex number λ. Let λ denote the automorphism of V which sends any vector w to λw. Therefore, d λ(α) is a tangent vector at λ(v), where d λ is the differential of the map λ. Choose a holomorphic map α from the unit disk in C to V such that α (0) = v and λ ◦ α ) (0) = d λ(α). α (0) = α. Therefore, we have ( For any point t in the unit disk, let Yt denote the spectral curve corresponding to the point α (t) of V. It is easy to see that the spectral curve corresponding to the point λ ◦ α (t) ∈ V is the image fλ (Yt ), where fλ , as before, is the automorphism of KX defined by multiplication with λ. If Lt is a line bundle over Yt and st is a holomorphic section of Lt , ∗ ∗ ∗ then f λ Lt is a line bundle on fλ (Yt ) and f λ s is a holomorphic section of f λ Lt ; here f λ : fλ (Yt ) −→ Yt , as before, is the isomorphism defined by fλ .

304


From the above observations it readily follows that λ · Q∗ γ (v)(α) = Q∗ γ (λv)(d λ(α)),

(4.7)

where Q is defined in (4.6). We already noted that the evaluation of the 1-form Q∗ γ for any vertical vector for the projection q (defined in (4.5)) vanishes. In view of this, the equality (4.7) implies OP (1) on P. This that the one-form Q∗ γ descends to a holomorphic section of 1P OP (1) such that means that there is a section γ of 1P q ∗ γ ∈ H 0 (V\{0}, 1V \{0} ⊗ q ∗ OP (1)) realized as a one-form on V\{0} using the canonical trivialization of q ∗ OP (1) coincides with the one-form Q∗ γ . It is known that H 0 (P, 1P ⊗ OP (1)) = 0 [SS, pp. 71, Theorem 4.3(a)]. Therefore, we obtain that γ = 0. Since γ = 0, the identity (4.3) completes the proof of Lemma 3.3. We already noted in Sect. 3 that Lemma 3.3 implies Theorem 3.1. Therefore, the proof of Theorem 3.1 is complete. We note that exactly imitating the construction of the 1-form & on MH we may construct a 1-form on MT . Indeed, there is a natural projection from the space of infinitesimal deformations of a triple (E, θ, s) to the space of infinitesimal deformations of the Higgs bundle (E, θ ). Similarly, exactly imitating the construction of the 2-form on MH a 2-form on MT can be constructed. If the parameter σ in the stability condition of a triple is not sufficiently small, then F is not defined everywhere. Even in this case, these 1-form and 2-form coincide on the domain of F with F ∗ & and F ∗ respectively. Theorem 3.1 remains valid if F ∗ is replaced by this 2-form on MT . Acknowledgements. We thank the referee for clarifying remarks.

References [Be]

Beauville, A.: Variétés Kähleriennes dont la premiére classe de Chern est nulle. J. Diff. Geom. 18, 755–782 (1983) [BNR] Beauville, A., Narasimhan, M.S., Ramanan, S.: Spectral curves and the generalised theta divisor. J. Reine Angew. Math. 398, 169–179 (1989) [BG] Bradlow, S.B., García-Prada, O.: Stable triples, equivariant bundles and dimension reduction. Math. Ann. 304, 225–252 (1996) [BR] Biswas, I., Ramanan, S.: An infinitesimal study of the moduli of Hitchin pairs. J. Lond. Math. Soc. 49, 219–231 (1994) [G] García-Prada, O.: Dimensional reduction of stable bundles, vortices and stable pairs. Internat. J. Math. 5, 1–52 (1994) [Hi1] Hitchin, N.J.: The self-duality equations on a Riemann surface. Proc. Lond. Math. Soc. 55, 59–126 (1987) [Hi2] Hitchin, N.J.: Stable bundles and integrable systems. Duke Math. J. 54, 91–114 (1987) [Li] Lin, T.-R.: Hermitian-Yang–Mills–Higgs metrics and stability for holomorphic vector bundles with Higgs fields. Preprint [Si] Simpson, C.T.: Moduli of representations of the fundamental group of a smooth projective variety. II. Inst. Hautes Études Sci. Publ. Math. 80, 5–79 (1994) [SS] Shiffman, B., Sommese, A.: Vanishing Theorems on Complex Manifolds. Progress in Math., Vol. 56 Boston: Birkhäuser, 1985 Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 221, 305 – 333 (2001)

Communications in



Long Time Behavior of the Continuum Limit of the Toda Lattice, and the Generation of Infinitely Many Gaps from C ∞ Initial Data A.B.J. Kuijlaars1 , K. T.-R. McLaughlin2,3 1 Department of Mathematics, Katholieke Universiteit Leuven, Celestijnenlaan 200 B, 3001 Leuven, Belgium.


2 Department of Mathematics, University of Arizona, Tucson, Arizona 85721, USA.


3 Department of Mathematics, University of North Carolina, Chapel Hill, North Carolina 27599, USA.

E-mail: [email protected] Received: 8 May 2000 / Accepted: 27 March 2001

Abstract: We analyze a continuum limit of the finite non-periodic Toda lattice through an associated constrained maximization problem over spectral density functions. The maximization problem was derived by Deift and McLaughlin using the Lax–Levermore approach, initially developed for the zero dispersion limit of the KdV equation. It encodes the evolution of the continuum limit for all times, including evolution through shocks. The formation of gaps in the support of the maximizer is indicative of oscillations in the Toda lattice and the lack of strong convergence of the continuum limit. For large times, the maximizer tends to have zero gaps, which is the continuum analogue of the sorting property of the finite lattice. Using methods from logarithmic potential theory, we show that this behavior depends crucially on the initial data. We exhibit initial data for which the zero gap ansatz holds uniformly in the spatial parameter (at large times), and other initial data for which this uniformity fails at all times. We then construct an example of C ∞ smooth initial data generating, at a later time, infinitely many gaps in the support of the maximizer, while for even larger times, all gaps have closed.

1. Introduction The finite, non-periodic Toda lattice is a dynamical system that is given in Flaschka coordinates by dan 2 = 2(bn2 − bn−1 ), n = 1, . . . , N, dt dbn = bn (an+1 − an ), n = 1, . . . , N − 1 dt

(1.1) (1.2)

with b0 = bN = 0. The Toda lattice is completely integrable and is solved explicitly by the inverse spectral transform [12, 23, 24].

306

A. B. J. Kuijlaars, K. T.-R. McLaughlin

The continuum limit of the Toda lattice is studied by Deift and McLaughlin in [7]. For > 0, they choose N = [1/ ] and initial values an = a0 ( n),

bn = b0 ( n)

(1.3)

in which a0 (x) and b0 (x) are given continuous functions for x ∈ [0, 1] such that b0 (x) > 0 on (0, 1). For small > 0, the initial data (1.3) vary only very little with the index n. Since the constant functions an (t) ≡ a ∈ R and bn (t) ≡ b > 0 clearly satisfy (1.1)–(1.2) one may then reasonably expect that the solutions with initial data (1.3) are approximately constant in time, and vary noticeably only over large time scales. Putting an (t) = a ( n, t),

bn (t) = b ( n, t),

one may then make the ansatz about the asymptotic form of the functions a and b : a (x, t) ∼ a(x, t) + a1 (x, t) + · · · , b (x, t) ∼ b(x, t) + b1 (x, t) + · · · .

(1.4) (1.5)

Inserting the asymptotic form (1.4)–(1.5) into (1.1)–(1.2) and equating the leading order terms, one arrives at the formal continuum limit: ∂a ∂b = 4b , ∂t ∂x

∂b ∂a =b , ∂t ∂x

(1.6)

for 0 < x < 1 and t > 0, with boundary conditions b(0) = b(1) = 0. The system (1.6) is hyperbolic with Riemann invariants α = a − 2b,

β = a + 2b.

The Riemann invariant form of (1.6) is ∂α β − α ∂α =− , ∂t 2 ∂x

∂β β − α ∂β = , ∂t 2 ∂x

(1.7)

with initial values α0 (x) := a0 (x) − 2b0 (x),

β0 (x) := a0 (x) + 2b0 (x).

(1.8)

A rigorous justification of this procedure was provided by Deift and McLaughlin in [7] for certain initial data (see below), modified to correspond to WKB data, and for times t up to the shock time of the system (1.6), see also the recent work [2]. For times beyond shock time, a more complicated description was found in certain cases. The analysis of the continuum limit of the Toda lattice has much in common with the analysis of Lax and Levermore [22] of the zero-dispersion limit of the Korteweg–de Vries (KdV) equation ut − 6uux + 2 uxxx = 0 as → 0. In both cases, essential use is made of the inverse spectral (or scattering) transforms for the KdV equation and the Toda lattice, respectively. A fundamental step in the analysis is the formulation of a quadratic maximization problem in the spectral variable in which the space-time variables x and t appear as parameters. The maximization problem arises from an asymptotic analysis of the spectral transform.

Continuum Limit of the Toda Lattice

307

For the continuum limit of the Toda lattice, the spectral data are the eigenvalues λ k , k = 1, 2, . . . , N, of the tridiagonal matrix   a1 b1 0     b1 a2 b2     ..   . L = (1.9) b2 a3    .. ..   . . bN  −1  0 bN −1 aN constructed from the initial values (1.3), together with the “norming constants” wk > 0, k = 1, . . . , N, where wk is the first component of the normalized eigenvector of L corresponding to the eigenvalue λ k . The following restriction is put on the initial Riemann invariants α0 and β0 from (1.8). We assume as in [7, 21] • α0 has exactly one local minimum in [0, 1], and • β0 has exactly one local maximum in [0, 1]. In addition, we assume that α0 (x) < β0 (x) for x ∈ (0, 1), α0 (0) = β0 (0) and α0 (1) = β0 (1). We put A := min α0 (x), 0≤x≤1

B := max β0 (x). 0≤x≤1

It follows from the above assumptions that, for every λ ∈ [A, B], the set {x ∈ [0, 1] : α0 (x) ≤ λ ≤ β0 (x)} is an interval. We denote the endpoints of this interval by x− (λ) and x+ (λ). Under these hypothesis, Deift and McLaughlin [7] show that the eigenvalues λ k have an asymptotic density φ, i.e., for all λ1 < λ2 , λ2 lim #{λ k ∈ (λ1 , λ2 )} = φ(λ) dλ, →0

λ1

and φ is given by φ(λ) =

1 π

x+ (λ)

x− (λ)

√

1 dx, (β0 (x) − λ)(λ − α0 (x))

λ ∈ [A, B].

(1.10)

Furthermore, for every fixed λ∗ ∈ [A, B], and eigenvalues λ k( ) of L such that λ k( ) → λ∗ as → 0, the limit lim log wk ( ) = −V (λ∗ )

→0

exists, and V is given by

x− (λ)

V (λ) = 0

2

λ − a0 (x)

λ − a0 (x)

log

− 1

dx. + 2b0 (x)

2b0 (x)

(1.11)

308


In (1.11) that branch of the square root is chosen which is positive for λ > β0 (x). Following [7], we are led to the maximization problem Q(x, t) := max [(Lψ, ψ) − 2(V − tλ, ψ)] , where the maximum is taken for those functions ψ on [A, B] satisfying 0 ≤ ψ ≤ φ, ψ(λ) dλ = x. Here

(1.12)

(1.13)

Lψ(λ) =

log |λ − µ|ψ(µ) dµ,

and the inner product (· , ·) is the L2 inner product on [A, B]: B f (λ)g(λ) dλ. (f, g) = A

The maximization problem (1.12)–(1.13) is an extremal problem for logarithmic potentials. The function V (λ) − tλ in the right-hand side of (1.12) is known as an external field, see [27]. The external field changes with time, and initially, at time t = 0, it is given by the spectral function V . The other spectral function φ is a constant of the motion and appears in (1.13) as an upper constraint for the maximizer. The spatial coordinate x appears as a normalization in (1.13). It is important to note that (1.12)–(1.13) provides a global description of the continuum limit of the Toda lattice. Indeed, the maximizer ψ(λ) = ψ(λ; x, t) exists and is unique for every x ∈ (0, 1) and every t ∈ R. In case, for some range of the parameters x and t in space-time, the “free part” of ψ, i.e., the set of λ ∈ [A, B] where 0 < ψ(λ) < φ(λ), is an interval (α(x, t), β(x, t)) then the endpoints α(x, t) and β(x, t) satisfy Eqs. (1.7) and the asymptotic form (1.4) and (1.5) is believed to be valid. In such a case one says that a zero gap ansatz holds. In case the set 0 < ψ(λ) < φ(λ) consists of several intervals, separated by a number of gaps, then the continuum limit exists only in a weak, averaged sense. The endpoints of the intervals then evolve according to a system of PDEs, which is recognized as a Whitham-type system of modulation equations, see again [7]. It is generally believed that the lack of strong convergence is due to the development of oscillations in the Toda lattice, and that the oscillations can be modeled by theta functions built out of a Riemann surface with genus equal to the number of gaps (for the analogous case of the KdV equation with small dispersion, see [32] and [13] for genus 1 and arbitrary genus, respectively). The connection between the existence of gaps and the development of oscillations in the continuum limit of the Toda lattice has not been established rigorously. However, experience from the interplay between whole line scattering theory and periodic spectral theory indicates that the existence of gaps implies the development of oscillations. For the small dispersion limit of the KdV equation, Deift, Venakides, and Zhou [8] have shown, under real analyticity assumptions on the spectral data, that the existence of gaps implies oscillations. Also, for the so-called Toda shock problem, the connection between a gap and oscillations is well known [15, 31, 17]. As already indicated, the formulation and analysis of a maximization problem like (1.12)–(1.13) lies at the heart of the Lax–Levermore approach to the zero-dispersion limit of the KdV equation [22, 30, 8, 11]. In a similar way, other singular limits of integrable


309

systems have been treated recently. We mention the semiclassical limit of the defocussing nonlinear Schrödinger equation [16] and the continuum limit of a discrete NLS chain [28]. The long time asymptotic behavior was considered in [22] for the zero-dispersion limit of the KdV equation, and in [1] for the semi-classical limit of the defocussing NLS equation. In this paper, we consider two questions related to the continuum limit of the Toda lattice. The first concerns the long time behavior. It is well known that the Toda lattice (1.1)–(1.2) has the sorting property, i.e., for fixed > 0, the tridiagonal matrix (1.9) converges as t → ∞ to a diagonal matrix with the eigenvalues λ k on the diagonal, sorted from large to small. This sorting property was discussed for the continuum limit in [3]. Deift and McLaughlin [7, Chapter 11] study the long time behavior in terms of the so-called right ansatz. The right ansatz holds for x and t if there exist α(x, t) and β(x, t) in [A, B] such that the maximizer ψ of the maximization problem (1.12)–(1.13) satisfies 0 < ψ(λ) < φ(λ), λ ∈ (α(x, t), β(x, t)), ψ = 0, λ ∈ [A, α(x, t)], ψ = φ, λ ∈ [β(x, t), B].

(1.14) (1.15) (1.16)

Theorem 1.1 (Deift–McLaughlin). For initial data as above, there exists a time t¯ such that for t > t¯ there exist x0 (t) and x1 (t) in [0, 1] satisfying lim x0 (t) = 0

t→∞

and lim x1 (t) = 1

t→∞

such that the right ansatz holds for every t > t¯ and every x ∈ (x0 (t), x1 (t)). Proof. See [7, Theorems 11.1 and 11.2], where this result was proved for the case that α0 is strictly decreasing. [In that case one may take x1 (t) = 1.] The more general case follows from similar arguments. It follows from Theorem 1.1 that, for fixed x ∈ (0, 1), the functions α(x, t) and β(x, t) exist for t sufficiently large. The fact that they have a common limit as t → ∞ represents the continuum analogue of the sorting property of the Toda lattice. We complement this long time result in two ways. • Firstly, we describe a general class of initial data for which the right ansatz holds for large enough times t, and for all x ∈ (0, 1), see Theorem 4.1. For these initial data, one may therefore write x0 (t) = 0 and x1 (t) = 1 in Theorem 1.1. • Secondly, we describe a different class of initial data for which the right ansatz does not hold near x = 0 no matter how large t is. See Theorem 5.3 for the precise conditions. For these initial data one therefore necessarily has x0 (t) > 0 in Theorem 1.1. The difference between the two cases lies in the smoothness of β0 at its maximum. The second problem we consider in this paper is the formation of an infinite gap solution, i.e., an infinite number of intervals in the support of the maximizer. • We present an example in which C ∞ initial data evolve into an infinite gap solution at a certain time t0 and position x0 , see Lemma 6.2 and Theorem 6.3. The construction makes essential use of the result on long time behavior.

310


• At the critical time t0 , we show that for x < x0 , and for x ∈ (x0 , x0 +δ) the constraint is not active, and the maximizer is supported on finitely many intervals. As x approaches x0 , the number of gaps increases without bound. This indicates that the oscillations in the continuum limit of the Toda lattice become more and more complicated. It would be of interest to show that the oscillations are described by theta functions associated to Riemann surfaces whose genuses grow as x tends to x0 . We remark that for real analytic spectral data the formation of an infinite gap solution is not possible, see [18]. The outline of the rest of the paper is as follows. Sections 2 and 3 contain preliminary material that is needed for the long time result of Theorem 4.1. In Sect. 2 we introduce an external field V˜ which is dual to V . In Sect. 3 we obtain a uniformly valid long time result from certain smoothness properties of both V and V˜ . Then in Sect. 4 we give a general class of initial data α0 and β0 which give rise to external fields V and V˜ with the required smoothness properties, so that for large enough time, the right ansatz holds uniformly in x. In Sect. 5 we consider a different class of initial data for which the right ansatz does not hold uniformly in x. This class includes C 2 initial data. In our final Sect. 6 we describe the generation of an infinite gap solution from C ∞ initial data. 2. The Dual Problem The Euler–Lagrange relations associated with the maximization problem (1.12)–(1.13) are Lψ(λ) − V (λ) + tλ = $, if 0 < ψ(λ) < φ(λ), Lψ(λ) − V (λ) + tλ ≤ $, if ψ(λ) = 0, Lψ(λ) − V (λ) + tλ ≥ $, if ψ(λ) = φ(λ),

(2.1) (2.2) (2.3)

where $ is a constant, which may depend on x and t. The maximizer is the only function on [A, B] that satisfies (1.13) and the relations (2.1)–(2.3) for some constant $. We need some notions from logarithmic potential theory. Good references are [26, 27]. The Green function with pole at infinity of an unbounded domain % in the complex λ-plane, is the unique continuous function of λ, that is harmonic in %, vanishes on C \ % and behaves like log |λ| as |λ| → ∞. We use g[α,β] to denote the Green function with pole at infinity of C \ [α, β]. It is known that

2

λ − a

λ−a g[α,β] (λ) = log

− 1

, + 2b

2b

where a = (α + β)/2 and b = (β − α)/4. Thus from (1.8) and (1.11) we see that the external field V is given as an integral of Green functions x− (λ) V (λ) = g[α0 (x),β0 (x)] (λ) dx. (2.4) 0

Also the function φ(λ) of (1.10) has a potential theoretic interpretation. Let ω[α,β] be the density of the equilibrium measure of the interval [α, β], that is,  1   √ if λ ∈ [α, β] π (β − λ)(λ − α) ω[α,β] (λ) =   0 elsewhere.


311

Then by (1.12)

1

φ(λ) = 0

ω[α0 (x),β0 (x)] (λ) dλ.

(2.5)

We recall the following relation between the equilibrium measure and the Green function:

β −α Lω[α,β] = g[α,β] + log . (2.6) 4 We define a second external field 1 V˜ (λ) = g[α0 (x),β0 (x)] (λ) dx, x+ (λ)

Lemma 2.1. Assume that C =

1 0

λ ∈ [A, B].

(2.7)

log b0 (x)dx > −∞. Then

Lφ = V + V˜ + C. Proof. Using (2.5) and (2.6), we find 1 Lφ = g[α0 (x),β0 (x)] dx + 0

1

log b0 (x)dx.

0

The Green function g[α0 (x),β0 (x)] (λ) vanishes for λ ∈ [α0 (x), β0 (x)]. Thus for fixed λ, it vanishes for x ∈ [x− (λ), x+ (λ)]. This gives the lemma. 1 Remark 2.2. From now on we always assume 0 log b0 (x)dx > −∞. This is related to the assumption that Lφ is a bounded function. We put ψ˜ = φ − ψ, so that ψ˜ satisfies 0 ≤ ψ˜ ≤ φ,

˜ ψ(λ) dλ = 1 − x.

Then Lψ˜ = Lφ − Lψ, so that in view of (2.1)–(2.3) and Lemma 2.1, ˜ if 0 < ψ(λ) ˜ ˜ Lψ(λ) − V˜ (λ) − tλ = $, < φ(λ), ˜ if ψ(λ) ˜ ˜ Lψ(λ) − V˜ (λ) − tλ ≤ $, = 0, ˜ ˜ ˜ ˜ Lψ(λ) − V (λ) − tλ ≥ $, if ψ(λ) = φ,

(2.8) (2.9) (2.10)

with $˜ = C − $. Thus we have proved the following theorem. Theorem 2.3. If ψ is the maximizer for the maximization problem (1.12)–(1.13), then ˜ ψ˜ = φ − ψ is the maximizer for the maximization problem with external field V (λ) + tλ, ˜ constraint φ, and normalization ψ dλ = 1 − x. We will refer to the maximization problem with external field V˜ (λ) + tλ as the dual problem. The simultaneous study of a maximization problem and its dual was done earlier by Dragnev and Saff [10] in their work on the zero asymptotics of Krawtchouk polynomials.

312


3. The Right Ansatz and the Left Ansatz The right ansatz was formulated in (1.14)–(1.16). If the right ansatz is valid then the support of ψ is equal to the interval [α, B] and the support of ψ˜ = φ − ψ, the maximizer for the dual problem, is [A, β]. It is easy to see that also the converse holds. That is, if the support of ψ is an interval [α, B] and the support of ψ˜ is an interval [A, β], then α < β and (1.14)–(1.16) hold. The following lemma gives sufficient conditions for the supports to be intervals. Part of the lemma is covered by [9, Theorem 2.16 (b)] of Dragnev and Saff. For completeness and convenience of the reader, we give here the full proof. Lemma 3.1. Let t ∈ R. (a) If (λ − A)(V˜ (λ) + t) is increasing for λ ∈ [A, B], then, for every x ∈ (0, 1), there is β = β(x, t) such that the support of ψ˜ is [A, β]. In addition, β(x, t) depends continuously on x. (b) If (B − λ)(V (λ) − t) is increasing for λ ∈ [A, B], then, for every x ∈ (0, 1), there is α = α(x, t) such that the support of ψ is [α, B]. In addition, α(x, t) depends continuously on x. ˜ Proof. (a) If the support of ψ˜ is not an interval, then there exist λ1 < λ2 in supp(ψ) ˜ such that ψ(λ) = 0 for λ ∈ (λ1 , λ2 ). By (2.9) we then have ˜ ˜ Lψ(λ) − V˜ (λ) − tλ ≤ $,

λ ∈ (λ1 , λ2 ),

˜ while by (2.8) there is equality for λ = λ1 , and for λ = λ2 . Thus Lψ(λ) − V˜ (λ) − tλ assumes its minimum on [λ1 , λ2 ] at an internal point, say at λ∗ ∈ (λ1 , λ2 ). Then ˜ (λ∗ ) − V˜ (λ∗ ) − t = 0. (Lψ)

(3.1)

Next, we note that for λ ∈ (λ1 , λ2 ), ˜ (λ) = (λ − A) (λ − A)(Lψ)

B A

1 ˜ ψ(µ)dµ. λ−µ

(3.2)

It is easy to see that for every µ ∈ (A, B] \ (λ1 , λ2 ), the function (λ − A)/(λ − µ) is strictly decreasing on [λ1 , λ2 ]. Since ψ˜ is non-negative on [A, B] and vanishes on (λ1 , λ2 ), we find that the left-hand side of (3.2) is strictly decreasing on [λ1 , λ2 ]. Then we get, using the assumption that (λ − A)(V˜ (λ) + t) increases on [A, B], ˜ (λ) − V˜ (λ) − t is strictly decreasing on [λ1 , λ2 ]. (λ − A) (Lψ) (3.3) In view of (3.1) this implies that ˜ (λ) − V˜ (λ) − t > 0, (Lψ) ˜ (λ) − V˜ (λ) − t < 0, (Lψ)

λ ∈ (λ1 , λ∗ ), λ ∈ (λ∗ , λ2 ),

˜ which means that Lψ(λ) − V˜ (λ) − tλ has a local maximum at λ∗ . This contradiction shows that the support of ψ˜ is an interval. Let β = β(x, t) be the right endpoint of this interval.


313

For later reference we note that for λ ∈ (β, B], we have strict inequality ˜ ˜ Lψ(λ) − V˜ (λ) − tλ < $. Indeed, if equality would hold, then we could repeat the same arguments as above, with λ1 = β and λ2 = λ, and we would obtain a contradiction again. ˜ Assuming A is not in the support of ψ, ˜ Next, we show that A is in the support of ψ. we let λ2 > A be the smallest number in the support. Then ˜ ˜ Lψ(λ) − V˜ (λ) − tλ ≤ $,

λ ∈ (A, λ2 ),

with equality for λ = λ2 . Then in the same way we obtained (3.3), we get ˜ (λ) − V˜ (λ) − t is strictly decreasing on [A, λ2 ]. (λ − A) (Lψ) As the above expression vanishes for λ = A, we see that it is negative for λ ∈ (A, λ2 ). ˜ Thus Lψ(λ) − V˜ (λ) − tλ is strictly decreasing on [A, λ2 ]. This is again a contradiction, since ˜ ˜ Lψ(λ) − V˜ (λ) − tλ ≤ $, for λ = A and equality holds for λ = λ2 . Thus the support of ψ˜ is equal to the interval [A, β(x, t)]. To prove that β(x, t) depends continuously on x, we note first that as x increases, ˜ the normalization ψdλ = 1 − x decreases, and therefore, by Proposition 4.1(a) of [18] (see also Lemma 5.1 below) the maximizer decreases as x increases. In addition, ˜ the measures ψdλ depend continuously on x in the weak∗ topology of measures on [A, B], see [18, Proposition 4.1(b)]. This immediately implies that the endpoint β(x, t) is right-continuous in x. What remains is to show that it is also continuous from the left. The limit from the left exists since β is a decreasing function of x. We denote the left limit by β(x−, t). Clearly β(x−, t) ≥ β(x, t). So we certainly have β(x−, t) = β(x, t) if β(x, t) = B. Assuming β(x, t) < B, we recall that we have strict inequality ˜ Lψ(λ) − V˜ (λ) − tλ < $˜ ˜ for all λ ∈ (β(x, t), B]. By [18, Proposition 4.1(b)] Lψ(λ) and $˜ both depend continuously on x. It follows that for any given λ ∈ (β(x, t), B] we can find sufficiently small δ > 0 such that also for x − δ strict inequality holds at λ. Then β(x − δ, t) < λ for all sufficiently small δ, and it follows that β(x−, t) < λ. Since λ can be taken arbitrarily close to β(x, t), we arrive at β(x−, t) ≤ β(x, t). This proves the left-continuity, since we already know that β(x−, t) ≥ β(x, t). This completes the proof of part (a). The proof of part (b) is similar. Combining Lemma 3.1 with the remarks immediately preceding it, we arrive at the following result. Theorem 3.2. Let t ∈ R. If both (λ−A)(V˜ (λ)+t) and (B −λ)(V (λ)−t) are increasing functions on the interval [A, B], then the right ansatz holds for the maximization problem (1.12)–(1.13) for every x ∈ (0, 1). Furthermore, the functions α(x, t) and β(x, t) appearing in (1.14)–(1.16) are continuous in x.

314


It is clear that if the conditions of the theorem hold for some t, then they hold for all later times as well. Thus in that case, the right ansatz continues to hold for all later times. It is important to note that in Theorem 3.2 the right ansatz holds for every x ∈ (0, 1). Dual to the right ansatz, we have the so-called left ansatz. By this we mean that the maximizer ψ of (1.12)–(1.13) satisfies 0 < ψ(λ) < φ(λ), ψ = φ, ψ = 0,

λ ∈ (α(x, t), β(x, t)), λ ∈ [A, α(x, t)], λ ∈ [β(x, t), B]

(3.4) (3.5) (3.6)

for some values α(x, t), β(x, t) in [A, B]. The following analogue of Theorem 3.2 holds for the left ansatz. Theorem 3.3. Let t ∈ R. If both (λ − A)(V (λ) − t) and (B − λ)(V˜ (λ) + t) are increasing functions on [A, B], then the left ansatz holds for the maximization problem (1.12)–(1.13) for every x ∈ (0, 1), and the functions α(x, t) and β(x, t) appearing in (3.4)–(3.6) are continuous in x. If the conditions of Theorem 3.3 hold at some time t, then they hold for all earlier times. Corollary 3.4. Suppose V and V˜ are differentiable functions on (A, B). (a) If there is a constant T such that for every λ ∈ (A, B), d (λ − A)V˜ (λ) ≥ −T , dλ and d (B − λ)V (λ) ≥ −T , dλ then the right ansatz holds for every time t ≥ T and every x ∈ (0, 1). (b) If there is a constant T such that for every λ ∈ (A, B), d (λ − A)V (λ) ≥ T , dλ and d (B − λ)V˜ (λ) ≥ T , dλ then the left ansatz holds for every time t ≤ T and every x ∈ (0, 1). (c) If V and V˜ are C 2 functions on [A, B], then there exist Tr and Tl such that the right ansatz holds for t ≥ Tr and x ∈ (0, 1), and the left ansatz for t ≤ Tl and x ∈ (0, 1). In all three cases, the functions α(x, t) and β(x, t) are continuous in x. Proof. This follows immediately from Theorems 3.2 and 3.3.

Remark 3.5. Under the conditions of Corollary 3.4, one may also show that the functions α(x, t) and β(x, t) are continuous in t. However, as we will not use this in the rest of the paper, we will not show it here.


315

4. Long Time Behavior: Right Ansatz Holds Uniformly In many important cases the functions V and V˜ are not C 2 functions and special attention is required for points where differentiability fails. Our goal in this section is to find conditions on the initial Riemann invariants α0 and β0 such that Theorem 3.2 can be applied for t sufficiently large. Recall that V and V˜ are determined by α0 and β0 through Eqs. (2.4) and (2.7), respectively. We study initial data satisfying the following conditions (see Fig. 1 below) I. The function β0 (x) is continuous on [0, 1] and assumes its maximum at xB ∈ (0, 1). It is a C 2 function on [0, 1] \ {xB } with β0 > 0 on [0, xB ) and β0 < 0 on (xB , 1]. At the point xB the left and right limits of β0 and β0 exist with β0 (xB −) > 0,

β0 (xB +) < 0.

(4.1)

II. The function α0 (x) is continuous on [0, 1] and assumes its minimum at xA ∈ (0, 1). It is a C 2 function on [0, 1] \ {xA } with α0 < 0 on [0, xA ) and α0 > 0 on (xA , 1]. At the point xA the left and right limits of α0 and α0 exist with α0 (xA −) < 0,

α0 (xA +) > 0.

(4.2)

β0 (x) ... ... ... ... ..... . ... . ... .... ............. ... . ... ...... .. ...... ...... ....... ....... .... . . . ..... ...... . ............. ...... ... ...... .. ...... ... . ... .............. .. . . ............. . . . . . ................... . . . . . . . . ......... ............ . ....... . ..... ...... . ..... . ...... .... ...... . .... . ...... ... .. .. ... ............ . . . . . . ... .. α0 (x) ...... . . ... . . . . ....... .... .. ... ....... ........... ....... ... ... .. ...... ... ...... . ........... .. ... ............ ... ... .................... .......... . . ....................... .. .. ........... .. ........... .... ... .. 0

xB

xA

1

Fig. 1. Example of initial Riemann invariants satisfying the Conditions I and II

Theorem 4.1. If the initial data α0 , β0 satisfy Conditions I and II, then for t large enough, the right ansatz holds for all x ∈ (0, 1), and the functions α(x, t) and β(x, t) are continuous in x. Proof. The proof is based on Theorem 3.2, so that we have to show that, for t large enough, both (B − λ)(V (λ) − t) and (λ − A)(V˜ (λ) − t) increase on [A, B]. We discuss here in detail the behavior of (B − λ)(V (λ) − t), the other function being similar.

316


Without loss of generality we may assume that α0 (0) = β0 (0) = 0. Then A < 0 < B. We start with some remarks on the function V . It is a non-negative continuous function on [A, B] with V (0) = 0. It is differentiable on (A, 0) and on (0, B) with derivative

x− (λ)

1 dx, λ ∈ (0, B), (λ − α (x))(λ − β0 (x)) 0 0 x− (λ) 1 dx, λ ∈ (A, 0). V (λ) = − √ (λ − α0 (x))(λ − β0 (x)) 0

V (λ) =

√

(4.3) (4.4)

We see that V (λ) is positive for λ > 0 and negative for λ < 0. Hence V assumes its minimum only √ at 0. [We could have taken (4.3) as the formula for V for all λ, if we would consider (λ − α0 (x))(λ − β0 (x)) as a complex function of the complex variable λ, which would take positive values for λ > β0 (x) and negative values for λ < α0 (x). However, we chose here not to take that point of view. Instead all formulas are “real” formulas, and all square roots are positive.] The values V (B) and V (A) exist. Indeed, we have xB 1 V (B) = dx √ (B − α0 (x))(B − β0 (x)) 0 and the integral converges because of the assumption that β0 (xB −) > 0. Similarly, V (A) exists. Hence V is a continuous function on [A, B] \ {0}. At 0, V is not continuous, but the left and right limits at 0 exist. This may be seen from the formula √ 1 (λu) λ x− du V (λ) = , 0 < λ < B, (4.5) √ 1 −u λ − α (x (λu)) 0 0 − which is obtained from (4.3) through the change of variables x = x− (λu). As λ → 0+, it is easy to see that (λu) → x−

1

(4.6)

β0 (0)

and √ λ

α0 (x− (λu)) = 1− λ λ − α0 (x− (λu))

−1/2

−1/2

α (0) → 1 − 0 u β0 (0)

Thus V (0+) =

1 β0 (0)

1

1−

0

−1/2

α0 (0) u β0 (0)

arctan −α0 (0)/β0 (0) . =2 −α0 (0)β0 (0)

√

du 1−u

.

(4.7)


317

Similarly arctan −β0 (0)/α0 (0) V (0−) = −2 . −α0 (0)β0 (0) Thus V has a jump at 0 of magnitude V (0+) − V (0−) =

π −α0 (0)β0 (0)

.

(4.8)

To study the differentiability of V we restrict ourselves to λ ∈ (0, B], the case λ ∈ [A, 0) being similar. We consider √ (λu) λ x− , 0 ≤ λ ≤ B, 0 ≤ u ≤ 1, (4.9) f (λ, u) = λ − α0 (x− (λu)) which for λ = 0, is interpreted as the limit from above (which exists by (4.6) and (4.7)). In view of (4.5) we have 1 du V (λ) = f (λ, u) √ . 1 −u 0 The function f is continuous on the rectangle R = {(λ, u) : 0 ≤ λ ≤ B, 0 ≤ u ≤ 1}. From (4.9) we see that f is differentiable with respect to λ, as all functions in (4.9) are differentiable. Furthermore, at λ = 0, f is differentiable from the right, and at λ = B, it is differentiable from the left. However, there is one exception, which has to do with the fact that α0 is not differentiable at xA . Thus α0 (x− (λu)) is not differentiable with respect to λ if x− (λu) = xA . This happens if xA < xB and λu = β0 (xA ). In that case it follows from the assumptions on α0 that the derivatives from the left and from the right exist. Thus ∂f/∂λ exists on R \ {λu = β0 (xA )}, is continuous there, and on the curve λu = β0 (xA ) its left and right limits exist. Then it easily follows that ∂f 1 (λ, u) √ ∂λ 1−u is integrable on R. By Fubini’s theorem, we then have for every λ0 ∈ (0, B],

λ0 1 1 λ0 ∂f ∂f du du (λ, u) √ (λ, u)dλ √ dλ = ∂λ 1−u 1−u 0 0 0 ∂λ 0 1 du = [f (λ0 , u) − f (0, u)] √ 1 −u 0 = V (λ0 ) − V (0+). Thus V is differentiable on (0, B] and V (λ) =

1 0

du ∂f (λ, u) √ ∂λ 1−u

318


is a continuous function on (0, B] and the limit V (0+) exists. Similarly, we find that V is a continuous function on [A, 0) and V (0−) exists. It follows that (B − λ)(V (λ) − t) is differentiable on [A, B] \ {0} with derivative (B − λ)V (λ) − V (λ) + t. The functions V and V are continuous on [A, B] \ {0} and have left and right limits at 0. Therefore they are bounded. It follows that for t ≥ T1 :=

sup

λ∈[A,B]\{0}

V (λ) − (B − λ)V (λ)

the function (B − λ)(V (λ) − t) increases on [A, 0) and on (0, B]. At λ = 0, V has a jump (4.8), from which it follows that lim (B − λ)(V (λ) − t) ≥ lim (B − λ)(V (λ) − t).

λ→0+

λ→0−

Thus (B − λ)(V (λ) − t) increases on the full interval [A, B] if t ≥ T1 . In the same way, we obtain that (λ − A)(V˜ (λ) + t) increases on [A, B] for t ≥ T2 , with T2 :=

sup

λ∈[A,B]\{0}

−V˜ (λ) − (λ − A)V˜ (λ) .

Thus for t ≥ max(T1 , T2 ), both (B − λ)(V (λ) − t) and (λ − A)(V˜ (λ) + t) increase on [A, B] and the theorem follows because of Theorem 3.2. Remark 4.2. The requirements that α0 and β0 are C 2 functions at 0 imply that φ(λ) becomes infinite at λ = 0 (as in the proof of Theorem 4.1, we assume that α0 (0) = β0 (0) = 0). In fact one may check that Conditions I and II imply that φ(λ) behaves like log (1/|λ|) for λ near 0. Thus the accumulation of eigenvalues for the original Toda matrix L is greater at zero than at other points of the spectrum. This observation has a noteworthy dynamical consequence. Since φ is a constant of the motion, the blow-up of φ at λ = 0 should be visible in the evolving curves α(x, t), β(x, t) at all times. Since the right ansatz holds for large time by Theorem 4.1, it follows that at large times, either α(·, t) or β(·, t) has a zero derivative at the value of x where it is 0. Remark 4.3. The inequalities (4.1) in Condition I imply that the function β0 is not C 2 at the point xB , but rather possesses a corner there. We will show in the next section that the conclusion of Theorem 4.1 does not hold if β0 is a C 2 function on the full interval [0, 1]. It is similarly true that the conclusion of the theorem does not hold if α0 is a C 2 function on the full interval. The difference between the two cases has an interesting dynamical interpretation. The point xB where β0 has its maximum moves to the left as t increases. If β0 has a corner at xB , then the top point hits the boundary x = 0 in finite time. This is a consequence of Theorem 4.1. On the other hand, if β0 is a C 2 function, then the top point will not hit the boundary in finite time. A similar interpretation applies to α0 .


319

5. Long Time Behavior: Right Ansatz Does Not Hold Uniformly The goal of this section is to prove Theorem 5.3 from which it follows that for C 2 initial data α0 and β0 , the right ansatz does not hold uniformly in x, no matter how large t is. To establish Theorem 5.3 we will use two results on the x-dependence of the maximizer, which will be discussed first. We study the dependence of the maximizer ψ on the spatial parameter x. Lemmas 5.1 and 5.2 hold generally and are not restricted to C 2 spectral data. Lemma 5.1. Let V be continuous, and let φ be such that Lφ is continuous. Then for a fixed t, the maximizer ψ of the problem (1.11)–(1.12) increases with x. Proof. See Proposition 4.1 of [18].

We are also interested in the behavior for x → 0+. Lemma 5.2. Let V be continuous, and let φ be such that Lφ is continuous. We further assume that supp(φ) = [A, B]. Fix t ∈ R. The following are equivalent for λ0 ∈ [A, B], (a) λ0 is in the support of the maximizer ψ for every x ∈ (0, 1); (b) the function V (λ) − tλ assumes its minimum at λ0 . Proof. Assume (b) holds. Let x > 0 and denote the corresponding maximizer by ψ. The function Lψ is harmonic on C \ supp(ψ). Since it tends to +∞ at infinity, the minimum principle for harmonic functions gives that Lψ assumes its minimum only in supp(ψ). Let λ1 ∈ supp(ψ) be a point where the minimum is assumed. Then, by (2.1) and (2.3) Lψ(λ1 ) − V (λ1 ) + tλ1 ≥ $. If we assume that λ0 ∈ supp(ψ), then Lψ(λ1 ) < Lψ(λ0 ) and by (2.2) Lψ(λ0 ) − V (λ0 ) + tλ0 ≤ $. Combining the last three relations, we find V (λ0 ) − tλ0 > V (λ1 ) − tλ1 which contradicts (b). Thus λ0 ∈ supp(ψ) and (a) holds. Next, assume that (a) holds. For x ∈ (0, 1), we use ψ(·; x) to denote the maximizer corresponding to x and $x to denote the constant appearing in (2.1)–(2.3). Then by (2.1) and (2.3) we have for every x ∈ (0, 1), Lψ(λ0 ; x) − V (λ0 ) + tλ0 ≥ $x . Letting x → 0, we have that Lψ(·; x) → 0 by the dominated convergence theorem, and thus −V (λ0 ) + tλ0 ≥ lim sup $x . x→0

320


We are done, if we can prove that lim $x = − min{V (λ) − tλ : λ ∈ [A, B]}.

x→0

(5.1)

Let λ1 be a point where V (λ) − tλ assumes its minimum. Then λ1 ∈ supp(ψ(·; x)) for every x ∈ (0, 1), by what has been proved before. Thus Lψ(λ1 ; x) − V (λ1 ) + tλ1 ≥ $x .

(5.2)

For every x ∈ (0, 1), the set of λ values such that Lψ(λ; x) − V (λ) + tλ = $x is a non-empty closed set, because of the continuity of Lψ(λ; x). We let λx be a point closest to λ1 such that Lψ(λx ; x) − V (λx ) + tλx = $x .

(5.3)

We claim that λx → λ1 as x → 0. Indeed, if this were not the case, there would be a sequence (xn ) tending to 0 and an > 0 such that |λxn − λ1 | > . Then using (5.2) and (5.3) we would find that Lψ(λ; xn ) − V (λ) + tλ > $xn for all λ in the interval + = (λ1 − , λ1 + ). Then ψ(·; xn ) = φ in + by (2.3), so that xn = ψ(λ; xn )dλ ≥ ψ(λ; xn )dλ = φ(λ)dλ > 0. +

+

This contradicts the fact that (xn ) tends to 0. Therefore λx → λ1 as x → 0, as claimed. Now letting x → 0 in (5.3), we get −V (λ1 ) + tλ1 = lim $x , x→0

which proves (5.1) by the definition of λ1 . This completes the proof of the lemma. Theorem 5.3. Suppose the initial upper Riemann invariant β0 is increasing on the interval [0, xB ], and decreasing on the interval [xB , 1], and the lower Riemann invariant α0 is decreasing on the interval [0, xA ], and increasing on the interval [xA , 1], where xA , xB ∈ (0, 1). (a) If β0 is a C 2 function in a neighborhood of xB , then for every t > 0, there exists δ > 0 such that for every x ∈ (0, δ), the maximizer ψ vanishes in a neighborhood of B. (b) If α0 is a C 2 function in a neighborhood of xA , then for every t > 0, there exists δ > 0 such that for every x ∈ (1 − δ, 1), the maximizer ψ = φ in a neighborhood of A. Consequently, in both cases the right ansatz (1.14)–(1.16) is not valid for all x ∈ (0, 1), no matter how large t is. In case (a) it fails for x close to 0, and in case (b) it fails for x close to 1.


321

Proof. We will restrict our attention to the proof of part (a), as the proof of part (b) is similar. So we assume that β0 is a C 2 function in a neighborhood of xB . Then as in (4.3), we have xB 1 V (B) = dx. (5.4) √ (B − α (x))(B − β0 (x)) 0 0 As β0 is a C 2 function around xB and xB is the point where β0 has its maximum, we have B − β0 (x) = O((xB − x)2 ),

(x → xB ).

Hence the integral in (5.4) diverges and V (B) = ∞. It follows that V (λ) − tλ does not assume its minimum at λ = B, no matter how large t is. Consequently, by Lemma 5.2, there is δ > 0 such that the maximizer ψ corresponding to normalization δ vanishes in a neighborhood of B. But then by Lemma 5.1, the maximizer corresponding to any smaller normalization also vanishes in this neighborhood, and thus part (a) of the theorem follows. Example 5.4. The effect described in Theorem 5.3 is clearly visible in the following explicit solution of the continuous Toda equations (1.7): 2 (1 − x)p − x(1 − p) , (5.5) α(x, t) = 2 (1 − x)p + x(1 − p) , (5.6) β(x, t) = where p = p(t) =

1 . 1 + e−2t

(5.7)

A straightforward calculation shows that (5.5) and (5.6) satisfy (1.7) indeed. This example is related to Krawtchouk polynomials [10]. The corresponding initial data are √ 2 1 √ 1 α0 (x) = 1 − x − x = − x(1 − x), 2 2 √ 2 1 √ 1 β0 (x) = 1 − x + x = + x(1 − x). 2 2 The upper Riemann invariant β is smooth and has its maximum initially at x = 1/2, and at later times at 1 x =1−p = . 1 + e2t Similarly, α has its minimum at x=p=

1 . 1 + e−2t

Thus for t > 0 the right ansatz holds for x in the interval 1 1 [1 − p, p] = , , 1 + e2t 1 + e−2t but not for x in [0, 1 − p) ∪ (p, 1].

322


6. Generation of Infinite Gaps from Smooth Initial Data We show in this section how the previous results can be used to establish the existence of smooth C ∞ initial data such that for some time t0 and some position x0 , the maximizer is supported on an infinite union of disjoint intervals. 6.1. Construction of the external field. We start with the construction of a C ∞ external field V0 such that the equilibrium measure in the presence of V0 (with normalization 1, and without upper constraint) is supported on infinitely many intervals. Lemma 6.1. Define for k = 0, 1, 2, . . . , ak =

1 3

k 1 , 2

bk =

1 2

k 1 , 2

(6.1)

and put . = {0} ∪

∞

[ak , bk ] .

k=0

Then there are functions ψ0 and V0 on R with the following properties: (a) ψ0 is a non-negative continuous function with support equal to . such that ψ0 (λ) dλ = 1.

(6.2)

(b) V0 is C ∞ on R and real analytic on R \ {0, b0 , a0 , b1 , . . . }. (c) Lψ0 = V0 on . and Lψ0 < V0 on R \ .. Proof. The function ψ0 will be built out of translates and rescalings of the function √ 2 1 − λ2 for λ ∈ [−1, 1], (6.3) f (λ) = π 0 otherwise. It is well known that Lf (λ) = λ2 − 1/2 − log 2,

for λ ∈ [−1, 1],

Lf (λ) < λ − 1/2 − log 2,

for λ ∈ R \ [−1, 1],

2

(see, for example, [27, p. 240]). Thus Lf is real analytic on the intervals (−∞, −1), (−1, 1) and (1, ∞). Then there is a function W such that W is C ∞ on R and real analytic on R \ {−2, −1, 1, 3},

(6.4)


323

and Lf (λ) = W (λ), Lf (λ) < W (λ),

for λ ∈ (−∞, −2] ∪ [−1, 1] ∪ [3, ∞), for λ ∈ (−2, −1) ∪ (1, 3).

Now for k = 0, 1, 2, . . . , we define 5 1 k ak + bk ck = = , 2 12 2 and

1 bk − ak = rk = 2 12

(6.5) (6.6)

k 1 , 2

λ − ck , rk

λ − ck Wk (λ) = rk W + log rk . rk

fk (λ) = f

(6.7) (6.8)

Then the function fk is supported on the interval [ak , bk ], and by a straightforward calculation,

λ − ck + log rk . (6.9) Lfk (λ) = rk Lf rk From (6.5), (6.6), and the definitions (6.1) of ak and bk , it then follows that Lfk (λ) = Wk (λ), Lfk (λ) < Wk (λ),

for λ ∈ (−∞, bk+1 ] ∪ [ak , bk ] ∪ [ak−1 , ∞), for λ ∈ (bk+1 , ak ) ∪ (bk , ak−1 ),

(6.10) (6.11)

where a−1 = 2/3. Furthermore, Wk is C ∞ on R and real analytic on R \ {bk+1 , ak , bk , ak−1 } as a result of (6.4). Taking k = 0, we see that W0 is real analytic, except at the points 1/4, 1/3, 1/2 and 2/3. Then there exists a C ∞ function Wˆ 0 such that Wˆ 0 = W0 on [0, 1/2], Wˆ 0 > W0 on (−∞, 0) ∪ (1/2, ∞), and such that Wˆ 0 is real analytic on R \ {0, 1/4, 1/3, 1/2}. It follows from (6.10)–(6.11) that Lf0 (λ) = Wˆ 0 (λ), Lf0 (λ) < Wˆ 0 (λ),

for λ ∈ [0, 1/4] ∪ [1/3, 1/2],

(6.12)

for λ ∈ (−∞, 0) ∪ (1/4, 1/3) ∪ (1/2, ∞).

(6.13)

Now we form the two infinite series ∞ 12 fk (λ) ψ0 (λ) = √ , k! e k=0 ∞ Wk (λ) 12 ˆ . V0 (λ) = √ W0 (λ) + k! e k=1

(6.14)

(6.15)

√ The factor 12/( e) was taken in order to guarantee that (6.2) holds. Observe that by construction, the support of ψ0 is ., so that property (a) of the lemma holds. We note that V0 is a C ∞ function, since Wˆ 0 and each Wk is C ∞ and the series (6.15) is uniformly convergent on compacts, as are the series with the derivatives of any order. Similarly, V0

324


is real analytic on each of the open intervals (ak , bk ) and (bk+1 , ak ) for k = 1, 2, . . . . Thus property (b) holds. We see from (6.10)–(6.15) that Lψ0 = V0 on . and that Lψ0 < V0 on each of the gaps (bk+1 , ak ). Because of the modification of W0 to Wˆ 0 , we also have Lψ0 < V0 on (−∞, 0) and on (b0 , ∞). Thus (c) holds. From properties (a)–(c) of Lemma 6.1 it follows that ψ0 is the equilibrium measure in the presence of the external field V0 . That is, it maximizes (Lψ, ψ) − 2(V0 , ψ) among all non-negative functions ψ satisfying ψdλ = 1, see [5, 6, 27]. 6.2. Construction of initial data. Let ψ0 and V0 be as in Lemma 6.1. We put √ 2 1 − λ2 for λ ∈ [−1, 1], φ(λ) = π 0 otherwise.

(6.16)

Since ψ0 is bounded with support . ⊂ [0, 1/2], there is an x0 ∈ (0, 1) such that x0 ψ0 < φ

on (−1, 1).

(6.17)

We consider the external field x0 V0 and the constraint φ on the interval [−1, 1]. The dual external field, see Lemma 2.1, is Lφ − x0 V0 − C. Both x0 V0 and Lφ are C ∞ on [−1, 1] (in fact, Lφ is real analytic). Thus, by Corollary 3.4 (c), there exists a sufficiently negative time Tl such that the left ansatz holds for every t < Tl and every x ∈ (0, 1), with continuous functions α(·, t) and β(·, t). Choose t0 > −Tl and write V (λ) = x0 V0 (λ) + t0 λ,

λ ∈ [−1, 1].

(6.18)

Note that we then have V (λ) > 0

for all λ ∈ [−1, 1].

For the external field (6.18) and the constraint φ, the maximizer ψ(·; x, t) for the maximization problem (1.12)–(1.13) at time t ∈ (−∞, t0 + Tl ) satisfies the left ansatz (3.4)– (3.6). Thus for every x ∈ (0, 1) and t < t0 + Tl , there exist α(x, t) and β(x, t) in [−1, 1] such that ψ(·; x, t) = φ on [−1, α(x, t)] and 0 < ψ(·; x, t) < φ on (α(x, t), β(x, t)). We take α0 (x) = α(x, 0),

β0 (x) = β(x, 0),

(6.19)

as initial data, whose spectral functions φ and V are given by (6.16) and (6.18), respectively. Lemma 6.2. Let x0 and Tl be defined as above. Then for every t0 > −Tl the following statements hold for the functions α0 (x) and β0 (x) from (6.19): (a) α0 and β0 are continuous increasing functions on (0, 1) with −1 < α0 (x) < β0 (x) < 1, lim α0 (x) = lim β0 (x) = −1,

(6.20) (6.21)

lim α0 (x) = lim β0 (x) = 1.

(6.22)

x→0+ x→1−

x→0+ x→1−


325

(b) The maximizer of the maximization problem (1.12)–(1.13) corresponding to x0 and t0 is equal to x0 ψ0 , and x0 ψ0 is supported on an infinite number of intervals. Proof. We already noted that the maximizer ψ(·; x, 0) at time t = 0 is equal to the constraint φ precisely on [−1, α0 (x)], that it vanishes precisely on [β0 (x), 1], and that the functions α0 (x) and β0 (x) are continuous in x. Since the maximizer increases with x by Lemma 5.1, it follows that both α0 and β0 are increasing functions of x. If α0 (x) would be −1, then the constraint φ would not be active. In that case, an explicit formula for ψ would be β V (s) 1 1 (β − s)(s + 1)ds , ψ(λ; x) = √ x + P.V. π π (β − λ)(λ + 1) −1 s − λ where P.V. denotes a Cauchy principal value integral, see e.g. [14,19]. Since V (s) > 0, we then see that the maximizer ψ would have a square-root singularity at −1, which is clearly impossible since ψ ≤ φ. Thus α0 (x) > −1. Similarly β0 (x) < 1. Now we follow the arguments of Deift and McLaughlin in [7, Chapter 4], where they consider decreasing initial data. If we modify their arguments to the case of increasing initial data, we find that α0 (x) and β0 (x) satisfy the equations T (α, β) = 0,

X(α, β) = x,

(6.23)

where the functions T and X are defined by T (α, β) =

1 π

β

α

and X(α, β) =

1 π

α

√

V (λ) dλ − (β − λ)(λ − α)

α −1

√

φ(λ) dλ, (β − λ)(α − λ)

β

(λ) α λ − α+β φ(λ) λ − α+β V 2 2 dλ − dλ. √ √ (β − λ)(λ − α) (β − λ)(α − λ) −1

(6.24)

(6.25)

If we let β → α+, then (6.24) becomes α φ(λ) dλ = −∞. V (α) − −1 α − λ Thus α0 (x) < β0 (x), and we proved (6.20). As the maximizer is equal to the constraint φ on the interval [−1, α0 (x)], it is clear that α0 (x) → −1 as x → 0+. Since the maximizer vanishes on [β0 (x), 1], and φ(λ)dλ = 1, we also find that β0 (x) → +1 as x → 1−. Suppose that β0 (x) → β > −1 as x → 0+. Then taking the limit x → 0+ in the equation T (α0 (x), β0 (x)) = 0, we find that 1 π

β −1

√

V (λ) dλ = 0, (β − λ)(λ + 1)

which is clearly impossible, since V (λ) > 0. Thus β0 (x) → −1 as x → 0+. Similar arguments, based on the dual problem, lead to the conclusion that α0 (x) → 1 as x → 1−. Hence (6.21) and (6.22) are proved.

326


Finally, we note that 0 ≤ x0 ψ0 ≤ φ by (6.17), and Lemma 6.1 (c) and (6.10) we have

x0 ψ0 dλ = x0 by (6.2). By

L(x0 ψ0 )(λ) = x0 Lψ0 (λ) ≤ x0 V0 (λ) = V (λ) − t0 λ with equality on the support of x0 ψ0 . The support of x0 ψ0 is equal to the set . of Lemma 6.1 and it consists of an infinite number of intervals. This proves part (b) and completes the proof of Lemma 6.2. Summarizing, for each choice of t0 > −Tl , we have constructed an external field V (λ) = x0 V0 (λ) + t0 λ out of V0 (λ) so that at t = 0, the maximization problem (1.12)– (1.13) is solved by the left ansatz for all x ∈ (0, 1), with α0 (x) and β0 (x) depending continuously on x. Next we would like to establish the C ∞ smoothness of the functions α0 (x) and β0 (x) (so far, we only know that they are continuous). For this, we will require that the parameter t0 be taken sufficiently large. We first observe that if we write T (α, β) from Eq. (6.24) in the form 1 T (α, β) = π −

1

−1 1 −1

V

β−α ! 2 u √ 2 1−u

α+β 2

φ β−

+

α−1 2

α−1 2

+

−

du !

α+1 2 u

!

α+1 2 u

(1 − u)

du,

and use the fact that V and φ are C ∞ functions on [−1, 1], we find that T is a C ∞ function for −1 < α < β < 1. Similarly, X is C ∞ . Theorem 6.3. Let x0 and Tl be as in Lemma 6.2. Then there is Tˆ > −Tl so that t0 > Tˆ implies that the functions α0 and β0 corresponding to t0 as in (6.19) are C ∞ smooth. Proof. We recall from the proof of Lemma 6.2 that for each x and t0 > −Tl , the pair (α0 (x), β0 (x)) solves the pair of equations (6.23). Using (6.24), together with V (λ) = x0 V0 (λ) + t0 , we may rewrite the function T as follows,

T (α, β) = t0 + −

x0 V0 (λ)/π dλ (β − λ)(λ − α) α α φ(λ) dλ. √ (β − λ)(α − λ) −1 β

√

(6.26)

Observe that the first integral on the right-hand side of (6.26) is uniformly bounded for all α < β in [−1, 1]: min x0 V0 (λ) ≤

λ∈[−1,1]

β α

√

x0 V0 (λ)/π dλ ≤ max x0 V0 (λ). λ∈[−1,1] (β − λ)(λ − α)

(6.27)


327

Similarly, its partial derivatives are uniformly bounded. Differentiating (6.26) with respect to β, we find

α x0 V0 (λ)/π φ(λ) 1/2 dλ dλ + √ β −λ (β − λ)(λ − α) (β − λ)(α − λ) α −1 β α x0 V0 (λ)/π ∂ φ(λ) 1/2 ≥ dλ dλ + √ √ ∂β α β + 1 (β − λ)(α − λ) (β − λ)(λ − α) −1 β x0 V0 (λ)/π ∂ = dλ √ ∂β α (β − λ)(λ − α) x0 V0 (λ) 1 β 1/2 t0 + + dλ − T (α, β) . √ β +1 π α (β − λ)(λ − α)

Tβ =

∂ ∂β

β

√

(6.28)

β x0 V0 (λ)/π Now assume α and β satisfy T (α, β) = 0. Then, since α √(β−λ)(λ−α) dλ and its derivative with respect to β are uniformly bounded, we have Tβ > 0 for t0 sufficiently big. Similarly, we now show that for t0 sufficiently large, it follows that if α and β solve the equation T (α, β) = 0, then Tα < 0. We insert the definition (6.16) of φ into the second integral in (6.26), to obtain √ α φ(λ) 1 − λ2 2 α dλ = dλ. √ √ π −1 (β − λ)(α − λ) (β − λ)(α − λ) −1 By contour integration this integral may be re-expressed as an integral over the interval [β, 1], which yields the following formula: √ α φ(λ) 1 − λ2 2 1 dλ = α + β + dλ. √ √ π β (β − λ)(α − λ) (λ − β)(λ − α) −1 Now arguments quite similar to those used to prove that Tβ > 0 can be used to prove that Tα < 0, if T (α, β) = 0 and t0 is sufficiently large. We thus have shown that if t0 is sufficiently big, and if α and β solve T (α, β) = 0, then Tα < 0 and Tβ > 0. For the partial derivatives of X, it follows as in [7, Chapter 4] that Xα = −

β −α Tα , 2

Xβ =

β −α Tβ , 2

provided that α and β satisfy T (α, β) = 0. Therefore we learn that " # Tα Tβ det = (β − α)Tα Tβ = 0 Xα Xβ for −1 < α < β < 1 solving T (α, β) = 0. Hence the Jacobian of the mapping (α, β) ! → (T , X) is non-zero for t0 sufficiently large and T (α, β) = 0. Thus, recalling that α0 (x) and β0 (x) are continuous functions solving T (α, β) = 0 and X(α, β) = x, we deduce from the implicit function theorem that α0 (x) and β0 (x) are C ∞ functions on (0, 1). This proves the theorem.

328


Remark 6.4. Combining Lemma 6.2 and Theorem 6.3, we have constructed an example where an infinite gap solution arises out of C ∞ initial data at a certain position x0 and time t0 > 0. We used the global description provided by the maximization problem (1.12)–(1.13), but we were not able to analyse the support of the maximizer in general for every x and t. Hence we do not know whether the infinite gap solution occurs at other values of x and t, or not. What we can say is that the conditions of Corollary 3.4 are satisfied. Thus for large enough time (larger than t0 ) the right ansatz holds uniformly for x ∈ (0, 1). Therefore, for large enough time, all gaps in the support of the maximizer have disappeared, and we again have C ∞ functions α(x, t) and β(x, t). We are also able to analyse the maximizer at the fixed time t0 , with varying x ∈ (0, 1). It turns out that for x = x0 , we have a finite gap solution provided that the maximizer does not meet the constraint φ. This will be discussed in the next subsection. 6.3. Deformation in the spatial variable x. We further study the external field V0 constructed in the proof of Lemma 6.1, for which the equilibrium measure is supported on infinitely many intervals. We will consider the equilibrium problem with normalization x > 0, and prove that the maximizer is supported on finitely many intervals for every x different from 1. As discussed in the Introduction, the normalization x corresponds to the spatial variable in the continuum limit of the Toda lattice. In this subsection, we consider V0 as an external field defined on [−1, 1]. For each x > 0, we use ψ(·; x) to denote the maximizer with external field V0 and normalization x, i.e., ψ(λ; x)dλ = x, and no upper constraint. We write .x for the support of ψ(·; x). Recall that ψ(·; x) is increasing with x (cf. Lemma 5.1), and that ψ(·; 1) is equal to the function ψ0 from Lemma 6.1. Thus .1 = {0} ∪

∞

[ak , bk ],

(6.29)

k=0

where ak and bk are given by (6.1). Theorem 6.5. For every x > 0, x = 1, the set .x consists of a finite number of intervals. Proof. We consider first the case x < 1. Then .x ⊂ .1 . First, we want to show that for all k sufficiently large, the interval [ak , bk ] is disjoint from .x . We use Lemma 5.7 of [29], from which it follows that ψ0 (λ) = ψ(λ; 1) ≥ (1 − x)

dω.1 (λ), dλ

for λ ∈ .x ,

(6.30)

where ω.1 denotes the equilibrium measure without external field of the set .1 , and normalization 1. Enlarging the set .1 to the interval [0, 1], we decrease the equilibrium measure on .1 , and a fortiori on .x . This property of equilibrium measures follows for example from Theorem IV.4.5 of [27]. So we have dω.1 dω[0,1] 1 (λ) ≥ (λ) = √ , dλ dλ π λ(1 − λ)

for λ ∈ .x .

(6.31)


329

For λ ∈ [ak , bk ], we have 1 2(k+1)/2 1 1 . ≥ √ ≥ √ = π π bk π λ(1 − λ) π bk (1 − bk ) √

(6.32)

Now combining inequalities (6.30), (6.31), and (6.32), we find that ψ0 (λ) ≥ (1 − x)

2(k+1)/2 , π

for λ ∈ [ak , bk ] ∩ .x .

(6.33)

On the other hand, from the construction of ψ0 in (6.3), (6.7), and (6.14), it is clear that ψ0 (λ) ≤

24 √ , π ek!

for λ ∈ [ak , bk ].

(6.34)

From (6.33) and (6.34), we learn that if 24 2(k+1)/2 < (1 − x) √ π π ek! then [ak , bk ] ∩ .x is empty. This is clearly satisfied for k large enough. Thus we have shown that .x ⊂

kx

[ak , bk ],

k=1

for some finite kx . Next, it also follows from (6.30) that the points ak and bk do not belong to .x . Indeed, we know that ψ0 vanishes at these points, and the density of ω.1 is infinite at these points. So we see that ak ∈ .x or bk ∈ .x would contradict (6.30). Thus [ak , bk ] ∩ .x is contained in [ak + δ, bk − δ] for some δ > 0. Since V0 is real analytic on (ak , bk ), Theorem 1.38 of [6] gives that [ak , bk ] ∩ .x consists of a finite number of intervals (cf. [18]). So it follows that .x consists of a finite number of intervals for all x ∈ (0, 1). Now we turn to the case that x is bigger than 1. Fix x > 1, so that .1 ⊂ .x . Our first goal is to show that for k large enough, the gaps (bk+1 , ak ) of .1 are fully contained in .x . To this end, we introduce external fields, one for each k ∈ N, V0 − Lψ0 on [bk+1 , ak ], Qk = (6.35) 0 on [−1, bk+1 ] ∪ [ak , 1]. 1

This is a C 1+ 2 external field on [−1, 1]. Let ak Qk (s) 1 1 x − 1 + P.V. 1 − s 2 ds , ηk (λ) = √ π π 1 − λ2 bk+1 s − λ

(6.36)

where P.V. denotes the Cauchy principal value. Then by standard results on singular integral equations, see e.g. [14, §42.3], we have Lηk = Qk

on [−1, 1]

(6.37)

330


and

1 −1

ηk (λ) dλ = x − 1.

(6.38)

We are going to show that ηk is non-negative on [−1, 1] if k is sufficiently large. From (6.10)–(6.15) and (6.35) we note that k+1 ! 12 1 Qk (λ) = √ Wj (λ) − Lfj (λ) , j! e

for λ ∈ [bk+1 , ak ].

(6.39)

j =k

From this and (6.7)–(6.8) we compute that for λ ∈ [bk+1 , ak ],

k+1 λ − cj λ − cj 12 1 Qk (λ) = √ − (Lf ) . W j! rj rj e

(6.40)

j =k

Inserting (6.40) into the principal value integral in the right-hand side of (6.36) and making a suitable transformation for each term separately, we arrive at the following principal value integrals: −1 12 W (t) − (Lf ) (t) 1 − (ck + rk t)2 ) dt, (6.41) √ P.V. t −ζ π ek! −3 2 12 W (t) − (Lf ) (t) 1 − (ck+1 + rk+1 t)2 dt, (6.42) P.V. √ t −ζ π e(k + 1)! 1 where in the first integral λ = ck +rk ζ , and in the second integral λ = ck+1 +rk+1 ζ . The functions (6.41) and (6.42) are Hilbert transforms of Hölder continuous functions, and therefore they are also Hölder continuous, and they decay to 0 for |ζ | → ∞, uniformly with respect to k. Using the continuity property of the Hilbert transform ! on Hölder continuous functions, we easily see that both (6.41) and (6.42) are O k!1 , as k → ∞, uniformly in ζ . Then it is clear from the definition (6.36) of ηk , that there exists kx ∈ N such that ηk > 0,

for all k ≥ kx .

(6.43)

Combining (6.35), (6.37), (6.38), (6.43) with Lemma 6.1, we see that for k ≥ kx , ψ0 + ηk > 0 on [−1, 1], 1 (ψ0 (λ) + ηk (λ))dλ = x, −1

and

L(ψ0 + ηk )

= V0

on [bk+1 , ak ],

≤ V0

on [−1, 1].

(6.44) (6.45)

(6.46)

We also note that ψ0 + ηk = ψ(·; x), since strict inequality occurs in (6.46) in each of the gaps (bj +1 , aj ) with j = k, and supp(ψ0 + ηk ) = [−1, 1]. From (6.44)–(6.46) it then follows by Lemma 2.2 of [4] that [bk+1 , ak ] ⊂ supp(ψ(·; x)) = .x

for k ≥ kx .


331

Since .1 ⊂ .x , we conclude that [0, bkx ] ⊂ .x . Thus for each x > 1, a full interval around 0 is in the support of .x . To conclude that .x consists of a finite number of intervals we are now left with the intervals [bkx , 1] and [−1, 0]. The bands [ak , bk ] remain in the support .x . It is thus enough to show that for each k < kx the set .x ∩ [bk+1 , ak ] consists of a finite number of intervals, and similarly for .x ∩ [−1, 0]. To this end, we note that ψ(·; x) − ψ0 is a non-negative function with L(ψ(·; x) − ψ0 ) = V0 − Lψ0 + $x

on .x ,

and inequality ≤ on [−1, 1]. Thus ψ(·; x) − ψ0 is the maximizer for the external field V0 − Lψ0 and normalization x − 1. This external field is zero on each interval [ak , bk ], and convex in a neighborhood of each ak and bk . It then follows that some interval [ak − δ, bk + δ] is also contained in .x . In a neighborhood of the remaining gaps [bk+1 + δ, ak − δ], the external field V0 is real analytic, and so by Theorem 1.38 of [6], .x ∩ [bk+1 + δ, ak − δ] consists of a finite union of intervals. Similarly, V0 − Lψ0 is convex in a left neighborhood of 0, and a left neighborhood [−δ, 0] is also contained in .x . The external field V0 is real analytic on [−1, −δ] and again by [6, Theorem 1.38] it follows that .x ∩ [−1, −δ] consists of a finite union of intervals. Thus we have shown that for x > 1 the support .x is a finite union of disjoint closed intervals. Remark 6.6. Theorem 6.5 has the following consequence for the continuum limit of the Toda lattice with the initial data α0 and β0 considered in Subsect. 6.2. We showed in Lemma 6.2 (b) that x0 ψ0 is the maximizer at time t0 , and position x0 . In the same way it follows that xψ(·; x) is the maximizer at time t0 and position x provided that it satisfies the constraint xψ(·; x) ≤ φ.

(6.47)

Using the fact that x0 ψ0 < φ on (−1, 1), see (6.17), we may prove as in [18, Lemma 4.10] that there exists δ > 0 such that (6.47) holds with strict inequality on (−1, 1) for every x < x0 + δ. This implies that in the example of Subsect. 6.2 the infinite gap solution holds at time t0 at x0 , but not at other x values less than x0 + δ. Going from x0 to x < x0 , we have that an infinite number of bands disappear, while going from x0 to x ∈ (x0 , x0 + δ), we have that an infinite number of gaps close. For x > x0 + δ, the upper constraint becomes active, and we are not able to analyse this more complicated situation. Acknowledgements. Arno Kuijlaars was supported in part by FWO research project G.0278.97, and a research grant of the Fund for Scientific Research – Flanders. He is grateful to K. T.-R. McLaughlin for the support and hospitality during a visit to the University of Arizona. Kenneth T.-R. McLaughlin was supported in part by NSF postdoctoral fellowship grant # DMS-9508946 and NSF grant # DMS-9970328. He thanks the faculty and staff of the Princeton University Mathematics Department and MSRI for their support and hospitality, and thanks A. Kuijlaars and W. Van Assche for their hospitality and support during visits to K. U. Leuven.

332


References 1. Bardos, C., Ghidaglia, J.-M. and Kamvissis, S.: Weak convergence and deterministic approach to turbulent diffusion. In: Nonlinear wave equations, (Yan Guo ed.), Contemp. Math. 263, Providence RI: AMS, 2000, pp. 1–15 2. Bloch, A., Golse, F., Paul, T. and Uribe, A.: Dispersionless Toda and Toeplitz operators. Preprint 3. Brockett, R.W. and Bloch, A.: Sorting with the dispersionless limit of the Toda lattice. In:Hamiltonian systems, transformation groups and spectral transform methods (Montreal, 1989), Montreal: Univ. Montréal, 1990, pp. 103–112 4. Damelin, S.B. and Kuijlaars, A.B.J.: The support of the equilibrium measure in the presence of a monomial external field on [−1, 1]. Trans. Am. Math. Soc. 351, 4561–4584 (1999) 5. Deift, P.: Orthogonal polynomials and random matrices: a Riemann–Hilbert approach. Courant Lecture Notes in Mathematics 3, New York: Courant Institute, 1999 6. Deift, P., Kriecherbauer, T. and McLaughlin, K.T.-R.: New results on the equilibrium measure for logarithmic potentials in the presence of an external field. J. Approx. Theory 95, 388–475 (1998) 7. Deift, P. and McLaughlin, K.T.-R.: A continuum limit of the Toda lattice. Mem. Am. Math. Soc. 131 624, (1998) 8. Deift, P., Venakides, S. and Zhou, X.: New results in small dispersion KdV by an extension of the steepest descent method for Riemann–Hilbert problems. Internat. Math. Research Notices 6, 286–299 (1997) 9. Dragnev, P.D. and Saff, E.B.: Constrained energy problems with applications to orthogonal polynomials of a discrete variable. J. Anal. Math. 72, 223–259 (1997) 10. Dragnev, P.D. and Saff, E.B.: A problem in potential theory and zero asymptotics of Krawtchouk polynomials. J. Approx. Theory 102, 120–140 (2000) 11. Ercolani, N., Levermore, C.D. and Zhang, T.: The behavior of the Weyl function in the zero-dispersion KdV limit. Commun. Math. Phys. 183, 119–143 (1997) 12. Flaschka, H.: On the Toda lattice II. Prog. Theor. Phys. 51, 703–716 (1974) 13. Flaschka, H., Forest, M.G. and McLaughlin, D.W.: Multiphase averaging and the inverse spectral solutions of the Korteweg–de Vries equation. Comm. Pure Appl. Math. 33, 739–784 (1980) 14. Gakhov, F.: Boundary value problems Oxford: Pergamon Press, 1966 15. Holian, B.L., Flaschka, H. and McLaughlin, D.W.: Shock waves in the Toda lattice: analysis. Phys. Rev. A 24, 2595–2623 (1981) 16. Jin, S., Levermore, C.D. and McLaughlin, D.W.: The semiclassical limit of the defocusing NLS hierarchy. Comm. Pure Appl. Math. 52, 613–654 (1999) 17. Kamvissis, S.: On the Toda shock problem. Phys. D 65, 242–266 (1993) 18. Kuijlaars, A.B.J.: On the finite gap ansatz in the continuum limit of the Toda lattice. Duke Math. J. 104, 433–462 (2000) 19. Kuijlaars, A.B.J. and Dragnev, P.D.: Equilibrium problems associated with fast decreasing polynomials. Proc. Am. Math. Soc. 127, 1065–1074 (1999) 20. Kuijlaars, A.B.J. and McLaughlin, K.T-R: Generic behavior of the density of states in random matrix theory and equilibrium problems in the presence of real analytic external fields. Comm. Pure Appl. Math. 53, 736–785 (2000) 21. Kuijlaars, A.B.J. and Van Assche, W.: The asymptotic zero distribution of orthogonal polynomials with varying recurrence coefficients. J. Approx. Theory 99, 167–197 (1999) 22. Lax, P.D. and Levermore, C.D.: The small dispersion limit of the Korteweg–de Vries equation I, II, III. Comm. Pure Appl. Math. 36, 253–290, 571–593, 809–829 (1983) 23. Manakov, S.V.: Complete integrability and stochastization of discrete dynamical systems. Zh. Exp. Teor. Fiz. 67, 543–555 (1974) 24. Moser, J.: Finitely many mass points on the line under the influence of an exponential potential – an integrable system. In: Dynamical Systems, Theory and Applications (J. Moser ed.) Lect. Notes in Phys. 38, Berlin: Springer, 1975, pp. 467–497 25. Rakhmanov, E.A.: Equilibrium measure and the distribution of zeros of the extremal polynomials of a discrete variable. Mat. Sb. 187, 109–124 (1996); English transl. in Sb. Math. 187, 1213–1228 (1996) 26. Ransford, T.: Potential theory in the complex plane. Cambridge: Cambridge University Press, 1995 27. Saff, E.B. and Totik, V.: Logarithmic Potentials with External Fields. New York: Springer-Verlag, 1997 28. Shipman, S.P.: Modulated waves in a semiclassical continuum limit of an integrable NLS chain. Comm. Pure Appl. Math. 53, 243–279 (2000) 29. Totik, V.: Weighted Approximation with Varying Weight. Lecture Notes in Math. 1569, Berlin: SpringerVerlag, 1994 30. Venakides, S.: Higher order Lax–Levermore theory. Comm. Pure Appl. Math. 43, 335–362 (1990)


333

31. Venakides, S., Deift, P. and Oba, R.: The Toda shock problem. Comm. Pure Appl. Math. 44, 1171–1242 (1991) 32. Whitham, G.B.: Linear and Nonlinear Waves. New York: Wiley, 1974 Communicated by M. Aizenman

Commun. Math. Phys. 221, 335 – 349 (2001)

Communications in



A Generic C1 Expanding Map has a Singular S–R–B Measure James T. Campbell , Anthony N. Quas Department of Mathematical Sciences, University of Memphis, Memphis, TN 38152-3240, USA. E-mail: [email protected]; [email protected] Received: 8 December 2000 / Accepted: 27 March 2001

Abstract: We show that for a generic C1 expanding map T of the unit circle, there is a unique equilibrium state for − log T that is an S–R–B measure for T , and whose statistical basin of attraction has Lebesgue measure 1. We also present some results related to the question of whether a generic C1 expanding map preserves a σ -finite measure, absolutely continuous with respect to Lebesgue measure. 1. Introduction Let E k denote the set of Ck expanding maps of the unit circle S 1 onto itself, k = 1, 2, . . . . Expanding maps have been widely studied in ergodic theory. In particular, various cases with k ≥ 2 have been studied by a large number of authors including Rényi ([17], 1965), Kr˙zyzewski ([9], 1971), Kr˙zyzewski and Szlenk ([11], 1969). A typical result says that an expanding map with C2 regularity has a unique absolutely continuous invariant measure with strong ergodic properties. These results have been extended to the case of C1+α expanding maps of the circle (maps with a Hölder continuous derivative) and even to maps satisfying weaker regularity conditions. More recently Góra ([5], 1994) proved results of this type under the Dini condition. A later result of Kr˙zyzewski ([10], 1979) gave the first indication that the situation for C1 expanding maps differs from that of the smoother maps. Namely, he showed that within the set of expanding C1 self-maps of any manifold, the set of such maps for which there is an absolutely continuous invariant probability measure, with continuous density bounded away from 0, is meager. (That is, its complement is generic, i.e., contains a dense Gδ set with respect to the C1 topology.) This theme was taken up by Góra and Schmitt ([4], 1989) who showed that there is an example of an expanding C1 map of the circle that has no absolutely continuous invariant probability measure. In further studies of C1 expanding maps of the circle by Quas ([15, 13, 14], all 1996) maps with respectively more than one absolutely continuous invariant measure and a J. Campbell is partially supported by NSF Grant #DMS–9801602

336

J. T. Campbell, A. N. Quas

non-weak-mixing invariant measure were constructed; and it was shown that a dense set of C1 expanding maps have a unique absolutely continuous invariant probability with unbounded density. In [2] (1998), Bruin and Hawkins constructed an example of an expanding C1 map of the circle with no σ -finite absolutely continuous invariant measure (finite or infinite). In a more recent paper of Quas ([16], 1999) it was shown that a generic C1 expanding map of the circle has no absolutely continuous invariant probability measure. Our main result shows that despite this result, there is (generically) a singular invariant probability from which properties of Lebesgue almost every orbit can be obtained. Theorem 1. For a generic T ∈ E 1 , there is a unique equilibrium measure µT for the potential − log T . This T - invariant probability measure has the following properties: 0 1 1. For a set of points S of Lebesgue measure 1, for all f ∈ C (S ), the averages n−1 k 1/n k=0 f (T x) converge to f dµT for all x ∈ S. 2. The measure µT is singular with respect to Lebesgue measure. 3. For each non-empty open set U , µT (U ) > 0.

In other words, a generic T ∈ E 1 possesses a fully supported singular Sinai–Ruelle– Bowen measure whose statistical basin of attraction has Lebesgue measure 1. A natural question is whether the result from [16] may be extended from probability measures to σ -finite measures; i.e., is it true that generically in E 1 , there is no absolutely continuous invariant measure? At the moment, we do not know the answer, but we include the following trio of results that give some information about this situation. Silva [19] introduced a notion of recurrence for a measure with respect to a non−1 singular transformation. To define this in our setting, let h be the density of λ ◦ 1T with respect to Lebesgue measure (h = dλ ◦ T −1 /dλ), and set ωn (x) = nj=1 h◦T j . Then 1 ωn > 0 on S and ωn dλ = 1, n = 1, 2, . . . . Lebesgue measure is recurrent for T if 1 the quantity ∞ n=1 ωn (x) is infinite for λ-a.e. x ∈ S . (We caution the reader that this notion of recurrence is much stronger than Poincaré recurrence. For example there exist C2 expanding maps of S 1 for which Lebesgue measure is not recurrent in this sense.) This recurrence property is relevant to the question of the existence of invariant measures as follows. If one can establish that a measure is recurrent for a non-invertible map, then existence or non-existence of absolutely continuous, σ -finite invariant measures for the map can be decided using a version of Krieger’s ratio set (see Hawkins and Silva [6] for a proof of this result). Theorem 2. For a generic subset of E 1 , Lebesgue measure is not recurrent. Recall that a measure µ is locally infinite if µ(I ) = ∞ for each open interval I . Theorem 3. For a generic T ∈ E 1 , any absolutely continuous invariant measure is locally infinite. To describe the next result in this direction, let hn (x) be the density of λ ◦ T −n with respect to λ: hn (x) = dλ ◦ T −n /dλ(x). Set S,n,a = T ∈ E 1 : λ{x : hn (x) ∈ [a, 2a]} < ,

Generic C 1 Expanding Maps

337

and consider the collection S=

S,n,a .

>0 n∈N a>0

If T ∈ S, we say the densities of λ ◦ T −n have no characteristic scale. This is because for such a T and for any > 0, there exists an n such that for each a > 0, the set {x : hn (x) ∈ [a, 2a]} has Lebesgue measure less than . It is known that there exist mappings with an infinite invariant measure so that the above densities hn , when appropriately rescaled, converge in measure to the invariant density (see Aaronson’s book [1] for examples). One can see that when T belongs to the class S defined above, this is impossible. Therefore, when T belongs to S, a natural way of producing an absolutely continuous invariant measure is lost. Theorem 4. The set S constructed above is a dense Gδ subset of E 1 . In the next section we give some notation and definitions, in Sect. 3 we state and prove some preliminary lemmas, in Sect. 4 we prove Theorems 1, 2, 3, and in Sect. 5 we prove Theorem 4. 2. Notation & Definitions We work on S 1 = [0, 1]/ ∼ , where ∼ identifies 0 with 1. The Borel sigma-algebra is denoted by B. The space of Borel measures on S 1 is denoted by M, with M1 denoting the subspace of probabilities. If T ∈ E 1 , MT1 denotes the set of Borel probability measures that are invariant under T . For ν ∈ MT1 , the measure-theoretic entropy of T with respect to ν is denoted by hν (T ), or hν if T is understood. For a continuous function f : S 1 → R, the pressure of f (with respect to T ) is given by PT (f ) = sup hν (T ) + f dν . ν∈M1T

An equilibrium state for f is an element µ ∈ MT1 satisfying PT (f ) = hµ + f dµ. Recall that a Borel measure µ is called a Sinai–Ruelle–Bowen measure for T ∈ E 1 if there exists a subset B of S 1 of positive Lebesgue measure such that for each f ∈ C 0 (S 1 ) and all x ∈ B, n−1

1

f (T k (x)) → n

f dµ.

k=0

The set B is called the statistical basin of attraction of µ. For each T ∈ E 1 , T is a continuous function whose absolute value is strictly larger than 1. Since S 1 is connected, E 1 decomposes into two disjoint open subsets, the first consisting of those T ’s for which T > 1, the other, those T ’s for which T < −1. Each of these sets has countably many open components, corresponding to the maps of degree k (k = 2, 3, . . . , and k = −2, −3, . . . , respectively). In some of our arguments, we want to prove, say, that a subset of E 1 with a certain property is generic. We proceed by supposing that T > 1 and the degree is a fixed but arbitrary integer k > 1, and proving that within the corresponding component, the set is generic. Since a practically

338


identical argument (with only the obvious minor modifications) will hold for T < −1 and k ≤ −2, and the components partition E 1 , the general result will follow. With these conventions in place we set !T = ! = − log(T ) < 0. We define the Perron–Frobenius operator, or transfer operator LT by LT f (x) =

f (y) . |T (y)|

T y=x

For now we do not specify the space containing f or LT f . These will depend upon the context in which they are being used, and will be designated as needed in the development. We repeatedly use the fact (proved in [9]) that for each T ∈ E 2 , there exists a unique, absolutely continuous µ ∈ MT1 , whose density is strictly positive and continuous. 3. Preliminary Lemmas We state and prove some lemmas that lead to the main results. Following [7], for each natural number k ≥ 2, let Ek : S 1 → S 1 denote the linear expanding map Ek (x) = kx mod 1. For T ∈ E 1 of degree k, it is well-known that Ek is conjugate to T ; that is, there exists a homeomorphism γ of S 1 such that T ◦ γ = γ ◦ Ek . In fact, in general there is more than one such homeomorphism (although only finitely many). For a degree k map T ∈ E 1 , we shall write Conj(T ) for the set of conjugacies between Ek and T . For our purposes, it will be necessary to study and control the dependence of the conjugacy on the map T . To do this, we shall exploit the construction in [7] of such a conjugacy. Specifically, in their construction, they start with a point p that is fixed by T and use the Markov partition of the circle given by the intervals whose endpoints are the points of T −1 {p}. For our modification, we need to control the choice of p. For z ∈ S 1 , set Uz = {T ∈ E 1 : T (z) = z}. Note that Uz is a dense open subset of 1 E . Lemma 1. For each z ∈ S 1 , there is a continuous map 'z : Uz → Homeo(S 1 ) such that 'z (T ) ∈ Conj(T ) for each T . In particular, given T ∈ E 1 of degree k, there is a neighborhood U of T on which there is a continuous choice of conjugacies to the map Ek . Proof. The proof is essentially that given in the proof of Theorem 2.4.6 in [7]. For a map T ∈ Uz , we choose the fixed point p of T that is the first fixed point on the circle “to the right” of z. That is, considering the circle to be the set [0, 1), p is chosen to be the first fixed point to the right of z or if there is none, the first fixed point to the right of 0. This choice of fixed point determines a conjugacy 'z (T ). The fixed point may be seen to depend continuously on the map, and so do its preimages. This allows one to show the required continuity of 'z . To show that in a neighborhood of any given map T ∈ E 1 , there is a continuous family of conjugacies, we argue as follows: Let z be any point not fixed by T , then Uz is the required neighborhood and 'z (S) is the continuous choice of conjugacy for S ∈ Uz .


339

Note that if γ ∈ Conj(T ) and f ∈ C0 (S 1 ), then PEk (f ◦ γ ) = PT (f ). Indeed, 1 and M 1 by ν → ν ◦ γ −1 . Then γ induces a bijection between ME f ◦ γ dν = T k −1 f dν ◦ γ , and since γ is a measure-theoretic isomorphism, hν (Ek ) = hν◦γ −1 (T ). The pressure equality follows. Lemma 2. For all T ∈ E 1 , PT (!T ) = 0. Proof. If T ∈ E 2 , this is well-known as the Ruelle-Ledrappier-Young entropy formula (see [12]). Given a degree k map T ∈ E 1 , by Lemma 1, we may find a neighborhood V of T and a choice of conjugacies γS for all S ∈ V so that the map S → γS is continuous on V . With these choices, if {Ti } ⊂ E 2 and Ti → T in E 1 , then !Ti ◦ γTi → !T ◦ γT in C 0 (S 1 ). Since pressure is continuous on C 0 (S 1 ), and 0 = PTi (!Ti ) = PEk (!Ti ◦ γTi ) for all i, it follows by taking limits that 0 = PEk (!T ◦ γT ) = PT (!T ). Corollary 1. If µ is any equilibrium state for !T , then µ is non-atomic. Proof. Let µ be an ergodic equilibrium state; then it must be either purely atomic, or continuous. If it is purely atomic, then hµ (T ) = 0 and !T dµ < 0, contradicting P (!T ) = 0. The result follows since the equilibrium states form a convex set, of which the extreme points are the ergodic states. Lemma 3. The set of T ∈ E 1 for which !T has a unique equilibrium state is generic. The lemma is a version of the Gibbs Phase Rule for the class of expanding maps of the circle. The original Gibbs Phase Rule for the case of a shift was proved by Ruelle [18] and Gallavotti and Miracle-Sole [3]. Proof. For any expansive T , there is at least one equilibrium state for each h ∈ C 0 (S 1 ) (see Walters [20], p. 224). Since expanding maps are expansive, every !T possesses at least one equilibrium state. To prove uniqueness for a dense Gδ , we work with equilibrium states for the map Ek : S 1 → S 1 given by Ek (x) = kx mod 1. We now show that the set B of potentials for which there is a unique Ek -equilibrium state forms a Gδ set. Theorems 4.3.3 and 4.3.5 of [8] characterize those potentials with unique equilibrium states as the set of f such that for all g ∈ C 0 (S 1 ), limt→0 (PEk (f + tg) − PEk (f ))/t exists. For fixed f and g, define H (t) = (PEk (f + tg) − PEk (f ))/t. Since the map t → PEk (f + tg) is convex, H is an increasing function. The above limit then exists if and only if lim inf t→0+ H (t) − H (−t) = 0. Hence f has a unique equilibrium state if and only if lim inf t→0+

PEk (f + tg) + PEk (f − tg) − 2PEk (f ) =0 t

for all g ∈ C 0 (S 1 ).

(1)

To show that these f form a Gδ set, we need to show that it is sufficient to calculate the lim inf for a collection of g belonging only to a countable set. To this end, let (gn )n∈N be a countable collection of continuous functions that is dense in C 0 (S 1 ). We note that PE (f + tg) + PE (f − tg) − 2PE (f ) /t − k k k

PEk (f + tgn ) + PEk (f − tgn ) − 2PEk (f ) /t ≤ 2g − gn ∞ .

340


Hence (1) holds if and only if lim inf (PEk (f + tgn ) + PEk (f − tgn ) − 2PEk (f ))/t = 0 for all n ∈ N. t→0+

The set of B of functions f satisfying this condition may be written as f : |(PEk (f + tgn ) + PEk (f − tgn ) − 2PEk (f ))/t| < 1/p , n∈N p∈N m∈N t∈(0, 1 ) m

which is easily seen to be a Gδ subset of C 0 . From Lemma 1, there is a continuous choice of conjugacies for maps in U0 . For a map T ∈ U0 , we shall call this choice of conjugacy γT . Letting . be the map U0 → C 0 (S 1 ) defined by .(T ) = !T ◦ γT , we see that . is continuous. It follows that .−1 (B) is a Gδ subset of U0 . We now have T ∈ .−1 (B) if and only if !T ◦ γT has a unique Ek equilibrium state. Since there is a bijection between Ek -equilibrium states for !T ◦ γT and T -equilibrium states for !T , we see that T ∈ .−1 (B) if and only if !T has a unique T -equilibrium state. We have established that the set S ⊂ E 1 consisting of those T for which !T has a unique equilibrium state, contains a Gδ subset of U0 . Since E 2 ∩ U0 is a dense subset of U0 that is contained in S, it follows that S contains a dense Gδ subset of U0 . Since U0 is a dense open subset of E 1 , we conclude that S contains a dense Gδ subset of E 1 . Set / = {T ∈ E 1 : there exists a unique equilibrium state for !T }. Lemma 4. Equip / with the (relative) C1 -topology, and M1 with the (relative) weak∗ topology. Then M : / → M1 , given by M(T ) = µT , is continuous. Proof. Suppose T0 ∈ / is of degree k, Ti ∈ / and Ti → T0 in C1 . We shall show that µT0 is the limit of the µTi . As in Lemma 1, fix a neighborhood V of T0 such that there is a continuous family of conjugacies γT for T ∈ V . Suppose that µ is any limit point of the µTi . We shall show that µ = µT0 , and this is sufficient, by weak∗ -sequential compactness, to show that the original sequence must converge to µT0 . Replacing the original sequence with a subsequence if necessary, we suppose that µTi → µ. Set νi = µTi ◦ γTi and ν = µ ◦ γT0 . Then ν and the νi are all Ek -invariant measures on S 1 . By continuity of the family of conjugacies, we see that νi → ν in the weak∗ -topology. For each i, since νi is an Ek -equilibrium state for !Ti ◦ γTi which by Lemma 2 has pressure 0, we have 0 = hνi + !Ti ◦ γTi dνi . Since the entropy map is upper semi-continuous, lim sup hνi ≤ hν . Since Ti → T0 in C1 and νi → ν we have 0 = lim sup hνi + !Ti ◦ γTi dνi ≤ hν + !T0 ◦ γT0 dν ≤ 0 ,


341

where the last inequality is true because the pressure is 0. Thus, all of the inequalities are equalities and ν is an equilibrium state for !T0 ◦ γT0 , so that µ is an equilibrium state for !T0 . Since T0 ∈ /, there is only one such state. Thus any limit point of the µTi is µT0 , the unique equilibrium state for !T0 , and the lemma is proved. ˜ = {T ∈ / : µT is fully supported} is a generic subset of / (and hence Lemma 5. / of E 1 ). Proof. From Corollary 1, for each T ∈ /, µT must be non-atomic. By Lemma 4, for a non-empty open interval I ⊂ S 1 , the map T → µT (I ) is continuous on /. Choose any collection {Ii } of non-empty open intervals that forms a countable basis for the topology ˜ = i {T ∈ / : µT (Ii ) > 0}, a Gδ that contains E 2 (and is therefore of S 1 . Then / dense). 4. Proofs of Theorems 1, 2, and 3 Proof of Theorem 1. Lemma 3 establishes that for T belonging to the residual set /, there is a unique equilibrium state µT for the potential − log T . To prove Statement 1, we use a result of Keller. Any fixed T ∈ E 1 , together with the Markov partition for T , forms what Keller [8] calls a continuous e−ψ -conformal fibred system. He shows ([8], Theorem 6.1.8)1 that in such a system, for λ-almost every x, the weak∗ -limit points of the averages k1 (δx + . . . + δT k−1 x ) are contained in the set of measures satisfying hµ + (− log T ) dµ ≥ 0. Since PT (− log T ) = 0, these measures are precisely the equilibrium states. Hence for T ∈ /, for λ-almost every x, the sequence k1 (δx + . . . + δT k−1 x ) has at most one weak-∗ limit point, namely µT . By weak-∗ sequential compactness, the entire sequence must converge to µT . To see that µT must be singular (with respect to λ), we first note that each T ∈ E 1 is a non-singular transformation (with respect to λ). Thus, if µT = µsi + µac is the decomposition of µT into singular and absolutely continuous components, the map µT → µT ◦ T −1 preserves µsi and µac , so that µac is a finite, absolutely continuous T -invariant measure. But we have seen that a generic T ∈ E 1 possesses no such invariant measure ([16]); that is, µac = 0. This proves Statement 2. Lemma 5 implies that generically, µT is fully supported, showing Statement 3. This completes the proof of Theorem 1. Before proving Theorem 2, we state and prove a lemma. There is a reference to a similar lemma in [2] although we have been unable to find the proof in the papers cited there. Recall that if T ∈ E 2 , µT is an absolutely continuous probability measure with strictly positive Radon–Nikodym derivative ρ = dµT /dλ. Lemma 6. Suppose T ∈ E 2 . Then log LT 1 dµT ≥ 0, with equality if and only if ρ is T −1 B-measurable. Proof. Fix T ∈ E 2 . In this case, the equilibrium state µT is absolutely continuous. We write ρ for the density of µT with respect to Lebesgue measure. 1 In fact the quoted theorem, as stated in the book, contains a mistake, although an irrelevant one for the present setting. The interested reader may go to http://www.mi.uni-erlangen/, keller/publications/equibook.html, where the needed correction to the proof of the theorem is given.

342


Let P denote the Perron-Frobenius operator for T with respect to µT ∈ MT1 . Then L (ρ·f ) . In particular LT (1) = ρP( ρ1 ). Thus, P(f ) = T ρ log LT (1) dµT = log ρ dµT + log P(1/ρ) dµT = − log(1/ρ) dµT + log P(1/ρ) dµT = − P(log(1/ρ)) dµT + log P(1/ρ) dµT , where the last equality follows because P preserves µT -integrals. It is well-known that P(·)◦T = EµT (·|T −1 B). Since T preserves µT , we may continue the above calculations as follows: − P(log(1/ρ)) dµT + log P(1/ρ) dµT = − P(log(1/ρ)) ◦ T dµT + log P(1/ρ) ◦ T dµT −1 = − EµT log(1/ρ)|T B dµT + log EµT (1/ρ|T −1 B) dµT ≥ 0, where the last inequality follows from Jensen’s inequality, from which it also follows that equality holds in the last step if and only if log( ρ1 ) is T −1 B-measurable, which holds if and only if ρ is T −1 B-measurable. This concludes the proof of Lemma 6. n Proof of Theorem 2. Since log ωn (x) = − j =1 log LT 1 ◦ T j (x), by Theorem 1 1 we have that log ω (x) → − log LT 1 dµT for λ-a.e. x ∈ S 1 and T ∈ /. If n n n log LT 1 dµT > 0, then for large n, ωn (x) = O(a ) for λ-a.e. x, where a is any number such that − log LT 1 dµT < log a < 0. That is, the sequence ωn (x) is asymptotically comparable to a geometric sequence, and hence summable (for λ-a.e. x), so that Lebesgue measure is not recurrent for T. First we observe that {T : log L 1 dµT > 0} is open in /. To see this, if T ∈ / T satisfies log LT 1 dµT > 0 and S ∈ / is C1 -close to T , then LS 1 is C0 -close to LT 1. By Lemma 4, µS is weak∗ -close to µT , proving the observation. Thus by Lemma 6, it is sufficient to show that for maps T belonging to a dense subset of E 2 (and hence a dense subset of E 1 ), the invariant density ρT is not T −1 B-measurable. Choose T ∈ E 2 for which ρT is T −1 B-measurable. We shall show that there is an S ∈ E 2 arbitrarily close to T (in the C1 topology) for which ρS is not S −1 B-measurable. Since ρT is T −1 B-measurable, T x = T y implies that ρ(x) = ρ(y). Given a Markov partition for T, we call the atoms of the partition the branches of T . We shall construct a C2 -homeomorphism π : S 1 → S 1 in such a way that 1. 2.

π is arbitrarily (C1 -) close to the identity, and The map T˜ = π ◦ T ◦ π −1 has the property that ρT˜ = ρ˜ is not T˜ −1 B-measurable. Establishing Items 1 and 2 will finish the proof.

Suppose for the moment that π is any C2 -homeomorphism of the circle, and T˜ (x) ˜ = −1 x) ˜ −1 B-measurable precisely ˜ π ◦ T ◦ π −1 (x). ˜ Then ρ( ˜ x) ˜ = πρ(π , so that ρ ˜ will be T (π −1 x) ˜


343

when T˜ (x) ˜ = T˜ (y) ˜ implies that ρ( ˜ x) ˜ = ρ( ˜ y). ˜ Suppose x˜ = y˜ and T˜ (x) ˜ = T˜ (y). ˜ Then, since ρ is T −1 B-measurable, ρ(π −1 y) ˜ = ρ(π −1 x). ˜ Hence ρ( ˜ x) ˜ will differ from ρ( ˜ y) ˜ precisely when π (π −1 x) ˜ = π (π −1 y). ˜ Hence, if π is chosen so that π is not T −1 B-measurable, these terms will be different. Now we specify that π is a C2 -homeomorphism of S 1 with the property that π ≡ 1 on one branch of T , and different from 1, yet arbitrarily close to 1, on the other branches. This completes the proof of Theorem 2. Proof of Theorem 3. Suppose that T satisfies the conditions of Theorem 1. We show that in this case, any absolutely continuous invariant measure for T is locally infinite. Suppose ν is an absolutely continuous invariant measure for T . Then ν(S 1 ) = ∞. Suppose, for the purpose of obtaining a contradiction, that I is any open interval with ν(I ) < ∞. Let f be any non-negative continuous function supported on I that is positive on some subinterval of I . Clearly f ∈ L1 (ν). By Birkhoff’s ergodic theorem for an infinite invariant measure, for ν-almost every x, n1 (f (x) + . . . + f (T n−1 x)) → 0. This holds in particular on a set of positive Lebesgue measure. On the other hand, since µT is a Sinai–Ruelle–Bowen measure, we have for λ-almost every x, n1 (f (x) + . . . + f (T n−1 x)) → f dµT . Since f is strictly positive on a subinterval of I and µT is fully supported, this quantity is strictly positive. This contradiction completes the proof of the theorem.

5. No Characteristic Scale In this section we prove that if S,n,a = T ∈ E 1 : λ{x : Ln 1(x) ∈ [a, 2a]} < , and S=

S,n,a

>0 n∈N a>0

then S is a dense Gδ subset of E 1 . Proof (Proof of Theorem 4). We can replace the uncountable intersections in the definition of S by countable intersections over the rationals without changing the set. Define LnT 1(x) 1 ≤2 . Fn (T ) = λ × λ (x, y) : ≤ n LT 1(y) 2 Clearly, Fn (T ) < 2 implies that for all positive a, the measure of the set of points with LnT 1(x)∈ [a, 2a] is less than . Letting R,n = {T : Fn (T ) < 2 }, it is clear that R,n ⊂ a>0 S,n,a . Conversely, for fixed x, let a1 = LnT 1(x)/2 and a2 = 2a1 . If T ∈ a>0 S 2 /2,n,a , then for each x, by considering ∪2i=1 {y : LnT 1(y) ∈ [ai , 2ai ]} we have λ{y : LnT 1(y) ∈ [LnT 1(x)/2, 2LnT 1(x)]} < 2 . By Fubini’s theorem, we see that Fn (T ) ≤ 2 so that T ∈ R,n . It follows that S,n,a = R,n . S= >0 n∈N a>0

>0 n∈N

344


We shall show that Fn : E 1 → R is an upper semi-continuous map so that S is a Gδ set. To prove this, suppose that Fn (T ) < α. We have λ×λ

(x, y) :

Ln 1(x) 1 ∈ 2, 2 Ln 1(y)

= lim λ × λ k→∞

Ln 1(x) 1 1 . ∈ 2 − k , 2 + k1 (x, y) : n L 1(y)

One can therefore find a k such that λ × λ({(x, y) : LnT 1(x)/LnT 1(y) ∈ [1/2 − 1/k, 2 + 1/k]}) < α. Since the map . : E 1 → C 0 (S 1 × S 1 ) given by .(T )(x, y) = LnT 1(x)/ LnT 1(y) is continuous (with the C1 and C0 -topologies on the respective spaces), there exists a neighborhood U of T such that if T˜ ∈ U , then .(T ) − .(T˜ ) < 1/k. It follows that if T˜ ∈ U , then Fn (T˜ ) < α, proving the upper semi-continuity of Fn . It then remains to demonstrate the density of S. To do this, we shall establish that for any > 0, any T0 ∈ E 2 and any neighborhood U of T0 (in the C1 topology), there is a T ∈ U and an n ∈ N such that for each a, λ{x : Ln 1(x) ∈ [a, 2a]} < . This will be accomplished by conjugating T0 using a homeomorphism constructed via a cocycle. We shall therefore assume > 0, T0 ∈ E 2 and δ > 0 are given. Let η > 0 be such that (1 + η)/(1 − η) < 1 + δ. Then we also have (1 − η)/(1 + η) > 1 − δ. Since T0 belongs to E 2 , T0 preserves an absolutely continuous invariant probability measure, µ, with a strictly positive continuous density, ρ. Let m be such that m1 ≤ ρ(x) ≤ m for all x. Let T¯0 : X → X be a natural extension of T0 : S 1 → S 1 preserving the measure µ. ¯ From [21], µ¯ is Bernoulli, so we may find a non-trivial independent partition P = {A0 , A1 } of X. Write p for µ(A ¯ 0 ) and q for µ(A ¯ 1 ). We then define a ¯ 0 on X as follows: function G 1 + ηq if x ∈ A0 ¯ G0 (x) = 1 − ηp if x ∈ A1 . ¯ (n) defined by Let n > 0 be an integer. We then form the multiplicative cocycle G 0 ¯ 0 (x)G ¯ 0 (T¯0 x) . . . G0 (T¯ n−1 x). ¯ (n) (x) = G G 0 0 (n)

¯ takes on the value vk = (1 + ηq)k (1 − ηp)n−k on a set of measure The G 0

n function k n−k q . p k Let K ∈ N be the least integer so that

Since vk+1 /vk =

1+ηq 1−ηp

1 + ηq 1 − ηp

K

> 2m2 .

¯ (n) in , for each a there are at most K values taken by G 0

[a, 2m2 a]. We then have the estimate ¯ (n) (x) ∈ [a, 2m2 a]} ≤ K µ{x ¯ :G 0

n k n−k p q . {k:vk ∈[a,2m2 a]} k max


345

Since for the values of k in the range over which the maximum is taken have the property that vk ≥ a, we see n ¯ (n) (x) ∈ [a, 2m2 a]} ≤ K a µ{x ¯ :G vk p k q n−k max 0 {k:vk ∈[a,2m2 a]} k n (p + ηpq)k (q − ηpq)n−k = K max 0≤k≤n k CK < √ , n where C is a constant that depends only on the values of p and q. ¯ (n) (x) ∈ [a, 2m2 a]}) < /4 for all a. It will turn out Now fix an n so that a µ({x ¯ :G 0 that an inequality of this type will be what is needed for the conjugate map to have the ¯ (n) is defined not on the circle, but on the desired property. At this point, the function G 0 natural extension space. We shall apply a conditional expectation and approximation ¯ (n) to obtain a function on the circle as needed. argument to G 0 Let Q be a Markov partition for T0 consisting of intervals. There exists a k such that k−1 −s s=0 T0 Q consists of intervals of length less than δ. Denote these intervals by Ij and write I¯j for π −1 Ij , where π denotes the natural projection from the natural extension (X, T¯0 , µ) ¯ to (S 1 , T0 , µ). ¯ Write ρ¯ = ρ ◦ π and define the natural extension of λ, λ¯ by λ(A) = A (1/ρ) ¯ d µ. ¯ We then calculate χ I¯j (n) i ¯ ¯ ¯ ¯ (n) ◦ T¯0i d µ. ¯ G0 ◦ T 0 d λ = ·G 0 ρ¯ I¯j Since T¯0 is mixing, we see that χ I¯j ¯ (n) ◦ T¯0i d λ¯ = ¯ (n) d µ¯ lim G d µ ¯ G 0 i→∞ I¯j 0 ρ¯ n ¯ ¯ ¯ G0 d µ¯ = λ(Ij ) = λ(Ij ), where we used the fact that P is an independent partition to get the second equality. We recall that n is chosen so that ¯ (n) (x) ∈ [a, 2m2 a]}) < /(4a), µ({x ¯ :G 0 for each a > 0. We now choose an i0 such that for i ≥ i0 , ¯ (n) ◦ T¯0i d λ¯ − λ(Ij ) < δ λ(Ij ), G I¯j 0 3

(2)

(3)

for each j . ¯ (n) if G ¯ is chosen to be We now show that similar inequalities persist for functions G ¯ 0. an appropriate perturbation of G ¯ (n) are in the range [(1 − It is useful to note that because the values taken by G 0 ηp)n , (1 + ηq)n ], the inequality (2) holds trivially for a outside this range.

346


We define N to be a subset of L1 (µ) ¯ as follows and equip it with the L1 subspace topology: ¯ : 1 − ηp ≤ G ¯ ≤ 1 + ηq; G ¯ −G ¯ 0 1 < ζ }. N = {G ¯ and because N consists of bounded Since composition with T¯ is an isometry on L1 (µ), ¯ → G ¯ (n) is continuous. Clearly, for G ¯ ∈ functions, the map from N to L1 given by G (n) ¯ N, the values taken by G are in the range [(1 − ηp)n , (1 + ηq)n ]. By choosing ζ ¯ (n) | < (1 − ηp)n /2 on a set of measure ¯ (n) − G appropriately small, we can ensure that |G 0 at least 1 − /(8(1 + ηq)n ). For a given a in the range [(1 − ηp)n /2m2 , (1 + ηq)n ], let a1 = a/2 and a2 = 2a. Then ¯ (n) (x) ∈ [a, 2m2 a]} ⊂ {x : G ¯ (n) (x) ∈ [a1 , 2m2 a1 ]} {x : G 0 ¯ (n) (x) ∈ [a, 2m2 a]} ∪ {x : G 0 ¯ (n) (x) ∈ [a2 , 2m2 a2 ]} ∪ {x : G 0 (n)

¯ (x) − G ¯ (n) (x)| > (1 − ηp)n /2}. ∪ {x : |G 0 We shall denote the four sets on the right-hand side by A1 , A2 , A3 and A4 respectively. ¯ (n) we have µ(A ¯ 1 ) < /(2a), µ(A ¯ 2 ) < /(4a) and By our previous estimates on G 0 µ(A ¯ 3 ) < /(4a2 ) < /(8a). We chose ζ above to ensure that µ(A ¯ 4 ) < /(8(1+ηq)n ) < /(8a), so that ¯ (n) (x) ∈ [a, 2m2 a]}) < /a µ({x ¯ :G for each a in the range [(1 − ηp)n /(2m2 ), (1 + ηq)n ]. As before, the inequality holds trivially for a outside this range, so we have established that for sufficiently small ζ , a ¯ (n) , if G ¯ is chosen from N . similar inequality to (2) persists for all a and functions G Since ¯ (n) ◦ T¯0i − G ¯ (n) ◦ T¯0i | d λ¯ ≤ |G ¯ (n) ◦ T¯0i − G ¯ (n) ◦ T¯0i | d λ¯ |G I¯j

0

0

≤m

¯ (n) ◦ T¯0i − G ¯ (n) ◦ T¯0i | d µ, |G ¯ 0

¯ ∈ N. we see that provided ζ is sufficiently small, (3) holds for G ¯ ∈ N, We have therefore shown that there exists a ζ > 0 such that for G ¯ (n) (x) ∈ [a, 2m2 a]}) < /a for each a, and µ({x ¯ :G ¯ (n) (x) ◦ T¯0i d λ¯ − λ(Ij ) < δ λ(Ij ) for each j , and i ≥ i0 . G 3 I¯j

(4) (5)

We note that since T¯0 : X → X is a natural extension of T0 : S 1 → S 1 , the σ -algebras ¯ 0 | T¯ i π −1 BS 1 ) converges to G ¯ 0 in L1 . By T¯0k π −1 BS 1 increase to BX . It follows that Eµ¯ (G 0 the monotonicity of conditional expectation, these functions also satisfy the inequality ¯ 0 | T¯ i π −1 BS 1 ) ≤ 1 + ηq. It follows that for sufficiently large i ≥ i0 , (i0 1 − ηp ≤ Eµ¯ (G 0 ¯ 0 | T¯ i π −1 BS 1 ) in place of G ¯ . Fix some as above), (4) and (5) are satisfied with Eµ¯ (G 0 ¯ 1 for Eµ¯ (G ¯ 0 | T¯ i π −1 BS 1 ). such i and write G 0


347

¯ 1 ◦ T¯ i = Eµ¯ (G ¯ 0 ◦ T¯ i |π −1 BS1 ) so we see that G ¯ 1 ◦ T¯ i may be written as g1 ◦ π Now G 0 0 0 for some B-measurable function g1 on the circle. Since C 0 (S 1 ) is dense in L1 (S 1 , B, µ), it follows that there exists a continuous function g2 such that g1 − g2 1 is arbitrarily ¯ 1 − g2 ◦ π ◦ T¯ −i 1 = g1 − g2 1 , we see that g2 may be chosen so that small. Since G 0 g2 ◦ π ◦ T¯0−i lies in N . Equations (4) and (5) now yield (n) g2 dλ − λ(Ij ) < 3δ λ(Ij ) for each j ; and Ij (n)

µ({x : g2 (x) ∈ [a, 2m2 a]}) < /a for each a > 0. (n) From the first equation, we see that 1 − 3δ < g2 dλ < 1 + 3δ , so finally we rescale g2 (i.e. multiply by a constant, that will, by our above estimates, be very close to 1) to obtain a function g that satisfies g (n) dλ = 1. We then have the inequalities (n) (6) g dλ − λ(Ij ) < δλ(Ij ) for each j ; and Ij µ({x : g (n) (x) ∈ [a, 2m2 a]}) < 2/a for each a > 0. (7) x (n) Set θ (x) = 0 g (t) dt and let T (x) = θ ◦ T0 ◦ θ −1 (x). Then from the above, and since each interval Ij has length less than δ, it may be verified that |θ(x) − x| < 2δ, and supx∈S 1 |T (x) − T0 (x)| < (C + 4)δ, where C = maxx∈S1 |T0 (x)|. Hence this quantity can be made arbitrarily small by choosing δ sufficiently small. Also, differentiating, we see θ (T0 (θ −1 x)) θ (θ −1 x) g (n) (T0 (θ −1 x)) = T0 (θ −1 x) g (n) (θ −1 x) g(T0n (θ −1 x)) = T0 (θ −1 x) . g(θ −1 x)

T (x) = T0 (θ −1 x)

Since g is uniformly close to 1 and T0 is uniformly continuous, we see that supx∈S 1 |T (x)− T0 (x)| can also be made arbitrarily small by controlling δ and η. This shows that T can be chosen arbitrarily close to T0 in the C1 norm. It remains to verify that T has the property that there exists an n such that for each a, λ{x : Ln 1(x) ∈ [a, 2a]} < . Since T is conjugate to T0 , there is also a conjugacy relation between their Perron-Frobenius operators given by LT = Lθ ◦ LT0 ◦ Lθ −1 , where Lθ f (x) = f (θ −1 (x))/θ (θ −1 x). Since T0 is a C2 expanding map, we have that LnT0 1 converges uniformly to ρ. It follows that LnT 1 converges uniformly to Lθ ρ(x) = ρ(θ −1 x)/θ (θ −1 x). We then estimate λ({x :

ρ(θ −1 x) 1 a , 2ma]}) ∈ [a, 2a]}) ≤ λ({x : (n) −1 ∈ [ m (n) −1 g (θ x) g (θ x) 1 = λ({x : g (n) (θ −1 x) ∈ [ 2ma ,m a ]}).

348


1 1 m (n) But we see that {x : g (n) (θ −1 x) ∈ [ 2ma ,m a ]} = θ({y : g (y) ∈ [ 2ma , a ]}). Using this, we get

λ({x :

ρ(θ −1 x) 1 ,m ∈ [a, 2a]}) ≤ λ ◦ θ({y : g (n) (y) ∈ [ 2ma a ]}) g (n) (θ −1 x) = g (n) (y) dλ 1 m {y : g (n) (y)∈[ 2ma , a ]}
0 there is CR > 0 such that √ Pk f (u) − (µ, f ) ≤ CR e−c k sup |f | + Lip(f ) for k ≥ 0, H

(0.4)

where u ≤ R, f is an arbitrary bounded Lipschitz function on H , and c > 0 is a constant not depending on u, f , R, and k. Example 0.2. Let us consider the 2D Navier–Stokes (NS) equations perturbed by a random kick-force: u˙ − νu + (u, ∇)u + ∇p = η(t, x) ≡

∞

ηk (x)δ(t − k),

k=−∞

div u = 0,

(0.5)

u(t, ·) = 0,

where u = u(t, x), x ∈ T2 , and u = T2 u(x) dx. Let H be the space of divergencefree vector fields u ∈ L2 (T2 , R2 ) such that u = 0 and let {ej } be the normalised trigonometric basis in H . Assuming that the kicks ηk ∈ H have the form (0.1) and normalising solutions u(t) for (0.5) to be continuous from the right, we observe that (0.5) can be written in the form (0.2), where uk = u(k, ·) ∈ H and S : H → H is the timeone shift along trajectories of the free NS system (i.e., of Eqs. (0.5) with η ≡ 0). As it is shown in [KS1], the operator S satisfies all the required assumptions, and therefore Theorem 0.1 applies to (0.5).

Randomly Forced Nonlinear PDE’s

353

Theorem 0.1 can also be applied to many other dissipative nonlinear PDE’s perturbed by a random kick-force, in particular, to the complex Ginzburg–Landau equation u˙ − ν( − 1)u + i|u|2 u = η(t, x),

x ∈ Tn ,

where u = u(t, x) and ν > 0 (see [KS1, KS2]). Uniqueness of a stationary measure for (0.2) was first established1 in [KS1]. The proof in [KS1] is based on a Lyapunov–Schmidt type reduction of the system (0.2) to an N -dimensional RDS with delay (the integer N is the same as in Theorem 0.1). Due to this reduction, the problem of uniqueness of a stationary measure for (0.2) reduces to a similar question for an abstract 1D Gibbs system with an N -dimensional phase space. The uniqueness for the reduced Gibbs system is then established using a version of the Ruelle–Perron–Frobenius theorem. E, Mattingly, Sinai [EMS] and Bricmont, Kupiainen, Lefevere [BKL] used later similar approaches to show that the NS system (0.5) perturbed by a white (in time) force of the form N η(t, x) = bj β˙j (t)ej (x), N < ∞, j =1

also has a unique stationary measure µ ∈ P(H ), provided that bj = 0 for 1 ≤ j ≤ N ≤ N with some sufficiently large N = N (ν). Moreover, it is shown in [BKL] that for the case of white noise the convergence in (0.4) is exponentially fast for µ-almost all u ∈ H . In [KS3] the NS equations (0.5) with an unbounded kick-force η(t, x) is studied and the scheme of [KS1] is used to prove the uniqueness and ergodicity of a stationary measure. The approach presented in this work does not use a Lyapunov–Schmidt type reduction and the Gibbs measure technique. Instead it exploits some ideas from [KS2], interpreting them in terms of the coupling. The new approach gives rise to a shorter proof and is more flexible. The coupling is a well-known effective tool for studying finite-dimensional Markov chains (e.g., see [Lin] and the Appendix in [V]) and dynamical systems (e.g., see [Y, BL]). In [EMS] a coupling is used to study the auxiliary finite-dimensional RDS with delay which arises as a result of the Lyapunov–Schmidt reduction. Our work shows that a form of coupling applies directly to infinite-dimensional Markov chains and randomly forced PDE’s. When a preprint of this paper was sent around, we learned from L.-S. Young that a similar approach to prove Theorem 0.1 is developed by her and Nader Masmoudi in their work under preparation. Notation. We abbreviate a pair of random variables ξ1 , ξ2 or points u1 , u2 to ξ1,2 and u1,2 , respectively. Given a probability space ($, F, P), for any integer k ≥ 1 we denote by $k the space $ × · · · × $ (k times) endowed with the σ -algebra F × · · · × F and the measure P × · · · × P. For a random variable ξ , we denote by D(ξ ) its distribution. For a Banach space H , we shall use the following spaces and sets: 1 It is shown in [KS1, KS2] that the left-hand side of (0.4) converges to zero as k → ∞ for any f ∈ C (H ); b however, the rate of convergence is not specified.

354

S. Kuksin, A. Shirikyan

Cb (H )

is the space of bounded continuous functions on H with the supremum norm · ∞ . L(H ) is the space of bounded Lipschitz functions on H endowed with the natural norm · L (see Sect. 1). M(H ) is the space of signed Borel measures on H with bounded variation. P(H ) is the set of probability measures µ ∈ M(H ); this space is endowed with two different metrics described in Sect. 1. P(H, A) is the set of measures µ ∈ P(H ) with support in a closed set A. µv (k) is the measure P(k, v, ·), where P is the Markov transition function for (0.2). BH (R) is the closed ball of radius R > 0 centred at zero. 1. Measures on Hilbert Spaces Let H be a separable Hilbert space with the Borel σ -algebra B(H ) and let M(H ) be the space of signed Borel measures with bounded variation. We denote by P(H ) the set of probability measures µ ∈ M(H ) and by P(H, A) the subset in P(H ) consisting of measures supported by a closed set A ⊂ H . For any measure µ ∈ M(H ) and any function f ∈ Cb (H ), we write (µ, f ) = f (u) dµ(u) = f (u)µ(du). H

H

We shall use two different topologies on P(H ). The first of them is given by the variation norm on M(H ): µvar = sup |µ()|. ∈B(H )

The distance defined by this norm on P(H ) can be characterised in terms of densities. Namely, let us assume that µ1 , µ2 ∈ P(H ) are absolutely continuous with respect to a fixed Borel measure m, finite or infinite. (Such a measure always exists; for instance, one can take m = (µ1 + µ2 )/2.) In this case, we have 1 µ1 − µ2 var = |p1 (u) − p2 (u)| dm(u), (1.1) 2 H where pi (u), i = 1, 2, is the density of µi with respect to m. The space P(H ) is complete with respect to · var . To define a second topology, we denote by L(H ) the space of real-valued bounded Lipschitz functions on H with the norm

|f (u) − f (v)| f L := sup |f (u)| ∨ sup . u − v u=v u∈H Let · ∗L be the dual norm on M(H ):

µ∗L = sup (µ, f ). f L ≤1

It is clear that the norm · ∗L defines a metric on P(H ). Lemma 1.1. The space P(H ) is complete with respect to the metric · ∗L .


355

Proof. Suppose that {µn } ⊂ P(H ) is a sequence such that µn − µm ∗L → 0 as m, n → ∞. Let L∗ (H ) be the space of continuous functionals on L(H ). Regarding µn as elements of L∗ (H ), we conclude that the sequence {µn } converges (in the norm ·∗L ) to a limit ) ∈ L∗ (H ), and we have )(f ) = lim (µn , f ), n→∞

f ∈ L(H ).

(1.2)

In view of the corollary2 from Theorem 1 in [GS, Chapter VI, §1], there is a measure µ ∈ P(H ) such that )(f ) = (µ, f ). This completes the proof. Note that, in the case when H is finite-dimensional, the fact that the functional ) in (1.2) is a measure is implied by the following well-known result (for instance, see [H, Theorem 2.1.7]): any nonnegative distribution is a measure; in particular, any positive functional ) ∈ L∗ (H ) is a measure as well. Let P(k, u, ), k ≥ 0, u ∈ H , ∈ B(H ), be a Markov transition function. A set A ∈ B(H ) is said be invariant for P if P(k, u, A) = 1

for all

k ≥ 0,

u ∈ A.

Lemma 1.2. Let A ∈ B(H ) be an invariant set for P(k, u, ). Suppose that there is k0 ≥ 1 and a sequence ζk , k ≥ k0 , going to zero as k → ∞ such that P(k, u, ·) − P(k, v, ·)∗L ≤ ζk for k ≥ k0 ,

u, v ∈ A.

(1.3)

Then there is a unique measure µ ∈ P(H, A) such that P(k, u, ·) − µ∗L ≤ ζk for k ≥ k0 ,

u ∈ A.

(1.4)

Proof. Let f ∈ L(H ), f L ≤ 1. Then, by (1.3) and the Chapman–Kolmogorov relation, for l ≥ k ≥ k0 and u, v ∈ A we have P(l, v, ·) − P(k, u, ·), f ≤ P(l − k, v, dz) P(k, z, dw)f (w) − P(k, u, dw)f (w) H H ≤ ζk P(l − k, v, dz) = ζk . H

(1.5)

By Lemma 1.1, the space P(H ) is complete with respect to · ∗L . Hence, there is a unique measure µ ∈ P(H ) such that P(l, v, ·) − µ∗L → 0 as l → ∞. It is clear that supp µ ⊂ A and therefore µ ∈ P(H, A). Passing to the limit in (1.5) as l → ∞, we obtain (1.4). We now recall that a pair of random variables (ξ1 , ξ2 ) defined on the same probability space is called a coupling for given measures µ1 , µ2 ∈ P(H ) if D(ξj ) = µj , j = 1, 2. For some basic results on the coupling, see [Lin,V] and the Appendix (Sect. 4). 2 The corollary of Theorem 1 in [GS, Chapter VI, §1] claims, in fact, that if the limit in (1.2) exists for any f ∈ Cb (H ), then the functional ) can be represented in the form )(f ) = (µ, f ), where µ ∈ P(H ). However, the same proof works also in the case under study.

356


Lemma 1.3. If measures µ1 , µ2 ∈ P(H ) admit a coupling (ξ1 , ξ2 ) such that P ξ1 − ξ2 > ε ≤ θ,

(1.6)

where ε > 0 and θ > 0 are some constants, then µ1 − µ2 ∗L ≤ 2θ + ε.

(1.7)

Proof. Let f ∈ L(H ), f L ≤ 1. Then (µ1,2 , f ) = E f (ξ1,2 ) and, therefore, |(µ1 − µ2 , f )| ≤ EχQ (f (ξ1 ) − f (ξ2 )) + EχQc (f (ξ1 ) − f (ξ2 )),

(1.8)

where χQ and χQc are characteristic functions of the event ξ1 − ξ2 > ε and of its complement, respectively. By (1.6), the first term in the right-hand side of (1.8) is bounded by 2θ, while the second does not exceed εf L ≤ ε. This completes the proof of (1.7). 2. A Class of Random Dynamical Systems Let H be a Hilbert space with a norm · and an orthonormal basis {ej } and let S : H → H be an operator satisfying Conditions (A)–(C) below: (A) For any R > r > 0 there exist positive constants a = a(R, r) < 1 and C = C(R) and an integer n0 = n0 (R, r) ≥ 1 such that S(u1 ) − S(u2 ) ≤ C(R)u1 − u2 S n (u) ≤ max{au, r}

for all u1 , u2 ∈ BH (R), for u ∈ BH (R), n ≥ n0 .

(2.1) (2.2)

Let ηk , k ≥ 1, be a sequence of i.i.d. H -valued random variables that are defined on a probability space ($1 , F1 , P1 ) and have the form (0.1), where bj ≥ 0 are some constants such that ∞ j =1

bj2 < ∞,

(2.3)

and {ξj k } is a family of independent real-valued random variables such that |ξj k | ≤ 1 for all j , k, and ω1 ∈ $1 . We consider the following RDS in H : uk = S(uk−1 ) + ηk =: F ω1 (uk−1 ),

k ≥ 1.

(2.4)

It follows from (0.1) and (2.3) that the distribution of ηk is supported by the Hilbert cube K,

∞ K= u= uj ej : |uj | ≤ bj for all j ≥ 1 . j =1

Therefore, if the initial state u0 of the RDS (2.4) belongs to a set B for all k ≥ 1 and ω1 ∈ $1 , where A0 (B) = B and Ak (B) = S Ak−1 (B) + K for

⊂ H , then uk ∈ Ak (B)

k ≥ 1.

The next condition expresses the property of existence of a bounded absorbing set for the system in question.


357

(B) There exists ρ > 0 such that for any bounded set B ⊂ H there is an integer k0 ≥ 1 such that Ak (B) ⊂ BH (ρ) for k ≥ k0 . Clearly, inequality (2.2) and Condition (B) are satisfied if S(u) ≤ γ u for all u ∈ H and some positive constant γ < 1. To formulate the last condition, we introduce some notations. For a subspace E ⊂ H , we denote by E ⊥ its orthogonal complement in H . For an integer N ≥ 1, let HN be the finite-dimensional subspace generated by the vectors e1 , . . . , eN and let PN and QN be the orthogonal projections onto HN and HN⊥ , respectively. (C) For any R > 0 there is a decreasing sequence γN (R) > 0 tending to zero as N → ∞ such that QN S(u1 ) − S(u2 ) ≤ γN (R)u1 − u2 for all u1 , u2 ∈ BH (R). Finally, we specify the random variables {ξj k }: (D) For any j , the random variables ξj k , k ≥ 1, have the same distribution πj (dr) = pj (r) dr, where the densities pj (r) are functions of bounded variation such that supp pj ⊂ [−1, 1] and |r|≤ε pj (r) dr > 0 for all j ≥ 1 and ε > 0. We normalise the functions pj to be continuous from the right. The RDS (2.4) defines a family of Markov chains in H with the transition function P(k, v, ) = P uk ∈ , where (uk , k ≥ 0) is the solution of (2.4) such that u0 = v. Let Pk and Pk∗ be the corresponding semigroups (see the Introduction for their definition). Continuity of S (see Condition (A)) and the Lebesgue theorem on dominated convergence imply that the transition function satisfies the Feller condition: if f ∈ Cb (H ), then Pk f ∈ Cb (H ) for all k ≥ 1. Let ρ > 0 be the constant in Condition (B). We introduce the set A=

Ak BH (ρ) .

(2.5)

k≥1

It is clear that A is an invariant set for the RDS (2.4): if u0 ∈ A, then uk ∈ A for all k ≥ 1 and ω1 ∈ $1 . Moreover, it follows from Condition (C) that the set A is compact in H . (Note that the union in (2.5) is taken over k ≥ 1 and therefore BH (ρ) is not a subset of A.) Our goal is to prove the following result: Theorem 2.1. There is an integer N ≥ 1 such that if (0.3) holds, then the RDS (2.4) has a unique stationary measure µ ∈ P(H, A). Moreover, for any R > 0 there is CR > 0 such that √ Pk f (u) − (µ, f ) ≤ CR e−c k f L for k ≥ 0, u ≤ R, where f ∈ L(H ) is an arbitrary function and c > 0 is a constant not depending on f , u, R, and k.

358


Condition (B) and the definition of A imply that for any R > 0 there is an integer l ≥ 1 depending on R such that P(l, u, A) = 1 for any u ∈ BH (R). Hence, we can restrict our consideration to the invariant set A. In view of Lemma 1.2, Theorem 2.1 will be established if we show that there are positive constants C and c and an integer k0 ≥ 1 such that P(k, u, ·) − P(k, v, ·)∗L ≤ C e−c

√ k

for

k ≥ k0 ,

u, v ∈ A.

(2.6)

3. Proof of the Main Result We first establish some auxiliary assertions and then use them to prove inequality (2.6), which implies the required result. 3.1. Auxiliary assertions. We begin with a simple observation. Let R > 0 be so large that BH (R) ⊃ A. To simplify notation, we denote B = BH (R). Lemma 3.1. For any d > 0 there is an integer l = l(d) ≥ 0 and a constant : = :(d) > 0 such that P ul (v) ≤ d/2 for all v ∈ B ≥ :. (3.1) Proof. Let a and n0 be the constants in Condition (A) that correspond to the parameters R (the radius of B) and r = d/4 and let l = n0 m, where m is the smallest integer such that a m R ≤ d/4. If ηk = 0 in (2.4) for 1 ≤ k ≤ l, then, in view of (2.2), we have ul (v) ≤ max{a m R, d/4} = d/4

for all

v ∈ B.

By continuity, there is γ > 0 such that if ηk ≤ γ

for

1 ≤ k ≤ l,

(3.2)

then ul (v) ≤ d/2.

(3.3)

It follows from (2.3) and Condition (D) that the event (3.2) has a positive probability :. Inequality (3.1) follows now from (3.3). To simplify notation, for any v ∈ H we denote by µv (k) the measure P(k, v, ·) ∈ P(H ). For any measurable space (X, B(X)) and any integer k ≥ 1, we denote by X k the direct product X × · · · × X endowed with the product σ -algebra B k (X) = B(X) × · · · × B(X). Lemma 3.2. There is a probability space ($, F, P), an integer N ≥ 1, and a constant C > 0 such that if (0.3) holds, then for any u1 , u2 ∈ B the measures µu1,2 (1) admit a coupling V1,2 = V1,2 (u1 , u2 ; ω) that possesses the following properties: (i) The maps V1,2 are measurable with respect to the σ -algebra B 2 (H )×F as functions of (u1 , u2 , ω) ∈ B 2 × $. (ii) Let d = u1 − u2 . Then P V1 − V2 ≥ d/2 ≤ Cd. (3.4)


359

Let us note that inequality (3.4) is nontrivial only in the case Cd < 1. Proof. Let ($1 , F1 , P1 ) be the probability space on which the random variables {ηk } are defined and let ($2 , F2 , P2 ) be the probability space constructed in Theorem 4.2 for the measures ν1,2 specified below. We shall show that the set $ = $1 × $2 endowed with the natural σ -algebra and probability of direct product is the required probability space. The random variables V1,2 are sought in the form V1 = S(u1 ) + ξ1 ,

V2 = S(u2 ) + ξ2 ,

where ξ1,2 are some random variables on $ such that D(ξ1 ) = D(ξ2 ) = D(η1 ). It is clear that D(V1,2 ) = µu1,2 (1) and that (i) holds. To define the random variables ξ1,2 , we specify their projections PN ξ1,2 and QN ξ1,2 , where N ≥ 1 is a sufficiently large integer which is chosen below. We set QN ξ1 = QN ξ2 = QN η˜ 1 , where η˜ 1 is the natural extension of η1 to $, i.e., η˜ 1 (ω) = η1 (ω1 ) for ω = (ω1 , ω2 ) ∈ $. To define PN ξ1,2 , let us write ν1,2 := PN µu1,2 (1) and assume that we have proved the inequality ν1 − ν2 var ≤ Cd,

(3.5)

where C > 0 is a constant not depending on u1,2 ∈ B. In view of Theorem 4.2, there is a maximal coupling =1,2 (u1 , u2 ; ω2 ) for the measures ν1,2 that is measurable with respect to (u1 , u2 , ω2 ) ∈ B 2 × $2 : P{=1 = =2 } = ν1 − ν2 var ≤ Cd.

(3.6)

Retaining the same notation for the natural extensions of =1 and =2 to $, we now set PN ξ1,2 = =1,2 − PN S(u1,2 ) and note that PN V1 = PN V2 if and only if =1 = =2 . Let N ≥ 1 be so large that γN (R) ≤ 1/2 (see Condition (C)). In this case, if PN V1 = PN V2 , then V1 − V2 = QN (V1 − V2 ) = QN (S(u1 ) − S(u2 )) ≤ u1 − u2 /2 ≤ d/2. Inequality (3.4) follows now from (3.6). Thus, it remains to establish (3.5). To this end, we set v1,2 = PN S(u1,2 ) and note that, in view of (2.1), v1 − v2 ≤ C(R)d.

(3.7)

Since bj = 0 for 1 ≤ j ≤ N , Condition (D) implies that D(PN η1 ) = p(x) dx, where dx is the Lebesgue measure on the finite-dimensional space HN and p(x) =

N j =1

qj (xj ),

qj (xj ) = bj−1 pj (xj /bj ),

x = (x1 , . . . , xN ) ∈ HN ,

is a bounded function with support in the set PN K. It follows that ν1,2 = D(v1,2 + PN η1 ) = p(x − v1,2 ) dx.

360


Therefore, by (1.1), 1 = 2

v1 − v2 var

HN

|p(x − v1 ) − p(x − v2 )| dx.

We claim that HN

|p(x − v1 ) − p(x − v2 )| dx ≤ |v1 − v2 |

N j =1

bj−1 Var(pj ),

(3.8)

where Var(pj ) stands for the total variation of pj . The required inequality (3.5) follows immediately from (3.7) and (3.8). To prove (3.8), we first assume that pj are C 1 -smooth functions. In this case, we have |p(x − v1 ) − p(x − v2 )| dx HN

≤ |v1 − v2 |

HN

= |v1 − v2 | = |v1 − v2 |

HN N

(∇p)(x − θv1 − (1 − θ)v2 ) dθdx

1 0

N (∇p)(x) dx ≤ |v1 − v2 |

j =1 R

∂x qj (xj ) dxj j

Var(qj ).

j =1

It remains to note that Var(qj ) = bj−1 Var(pj ). Inequality (3.8) in the general case can be easily derived by a standard approximation procedure; we omit the corresponding arguments. k (u , u ) for the We now combine Lemmas 3.1 and 3.2 to obtain a coupling U1,2 1 2 measures µu1,2 (k), k ≥ 1. Let l = l(d) and C > 0 be the constants in Lemmas 3.1 and 3.2 and let d0 > 0 be so small that

Cd0 ≤ 1/8. We set dr = 2−r d0 , r ≥ 1. For a probability space ($, F, P), we shall denote by ($k , F k , Pk ) the direct product of its k independent copies. Points of the latter will be denoted by ωk = (ω1 , . . . , ωk ). Lemma 3.3. Suppose that the conditions of Lemma 3.2 are satisfied. Let u1 , u2 ∈ A and d = u1 − u2 . Then for any k ≥ 1 the measures µu1,2 (k) admit a coupling k = U k (u , u ; ωk ), ωk ∈ $k , such that the following assertions hold: U1,2 1,2 1 2 k (u , u ; ωk ) are measurable with respect to (u , u , ωk ) ∈ A2 × $k . (i) The maps U1,2 1 2 1 2 (ii) There is a constant θ > 0 not depending on u1 , u2 , and k such that (3.9) Pk U1k − U2k ≤ dr ≥ θ for all k ≥ r + l(d0 ), u1 , u2 ∈ A.


361

(iii) If u1 − u2 ≤ dr , then Pk U1k − U2k ≤ dk+r ≥ 1 − 2−r−1 for all k ≥ 1,

r ≥ 0.

(3.10)

Proof. Let us recall that for any (u1 , u2 ) ∈ B × B a coupling V1,2 (u1 , u2 ; ω) was constructed in Lemma 3.2. We set Vj (u1 , u2 ; ω) if u1 − u2 ≤ d0 , Uj (u1 , u2 ; ω) = F ω (uj ) if u1 − u2 > d0 , k on where j = 1, 2 and F ω (u) is given by (2.4). We define random variables U1,2 ($k , F k ) by the following rule: if u1 − u2 > d0 , then

Ujk (u1 , u2 ; ωk ) = F ωk ◦ · · · ◦ F ω1 (uj ) for k ≤ l(d0 ) and

Ujk (u1 , u2 ; ωk ) = Uj U1k−1 (u1 , u2 ; ωk−1 ), U2k−1 (u1 , u2 ; ωk−1 ); ωk

(3.11)

for k > l(d0 ), where ωk = (ωk−1 , ωk ) = (ω1 , . . . ωk ) and Uj0 (u1 , u2 ) = uj . If u1 − k 0 (u , u ) = u k u2 ≤ d0 , then U1,2 1 2 1,2 and for k ≥ 1 the random variables Uj (u1 , u2 ; ω ) are inductively defined by (3.11). k satisfy assertions (i)–(iii) of the lemma. Indeed, the measurabilWe claim that U1,2 k is obvious since they are compositions of measurable maps. To ity of the maps U1,2 prove (3.9), we first note that it is sufficient to consider the case k = l + r, l = l(d0 ). We introduce the following events in $l+r : Q+ = U1l − U2l ≤ d0 , Q− = U1l − U2l > d0 , Q = U1l+r − U2l+r ≤ dr . By Lemma 3.1, we have Pk (Q) = Pk (Q|Q+ )P(Q+ ) + Pk (Q|Q− )P(Q− ) ≥ : Pk (Q|Q+ ).

(3.12)

If we assume that (3.10) is proved for r = 0, then (3.12) will imply the required estimate (3.9) with θ = :/2. Thus, it remains to establish (iii). For a fixed r ≥ 0, we set k k k k Q+ Q− k = U1 − U2 ≤ dk+r , k = U1 − U2 > dk+r − and denote by pk+ and pk− the probabilities of Q+ k and Qk , respectively. Using (3.4) with d = dk+r−1 , we derive + + − + − + k pk+ = pk−1 Pk (Q+ k |Qk−1 ) + pk−1 P (Qk |Qk−1 ) ≥ (1 − Cdk+r−1 )pk−1 .

Since p0+ = 1, iteration of this estimate results in pk+ ≥ λ :=

k−1 j =0

(1 − Cdj +r ).

(3.13)

362


Since dm = 2−m d0 and Cd0 ≤ 1/8, we have log λ =

k−1

log(1 − Cdj +r ) ≥ −2C

j =0

≥ −2Cd0

k−1

dj +r

j =0 ∞

2−(j +r) = −22−r Cd0 ≥ −2−r−1 .

j =0

Therefore, λ ≥ 1 − 2−r−1 . 3.2. Proof of Theorem 2.1. As was mentioned at the end of Sect. 2, it is sufficient to establish inequality (2.6). In what follows, to simplify notation, we shall write P instead of Pk . (1) Let us fix arbitrary u1 , u2 ∈ A and set T0 = 0 and Tr = Tr−1 + r + l for r ≥ 1, i.e., Tr = r(r + 1)/2 + rl. We claim that for any integer r ≥ 0 there is a coupling y1,2 (Tr ) on $Tr for the measures µu1,2 (Tr ) such that (3.14) P y1 (Tr ) − y2 (Tr ) > dr ≤ C1 γ r , where C1 and γ < 1 are some positive constants. The construction of y1,2 (Tr ) = y1,2 (Tr , u1 , u2 ; ωTr ) and the proof of (3.14) are by induction. For r = 0, we set yj (0) = uj , and inequality (3.14) with C1 ≥ 1 is trivial in this case. Assuming that y1,2 (Ti ) are constructed for 0 ≤ i ≤ r, we set yj (Tr+1 , u1 , u2 ; ωTr+1 ) = Ujr+l+1 y1 (Tr , u1,2 ; ωTr ), y2 (Tr , u1,2 ; ωTr ); ωr+l+1 , (3.15) k (u , u ; ωk ) are defined in Lemma 3.3 and ωTr+1 = (ωTr , ωr+l+1 ). Let us where U1,2 1 2 introduce the events Q+ Q− r = y1 (Tr ) − y2 (Tr ) ≤ dr , r = y1 (Tr ) − y2 (Tr ) > dr

and denote by pr+ and pr− their probabilities. Then, in view of (3.9) and (3.10) with k = r + l, we have (cf. (3.12)) − − + + − − pr+1 = P(Q− r+1 |Qr )P(Qr ) + P(Qr+1 |Qr )P(Qr )

≤ 2−r−1 pr+ + (1 − θ)pr− ≤ 2−r−1 + γpr− ,

(3.16)

where γ = 1 − θ. Without loss of generality, we can assume that 0 < θ < 1/2, and therefore 1 < 2γ < 2. Iterating (3.16), we obtain − pr+1

≤2

−r−1

r

(2γ )j + γ r+1 p0− ≤ 2−r−1

j =0

This completes the induction.

(2γ )r+1 − 1 + γ r+1 ≤ C1 γ r+1 . 2γ − 1


363

(2) We can now prove (2.6). Let us fix arbitrary positive integers r and m ≤ r + l and set k = Tr + m, so that Tr + 1 ≤ k < Tr+1 . We define a coupling y1,2 (k) = y1,2 (k, u1 , u2 ) for the measures µu1,2 (k) by the formula (cf. (3.15)) yj (k, u1 , u2 ; ωk ) = Ujm y1 (Tr , u1 , u2 ; ωTr ), y2 (Tr , u1 , u2 ; ωTr ); ωm . In view of (3.10) and (3.14), we have (cf. (3.16)) −r−1 r P y1 (k) − y2 (k) > dr+1 ≤ P(Q− P(Q+ r )+2 r ) ≤ C2 γ ,

(3.17)

where C2 > 0 is a constant. Now note that r 2 /2 ≤ Tr ≤ (l + 1)r 2 for any r ≥ 0 and therefore there are positive constants C and c such that dr+1 ≤ C e−c

√ k

,

C2 γ r ≤ C e−c

√ k

for

Tr ≤ k < Tr+1 .

Combining this with (3.17), we derive √ √ P y1 (k, u1 , u2 ) − y2 (k, u1 , u2 ) ≥ C e−c k ≤ C e−c k .

(3.18)

By Lemma 1.3, inequality (3.18) implies that √ µu (k) − µu (k)∗ ≤ 3C e−c k , 1 2 L

which completes the proof of (2.6) with k0 = T1 . Theorem 2.1 is proved.

4. Appendix: Coupling In this appendix, we present some results on the coupling in finite-dimensional spaces in the form which we learned from S. Foss. These results are well known (e.g., see [Lin, V] for Lemma 4.1 and [BF] for Lemma 4.3). Let ν1 , ν2 ∈ P(RN ) be two measures absolutely continuous with respect to the Lebesgue measure dx: ν1,2 (dx) = p1,2 (x) dx. We set ρ := ν1 − ν2 var

1 = 2

|p1 (x) − p2 (x)| dx

(4.1)

pˆ 1,2 := ρ −1 (p1,2 − p).

(4.2)

RN

and assume first that 0 < ρ < 1. Let p := (1 − ρ)−1 p1 ∧ p2 ,

For ρ = 1 or 0, we define p(x) and p1,2 (x) as follows: p(x) ≡ 0, p(x) ≡ p1 (x),

pˆ 1,2 (x) ≡ p1,2 (x) if ρ = 1, pˆ 1,2 (x) ≡ 0 if ρ = 0.

It is clear that p1,2 (x) = (1 − ρ)p(x) + ρ pˆ 1,2 (x)

almost everywhere.

(4.3) (4.4)

364


If (ξ1 , ξ2 ) is a coupling for the measures (ν1 , ν2 ), then for any ∈ B(RN ) we have ν1 () − ν2 () = E χ (ξ1 ) − χ (ξ2 ) = E χ{ξ1 =ξ2 } χ (ξ1 ) − χ (ξ2 ) ≤ P{ξ1 = ξ2 }. Therefore,

P{ξ1 = ξ2 } ≥ ρ ≡ ν1 − ν2 var .

A coupling (ξ1 , ξ2 ) for (ν1 , ν2 ) is said to be maximal if P{ξ1 = ξ2 } = ρ ≡ ν1 − ν2 var . Lemma 4.1. Let ξ1,2 , ξ , and α be independent random variables such that P{α = 1} = 1 − ρ,

P{α = 0} = ρ,

D(ξ ) = p(x) dx,

D(ξ1,2 ) = pˆ 1,2 (x) dx. (4.5)

Then the random variables =1,2 = αξ + (1 − α)ξ1,2

(4.6)

form a maximal coupling for ν1,2 . Proof. Since ξ1 and ξ2 are independent and their distributions possess densities with respect to the Lebesgue measure, we have P{ξ1 = ξ2 } = 0. Taking into account the relation α(1 − α) ≡ 0, we get D(=1,2 ) = p1,2 (x) dx = ν1,2 ,

P{=1 = =2 } = P{α = 0} = ρ,

which completes the proof. Let us now assume that ϕ is a random variable in RN with the distribution D(ϕ) = q(x) dx, where q ∈ L1 (RN ). Consider the following family of measures depending on a parameter v ∈ RN : νv (dx) = D(v + ϕ) = q(x − v) dx. Let ρ(v1 , v2 ) be the variation distance between νv1 and νv2 . It is clear from (4.1) that ρ(v1 , v2 ) is measurable with respect to v1 , v2 ∈ R2N . In the construction above, let us take ν1,2 = νv1,2 . Then p(x) = p(x; v1 , v2 ),

pˆ 1,2 (x) = pˆ 1,2 (x; v1 , v2 ).

Clearly, the functions p(x; v1 , v2 ) and pˆ 1,2 (x; v1 , v2 ) are measurable with respect to (x, v1 , v2 ). Using the above observations, we construct a coupling for (νv1 , νv2 ) that is measurable with respect to (v1 , v2 , ω). Namely, we have the following result: Theorem 4.2. There is a probability space ($, F, P) such that for any pair (v1 , v2 ) ∈ R2N there are random variables =1,2 = =1,2 (v1 , v2 ; ω) satisfying the following properties: (i) The pair (=1 , =2 ) is a maximal coupling for (νv1 , νv2 ). (ii) The map =(v1 , v2 ; ω) : R2N ×$ → RN is measurable with respect to the σ -algebra B(R2N ) × F.


365

To prove the theorem, we shall need the lemma below: Lemma 4.3. Let µz ∈ P(RN ), z ∈ Rd , be a family of probability measures such that µz (dx) = pz (x) dx, d where pz ∈ L1 (RN x ) for each z ∈ R and pz (x) is measurable as a function of (x, z) ∈ N d R × R . Then there is a probability space ($, F, P) and a family of random variables ζz : $ → RN such that D(ζz ) = µz for all z ∈ Rd and ζz (x) is measurable with respect to (z, x).

Proof. If N = 1, then we take ($, F, P) = ([0, 1], B, dt), where B is the Borel σ algebra and dt is the Lebesgue measure. Denoting by Fz (λ) the distribution function of the measure µz , Fz (λ) = µz ((−∞, λ]), we set ζz (t) = min{λ : Fz (λ) ≥ t}. The map (t, z) & → ζz (t) from [0, 1]×Rd to R is measurable, and the distribution function of D(ζz ) is equal to Fz . Thus, for N = 1 the lemma is proved. We now assume that the required assertion is established for N = L and prove it for N = L + 1. Let us write x ∈ RL+1 as x = (x , y), where x ∈ RL and y ∈ R. Decomposing µz in terms of the conditional density (see [GS]), we write µz (dx) = pz (x) dx = pz (x | y) dx qz (y) dy. Here

qz (y) =

RL

pz (x , y) dx ,

pz (x | y) =

(4.7)

pz (x , y) , qz (y)

where we set 0/0 = ∞/∞ = 0. Applying the induction hypothesis with z replaced by (z, y), we find a probability space ($ , F , P ) and a measurable map ζz (ω , y) : $ × Rd × R → RL

such that D ζz (·, y) = pz (x | y) dx for each (z, y) ∈ Rd × R. Applying the first step of the proof, we construct a measurable map ξz (t) : [0, 1] × Rd → R such that D(ξz ) = qz (λ) dλ. We now set $ = $ × [0, 1] and ζz (ω , t) = ζz (ω , ξz (t)), ξz (t) ∈ RL+1 . We have constructed a measurable map $×Rd → RL+1 such that, for any fixed z ∈ Rd , its distribution is given by the right-hand side of (4.7). Proof of Theorem 4.2. Applying Lemma 4.2 to measures in RN given by the densities p and pˆ 1,2 , we construct probability spaces (Fj , Sj , Pj ), j = 0, 1, 2, and random variables j ξ(v1 ,v2 ) on Fj such that 0 D(ξ(v ) = p(x; v1 , v2 ) dx, 1 ,v2 )

j

D(ξ(v1 ,v2 ) ) = pˆ j (x; v1 , v2 ) dx,

j = 1, 2.

(4.8)

We also define a random variable αρ : [0, 1] → {0, 1}, ρ = ρ(v1 , v2 ), by the formula αρ (t) = χ[0,1−ρ] (t),

366


where [0, 1] is endowed with the Borel σ -algebra and the Lebesgue measure, and χ[0,r] is the characteristic function of the interval [0, r]. We now define the required probability space as the set $ = F0 × F1 × F2 × [0, 1] with the σ -algebra and the probability of direct product. The natural extensions3 of αρ j and ξ(v1 ,v2 ) , j = 0, 1, 2, to $ (for which we retain the same notations) form a quadruple of independent random variables satisfying (4.8) and also the relations P{αρ = 1} = 1 − ρ(v1 , v2 ),

P{αρ = 0} = ρ(v1 , v2 ).

A maximal coupling (=1 , =2 ) for the measures (νv1 , νv2 ) that satisfies assertion (ii) of 0 the theorem can now be defined by formula (4.6), in which α = αρ , ξ = ξ(v , and 1 ,v2 ) j

ξj = ξ(v1 ,v2 ) , j = 1, 2.

Acknowledgements. The authors thank Roger Tribe and Sergei Foss for fruitful discussions of the coupling approach during the Symposium “Stochastic Fluid Equations” in Warwick on January 19–20, 2001, and at seminars in Heriot-Watt University, respectively. The authors are also grateful to Jan Kristensen for useful remarks on functional analysis. This research was supported by EPSRC, grant GR/N63055/01.

References [BF]

Borovkov, A.A., Foss, S.G.: Stochastically recursive sequences and their generalizations. Siberian Adv. in Math. 2, no. 1, 16–81 (1992) [BL] Bressaud, X., Liverani, C.: Anosov diffeomorphism and coupling. To appear in Ergodic Theory Dynam. Systems [BKL] Bricmont, J., Kupiainen, A., Lefevere, R.: Exponential mixing for the 2D stochastic Navier–Stokes dynamics. Preprint [EMS] E, W., Mattingly, J.C., Sinai, Ya.G.: Gibbsian dynamics and ergodicity for the stochastically forced Navier–Stokes equation. Preprint [GS] Gihman, I.I., Skorohod, A.V.: The Theory of Stochastic Processes I. Berlin–Heidelberg–New York: Springer-Verlag, 1980 [H] Hörmander, L.: The Analysis of Linear Partial Differential Operators I. Distribution Theory and Fourier Analysis. Berlin: Springer-Verlag, 1983 [KS1] Kuksin, S., Shirikyan, A.: Stochastic dissipative PDE’s and Gibbs measures. Comm. Math. Phys. 213, 291–330 (2000) [KS2] Kuksin, S., Shirikyan, A.: On dissipative systems perturbed by bounded random kick-forces. Submitted to Ergodic Theory Dynam. Systems (www.ma.hw.ac.uk/kuksin) [KS3] Kuksin, S., Shirikyan, A.: Ergodicity for the randomly forced 2D Navier–Stokes equations. Preprint. (www.ma.hw.ac.uk/kuksin) [Lin] Lindvall, T.: Lectures on the Coupling Method. New York: John Wiley & Sons, 1992 [V] Veretennikov, A.Yu.: Parametric and non-parametric estimation of Markov chains. Moscow: Moscow State University Press, 2000 (in Russian) [Y] Young, L.-S.: Recurrence times and rates of mixing. Israel J. Math. 110, 153–188 (1999) Communicated by G. Gallavotti

3 For instance, the extension of α is given by α (ω) = α (t), where ω = (ω , ω , ω , t) ∈ $. ρ ρ ρ 0 1 2

Commun. Math. Phys. 221, 367 – 384 (2001)

Communications in



Loop Homotopy Algebras in Closed String Field Theory Martin Markl Mathematical Institute of the Academy, Žitná 25, 11567 Prague 1, The Czech Republic. E-mail: [email protected] Received: 10 November 1999 / Accepted: 29 March 2001

Abstract: Barton Zwiebach constructed [20] “string products” on the Hilbert space of a combined conformal field theory of matter and ghosts, satisfying the “main identity”. It has been well known that the “tree level” of the theory gives an example of a strongly homotopy Lie algebra (though, as we will see later, this is not the whole truth). Strongly homotopy Lie algebras are now well-understood objects. On the one hand, strongly homotopy Lie algebra is given by a square zero coderivation on the cofree cocommutative connected coalgebra [14, 13]; on the other hand, strongly homotopy Lie algebras are algebras over the cobar dual of the operad Com for commutative algebras [9]. As far as we know, no such characterization of the structure of string products for arbitrary genera has been available, though there are two series of papers directly pointing towards the requisite characterization. As far as the characterization in terms of (co)derivations is concerned, we need the concept of higher order (co)derivations, which has been developed, for example, in [2, 3]. These higher order derivations were used in the analysis of the “master identity.” For our characterization we need to understand the behavior of these higher (co)derivations on (co)free (co)algebras. The necessary machinery for the operadic approach is that of modular operads, anticipated in [5] and introduced in [8]. We believe that the modular operad structure on the compactified moduli space of Riemann surfaces of arbitrary genera implies the existence of the structure we are interested in the same manner as was explained for the tree level in [11]. We also indicate how to adapt the loop homotopy structure to the case of open string field theory [19]. 1. Introduction Let H be the Hilbert space of a combined conformal field theory of matter and ghosts and let Hrel ⊂ H be the subspace of elements annihilated by b0− := b0 − b0 and

368

M. Markl

L− 0 := L0 − L0 (see, for example, [11, Sect. 4]). Barton Zwiebach constructed in [20], for each “genus” g ≥ 0 and for each n ≥ 0, multilinear “string products” ⊗n B1 × · · · × Bn −→ [B1 , . . . , Bn ]g ∈ Hrel . Hrel

Recall the basic properties of these products. If gh(−) denotes the ghost number, then [20, (4.8)] gh([B1 , . . . , Bn ]g ) = 3 − 2n +

n

gh(Bi ).

i=1

The string products are graded (super) commutative [20, (4.4)]: [B1 , . . . , Bi , Bi+1 , . . . , Bn ]g = (−1)Bi Bi+1 [B1 , . . . , Bi+1 , Bi , . . . , Bn ]g .

(1)

Here we used the notation (−1)Bi Bi+1 := (−1)gh(Bi )gh(Bi+1 ) . For n = 0 and g ≥ 0, [ . ]g ∈ Hrel is just a constant, and the products are constructed in such a way that [ . ]0 = 0 [20, (4.6)]. The linear operation [B]0 =: QB is identified with the BRST differential of the theory. These products satisfy, for all n, g, the main identity [20, (4.13)]: 0= σ (il , jk ) Bi1 , . . . , Bil , [Bj1 , . . . , Bjk ]g2 g (2) 1 1 + (−1)s [s , s , B1 , . . . , Bn ]g−1 . 2 s Here the first sum runs over all g1 + g2 = g, k + l = n, and all sequences i1 < · · · < il , j1 < · · · < jk such that {i1 , . . . , il , j1 , . . . , jk } = {1, . . . , n}. Such sequences are called unshuffles (see the terminology introduced at the beginning of Sect. 2). The sign σ (il , jk ) is picked up by rearranging the sequence (Q, B1 , . . . , Bn ) into the order (Bi1 , . . . , Bil , Q, Bj1 , . . . , Bjk ). In the second sum, {s } is a basis of Hrel and {s } its dual basis in the sense that (−1)r r , s = δrs (Kronecker delta), where −, − denotes the bilinear inner product on H [20, (2.44)]. Let us remark that, in the original formulation of [20], {s } was a basis of the whole H, but the sum in (2) was restricted to Hrel . The product satisfies [20, (2.62)]: A, B = (−1)(A+1)(B+1) B, A

(3)

and it is nontrivial only for elements whose ghost numbers add up to five: if A, B = 0, then gh(A) + gh(B) = 5.

(4)

The above two conditions in fact imply that A, B = B, A. Moreover, the product −, − is Q-invariant [20, 2.63]: QA, B = (−1)A A, QB.

(5)

Loop Homotopy Algebras in Closed String Field Theory

369

⊗2 Conditions (3) and (4) also imply that the element := (−1)s s ⊗s ∈ Hrel is symmetric in the sense that s

(−1)s s ⊗s = (−1)s s ⊗s = −(−1) s ⊗s .

(6)

We use, in the previous formula as well as at many places in the rest of the paper, the Einstein convention of summing over repeated indices. The last important property of string products is that the element ⊗2 s ⊗[s , B1 , . . . , Bn−1 ]g ∈ Hrel

(7)

is antisymmetric. This property is not explicitly stated in [20], though it is used in the proof of the identity [20, (4.28)]: B1 , . . . , Bl , s , [s , A1 , . . . , Ak ]g2 g = 0, for arbitrary l ≥ 0, k ≥ 0, 1

s

which then immediately follows from the antisymmetry (7) by the graded commutativity (1) of string products. Eq. (7) is a consequence of the important fact that the string products are defined with the aid of the multilinear string functions [20, (7.72)] ⊗(n+1)

Hrel

B0 , . . . , Bn −→ {B0 , . . . , Bn }g ∈ C

by [20, (4.33)] [B1 , . . . , Bn ]g :=

(−1)t t · {t , B1 , . . . , Bn }g .

(8)

t

Let us show that the graded commutativity [20, (4.36)] {B0 , . . . , Bi , Bi+1 , . . . , Bn }g = (−1)Bi Bi+1 {B0 , . . . , Bi+1 , Bi , . . . , Bn }g of the string multilinear functions implies the antisymmetry of the element in (7). Indeed, because of (6), we may write (8) as (−1)t t · {t , B1 , . . . , Bn }g , [B1 , . . . , Bn ]g = t

thus the element in (7) takes the form (−1)t (s ⊗t ) · {t , s , B1 , . . . , Bn−1 }g . s,t

The antisymmetry we are proving means that

(−1)t s ⊗t · {t , s , B1 , . . . , Bn−1 }g

s,t

=−

(−1)t +s t t ⊗s · {t , s , B1 , . . . , Bn−1 }g .

s,t

The replacement t ←→ s in the right-hand side of the above equation gives − (−1)s +t s s ⊗t · {s , t , B1 , . . . , Bn−1 }g s,t

370

M. Markl

which can be further rewritten, using the graded commutativity of string functions, as s t (−1)s +t s + s ⊗t · {t , s , B1 , . . . , Bn−1 }g . (9) − s,t

Since gh(s ) ≡ gh(s ) + 1 (mod 2) and gh(t ) ≡ gh(t ) + 1 (mod 2), gh(s )gh(t ) ≡ gh(s )gh(t ) + gh(s ) + gh(t ) + 1 (mod 2), therefore the sign factor in (9) is (−1)t . This proves the claim. 2. Sign Interlude and the Definition In this brief section we rewrite the axioms of string products into a more usual and convenient formalism. All algebraic objects will be considered over a fixed field k of characteristic zero. This, of course, includes the case k = C of the previous section. We will systematically use the Koszul sign convention meaning that whenever we commute two “things” of degrees p and q, respectively, we multiply by the sign factor (−1)pq . Our conventions concerning graded vector spaces, permutations, shuffles, etc., will follow closely those of [15]. For graded indeterminates x1 , . . . , xn and a permutation σ ∈ n define the Koszul sign (σ ) = (σ ; x1 , . . . , xn ) by x1 ∧ · · · ∧ xn = (σ ; x1 , . . . , xn ) · xσ (1) ∧ · · · ∧ xσ (n) , which is to be satisfied in the free graded commutative algebra ∧(x1 , . . . , xn ). Define also χ (σ ) := χ (σ ; x1 , . . . , xn ) := sgn(σ ) · (σ ; x1 , . . . , xn ). We say that σ ∈ n is an (i, j )-unshuffle, i + j = n, if σ (1) < · · · < σ (i) and σ (i + 1) < · · · < σ (n). In this case we write σ ∈ unsh(i, j ). In the obvious similar manner one may introduce (i, j, k)-unshuffles, etc. Let us denote, for a graded vector space U , by ↑ U (resp. ↓ U ) the suspension (resp. the desuspension) of U , i.e. the graded vector space defined by (↑ U )p := Up−1 (resp. (↓ U )p := Up+1 ). We have the obvious natural maps ↑: U → ↑ U and ↓: U → ↓ U . For a graded vector space U , let its reflection r(U ) be the graded vector space defined by r(U )p := U−p . There is an obvious natural map r : U → r(U ). Observe that r2 = 1, r ◦ ↑= ↓ ◦ r and r ◦ ↓=↑ ◦ r. Take now V := r(↓ Hrel ). Define, for each g ≥ 0 and n ≥ 0, multilinear maps g ln : V ⊗n → V by g

ln (v1 , . . . , vn ) := (−1)(n−1)v1 +(n−2)v2 +···+vn−1 ↓ [↑ r(v1 ), . . . , ↑ r(vn )]g , for v1 , . . . , vn ∈ V ⊗n . Define also the bilinear form B : V ⊗V → C by B(u, v) := ↑ r(u), ↑ r(v)

(10)

and, finally, the element h = hs ⊗hs by hs := (−1)s r(↓ s ), hs := r(↓ s ), which means that hs ⊗hs := (−1)s r(↓ s )⊗r(↓ s ) (Einstein summation convention). A technical, but absolutely straightforward, calculation shows that the above structure is an example of a loop homotopy Lie algebra in the sense of the following definition.


371 g

Definition 1. A loop homotopy Lie algebra is a triple V = (V , B, {ln }) consisting of Vi , (i) a Z-graded vector space V , V∗ = (ii) a graded symmetric nondegenerate bilinear degree +3 form B : V ⊗V → k, and g (iii) the set {ln }n,g≥0 of degree n − 2 multilinear antisymmetric operations g ln : V ⊗n → V . These data are supposed to satisfy the following two axioms: (A1) For any n, g ≥ 0 and v1 , . . . , vn ∈ V , the following “main identity” g g 0= χ (σ )(−1)l(k−1) lk 1 (ll 2 (vσ (1) ,..., vσ (l) ), vσ (l+1) ,..., vσ (n) ) k+l=n+1 g1 +g2 =g

+

σ ∈unsh(l,n−l)

1 g−1 (−1)hs +n ln+2 (hs , hs , v1 , . . . , vn ) 2 s

(11)

holds. In the second term, {hs } and {hs } are bases of the vector space V dual to each other in the sense that B(hs , ht ) = δts .

(12)

(A2) The element g

(−1)(n+1)hs hs ⊗ln (hs , v1 , . . . , vn−1 ) ∈ V ⊗V

(13)

is symmetric, for all g ≥ 0, n ≥ 0, and v1 , . . . , vn−1 ∈ V . Remark 1. To give a reasonable meaning to the “basis {hs } of V ”, we must suppose either that V is finite dimensional, or that it has a suitable topology, as in the case of string products. We will always tacitly assume that assumptions of this form have been made. In the “main identity” for g = 0 we put, by definition, ln−1 = 0. Because deg(hs ) + deg(hs ) = −3, deg(hs ) deg(hs ) is even. The graded symmetry of B then implies that, besides (12), also B(hs , ht ) = δst . The element h = hs ⊗hs is s easily seen to be symmetric, hs ⊗hs = (−1)hs h hs ⊗hs = hs ⊗hs . For n = 0 axiom (2) gives

0=

g1 +g2 =g

g

g

l1 1 (l0 2 (.)) +

1 g−1 (−1)hs l2 (hs , hs ), 2 s

while for n = 1 it gives 0=

g1 +g2 =g

g

g

g

g

(l1 1 (l1 2 (v)) + l2 1 (l0 2 (.), v)) −

1 g−1 (−1)hs l3 (hs , hs , v), 2 s g

(14)

for all v ∈ V . From this moment on, we will assume that l0 = 0, for all g ≥ 0, that is, the theory has “no constants”. This assumption is not really necessary, but it will considerably simplify our exposition.

372

M. Markl

Exercise 1. Let us denote ∂ := l10 . Equation (14) implies that ∂ 2 = 0 (recall our asg sumption l0 = 0!). Thus ∂ is a degree −1 differential on the space V . The symmetry of s hs ⊗∂(h ) (Axiom (A2) with n = 1 and g = 1) is equivalent to the d-invariance of the form B, B(∂u, v) + (−1)u B(u, ∂v) = 0, for u, v ∈ V . The tree level. Let us discuss the “tree level” (g = 0) specialization of the above g structure. The only nontrivial ln ’s are ln := ln0 , n ≥ 1. The main identity (11) for g = 0 reduces to χ (σ )(−1)l(k−1) lk (ll (vσ (1) ,..., vσ (l) ), vσ (l+1) ,..., vσ (n) ) (15) 0= k+l=n+1 σ ∈unsh(l,n−l) n

while, for g = 1 it gives (after forgetting the overall factor (−1) 2 ) (−1)hs ln+2 (hs , hs , v1 , . . . , vn ). 0=

(16)

s

Axiom (A2) says that the elements (−1)(n+1)hs hs ⊗ln (hs , v1 , . . . , vn )

(17)

are symmetric. We immediately recognize (15) as the defining axiom for strongly homotopy Lie algebras [13, Def. 2.1]. Thus the tree level loop homotopy Lie algebra is a strongly homotopy Lie algebra (V , {ln }) with an additional structure given by a bilinear form B such that the element h = hs ⊗hs , uniquely determined by B, satisfies (16) and (17). We see that the “tree-level” specialization is a richer structure than just a strongly homotopy Lie algebra as it is usually understood. A proper name for such a structure would be a cyclic strongly homotopy Lie algebra. 3. Higher Order (Co)derivations In this section we investigate properties of higher order coderivations of cofree cocommutative coalgebras. Because this paper is meant for humans, not for robots, we derive necessary properties for derivations on free commutative algebras, and then simply dualize the results. This is an absolutely correct procedure, except for one fine point related to the cofreeness, see Remark 3. The following definitions were taken from [1, 3]. Let A be a graded (super) commutative algebra and ∇ : A → A a homogeneous degree k linear map. We define inductively, for each n ≥ 1, degree k linear deviations n∇ : A⊗n → A by 1∇ (a) := ∇(a),

2∇ (a, b) := ∇(ab) − ∇(a)b − (−1)ka a∇(b),

3∇ (a, b, c) := ∇(abc) − ∇(ab)c − (−1)a(b+c) ∇(bc)a − (−1)c(a+b) ∇(ca)b + ∇(a)bc + (−1)a(b+c) ∇(b)ca + (−1)c(a+b) ∇(c)ab,

.. . n n n+1 ∇ (a1 , . . . , an+1 ) := ∇ (a1 , . . . , an an+1 ) − ∇ (a1 , . . . , an )an+1

− (−1)an ·an+1 n∇ (a1 , . . . , an−1 , an+1 )an .


373

As a matter of fact, it is possible to give a non-inductive formula for n∇ , namely n∇ (a1 , . . . , an ) = (−1)n−i (σ )∇(xσ (1) · · · xσ (i) )xσ (i+1) · · · xσ (n) . (18) 1≤i≤n σ ∈unsh(i,n−i)

We say that ∇ is a derivation of order r if r+1 ∇ is identically zero. In this case we write ∇ ∈ Der rk (A), where k = deg(∇). In the following proposition, which was stated in [1], [−, −] denotes the graded anticommutator of endomorphisms. Proposition 1. The subspaces Der rk (A) satisfy: (i) Der 1k (A) ⊂ Der 2k (A) ⊂ Der 3k (A) ⊂ · · · , (ii) Der rk (A) ◦ Der sl (A) ⊂ Der r+s k+l (A), and s r (iii) [Der k (A), Der l (A)] ⊂ Der r+s−1 k+l (A). Let now A = ∧X be the free graded commutative algebra on the graded vector space X. Let us prove the following useful proposition. Proposition 2. Let ∇ ∈ Der rk (∧X). Then ∇ is uniquely determined by its values on the products x1 · · · xs , s ≤ r, xi ∈ X for 1 ≤ i ≤ s. In particular, ∇ = 0 if and only if ∇(x1 · · · xs ) = 0, for x1 · · · xs as above. Proof. Since ∇ ∈ Der rk (∧X) is linear, it is enough to prove that ∇(x1 · · · xs ) = 0 for all s ≤ r implies that ∇(x1 · · · xn ) = 0 for each n. This we prove inductively. Suppose we already know ∇(x1 · · · xk ) = 0, for each k ≤ n, n ≥ r, and consider ∇(x1 · · · xn+1 ). We compute from (18) that n+1 ∇ (x1 , . . . , xn+1 )

= ∇(x1 · · · xn+1 ) + (−1)n−i+1 (σ )∇(xσ (1) · · · xσ (i) )xσ (i+1) · · · xσ (n+1) . 1≤i≤n σ ∈unsh(i,n−i+1)

Since ∇ ∈ Der rk (∧X) and n ≥ r, n+1 ∇ (x1 , . . . , xn+1 ) = 0, while the terms in the sum are zero by the inductive assumption. Thus ∇(x1 · · · xn+1 ) = 0 and the induction may go on. Remark 2. 1-derivations are ordinary derivations, Der 1k (A) = Der k (A). Proposition 2 then states the standard fact that derivations on free algebras are given by their restrictions to the space of generators. For a fixed n, we denote by ∧n X the subspace of ∧X spanned by the products x1 · · · xn , xi ∈ X, 1 ≤ i ≤ n; we put, by definition, ∧0 X := k. Let ιn : ∧n X (→ ∧X be the inclusion. The following proposition says that r-derivations of the free algebra ∧X are in one-to-one correspondence with r-tuples of linear maps, {fs : ∧s X → ∧X}1≤s≤r . Proposition 3. Suppose we are given homogeneous degree k linear maps fs : ∧s X → ∧X, for 1 ≤ s ≤ r. Then there exists a unique order r derivation ∇ ∈ Derrk (∧X) such that ∇ ◦ ιs = fs , for 1 ≤ s ≤ r.

(19)

374

M. Markl

Proof. The uniqueness follows immediately from Proposition 2. To prove the existence, observe first that, given degree k linear maps gs : ∧s X → ∧X, 1 ≤ s ≤ r, the formula ∇(x1 · · · xn ) :=

(σ )gs (xσ (1) · · · xσ (s) )xσ (s+1) · · · xσ (n) ,

1≤s≤min(r,n) σ ∈unsh(s,n−s)

defines an order k derivation. Condition (19) then leads to the following system of equations: f1 (x1 ) = g1 (x1 ), f2 (x1 x2 ) = g2 (x1 x2 ) + g1 (x1 )x2 + (−1)x1 x2 g1 (x2 )x1 , .. . fr (x1 · · · xr ) = (σ )gs (xσ (1) · · · xσ (s) )xσ (s+1) · · · xσ (r) . 1≤s≤r σ ∈unsh(s,n−s)

This system can obviously be solved for gs , 1 ≤ s ≤ r. Let us turn our attention to coalgebras. Suppose that C = (C, +) is a cocommutative coassociative coalgebra. To define higher-order coderivations of C, we need analogs of the deviations r∇ introduced above. By duality, we define, for any homogeneous degree n : C → C ⊗n inductively k linear endomorphism , of C, degree k multilinear maps -, as 1 -, := ,, 2 -, := + ◦ , − (,⊗1) ◦ + − (1⊗,) ◦ +,

3 -, := +[3] ◦, − (+⊗1)◦(,⊗1)◦+ − T312 ◦(+⊗1)◦(,⊗1)◦+

− T231 ◦(+⊗1)◦(,⊗1)◦+ + (,⊗12 )◦+[3] + T312 ◦(,⊗12 )◦+[3]

+ T231 ◦(,⊗12 )◦+[3] .. . n+1 n n n -, := (1n−1 ⊗+) ◦ -, − (-, ⊗1) ◦ + − T1,2,... ,n−1,n+1,n ◦ (-, ⊗1) ◦ +,

where +[3] := (+⊗1)+ (= (1⊗+)+ by the coassociativity) and, for σ ∈ n , Tσ (1)···σ (n) : C ⊗n → C ⊗n is defined by Tσ (1)···σ (n) (x1 ⊗ · · · ⊗ xn ) := (σ )(xσ (1) ⊗ · · · ⊗ xσ (n) . r+1 is identically We say that a linear map , : C → C is an order r coderivation, if -, r zero. Let coDer k (C) be the space of all such maps. The following proposition is an exact dual of Proposition 1.


375

Proposition 4. The subspaces coDer rk (C) satisfy: (i) coDer 1k (C) ⊂ coDer 2k (C) ⊂ coDer 3k (C) ⊂ · · · , (ii) coDer rk (C) ◦ coDer sl (C) ⊂ coDer r+s k+l (C), and s r (iii) [coDer k (C), coDer l (C)] ⊂ coDer r+s−1 k+l (C). Let W be a graded vector space and consider again the free graded commutative algebra ∧W on W . We introduce on ∧W a cocommutative coassociative comultiplication + = 1⊗1 + + + 1⊗1 by defining the reduced diagonal + as +(w1 · · · wn ) = (σ )(wσ (1) · · · wσ (i) ) ⊗ (wσ (i+1) · · · wσ (n) ), 1≤i≤n−1 σ

w1 · · · wn ∈ ∧n W , where σ runs through all (i, n − i) unshuffles. We denote the coalgebra (∧W, +) by c∧W . Remark 3. Here it must be pointed out that c∧W is not the cofree cocommutative coassociative coalgebra cogenerated by W , as it is generally supposed to be. It is the cofree coalgebra in the category of connected coalgebras, see the discussion in [13, p. 2150]. Denote by πn : c∧W → ∧n W the natural projection of vector spaces. The following theorem is the exact dual of Proposition 3. Proposition 5. For each r-tuple us : c∧W → ∧s W , 1 ≤ s ≤ r, of homogeneous degree k linear maps there exists a unique order r coderivation , ∈ coDer rk (c∧W ) such that πs ◦, = us , for 1 ≤ s ≤ r.

(20)

4. Loop Homotopy Lie Algebras – 1st Description We already observed at the end of Sect. 2 that strongly homotopy Lie algebras are closely related to the “tree level” specializations of loop homotopy Lie algebras. Recall [13, Theorem 2.3] that strongly homotopy Lie algebras have the following characterization. Proposition 6. There exists a one-to-one correspondence between strongly homotopy Lie algebra structures on a graded vector space V and degree −1 coderivations δ ∈ coDer −1 (c∧W ), W :=↑ V , with the property δ 2 = 0. In this section we give a similar characterization for loop homotopy Lie algebras. Suppose that the vector space V and the bilinear form B is the same as in Def. 1. Let h = hs ⊗hs ∈ (V ⊗V )−3 be as in (12) (of course, h is uniquely determined by the nondegenerate form B). Let W :=↑ V and y = ys ⊗y s :=↑ hs ⊗ ↑ hs ∈ (W ⊗W )−1 . Because h is symmetric, y is symmetric as well, thus, in fact, y = ys y s ∈ (∧2 W )−1 . Let us consider the extension c∧W [t] of c∧W over the polynomial ring k[t], c∧W [t] := c∧W ⊗k k[t]. By Proposition 5, there exist a unique coderivation θ ∈ coDer 2−1 (c∧W [t]) such that 0, w ∈ ∧n W [t], n > 0, π1 (θ ) = 0 and π2 (θ )(w) = 1 (21) 0 0 ∼ 2 ty, w = 1 ∈ ∧ W · t = k. The rôle of θ is to incorporate the form B into our theory. In the rest of this section we prove the following theorem.

376

M. Markl

Theorem 1. Under the above notation, there is a one-to-one correspondence between loop homotopy Lie algebra structures on the graded vector space V and degree −1 coderivations δ ∈ coDer 1−1 (c∧W [t]) such that (δ + θ)2 = 0.

(22)

Let us analyze Eq. (22). It is, of course, equivalent to δ 2 + θδ + δθ + θ 2 = 0.

(23)

Sublemma 1. Under the above notation, θ 2 = 0, δ 2 ∈ coDer 1−2 (c∧W [t]), and (θ δ + δθ ) ∈ coDer 2−2 (c∧W [t]). Proof. For w1 · · · wn ∈ ∧n W obviously θ (w1 · · · wn ) =

1 tys y s w1 · · · wn , 2

(24)

thus θ 2 (w1 · · · wn ) =

1 2 t ys y s y t y t w 1 · · · w n . 4

(25)

The graded commutativity implies that ys y s yt y t = (−1)(ys +y

s )(y +y t ) t

yt y t ys y s = −yt y t ys y s .

On the other hand, the substitution s ↔ t gives ys y s yt y t = yt y t ys y s , therefore yt y t ys y s = 0, and θ 2 = 0 by (25). The remaining two statements follow from Proposition 4(iii) and the observation that δ 2 = 21 [δ, δ] and θ δ + δθ = [δ, θ ]. By Sublemma 1, (23) reduces to δ 2 + θδ + δθ = 0.

(26)

By the same sublemma and Proposition 1(i), δ 2 + θδ + δθ is an order 2 coderivation. Thus (26) is, by Proposition 5, equivalent to π1 (δ 2 + θδ + δθ ) = 0, and

(27)

π2 (δ 2 + θδ + δθ ) = 0.

(28)

Because, by (21), π1 (θ ) = 0, Eq. (27) further reduces to π1 (δ 2 + δθ ) = 0.

(29)

To understand better the meaning of this equation, let us introduce, for any g ≥ 0 and g n ≥ 0, linear maps δn : ∧n W → W by g

δn (w1 · · · wn ) := Coef g (π1 δ(w1 · · · wn )), w1 · · · wn ∈ ∧n W,

(30)


377 g

where Coef g (−) is the coefficient at t g . By Proposition 5, the set {δn }n,g≥0 uniquely determines the coderivation δ. The explicit formula is (compare explicit formulas for coderivations acting on coalgebras in [14]): g (σ )t g δi (wσ (1) · · · wσ (i) )wσ (i+1) · · · wσ (n) , (31) δ(w1 · · · wn ) = 0≤i≤n

where the summation is taken over all g ≥ 0 and all σ ∈ unsh(i, n − i). From this and (24) we obtain π1 (δ 2 + δθ )(w1 · · · wn ) = k+l=n+1 g1 +g2 =g

+

σ ∈unsh(l,n−1)

g

g

(σ )t g δk 1 (δl 2 (wσ (1) · · · wσ (l) )wσ (l+1) · · · wσ (n) ) (32)

1 g+1 g t δn+2 (ys , y s , w1 , . . . , wn ). 2 s,g≥0

We formulate the result as: Sublemma 2. Eq. (29) means that, for all n ≥ 0, w1 · · · wn ∈ ∧n W and g ≥ 0, g g (σ )δk 1 (δl 2 (wσ (1) · · · wσ (l) )wσ (l+1) · · · wσ (n) ) 0= k+l=n+1 g1 +g2 =g

+

(33)

σ ∈unsh(l,n−1)

1 g−1 δ (ys , y s , w1 , . . . , wn ). 2 s n+2

We will see that Eq. (33) will correspond to the “main identity” (11). Let us make a similar analysis of Eq. (28). Because clearly π2 (θ δ) = 0, it reduces to π2 (δ 2 + δθ) = 0.

(34)

Using the similar arguments as above, we obtain, for any g ≥ 0 and w1 · · · wn ∈

∧n W ,

(35) Coef g (π2 (δ 2 )(w1 · · · wn )) g1 g2 = (σ )δk (δl (wσ (1) · · · wσ (l) )wσ (l+1) · · · wσ (n−1) )wσ (n) k+l=n+1 g1 +g1 =g

+

σ ∈unsh(l,n−l−1,1)

g

g

(−1)wσ (1) +···+wσ (p) (σ )δp1 (wσ (1) · · · wσ (p) )δq 2 (wσ (p+1) · · · wσ (n) ).

p+g=n σ ∈unsh(p,q) g1 +g1 =g

Similarly, we have Coef g (π2 (δθ )(w1 · · · wn )) 1 g−1 = (−1)wi (wi+1 +···+wn ) δn+1 (ys y s w1 · · · wi−1 wi+1 · · · wn )wi 2 1≤i≤n

1 s g−1 (−1)ys (y +w1 +···+wn ) δn+1 (y s w1 · · · wn )ys 2 s 1 s g−1 + (−1)y (w1 +···+wn ) δn+1 (ys w1 · · · wn )y s . 2 s +

(36)

378

M. Markl

Now, assuming (33), it is immediate to see that the first term at the right-hand side of (35) s is minus the first term at the right-hand side of (36). The symmetry ys y s = (−1)ys y y s ys implies that the second and third terms at the left-hand side of (36) are the same, both g−1 equal to 1/2 s (−1)ys ys δn+1 (y s w1 · · · wn ). We formulate these observations as Sublemma 3. Assuming (33), Eq. (34) is equivalent to 1 g−1 (−1)ys ys δn+1 (y s w1 · · · wn ) = 0. 2 s

(37)

Since we work in the free commutative algebra, (37) is equivalent to the antisymmetry of 1 g−1 (−1)ys ys ⊗δn+1 (y s w1 · · · wn ) ∈ W ⊗W. (38) 2 s Proof of Theorem 1. Recall that W =↑ V . The correspondence between the structure g operations {ln }g,n≥0 of a loop homotopy Lie algebra and coderivations δ of Theorem 1 is given by g

g

ln (v1 , . . . , vn ) = (−1)(n−1)v1 +···+vn−1 ↓ δn (↑ v1 · · · ↑ vn ), v1 , . . . , vn ∈ V , with the inverse formula g

g

δn (w1 · · · wn ) = (−1)n(n−1)/2 (−1)(n−1)w1 +···+wn−1 ↑ ln (↓ w1 , . . . , ↓ wn ), g

w1 · · · wn ∈ ∧n W , where the multilinear maps {δn } were introduced in (30). Observe the sign (−1)n(n−1)/2 in the second formula; it is typical for formulas of this type, see [15, g g Example 1.6]. A routine calculation shows that the substitution ln ↔ δn converts (33) to (11) and that the symmetry of the element in (13) is equivalent to the antisymmetry of the element of (38).

5. Loop Homotopy Lie Algebras – Operadic Approach In this section we give an operadic characterization of loop homotopy Lie algebras. We will not repeat here all details of necessary definitions concerning operads, because it would stretch the paper beyond any reasonable limit. Operads are introduced in the classical book [17]. The (co)bar construction over a (co)operad is defined in [9], see also [6]. Cyclic operads are introduced in [7] while modular operads and the corresponding modular (co)bar construction (called the Feynman transform) in [8]. There is also a nice overview [10]. These sources are easily available, we will thus rely on them and indicate only basic ideas. Recall that a collection is a system E = {E(n)}n≥1 of graded vector spaces such that each E(n) possesses a right action of the symmetric group n . Any collection E extends to a functor (denoted by the same symbol) from the category of finite sets to the category of graded vector spaces with the property that E(n) = E({1, . . . , n}) [6, 1.3]. Let Trn denote the set of rooted (= directed) trees with n labelled leaves. For a tree T ∈ Trn and a collection E, denote ([9, 1.2.13]) E(T ) := E(In(v)), v∈Vert(T )


379

where Vert(T ) is the set of the vertices of T and In(v) the set of incoming edges of v. The free operad on E [9, 2.1.1] is then the collection F(E)(n) := E(T ), n ≥ 1, T ∈Trn

with the operadic structure induced by the grafting of underlying trees. Let P be an operad. Consider the free operad F(↓ sP ∗ ) on the collection ↓ sP ∗ (n) := ↑ n−2 P ∗ (n), n ≥ 1, where (−)∗ is the linear dual. As proved in [9, 3.2], structure operations of the operad P induce a differential ∂D on F(↓ sP ∗ ). The differential operad D(P) := (F(↓ sP ∗ ), ∂D ) is called the (operadic) cobar dual of the operad P. It is well-known [9, 4.2.14] that “classical” strongly homotopy Lie algebras are characterized as follows. Proposition 7. Strongly homotopy Lie algebras are algebras over the cobar dual D(Com) of the operad Com for commutative algebras. The above proposition means that a strongly homotopy Lie algebra structure on a differential graded vector space V = (V , ∂) is the same as a morphism a : D(Com) → End V from the operad D(Com) to the endomorphism operad End V of V [9, 1.2.9]. Our aim is to give a similar characterization of loop homotopy Lie algebras, based on a certain generalization of operads, called modular operads. An intermediate step between ordinary operads and modular operads are cyclic operads whose definition we briefly recall. A cyclic collection is a system E = {E((n))}n≥1 of graded vector spaces such that each E((n)) has a right n+1 -action. Each cyclic collection E induces a functor from the category of finite sets into the category of graded vector spaces (denoted again by E) such that E(({0, . . . , n})) = E((n)). This notation differs from that of [7] and [5] where E(({0, . . . , n})) = E((n + 1)). Let Tur n denote the set of unrooted trees T with leaves indexed by {0, . . . , n}. For a cyclic collection E and a tree T ∈ Tur n , let E((T )) := E((Leg(v))), v∈Vert(T )

where Leg(v) is the set of all edges of T adjacent to the vertex v. A cyclic operad is then a cyclic collection C = {C((n))}n≥1 together with a “coherent” system of “contractions” αT : C((T )) → C((n)), T ∈ Tur n , n ≥ 1,

(39)

see [7, Def. 2.1] Modular operads, anticipated in [5], were introduced by Getzler and Kapranov [8] for the study of moduli spaces of Riemann surfaces of arbitrary genera. Recall that a modular collection is a cyclic collection E with a second grading by the “genus” g ≥ 0, E = {E((g, n))}n≥1 . A modular operad A is then a modular collection which possesses, besides a cyclic operadic structure, also operations A((g, n + 2)) → A((g + 1, n)). These operations are abstractions of the “self-gluing” which produces, from a surface of genus g with (n + 2) punctures, a new surface of genus g + 1 with n punctures, as indicated in Fig. 1.

380

M. Markl

3

4

self-gluing

1

2

✲

1

2

Fig. 1. An example of “self-gluing”. The surface on the right has 2 punctures and genus 2. It is obtained from the surface on the left with 4 punctures and genus 1 by sewing along the punctures marked by 3 and 4

As cyclic operads are characterized by a system of contractions (39) indexed by unrooted trees, there is a similar characterization of modular operads, but based on labelled (or “modular”) graphs rather than trees. Following [5, 12], by a graph 8 we mean a finite set Flag(8) (whose elements are called flags or half-edges) together with an involution σ and a partition λ. The vertices Vert(8) of a graph 8 are the blocks of the partition λ. The edges Edg(8) are pairs of flags forming a two-cycle of σ relative to the decomposition of a permutation into disjoint cycles. The legs Leg(8) are the fixed-points of σ . We also denote by Leg(v) the flags belonging to the block v or, in common speech, half-edges adjacent to the vertex v. Each graph 8 has its geometric realization, a finite one-dimensional cell complex |8|, obtained by taking one copy of [0, 21 ] for each flag and imposing the following equivalence relation: the points 0 ∈ [0, 21 ] are identified for all flags in a block of the partition λ, and the points 21 ∈ [0, 21 ] are identified for pairs of flags exchanged by the involution σ . We will usually make no distinction between a graph and its geometric realization. A modular or labelled graph is a connected graph 8 together with a map g : Vert(8) → {0, 1, 2, . . . }. The genus g(8) of a modular graph 8 is the number g(8) := dim H1 (|8|) + g(v). v∈Vert(8)

Let 8 ((g, S)) be the category whose objects are pairs (|8|, ρ) consisting of a modular graph 8 of genus g and an isomorphism ρ : Leg(8) → S labeling the legs of 8 by elements of a finite set S. As usual, we write 8 ((g, n)) := 8 ((g, {0, . . . , n})). For a modular collection A = {A((g, n))}n≥1 and a modular graph 8, let A((8)) be the tensor product A((8)) := A((g(v), Leg(v))). (40) v∈Vert(8)

A modular operad structure on A is then given by a coherent system of contractions [8, 2.10] α8 : A((8)) → A((g, S)), for any 8 ∈ 8 ((g, S)), g ≥ 0 and a finite set S.


381

Example 1. Let V = (V , B) be a differential graded vector space with a graded symmetric inner product B : V ⊗V → k. Let us define, for each g ≥ 0 and a finite set S, End V ((g, S)) := V ⊗S (the tensor product of copies of V indexed by S). It follows from definition that, for any 8 ∈ 8 ((g, S)), End V ((8)) = V ⊗Flag(8) . Let B ⊗Edg(8) : V ⊗Flag(8) → V ⊗Leg(8) be the multilinear form which contracts the factors of V ⊗Flag(8) corresponding to the flags which are paired up as edges of 8. Then we define α8 : End V ((g, 8)) → End V ((g, S)) to be the map B ⊗Edg(8)

α8 : End V ((8)) ∼ = V ⊗Flag(8) −−−−−−→ V ⊗Leg(8) ∼ = V ⊗S = End V ((g, S)).

(41)

It is easy to show that the contractions {α8 | 8 ∈ 8 ((g, S))} define on End V the structure of a modular operad. We would like to modify Example 1 to the situation when the degree of the form B is +3, as in the definition of a loop homotopy Lie algebra. Formula (41) does not work, among other things also because α8 will not be of degree zero. For this modification we need to introduce “twisted” modular operads. If X is a finite set with card(X) = s, let Det(X) := ∧s ((↓ k)⊕X ), the top dimensional piece of the s-fold exterior power of the direct sum of the copies of ↓ k indexed by elements of X. Clearly Det(X) is an one-dimensional vector space concentrated in degree −s. The determinant of a graph 8 ∈ 8 ((g, S)) is defined by Det(8) := Det(Edg(8)). A twisted modular operad ([5, p. 293], also called a K-modular operad in [7]) is then a modular collection A together with a coherent system of contractions α˜ 8 : A((8))⊗Det(8) → A((g, S)), for any 8 ∈ 8 ((g, S)), g ≥ 0 and a finite set S. Example 2. Let W = (W, H ) be a graded vector space with a nondegenerate degree −1 W by symmetric bilinear form H . Define the modular collection End W ((g, S)) := W ⊗S , End for g ≥ 0 and a finite set S. For 8 ∈ 8 ((g, S)), the twisted modular contraction W ((8))⊗Det(8) → End W ((g, S)) α˜ 8 : End is defined as follows. Let us choose labels se , te such that e = {se , te } for each edge e ∈ Edg(8) and define α˜ 8 to be the composition: W ((8))⊗Det(8) ∼ End = W ⊗Flag(8) ⊗Det(8)

∼ W ⊗{se ,te } ⊗Span(↓ e) = W ⊗S ⊗ e∈Edg(8)

∼ =W

⊗S

⊗

Wse ⊗Wte ⊗Span(↓ e)

e∈Edg(8)

1⊗ e He W ((g, S)), −−−−−−→ W ⊗S ⊗k⊗Edg(8) ∼ = End

382

M. Markl

where He is the map that sends u⊗v⊗↓e ∈ Wse ⊗Wte ⊗Span(↓e) to H (u, v) ∈ k. The symmetry of H assures that the definition of α˜ 8 does not depend on the choice of labels. W the structure of a twisted modular The system {α˜ 8 | 8 ∈ 8 ((g, S))} induces on End operad. If V = (V , B) is a graded vector space with a nondegenerate degree +3 bilinear symmetric form B, then W = (W, H ) with W := ↑2 V and the form H defined by H (u, v) := B(↓2 u, ↓2 v), u, v ∈ W , form the data as in Example 2, so we may ↑2 V . consider the twisted modular operad End Another example of a twisted modular operad is provided by the Feynman transform (E) on a of a modular operad. Recall [8, 4.2] that the free twisted modular operad M modular collection E is given by (E)((g, n)) := M

colim

E((8))⊗Det(8),

8 ∈ Iso 8 ((g, n))

where Iso 8 ((g, n)) is the full subcategory of isomorphisms in 8 ((g, n)). The twisted modular operad structure is induced by the “grafting” of underlying graphs. (A)((g, n)) carries a natural differential ∂F [5, If A is a modular operad, then M (A), ∂F ) is called Theorem 4.4]. The twisted differential modular operad F(A) := (M the Feynman transform of the modular operad A. Let us consider the “forgetful” functor For : MOp → COp from the category of modular operads to the category of cyclic operads given by For(A)((S)) := A((0, S)), for any finite set S. It is not difficult to show [16] that this functor has a left adjoint Mod : COp → MOp. Definition 2. The modular operad Mod(P) is called the modular operadic completion of the cyclic operad P. An easy calculation shows that Mod(Com)((g, n)) ∼ = k, for each g ≥ 0, n ≥ 1,

(42)

with the trivial action of the symmetric group n+1 . The key role in our characterization is played by the Feynman transform F(Mod(Com)) of the modular completion of the operad Com. It follows from (42) that, as a nondifferential operad, F(Mod(Com)) is the free twisted modular operad on the g generators ωn , (Mod(Com)) ∼ ({ωng ; n ≥ 1, g ≥ 0}), M =M

(43)

g where ωn corresponds to the dual of 1 ∈ k ∼ = Mod(Com)((g, n)). The central result of this section reads as follows.

Theorem 2. There exists a natural one-to-one correspondence between twisted modular F(Mod(Com))-algebra structures on (↑ 2 V , B(↓ 2 −, ↓ 2 −), i.e. morphisms

↑2 V , ∂ = 0 (44) a : F(Mod(Com)), ∂F → End of differential twisted modular operads, and loop homotopy algebra structures on (V , B) in the sense of Def. 1.


383

Sketch of proof. Description (43) shows that a map a of (44) is determined by its values g g ↑2 V ((g, n)) on the generators. Moreover, the map a ought to commute ξn := a(ωn ) ∈ End with the differentials, so the equation g

a(∂F (ωn )) = 0

(45) g

↑2 V ((g, n)) can be must be satisfied, for each g ≥ 0 and n ≥ 1. Observe that ξn ∈ End interpreted as a degree −2(n + 1)-element of the graded vector space V ⊗n+1 . Let us introduce a map @ : V ⊗n+1 → Hom(V ⊗n , V ) by @(x0 ⊗ · · · ⊗ xn )(v1 , . . . , vn ) : := (−1)nx0 +(n−1)x1 +···+xn−1 x0 B(x1 , v1 )B(x2 , v2 ) · · · B(xn , vn ),

(46)

for x0 ⊗ · · · ⊗ xn ∈ V ⊗n+1 and v1 , . . . , vn ∈ V . The map @ is clearly a degree 3n isog morphism of V ⊗n+1 and Hom(V ⊗n , V ). Finally, let ln : V ⊗n → V be a homogeneous degree n − 2 map given by g

ln (v1 , . . . , vn ) := (−1)

n(n+1) +n(v1 +···+vn ) 2

g

@(ωn )(v1 , . . . , vn ), for v1 , . . . , vn ∈ V . g

A long but straightforward calculation shows that ln are antisymmetric operations satisfying (13) and that (45) translates to the main identity (11). On the other hand, all steps above can clearly be reversed, thus a loop homotopy Lie algebra structure induces a map (44). Remark 4. Observe that Theorem 2 is formulated in such a way that the differential ∂ on V is a part of the structure, namely ∂ := a(ω10 ). 6. Possible Generalizations (Open Strings) Let P be an operad. It is now well-understood what a “strongly homotopy P-algebra” is. In the case when P is Koszul, it is an algebra over the cobar construction on the quadratic dual P ! of P [9, Def. 4.2.14]. An alternative characterization is that a homotopy P-algebra is a square zero differential on the cofree connected P ! -coalgebra. The equivalence of these two characterizations follows for example from [9, Prop. 4.2.15]. The quadratic dual of the operad Lie for Lie algebras is Com, the operad for commutative associative algebras, and the above characterization give Proposition 6, resp. Proposition 7. Another example is P = Ass, the operad for associative algebras. It is quadratic self-dual, P ! = Ass, and the corresponding strongly homotopy algebras are called strongly homotopy associative or A∞ -algebras [18, 15]. Let us look for possible generalizations to the loop case. If P is a cyclic operad (recall that both Lie and Ass are cyclic), the quadratic dual P ! is again cyclic [7], so it makes sense to consider the modular completion Mod(P ! ) (Def. 2). We suggest the following definition. Definition 3. Let P be a Koszul cyclic operad. A loop homotopy P-algebra is a modular algebra over the twisted differential modular operad F(Mod(P ! )).

384

M. Markl

For P = Lie we get Theorem 2. It would be interesting to write out explicitly axioms of loop homotopy associative algebras, because these structures should play an important rôle in the higher-genera open string field theory, as suggested by [19]. While in the Lie g case we had, for each n and g, only one antisymmetric operation ln : V ⊗n → V , in the loop homotopy associative case we expect to have (n + 1)! g 2 · g! · (n + 1 − 2g)! operations V ⊗n → V , due to the dimension of Mod(Ass)((g, n)). A seemingly easier approach would be the one based on coderivations. We would like to say that a loop homotopy P-algebra is an order 2 coderivation of the cofree connected P ! -coalgebra, having properties analogous to (22). This works nicely for P = Lie, because we know what is a higher order coderivation of a cocommutative coalgebra. But we are not sure whether there exists a reasonable concept of higher-order coderivations without the cocommutativity, though the paper [4] seems to suggest this. Acknowledgement. I would like to express my gratitude to Jim Stasheff for reading the manuscript and many helpful remarks and suggestions.

References 1. Akman, F.: On some generalizations of Batalin–Vilkovisky algebras. Preprint q-alg/9506027, June 1995 2. Akman, F.: Multibraces on the Hochschild complex. Preprint q-alg/9702010, February 1997 3. Alfaro, J., Bering, K., Damgaard, P.H.: Algebra of higher antibrackets. Preprint hep-th/9604027, April 1996 4. Alfaro, J., Damgaard, P.H.: Non-Abelian antibrackets. Preprint hep-th/9511066, November 1995 5. Behrend, K., Manin, Yu.: Stacks of stable maps and Gromov-Witten invariants. Preprint alggeom/9506023, June 1995 6. Getzler, E., Jones, J.D.S.: Operads, homotopy algebra, and iterated integrals for double loop spaces. Preprint, 1993 7. Getzler, E., Kapranov, M.M.: Cyclic operads and cyclic homology. In: S.-T. Yau, editor, Geometry, Topology and Physics for Raoul Bott, Volume 4 of Conf. Proc. Lect. Notes. Geom. Topol., Cambridge, MA: International Press, 1995, pp. 167–201 8. Getzler, E., Kapranov, M.M.: Modular operads. Compositio Math. 110 (1), 65–126 (1998) 9. Ginzburg, V., Kapranov, M.M.: Koszul duality for operads. Duke Math. J. 76 (1), 203–272 (1994) 10. Kapranov, M.M.: Operads in algebraic geometry. Documenta Mathematica Extra Volume ICM, pp. 277– 286 (1998) 11. Kimura, T., Stasheff, J.D., Voronov, A.A.: On operad structures of moduli spaces and string theory. Commun. Math. Phys. 171, 1–25 (1995) 12. Kontsevich, M.: Graphs, homotopical algebra and low-dimensional topology. Preprint, 1994 13. Lada, T., Markl, M.: Strongly homotopy Lie algebras. Communications in Algebra 23 (6), 2147–2161 (1995) 14. Lada, T., Stasheff, J.D.: Introduction to sh Lie algebras for physicists. International J. Theor. Phys. 32 (7), 1087–1103 (1993) 15. Markl, M.: A cohomology theory for A(m)-algebras and applications. J. Pure Appl. Algebra 83, 141–175 (1992) 16. Markl, M., Shnider, S., Stasheff, J.D.: Operads in algebra, topology and mathematical physics. Book, work in progress 17. May, J.P.: The Geometry of Iterated Loop Spaces. Lecture Notes in Mathematics Vol 271 Berlin– Heidelberg–New York: Springer-Verlag, 1972 18. Stasheff, J.D.: Homotopy associativity of H-spaces I,II. Trans. Am. Math. Soc. 108, 275–312 (1963) 19. Stasheff, J.D.: Higher homotopy algebras: String field theory and Drinfel’d quasi-Hopf algebras. In: Proceedings of the XXth International Conference on Differential Geometric Methods in Theoretical Physics, Baruch College, CUNI, June 1991, Singapore: World Scientific, 1992, pp. 408–425 20. Zwiebach, B.: Closed string field theory: Quantum action and the Batalin–Vilkovisky master equation. Nucl. Phys. B 390, 33–152 (1993) Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 220, 385 – 432 (2001)

Communications in



Noncommutative Instantons and Twistor Transform Anton Kapustin1, , Alexander Kuznetsov2, , Dmitri Orlov3, 1 School of Natural Sciences, Institute for Advanced Study, Olden Lane, Princeton, NJ 08540, USA.


2 Institute for Problems of Information Transmission, Russian Academy of Sciences, 19 Bolshoi Karetnyi,

Moscow 101447, Russia. E-mail: [email protected]; [email protected]

3 Algebra Section, Steklov Mathematical Institute, Russian Academy of Sciences, 8 Gubkin str., GSP-1,

Moscow 117966, Russia. E-mail: [email protected] Received: 3 May 2000 / Accepted: 3 April 2001

Dedicated to A.N. Tyurin on his 60th birthday Abstract: Recently N. Nekrasov and A. Schwarz proposed a modification of the ADHM construction of instantons which produces instantons on a noncommutative deformation of R4 . In this paper we study the relation between their construction and algebraic bundles on noncommutative projective spaces. We exhibit one-to-one correspondences between three classes of objects: framed bundles on a noncommutative P2 , certain complexes of sheaves on a noncommutative P3 , and the modified ADHM data. The modified ADHM construction itself is interpreted in terms of a noncommutative version of the twistor transform. We also prove that the moduli space of framed bundles on the noncommutative P2 has a natural hyperkähler metric and is isomorphic as a hyperkähler manifold to the moduli space of framed torsion free sheaves on the commutative P2 .The natural complex structures on the two moduli spaces do not coincide but are related by an SO(3) rotation. Finally, we propose a construction of instantons on a more general noncommutative R4 than the one considered by Nekrasov and Schwarz (a q-deformed R4 ). 1. Physical Motivation In this section we explain the physical motivation for studying instantons on a noncommutative R4 . Readers uninterested in the motivation may skip most of this section and proceed directly to Subsect. 1.5. Likewise, readers familiar with the way noncommutative instantons arise in string theory may start with Subsect. 1.5. 1.1. Instanton equations. Let E be a vector bundle with structure group G on an oriented Riemannian 4-manifold X, and let A be a connection on E. The instanton equation is Supported by DOE grant DE-FG02-90ER4054442.

Supported by NSF grant DMS97-29992 and RFFI grants 99-01-01144, 99-01-01204. Supported by NSF grant DMS97-29992 and RFFI grant 99-01-01144.

386

A. Kapustin, A. Kuznetsov, D. Orlov

the equation FA+ = 0,

(1)

where FA is the curvature of A, and FA+ denotes the self-dual (SD) part of FA . Solutions of this equation are called instantons, or anti-self-dual (ASD) connections. The second Chern class of E is known in the physics literature as the instanton number. Instantons automatically satisfy theYang–Mills equation dA (∗F ) = 0, where dA : p ⊗End(E) −→ p+1 ⊗ End(E) is the covariant differential, and ∗ : p −→ 4−p is the Hodge star operator. There are several physical reasons to be interested in instantons. If one is studying quantum gauge theory on a Riemannian 3-manifold M (space), then instantons on X = M × R describe quantum-mechanical tunneling between different classical vacua. The possibility of such tunneling has drastic physical effects, some of which can be experimentally observed. If one is studying classical gauge theory on a 5-dimensional space-time X × R, then instantons on X can be interpreted as solitons, i.e. as static solutions of the Yang–Mills equations of motion. In fact, instantons are the absolute minima of the Yang–Mills energy function of the 5-dimensional theory (with fixed second Chern class). Both interpretations arise in string theory, but to explain this we need to make a digression and discuss D-branes.

1.2. D-branes. It has been discovered in the last few years that string theory describes, besides strings, extended objects (branes) of various dimensions. These extended objects should be regarded as static solutions of (as yet poorly understood) stringy equations of motion. D-branes are a particularly manageable class of branes. Recall that ordinary closed oriented superstrings, known as Type II strings, are described by maps from a Riemann surface (“worldsheet”) to a 10-dimensional manifold Z (“target”). The physical definition of a D-brane is “a submanifold of Z on which strings can end”. This means that if a D-brane is present, then one needs to consider maps from a Riemann surface with boundaries to Z such that the boundaries are mapped to a certain submanifold X ⊂ Z. In this case one says that there is a D-brane wrapped on X. If X is connected and has dimension p + 1, then one says that one is dealing with a Dp-brane. In general, X can have several components with different dimensions, and then each component corresponds to a D-brane. In perturbative string theory, the role of equations of motion is played by the condition that a certain auxiliary quantum field theory on the Riemann surface is conformally invariant. When D-branes are present, has boundaries, and the auxiliary theory must be supplemented with boundary conditions. The requirement that the boundary conditions preserve conformal invariance imposes constraints on the submanifold X. These constraints should be regarded as equations of motion for D-branes. For example, if we consider a D0-brane wrapped on a 1-dimensional submanifold X, then conformal invariance requires that X be a geodesic in Z. This is the usual equation of motion for a relativistic particle moving in Z. An important subtlety is that to specify fully the boundary conditions for the auxiliary theory on it is not sufficient to specify X; one should also specify a unitary vector bundle E on X and a connection on it. In the simplest case this bundle has rank 1, but one can also have “multiple” D-branes, described by bundles of rank r > 1. Such bundles describe r coincident D-branes wrapped on the same submanifold X. Using

Noncommutative Instantons and Twistor Transform

387

the requirement of conformal invariance of the auxiliary two-dimensional quantum field theory, one can derive equations of motion for the Yang–Mills connection on E. In the low-energy approximation, the equations of motion are the usual Yang–Mills equations dA (∗FA ) = 0.In particular, instantons are solutions of these equations. 1.3. Instantons and D-branes. Let Z be R10 with a flat metric, and let X → Z be R5 = R4 × R linearly embedded in Z. We regard R4 as space and R as time. Consider r D4-branes wrapped on X. This physical system is described by the Yang–Mills action on R5 = R4 × R. If one is looking for static solutions of the equations of motion, one needs to consider the minima of the Yang–Mills energy function W [A] = ||FA ||2 , R4

where FA is the curvature of a U (r) connection A, and ||FA ||2 = −Tr (FA ∧ ∗FA ). The instanton number of A is defined by 1 c2 = (2) Tr (FA ∧ FA ) . 8π 2 R4 If the Yang–Mills energy evaluated on A is finite, then the bundle E and the connection A extend to S4 , the one-point compactification of R4 (see [4] for details). In this case c2 is the second Chern class of E and is therefore an integer. Solutions of instanton equations on R4 are precisely the absolute minima of the Yang–Mills energy function. These solutions should be regarded as composed of identical particle-like objects (instantons) on X, their number being c2 . Since the energy of the instanton is proportional to c2 , all “particles” have the same mass. Since the solution is static, the particles neither repel nor attract. This is actually a consequence of supersymmetry: Type II string theory is supersymmetric, and D4-branes with instantons on them leave part of supersymmetry unbroken. In string theory one may also consider k D0-branes present simultaneously with r D4-branes. More specifically, we will consider D0-branes which are at rest, i.e. the corresponding one-dimensional manifolds are straight lines parallel to the time axis. Such a configuration of branes is also supersymmetric, and consequently there are no forces between any of the branes. The positions of D0-branes are not constrained by anything, so their moduli space is (R9 )k . More precisely, since D0-branes are indistinguishable, the moduli space is Symk (R9 ). It turns out that an instanton with instanton number k and k D0-branes are related: they can be deformed into each other without any cost in energy. A convenient point of view is the following. In the presence of D4-branes wrapped on X the moduli space of D0-branes has two branches: a branch where their positions are unconstrained and D0-branes are point-like (this branch is isomorphic to Symk (R9 )), and the branch where they are constrained to lie on X. The latter branch is isomorphic to the moduli space Mr,k of U (r) instantons on X = R4 with c2 = k. The dimension of Mr,k is known to be 4rk for r > 1 (see for example [4]). For r = 1 instantons do not exist. The translation group of R4 acts freely on Mr,k , and the quotient space describes the relative positions and sizes of instantons. Thus D0-branes are pointlike objects when they are away from D4-branes, but when they bind to D4-branes they can acquire finite size.

388


The “instanton” branch touches the “point-like” branch at submanifolds where some or all of the instantons shrink to zero size. These are the submanifolds where the instanton moduli space is singular. At these submanifolds the point-like instantons can detach from D4-branes and start a new life as D0-branes. This lowers the second Chern class of the bundle on D4-branes. Thus from the string theory perspective it is natural to glue together the moduli spaces of instantons with different Chern classes along singular submanifolds. 1.4. Noncommutative geometry and D-branes. Instanton equations (and, more generally, Yang–Mills equations) arise in the low-energy limit of string theory, or equivalently for large string tension. Recently, another kind of low-energy limit of string theory was discussed in the literature [32]. Consider a trivial U (r)-bundle on X = R4 with a connection A whose curvature FA is of the form 1⊗f where 1 is the unit section of End(E), and f is a constant nondegenerate 2-form. For small f the D4-branes are described by the ordinary Yang–Mills action, but for large FA the stringy equations of motion get complicated. It turns out that the equations of motion simplify again in the limit when both FA and the string tension are taken to infinity, with a certain combination of the two kept fixed (one also has to scale the metric appropriately, see [32]). We will call this limit the Seiberg–Witten limit. In this limit the D4-branes are described by Yang– Mills equations on a certain noncommutative deformation of R4 (see [32] and references therein). There is another description of the Seiberg–Witten limit, which is gauge-equivalent to the previous one. Type II string theory reduces at low energies to Type II supergravity in 10 dimensions. The bosonic fields of this low-energy theory include a symmetric ranktwo tensor (metric) and a 2-form B. R10 with a flat Lorenzian metric and a constant B is a solution of supergravity equations of motion, as well as full stringy equations of motion. A constant B can be gauged away, so this is not a very interesting solution. Life gets more interesting if there are D-branes present. For example, consider r coincident flat D4-branes embedded in R10 with a constant B-field. It turns out that one can gauge away a constant B-field only at the expense of introducing a constant FA of the form 1 ⊗ f , where f is equal to the pull-back of B to the worldvolume of the D4-branes. Thus the solution with zero FA and nonzero B is equivalent to the solution with nonzero FA and zero B. Therefore the Seiberg–Witten limit can be described as the limit in which both the B-field and the string tension become infinite. The idea that D-branes in a nonzero B-field are described Yang–Mills theory on a noncommutative space was first put forward in [13] for the case of D-branes wrapped on tori. 1.5. Instanton equations on a noncommutative R4 . The deformed R4 that one obtains in the Seiberg–Witten limit is completely characterized by its algebra of functions A. It is a noncommutative algebra whose underlying space is a certain subspace of C ∞ functions on R4 . The product is the so-called Wigner–Moyal product formally given by ∂2 1 f (x)g(y). (3) hθ (f g)(x) = lim exp ¯ ij y→x 2 ∂xi ∂yj Here θ is a purely imaginary matrix, and h¯ is a real parameter (“Planck constant”) which is introduced to emphasize that the Wigner–Moyal product is a deformation of the usual product. In the string theory context θ is proportional to f −1 .


389

Of course, to make sense of this definition we must specify a subspace in the space of C ∞ functions which is closed under the Wigner–Moyal product. Leaving this question aside for a moment,1 one can define the exterior differential calculus over A. Differential geometry of noncommutative spaces has been developed by A. Connes [12]. In our situation Connes’ general theory is greatly simplified. For example, the sheaf of 1-forms 1 (A) is simply a bimodule A⊕4 (the relation of this definition with the general theory is explained in Subsect. 8.11). The elements of 1 (A) will be denoted i f i (x)dxi , or simply f i (x)dxi . The exterior differential d is a vector space morphism d : A → 1 (A),

f →

∂f dxi . ∂xi

The exterior differential d satisfies the Leibniz rule d(f1 f2 ) = df1 f2 + f1 df2 . This makes sense because 1 (A) is a bimodule. The sheaf of 2-forms over A is a bimodule 2 (A) = A⊕6 (see Subsect. 8.11). The definition of the exterior differential can be extended to 1 (A) in an obvious manner. Complex conjugation acts as an anti-linear anti-homomorphism of A, i.e. (f g) = g f .Thus A has a natural structure of a ∗-algebra. We will denote the ∗-conjugate of f ∈ A by f † . A trivial bundle over the noncommutative R4 is defined as a free A-module E. A trivial unitary bundle over the noncommutative R4 is defined as a free module V ⊗C A, where V is a Hermitian vector space. A connection on a trivial bundle E is defined as a map ∇ : E → E ⊗A 1 (A), which is a vector space morphism satisfying the Leibniz rule ∇(m f ) = ∇(m) f + m df. This formula makes use of the bimodule structure on 1 (A). The curvature F∇ = [∇, ∇] is a morphism of A-modules F∇ : E → E ⊗A 2 (A). As in the commutative case, a connection on a trivial bundle E can be written in terms of a connection 1-form A ∈ EndA (E) ⊗A 1 (A): ∇(m) = dm + A m. This formula uses the bimodule structure on m. If E is a unitary bundle, and we have A† = −A, then we say that A is a unitary connection. The curvature is given in terms of A by the usual formula F∇ := FA = dA + A ∧ A. Here it is understood that f i dxi ∧ g j dxj = f i g j dxi ∧ dxj . 1 String theory considerations do not shed light on this problem.

390


The instanton equation on A is again given by (1), and the instanton number is defined by (2). The most obvious choice of the space of functions closed under the Wigner–Moyal product is the space of polynomial functions. However, this choice is not suitable for our purposes because it precludes the decrease of FA at infinity which is necessary for the instanton action to converge. In the commutative case, components of an instanton connection are rational functions [4], so we would like our class of functions to include rational functions on R4 . A possible choice for the underlying set of A is the set of C ∞ functions on R4 all of whose derivatives are polynomially bounded. Then we face the question of the convergence of the series (3). To avoid dealing with this issue, we modify our definition of the Wigner–Moyal product (see the Appendix for details). The modified product makes the space of C ∞ functions all of whose derivatives are polynomially bounded into an algebra over C, and agrees with (3) on polynomial functions. Polynomial functions form a subalgebra of A. This subalgebra is isomorphic to the algebra generated by four variables xi , i = 1, 2, 3, 4 with relations [xi , xj ] = hθ ¯ ij . This algebra is usually called the Weyl algebra. To summarize, there is a limit of string theory in which D4 branes are described by Yang–Mills equations on the noncommutative R4 (= A). D0-branes bound to D4-branes are described in this limit by the instanton equations on the noncommutative R4 . One can show that, unlike in the commutative case, instantons cannot be deformed to point-like D0-branes without a cost in energy. Thus it is natural to suspect that the moduli space of instantons on the noncommutative R4 is metrically complete. 2. Review of the ADHM Construction and Summary All instantons on the commutative R4 arise from the so-called ADHM construction. Recently N. Nekrasov and A. Schwarz [29] introduced a modification of this construction which produces instantons on the noncommutative R4 .2 In the commutative case the completeness of the ADHM construction can be proved using the twistor transform of R. Penrose, so one could hope that the same approach could work in the noncommutative case. In this paper we show that the deformed ADHM data of [29] describe holomorphic bundles on certain noncommutative algebraic varieties and interpret the deformed ADHM construction in terms of noncommutative twistor transform. In this subsection we review both ordinary and deformed ADHM constructions and make a summary of our results. 2.1. Review of the ADHM construction of instantons. First let us outline the ADHM construction of U (r) instantons on the commutative R4 following [15]. We assume that the constant metric G on R4 has been brought to the standard form G = diag(1, 1, 1, 1) by a linear change of basis. To construct a U (r) instanton with c2 = k one starts with two Hermitian vector spaces V Ck and W Cr . The ADHM data consist of four linear maps B1 , B2 ∈ Hom(V , V ), I ∈ Hom(W, V ), J ∈ Hom(V , W ) which satisfy the following two conditions: 2 As in the commutative case, one may consider different classes of functions on the noncommutative R4 : polynomial, C ∞ functions rapidly decreasing at infinity, C ∞ functions all of whose derivatives are polynomially bounded, etc. Our class of functions differs somewhat from that adopted in [29].


391

(i) µc = [B1 , B2 ] + I J = 0, µr = [B1 , B1† ] + [B2 , B2† ] + I I † − J † J = 0. (ii) For any ξ = (ξ1 , ξ2 ) ∈ C2 ∼ = R4 the linear map Dξ ∈ Hom(V ⊕ V ⊕ W, V ⊕ V ) defined by Dξ =

B1 − ξ1 −B2 + ξ2 I B2† − ξ¯2 B1† − ξ¯1 J †

(4)

is surjective. The equations µc = µr = 0 are called the ADHM equations. They are invariant with respect to the action of the group of unitary transformations of V . Solutions of these equations are called ADHM data. The space of ADHM data modulo U (V ) transformations has dimension 4rk and carries a natural hyperkähler metric. ADHM construction identifies this moduli space with the moduli space of U (r) instantons with c2 = k and fixed trivialization at infinity. The role of the condition (ii) above is to remove submanifolds in this moduli space where the hyperkähler metric becomes singular (these are point-like instanton singularities mentioned in Subsect. 1.3). As a result the moduli space of the ADHM data is metrically incomplete. The instanton connection can be reconstructed from the ADHM data as follows. The condition (ii) implies that the family Ker Dξ forms a trivial subbundle of V ⊕ V ⊕ W of rank r. Let v(ξ ) be its trivialization, i.e. a linear map v(ξ ) : Cr → V ⊕ V ⊕ W smoothly depending on ξ such that Dξ v(ξ ) = 0 for all ξ , and ρ(ξ ) = v(ξ )† v(ξ ) is an isomorphism for all ξ . We set A(ξ ) = ρ(ξ )−1 v(ξ )† dv(ξ ). The matrix-valued one-form A is a connection on a trivial unitary bundle of rank r. One can show that its curvature FA is ASD (see [4]). However, it does not satisfy A† = −A, because we are not using a unitary gauge. Instead A satisfies A† (ξ ) = −(ρ(ξ )A(ξ )ρ(ξ )−1 + ρ(ξ )dρ(ξ )−1 ). To go to a unitary gauge, we must make a gauge transformation A (ξ ) = g(ξ )A(ξ )g(ξ )−1 + g(ξ )dg(ξ )−1 , where g(ξ ) is a function taking values in Hermitian r ×r matrices and satisfying g(ξ )2 = ρ(ξ ). We now explain, following [29], how to modify the ADHM construction so that it produces rank r instantons on the noncommutative R4 defined in the previous section. It proves convenient to apply an orthogonal transformation which brings the matrix θ in (3) to the standard form  θ=

0

a

0



 0 .  0 b 0 −b 0

 √ −a 0 −1    0 0 0

0

0

392


We will assume that a + b = 0.Since θ enters only in the combination hθ ¯ , we can set a + b = 1 without loss of generality. The relation between the affine coordinates ξ1 , ξ2 on C2 and affine coordinates x1 , x2 , x3 , x4 on R4 is chosen as follows: √ √ ξ1 = x4 − −1 x3 , ξ2 = −x2 + −1 x1 . Then ξ1 , ξ2 , ξ¯1 , ξ¯2 obey the Weyl algebra relations [ξ1 , ξ¯1 ] = 2hb, ¯

[ξ2 , ξ¯2 ] = 2ha, ¯

[ξ1 , ξ2 ] = [ξ1 , ξ¯2 ] = [ξ¯1 , ξ2 ] = [ξ¯1 , ξ¯2 ] = 0.

The modified ADHM data consist of the same four maps which now satisfy µc = 0, µr = −2h(a ¯ + b) · 1k×k . The instanton connection is given by essentially the same formulas as in the commutative case. The operator D is given by the same formula as Dξ , but is now regarded as an element of HomA ((V ⊕ V ⊕ W ) ⊗C A, (V ⊕ V ) ⊗C A). The module Ker D is a projective module over A. Following [10], we assume that it is isomorphic to a free module of rank r, and v is the corresponding isomorphism v : A⊕r → Ker D. We further assume [10] that the morphism / = DD† ∈ EndA ((V ⊕ V ) ⊗ A) is an isomorphism.3 Then it is easy to see that ρ = v † v ∈ EndA (Cr ⊗ A) is an isomorphism too. We set A = ρ −1 v † dv.

(5)

(The multiplication here and below is understood to be the Wigner–Moyal multiplication.) This formula defines a connection 1-form on a trivial unitary bundle on A of rank r. The curvature of this connection is given by FA = ρ −1 dv † ∧ (1 − vρ −1 v † )dv. A short computation (essentially the same as in the commutative case) shows that the curvature can be written in the form FA = ρ −1 v † dD† /−1 ∧ dD v. Furthermore, since D and D† are linear in ξi , ξ¯i , their exterior derivatives have a very simple form:   −d ξ¯1 −dξ2   −dξ1 dξ2 0 , dD† =  dD = d ξ¯2 −dξ1  .  −d ξ¯2 −d ξ¯1 0 0 0 3 One can show that the latter assumption is always valid provided h = 0. As for the former one, it is ¯ not known what constraints should be imposed on the deformed ADHM data to ensure that Ker D is a free A-module of rank r. For r = 1 Ker D is never free [16].


393

Note also that by virtue of the deformed ADHM equations / has a block-diagonal form: δ 0 , /= 0 δ where δ ∈ EndA (V ⊗ A) is an isomorphism. Using this fact, one can easily see that FA is proportional to the 2-forms dξ1 ∧ d ξ¯1 + dξ2 ∧ d ξ¯2 ,

dξ1 ∧ d ξ¯2 ,

dξ2 ∧ d ξ¯1 ,

which are anti-self-dual. As in the commutative case, the connection A does not satisfy A† = −A. To go to a unitary gauge one has to perform a gauge transformation A = g A g −1 + g dg −1 . Here g ∈ AutA (Cr ⊗ A) should be found from the conditions g † = g, g g = ρ. The existence of such g is an additional assumption. 2.2. Summary of results. In the commutative case there is a one-to-one correspondence between the following four classes of objects: A. Rank r holomorphic bundles on P2 with c2 = k and a fixed trivialization on the line at infinity. B. The set of ADHM data modulo the natural action of U (k). C. Rank r holomorphic bundles on P3 with c2 = k, a trivialization on a fixed line, vanishing H 1 (E(−2)), and satisfying a certain reality condition. D. U (r) instantons on R4 with c2 = k. The correspondence between C and D is a particular instance of twistor transform [6]. The correspondence between B and C has been proved by Atiyah, Hitchin, Drinfeld, and Manin [5, 4]. Together these two results imply that all instantons on R4 arise from the ADHM construction. The correspondence between A and B has been proved by Donaldson [15]. One can also prove the correspondence between A and D directly [7, 11, 18]. The goal of this paper is to extend some of these results to the noncommutative case. We show that there is a natural one-to-one correspondence between the isomorphism classes of the following objects: A . Algebraic bundles on a noncommutative deformation of P2 with c2 = k and a fixed trivialization on the line at infinity. B . Deformed ADHM data of Nekrasov and Schwarz modulo the natural U (k) action. C . Certain complexes of sheaves on a noncommutative deformation of P3 satisfying reality conditions. The moduli space of the deformed ADHM data has a natural hyperkähler metric, and the other two moduli spaces inherit this metric. Furthermore, we reinterpret the deformed ADHM construction of Nekrasov and Schwarz in terms of a noncommutative deformation of the twistor transform. It is interesting to note that H. Nakajima [27] studied the same linear algebra data as Nekrasov and Schwarz and showed that their moduli space coincides with the moduli

394


space of torsion free sheaves on a commutative P2 with a trivialization on a fixed line. On the other hand, we show that the same data describe algebraic bundles on a noncommutative P2 . As shown below, the interpretation in terms of complexes of sheaves on a noncommutative P3 provides a geometric reason for this “coincidence”. We prove that the two moduli spaces are isomorphic as hyperkähler manifolds, but the natural complex structures on them differ by an SO(3) rotation. The rest of the paper is organized as follows. In Sect. 3 we define noncommutative deformations of certain commutative projective varieties (P2 , P3 , and a quadric in P5 ). Section 4 is an algebraic preparation for the study of bundles on noncommutative projective spaces. In Sect. 5 we study the cohomological properties of sheaves on noncommutative P2 and P3 and define locally free sheaves (i.e. bundles). In Sect. 6 we show that any bundle on a noncommutative P2 trivial on the commutative line at infinity arises as a cohomology of a monad. In Sect. 7 we exhibit bijections between A , B , and C and explain the relation with Nakajima’s results. In Sect. 8 we construct a noncommutative deformation of Grassmannians and flag manifolds and describe a noncommutative version of the twistor transform. We also describe a nice class of noncommutative projective varieties associated with a Yang–Baxter operator and define differential forms on these varieties. In Sect. 9 we consider a more general deformation of R4 (a q- deformed R4 ) whose physical significance is obscure at present. We propose an ADHM-like construction of instantons on this space and outline its relation to noncommutative algebraic geometry. In the Appendix we define the Wigner–Moyal product on the space of C ∞ functions on Rn all of whose derivatives are polynomially bounded, and prove that the Wigner–Moyal product provides this space with a structure of an algebra over C. Note added in proof. After this paper was submitted to the electronic archive, we learned that coherent sheaves on the noncommutative projective plane and their moduli spaces have been studied by L. Le Bruyn [21]. 3. Geometry of Noncommutative Varieties 3.1. Algebraic preliminaries. Let k be a base field (we will be dealing only with k = C or k = R in this paper). Let A be an algebra over k. It is called right (left) noetherian if every right (left) ideal is finitely generated, and it is called noetherian if it is both right and left noetherian. Let A = ⊕ Ai be a graded noetherian algebra. We denote by mod(A) the category i≥0

of finitely generated right A-modules, by gr(A) the category of finitely generated graded right A-modules, and by tors(A) the full subcategory of gr(A) which consists of finite dimensional graded A-modules. An important role will be played by the quotient category qgr(A) = gr(A)/tors(A). It has the following explicit description. The objects of qgr(A) are the objects of gr(A) (we

the object in qgr(A) which corresponds to a module M). The morphisms denote by M in qgr(A) are given by

N

) = lim Homgr (M , N ), Homqgr (M, −→ M

where M runs over submodules of M such that M/M is finite dimensional. On the category gr(A) there is a shift functor: for a given graded module M = ⊕i≥0 Mi the shifted module M(r) is defined by M(r)i = Mr+i . The induced shift

to M(r)

functor on the quotient category qgr(A) sends M = M(r).


395

Similarly, we can consider the category Gr(A) of all graded right A-modules. It contains the subcategory Tors(A) of torsion modules. Recall that a module M is called torsion if for any element x ∈ M one has xA≥s = 0 for some s, where A≥s = ⊕ Ai . We i≥s

denote by QGr(A) the quotient category Gr(A)/Tors(A). The category QGr(A) contains qgr(A) as a full subcategory. Sometimes it is convenient to work in QGr(A) instead of qgr(A). Henceforth, all graded algebras will be noetherian algebras generated by the first component A1 with A0 = k. Sometimes we use subscripts R or L for categories gr(A), qgr(A), etc., to specify whether right or left modules are considered. If the subscript is omitted, the modules are taken to be right modules. For the same reason for an A-bimodule M we sometimes write MA or A M to specify whether the right or left module structure is considered. 3.2. Noncommutative varieties. A variety in commutative geometry is a topological space with a sheaf of functions (continuous, smooth, analytic, algebraic, etc.) which is, obviously, a sheaf of algebras. One of the main objects in geometry (algebraic or differential) is a bundle or, more generally, a sheaf. To any variety X we can associate an abelian category of sheaves of modules (maybe with some additional properties) over the sheaf of algebras of functions. Given a sheaf of modules on X, the space of its global sections is a module over the algebra of global functions on X. Thus the functor of global sections associates to every X an algebra and a certain category of modules over it. Under favorable circumstances, much of the information about the geometry of X is contained in this purely algebraic datum. Let us give a few examples. If X is a compact Hausdorff topological space, then the category of vector bundles over X is equivalent to the category of finitely generated projective modules over the algebra of continuous functions on X [34, 36]. The equivalence is given by the functor which maps a vector bundle to the module of its global sections. It is well known that if A is a commutative noetherian algebra, the category of coherent sheaves on the noetherian affine scheme Spec(A) is equivalent to the category of finitely generated modules over A. The equivalence is again given by the functor which attaches to a coherent sheaf the module of its global sections. In the case of projective varieties the only global functions are constants, so one has to act somewhat differently. Since a projective variety X is by definition a subvariety of a projective space, it inherits from it the line bundle OX (1) and its tensor powers OX (i). We can consider a graded algebra 9(X) = ⊕ H 0 (X, OX (i)). i≥0

This algebra is called the homogeneous coordinate algebra of X. Furthermore, for any sheaf F we can define a graded A-module 9(F) = ⊕ H 0 (X, F(i)). i≥0

It can be checked that 9 is a functor from the category of coherent sheaves on X coh(X) to gr(9(X)). In a brilliant paper [33], J.-P. Serre described the category of coherent sheaves on a projective scheme X in terms of graded modules over the graded algebra 9(X). He proved that the category coh(X) is equivalent to the quotient category qgr(9(X)) = gr(9(X))/tors(9(X)). The equivalence is given by the composition of the functor 9

396


with the projection π : gr(A) → qgr(A). On other hand, let A = ⊕ Ai be a graded i≥0

commutative algebra generated over k by the first component (which is assumed to be finite dimensional). We can associate to A a projective scheme X = Proj(A). Serre proved that the category coh(X) is equivalent to the category qgr(A). The equivalence also holds for the category of quasicoherent sheaves on X and the category QGr(A) = Gr(A)/Tors(A). In all of the above examples it turned out that the natural category of sheaves or bundles on a variety is equivalent to a certain category defined in terms of (graded) modules over some (graded) algebra. On the other hand, “as A. Grothendieck taught us, to do geometry you really don’t need a space, all you need is a category of sheaves on this would-be space” ([25], p. 83). For this reason, in the realm of algebraic geometry it is natural to regard a noncommutative noetherian algebra as a coordinate algebra of a noncommutative affine variety; then the category of finitely generated right modules over this algebra is identified with the category of coherent sheaves on the corresponding variety. Similarly, a noncommutative graded noetherian algebra is regarded as a homogeneous coordinate algebra of a noncommutative projective variety. The category of finitely generated graded right modules over this algebra modulo torsion modules is identified with the category of coherent sheaves on this variety (see [3, 25, 35]). A different approach to noncommutative geometry has been pursued by Connes [12]. 3.3. Noncommutative deformations of commutative varieties. Many important noncommutative varieties arise as deformations of commutative ones. Let X be a commutative variety (affine or projective). Let A be the corresponding commutative (graded) algebra. A noncommutative deformation of X is a deformation of the algebra structure on A, that is, a deformation of the multiplication law. Usually it is not easy to write down an explicit formula for the deformed product. There is a more algebraic way to describe noncommutative deformations of commutative varieties. Assume that the algebra A is given in terms of generators and relations. This means that A is given as a quotient A = T (V )/R, where V is the vector space spanned by the generators, T (V ) is the tensor algebra of V , and R is a two-sided ideal in T (V ) generated by a subspace of relations R ⊂ T (V ). Assume that Rh¯ ⊂ T (V ) is a one-parameter deformation of the subspace R. Then Ah¯ = T (V )/Rh¯ is a oneparameter deformation of A. (If A is graded, then we assume that R is a graded subspace, and the deformation preserves the grading). We denote by Xh¯ the noncommutative variety corresponding to the algebra Ah¯ . Thus Xh¯ is a noncommutative one-parameter deformation of X. If X is projective and A is a graded algebra, then we denote by coh(Xh¯ ) the category qgr(Ah¯ ). Furthermore, as in the commutative case, we will write O(r) for the object h¯ (r). A Now we define noncommutative varieties which are going to be used in this paper. 3.4. Noncommutative C4 . Denote by A(C4 ) the algebra of polynomial functions on C4 . Let θ be a skew-symmetric 4 × 4 matrix. Let us define the algebra A(C4h¯ ) as an algebra over C generated by xi (i = 1, 2, 3, 4) with relations [xi , xj ] = hθ ¯ ij : A(C4h¯ ) = T(x1 , x2 , x3 , x4 )/[xi , xj ] = h¯ θij 1≤i,j ≤4 .

(6)


397

We will regard A(C4h¯ ) as the algebra of polynomial functions on a noncommutative affine variety C4h¯ . 3.5. Noncommutative 4-dimensional quadric. Let G be a 4 × 4 symmetric nondegenerate matrix. Consider a graded algebra Qh¯ = ⊕ Qi over C generated by the elements i≥0

X1 , X2 , X3 , X4 , D, T of degree 1 with the following quadratic relations: [T , D] = [T , Xi ] = 0, 2 [Xi , Xj ] = hθ ¯ ij T , θil Glk Xk T , [D, Xi ] = 2h¯

(7)

lk

Gij Xi Xj = DT .

ij

We denote by Q4h¯ the noncommutative projective variety corresponding to the algebra Qh¯ . It is evident that Q4h¯ is a deformation of a 4-dimensional commutative quadric Q4 = { ij Gij Xi Xj = DT } ⊂ CP5 . 3.6. Embedding C4h¯ → Q4h¯ . Let Qh¯ [T −1 ] be the localization of the algebra Qh¯ with respect to T . Elements of degree 0 in Qh¯ [T −1 ] form a subalgebra which will be denoted by Qh¯ [T −1 ]0 . Lemma 3.1. The map xi → T −1 Xi (i = 1, 2, 3, 4) induces an isomorphism of the algebra A(C4h¯ ) with the algebra Qh¯ [T −1 ]0 . Proof. Obvious.

" !

This means that C4h¯ can be identified with the open subset {T = 0} in Q4h¯ . For this reason, Q4h¯ may be regarded as a compactification of C4h¯ which is compatible with the bilinear form G. Note also that the complement of C4h¯ in Q4h¯ corresponds to the algebra Qh¯ /T = T(X1 , X2 , X3 , X4 , D)/ [Xi , Xj ] = [D, Xi ] = 0, Gij Xi Xj = 0 . ij

Since this algebra is commutative, the complement is the usual 3-dimensional commutative quadratic cone. Thus one may say that Q4h¯ is obtained from C4h¯ by adding a cone “at infinity”. This is in complete analogy with the commutative case. 3.7. Noncommutative P2h¯ and P3h¯ . Noncommutative deformations of the projective plane have been classified in [1, 2, 9]. We will need one of them, namely the one whose homogeneous coordinate algebra is a graded algebra P Ph¯ = ⊕ P Ph¯ i over C generated by the elements w1 , w2 , w3 of degree 1 with the relations:

i≥0

[w3 , wi ] = 0 for any i = 1, 2, 3, 2 [w1 , w2 ] = 2hw ¯ 3.

(8)

398


We will also need a noncommutative deformation of the 3-dimensional projective space, whose homogeneous coordinate algebra will be denoted P Sh¯ = ⊕ P Sh¯ i . It is a i≥0

graded algebra over C generated by P Sh¯ 1 = U , where the vector space U is spanned by elements z1 , z2 , z3 , z4 obeying the relations [z3 , zi ] = [z4 , zi ] = 0 for any i = 1, 2, 3, 4, [z1 , z2 ] = 2hz ¯ 3 z4 .

(9)

The noncommutative projective varieties corresponding to P Ph¯ and P Sh¯ will be denoted P2h¯ and P3h¯ , respectively. Note that for h¯ = 0 all algebras P Sh¯ are isomorphic, and therefore the varieties P3h¯ are the same for all h¯ = 0. The same is true for P2h¯ . 3.8. Subvarieties in P3h¯ and P2h¯ . If I ⊂ P Sh¯ is a graded two-sided ideal, then the quotient algebra P Sh¯ /I corresponds to a closed subvariety X(I ) ⊂ P3h¯ . Let us describe some of them. Let J be the graded two-sided ideal generated by z3 and z4 . Then P Sh¯ /J = T(z1 , z2 )/[z1 , z2 ] = 0, hence X(J ) is the commutative projective line. For each point p = (λ : µ) ∈ P1 let Jp denote the graded two-sided ideal generated by λz3 + µz4 . If p = (0 : 1) or p = (1 : 0), then it is easy to see that X(Jp ) is the commutative projective plane. For all other p ∈ P1 we have λ P Sh¯ /Jp = T(z1 , z2 , z3 )/ [z1 , z3 ] = [z2 , z3 ] = 0, [z1 , z2 ] = −2h¯ z32 , µ hence X(Jp ) is a noncommutative projective plane isomorphic to P2h¯ . We have Jp ⊂ J for all p ∈ P1 , hence all planes X(Jp ) pass through the line X(J ). Thus we see that P3h¯ is a pencil of noncommutative projective planes passing through a fixed commutative projective line. Similarly, the two-sided ideal generated by w3 in P Ph¯ corresponds to a commutative projective line l = {w3 = 0} ⊂ P2h¯ . 4. Properties of Algebras P Sh¯ and P Ph¯ and the Resolution of the Diagonal This section is a preparation for the study of sheaves on P3h¯ and P2h¯ . We show that the algebras P Sh¯ and P Ph¯ are regular and Koszul and construct the resolution of the diagonal, which will enable us to associate monads to certain bundles on P2h¯ . 4.1. Quadratic algebras. A graded algebra A = ⊕ Ai over a field k is called quadratic i≥0

if it is connected (i.e. A0 = k), is generated by the first component A1 , and the ideal of relations is generated by the subspace of quadratic relations R(A) ⊂ A1 ⊗ A1 . Therefore the algebra A can be represented as T (A1 )/R(A), where T (A1 ) is a free tensor algebra generated by the space A1 . The algebras P Sh¯ and P Ph¯ are quadratic algebras. For example, P Sh¯ can be represented as T(U )/W , where U = P Sh¯ 1 is a 4-dimensional vector space and W is the 6–dimensional subspace of U ⊗ U spanned by the relations (9).


399

4.2. The dual algebra. For any quadratic algebra A = T (A1 )/R(A) we can define its dual algebra which is also quadratic. Let us identify A∗1 ⊗ A∗1 with (A1 ⊗ A1 )∗ by (l ⊗ m)(a ⊗ b) = m(a)l(b). Denote by R(A)⊥ the annulator of R(A) in A∗1 ⊗ A∗1 , i.e. the subspace which consists of such q ∈ (A∗1 )⊗2 that q(r) = 0 for any r ∈ R(A). Definition 4.1 ([25]). The algebra A! = T (A∗1 )/R(A)⊥ is called the dual algebra of A. Example 4.2. Let {ˇzi }, i = 1, 2, 3, 4, be the basis of P Sh¯ !1 = U ∗ which is dual to {zi }. By definition, P Sh¯ ! is generated by {ˇzi } with defining relations zˇ i2 = 0 for all i = 1, . . . , 4; zˇ i zˇ j + zˇ j zˇ i = 0 for all i < j, (i, j ) = (3, 4); zˇ 3 zˇ 4 + zˇ 4 zˇ 3 = h[ˇ ¯ z1 , zˇ 2 ] = 2h¯ zˇ 1 zˇ 2 . In the commutative case the dual algebra of the symmetric algebra S · (U ) is isomorphic to the exterior algebra C· (U ∗ ). Obviously, the algebras P Sh¯ ! and P Ph¯ ! are deformations of exterior algebras. For example, the vector space P Sh¯ !k is spanned by the elements zˇ i1 · · · zˇ ik with i1 < · · · < ik . In particular, the dimension of the vector space P Sh¯ !k is equal to k4 . Similarly, the dimension of P Ph¯ !k is equal to k3 . Proposition 4.3. Let A be P Sh¯ or P Ph¯ , and let n be 4 or 3, respectively. The multiplication map A!k ⊗ A!n−k −→ A!n is a non-degenerate pairing. Hence the dual algebra A! is a Frobenius algebra, i.e. (A! )A! ∼ = (A! A! )∗ as right A! -modules. Proof. The proposition holds for the exterior algebra, and therefore also for the algebra A! , since the latter is a “small” deformation of the exterior algebra. ! " 4.3. The Koszul complex. Consider right A-modules (A!k )∗ ⊗A. The following complex K· (A) is called the (right) Koszul complex of a quadratic algebra: d

d

d

d

· · · −→ (A!3 )∗ ⊗A(−3) −→ (A!2 )∗ ⊗A(−2) −→ (A!1 )∗ ⊗A(−1) −→ (A!0 )∗ ⊗A −→ 0, where the map d : (A!k )∗ ⊗ A → (A!k−1 )∗ ⊗ A is a composition of two natural maps: (A!k )∗ ⊗ A −→ (A!k )∗ ⊗ A!1 ⊗ A1 ⊗ A −→ (A!k )∗ ⊗ A. Here the first arrow sends α ⊗ a to α ⊗ e ⊗ a with e defined as e= yi ⊗ xi ∈ A!1 ⊗ A1 , i

and {xi } and {yi } being the dual bases of A1 and A!1 , respectively. The second map is determined by the algebra structures on A! and A. It is a well–known fact that d 2 = 0 (see, for example, [25]). Let kA be the trivial right A-module. The Koszul complex K· (A) possesses a natural ε augmentation K· −→ kA −→ 0.

400


Definition 4.4 (see [31]). A quadratic algebra A = ⊕ Ai is called a Koszul algebra if i≥0

ε

the augmented Koszul complex K· (A) −→ kA −→ 0 is exact. In the same manner one can define the left Koszul complex of a quadratic algebra. It is well known that the exactness of the right Koszul complex is equivalent to the exactness of the left Koszul complex (see, for example, [22]). Proposition 4.5. The algebras P Sh¯ and P Ph¯ are Koszul algebras. Proof. For h¯ = 0 this is a well-known fact about the symmetric algebra S · (U ). Since the augmented Koszul complex is exact for h¯ = 0, it is also exact for small h, ¯ and consequently for all h. " ¯ ! Since the dual algebras P Sh¯ ! and P Ph¯ ! are finite, the Koszul resolutions for the algebras P Sh¯ and P Ph¯ are finite too and have the same form as the resolutions for ordinary symmetric algebras. For example, the Koszul resolution for A = P Ph¯ is: {0 → (A!3 )∗ ⊗ A(−3) → (A!2 )∗ ⊗ A(−2) → (A!1 )∗ ⊗ A(−1) → (A!0 )∗ ⊗ A} → C. 4.4. Resolution of the diagonal. Consider a bigraded vector space 2 2 K··2 (A) = Kk,l (A) with Kk,l (A) = A(k) ⊗ (A!l−k )∗ ⊗ A(−l). k,l≥0

2 → K2 2 2 Consider morphisms dR : Kk,l k,l−1 and dL : Kk,l → Kk+1,l given by the following compositions:

dR : A ⊗ (A!k )∗ ⊗ A → A ⊗ (A!k )∗ ⊗ A!1 ⊗ A1 ⊗ A → A ⊗ (A!k−1 )∗ ⊗ A, dL : A ⊗ (A!k )∗ ⊗ A → A ⊗ A1 ⊗ A!1 ⊗ (A!k )∗ ⊗ A → A ⊗ (A!k−1 )∗ ⊗ A. Here the leftmost maps are given by yi ⊗ xi ∈ A!1 ⊗ A1 eR =

and

eL =

i

xi ⊗ yi ∈ A1 ⊗ A!1 ,

i

where {xi } and {yi } are the dual bases of A1 and A!1 , respectively, while the rightmost maps are induced by the algebra structures of A! and A. It is easy to show that dR2 = dL2 = 0

and

dR dL = dL dR ,

hence K··2 (A) is a bicomplex. It is called the double Koszul bicomplex of the quadratic algebra A. The topmost part of the bicomplex looks as follows: dR

dR

dR

dR

. . . −−−−→ A ⊗ (A!l+1 )∗ ⊗ A(−1 − l) −−−−→   dL

A ⊗ (A!l )∗ ⊗ A(−l)   dL

dR

−−−−→ . . .

dR

. . . −−−−→ A(1) ⊗ (A!l )∗ ⊗ A(−1 − l) −−−−→ A(1) ⊗ (A!l−1 )∗ ⊗ A(−l) −−−−→ . . .


401

Each term of the bicomplex K··2 (A) has an obvious structure of a bigraded Abimodule, and it is clear that the differentials are morphisms of bigraded A-bimodules. Let 2 2 Kl (A) = Ker dL : K0,l (A) → K1,l (A). Then K· (A) is a complex of bigraded A-bimodules (with respect to the differential dR ). Consider a bigraded algebra / = i,j /ij with /ij = Ai+j and with the multiplication induced from A. The algebra / is called the diagonal bigraded algebra of A. Note that the multiplication map induces a surjective morphism of A-bimodules δ : A ⊗ A → /. Lemma 4.6. The map δ : K0 (A) = A ⊗ A → / gives an augmentation of the complex K· (A). 2 (A) = Proof. We have to check that δ · dR : K1 (A) → A vanishes. Note that K0,1 2 A ⊗ A1 ⊗ A(−1), and that the differentials dR and dL restricted to K0,1 (A) coincide with the multiplication maps m1,2 and m2,3 , respectively. Thus we have the following commutative diagram:

K1 (A)  

dR

δ

m1,2

δ

−−−−→ K0 (A) −−−−→

/

A ⊗ A1 ⊗ A(−1) −−−−→ A ⊗ A −−−−→ /  m2,3  A(1) ⊗ A(−1) Now the lemma follows because δ · m1,2 = δ · m2,3 (associativity) obviously annihilates Ker m2,3 = K1 (A). ! " δ

Proposition 4.7. If A is Koszul, then K· (A) → / is exact. ! 2 (A) is equal to A ∗ Proof. The (p, q)-bigraded component of Kk,l p+k ⊗ (Al−k ) ⊗ Aq−l , 2 hence the (p, q)-bigraded component of the bicomplex K·· (A) vanishes for l < k or l > q. Thus the (p, q)-bigraded component of the bicomplex K··2 (A) is bounded. Therefore both spectral sequences of the bicomplex K··2 (A) converge to the cohomology of the total complex Tot(K··2 (A)). The first term of the first spectral sequence reads A(l) ⊗ k(−l), if k = l 1 Ek,l = 0, otherwise.

Hence the spectral sequence degenerates in the first term, and we have H 0 (Tot(K··2 (A))) =

∞ l=0

A(l) ⊗ k(−l),

H =0 (Tot(K··2 (A))) = 0.

402


On the other hand, the first term of the second spectral sequence reads

1 Ek,l

  k(l) ⊗ A(−l), if k = l > 0 = Kl (A), if k = 0  0, otherwise.

Hence the spectral sequence degenerates in the second term, and we have H

0

(Tot(K··2 (A)))

= H (K· (A)) ⊕ 0

∞

k(l) ⊗ A(−l) ,

l=1

H l (Tot(K··2 (A))) = H l (K· (A)). Therefore H =0 (K· (A)) = 0, and we have an exact sequence 0 → H 0 (K· (A)) →

∞

A(l) ⊗ k(−l) →

l=0

∞

k(l) ⊗ A(−l) → 0.

l=1

Looking at the (p, q)-bigraded component of this sequence we see that (H (K· (A)))p,q = 0

Thus H 0 (K· (A)) = /.

Ap+q , 0,

if p, q ≥ 0 . otherwise

" !

Definition 4.8. Define the left A-module Ω k as the cohomology of the left Koszul complex, truncated in the term Kk . In particular, Ω 1 is defined by the so-called Euler sequence m

ε

0 → Ω 1 → A(−1) ⊗ A1 → A → k → 0.

(10)

In Sect. 8.11 we will show that for noncommutative projective spaces the sheaves corresponding to the modules Ω k can be regarded as sheaves of differential forms. Proposition 4.9. We have Kk (A) = Ω k (k) ⊗ A(−k). " Proof. This follows immediately from the definition of Ω k and Kk (A). ! Combining Propositions 4.7 and 4.9, we obtain the following resolution of the diagonal: . . . −→ Ω 2 (2) ⊗ A(−2) −→ Ω 1 (1) ⊗ A(−1) −→ A ⊗ A −→ / −→ 0.

(11)


403

4.5. Cohomological properties of the algebras P Sh¯ and P Ph¯ . First we note that both algebras P Sh¯ and P Ph¯ are noetherian. This follows from the fact that they are Ore extensions of commutative polynomial algebras (see for example, [26]). For the same reason the algebras P Sh¯ and P Ph¯ have finite right (and left) global dimension, which is equal to 4 and 3, respectively (see [26], p. 273). We recall that the global dimension of a ring A is the minimal number n (if it exists) such that for any two modules M and N we have Ext n+1 A (M, N ) = 0. In the paper [1] the notion of a regular algebra has been introduced. Regular algebras have many good properties (see [3, 2, 40], etc.). Definition 4.10. A graded algebra A is called regular of dimension d if it satisfies the following conditions: (1) A has global dimension d, (2) A has polynomial growth, i.e. dim An ≤ cnδ for some c, δ ∈ R, (3) A is Gorenstein, meaning that ExtiA (k, A) = 0 if i = d, and ExtdA (k, A) = k(l) for some l. Here ExtA stands for the Ext functor in the category mod(A). It is easy to see that these properties are verified for P Sh¯ and P Ph¯ . Property (2) holds because our algebras grow as ordinary polynomial algebras. Property (3) follows from the fact that P Sh¯ and P Ph¯ are Koszul algebras and the dual algebras are Frobenius resolutions. In this case the Gorenstein parameter l in (3) is equal to the global dimension d. Thus we have Proposition 4.11. The algebras P Sh¯ and P Ph¯ are noetherian regular algebras of global dimension 4 and 3, respectively. For these algebras the Gorenstein parameter l coincides with the global dimension d. 5. Cohomological Properties of Sheaves on P2h¯ and P3h¯ 5.1. Ampleness and cohomology of O(i). Let A be a graded algebra and X be the corresponding noncommutative projective variety. Consider the sequence of sheaves {O(i)}i∈Z in the category coh(X) ∼ = qgr(A), where O(i) = A(i). This sequence is called ample if the following conditions hold: (a) For every coherent sheaf F there are integers k1 , . . . , ks and an epimorphism s

⊕ O(−ki ) −→ F.

i=1

(b) For every epimorphism F −→ G the induced map Hom(O(−n), F) −→ Hom(O(−n), G) is surjective for n & 0. It is proved in [3] that the sequence {O(i)} is ample in qgr(A) for a graded right noetherian k-algebra A if it satisfies the extra condition: (χ1 ) :

dimk Ext 1A (k, M) < ∞

for any finitely generated graded A-module M. This condition can be verified for all noetherian regular algebras (see [3], Theorem 8.1). In particular, the categories coh(P3h¯ ), coh(P2h¯ ) have ample sequences.

404


For any sheaf F ∈ qgr(A) we can define a graded module 9(F) by the rule: 9(F) := ⊕ Hom(O(i), F). i≥0

It is proved in [3] that for any noetherian algebra A that satisfies the condition χ1 the correspondence 9 is a functor from qgr(A) to gr(A) and the composition of 9 with the natural projection π : gr(A) −→ gqr(A) is isomorphic to the identity functor (see [3, Ch. 3,4]). Now we formulate a result about the cohomology of sheaves on noncommutative projective spaces. This result is proved in [3] for a general regular algebra and parallels the commutative case. Proposition 5.1 ([3, Theorem 8.1.]). Let A be P Sh¯ or P Ph¯ , and X be P3h¯ or P2h¯ , respectively. Denote by n the dimension of X (in our case n = 3 or n = 2, respectively). Then (1) The cohomological dimension of coh(X) is equal to dim(X), i.e. for any two coherent sheaves F and G Exti (F, G) vanishes if i > n. (2) There are isomorphisms   for p = 0, i ≥ 0 Ak H p (X, O(i)) = A∗−i−1−n for p = n, i ≤ −n − 1 (12)  0 otherwise. This proposition and the ampleness of the sequence {O(i)} implies the following corollary: Corollary 5.2. Let X be either P3h¯ or P2h¯ . Then for any sheaf F ∈ coh(X) and for all sufficiently large i ≥ 0 we have Hom(F, O(i)) = 0. Proof. By ampleness a sheaf F can be covered by a finite sum of sheaves O(kj ). Now the statement follows from the proposition, because Hom(O(kj ), O(i)) = 0 for all i < kj . ! " Corollary 5.3. Let X be either P3h¯ or P2h¯ . Then for any sheaf F ∈ coh(X) and for all sufficiently large i ≥ 0 we have H k (X, F(i)) = 0 for all k ≥ 1. Proof. The group H k (X, F(i)) coincides with Extk (O(−i), F). Let k be the maximal integer (it exists because the global dimension is finite) such that for some F there exists arbitrarily large i such that Extk (O(−i), F) = 0. Assume that k ≥ 1. s

Choose an epimorphism ⊕ O(−kj ) → F. Let F1 denote its kernel. Then for i > j =1

max{kj

} we have Ext>0 (O(−i),

s

⊕ O(−kj )) = 0, hence Extk (O(−i), F) = 0 implies

j =1

Extk+1 (O(−i), F) = 0. This contradicts the assumption of the maximality of k.

" !


405

5.2. Serre duality and the dualizing sheaf. A very useful property of commutative smooth projective varieties is the existence of the dualizing sheaf. Recall that a sheaf ω is called dualizing if for any F ∈ coh(X) there are natural isomorphisms of k-vector spaces H i (X, F) ∼ = Extn−i (F, ω)∗ , where ∗ denotes the k-dual. The Serre duality theorem asserts the existence of the dualizing sheaf for smooth projective varieties. In this case the dualizing sheaf is a line bundle and coincides with the sheaf of differential forms nX of top degree. Since the definition of ω is given in abstract categorical terms, it can be extended to the noncommutative case. More precisely, we will say that qgr(A) satisfies classical Serre duality if there is an object ω ∈ qgr(A) together with natural isomorphisms Exti (O, −) ∼ = Extn−i (−, ω)∗ . Our noncommutative varieties P3h¯ and P2h¯ satisfy classical Serre duality, with dualizing sheaves being OP3 (−4) and OP2 (−3), respectively. This follows from the paper [40], h¯ h¯ where the existence of a dualizing sheaf in qgr(A) has been proved for a general class of algebras which includes all noetherian regular algebras. In addition, the authors of

[40] showed that the dualizing sheaf coincides with A(−l), where l is the Gorenstein paramenter for A (see condition (3) of Definition 4.10). 5.3. Bundles on noncommutative projective spaces. To any graded right A-module M one can attach a left A-module M ∨ = HomA (M, A) which is also graded. Note that under this correspondence the right module AA (r) goes to the left module A A(−r). It is known that if A is a noetherian regular algebra, then HomA (−, A) is a functor from the category gr(A)R to the category gr(A)L . Moreover, its derived functor RHom·A (−, A) gives an anti-equivalence between the derived categories of gr(A)R and gr(A)L (see [39, 40, 38]). If we assume that the composition of the functor HomA (−, A) with the projection gr(A)L −→ qgr(A)L factors through the projection gr(A)R −→ qgr(A)R , then we obtain a functor from qgr(A)R to qgr(A)L which is denoted by Hom(−, O). This functor is not right exact and has right derived functors Ext i (−, O), i > 0, from qgr(A)R to qgr(A)L . For a noetherian regular algebra the functor Hom(−, O) and its right derived functors exist. This follows from the fact that the functors ExtiA (−, A) send a finite dimensional module to a finite dimensional module (see condition (3) of Definition 4.10). Moreover, in this case the functor Hom(−, O) can be represented as the composition of the functor 9 : qgr(A)R −→ gr(A)R , the functor HomA (−, A) : gr(A)R −→ gr(A)L , and the projection π : gr(A)L −→ qgr(A)L . This can be illustrated by the following commutative diagram: gr(A)R  π 9 qgr(A)R

HomA (−,A)

−−−→

Hom(−,O)

−−−→

gr(A)L  π

(13)

qgr(A)L

For a noetherian regular algebra the functor RHom·A (−, A) is an anti-equivalence between the derived categories of gr(A)R and gr(A)L and takes complexes of finite dimensional modules over gr(A)R to complexes of finite dimensional modules over gr(A)L .

406


This implies that the functor RHom· (−, O) gives an anti-equivalence between the derived categories of qgr(A)R and qgr(A)L . (Note that for derived functors RHomA (−, A) and RHom(−, O) there is also a commutative diagram like (13).) The functors Ext j (−, O) can be described more explicitly. Let M be an A-bimodule.

Regarding it as a right module, we see that for any F ∈ QGr(A)R the groups Ext i (F, M) have the structure of left A-modules. We can project them to QGr(A)L . Thus each bimodule M defines functors from QGr(A)R to QGr(A)L , which will be denoted by

πExti (−, M). Now, using π9 = id and the commutativity of the diagram (13) for the derived j functors ExtA (−, A) and Ext j (−, O), we obtain isomorphisms j j Ext j (F, O) ∼ = π ExtA (9(F), A) ∼ = π Extgr(A) (9(F), ⊕ A(i)) ∼ = π Extj (F, ⊕ O(i)) i≥0

i≥0

(14) for any sheaf F ∈ qgr(A)R . Definition 5.4. We call a coherent sheaf F ∈ qgr(A)R locally free (or a bundle) if Ext j (F, O) = 0 for any j = 0. Remark. In the commutative case this definition is equivalent to the usual definition of a locally free sheaf. Definition 5.5. The dual sheaf Hom(F, O) ∈ qgr(A)L will be denoted by F ∨ . If F ∈ qgr(A)L is a bundle, then the dual sheaf F ∨ is a bundle in qgr(A)L , because RHom· (F ∨ , O) = F in the derived category, and Ext j (F ∨ , O) = 0 for j = 0. Thus we have a good definition of locally free sheaves on P3h¯ and P2h¯ . Since the derived functor RHom(−, O) gives an anti-equivalence between the derived categories of qgr(A)R and qgr(A)L , there is an isomorphism: Hom(F, G) ∼ = Hom(G ∨ , F ∨ )

(15)

for any two bundles F and G on P3h¯ or P2h¯ . 6. Bundles on P2h¯ 6.1. Bundles on P2h¯ with a trivialization on the commutative line. In this section we study bundles on P2h¯ . By definition, a bundle is an object E ∈ coh(P2h¯ ) satisfying the additional condition Ext i (E, O) = 0 for all i > 0 (see (5.4)). The noncommutative plane P2h¯ contains the commutative projective line l ∼ = P1 given by the equation w3 = 0. If M is a P Ph¯ -module, then the quotient module M/Mw3 is a P Ph¯ /w3 -module. This gives a functor coh(P2h¯ ) → coh(P1 ), F → F|l . The sheaf F|l is referred to as the restriction of F to the line l. Lemma 6.1. If F is a bundle, there is an exact sequence: ·w3

0 −→ F(−1) −→ F −→ F|l −→ 0.

(16)


407

Proof. To prove this we only need to show that multiplication by w3 is a monomorphism. s

If F is a bundle, it can be embedded into a direct sum ⊕ O(ki ), because by ampleness i=1

the dual bundle F ∨ is covered by a direct sum of line bundles. Now, since the morphism ·w3 ·w3 O(ki −1) −→ O(ki ) is mono for any i, the same is true for the morphism F(−1) −→ F. " ! Lemma 6.2. Let E be a bundle on P2h¯ such that its restriction E|l to the commutative line l is isomorphic to a trivial bundle Ol⊕r . Then H 0 (P2h¯ , E(−1)) = H 0 (P2h¯ , E(−2)) = H 2 (P2h¯ , E(−1)) = H 2 (P2h¯ , E(−2)) = 0. Proof. We have the following exact sequence in the category coh(P2h¯ ): 0 −→ E(−2) −→ E(−1) −→ E(−1)|l −→ 0.

(17)

Since E(−1)|l ∼ = Ol (−1)⊕r , we have H 0 (E(−1)|l ) = 0. Assume that E(−1) has a nontrivial section. Then E(−2) has a nontrivial section too. For the same reason E(−3) has a nontrivial section, and so on. Thus for any n < 0 the bundle E(−n) has a nontrivial section. By (15) we have isomorphisms: H 0 (E(−n)) ∼ = Hom(O(n), E) ∼ = Hom(E ∨ , O(−n)). On the other hand, by Corollary 5.2 the last group is trivial for n & 0. Hence H 0 (E(−n)) = 0 for all n & 0, and consequently H 0 (E(−2)) = H 0 (E(−1)) = 0. Further, assume that H 2 (E(−2)) is nontrivial. Since H 1 (E(i)|l ) = 0 for all i ≥ −1 we have from the exact sequence (16) with F = E(i) that H 2 (E(i)) is nontrivial too for all i ≥ −1. But this contradicts Corollary 5.3. Therefore H 2 (E(−2)) = H 2 (E(−1)) = 0. This completes the proof. ! " 6.2. Monads on P2h¯ and P3h¯ . As in the commutative case, a non-degenerate monad on P2h¯ or P3h¯ is a complex over coh(P2h¯ ) m

n

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0 for which the map n is an epimorphism and m is a monomorphism. (Note that there is another more restrictive definition of a monad, according to which the dual map (m)∗ has to be an epimorphism, see [30]). The coherent sheaf E = Ker(n)/ Im(m) is called the cohomology of a monad. A morphism between two monads is a morphism of complexes. The following lemma is proved in [30, Lemma 4.1.3] in the commutative case, but the proof is categorical and applies to the noncommutative case as well. Lemma 6.3. Let X be either P2h¯ or on P3h¯ , and let E and E be the cohomology bundles of two monads m

n

M :0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0, m

n

M :0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0

408


on X. Then the natural mapping Hom(M, M ) −→ Hom(E, E ) is bijective. The proof is based on the fact that Extj (O, O(−1)) = Extj (O(1), O(−1)) = Extj (O(1), O) = 0 for all j (see [30], Lemma 4.1.3). 6.3. Non-degeneracy conditions. In the definition of a monad we require that the map n be an epimorphism. In the commutative case this condition must be verified pointwise. In the noncommutative case the situation is simpler in some sense, because the complement of the commutative line l does not have points. Lemma 6.4. If the restriction of a sheaf F ∈ coh(P2h¯ ) to the projective line l is the zero object, then F is also the zero object.

Consider Proof. Let M be a finitely generated graded P Ph¯ -module such that F ∼ = M. an exact sequence: ·w3 M −→ M(1) −→ N −→ 0.

= F(1)|l = 0, the module N is finite dimensional. This implies that for i & 0 Since N ·w3 the map Mi → Mi+1 is surjective. Moreover, these maps are isomorphisms for i & 0, because all Mi are finite dimensional vector spaces. Let us identify all Mi for i & 0 with respect to these isomorphisms. Using the A-module structure on M, we obtain a representation of the Weyl algebra T(X, Y )/[X, Y ] = 2h ¯ on the vector space Mi . But it is well known that the Weyl algebra does not have finite dimensional representations. Thus Mi = 0 for all i & 0, and M is finite dimensional. Therefore F = 0. ! " The following corollary is an immediate consequence of the lemma. Corollary 6.5. Let f : F −→ G be a morphism in coh(P2h¯ ). Suppose its restriction f¯ : F|l −→ G|l is an epimorphism. Then f is an epimorphism too. 6.4. From the resolution of the diagonal to a monad. Let M be an A-bimodule. Regard have ing it as a left module, we see that for any F ∈ QGr(A)L the groups Exti (F, M) the structure of right A-modules. We can project them to QGr(A)R . Thus each bimodule

from QGr(A)L to QGr(A)R . M defines functors π Exti (−, M) Let E be a bundle on P2h¯ such that its restriction to the line l is a trivial bundle. Let us consider the bundle E ∨ (1) ∈ qgr(P Ph¯ )L and the resolution of the diagonal K· (P Ph¯ ), which has only three terms: {0 −→ P Ph¯ (−1) ⊗ P Ph¯ (−2) −→ Ω 1 (1) ⊗ P Ph¯ (−1) −→ P Ph¯ ⊗ P Ph¯ } −→ /.

· over The resolution of the diagonal is a complex of bimodules. It induces a complex K QGr(P Ph¯ )L :

, {0 −→ O(−1) ⊗ P Ph¯ (−2) −→ 1 (1) ⊗ P Ph¯ (−1) −→ O ⊗ P Ph¯ } −→ /

(18)


409

where 1 is a sheaf on P2h¯ corresponding to the P Ph¯ -module Ω 1 .

from As described above, each A-bimodule M gives the functors π Ext i (−, M) QGr(A)L to QGr(A)R . In particular, each object of the resolution of the diagonal induces such functors.

. Note that the object /

coincides First we calculate these functors for the object / with ⊕ O(i). Hence by (14) we have i≥0

) = 0 π Ext j (E ∨ (1), /

) ∼ if j > 0, while π Ext0 (E ∨ (1), / = E(−1). The resolution of the diagonal (18) gives us a spectral sequence with the E1 term pq

−p ) (⇒ π Ext p+q (E ∨ (1), /

), E1 = πExt q (E ∨ (1), K which converges to

i E∞ =

E(−1) 0

if i = 0 otherwise.

pq

Now we describe all terms E1 of this spectral sequence. First we have π Extj (E ∨ (1), O ⊗ P Ph¯ ) ∼ Ph¯ = Extj (E ∨ (1), O) ⊗ P j ∨ ∼ = H j (P2 , E(−1)) ⊗ O. = Ext (E (1), O) ⊗ O ∼ h¯

By Lemma 6.2, these groups are trivial for j = 1. For the same reason we have π Extj (E ∨ (1), O(−1) ⊗ P Ph¯ (−2)) = H j (P2h¯ , E(−2)) ⊗ O(−2) = 0 for j = 1 and πExt1 (E ∨ (1), O(−1) ⊗ P Ph¯ (−2)) ∼ = H 1 (P2h¯ , E(−2)) ⊗ O(−2). Now let us consider the functors which are associated with the object 1 (1)⊗P Ph¯ (−1). We have πExtj (E ∨ (1), 1 (1) ⊗ P Ph (−1)) ∼ = Extj (E ∨ , 1 ) ⊗ O(−1). ¯

It follows from the Koszul complex that the sheaf 1 can be included in two exact sequences: 0 −→ 1 −→ O(−1) ⊗ P Ph¯ 1 −→ O −→ 0, 0 −→ O(−3) −→ O(−2) ⊗ (P Ph¯ 1 )∗ −→ 1 −→ 0. Applying the functor Hom(E ∨ , −) to the first sequence and taking into account that Hom(E ∨ , O(−1)) = 0, we obtain Hom(E ∨ , 1 ) = 0. Similarly, we deduce from the second sequence that Ext2 (E ∨ , 1 ) = 0, because Ext2 (E ∨ , O(−2)) = 0. This implies that the object πExtj (E ∨ (1), 1 (1) ⊗ P Ph¯ (−1)) is trivial for all j = 1. Thus our spectral sequence is nothing more than the complex

2 ) −→ π Ext 1 (E ∨ (1), K

1 ) −→ π Ext 1 (E ∨ (1), K

0 ), π Ext1 (E ∨ (1), K which is isomorphic to the complex H 1 (P2h¯ , E(−2)) ⊗ O(−2) −→ Ext 1 (E ∨ , 1 ) ⊗ O(−1) −→ H 1 (P2h¯ , E(−1)) ⊗ O. It has only one cohomology which coincides with E(−1).

410


Theorem 6.6. Let E be a bundle on P2h¯ such that its restriction to the commutative line l is isomorphic to the trivial bundle Ol⊕r . Then E is the cohomology of a monad m

n

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0 with H = H 1 (P2h¯ , E(−2)), L = H 1 (P2h¯ , E(−1)), and such a monad is unique up to an isomorphism. Moreover, in this case the vector spaces H and L have the same dimension. Proof. The existence of such a monad was proved above. The uniqueness follows from Lemma 6.3. The equality of dimensions of H and L follows immediately from the exact sequence (17). ! " 6.5. Barth description of monads. Now following Barth [8], we give a description of the moduli space of vector bundles on P2h¯ trivial on the line l in terms of linear algebra (see also [15]). Denote by Mh¯ (r, 0, k) the moduli space of bundles on the noncommutative P2h¯ trivial on the line l and with a fixed trivialization there (i.e. with a fixed isomorphism E|l ∼ = Ol⊕r ). Let dim H 1 (P2h¯ , E(−1)) = k. As in the commutative case, the numbers r, 0, k can be regarded as the rank, first Chern class, and second Chern class of E, respectively. The following theorem gives a description of this moduli space which is similar to the description given by Barth in the commutative case. Theorem 6.7. Let {(b1 , b2 ; j, i)} be the set of quadruples of matrices b1 , b2 ∈ Mk×k (C), j ∈ Mr×k (C), i ∈ Mk×r (C), which satisfy the condition [b1 , b2 ] + ij + 2h¯ · 1k×k = 0. Then the space Mh¯ (r, 0, k) is the quotient of this set with respect to the following free action of GL(k, C): bi → gbi g −1 ,

j → jg −1 ,

i → gi,

where g ∈ GL(k, C).

Proof. Let E be a bundle on P2h¯ trivial on the line l. We showed above that any such bundle comes from a monad unique up to an isomorphism. Conversely, suppose we have a monad m

n

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0

(19)

with dim H = dim L = k such that its restriction to the line l is a monad with the cohomology Ol⊕r . Then the cohomology of this monad is a bundle on P2h¯ which belongs to Mh¯ (r, 0, k). Indeed, the cohomologies of the dual complex n∗

m∗

0 −→ O(−1) ⊗ L∗ −→ O ⊗ K ∗ −→ O(1) ⊗ H ∗ −→ 0 coincide with Hom(E, O) and Ext 1 (E, O). Hence, to prove that E is a bundle, it is sufficient to show that the dual complex is a monad too, i.e. that the map m∗ is an epimorphism. The restriction of the dual complex to l is a monad which is dual to the restriction of the monad (19) to l. Hence the restriction of m∗ on l is an epimorphism. Then, by Lemma 6.5, m∗ is an epimorphism as well. Thus to describe the moduli space


411

Mh¯ (r, 0, k) we have to decsribe the space of all monads (19) modulo isomorphisms preserving trivialization on l. Consider a monad m

n

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0 with dim H = dim L = k and dim K = 2k + r. Denote by E its cohomology bundle. The maps m and n can be regarded as elements of H ∗ ⊗ K ⊗ W and K ∗ ⊗ L ⊗ W , respectively, where W = H 0 (P2h¯ , O(1)) is the vector space spanned by w1 , w2 , w3 . The maps m and n can be written as m1 w1 + m2 w2 + m3 w3 ,

n1 w1 + n2 w2 + n3 w3 ,

where mi : H → K and ni : K → L are constant linear maps. Let us restrict the monad to the line l. The monadic condition nm = 0 implies now: n1 m2 + n2 m1 = 0,

n1 m1 = 0,

n2 m2 = 0.

Moreover, since the restriction of E to l is trivial, the composition n1 m2 is an isomorphism (see [30], Lemma 4.2.3). We can choose bases for H, K, L so that n1 m2 = 1k×k (the identity matrix) and     1k×k 0k×k       m1 =  m2 =  0k×k  , 1k×k  , 0r×k 0r×k ! ! n1 = 0k×k 1k×k 0k×r , n2 = −1k×k 0k×k 0k×r . Using the equations n3 m1 + n1 m3 = 0 and n3 m2 + n2 m3 = 0 we can write:   b1 !    m3 = b2  , n = . −b b i 3 2 1  j Now the monadic condition nm = 0 can be written as: (n3 m3 ) · w32 + 1k×k · [w1 , w2 ] = 0. Therefore we obtain the following matrix equation: [b1 , b2 ] + ij + 2h¯ · 1k×k = 0. Note that the last r basis vectors of K give us a trivialization of the restriction of E to the line l. It is easy to check that any isomorphism of a monad which preserves trivialization on l and the choice of the bases of H, K, L made above has the form bi → gbi g −1 , This proves the theorem. ! "

j → jg −1 ,

i → gi,

where g ∈ GL(k, C).

412


7. The Noncommutative Variety P3h¯ as a Twistor Space 7.1. Real structures. A ∗-algebra is, by definition, an algebra over C with an anti-linear anti-homomorphism ∗ satisfying ∗2 = id.A ∗-structure on a (graded) algebra is regarded as a real structure on the corresponding (projective) noncommutative variety. Let us introduce real structures on the complex varieties C4h¯ and Q4h¯ defined in Sect. 3. Assume that in (6), (7) the skew-symmetric matrix θ is purely imaginary and h¯ is real. Then there is a unique ∗-structure on the algebra A(C4h¯ ) such that xi∗ = xi . We denote the corresponding noncommutative variety by R4h¯ . Assume in addition that the symmetric matrix G in (7) is real and positive definite. There is a unique ∗-structure on the algebra Qh¯ such that Xi∗ = Xi ,D ∗ = D, and T ∗ = T . The corresponding noncommutative real variety will be called the noncommutative sphere and denoted by S4h¯ . The embedding of C4h¯ into Q4h¯ induces an embedding R4h¯ → S4h¯ . Recall that the complement of C4h¯ in Q4h¯ is a commutative quadratic cone kl G Xk Xl = 0 which has only one real point. Thus S4h¯ can be regarded as a one-point kl

compactification of R4h¯ . By a linear change of basis one can bring the pair (G, θ ) to the standard form 

1 0 0 0



  0 1 0 0 , G=   0 0 1 0 0 0 0 1

 θ=

0

a

0



 0 .  0 b 0 −b 0

 √ −a 0 −1    0 0 0

0

0

(20)

Furthermore, since h¯ and θ enter only in the combination h¯ · θ , and we asssume that a + b = 0, we can set a + b = 1 without loss of generality. 7.2. Realification of P3h¯ . Recall that the noncommutative projective space P3h¯ corresponds to the algebra P Sh¯ with generators zi , i = 1, 2, 3, 4, and relations (9). Consider an algebra P" Sh¯ with generators zi , z¯ i , i = 1, 2, 3, 4, and relations [z1 , z2 ] = 2h(a ¯ + b)z3 z4 , [z1 , z¯ 1 ] = 2h¯ bz3 z¯ 3 − 2haz ¯ 4 z¯ 4 , [z1 , z¯ 2 ] = 0, [¯z1 , z¯ 2 ] = −2h(a + b)¯ z z ¯ , [z , z ¯ ] = 2 h az z ¯ − 2 hbz ¯ ¯ 3 3 ¯ 4 z¯ 4 , [z2 , z¯ 1 ] = 0, (21) 3 4 2 2 [zi , zj ] = [zi , z¯ j ] = [¯zi , zj ] = [¯zi , z¯ j ] = 0 for all i = 3, 4; j = 1, 2, 3, 4. There is a unique ∗-structure on this algebra such that zi∗ = z¯ i ,¯zi∗ = zi . We denote the corresponding real variety P3h¯ (R). This variety can be considered a realization of P3h¯ . Remark. In contrast to the commutative situation, a noncommutative complex variety in general has many different realization. We have an ambiguity in the choice of relations involving both zi and z¯ j . The realization (21) is distinguished by the fact that it is the twistor space of the noncommutative sphere S4h¯ , as explained below. In the commutative case there is a map from P3 (R) to the sphere S4 which is a P1 fibration. The corresponding P1 bundle is the projectivization of a spinor bundle on S4 . This map is known as the Penrose map. In the noncommutative case we have a


413

similar picture. The analogue of the Penrose map is a map N : P3h¯ (R) −→ S4h¯ which is Sh¯ : associated with the homomorphism of ∗-algebras Qh¯ −→ P" √ −1 (z1 z¯ 4 − z¯ 1 z4 − z¯ 2 z3 + z2 z¯ 3 ), X1 → − 2 1 D → − (z1 z¯ 1 + z¯ 1 z1 + z2 z¯ 2 + z¯ 2 z2 ), 2 1 X2 → (z1 z¯ 4 + z¯ 1 z4 − z¯ 2 z3 − z2 z¯ 3 ), 2 T → − (z3 z¯ 3 + z4 z¯ 4 ), √ −1 X3 → − (¯z1 z3 − z1 z¯ 3 + z2 z¯ 4 − z¯ 2 z4 ), 2 1 X4 → (z1 z¯ 3 + z¯ 1 z3 + z¯ 2 z4 + z2 z¯ 4 ). 2 Note that for h¯ = 0 we obtain the homomorphism of commutative algebras which corresponds to the usual Penrose map. This means that P3h¯ (R) is the twistor space of S4h¯ . The variety P3h¯ (R) is a twistor space in yet another sense. For the commutative R4 the complex structures compatible with the symmetric bilinear form G and orientation are parametrized by points of a P1 . This remains true in the noncommutative case. A complex structure (resp. orientation) on R4h¯ is defined as a complex structure (resp. orientation) on the real vector space U spanned by x1 , . . . , x4 . We will choose an orientation on U and require that the complex structure be compatible with it. All such complex structures are parametrized by points of a P1 . Recall now that P3h¯ is a pencil of noncommutative projective planes passing through the commutative line. Let us pick any one of them. The realification of P3h¯ defined above induces a realification of the noncommutative projective plane. It is easy to see that the complement of the commutative line w3 = w¯ 3 = 0 in the realified projective plane is isomorphic to R4h¯ . Furthermore, the complement carries a natural complex structure defined by √ √ w3−1 wi → −1 w3−1 wi , w¯ 3−1 w¯ i → − −1 w¯ 3−1 w¯ i , i = 1, 2. The Penrose map induces an identification between the complement and R4h¯ ⊂ S4h¯ , and therefore induces a complex structure on the latter. Varying the noncommutative projective plane, one obtains all possible complex structures on R4h¯ compatible with a particular orientation. This is completely analogous to the commutative case.

7.3. Connection between sheaves on commutative and noncommutative planes. In this subsection we are going to connect the moduli space Mh¯ (r, 0, k) of bundles on P2h¯ with a trivialization on the line l with the moduli space M(r, 0, k) of torsion free sheaves on the commutative P2 with a trivialization on a fixed line. The bridge between bundles on P2h¯ and torsion free sheaves on P2 is provided by the twistor variety P3h¯ . This gives a geometrical interpretation of Nakajima’s results (the description of the moduli space M(r, 0, k) by the deformed ADHM data [28, 27]). We will construct a hyperkähler manifold M parametrizing certain complexes on P3h¯ which is isomorphic to M(r, 0, k) (which is also a hyperkähler manifold [28]). The isomorphism is given by the restriction of complexes

414


to one of the commutative P2’s. On the other hand, the restriction of complexes to a noncommutative plane P2h¯ yields an isomorphism between M with a particular choice of complex structure and the moduli space Mh¯ (r, 0, k). Thus Mh¯ (r, 0, k) can be obtained from M(r, 0, k) by a rotation of complex structure. Consider complexes C · on P3h¯ of the form M

N

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0

(22)

with dim H = dim L = k, dim K = 2k + r, which satisfies the condition that its restriction to the line l has only one cohomology which is a trivial bundle (with a fixed trivialization). This condition implies that M is a monomorphism. Note that N is not an epimorphism in general, so (22) is not a monad. But the restriction of the complex (22) to any noncommutative plane is a monad by Corollary 6.5. Thus N can fail to be surjective only on the commutative planes z3 = 0 and z4 = 0. Now we introduce a real structure on P3h¯ (this is different from the real structure on the realification of P3h¯ defined above). Assume that h¯ is a real number. Consider an anti-linear anti-homomorphism J¯ of P Sh¯ defined by J¯ (z1 ) = z2 ,

J¯ (z2 ) = −z1 ,

J¯ (z3 ) = z4 ,

J¯ (z4 ) = −z3 ,

¯ J¯ (λ) = λ,

λ ∈ C.

Thus J¯ is a homomorphism of R-algebras from P Sh¯ to the opposite algebra P Sh¯ op . (The notation J¯ is used by analogy with the commutative case, where this anti-homomorphism is a composition of a complex structure J with complex conjugation [15].) The anti-homomorphism J¯ induces a functor J¯ ∗ from qgr(P Sh¯ )R to qgr(P Sh¯ op )R . The latter category is naturally identified with the category qgr(P Sh¯ )L . Using this identification we can consider the composition of J¯ ∗ with the dualization functor Hom(−, O) as a functor from qgr(P Sh¯ )R to itself. For any bundle E we denote by J¯ ∗ (E)∨ its image under this functor. The functor can be extended to complexes of bundles. It takes the complex C · (22) to the complex J¯ ∗ (C · )∨ J¯ ∗ (N)∨ J¯ ∗ (M)∨ 0 −→ L¯ ∗ ⊗ O(−1) −→ K¯ ∗ ⊗ O −→ H¯ ∗ ⊗ O(1) −→ 0.

Let us consider complexes C · on P3h¯ with an isomorphism J¯ ∗ (C · )∨ ∼ = C·

(23)

and trivialization on the line l. Then the space K acquires a hermitian metric and L becomes isomorphic to H¯ ∗ . The reasoning of Sect. 6 shows that we can represent the maps M and N as M 1 z1 + M 2 z2 + M 3 z3 + M 4 z 4 ,

N1 z1 + N2 z2 + N3 z3 + N4 z4 ,

where Mi and Ni are constant maps. By a suitable choice of bases we can put these maps into the form         1 0 B1 B1                (24) M1 = 0 , M2 = 1 , M3 = B2  , M4 = B2  , 0 0 J J


! N1 = 0 1 0 ,

! N3 = −B2 B1 I ,

415

! N2 = −1 0 0 , N4 = −B2

B1

I

!

.

Using the reality conditions J¯ ∗ (N )∨ = M and J¯ ∗ (M)∨ = −N we find that

B1 = −B2 † ,

B2 = B1 † ,

J = I †,

I = −J † .

(25)

Finally the condition N M = 0 gives a) b)

µc = [B1 , B2 ] + IJ = 0, µr = [B1 , B1 † ] + [B2 , B2 † ] + II † − J † J = −2h¯ · 1k×k .

These matrix equations are invariant under the following action of U (k): Bi → gBi g −1 ,

I → gI,

J → Jg −1 ,

where g ∈ U (k).

(26)

Denote by M the vector space of complex matrices (B1 , B2 , I, J). It has a structure of a quaternionic vector space defined by (B1 , B2 , I, J) → (−B2 † , B1 † , −J † , I † ), and, moreover, it is a flat hyperkähler manifold (see [28]). The map µ = (µr , µc ) is a hyperkähler moment map for the action of U (k) defined in (26) (see [19]). Since the −1 −1 action of U (k) on µ−1 is free, the quotient M = µ−1 ¯ ¯ c (0)∩µr (−2h·1) c (0)∩µr (−2h· 1)/U (k) is a smooth hyperkähler manifold. This manifold parametrizes complexes (22) with a real structure (23) and a trivialization on the line l. On the other hand, it was proved in [28, 27] that the moduli space M(r, 0, k) of torsion free sheaves on the commutative P2 with a trivialization on a fixed line can be identified with M. This identification can be described geometrically as follows. Let us assume that h¯ is positive. It can be checked that in this case the map N can fail to be surjective only on the plane z4 = 0. We can restrict the complex (22) to the commutative plane z3 = 0. The restriction is a monad and its cohomology sheaf is a torsion free sheaf. It is easy to see that this yields a complex isomorphism from M to M(r, 0, p). The restriction of the complex (22) to a noncommutative plane is a monad as well. This yields a map from M to the moduli space Mh¯ (r, 0, k) of bundles on the noncommutative plane. Let us show that this map is an isomorphism. To this end we note that on the level of the linear algebra data this map sends a quadruple (B1 , B2 , I, J) to the quadruple (b1 , b2 , i, j) with b1 = B1 − B2 † ,

b2 = B2 + B1 † ,

i = I − J †,

j = J + I †.

Further, note that the equations µc = 0, µr = −2h¯ · 1 are equivalent to the equation [b1 , b2 ] + i · j + 2h¯ · 1 = 0 and the vanishing of the moment map for the action of the group U (k) on the space of quadruples (b1 , b2 , i, j). Now it follows from the theorem of Kempf and Ness ([28, 20]) that the map M → Mh¯ (r, 0, k) is a diffeomorphism. It becomes a complex isomorphism if we replace the natural complex structure of the space M with another one within the P1 of complex structures on M. Thus we have

416


Theorem 7.1. The moduli space Mh¯ (r, 0, k) is a smooth hyperkähler manifold of real dimension 4rk, and as a hyperkähler manifold it is isomorphic to the moduli space M(r, 0, k) of torsion free sheaves on the commutative P2 with a trivialization on a fixed line. As a complex manifold Mh¯ (r, 0, k) is obtained from M(r, 0, k) by a rotation of the complex structure. The above discussion shows that there are natural bijections between A . Bundles on P2h¯ with a trivialization on the commutative line l and c2 = k. B . Solutions of the equations µc = 0, µr = −2h¯ · 1 modulo the action of U (k). C . Complexes of sheaves on P3h¯ of the form (22) with a trivialization on the commutative line l satisfying the reality condition (23). One can show that for r > 1 a generic complex (22) is a monad and its cohomology is a bundle E on P3h¯ such that H 1 (P3h¯ , E(−2)) = 0,

J¯ ∗ (E)∨ ∼ = E.

(27)

Moreover, it can be shown that any bundle E satisfying the conditions (27) can be represented as a cohomology of a monad of the form (22).

8. Noncommutative Twistor Transform 8.1. Review of the twistor transform. In the commutative case the ADHM construction of instantons has the following geometric interpretation. Consider the double fibration p

q

G(2; 4) ←−−−− Fl(1, 2; 4) −−−−→ P3 ,

(28)

where G(2; 4) is the Grassmannian and Fl(1, 2; 4) is the partial flag variety. The Grassmannian G(2; 4) has a real structure with S4 as the set of real points. For any bundle E on P3 its twistor transform is defined as a sheaf p∗ q ∗ E on G(2; 4). Given ADHM data we have a monad on P3 whose cohomology is a bundle. It can be shown that the restriction of its twistor transform to the sphere S4 coincides with the instanton bundle corresponding to these ADHM data. The instanton connection can also be reconstructed from the bundle on P3 (see [4, 24] for details). In this section we show that one can consider the noncommutative quadric introduced in Sect. 3 as a noncommutative Grassmannian G(2; 4). We also construct a noncommutative flag variety Fl(1, 2; 4) and projections p, q giving a noncommutative analogue of the twistor diagram (28). The twistor transform can be defined in the same way as above. It produces a bundle on the noncommutative sphere from the deformed ADHM data. We show that this bundle is precisely the kernel of the map D defined in Sect. 2. It should also be possible to construct the instanton connection on the noncommutative R4 from the complex of sheaves on P3h¯ . To do this, one needs to develop the differential geometry of noncommutative affine and projective varieties. We go some way in this direction by defining differential forms and spinors. Since the goal of this section is mainly illustrative, we limit ourselves to stating the results. An interested reader should be able to fill in the proofs.


417

8.2. Tensor categories. A good way to construct noncommutative varieties with properties similar to those of commutative varieties is to start with a tensor category (see [25, 23]). Let T be an abelian tensor category. Consider a tensor functor O : T → Vect to the abelian tensor category of vector spaces compatible with the associativity constraint but not compatible with the commutativity constraint. If A is a commutative algebra in the tensor category T , then O(A) is a noncommutative algebra in the tensor category Vect. If M ∈ T is a right A-module, then O(M) is a right O(A)-module. Any right A-module (in the category T ) has a natural structure of a left A-module (and hence an A-bimodule). Thus any right O(A)-module of the form O(M) has a natural structure of a O(A)-bimodule. Consider the category CommT of all finitely generated (graded) commutative algebras in the tensor category T . Then under O the category CommT is mapped to a subcategory of the category of finitely generated (graded) algebras. This subcategory enjoys many properties of the category of commutative (graded) algebras. For example, for all A, B ∈ CommT there is a natural algebra structure on O(A) ⊗ O(B) coming from the algebra structure on A ⊗ B. The corresponding subcategory in the category of noncommutative affine (resp. projective) varieties shares a lot of properties with the category of commutative varieties. For example, if X and Y are varieties in this category, then using the tensor product of the corresponding algebras one can define the “Carthesian” product X × Y . More generally, given a pair of morphisms X → Z and Y → Z one can define the fiber product X ×Z Y . Further, starting from the module of differential forms of A one can construct the sheaf of differential forms on the corresponding noncommutative variety. The category qgr(O(A)) has a nice subcategory which consists of modules of the form O(M), where M ∈ T is an A-module. To any object O(M) of this subcategory one can associate its symmetric and exterior powers. The symmetric powers of O(M) form a noncommutative graded algebra. This enables one to define the projectivization of the sheaf corresponding to the module O(M). 8.3. Yang–Baxter operators. One way to construct an abelian tensor category T with a functor O : T → Vect is to consider a Yang–Baxter operator (see [25, 23]). A Yang–Baxter operator on a vector space V is an operator R : V ⊗ V → V ⊗ V , such that R 2 = idV ⊗V , (R ⊗ idV )(idV ⊗ R)(R ⊗ idV ) = (idV ⊗ R)(R ⊗ idV )(idV ⊗ R).

(29)

A Yang–Baxter operator induces an action of the permutation group Sn on the tensor power V ⊗n , where the transposition (i, i + 1) ∈ Sn acts as the operator Ri,i+1 = idV ⊗(i−1) ⊗ R ⊗ idV ⊗(n−i−1) : V ⊗n → V ⊗n . Equations (29) ensure that operators Ri,i+1 satisfy the relations between the transpositions (i, i + 1) in the group Sn . If R is a Yang–Baxter operator on a vector space V , then the dual operator R ∨ : V ∗ ⊗ V ∗ → V ∗ ⊗ V ∗ is also a Yang–Baxter operator. Given a Yang–Baxter operator R : V ⊗ V → V ⊗ V , one can construct an abelian tensor category TR and a functor OR : TR → Vect such that V is a OR -image of some object of TR , and the commutativity morphism in the category TR is mapped by OR to R [23]. As mentioned above, given any two objects A, B of the category CommTR , one

418


has a natural algebra structure on the vector space O(A) ⊗ O(B). This algebra will be denoted O(A) ⊗ O(B) and called the R-tensor product of O(A) and O(B). R

It is well known that there is a one-to-one correspondence between irreducible representations of the group Sn and partitions of n (Young diagrams). Under this correspondence the trivial partition (n) corresponds to the sign representation, while the maximal partition (1, 1, . . . , 1) corresponds to the identity representation. Given # $% & n times

a partition (k1 , . . . , kr ) of n (k1 ≥ k2 ≥ · · · ≥ kr ) we denote by (k1 , . . . , kr ) the (k ,...,kr ) (k ,...,kr ) ∗ V (resp. R 1 V ) the corresponding irreducible representation and by R 1 ⊗n ∗ ⊗n (k1 , . . . , kr )-isotypical component of V (resp. (V ) ), i.e. the sum of all subrepresen(n) tations of V ⊗n (resp. (V ∗ )⊗n ) isomorphic to (k1 , . . . , kr ). We also put CnR V = R V , (n) CnR V ∗ = R V ∗ for brevity. Remark. The subspaces Rλ V ⊂ V ⊗n are the OR -images of some objects of the category TR . Let λ, µ be partitions of n and m respectively. It is clear that the action of the permutation σn,m ∈ Sn+m i + m, if 1 ≤ i ≤ n σn,m (i) = i − n, if n + 1 ≤ i ≤ n + m gives an isomorphism µ

µ

Rn,m : Rλ V ⊗ R V → R V ⊗ Rλ V . Remark. This isomorphism is the image of an isomorphism in the category TR . The trivial example of a Yang–Baxter operator is the usual transposition R0 (v1 ⊗ v2 ) = v2 ⊗ v1 . We will say that R is a deformation-trivial Yang–Baxter operator if R is an algebraic deformation of R0 in the class ofYang–Baxter operators. For a deformation-trivialYang– Baxter operator R we have dim Rλ V = dim Rλ 0 V for any partition λ. 8.4. The noncommutative projective space. Let R be a deformation-trivial Yang–Baxter operator on the vector space V ∗ . Then the graded algebra ) '( SR· V ∗ = T (V ∗ ) C2R V ∗ is a noncommutative deformation of the coordinate algebra of the projective space P(V ). We denote by PR (V ) the corresponding noncommutative variety. Thus PR (V ) is a noncommutative deformation of the projective space P(V ).


419

Example 8.1. The operator if (i, j ) = (1, 2), (2, 1), R(zi ⊗ zj ) = zj ⊗ zi , R(z1 ⊗ z2 ) = z2 ⊗ z1 + 2h(az ⊗ z + bz4 ⊗ z3 ), ¯ 3 4 R(z2 ⊗ z1 ) = z1 ⊗ z2 − 2h(bz ¯ 3 ⊗ z4 + az4 ⊗ z3 ),

(30)

is a deformation trivialYang–Baxter operator on the 4-dimensional vector space Z ∗ with the basis {z1 , z2 , z3 , z4 }. By definition the homogeneous coordinate algebra of PR (Z) is generated by z1 , z2 , z3 , z4 with relations (9) (we set a + b = 1 as before). Hence PR (Z) is isomorphic to the noncommutative projective space P3h¯ defined in Sect. 3. The space Z ∗ was denoted U in that section. The above example shows that part of the data encoded in theYang–Baxter operator R is lost in the structure of the corresponding noncommutative projective space. We will see below that this data appears in the structure of other noncommutative varieties associated with R. 8.5. Noncommutative Grassmannians. It is well known that the homogeneous coordinate algebra of the Grassmann variety G(k; V ) is a graded quadratic algebra with Ck V ∗ as the space of generators and ! Ker Ck V ∗ ⊗ Ck V ∗ → (V ∗ )⊗2k → (k,k) V ∗ as the space of relations. This description justifies the following definition. Definition 8.2. Let R be a Yang–Baxter operator on the space V ∗ . The noncommutative Grassmann variety GR (k; V ) is the noncommutative projective variety corresponding to the quadratic algebra '( ) (k,k) Ker(CkR V ∗ ⊗ CkR V ∗ → R V ∗ ) . GR (k; V ) = T (CkR V ∗ ) The algebra GR (k; V ) is the OR -image of a commutative algebra in the category TR . If R is deformation-trivial, then GR (k; V ) is a noncommutative deformation of G(k; V ). Note that GR (1; V ) = PR (V ) by definition. Example 8.3. Consider the noncommutative Grassmannian GR (2; Z) corresponding to the Yang–Baxter operator (30). Let zij =

1 ((zi ⊗ zj − zj ⊗ zi ) − R(zi ⊗ zj − zj ⊗ zi )) ∈ C2R Z ∗ . 2

Then it is easy to check that GR (2; Z) is generated by the elements Y1 = z13 ,

Y2 = −z24 ,

Y3 = z23 ,

Y4 = z14 ,

D = −z12 ,

T = z34 ,

with relations [Y1 , Y2 ] = 2h¯ aT 2 , [Y3 , Y4 ] = 2h¯ bT 2 , [D, Y1 ] = −2h¯ aY1 T , [D, Y2 ] = 2haY ¯ 2T , [D, Y3 ] = −2h¯ bY3 T , [D, Y4 ] = 2hbY ¯ 4T , 1 DT = (Y1 Y2 + Y2 Y1 + Y3 Y4 + Y4 Y3 ) , 2

(31)

420


[Yi , Yj ] = [T , Yj ] = [T , D] = 0 for all i = 3, 4, j = 1, 2, 3, 4. Comparing with (7) one can see that the algebra GR (2; Z) is isomorphic to Qh¯ with G and θ given by     0 a 0 0 0 1 0 0     −a 0 0 0 1 0 0 0 1 .    , θ = 2h¯  G=   2 0 0 0 1   0 0 0 b 0 0 −b 0 0 0 1 0 Note that the variables Xi , i = 1, 2, 3, 4, used in Sect. 7 to describe the quadric are related to Yi , i = 1, 2, 3, 4, by the following formulas: √ √ Y1 = X2 + −1 X1 , Y2 = −X2 + −1 X1 , (32) √ √ Y3 = X4 + −1 X3 , Y4 = −X4 + −1 X3 . 8.6. Products of Grassmannians and flag varieties. Let R be a Yang–Baxter operator on the vector space V ∗ . Consider a sequence k1 , . . . , kr of integers. Let Zr be the free abelian group with r generators e1 , . . . , er . The R-tensor product GR (k1 ; V ) ⊗ . . . ⊗ GR (kr ; V ) R

R

is a Zr -graded algebra generated by the vector spaces CkRi V ∗ in degree ei , with relations ! (k ,k ) Ker CkRi V ∗ ⊗ CkRi V ∗ → R i i V ∗ in degree 2ei for all i and k

(id,−Rkj ,ki )

k

k

Ker (CkRi V ∗ ⊗ CRj V ∗ ) ⊕ (CRj V ∗ ⊗ CkRi V ∗ ) −−−−−−−−−−→ CkRi V ∗ ⊗ CRj V ∗

!

in degree ei + ej for all i > j . For any increasing sequence k1 , . . . , kr we define also a Zr -graded algebra FLR (k1 , . . . , kr ; V ).It has the same generators as the algebra GR (k1 ; V ) ⊗ . . . ⊗ GR (kr ; V ),, subject to the same relations in degrees 2ei and to relations k

R

R

kj

kj

k

(id,−Rkj ,ki )

kj

k

(ki ,kj )

Ker (CRi V ∗ ⊗ CR V ∗ ) ⊕ (CR V ∗ ⊗ CRi V ∗ ) −−−−−−−−→ CRi V ∗ ⊗ CR V ∗ −−−−−→ R

V∗

!

in degree ei + ej for all i > j . This definition is suggested by the Borel–Weil–Bott theorem (see [14]). In particular, for R = R0 we get the algebra corresponding to the commutative flag variety. We define the R-Carthesian product GR (k1 ; V ) × . . . × GR (kr ; V ) and the noncomR

R

mutative flag variety FlR (k1 , . . . , kr ; V ) as the noncommutative varieties corresponding to the algebras GR (k1 ; V ) ⊗ . . . ⊗ GR (kr ; V ) and FLR (k1 , . . . , kr ; V ) respectively. R

R

To make this compatible with our definition of a noncommutative variety, we consider instead of a Zr -graded algebra its diagonal subalgebra. The diagonal subalgebra is a graded algebra whose nth graded component is the n(e1 + · · · + er )-graded component of the Zr -graded algebra. Thus according to Sect. 3 the category of coherent sheaves on


421

the R-Cartesian product of Grassmannians (or the flag variety) is the category qgr of the corresponding diagonal subalgebra. The algebra FLR (k1 , . . . , kr ; V ) is the OR -image of a commutative algebra in the category TR . Hence one can define the R-Carthesian product of several flag varieties. If R is deformation-trivial, then GR (k1 ; V ) × . . . × GR (kr ; V ) R

and

R

FlR (k1 , . . . , kr ; V )

are noncommutative deformations of the corresponding commutative varieties. Note that we have a canonical embedding of the graded algebra GR (ki ; V ) into the graded algebra FLR (k1 , . . . , ki , . . . , kr ; V ) inducing the canonical projections pi : FlR (k1 , . . . , ki , . . . , kr ; V ) → GR (ki ; V ). On the other hand, by definition FLR (k1 , . . . , kr ; V ) is a quotient algebra of the algebra GR (k1 ; V ) ⊗ . . . ⊗ GR (kr ; V ). Hence FlR (k1 , . . . , kr ; V ) can be regarded as a closed R

R

subvariety in GR (k1 ; V ) × . . . × GR (kr ; V ). R

R

Example 8.4. The algebra GR (1; Z) ⊗ GR (2; Z) corresponding to the Yang–Baxter opR

erator (30) is generated by the elements z1 , z2 , z3 , z4 , Y1 , Y2 , Y3 , Y4 , D, T with relations (9), (31), and [z2 , Y1 ] = 2h¯ az3 T , [z1 , Y2 ] = −2haz ¯ 4T , [z1 , Y3 ] = −2hbz [z2 , Y4 ] = −2hbz ¯ 3T , ¯ 4T , [z1 , D] = −2hbz ¯ 3 Y4 − 2haz ¯ 4 Y1 , [z2 , D] = 2haz ¯ 3 Y2 − 2hbz ¯ 4 Y3 , [z1 , Y1 ] = [z2 , Y2 ] = 0, [z3 , Yi ] = [z3 , D] = 0, [z4 , Yi ] = [z4 , D] = 0, [zi , T ] = 0 for all i = 1, 2, 3, 4. The algebra FLR (1, 2; Z) is given by the same generators subject to the same relations, as well as the additional relations 

0

 T   Y2

T

    0 z1      z2  0 −Y4 Y1   =  .     0 D − h(a ¯ + b)T  z3  0 0 −D − h(a 0 z4 ¯ + b)T Y2

0 Y4

Y3 −Y1

Y3

(33)

As explained above, we have projections Qh¯

p

q

GR (2; Z) ←−−−− FlR (1, 2; Z) −−−−→ PR (Z)

and a closed embedding FlR (1, 2; Z) ⊂ GR (2; Z) × PR (Z) = Qh¯ × P3h¯ . R

R

P3h¯

422


8.7. Tautological bundles. Let V (resp. V ∗ , Rλ V, Rλ V ∗ ) denote the coherent sheaf on GR (k; V ) corresponding to the free right GR (k; V )-module V ⊗ GR (k; V ) (resp. V ∗ ⊗GR (k; V ), Rλ V ⊗GR (k; V ), Rλ V ∗ ⊗GR (k; V )). Since the space of global sections ∗ of the sheaf O(1) on the Grassmannian GR (k; V ) is CkR V ∗ , the maps Ck−1 R V → k ∗ ∗ ∗ V ⊗ CkR V ∗ and Ck+1 R V → V ⊗ CR V induce morphisms of sheaves φ

∗ −−−→ V Ck−1 R V (−1) −

and

ψ

∗ Ck+1 −−−→ V ∗ . R V (−1) −

We put S = Im φ, V/S = Coker φ, S = Im ψ, V ∗ /S = Coker ψ. Remark. For k = 1 we have S = O(−1), V ∗ /S = O(1). One can show that these sheaves are locally free. We refer to them as tautological bundles. The free GR (k; V )-modules, corresponding to the sheaves Rλ V, Rλ V ∗ are the OR images of free modules over the corresponding algebra in the category TR . Furthermore, the morphisms φ and ψ are OR -images. This implies that the GR (k; V )-modules corresponding to the tautological bundles are OR -images as well. Therefore they all have a natural structure of GR (k; V )-bimodules. This allows to define R-symmetric powers SRk (−) (resp. R-exterior powers CkR (−)) of the tautological bundles as the corresponding OR -images. One can check that we have canonical isomorphisms of bimodules V ∗ /S ∼ = S∨,

S ∼ = (V/S)∨ .

Example 8.5. Let R be the Yang–Baxter operator (30) and k = 2. Let zˇ 1 , zˇ 2 , zˇ 3 , zˇ 4 be the dual basis of Z. Then the twisted maps φ(1) : Z ∗ ⊗ OGR → Z ⊗ OGR (1), ψ(1) : Z ⊗ OGR ∼ = C3R Z ∗ ⊗ OGR → Z ∗ ⊗ OGR (1) are given by      0 D + h(a z1 zˇ 1 ¯ − b)T −Y1 −Y4      D − h(a z2    0 −Y3 Y2  ¯ − b)T    zˇ 2  , φ(1) :    →    −Y1 Y3 0 −T  zˇ 3    z3  −Y4 z4 −Y2 T 0 zˇ 4      zˇ 1 0 T Y2 Y3 z1      zˇ 2  T   z2  0 −Y4 Y1    . ψ(1) :    →    0 D − h(a ¯ + b)T  z3  zˇ 3   Y2 Y4 Y3 −Y1 −D − h(a zˇ 4 0 z4 ¯ + b)T Note that ψ(1)φ = 0 and φ(1)ψ = 0. Hence we have isomorphisms S (1) ∼ = V/S,

S(1) ∼ = S∨.


423

Note also that on the open subset T = 0 elements (z3 , z4 ) give a trivialization of the tautological bundle S ∨ . More precisely, the restriction of the sections z1 , z2 of S ∨ can be expressed as z1 = y4 z3 − y1 z4 ,

z2 = −y2 z3 − y3 z4 ,

(34)

where yi = T −1 Yi . Similarly, the elements(ˇz1 , zˇ 2 ) give a trivialization of V/S on T = 0. Thus the restrictions of all tautological bundles to the open subset T = 0 correspond to the free rank two bimodule over the Weyl algebra A(C4h¯ ). 8.8. Pull-back and push-forward. Recall that we have canonical projections pi : FlR (k1 , k2 ; V ) → GR (ki ; V )

(i = 1, 2).

Given a right graded GR (ki ; V )-module E we consider the right bigraded FLR (k1 , k2 ; V )-module E ⊗GR (ki ;V ) FLR (k1 , k2 ; V ). The diagonal subspace of this module is a graded module over the diagonal subalgebra of FLR (k1 , k2 ; V ). This gives the pull-back functor pi∗ : coh(GR (ki ; V )) → coh(FlR (k1 , k2 ; V )). The pull-back functor is exact and takes a OR -image to a OR -image. In particular, the pull-backs of the tautological bundles have a canonical bimodule structure. The pull-back functor pi∗ admits a right adjoint functor pi∗ : coh(FlR (k1 , k2 ; V )) → coh(GR (ki ; V )), called the push-forward functor. It also takes a OR -image to a OR image. The line bundles p1∗ O(i) and p2∗ O(j ) on the flag variety FlR (k1 , k2 ; V ) are OR images, hence they have a canonical bimodule structure. Therefore, we have a welldefined tensor product O(i, j ) = p1∗ O(i) ⊗ p2∗ O(j ). The line bundle O(i, j ) is also a OR -image and has a canonical bimodule structure. The nth graded component of the corresponding module over the diagonal subalgebra of FLR (k1 , k2 ; V ) is the ((n + i)e1 + (n + j )e2 )-graded component of the algebra FLR (k1 , k2 ; V ). One can check that the push-forward of the line bundle O(j1 , j2 ) with respect to p2 is given by the formula j p2∗ O(j1 , j2 ) = SR1 (S ∨ )(j2 ). 8.9. FlR (1, 2; Z) as the projectivization of the tautological bundle. The R-symmetric powers of the tautological bundle form a sheaf of graded algebras on the Grassmannian GR (k; V ), ) '( SR· (S ∨ ) = T (S ∨ )

C2R S ∨ .

The corresponding GR (k; V )-module ∞ i,j =0

j

9(GR (k; V ), SR (S ∨ )(i))

424


is a bigraded module with a structure of a bigraded algebra. One can check that this bigraded algebra is isomorphic to the bigraded algebra FLR (1, k; V ). Thus we can regard the flag variety FlR (1, k; V ) as the projectivization of the tautological bundle S on the Grassmannian GR (k; V ). In particular, FlR (1, 2; Z) is the projectivization of the tautological bundle S on the Grassmannian GR (2; Z). 8.10. Noncommutative twistor transform. If E is a coherent sheaf on the noncommutative projective space PR (Z) = P3h¯ , we define its twistor transform as the sheaf p∗ q ∗ E on GR (2; Z) = Qh¯ , where q is the projection FlR (1, 2; Z) → PR (Z) = P3h¯ and p is the projection FlR (1, 2; Z) → GR (2; Z) = Qh¯ . Similarly, we can define the twistor transform of a complex of sheaves on P3h¯ . Actually, it is more natural to consider the derived twistor transform, i.e. the derived functor of the ordinary twistor transform. Consider a complex C · of the form M

N

0 −→ H ⊗ O(−1) −→ K ⊗ O −→ L ⊗ O(1) −→ 0 on the projective space P3h¯ . One can check that under the twistor transform one has OP3 (−1) → 0, h¯

OP3 (1) → S ∨ .

OP3 → OGR , h¯

h¯

In fact, for these sheaves the derived twistor transform coincides with the ordinary one. Thus the (derived) twistor transform takes the complex C · to the complex N

0 −→ K ⊗ O −→ L ⊗ S ∨ −→ 0. Let E denote the middle cohomology of the complex C · . It follows that the twistor transform takes E to the kernel of the map N : K ⊗ O −→ L ⊗ S ∨ . One can describe N without reference to the twistor transform. The morphism N is the same thing as a vector space morphism N1 z1 + N2 z2 + N3 z3 + N4 z4 : K −→ Z ∗ ⊗ L.

(35)

Here the maps Ni are given in terms of the deformed ADHM data according to (24) and (25). The map N is a composition of two maps K ⊗ OGR −→ L ⊗ Z ∗ ⊗ OGR −→ L ⊗ S ∨ , where the first map is given by (35), while the second map comes from the canonical projection Z ∗ ⊗ OGR → S ∨ . (We remind that S ∨ is the cokernel of the map ψ : Z ⊗ OGR (−1) −→ Z ∗ ⊗ OGR .) Recall that on the open subset {T = 0} the bundle S ∨ is trivial, and the elements (z3 , z4 ) give its trivialization (see (34)). Hence the restriction of the twistor transform of the complex (22) to this open subset is isomorphic to the complex 

N3  N4



+ y 4 N1 − y 2 N 2  

− y 1 N1 − y 3 N2 0 −−−−→ K ⊗ O −−−−−−−−−−−−−−−−→ (L ⊕ L) ⊗ O −→ 0.

(36)


425

Assume now that the complex (22) is given by the deformed ADHM data (B1 , B2 , I, J) (see Sect. (7)). Applying the formulas (24) and (25), we see that with respect to the chosen bases of L and K the map N is given by the matrix −B2 + y2 B1 + y 4 I . −B1 † + y3 −B2 † − y1 −J † It is evident that this operator is related to the operator D in (4) by a change of basis. In particular, the Nekrasov–Schwarz coordinates ξ1 , ξ2 , ξ¯1 , ξ¯2 (see Sect. 2) can be expressed through xi = T −1 Xi as follows: √ √ ξ1 = −y4 = x4 − −1 x3 , ξ2 = y2 = −x2 + −1 x1 , √ √ ξ¯1 = y3 = x4 + −1 x3 , ξ¯2 = −y1 = −x2 − −1 x1 . Thus the twistor transform of the complex corresponding to the deformed ADHM data coincides with the instanton bundle corresponding to these data (see Sect. 2). This gives a geometric interpretation of the deformed ADHM construction of the noncommutative instanton bundle. 8.11. Differential forms. Let an algebra A be the OR -image of a commutative algebra in the category TR . This means that there exists an operator R : A⊗2 −→ A⊗2 compatible with the multiplication law of A. Above we have defined the R-tensor product A ⊗ A R

which is also an algebra with a Yang–Baxter operator. Explicitly, the multiplication law of A ⊗ A is defined as follows. Let m be the multiplication map from A ⊗ A to A. Then R

the multiplication map from (A ⊗ A) ⊗ (A ⊗ A) to A ⊗ A is given by m12 m34 R23 in the obvious notation. It is easy to see that the multiplication map m is a homomorphism of algebras. Let I denote the kernel of the map m : A ⊗ A → A. Then I is a two-sided ideal of R

the algebra A ⊗ A. R

Definition 8.6. We define the bimodule of R-differential forms of the algebra A by ΩA1 = I /I 2 . For a motivation of this definition, see [12]. Furthermore, suppose A is a graded algebra. Consider the total grading of the bigraded algebra A ⊗ A. The two-sided ideal I inherits R

the grading. Therefore the bimodule ΩA1 is graded too. In the graded case, besides ΩA1 , we can define the module of projective differential forms of A in the following way. Let χ : A ⊗ A → A ⊗ A be the linear operator which R

R

acts on the (p, q)th graded component of the algebra A ⊗ A as a scalar multiplication by R

q. Since χ is a derivation, we have χ (I 2 ) ⊂ I . Therefore m(χ (I 2 )) = 0. Furthermore, m·χ the induced map ΩA1 = I /I 2 −→ A is a morphism of graded A-bimodules. Definition 8.7. We define the A-bimodule of projective differential forms of the algebra A by m·χ *A1 = Ker(ΩA1 −→ Ω A).

426


First, let us apply this construction of differential forms to the noncommutative affine variety C4h¯ (Subsect. 3.4). The algebra A(C4h¯ ) of polynomial functions on C4h¯ is the Weyl algebra: A(C4h¯ ) = T(x1 , x2 , x3 , x4 )/[xi , xj ] = h¯ θij 1≤i,j ≤4 . Let us define the Yang–Baxter operator on the tensor square of the subspace of A(C4h¯ ) spanned by 1, x1 , x2 , x3 , x4 by the formula 1 ⊗ xi → xi ⊗ 1, xi ⊗ 1 → 1 ⊗ xi , xi ⊗ xj → xj ⊗ xi + hθ ¯ ij · 1 ⊗ 1 for all

1 ≤ i, j ≤ 4.

This Yang–Baxter operator has a unique extension to the whole A(C4h¯ ) compatible with the multiplication law. There is another way to look at this Yang–Baxter operator. Recall that C4h¯ is an open subset T = 0 in the noncommutative Grassmannian GR (2; Z), where R is defined by (30). The Yang–Baxter operator on the quadratic algebra GR (2; Z) has the property that R(T ⊗ a) = a ⊗ T for any a ∈ GR (2; Z). Hence it descends to a Yang–Baxter operator on A(C4h¯ ). It is easy to see that it acts on the tensor square of the subspace spanned by 1, x1 , x2 , x3 , x4 in the above manner. We define the sheaf of differential forms 1C4 as the bimodule of R-differential forms h¯

of the algebra A(C4h¯ ). It is easy to check that 1C4 is isomorphic to the bimodule A(C4h¯ )⊕4 . h¯

p

Futhermore, we can take any R-exterior power of 1C4 and thereby define C4 . This h¯

h¯

enables us to define a connection and its curvature on any bundle on the noncommutative affine space. The relevant formulas were written above (see Subsect. 1.5). Second, we define the sheaf of differential forms 1GR on the noncommutative Grassmannian GR (k; V ) as the sheaf corresponding to the module of projective differential *1 . forms Ω GR It can be shown that as in the commutative case we have an isomorphism of coherent sheaves on the noncommutative Grassmannian GR (k; V ): 1GR ∼ = S ⊗ S. It follows that for k = 1 that we have an exact sequence 0 −→ 1PR (V ) −→ V ∗ (−1) −→ O −→ 0. Thus this definition of the sheaf of differential forms 1PR (V ) is consistent with Definition 4.8. Similarly, one can define the sheaf of differential forms 1FlR on the noncommutative flag variety FlR (k1 , . . . , kr ; V ). One can check that the projection pi : FlR (k1 , . . . , ki , . . . , kr ; V ) → GR (ki ; V ) induces a morphism of bundles pi∗ : 1GR → 1FlR . In the commutative case the ADHM construction of the instanton connection can be interpreted in terms of twistor transform (see [4, 24] for details). We believe that this can be done in the noncommutative case as well. It appears that the most convenient definition of connection on a bundle on a noncommutative projective variety is in terms of jet bundles (see, for example, [24]).


427

9. Instantons on a q-Deformed R4 In this paper we have focused on a particular noncommutative deformation of R4 related to the Wigner–Moyal product (3). This is the only deformation of R4 which is known to arise in string theory. But most of our constructions work for more general deformations which do not have a clear physical interpretation. For example, let us replace C4h¯ with a noncommutative affine variety whose coordinate ring is generated by z1 , z2 , z3 , z4 subject to the following quadratic relations: qz3 z4 − q −1 z4 z3 = h, qz1 z2 − q −1 z2 z1 = h, ¯ ¯ [z1 , z3 ] = [z1 , z4 ] = [z2 , z3 ] = [z2 , z4 ] = 0. We will denote this noncommutative affine variety by C4q,h¯ , and its coordinate algebra by Aq,h¯ . If h¯ and q are real, we can define a ∗-operation on Aq,h¯ by z1∗ = z2 , z3∗ = z4 . The corresponding real noncommutative affine variety will be denoted by R4q,h¯ . Consider now the following deformation of the ADHM equations: [B1 , B1† ]q −1 + [B2 , B2† ]q + I I † − J † J = −2h¯ · 1k×k . (37)

[B1 , B2 ]q −1 + I J = 0,

Here B1 , B2 ∈ Hom(V , V ), I ∈ Hom(W, V ), J ∈ Hom(V , W ), as usual, and by [A, B]q we mean a q-commutator: [A, B]q = qAB − q −1 BA. We claim that solutions of these “q- deformed” ADHM equations can be used as an input for the construction of instantons on R4q,h¯ of rank r = dim W and instanton charge k = dim V . Let us sketch this construction. Define an operator D ∈ HomAq,h¯ ((V ⊕ V ⊕ W ) ⊗C Aq,h¯ , (V ⊕ V ) ⊗C Aq,h¯ ) by the formula

D=

B1 − qz1 −qB2 + qz2

I

B2† − z¯ 2

J†

qB1† − z¯ 1

.

Now we can go through the same manipulations as in Sect. 2: assume that D is surjective, and its kernel is a free module, and define a connection 1-form by the expression (5). The same formal computation as in Sect. 2 shows that the curvature of this connection is anti-self-dual. In order to ensure that D is surjective, it is probably necessary to replace the algebra Aq,h¯ with some bigger algebra containing Aq,h¯ as a subalgebra. This bigger algebra should play the role of the algebra of smooth functions on our noncommutative R4 . For h¯ = 0, q = 1 there is even a natural candidate for this bigger algebra: it should consist of C ∞ functions on C2 with some suitable growth conditions at infinity and the product defined by (f g)(z1 , z2 , z¯ 1 , z¯ 2 ) = exp − ln(q) z1 z¯ 1

∂2 ∂2 ∂2 ∂2 + z2 z¯ 2 − z1 z¯ 1 − z2 z¯ 2 ∂z1 ∂ z¯ 1 ∂z2 ∂ z¯ 2 ∂z ∂ z¯ 1 ∂z2 ∂ z¯ 2 1 f (z1 , z2 , z¯ 1 , z¯ 2 ) g z1 , z2 , z¯ 1 , z¯ 2 |z1 =z1 ,z2 =z2 . (38)

428


Assuming that this formal expression exists, it is easy to check that the product is associative, that polynomial functions form a subalgebra with respect to it, and that this subalgebra is isomorphic to Aq,h¯ . It is natural to conjecture that all instantons on R4q,h¯ arise from this deformed ADHM construction. Note that in this case the deformed ADHM equations are not hyperkähler moment map equations, and one cannot use the hyperkähler quotient construction to infer the existence of a hyperkähler metric on the quotient space. The algebro-geometric part of the story can also be generalized. We did not go through this carefully, but nevertheless would like to indicate one result. It appears that the q-deformed ADHM data can be interpreted in terms of sheaves on a more general noncommutative P2 than the one defined in Sect. 3. The graded algebra corresponding to this noncommutative P2 is generated by degree one elements z1 , z2 , z3 with the quadratic relations 2 qz1 z2 − q −1 z2 z1 = 2hz ¯ 3 , [zi , z3 ] = 0, i = 1, 2. This algebra is one of the Artin-Schelter regular algebras of dimension three [1, 2]. It is characterized by the fact that the corresponding noncommutative variety P2q,h¯ contains as subvarieties a commutative quadric and a noncommutative line. The latter is given by the equation z3 = 0. In the limit q → 1 the plane P2q,h¯ reduces to P2h¯ , and the union of the quadric and the line turns into the triple commutative line l which played such a prominent role in this paper. If q = 1, then in the limit h¯ → 0 the quadric turns into a union of two intersecting commutative lines z1 = 0 and z2 = 0. For any q the line z3 = 0 should be regarded as “the line at infinity” (which is noncommutative for q = 1). It is plausible that the q- deformed ADHM data are in one-to-one correspondence with bundles, or may be torsion–free sheaves, on P2q,h¯ with a trivialization on this line. 10. Appendix In this section we define a -product on the space of complex-valued C ∞ functions on Rn whose derivatives of arbitrary order are polynomially bounded. The -product endows this space with a structure of a C-algebra and reduces to the Wigner–Moyal product (3) on polynomial functions. Definition 10.1. Let O be a topological vector space which is a subspace of the space of C ∞ functions on Rn , and let O be the space of distributions on O. Let f be a C-valued function on Rn which simultaneously is a distribution in O . f is called a multiplier if for any φ ∈ O, f φ ∈ O. The set of multipliers of O is obviously a subspace of O . Definition 10.2. Let f ∈ O . f is called a convolute if for any φ ∈ O we have (f ∗ φ)(x) ≡ (f (ξ ), φ(x + ξ )) ∈ O, and this expression depends continuously on φ. The above expression is called the convolution of f with φ. The set of convolutes is obviously a subspace of O .

, respectively. If f ∈ O,

and O We will denote the Fourier duals of O and O by O

will be the Fourier transform of f , etc. then f ∈ O


429

Definition 10.3. The Schwartz space S(Rn ) is the space of C-valued C ∞ functions on Rn such that φ ∈ S if and only if all the norms sup x k D m φ(x), x

k = 0, 1, 2, . . . ,

(39)

are finite. Here m = (m1 , . . . , mn ) is an arbitrary polyindex. Convergence on S is defined using the family of norms (39). Then S becomes a complete countably normed space [17]. Proposition 10.4. A function f ∈ S is a multiplier if and only if it is a C ∞ function on Rn all of whose derivatives are polynomially bounded. Proof. Obvious. ! " The following theorem proved in [37] describes the subspace of convolutes of S : Theorem 10.5. A distribution f ∈ S is a convolute if and only if it has the form f = D α fα (x), |α| 0 u+ (τ, m, η) = j 0, elsewhere.

Hence + + −

u1 0 u2 L2 (R×Z×R) u+ 1 0 u2 L2 (R×Z×R) + u1 0 u2 L2 (R×Z×R)

+ − − + u− 1 0 u2 L2 (R×Z×R) + u1 0 u2 L2 (R×Z×R)

and a use of (18) yields + + −

u1 0 u2 L2 (R×Z×R) u+ 1 0 u2 L2 (R×Z×R) + u1 0 j(u2 ) L2 (R×Z×R)

Since τ − m5 −

η2 m

+ − − + j(u− 1 ) 0 u2 L2 (R×Z×R) + j(u1 ) 0 j(u2 ) L2 (R×Z×R) . 2 is an odd function, one has 0 < m ∼ Mj and τ − m5 − ηm ∼ Kj ,

− j = 1, 2 on the support of u+ j and j(uj ). Hence we can suppose m > 0 on the support of uj , j = 1, 2, when proving Lemma 4. We need to bound the expression ∞ ∞ ∞ ∞ u1 (τ1 , m1 , η1 )u2 (τ − τ1 , m − m1 , η − η1 ) m=0 −∞ −∞ m1 >0,m−m1 >0 −∞ −∞

2 dτ1 dη1 dτ dη.

Periodic KP-I Type Equations

461

The Cauchy–Schwarz inequality in (τ1 , m1 , η1 ), the support properties of u1 and u2 and the Cauchy–Schwarz inequality in (τ, m, η) yield

u1 0 u2 2L2 (R×Z0 ×R)

sup

(τ,m=0,n)

|Aτ mη | u1 2L2 (R×Z×R) u2 2L2 (R×Z×R) ,

(19)

where Aτ mη ⊂ R × Z × R is the set Aτ mη = (τ1 , m1 , η1 ) : 0 < m1 ∼ M1 , 0 < (m − m1 ) ∼ M2 , 2 2 τ1 − m5 − η1 ∼ K1 , τ − τ1 − (m − m1 )5 − (η − η1 ) ∼ K2 . 1 m1 m − m1 Further we obtain via the triangle inequality |Aτ mη | (K1 ∧K2 )|Bτ mη |, where Bτ mη ⊂ Z × R is the set Bτ mη = (m1 , η1 ) ∈ Z × R : 0 < m1 ∼ M1 , 0 < (m − m1 ) ∼ M2 , 2 2 τ − m5 − (m − m1 )5 − η1 − (η − η1 ) (K1 ∨ K2 ) . 1 m1 m − m1 It remains to bound |Bτ mη |. We shall again use Lemma 1. The projection of Bτ mη on the m1 axis is bounded by c(M1 ∧ M2 ). Fix now m1 . We need to estimate the Lebesgue measure of η1 such that the expression τ − m51 − (m − m1 )5 −

η12 (η − η1 )2 − m1 m − m1

(20)

ranges in an interval of size c(K1 ∨ K2 ). For that purpose we need the following lemma, the proof of which is straightforward. Lemma 5. Let a = 0, b, c be real numbers and I be an interval on the real line. Then 1

mes {x : ax + bx + c ∈ I } 2

|I | 2

1

|a| 2

.

Write the expression (20) as −

2m 2η η2 η1 + τ − m51 − (m − m1 )5 − . η12 + m1 (m − m1 ) m − m1 m − m1

Since m1 and m − m1 are both positive we have that 1 m . M1 ∧ M 2 m1 (m − m1 ) Therefore Lemma 5 implies that the Lebesgue measure of η1 such that the expression (20) 1 1 ranges in an interval of size c(K1 ∨ K2 ) is bounded by c(M1 ∧ M2 ) 2 (K1 ∨ K2 ) 2 . Hence

462

J.-C. Saut, N. Tzvetkov 3

1

using Lemma 1 we obtain |Bτ mη | (M1 ∧ M2 ) 2 (K1 ∨ K2 ) 2 and moreover 3

1

|Aτ mη | (K1 ∧ K2 )(M1 ∧ M2 ) 2 (K1 ∨ K2 ) 2 .

(21)

Substituting (21) in (19) completes the proof of Lemma 4. Consider the dyadic levels

η2 K1 K2 K 5 DM1 M2 M = (τ, m, η, τ1 , m1 , η1 ) : τ − m − ≈ K, |m| ≈ M, m

η2 τ1 − m51 − 1 ≈ K1 , |m1 | ≈ M1 , m1

(η − η1 )2 5 τ − τ1 − (m − m1 ) − ≈ K2 , m − m1

|m − m1 | ≈ M2 , (m, m1 , η, η1 ) ∈ "2 ,

where K1 , K2 , K, M1 , M2 , M are dyadic integers. Denote by J2K1 K2 KM1 M2 M the conK1 K2 K tribution of DM to (6). Then 1 M2 M

J2

J2K1 K2 KM1 M2 M .

K1 ,K2 ,K,M1 ,M2 ,M-dyadic

Define fK1 M1 (τ, m, η) and gK2 M2 (τ, m, η) as in (8) and (9) respectively. In the estimate of J2 we shall perform an additional (comparing to the estimate for J1 ) localization of 2 h near the level set of τ − m5 − ηm . So we set 2 h(τ, m, η), when τ − m5 − ηm ≈ K, |m| ≈ M hKM (τ, m, η) = 0, elsewhere. Then clearly J2K1 K2 KM1 M2 M is bounded by M · fK M (τ1 , m1 , η1 )gK M (τ − τ1 , m − m1 , η − η1 )hKM (τ, m, η) 1 1 2 2 K K K 1 2

*M1 M2 M

1

+

1

K 2 − K12 K22 1

"+

+

,

K1 K2 K ⊂ R4 is defined as where *M 1 M2 M

K1 K2 K *M = (τ, τ1 , η, η1 ) ∈ R4 such that there exists (m, m1 , η, η1 ) ∈ "2 1 M2 M K1 K2 K . with (τ, m, η, τ1 , m1 , η1 ) ∈ DM 1 M2 M Using Lemma 3 one obtains that max {K, K1 , K2 } M1 M2 M 3 . We are in a position to state the following lemma.

(22)


463

Lemma 6. 1

fK1 M1 L2 gK2 M2 L2 hKM L2 . [max{K, K1 , K2 }]0+ Proof. Via a symmetry argument we can assume that K1 ≥ K2 . We shall consider separately the cases K1 ≥ K and K1 ≤ K. Case K1 ≥ K. Then M J2K1 K2 KM1 M2 M hKM 0 j(gK2 M2 ), fK1 M1 L2 , 1 1 + 1+ K 2 − K12 K22 J2K1 K2 KM1 M2 M

where ·, · L2 connotes the L2 (R × Z × R) scalar product. Using the Cauchy–Schwarz inequality and Lemma 4 we obtain 3

J2K1 K2 KM1 M2 M

1

1

M(M ∧ M2 ) 4 (K ∧ K2 ) 2 (K ∨ K2 ) 4

1

+

1

+

K 2 − K12 K22 · fK1 M1 L2 gK2 M2 L2 hKM L2 . 1

Now (22) yields 1

1

1

3

3

K12 (M1 M2 M 3 ) 2 M22 M 2 M(M ∧ M2 ) 4 . Hence for K1 ≥ K one has J2K1 K2 KM1 M2 M

1 K10+

fK1 M1 L2 gK2 M2 L2 hKM L2 .

Case K1 ≤ K. Then J2K1 K2 KM1 M2 M

M K

1 2−

1

+

1

K12 K22

+

fK1 M1 0 gK2 M2 , hKM L2 .

The Cauchy–Schwarz inequality and Lemma 4 yield 3

J2K1 K2 KM1 M2 M

1

1

M(M1 ∧ M2 ) 4 (K1 ∧ K2 ) 2 (K1 ∨ K2 ) 4 1

+

1

+

K 2 − K12 K22 · fK1 M1 L2 gK2 M2 L2 hKM L2 . 1

Next using (22) we obtain K 2 − (M1 M2 M 3 ) 2 − M(M1 ∧ M2 ) 4 . 1

1

3

Hence for K1 ≤ K one has 1

fK1 M1 L2 gK2 M2 L2 hKM L2 . K 0+ This completes the proof of Lemma 6. J2K1 K2 KM1 M2 M

Now using (22) and Lemma 6 we can sum J2K1 K2 KM1 M2 M over dyadic K1 , K2 , K, M1 , M2 , M and arrive at J2 f L2 g L2 h L2 . This completes the proof of Theorem 2.1.

464

J.-C. Saut, N. Tzvetkov

3. Local Well-Posedness The goal of this section is to prove a local well-posedness result in the Fourier transform restriction spaces associated to the energy density of the fifth order KP-I equation posed on T × R. This well-posedness result is a consequence of a bilinear estimate in the framework of the above spaces. The gain of smoothness is obtained as in the previous section. Because of the specific structure of the energy density, an additional argument is needed in order to deal with the terms containing antiderivatives. This argument was already given in [12]. Here we perform it again with the needed modifications. We define now the antiderivative operator ∂x−k which acts on functions defined on T × R with zero x mean value(or equivalently vanishing of some Fourier modes). Let ˆ η) = 0). Define φ : T × R → R be such that T φ(x, y)dx = 0 (or equivalently φ(0, −k ∂x φ through its Fourier transform as ˆ (−im)−k φ(m, η), when m = 0 −k ∂ φ(m, η) = x 0, elsewhere. Note that ∂x−1 (∂x φ) = φ for any φ having zero x mean value. Let φ : T × R → R be such that T φ(x, y)dx = 0. Then an integration by parts yields ∂x−2 φ · φ = |∂x−1 φ|2 . T×R

T×R

H s,k (T × R)

be the Sobolev-type space (related to the Let s and k be real numbers and energy density of the KP equation for s = 2 and k = 1) of functions having zero x mean value equipped with the norm

φ H s,k =

∞

m=0 −∞

(|m| + |m| 2s

−2

ˆ |η| )|φ(m, η)| dη 2k

2

21

.

Let b and k be real numbers. Since the energy density of the KP equations contains an antiderivative we introduce the Fourier transform restriction space Y b,k (R × T × R) as Y b,k (R × T × R) = u ∈ S (R × T × R) : u(τ, ˆ 0, η) = 0 and u Y b,k < ∞ , where

u Y b,k =

  

∞

∞

−∞ −∞ m=0

1 2 2 η −2 2k 5 2b 2 |m| |η| (1 + |τ − m − |) |u(τ, ˆ m, η)| dτ dη .  m

Define now the space Z b,s,k (R × T × R) := X b,s (R × T × R) ∩ Y b,k (R × T × R) equipped with the norm

u Z b,s,k = u Xb,s + u Y b,k . Let I ⊂ R be an interval. Then we define a localized Bourgain space Z b,s,k (I ) endowed with the norm

u Z b,s,k (I ) =

inf { w Z b,s,k , w(t) = u(t) on I }.

w∈Z b,s,k


465

We have the following local well-posedness result. Theorem 3.1. Let s ≥ 1 and k ≥ 0. Then for any φ ∈ H s,k (T×R), there exist a positive T = T ( φ H s,k ) (limρ→0 T (ρ) = ∞) and a unique solution u(t, x, y) of the initial value problem associated to the fifth order KP-I equation with data on T × R on the 1 time interval I = [−T , T ] such that u ∈ C(I, H s,k (T × R)) ∩ Z 2 +,s,k (I ). The proof of the Theorem 3.1 results from the following fundamental estimate:

∂x (uv)

1

Z − 2 +,s,k

u

1

Z 2 +,s,k

v

1

Z 2 +,s,k

,

s ≥ 1, k ≥ 0.

(23)

3.1. Proof of (23). Due to Theorem 2.1 we obtain for s > 1/2,

∂x (uv)

1

X− 2 +,s

u

1

X 2 +,s

v

u

1

X 2 +,s

1

Z 2 +,s,k

v

1

Z 2 +,s,k

.

Therefore the proof of (23) is reduced to estimating

∂x (uv)

1

Y − 2 +,k

by

u

1

Z 2 +,s,k

v

1

Z 2 +,s,k

.

Actually a stronger estimate holds. More precisely we have the following theorem. Theorem 3.2. Let s ≥ 1 and k ≥ 0. Then

∂x (uv) − 1 +,k u 1 +,s v

Y

X2

2

Y

1 2 +,k

+ u

Y

1 2 +,k

v

1 X 2 +,s

.

Proof of Theorem 3.2. Write

∂x (uv)

1

Y − 2 +,k

=

 

∞



∞

m=0 −∞ −∞

I 2 (τ, m, η)dτ dη

1 2 

,

where I (τ, m, η) =

|η|k τ − m5 − m1 =0 m−m1 =0

=

R2

η2 21 − m

u(τ1 , m1 , η1 ) v (τ − τ1 , m − m1 , η − η1 )dτ1 dη1    

|η|k τ − m5 −

η2 21 − m

  

|η|≤2|η1 |

··· +

m1 =0 m−m1 =0

:= I1 (τ, m, η) + I2 (τ, m, η) Theorem 3.2 is a direct consequence of the next lemma.

|η|≥2|η1 |

m1 =0 m−m1 =0

···

      

466


Lemma 7. The following estimates hold:  

∞

∞



m=0 −∞ −∞

 

∞

I12 (τ, m, η)dτ dη

∞



m=0 −∞ −∞

1 2

u



I22 (τ, m, η)dτ dη

1

Y 2 +,k

1 2

u



1

v

X 2 +,s

1

X 2 +,s

v

1

,

(24)

.

(25)

Y 2 +,k

Proof of Lemma 7. Since |η| ≤ 2|η1 | on the domain of the integral defining I1 (τ, m, n), a duality argument shows that in order to prove (24) we should bound the expression |m1 ||m − m1 |−s f (τ1 , m1 , η1 )g(τ − τ1 , m − m1 , η − η1 )h(τ, m, η) , η12 1 + η2 1 − (η−η1 )2 1 + 5 5 5 "+ τ − m − m 2 τ1 − m1 − m1 2 τ − τ1 − (m − m1 ) − m−m1 2 (26) by c f L2 (R×Z×R) g L2 (R×Z×R) h L2 (R×Z×R) , where f , g, h are positive L2 functions. Estimate for the contribution of "1 to (26). Denote by J1 the contribution of "1 to the K1 K2 expression (26). Consider the dyadic levels DM , where K1 , K2 , M1 , M2 , M are 1 M2 M

dyadic integers as in the proof of Theorem 2.1. Denote by J1K1 K2 M1 M2 M the contribution K1 K2 of DM to (26). Then 1 M2 M

J1

J1K1 K2 M1 M2 M .

K1 ,K2 ,M1 ,M2 ,M−dyadic

Define fK1 M1 (τ, m, η), gK2 M2 (τ, m, η) and hM (τ, m, η) as in (8), (9), (10). Then clearly J1K1 K2 M1 M2 M is bounded by M1 · fK M (τ1 , m1 , η1 )gK M (τ −τ1 , m−m1 , η−η1 )hM (τ, m, η) 1 1 2 2 dτ dτ1 , 1 K1 K2 + 1+ *M M M + M2s · K12 K22 " 1 2 1 K2 4 where *K M1 M2 M ⊂ R is defined as in the proof of Theorem 2.1. Moreover, similarly to the proof of Theorem 2.1 we obtain that

J1K1 K2 M1 M2 M

M1 M2−s 1 2+

1 2+

K1 K2

sup

(τ,|m|≈M,n)

|Aτ mn | 2 fK1 M1 L2 gK2 M2 L2 hM L2 , 1

where the set Aτ mη is defined as in (12). Again similarly to the proof of Theorem 2.1 we obtain via the triangle inequality that |Aτ mη | (K1 ∧K2 )|Bτ mn |, where Bτ mη ⊂ Z×R is the set defined by (14). We shall estimate |Bτ mη | in a slightly different fashion compared to the proof of Theorem 2.1. The projection of Bτ mη on the m1 axis is contained in a set


467

of cardinality at most c(M1 ∧ M2 ) since for (m1 , η1 ) ∈ Bτ mη one has |m1 | ≈ M1 and |m − m1 | ≈ M2 . Fix now m1 . Recall that for (m, m1 , η, η1 ) ∈ "1 one has ∂ 2 2 η ) (η − η 1 (τ − m51 − m − m1 )5 − 1 − ∼ |m| m2 − mm1 + m21 ∂η1 m1 m − m1 |mm1 | ∼ M1 M. Hence, due to Lemma 2, the maximum cardinality of the sections of Bτ mη with lines 2) parallel to the η1 axis is bounded by c(KM11∨K M . Now using Lemma 1 we obtain that the cardinality of Bτ mη is bounded by c(M1 ∧ M2 )(M1 M)−1 (K1 ∨ K2 ). Moreover |Aτ mη |

K1 K2 (M1 ∧ M2 ) . M1 M

Hence 1

J1K1 K2 M1 M2 M

1

M12 (M1 ∧ M2 ) 2 1

M 2 M2s K10+ K20+

fK1 M1 L2 gK2 M2 L2 hM L2

1

M12 1

s− 21

M 2 M2

K10+ K20+

fK1 M1 L2 gK2 M2 L2 hM L2

1

M12 1

1

M 2 M22 K10+ K20+

fK1 M1 L2 gK2 M2 L2 hM L2 ,

since s ≥ 1. By the triangle inequality we have that M1 max{M, M2 }. Using a symmetry argument we can suppose that M ≥ M1 and therefore M1 M. Let M = 2l M1 , where l ∈ Z, l ≥ −l0 (l0 is fixed, positive and independent of M1 ). Then we have that 1

l

J1K1 ,K2 ,M1 ,M2 ,2 M1

1

l

K10+ K20+ M22 2 2

fK1 M1 L2 gK2 M2 L2 h2l M1 L2 .

(27)

It remains to sum (27) over K1 , K2 , M1 , M2 , l. First we can easily sum (27) over K1 , K2 , M2 ,

l

J1K1 ,K2 ,M1 ,M2 ,2 M1

K1 ,K2 ,M2 -dyadic

1 l

22

fM1 L2 g L2 h2l M1 L2 ,

where fM1 (τ, m, η) =

f (τ, m, η), when |m| ≈ M1 0, elsewhere.

468


Next we sum over M1 and l via the Cauchy–Schwarz inequality J1

∞

l=−l0

K1 ,K2 ,M1 ,M2 -dyadic

 ∞ 1  l 22  l=−l0

M1 -dyadic

l

J1K1 ,K2 ,M1 ,M2 ,2 M1

fM1 2L2

1/2    



M1 -dyadic

h2l M1 2L2

1/2  

g L2

f L2 g L2 h L2 . Estimate for the contribution of "2 to (26). For (m, m1 ) ∈ "+ and s ≥ 1 one has |m1 ||m − m1 |−s |m|. Hence the contribution of "2 to the sum in the expression (26) is bounded by |m|f (τ1 , m1 , η1 )g(τ − τ1 , m − m1 , η − η1 )h(τ, m, η) 1 1 1+ . 2 2− η12 2 + (η−η1 )2 2 5 5 "+ τ − m5 − η τ1 − m1 − m1 τ − τ1 − (m − m1 ) − m−m1 m Now we remark that the above expression has the same nature as (6) with s = 0. Hence we can use the arguments implemented above when estimating the contribution of "2 to the expression (6). This completes the proof of (24). When |n| ≥ 2|n1 | one has |n| ≤ 2|n − n1 |. Hence a duality argument shows that the proof of (25) is reduced to bound the expression |m − m1 ||m1 |−s f (τ1 , m1 , η1 )g(τ − τ1 , m − m1 , η − η1 )h(τ, m, η) 1 1 1+ , 2 2− η12 2 + (η−η1 )2 2 5 5 "+ τ − m5 − η τ1 − m1 − m1 τ − τ1 − (m − m1 ) − m−m1 m (28) by c f L2 (R×Z×R) g L2 (R×Z×R) h L2 (R×Z×R) , where f , g, h are positive L2 functions. A symmetry argument (m1 → (m − m1 )) shows that we can bound (28) similarly to (26). This completes the proof of Lemma 7.

3.2. The fixed point argument. In this section we perform a fixed point argument for the integral equation corresponding to the fifth order KP-I equation. This argument is standard since the linear estimates in the Fourier transform restriction method of J. Bourgain do not depend on the particular equation in hand. Write the fifth order KP-I equation as an integral equation 1 t u(t) = U (t)φ − U (t − t )∂x (u2 (t ))dt , (29) 2 0


469

where U (t) = exp(t (∂x5 + ∂x−1 ∂y2 )) is the unitary group generating the solutions of the linear problem. We shall apply the contraction mapping principle to a cut-off version of (29). Let ψ be a bump function such that ψ ∈ C0∞ (R), supp ψ ⊂ [−2, 2], ψ = 1 on the interval [−1, 1]. Consider the integral equation 1 u(t) = ψ(t)U (t)φ − ψ(t/T ) 2

t

U (t − t )∂x (u2 (t ))dt .

0

(30)

We shall solve (30) globally in time in the space Z b,s,k , where I = [−T , T ]. To the solutions of (30) correspond local solutions of the fifth order KP-I equation in the time interval [−T , T ] in the space Z b,s,k (I ), where I = [−T , T ]. Consider the nonlinear operator L acting on Z b,s,k as 1 Lu := ψ(t)U (t)φ − ψ(t/T ) 2

t 0

U (t − t )∂x (u2 (t ))dt .

We claim that for small enough T the operator L is a contraction in the space Z 2 +,s,k for any φ ∈ H s,k (T × R). This will follow from the next estimates of the two terms in the right-hand side of (30). 1

Lemma 8 (linear estimates). Let − 21 < b ≤ 0 ≤ b ≤ b + 1, s ≥ 0 and k ≥ 0. Then the following inequalities hold:

ψ(t)U (t)φ Z b,s,k φ H s,k ,

t

ψ(t/T ) 0

(31)

U (t − t )∂x (u2 (t ))dt Z b,s,k T 1−b+b uux Z b ,s,k .

(32)

We refer to [9] for the proof of (31) and (32) (and for a very clear introduction to Bourgain’s method). These estimates are essentially one dimensional and do not depend on the unitary group U (t). Now using (31), (32) and (23) we obtain that

Lu

1

Z 2 +,s,k

φ H s,k + T 0+ u 2 1 +,s,k , s ≥ 1, k ≥ 0. Z2

Hence L maps Z 2 +,s,k into itself for s ≥ 1, k ≥ 0. In a similar way we obtain that 1

Lu − Lv

1

Z 2 +,s,k

T 0+ u − v

1

Z 2 +,s,k

u + v

1

Z 2 +,s,k

.

for some positive Therefore L is a contraction in Z b,s,k for a small T of order φ −a H s,k constant a. It remains to use the contraction mapping principle to solve (30) in Z b,s,k . This implies the local well-posedness of (29) in Z b,s,k (I ). The embedding of Z b,s,k (I ) in C(I, H s,k (T × R)) follows from a one dimensional Sobolev inequality. This completes the proof of Theorem 3.1.

470


4. Global Well-Posedness In this section we extend globally in time the local solutions obtained in Theorem 3.1. This results from the energy conservation. More precisely, applying Theorem 3.1 with s = 2 and k = 1 we obtain a local solution u of the fifth order KP-I equation on the time interval [−T , T ]. The local well-posedness implies that the following alternative holds: either limt→T u(t) H 2,1 (T×R) = ∞ or T = ∞. Our goal is to show that the second claim holds. It suffices to show that u(t) H 2,1 (T×R) remains bounded along the trajectories. For that purpose we shall use the conservation of the energy as stated in the next lemma. Lemma 9. Let u be the local solution obtained in Theorem 3.1 for s = 2 and k = 1. Then the conservation of the energy holds: 1 1 3 2 2 −1 2 |∂x u(t)| + |∂x ∂y u(t)| − u (t) = H (φ). (33) 2 T×R 3 Proof. Writing the fifth order KP-I equation as u2 ∂t u − ∂x ∂x4 u − + ∂x−2 ∂y2 u = 0, 2 it suffices to multiply (34) by ∂x4 u − Set Q(t) =

1 2

T×R

u2 2

+ ∂x−2 ∂y2 u and to integrate by parts.

(34)

|∂x2 u(t)|2 + |∂x−1 ∂y u(t)|2 .

We shall prove that Q(t) is bounded. An anisotropic Sobolev inequality (cf. [2]) yields 1 1 3 u (t) ≤ 2 u(t) 2L2 ∂x2 u(t) L2 2 ∂x−1 ∂y u(t) L2 2 . T×R

Using the L2 conservation law, (33) and twice the elementary inequality 2ab ≤ a 2 + b2 , we obtain 3 ≤ 4 φ 4 2 + 1 Q(t). u (t) (35) L 4 T×R Thanks to the energy conservation we have that 1 Q(t) = H (φ) + u3 (t). 6 T×R Now we use (35) to obtain the following bound for Q(t) Q(t) ≤

24 16 H (φ) + φ 4L2 . 23 23

Hence Q(t) is bounded by a quantity which remains constant along the trajectories. Therefore T = ∞ and the solutions are global. This completes the proof of Theorem 14 . 4 One can show higher order Sobolev regularity persistence properties of the flow similar to [6, Sect. 8].


471

5. Counterexamples in the Case of Purely Periodic Data 5.1. The fifth order KP-I equation. In this section we give an example showing that Theorem 2.1 can not be extended to the purely periodic case. This is an additional motivation of our choice of initial data defined on Tx × Ry in the considerations of the previous sections. Our example pertains only to KP-I equations since it uses in an essential way the failure of the smoothing relation obtained in the KP-II context. Let b and s be real numbers. We define the Fourier transform restriction space Xb,s (R × T2 ) associated to the fifth order KP-I equation with data on T2 , Xb,s (R × T2 ) = u ∈ S (R × T2 ) : u(τ, ˆ 0, n) = 0, u Xb,s < ∞ , where

u Xb,s =

      

∞ −∞

1 2    2 2b n 1 + τ − m5 − |u(τ, ˆ m, n)|2 dτ  m  

|m|2s

(m,n)∈Z2

m=0

and u(τ, ˆ m, n) stands for the Fourier transform of a function defined on R × T2 : u(τ, ˆ m, n) = e−itτ −imx−iny u(t, x, y)dtdxdy. R×T2

We shall show that the statement of Theorem 2.1 is invalid in the framework of the spaces Xb,s (R×T2 ). As far as we know counterexamples showing the failure of bilinear estimates in Bourgain spaces first appeared in [10] for the KdV equation. Theorem 5.1. The estimate

∂x (uv) Xb−1,s (R×T2 ) u Xb,s (R×T2 ) v Xb,s (R×T2 )

(36)

fails for any 0 < b ≤ 1 and s ∈ R. Proof of Theorem 5.1. A duality argument shows that if (36) holds then one can bound the modulus of the expression

|m|1+s |m1 |−s |m−m1 |−s f (τ1 , m1 , n1 )g(τ −τ1 , m−m1 , n−n1 )h(τ, m, n)dτ dτ1 2 b 2 1−b n2 b R2 + 1) τ −m5 − nm τ1 −m51 − m11 τ −τ1 −(m−m1 )5 − (n−n : m−m1 (37)

by c f L2 (R×Z×Z) g L2 (R×Z×Z) h L2 (R×Z×Z) , where : + = {(m, n, m1 , n1 ) ∈ Z4 √: m1 = 0, m − m1 = 0, m = 0}. Let N be a large integer. Set5 α(N) := [N (N − 1) 5N 2 − 5N + 5],where [·] denotes the integer part 5 Note that for m = N −1, m = N, n = 0, n = α(N ) the expression 5mm (m−m )(m2 −mm +m2 ) 1 1 1 1 1 1 (m n−mn )2

is “close to” mm1 (m−m1 ) , i.e. (N − 1, N, 0, α(n) is “close to” the set of lattice points where the smoothing 1 1 relation fails.

472


of a real number. Define f , g and h as follows: 1, when m = 1, n = 0, 0 ≤ τ − 1 ≤ 1 f (τ, m, n) = 0, elsewhere, α(N)2 5 g(τ, m, n) = 1, when m = N − 1, n = α(N ), |τ − (N − 1) − N−1 | ≤ 1 0, elsewhere, α(N)2 5 h(τ, m, n) = 1, when m = N, n = α(N ), 0 ≤ τ − 1 − (N − 1) − N−1 ≤ 1 0, elsewhere. Note that

f L2 (R×Z×Z) ∼ g L2 (R×Z×Z) ∼ h L2 (R×Z×Z) ∼ 1. Hence if (36) holds, the above choice of f , g, h would imply that (37) is bounded independently of N . We have (f 0g)(τ, m, n) ≥ h(τ, m, n) and therefore (37) is bounded from below by N dτ (38) c 1−b , *N τ − N 5 − α 2 (N) N where

α 2 (N ) *N = τ : 0 ≤ τ − 1 − (N − 1)5 − ≤1 . N −1

In order to bound (38) from below we need to get an upper bound of |τ − N 5 − for τ ∈ *N . We can write via the triangle inequality for τ ∈ *N , 2 2 τ − N 5 − α (N ) τ − 1 − (N − 1)5 − α (N ) N N − 1 α 2 (N ) +N 5 − (N − 1)5 − 1 − N (N − 1) 5 α 2 (N ) 5 1 + N − (N − 1) − 1 − . N (N − 1)

α 2 (N) N |

√ √ Since N (N − 1) 5N 2 − 5N + 5 − 1 ≤ α(N ) ≤ N (N − 1) 5N 2 − 5N + 5 one can easily obtain that 0 ≤ N 5 − (N − 1)5 − 1 −

α 2 (N ) N. N (N − 1)

Hence (38) is minorized by cN ∼ Nb N 1−b which is clearly not bounded since N can be chosen arbitrary large. This completes the proof of Theorem 5.1.


473

5.2. The “usual” KP-I equation. We shall use the idea of the previous section in order to provide counterexamples in the case of the “usual” KP-I equation. The examples are easier to construct in this context because of the lower order of the dispersion. Let b, b1 , b2 , s be real numbers. We define the Fourier transform restriction spaces b,b1 ,b2 ,s X± (R × T2 ) as b,b1 ,b2 ,s X± (R × T2 ) = u ∈ S (R × T2 ) : u(τ, ˆ 0, n) = 0, u Xb,b1 ,b2 ,s < ∞ , ±

where

u Xb,b1 ,b2 ,s ±

 2 τ − m3 ∓ n b s 3 = |m| τ − m ∓ 1 + m |m|b2

n2 m

 ˆ m, n)  u(τ,

b1 

.

L2 (R×Z∗ ×Z)

b,b1 ,b2 ,s (R × T2 ) corresponds to the KP-I equation, while the space The space X+ b,b1 ,b2 ,s X− (R × T2 ) to the KP-II equation. The next estimate is the main ingredient in the proof of the L2 (T2 ) well-posedness Theorem obtained in [6].

Theorem 5.2 ([6]). The estimate

∂x (uv) Xb−1,b1 ,b2 ,s u Xb,b1 ,b2 ,s v Xb,b1 ,b2 ,s −

holds for b = 21 , b1 = 41 , b2 =

3 4

−

−

(39)

and s ≥ 0.

Once we prove (39) the rest of the well-posedness result for the KP-II equation follows the general lines of Bourgain’s method6 . The goal of this section is to prove the following result. Theorem 5.3. The estimate

∂x (uv) Xb−1,b1 ,b2 ,s u Xb,b1 ,b2 ,s v Xb,b1 ,b2 ,s +

+

+

(40)

fails for any b, b1 , b2 , s.

√ Proof. The proof is similar to that of Theorem 5.1. Set α(N ) = [ 3N (N − 1)]. The estimate (40) fails for N % 1 and the following choice of u and v: u(t, x, y) =

eit (eit − 1) ix ·e , 8iπ 3 t ei((N−1)

3 + α 2 (N ) −1)t N −1

(e2it − 1)

· ei(N−1)x · eiα(N)y . 8iπ 3 t The choice of u and v becomes transparent when computing their Fourier transform. More precisely uˆ and vˆ are characteristic functions: 1, when m = 1, n = 0, 0 ≤ τ − 1 ≤ 1 u(τ, ˆ m, n) = 0, elsewhere, α(N)2 3 v(τ, ˆ m, n) = 1, when m = N − 1, n = α(N ), |τ − (N − 1) − N−1 | ≤ 1 0, elsewhere. v(t, x, y) =

6 That is the linear estimates and an additional bilinear estimate in the spirit of (39) because of the assumption b = 21 .

474


We first suppose that (40) holds with b2 positive. Then we have

u Xb,b1 ,b2 ,s 1 and v Xb,b1 ,b2 ,s N s . +

(41)

+

On the other hand similar to the proof of Theorem 5.1 we can minorize uˆ 0 vˆ and obtain N 1+s dτ

∂x (uv) Xb,b1 ,b2 ,s c , (42) 2 + *N τ − N 3 − α (N) 1−b N where

α 2 (N ) *N = τ : 0 ≤ τ − 1 − (N − 1) − ≤1 . N −1 3

The difference with the proof of Theorem 5.1 is that now, because of the lower order 2 dispersion, one can bound, for τ ∈ *N , |τ − N 3 − α N(N) | by a constant independent of N. More precisely we write, using the triangle inequality, when τ ∈ *N , 2 2 τ − N 3 − α (N ) τ − 1 − (N − 1)3 − α (N ) N N − 1 α 2 (N ) +N 3 − (N − 1)3 − 1 − N (N − 1) α 2 (N ) 1 + 3N (N − 1) − . N (N − 1) √ α 2 (N) Since α(N) = [ 3N (N −1)] the expression 3N (N −1)− N(N−1) is easily seen to range √ 3N (N − 1) < α(N ) < in an interval independent of N. More precisely using that √ 3N (N − 1), one can obtain for N > 1, 0 < 3N (N − 1) −

√ α 2 (N ) < 2 3. N (N − 1)

Since *N has measure 1, we use (42) to prove that ∂x (uv) Xb−1,b1 ,b2 ,s is bounded from +

below by N 1+s . Taking into account (41) we obtain the failure of (40) for N % 1 and b2 positive. When b2 is negative, a slight modification of the above argument is needed. In this case

u Xb,b1 ,b2 ,s 1 and v Xb,b1 ,b2 ,s N s−b2 . The lower bound for ∂x (uv) Xb−1,b1 ,b2 ,s , +

+

+

for a negative b2 is N 1+s−b2 . This is a contradiction for N % 1. This completes the proof of Theorem 5.3. It is worth mentioning that similar obstructions on the KP-I dynamics appear when applying the Poincaré–Birkhoff normal form theory to the KP-I equation posed on the two dimensional torus (cf. [18] and the references therein). On the other hand this method works for the KP-II equation. In the analysis of the KP equations in [18] an important role is played by the structure of the following manifolds: n2 (n − n1 )2 n2 M = (m1 , m, n1 , n) ∈ Z4 : m31 ± 1 + (m − m1 )3 ± =0 , − m3 ± m − m1 m m1


475

where the sign + corresponds to the KP-I case and the sign − to the KP-II one. The trivial structure of the manifold M in the KP-II case implies a trivial structure of the classical scattering matrix (the identity operator) and hence one is able to solve the Cauchy problem for the KP-II equation for small periodic initial data. On the other hand, the nontrivial structure of the manifold M in the KP-I case leads to some small denominators problems and seems to be an obstruction to perform the methods of [18] in the KP-I context. The KP equations possess an infinite number of conserved quantities (cf. [19]). In the case of the KP-I equation the conserved quantities have the form 1 F2n+1 (φ) = |Lφ|2 + higher order terms, 2 T2 where L is defined as & % L = ∂x−1

1 ∂x + √ ∂x−1 ∂y 3

2n+1

1 − −∂x + √ ∂x−1 ∂y 3

2n+1 '( 21 .

The functional F1 corresponds to the L2 conservation law, F3 to the energy conservation, etc. Similar expressions can be written for the KP-II conservation laws. A very important difference between the KP-I and KP-II conservation laws is that Sobolev type norms are formally controlled by the conserved quantities only in the context of the KP-I equation. This fact is used in [13] to study the global well-posedness of the KP-I equation with periodic sufficiently smooth small initial data (as noticed in [6] the smallness condition can be easily removed). The argument performed in [13] uses that the antiderivative operators act continuously on L2 (T2 ). The failure of this fact in the continuous case is one of the obstructions to adapt the method of [13] for initial data defined on R2 . On the other hand, the conservation of F3 , is used in [16] to prove the existence of global weak solutions to the KP-I equation by energy methods (the uniqueness is unknown so far). Note added in proof. We have recently proven the failure of any iterative method for solving the Cauchy problem for the usual KP-I equation in R2 , and the global wellposedness of the KP-I equation in R2 by a compactness method (cf. [20, 21]). Acknowledgements. Part of this work has been done at the University of L’Aquila on an invitation of Professor V. Georgiev.

References 1. Ben-Artzi, M., Saut, J.-C.: Uniform decay estimates for a class of oscillatory integrals and applications. Diff. Int. Eq. 12, 137–145 (1999) 2. Besov, O., Ilin, V., Nikolski, S.: Integral representation of functions and embedding theorems. New York: J. Wiley, 1978 3. de Bouard, A., Saut, J. C.:Solitary waves of generalized Kadomtsev–Petviashvili equations. Annales I.H.P., Analyse non linéaire 14, 211–236 (1997) 4. Bourgain, J.: Fourier transform restriction phenomena for certain lattice subsets and application to nonlinear evolution equations I. Schrödinger equations. GAFA 3, 107–156 (1993) 5. Bourgain, J.: Fourier transform restriction phenomena for certain lattice subsets and application to nonlinear evolution equations II. The KdV equation. GAFA 3, 209–262 (1993) 6. Bourgain, J.: On the Cauchy problem for the Kadomtsev-Petviashvili equation. GAFA 3, 315–341 (1993) 7. Colliander, J.: Personal communication

476


8. Colliander, J.: Delort, J.M., Kenig, C., Staffilani, G.: Bilinear estimates and applications to 2D NLS. Trans. Am. Math. Soc. (to appear) 9. Ginibre, J.: Le problème de Cauchy pour des EDP semi-linéaires périodiques en variables d’espace (d’après Bourgain). Séminaire Bourbaki 796, Astérique 237, 163–187 (1995) 10. Kenig, C., Ponce, G., Vega, L.: A bilinear estimate with applications to the KdV equations. J. AMS 9, 573–603 (1996) 11. Saut, J.C., Tzvetkov, N.: The Cauchy problem for higher order KP equations. J. Diff. Eq. 153, 196–222 (1999) 12. Saut, J. C., Tzvetkov, N.: The Cauchy problem for the fifth order KP equations. J. Math. Pures Appl. 307–338 (2000) 13. Schwarz, Jr., M.:Periodic solutions of Kadomtsev–Petviashvili. Adv. Math. 66 217–233 (1987) 14. Takaoka, H., Tzvetkov, N.: On the local regularity of Kadomtsev–Petviashvili–II equation. IMRN 8, 77–114 (2001) 15. Tao, T.: Multilinear weighted convolution of L2 functions, and applications to non-linear dispersive equations. Amer. J. Math. (to appear) 16. Tom, M. M.: On a generalized Kadomtsev-Petviashvili equation. Contemp. Math. AMS, 200, 193–210 (1996) 17. Tzvetkov, N.: Bilinear estimates related to the KP equations. Journées Equations aux Dérivées Partielles (La Chapelle sur Erdre, 2000), Exp. No. XIX, 12 pp., Univ. Nantes, Nantes, 2000 18. Zakharov, V.: Weakly nonlinear waves on the surface of an ideal finite depth fluid. Am. Math. Soc. Transl. 182, 167–197 (1998) 19. Zakharov, V., Schulman, E.: Degenerative dispersion laws, motion invariants and kinetic equations. Physica D 1, 192–202 (1980) 20. Molinet, L., Saut, J.C., Tzvetkov, N.: Well-posedness and ill-posedness results for the KadomtsevPetviashvili-I equation. Preprint 2001 21. Molinet, L., Saut, J.C., Tzvetkov, N.: Global well-posedness for the KP-I equation. Preprint 2001 Communicated by P. Constantin

Commun. Math. Phys. 221, 477 – 497 (2001)

Communications in



AN -Type Dunkl Operators and New Spin Calogero–Sutherland Models F. Finkel, D. Gómez-Ullate, A. González-López , M. A. Rodríguez, R. Zhdanov Departamento de Física Teórica II, Facultad de Ciencias Físicas, Universidad Complutense, 28040 Madrid, Spain Received: 17 February 2001 / Accepted: 8 March 2001

Abstract: A new family of AN -type Dunkl operators preserving a polynomial subspace of finite dimension is constructed. Using a general quadratic combination of these operators and the usual Dunkl operators, several new families of exactly and quasi-exactly solvable quantum spin Calogero–Sutherland models are obtained. These include, in particular, three families of quasi-exactly solvable elliptic spin Hamiltonians. 1. Introduction In the early seventies, Calogero [6] and Sutherland [36, 37] introduced the celebrated exactly solvable (ES) and integrable quantum many-body problems in one dimension that bear their names. These papers had a profound impact in the whole physics community, as reflected by the vast amount of literature devoted to the study of the mathematical properties and applications of these models. Among the most recent ones we could mention soliton theory [26, 32], orthogonal polynomials [27, 1, 13], fractional statistics and anyons [8], random matrix theory [39], and Yang–Mills theories [20, 11], to name only a few. Later on, Olshanetsky and Perelomov [29] explained the integrability of the original Calogero–Sutherland (CS) models by relating them to the root system of AN type. These authors then constructed new families of integrable many-body Hamiltonians associated with all the other root systems. Furthermore, they showed that the most general interaction potential for these models is proportional to the Weierstrass ℘ function. A considerable effort has been devoted over the last decade to the extension of CS models to particles with spin. These models are a step forward towards the unification of the CS scalar models and integrable spin chains, like the Haldane–Shastry model [22, 34]. Several different techniques have been used to construct spin counterparts of the scalar CS models, including the exchange operator method [31], the Dunkl operators formalism [3, 9], the supersymmetric approach [5], and reduction by discrete symmetries [33]. Corresponding author. E-mail: [email protected]

On leave of absence from Institute of Mathematics, 3 Tereshchenkivska St., 01601 Kyiv-4, Ukraine

478

F. Finkel, D. Gómez-Ullate, A. González-López, M. A. Rodríguez, R. Zhdanov

In the quantum case, only the rational and trigonometric (or hyperbolic) spin CS models have been constructed, both in their AN [31, 23, 3, 4, 2, 25, 38, 42] and BCN [44] versions. In the AN case, the integrability and the exact-solvability of these models both follow from the fact that the Hamiltonian is related to a quadratic combination of some family of Dunkl operators. The Dunkl operators Ti =

1 ∂ +a (1 − Kij ), ∂zi zi − z j

i = 1, . . . , N,

(1)

j =i

were originally introduced in [12] in connection with the theory of orthogonal polynomials associated with finite reflection groups. In the latter expression, a is an arbitrary real parameter and the sum runs over j = 1, . . . , i − 1, i + 1, . . . , N. The permutation operators Kij = Kj i act on an arbitrary function f (z), with z = (z1 , . . . , zN ) ∈ RN , as (Kij f ) (z1 , . . . , zi , . . . , zj , . . . , zN ) = f (z1 , . . . , zj , . . . , zi , . . . , zN ).

(2)

Using the relations Kij2 = 1,

Kij Kj k = Kik Kij = Kj k Kik ,

Kij Kkl = Kkl Kij ,

(3)

where i, j, k, l take different values in the range 1, . . . , N, one can establish the commutativity of the operators (1) and prove that Ti , Kj k , i, j, k = 1, . . . , N, span a realization of a degenerate Hecke algebra (see [9] for more details). Since the rational spin CS Hamiltonian is related to a polynomial in the Dunkl operators (1), these operators yield a complete set of commuting integrals of motion. In addition, the spectrum of the Hamiltonian follows immediately from that of the Dunkl operators, which can be easily computed [3, 2]. The previous considerations also apply to the operators zj ∂ zi T˜i = zi +a (1 − Kij ) + a (1 − Kij ) + 1 − i, ∂zi zi − z j zi − z j j

(4)

j >i

i = 1, . . . , N, introduced by Cherednik [9] in connection with the trigonometric spin CS model. In other words, the operators T˜i commute, have an easily computable spectrum, and can be used to obtain a complete set of integrals of motion and the spectrum of the Hamiltonian. It has become customary in the literature to refer to both families of operators Ti and T˜i as Dunkl operators. Recently, some partially solvable deformations of the scalar CS models with an external potential have been proposed [16, 24, 28]. For these models – in contrast to the CS models listed in [29] – only a finite-dimensional subset of the spectrum can be computed algebraically. Following Turbiner and Ushveridze [40, 41], we shall use the term quasiexactly solvable (QES) to refer to this type of models; see also the reviews [19, 35, 43]. In all these models, the Hamiltonian can be expressed as a quadratic combination of the generators of a realization of sl(N + 1) by first-order differential operators preserving a finite-dimensional space of smooth functions. The action of the Hamiltonian in this space can thus be represented by a finite-dimensional constant matrix. The eigenvalues of this matrix belong to the spectrum of the Hamiltonian provided the corresponding eigenfunctions satisfy some appropriate boundary conditions. The Lie algebra sl(N +1) is usually referred to as a hidden symmetry algebra of the Hamiltonian in these models.

AN -Type Dunkl Operators and New Spin Calogero–Sutherland Models

479

In this paper, we propose a general procedure for constructing (Q)ES spin CS models, close in spirit to the hidden symmetry algebra approach to scalar QES models. The starting point of our construction is the well-known fact that the two standard families of Dunkl operators (1) and (4) admit an infinite sequence of invariant polynomial subspaces of finite dimension. One of the main novelties in our approach consists in the introduction of a new family of commuting Dunkl operators which, together with the other two families (1) and (4), is shown to preserve a single polynomial subspace of finite dimension. We then prove that certain quadratic combinations involving all three families of Dunkl operators always yield a spin CS Hamiltonian. The QES character of these Hamiltonians follows immediately from the fact that the Dunkl operators admit a finite-dimensional invariant subspace. Moreover, if the original quadratic combination does not involve the new family of Dunkl operators, the resulting Hamiltonian preserves an infinite sequence of finite-dimensional subspaces of smooth functions, which we shall take as the definition of exact solvability. The linear space spanned by all three types of Dunkl operators is then shown to be invariant under the projective action of the group GL(2, R). We make use of this fact to perform a complete classification of the resulting (Q)ES spin CS models. All the previously known exactly-solvable spin CS models of AN type appear as particular cases, arising from a quadratic combination of a single type (either (1) or (4)) of Dunkl operators. In addition, we obtain many new spin CS models, both exactly and quasi-exactly solvable. These include, in particular, several elliptic QES spin CS models. To the best of our knowledge, these are the first examples of solvable quantum spin CS models involving elliptic functions. 2. A New Family of AN -Type Dunkl Operators In this section we shall define a third family of Dunkl operators that preserve certain finitedimensional polynomial subspaces. These three families shall be used in the following sections to construct exactly and quasi-exactly solvable spin many-body Hamiltonians. Let us begin by introducing the polynomial subspaces Rm (z) = span

N i=1

Pn (z) = span

N i=1

zili zili

: li ≤ m, :

N

i = 1, . . . , N ,

(5)

li ≤ n ,

(6)

i=1

which shall be referred to as the rectangular and triangular modules, respectively, by analogy with the two particle case [15]. A well-known property of the Dunkl operators (1), (4) is the fact that they preserve the triangular module Pn (z) for arbitrary n. Seemingly less known, but central to our construction, is the fact that they preserve the rectangular module Rm (z) for arbitrary m as well. On the other hand, the differential parts of the Dunkl operators (1), (4) together with the differential operator zi2 ∂zi , span a realization of sl(2). Inspired by this fact, it is natural to suggest the following ansatz for a third set of Dunkl operators: Ji = zi2

∂ − mzi + fij (z)(1 − Kij ), ∂zi j =i

i = 1, . . . , N,

480


where m is an arbitrary non-negative integer and fij (z) is a function anti-symmetric in i, j . This new family does not preserve the module Pn (z), but for a suitable choice of the functions fij (z) it will be shown to preserve the module Rm (z). To this end, let us define the operators 1 (1 − Kij ), Q− i =a zi − z j j =i

a zi + z j = (1 − Kij ), 2 z − zj j =i i zi z j (1 − Kij ), Q+ i =a zi − z j Q0i

(7)

j =i

where a is a real parameter and the sum runs over j = 1, . . . i − 1, i + 1, . . . N. The following lemma will be important in the sequel: Lemma 1. For any non-negative integer n, the rectangular module Rn (z) is invariant + 0 under the action of the operators Q− i , Qi and Qi . The triangular module Pn (z) is − invariant only under the action of the operators Qi and Q0i . Proof. It suffices to prove that the inclusions hij zi − z j hij zi − z j

(1 − Kij ) Rn (z) ⊂ Rn (z), (1 − Kij ) Pn (z) ⊂ Pn (z),

= ±, 0, = −, 0,

hold for any pair of indices i = j , where h− ij = 1,

h0ij = zi + zj ,

h+ ij = zi zj .

The action of these operators on an arbitrary monomial hij zi − z j

(1 − Kij )

l

N k=1

l zili zjj

zklk =

l

N zili zjj − zij zjli hij

zi − z j

zklk li lj zi zj k=1

l

yields a polynomial, since − zij zjli is a multiple of zi − zj . The homogeneous degree of this polynomial is either 0 (if li = lj ) or + i li (if li = lj ). Therefore, if the original monomial belongs to Pn (z) the resulting polynomial is also in Pn (z) for = −, 0, but lies outside Pn (z) for = +. On the other hand, the degrees of the variables zk in the resulting polynomial remain equal to lk if k = i, j , while the degrees of zi and zj satisfy deg(zk ) ≤ max(li , lj ) − 1 + d , where

d =

0 1

k = i, j,

if = −, if = 0, +.

Therefore, if the original monomial belongs to the space Rn (z) so does the resulting polynomial in all three cases = ±, 0.


481

The following three sets of Dunkl operators shall be the building blocks for the construction of several new (quasi-)exactly solvable spin CS models: Ji− =

∂ 1 +a (1 − Kij ), ∂zi zi − z j j =i

∂ m a zi + z j − + (1 − Kij ), Ji0 = zi ∂zi 2 2 zi − z j j =i

Ji+ = zi2

(8)

zi zj ∂ − mzi + a (1 − Kij ), ∂zi zi − z j j =i

where i = 1, . . . , N, a is a real parameter, and m is a non-negative integer. Note that the operators Ji− coincide exactly with the Dunkl operators (1), while the operators Ji0 differ from the Cherednik operators (4) by a linear combination with constant coefficients of the permutation operators Kij , namely a a m T˜i = Ji0 + (1 − Kij ) − (1 − Kij ) + + 1 − i. 2 2 2 j

(9)

j >i

The operators Ji+ are, to the best of our knowledge, new. The operators Ji and Kij obey the following commutation relations: [Ji± , Jj± ] = 0, [Kij , Jk ] = 0,

[Ji0 , Jj0 ] =

a2 Kij (Kj k − Kik ), 4

(10)

k=i,j

Kij Ji = Jj Kij ,

(11)

where = ±, 0 and the indices i, j, k are all different. The set of operators Ji , Kij : i, j = 1, . . . , N , = ±, 0, spans a degenerate affine Hecke algebra, see [9]. This is clear for = ±, while for = 0, it follows from (9) and the commutativity of the Cherednik operators (4). The key property in our construction of (quasi-)exactly solvable spin CS models is the fact that the operators (8) possess invariant polynomial subspaces. Theorem 1. The operators Ji− and Ji0 preserve the modules Pn (z) and Rn (z) for an arbitrary non-negative integer n. The operators Ji+ preserve the module Rm (z), but do not preserve the modules Pn (z) and Rk (z) for k = m. Proof. The statement follows from Lemma 1 and the fact that the differential parts of Ji− and Jj0 preserve the modules Pn (z) and Rn (z) for any non-negative integer n, whereas the differential part of Ji+ preserves the module Rk (z) only for k = m. The following corollary is an immediate consequence of Theorem 1: Corollary 1. Any polynomial in the operators Ji leaves invariant the rectangular module Rm (z). In addition, if the polynomial does not depend on Ji+ , it preserves the modules Rn (z) and Pn (z) for all n.

482


3. Construction of Spin Calogero–Sutherland Models In the previous section we have introduced a new set of Dunkl operators preserving the space of polynomials Rm . Here we shall make use of all three sets of Dunkl operators (8) to construct some multi-parameter families of spin CS models. Consider the spin permutation operators Sij , i, j = 1, . . . , N, whose action on a spin state |s1 , . . . , sN , −M ≤ si ≤ M, with M ∈ 21 N, is given by Sij |s1 , . . . , si , . . . , sj , . . . , sN = |s1 , . . . , sj , . . . , si , . . . , sN .

(12)

(3) with Kij replaced by Sij . Let S denote Note that the operators Sij obey the identities the linear space span |s1 , . . . , sN −M≤s ≤M . The action of the operators Sij in S is i thus represented by (2M + 1)N -dimensional symmetric matrices. The starting point of our procedure is the following quadratic combination of the Dunkl operators (8): c0+ 0 + − H∗ = c++ (Ji+ )2 + c00 (Ji0 )2 + c−− (Ji− )2 + {J , J } 2 i i i (13) c0− 0 − {Ji , Ji } + c+ Ji+ + c0 Ji0 + c− Ji− , + 2 where c , c , , = ±, 0, are arbitrary real constants. The term 21 i {Ji− , Ji+ } differs 0 2 from i (Ji ) by a constant operator (see Appendix), and for this reason it has not been included in (13). We emphasize that only the particular cases of (13) (Ji0 )2 , −H ∗ = c−− (Ji− )2 −H ∗ = c00 i

i

have been previously discussed in the literature in connection with CS models; see [3, 9,13, 31, 44] and references therein. As it is customary, we shall identify Kij , Sij , and H ∗ with their natural extensions Kij ⊗ 1I, 1I ⊗ Sij , and H ∗ ⊗ 1I to the tensor product C[z1 , . . . , zN ] ⊗ S. The following lemma is an immediate consequence of Eq. (11). Lemma 2. The (Q)ES differential-difference operator H ∗ commutes with Kij and Sij for all i, j = 1, . . . , N. This property plays a crucial role in the construction of spin CS models; see, for instance, Ref. [2]. Let # be the projection operator on states antisymmetric under the simultaneous interchange of any two particles’ coordinates and spins. In terms of the total permutation operators $ij = Kij Sij , the operator # can be alternatively defined by the relations $ij # = −#, j > i = 1, . . . , N. Since Kij2 = 1, these relations are equivalent to Kij # = −Sij #,

j > i = 1, . . . , N.

For the lowest values of N the antisymmetrizer # is given by N =2: N =3:

# = 1 − $12 , # = 1 − $12 − $13 − $23 + $12 $13 + $12 $23 .

(14)


483

In general, # is an (N − 1)-th degree polynomial in the total permutation operators $ij . It thus follows from Lemma 2 that H ∗ commutes with #. Suppose that f (z) is an eigenfunction of H ∗ with eigenvalue λ. For instance, f could be one of the polynomial eigenfunctions that H ∗ is guaranteed to possess in Rm . Given any (constant) spin state |σ ∈ S, the spin function ϕ = #[f (z)|σ ] is also an eigenfunction of H ∗ with the same eigenvalue λ. Next, we introduce the matrix differential operator H obtained from H ∗ by the formal substitutions Kij → −Sij , i, j = 1, . . . , N. The relations (14) imply that ϕ is a spin eigenfunction of H with eigenvalue λ. Using the formulae (A1)–(A6) for the sums of the squares and the anticommutators of the Dunkl operators (8) given in the Appendix, we obtain the following explicit expression for H : ˜ i )∂zi + R(zi ) + 2ac++ (1 − m) −H = z i zj P (zi )∂z2i + Q(z i

+ 2a

i<j

+a

i<j

P (zi ) + P (zj )

1

P (zi )∂zi −P (zj )∂zj −a 1 + Sij 2 zi − z j (zi − zj ) i<j

i<j

a2 c++ (zi + zj )2 +c0+ (zi + zj )+c00 Sij + c00 1−Sij Sik , 12 i,j,k

(15) where P (z) = c++ z4 + c0+ z3 + c00 z2 + c0− z + c−− , b ˜ P (z), Q(z) = Q(z) + 1 − 2 Q(z) = c+ z2 + c0 z + c− , b = 1 + m + a(N − 1), (16)

2 m b + m(m − 1) − 1 − mc+ z R(z) = c++ b + m(m − 2) − 1 z + c0+ 1 − 2 m c00

+ 2(b − 1) + m(m − 2) − c0 , 4 2 and denotes summation in i, j, k with i = j = k = i. i,j,k

In the final step, one performs a gauge transformation with a suitable scalar function µ(z), followed by a change of variables z = ζ (x), x = (x1 , . . . , xN ), H = µ H µ−1 , (17) z=ζ (x)

in order to reduce the gauge spin Hamiltonian H to the Schrödinger form H =− ∂x2i + V (x),

(18)

i

where V (x) is a Hermitian matrix-valued function. Proposition 1. The operator H in Eq. (15) can be reduced to a matrix Schrödinger operator (18) by a change of variables z = ζ (x) and conjugation by a scalar gauge factor µ(z).

484


Proof. The gauge transformation (17) with gauge factor µ(z) =

(zi − zj )

a

i<j

N

P (zi )

− 41

i=1

exp

zi

˜ Q(y) dy, 2P (y)

(19)

together with the change of variables xi = ζ

−1

(zi ) =

zi

√

dy , P (y)

i = 1, . . . , N,

(20)

map the gauge spin Hamiltonian H to a matrix Schrödinger operator (18), with potential

P (zi ) + P (zj ) Q(z ˜ i) P (zi ) +a (a + Sij ) +a (zi − zj )(zi − zk ) z − zj (zi − zj )2 i,j,k i=j i i<j c++ (zi + zj )2 + c+0 (zi + zj ) + c00 Sij zi z j − a − 2ac++ (1 − m)

V (x) = a 2

i<j

i<j

1 3 1 ˜ i )2 − 2Q(z ˜ i )P (zi ) − P (zi ) + 2Q ˜ (zi ) + P (zi )2 + Q(z 4 P (zi ) 4 i

a2 − 4R(zi ) − . (21) 1 − Sij Sik c00 12 z=ζ (x) i,j,k

Remark 1. The gauge factor µ(z) in (19) was introduced in [16] in connection with a generalization of the theory of QES models with one degree of freedom to many-body problems. The existence of a (matrix or scalar) gauge factor and change of coordinates reducing a given matrix second-order differential operator in N > 1 variables to a matrix Schrödinger operator (18) is not guaranteed a priori. In the scalar case this problem was first addressed by Cotton [10], while the matrix case has been studied recently in Ref. [14]. In fact, the quadratic combination H ∗ has been chosen so that a scalar gauge and change of variables can be easily found for H . For instance, the term +factor − [J , J ] i i i has been discarded because it involves first-order derivatives with matrixvalued coefficients, which are usually very difficult to gauge away. Remark 2. The change of variables (20), and hence the potential V (x), are defined up to an arbitrary translation for each coordinate xi , i = 1, . . . , N. We shall see that this arbitrariness can be removed in some cases by requiring the potential to be invariant under sign reversals of any coordinate xi . If ϕ(z) is one of the eigenfunctions

of H with eigenvalue λ constructed above, the spin function ψ(x) = µ ζ (x) ϕ ζ (x) is clearly an eigenfunction of H with the same eigenvalue. Note, however, that we have not imposed so far any boundary conditions on the eigenfunctions ψ. In general, the parameters a, c , and c defining H ∗ should satisfy certain constraints in order to ensure that the appropriate boundary conditions are satisfied. The following proposition is an immediate consequence of the previous considerations:


485

Proposition 2. The spin Schrödinger operator (18) with potential (21) leaves invariant the module

Mm = µ ζ (x) # Rm ζ (x) ⊗ S . (22) In addition, if c++ = c0+ = c+ = 0, it preserves the modules Mn and

Nn = µ ζ (x) # Pn ζ (x) ⊗ S ,

(23)

for any non-negative integer n. If we add a constant term V0 = γ0 + γ1

Sij + γ2

i<j

Sij Sik ,

γi ∈ R,

(24)

i,j,k

to the potential (21), the previous procedure for constructing eigenfunctions of H still applies to H + V0 . Indeed, the associated operator (H + V0 )∗ is obtained by adding the term V0∗ = γ0 − γ1 Kij + γ2 Kij Kik i<j

i,j,k

to the initial operator H ∗ . Our assertion follows from the fact that V0∗ preserves the modules Pn and Rn for all n, commutes with all the permutation operators Kij , and acts trivially on S. This observation will be used in what follows to simplify the formula for V (x) by dropping terms of the form (24). 4. Classification of Spin Calogero–Sutherland Models We have seen in Sect. 3 that any quadratic combination of the form (13) yields a spin CS model for which a number of eigenvalues and their corresponding eigenfunctions can be computed in an algebraic fashion. In this section we shall obtain a complete classification of the potentials constructed in this way. The form of the potential V (x) in Eq. (21) depends on the choice of parameters c and c , , = ±, 0. The parameters c which define the polynomial P are of particular significance, since they determine the form of the change of variables (20). However, different sets of parameters c , c defining the operator H ∗ may give rise to the same potential. Indeed, there is a group of residual transformations preserving the vector spaces span{Ji− , Ji0 , Ji+ }, i = 1, . . . , N. The image of the operator H ∗ under these transformations is still of the form (13), albeit with different coefficients cˆ and cˆ . We shall make use of this fact to classify the starting multi-parameter family of operators (13) into conjugacy classes. This will provide a complete classification of the spin CS models obtainable within our framework. The ideas used for this classification are similar in spirit to those applied in Refs. [18, 19] to classify one-particle Lie-algebraic QES Schrödinger operators. Consider the mapping

Ji (w) → Ji (z) = µm (z) Ji w(z) µ−1 m (z),

(25)

486


where w = (w1 , . . . , wN ) is given by the projective action of GL(2, R) on RP1 (Möbius transformation) wi =

αzi + β , γ zi + δ

i = 1, . . . , N,

6 = αδ − βγ = 0,

(26)

and the gauge factor µm (z) is defined by µm (z) =

N

(γ zi + δ)m .

(27)

i=1

The following lemma is an immediate consequence of the definition of the Dunkl operators (8): Lemma 3. The mapping (25) acts linearly on the vector spaces span{Ji− , Ji0 , Ji+ }, i = 1, . . . , N, as  +   2  +  Ji (z) α 2αβ β2 Ji (z)  J0 (z)  = 1 αγ αδ + βγ βδ  J 0 (z)  . (28) i i 6 − − 2 2 Ji (z) γ Ji (z) 2γ δ δ ∗ defined by It follows from the previous lemma that the operator H

∗ = µm (z) H ∗ w(z) µ−1 H m (z) is still a second degree polynomial in the Dunkl operators Ji (z), whose coefficients cˆ , cˆ can be easily computed using Eq. (28). The corresponding gauge spin Hamiltonian can be obtained from Eq. (15) by replacing the coefficients c and c by their H counterparts cˆ and cˆ . In particular, the polynomials P (z) and Q(z) in Eq. (16) are replaced by the polynomials (z) = cˆ++ z4 + cˆ0+ z3 + cˆ00 z2 + cˆ0− z + cˆ−− , P = cˆ+ z2 + cˆ0 z + cˆ− . Q(z)

(29)

Expressing the coefficients cˆ and cˆ in terms of the original coefficients c and c , one easily arrives to the explicit formulas 4 (z) = (γ z + δ) P αz + β , P 62 γz + δ (30) 2 (γ z + δ) αz + β Q(z) = Q . 6 γz + δ Recall [30] that the (irreducible) multiplier representation ρn,i of GL(2, R) on the space of univariate polynomials of degree at most n is defined by the linear transformations αz + β i n . p(z) → p(z) ˆ = 6 (γ z + δ) p γz + δ By Eq. (30), the polynomials P and Q defining H transform according to the representations ρ4,−2 and ρ2,−1 , respectively. Note also that the Dunkl operators (8) transform


487

according to the representation ρ2,−1 ; see Eq. (28). The (nonzero) orbits of the representation ρ4,−2 can be parametrized by the following canonical forms [19, 21]: ±1,

z,

±ν(1 − z2 ),

±ν(1 − z2 )(1 − k 2 z2 ),

±ν(1 + z2 ), ±νz2 , ±ν(1 + z2 )2 ,

±ν(1 + z2 ) 1 + k 2 z2 , ν(1 − z2 ) 1 − k 2 + k 2 z2 ,

where ν > 0 and 0 < k < 1. The above list of canonical forms can be further reduced without any loss of generality using the complex projective (linear) transformation w = iz. Since the projective transformation w = iz also induces the mapping c → cˆ = i c for the coefficients of the polynomial Q, the resulting canonical form leads to a real potential provided that the initial coefficients c± are purely imaginary. The reduced list of canonical forms will be conveniently taken as 1) 1,

6)

2)

z,

7)

3)

ν(z2 − 1),

4)

ν(1 − z ),

5)

νz2 ,

ν(1 + z2 )2 ,

ν(1 − z2 )(1 − k 2 z2 ),

8) ν(z2 − 1) 1 − k2 z2 ,

9) ν(1 − z2 ) k2 + k 2 z2 ,

2

(31)

where k2 = 1 − k 2 . By choosing P (z) in Eqs. (15), (16) in each of the canonical forms (31), one obtains a complete classification of the spin CS models with potential (21), which are (Q)ES by construction. Recall that in the one-particle case the canonical forms 1), 2), 3), 5) give rise to rational or hyperbolic potentials, while the remaining five yield periodic (trigonometric or elliptic) potentials [19]. Remark 3. The operator H ∗ in (13) is easily seen to preserve the module Sm (z) of symmetric polynomials in z1 , . . . , zN of degree at most m, on which the permutation operators Kij act as the identity. Hence the antisymmetrizer # acts on the space Sm ⊗ S as the tensor product 1I ⊗ #0 , where #0 is the spin antisymmetrization operator, and the H -invariant module µ# Sm ⊗ S factors as the tensor product (µ Sm ) ⊗ (#0 S). The spin permutation operators Sij on this module

reduce to −1I. Therefore, the restriction of the Hamiltonian H to this space is simply H S →−1 ⊗ 1I. Thus the scalar Schrödinger ij operator H leaves invariant the module µ Sm . It follows that replacing the spin Sij →−1

permutation operators Sij by −1 in one of the (Q)ES spin potentials listed below, one obtains a corresponding (Q)ES scalar potential. The scalar potentials so constructed include as particular cases all the potentials presented in [16]. For each of the canonical forms (31), the potential turns out to decompose as V (x) = U (xi ) + Vint (x),

(32)

i

where U plays the role of a external field potential, and the interaction potential Vint is of the form

Vint (x) = (33) V − (xi − xj ) + V + (xi + xj ) a(a + Sij ), i<j

488


with either V + = 0 or V + = V − . Indeed, making use of the identities (A7)–(A15) in the Appendix and recalling that c0+ = 0 for all the canonical forms, the potential (21) reduces after some algebra to the form (32) with 1 ˜ 2 − 8Q(z)P ˜ U = c++ (1 − b2 )z2 + b c+ z + 3P (z)2 + 4Q(z) (z) , 16 P (z) c0− −2 2 2 (zi − zj ) c++ zi zj + c00 zi zj + Vint = 2 (zi + zj ) + c−− a(a + Sij ). 2 i<j

We have discarded here a constant term of the form (24), in accordance with the observation at the end of Sect. 3. The interaction potential takes the form (33) after performing the change of variables (20) and using some identities for the corresponding function ζ (x). Moreover, if the function V + is nonzero, it can be reduced to V − by a suitable coordinate translation (see Remark 2). We shall now present the list of (Q)ES spin many-body potentials obtained from the canonical forms (31). Note that the scaling (c , c ) → (λc , λc ) induces the mapping √ V (x ; c , c ) → V (x ; λ c , λ c ) = λ V ( λ x ; c , c ), √ µ(x ; c , c ) → µ(x ; λ c , λ c ) ∝ µ( λ x ; c , c ) of the corresponding potentials and gauge factors. For this reason we shall list the potentials for a suitably chosen value of the parameter ν in Cases 3–9, or a suitable multiple of P (z) in Cases 1, 2. The notation xij± = xi ± xj ;

c , 4

α =

= ±, 0 ;

α = α+ + α0 + α−

shall be employed in what follows. 1 . 4 Change of variables: Gauge factor:

Case 1.

P (z) =

z=

µ(x) =

i<j

x . 2 (xij− )a

i

1 3 2 exp α+ xi + α0 xi + 4α− xi . 3

External potential: 2 4 U (x) = α+ x + 4α0 α+ x 3 + 4(α02 + 2α− α+ ) x 2 + 2(8α− α0 + bα+ ) x.

Interaction potential: Vint (x) = 2

(xij− )−2 a(a + Sij ). i<j


489

Case 2. P (z) = 4z. Change of variables: z = x 2 . Gauge factor:

1 (1−b)+α− 1 1 − + a 4 2 2 µ(x) = xi exp xij xij α+ xi + α0 xi . 4 2 i<j

i

External potential:

1 1

2 6 U (x) = α+ x + 2α0 α+ x 4 + α02 + α+ (3b + 2α− ) x 2 + (2α− − b)2 − 1 2 . 4 x Interaction potential: −2 −2 Vint (x) = 2 + xij+ xij− a(a + Sij ). i<j

Case 3. P (z) = 4(z2 − 1). Change of variables: z = cosh 2x. Gauge factor:

a 1 µ(x) = (sinh xi ) 2 (1+α−b) sinh xij− sinh xij+ i<j

i

× (cosh xi )

1 2 (1+2α0 −α−b)

exp

α

+

2

cosh 2xi

.

External potential: 2 U (x) = α+ cosh2 2x +2α+ (α0 +b) cosh 2x + 2(α+ +α− )(α0 − b) cosh 2x sinh−2 2x

+ (α+ + α− )2 + (α0 − b)2 − 1 sinh−2 2x.


i<j

sinh−2 xij− + sinh−2 xij+ a(a + Sij ).

Case 4. P (z) = 4(1 − z2 ). Change of variables: z = cos 2x. Gauge factor:

1 − + a µ(x) = (sin xi ) 2 (1−α−b) sin xij sin xij i<j

i

× (cos xi )

1 2 (1+α−2α0 −b)

α+ cos 2xi . exp − 2

External potential: 2 U (x) = −α+ cos2 2x + 2α+ (b − α0 ) cos 2x

+ 2(α+ + α− )(b + α0 ) cos 2x sin−2 2x

+ (α+ + α− )2 + (b + α0 )2 − 1 sin−2 2x.

490



i<j

sin−2 xij− + sin−2 xij+ a(a + Sij ).

Case 5. P (z) = 4z2 . Change of variables: z = e2x . Gauge factor: µ(x) =

i<j

sinha xij−

i

1 2xi −2xi exp + (α0 − m)xi . α+ e − α − e 2

External potential: 2 4x 2 −4x e + 2α+ (α0 + b) e2x + 2α− (α0 − b) e−2x + α− e . U (x) = α+


i<j

sinh−2 xij− a(a + Sij ).

Case 6. P (z) = (1 + z2 )2 . Change of variables: z = tan x. Gauge factor: µ(x) =

i<j

sina xij−

i

1 1 cosm xi exp (α+ +α− )xi + (α− −α+ ) sin 2xi − α0 cos 2xi . 2 2

External potential: U (x) =

2 1

2 − α+ + bα0 ) cos 2x (α+ − α− )2 − α02 cos 4x + 2 α− 2

+ α0 (α− − α+ ) sin 4x + 2 α0 (α+ + α− ) + b(α+ − α− ) sin 2x.


i<j

sin−2 xij− a(a + Sij ).

P (z) = 4(1 − z2 )(1 − k 2 z2 ). cn 2x Change of variables: z = . dn 2x Gauge factor: a

sn xij− sn xij+ 1 α 2 1−b− k2 (dn 2x )m (sn 2x ) µ(x) = i i 1 − k 2 sn2 xij− sn2 xij+ i<j i α+ +α−

− α+ +k2 α−

2kk2 . × dn 2xi + cn 2xi 2k2 dn 2xi + k cn 2xi Case 7.


491

Here (and also in Cases 8 and 9) the functions sn x ≡ sn(x|k), cn x ≡ cn(x|k), √ and dn x ≡ dn(x|k) are the usual Jacobian elliptic functions of modulus k, and k = 1 − k 2 is the complementary modulus. External potential: U (x) = A7 sn2 2x + B7 cn 2x dn 2x + sn−2 2x (C7 + D7 cn 2x dn 2x), where k 2 α0 α 0 1 − 2b + 4 (α+ + k 2 α− )2 , 2 2 k k k 2 α0 2 B7 = 2 (α+ + k α− ) b − 2 , k k α 0 α0 1 2 C7 = b − 1 + 2 + 2b + 4 (α+ + α− )2 , k k2 k 2 α0 D7 = 2 (α+ + α− ) b + 2 . k k A7 = k 2 (b2 − 1) +


cn2 xij− dn2 xij− i<j

sn2 xij−

+

cn2 xij+ dn2 xij+ sn2 xij+

a(a + Sij ).

P (z) = 4(z2 − 1)(1 − k2 z2 ). 1 Change of variables: z = . dn 2x Gauge factor: Case 8.

µ(x) =

sn xij− sn xij+ cn xij− cn xij+ a

1 2

1−b−

1 (α+ +k α0 +k2 α− ) k k2

(cn 2xi ) 1 − k 2 sn2 xij− sn2 xij+ i

1

− α+ +α− α+ +k2 α− 1−b+ α2 k 2k 2 (dn 2xi )m 1 + dn 2xi × (sn 2xi ) 2 . k + dn 2xi 2k k2

i<j

External potential: U (x) = sn−2 2x (A8 + B8 dn 2x) + cn−2 2x (C8 + D8 dn 2x), where α 0 α0 1 − 2b + 4 (α+ + α− )2 , k2 k2 k α 2 0 B8 = 2 (α+ + α− ) 2 − b , k k k2 α0 α0 1 2 2 C8 = k (b − 1) + 2 + 2b + 4 (α+ + k2 α− )2 , 2 k k k α 2 0 D8 = 2 (α+ + k2 α− ) 2 + b . k k A8 = b2 − 1 +

492



i<j

dn2 xij−

sn2 xij− cn2 xij−

+

dn2 xij+

sn2 xij+ cn2 xij+

a(a + Sij ).

Case 9. P (z) = 4(1 − z2 )(k2 + k 2 z2 ). Change of variables: z = cn 2x. Gauge factor: µ(x) =

sn xij− sn xij+ dn xij− dn xij+ a

(sn 2xi ) 2 (1−α−b) (dn 2xi ) 2 (1+α0 −b)

1 − k 2 sn2 xij− sn2 xij+

i<j

× 1 + cn 2xi

1

1

i

1 (α+ +α− ) 2

k 2 α− − k2 α+ −1 k exp tan cn 2x i 2kk k

.

External potential: U (x) = dn−2 2x (A9 + B9 cn 2x) + sn−2 2x (C9 + D9 cn 2x), where 2 1

A9 = k2 (1 − b2 ) + k2 α0 (2b − α0 ) + 2 k2 α+ − k 2 α− , k

B9 = 2(b − α0 ) k2 α+ − k 2 α− , C9 = (b + α0 )2 + (α+ + α− )2 − 1, D9 = 2(b + α0 )(α+ + α− ). Interaction potential: Vint (x) = 2

i<j

cn2 xij−

sn2 xij− dn2 xij−

+

cn2 xij+

sn2 xij+ dn2 xij+

a(a + Sij ).

Remark 4. In Cases 7 and 8, the alternative canonical form P (z) = 4z3 − g2 z − g3 ,

g23 > 27g32 > 0,

leads to a spin generalization of the QES potential involving Weierstrass functions studied in [17]. The corresponding change of variables is z = ℘ (x + ω3 ), where ℘ (x) = ℘ (x|g2 , g3 ) is the Weierstrass function with invariants g2 , g3 , and 2ω3 is its purely imaginary fundamental period. The gauge factor reads

a β µ(x) = ℘ (xi + ω3 ) ℘ (xi + ω3 ) − ℘ (xj + ω3 ) i<j

i

× ℘ (xi + ω3 ) − e1

γ1

℘ (xi + ω3 ) − e2

γ2

℘ (xi + ω3 ) − e3

γ3

,


493

where α+ 1 (1 − b) + , 2 3 g2 α+ + 12(ej α0 + α− ) γj = , 24(ej − ek )(ej − el )

β=

(j, k, l) = cyclic permutation of (1, 2, 3),

and ej are the real (different) roots of P (z). The external and interaction potentials are given by U (x) = 4β(β − 1)℘ (2x) + A℘ (x + ω3 )

−2 + B℘ 2 (x + ω3 ) + C℘ (x + ω3 ) + D ℘ (x + ω3 ) , where 4 α+ (2α+ + 3b), 9 1 B = 4α02 + (2α+ − 3b)(12α− + g2 α+ ),

3 C = 2α0 4α− + g2 (α+ − b) , 1 1 D = g2 b + 4α− − g2 α+ α− + g2 α+ + g3 α0 (3b − 2α+ ), 3 12 A = 2α0 +

and

Vint (x) = 2

i<j

℘ (xij− ) + ℘ (xij+ ) a(a + Sij ).

The restriction of this spin model to the polynomial space Sm (see Remark 3) yields the scalar elliptic CS model in [17] provided α0 = 0 and α− = −g2 α+ /12. Remark 5. Case 1 with α+ = 0 yields the rational Calogero AN spin model. Case 5 with α+ α− = 0 is the model studied by Inozemtsev [25], while for α+ = α− = 0 the hyperbolic Sutherland AN spin model is obtained. Case 6 with α+ − α− = α0 = 0 is the trigonometric Sutherland AN spin model. The remaining potentials are new. In Cases 1–5, the potential is ES if α+ = 0. In Case 5, the potential is also ES for α− = 0. The only ES potential in Case 6 is the trigonometric Sutherland potential (α+ − α− = α0 = 0). The remaining potentials, including all the elliptic potentials in Cases 7–9, are QES. Remark 6. In order to qualify as physical wavefunctions, the spin functions ψ(x) constructed in Sect. 3 must vanish at all the hyperplanes in which the corresponding potential V (x) is singular faster than the square root of the distance to the hyperplane. In addition, in Cases 1, 2, 3, 5 the function ψ is required to be square-integrable over a suitable domain of RN . Both requirements impose certain constraints on the parameters α and a defining the potential. A detailed analysis of the necessary and sufficient conditions on these parameters lies beyond the scope of this paper; see [18] for a complete solution of the analogous one-particle problem. However, it is not difficult in each case to provide sufficient conditions for the above requirements to hold. For example, in Case 5 the spin functions ψ vanish at the singularities xij− = 0 of the potential faster than |xij− |1/2 if and only if a > 1/2, and are square-integrable over the domain {x ∈ RN : x1 > · · · > xN } provided α+ < 0 and α− > 0.

494


Remark 7. In Cases 1–6, the gauge factor is of the form a µ(x) = h(xi ), f (xij− ) g(xij+ ) i<j

(34)

i

with g = 1 or g = f . On the other hand, in the elliptic Cases 7–9 the gauge factor does not factorize as (34). This is consistent with the analogous result proved by Calogero for elliptic potentials in the scalar case [7]. 5. Conclusions We have developed in this paper a systematic method for constructing new families of exactly and quasi-exactly solvable spin Calogero–Sutherland models. The key idea consists in relating the physical spin Hamiltonian to a general quadratic combination involving the usual AN -type Dunkl operators (1), (4), and the new family of Dunkl operators Ji+ introduced in Sect. 2 (Eq. (8)). Our approach goes beyond the Lie algebraic method extensively used in the scalar case, since the Dunkl operators (8) do not span a Lie algebra. However, they are invariant under the projective action of GL(2, R), a fact that is exploited in Sect. 4 to classify all the potentials obtained by this method up to translations. The potentials constructed in this paper are all invariant under the AN group consisting of permutations of the particles’ coordinates and spins. A remarkable feature of the potentials in Cases 2–4 and 7–9 is their additional invariance under a change of sign of the spatial coordinate of any particle. Therefore, although these potentials are not invariant under the full BN group of permutations and sign reversal of the particles’ coordinates and spins, they are invariant under the restriction of the action of this group to the spatial coordinates. These models thus occupy an intermediate position between the usual spin CS models of AN type and the fully BN -invariant rational and trigonometric spin CS models introduced by Yamamoto [44]. In fact, while Dunkl has recently proved the exact solvability of the rational Yamamoto model [13], there are no exact results for the eigenfunctions of its trigonometric counterpart. It is to be expected that a suitable extension of the method developed in this paper to the BN case will yield new families of (Q)ES spin CS models of BN type, including the trigonometric Yamamoto model. 6. Appendix The formulae for the sums of the squares and the anticommutators of the Dunkl operators (8) are given by i

(Ji− )2 =

i

∂z2i + 2a

i<j

1

1 ∂zi − ∂zj − 2a (1 − Kij ), zi − z j (zi − zj )2 i<j

(Ji0 )2 = zi2 ∂z2i + (2 − b) zi ∂zi + 2a i

i

−a

i<j

i<j

zi2

+ zj2

(zi − zj

)2

(1 − Kij ) + a

(A1) 1 2 zi ∂zi − zj2 ∂zj zi − z j

i<j

(1 − Kij )


+

495

a2 N m2 (1 − Kij Kj k ) + , 12 4

(A2)

i,j,k

(Ji+ )2 = zi4 ∂z2i + 2(2 − b) zi3 ∂zi + 2a i

j

i<j

−a

zi4

+ zj4 )2

(1 − Kij ) + a

(zi − zj i<j i<j − 2am zi zj + m(m − 1) zi2 , i<j

1 4 zi ∂zi − zj4 ∂zj zi − z j

(zi + zj )2 (1 − Kij ) (A3)

i

1

1 0 − 1 {Ji , Ji } = zi ∂z2i + (2 − b) ∂zi + 2a zi ∂zi − zj ∂zj 2 2 zi − z j i

i

zi + z j −a (1 − Kij ), (zi − zj )2

i<j

(A4)

i<j

1

1 + − zi2 ∂z2i + (2 − b) zi ∂zi + 2a z2 ∂z − zj2 ∂zj {Ji , Ji } = 2 zi − z j i i i

i

i<j

−a

zi2 + zj2 i<j

(zi − zj )2

(1 − Kij ) −

a2 (1 − Kij Kj k ) 6 i,j,k

mN

− 1 + a(N − 1) , (A5) 2 1

1 0 + 3 zi3 ∂z2i + (2 − b) zi2 ∂zi + 2a zi3 ∂zi − zj3 ∂zj {Ji , Ji } = 2 2 zi − z j i

i

i<j

−a

i<j

+

zi3

+ zj3

(zi − zj

)2

(1 − Kij ) + a

m(2m − b) zi , 2

(zi + zj )(1 − Kij )

i<j

(A6)

i

where Here the symbol

i,j,k

b = 1 + m + a(N − 1). stands for summation in i, j, k with i = j = k = i. The following

identities are needed for the computation of the potential: i,j,k

i,j,k

i,j,k

1 = 0, (zi − zj )(zi − zk )

(A7)

zi = 0, (zi − zj )(zi − zk )

(A8)

zi2 1 = N (N − 1)(N − 2), (zi − zj )(zi − zk ) 3

(A9)

496


zi3 = (N − 1)(N − 2) zi , (zi − zj )(zi − zk )

(A10)

(A11)

i,j,k

zi4 (zi + zj )2 , = (N − 2) (zi − zj )(zi − zk )

(A12)

i=j

1 = 0, zi − z j

(A13)

i=j

zi 1 = N (N − 1), zi − z j 2 zi2 = (N − 1) zi , zi − z j

(A14)

zi3 = (N − 1) zi2 + z i zj . zi − z j

(A15)

i,j,k

i=j

i=j

i

i<j

i

i

i<j

Acknowledgements. This work was partially supported by the DGES under grant PB98-0821. R. Zhdanov would like to acknowledge the financial support of the Spanish Ministry of Education and Culture during his stay at the Universidad Complutense de Madrid.

References 1. Baker, T.H. and Forrester, P.J.: The Calogero–Sutherland model and generalized classical polynomials. Commun. Math. Phys. 188, 175–216 (1997) 2. Basu-Mallick, B.: Spin-dependent extension of Calogero–Sutherland model through anyon-like representations of permutation operators. Nucl. Phys. B 482, 713–730 (1996) 3. Bernard, D., Gaudin, M., Haldane, F.D.M. and Pasquier, V.: Yang–Baxter equation in long-range interacting systems. J. Phys. A: Math. Gen. 26, 5219–5236 (1993) 4. Brink, L., Hansson, T.H., Konstein, S. and Vasiliev, M.A.: The Calogero model – anyonic representation, fermionic extension and supersymmetry. Nucl. Phys. B 401, 591–612 (1993) 5. Brink, L. Turbiner, A. and Wyllard, N.: Hidden algebras of the (super) Calogero and Sutherland models. J. Math. Phys. 39, 1285–1315 (1998) 6. Calogero, F.: Solution of the one-dimensional N-body problems with quadratic and/or inversely quadratic pair potentials. J. Math. Phys. 12, 419–436 (1971) 7. Calogero, F.: One-dimensional many-body problems with pair interactions whose exact ground-state wave function is of product type. Lett. Nuovo Cim. 13, 507–511 (1975) 8. Carey, A.L. and Langmann, E.: Loop groups, anyons and the Calogero–Sutherland model. Commun. Math. Phys. 201, 1–34 (1999) 9. Cherednik, I.: Integration of quantum many-body problems by affine Knizhnik–Zamolodchikov equations. Adv. Math. 106, 65–95 (1994) 10. Cotton, É.: Sur les invariants différentiels de quelques équations linéaires aux dérivées partielles du second ordre. Ann. École Norm. 17, 211–244 (1900) 11. D’Hoker, E. and Phong, D.H.: Calogero–Moser systems in SU(N ) Seiberg–Witten theory. Nucl. Phys. B 513, 405–444 (1998) 12. Dunkl, C.F.: Differential-difference operators associated to reflection groups. Trans. Am. Math. Soc. 311, 167–183 (1989) 13. Dunkl, C.F.: Orthogonal polynomials of types A and B and related Calogero models. Commun. Math. Phys. 197, 451–487 (1998) 14. Finkel, F. and Kamran, N.: On the equivalence of matrix differential operators to Schrödinger form. Nonlin. Math. Phys. 4, N 3–4, 278–286 (1997) 15. Finkel, F. and Kamran, N.: The Lie algebraic structure of differential operators admitting invariant spaces of polynomials. Adv. Appl. Math. 20, 300–322 (1998)


497

16. Gómez-Ullate, D., Gónzalez-López, A. and Rodríguez, M.A.: New algebraic quantum many body problems. J. Phys. A: Math. Gen. 33, 7305–7335 (2000) 17. Gómez-Ullate, D., Gónzalez-López, A. and Rodríguez, M.A.: Exact solutions of an elliptic Calogero– Sutherland model. Phys. Lett. B (2001), in press (hep-th/0006039) 18. González-López, A., Kamran, N. and Olver, P.J.: Normalizability of one-dimensional quasi-exactly solvable Schrödinger operators. Commun. Math. Phys. 153, 117–146 (1993) 19. González-López, A., Kamran, N. and Olver, P.J.: Quasi-exact solvability. Contemp. Math. 160, 113–140 (1994) 20. Gorsky, A. and Nekrasov, N.: Hamiltonian systems of Calogero type, and 2-dimensional Yang–Mills theory. Nucl. Phys. B 414, 213–238 (1994) 21. Gurevich, G.B.: Foundations of the Theory of Algebraic Invariants. Groningen: P. Noordhoff Ltd., 1964 22. Haldane, F.D.M.: Exact Jastrow–Gutzwiller resonating-valence-bond ground state of the spin-1/2 antiferromagnetic Heisenberg chain with 1/r 2 exchange. Phys. Rev. Lett. 60, 635–638 (1988) 23. Hikami, K. and Wadati, M.: Integrability of Calogero–Moser spin system. J. Phys. Soc. Jap. 62, 469–472 (1993) 24. Hou, X. and Shifman, M.: A quasi-exactly solvable N -body problem with the sl(N +1) algebraic structure. Int. J. Mod. Phys. A 14, 2993–3003 (1999) 25. Inozemtsev, V.I.: Integrable model of interacting fermions confined by the Morse potential. Int. J. Mod. Phys. A 12, 195–200 (1997) 26. Kasman, A.: Bispectral KP solutions and linearization of Calogero–Moser particle systems. Commun. Math. Phys. 172, 427–448 (1995) 27. Lapointe, L. and Vinet, L.: Exact operator solution of the Calogero–Sutherland model. Commun. Math. Phys. 178, 425–452 (1996) 28. Minzoni, A., Rosenbaum, M. and Turbiner, A.: Quasi-exactly solvable many-body problems. Mod. Phys. Lett. A 11, 1977–1984 (1996) 29. Olshanetsky, M.A. and Perelomov A.M.: Quantum integrable systems related to Lie algebras. Phys. Rep. 94, 313–414 (1983) 30. Olver, P.J.: Classical Invariant Theory. Cambridge: Cambridge University Press, 1999 31. Polychronakos, A.P.: Exchange operator formalism for integrable systems of particles. Phys. Rev. Lett. 69, 703–705 (1992) 32. Polychronakos, A.P.: Waves and solitons in the continuum limit of the Calogero–Sutherland model. Phys. Rev. Lett. 74, 5153–5157 (1995) 33. Polychronakos, A.P.: Generalized Calogero models through reductions by discrete symmetries. Nucl. Phys. B 543, 485–498 (1999) 34. Shastry, B.S.: Exact solution of an S = 1/2 Heisenberg antiferromagnetic chain with long-ranged interactions. Phys. Rev. Lett. 60, 639–642 (1988) 35. Shifman, M.A.: New findings in quantum mechanics (partial algebraization of the spectral problem). Int. J. Mod. Phys. A 4, 2897–2952 (1989) 36. Sutherland, B.: Exact results for a quantum many-body problem in one dimension. Phys. Rev. A 4, 2019–2021 (1971) 37. Sutherland, B.: Exact results for a quantum many-body problem in one dimension. II. Phys. Rev. A 5, 1372–1376 (1972) 38. Takemura, K. and Uglov, D.: The orthogonal eigenbasis and norms of eigenvectors in the spin Calogero– Sutherland model. J. Phys. A: Math. Gen. 30, 3685–3717 (1997) 39. Taniguchi, N., Shastry, B.S. and Altshuler, B.L.: Random matrix model and the Calogero–Sutherland model: A novel current-density mapping. Phys. Rev. Lett. 75, 3724–3727 (1995) 40. Turbiner, A.V. and Ushveridze, A.G.: Spectral singularities and quasi-exactly solvable quantal problem. Phys. Lett. A 126, 181–183 (1987) 41. Turbiner, A.V.: Quasi-exactly solvable problems and sl(2) algebra. Commun. Math. Phys. 118, 467–474 (1988) 42. Ujino, H., Nishino, A. and Wadati, M.: New nonsymmetric orthogonal basis for the Calogero model with distinguishable particles. Phys. Lett. A 249, 459–464 (1998) 43. Ushveridze, A.G.: Quasi-Exactly Solvable Models in Quantum Mechanics. Bristol: Institute of Physics Publishing, 1994 44. Yamamoto, T.: Multicomponent Calogero model of BN -type confined in a harmonic potential. Phys. Lett. A 208, 293–302 (1995) Communicated by G. Gallavotti

Commun. Math. Phys. 221, 499 – 510 (2001)

Communications in



Note on the Paper “The Norm Convergence of the Trotter–Kato Product Formula with Error Bound” by Ichinose and Tamura Takashi Ichinose1 , Hideo Tamura2 , Hiroshi Tamura1 , Valentin A. Zagrebnov3 1 Department of Mathematics, Faculty of Science, Kanazawa University, Kanazawa, 920-1192, Japan.

E-mail: [email protected]; [email protected]

2 Department of Mathematics, Faculty of Science, Okayama University, Okayama, 700-8530, Japan.


3 Département de Physique, Université de la Méditerranée (Aix-Marseille II) and Centre de Physique

Théorique, CNRS-Luminy-Case 907, 13288 Marseille Cedex 9, France. E-mail: [email protected] Received: 5 October 2000 / Accepted: 12 March 2001

Abstract: The norm convergence of the Trotter–Kato product formula is established with ultimate optimal error bound for the selfadjoint semigroup generated by the operator sum of two selfadjoint operators. A generalization is also given to the operator sum of several selfadjoint operators. 1. Introduction The present note is an addendum to the recent paper [1] by the first two authors. The aim is to prove the norm convergence of the Trotter–Kato product formula for the selfadjoint semigroup with ultimate error bound. To refer to some items in that paper we shall write, for instance, Lemma 2.1 in [1] as Lemma I.2.1, Eq. (3.2) in [1] as (I.3.2), Ref. [4] in [1] as [I 4], and so on. To formulate our new theorems, again consider real-valued, Borel measurable functions f on [0, ∞) satisfying 0 ≤ f (s) ≤ 1,

f (0) = −1.

f (0) = 1,

(1.1)

Some examples of functions satisfying (1.1) are f (s) = e−s ,

f (s) = (1 + k −1 s)−k ,

k > 0.

(1.2)

We are interested in those functions f which satisfy not only (1.1) but also that for every small ε > 0 there exists a positive constant δ = δ(ε) < 1 such that f (s) ≤ 1 − δ(ε),

s ≥ ε,

(1.3)

and that for some fixed constant κ with 1 < κ ≤ 2, [f ]κ := sup s>0

|f (s) − 1 + s| < ∞. sκ

(1.4)

500

T. Ichinose, H. Tamura, H. Tamura, V. A. Zagrebnov

A function f (s) satisfying (1.1) has property (1.3), if it is non-increasing. Of course, the functions in (1.2) have properties (1.3) and (1.4). Condition (1.3) is necessary. For this account and some further remarks on conditions (1.3) and (1.4) we refer to [1]. Then we can show the following theorem. Theorem 1. Let f and g be functions having properties (1.3) and (1.4) with κ = 2 as well as (1.1). If A and B are nonnegative selfadjoint operators in a Hilbert space H with domains D[A] and D[B] such that the operator sum C := A + B is selfadjoint on D[C] = D[A] ∩ D[B], then it holds in operator norm that

[g(tB/2n)f (tA/n)g(tB/2n)]n − e−tC = O(n−1 ),

[f (tA/n)g(tB/n)]n − e−tC = O(n−1 ),

n → ∞.

(1.5)

The convergence is uniform on each compact t-interval in the closed half line [0, ∞), and further, if C is strictly positive, i.e. C ≥ η for some constant η > 0, uniform on the whole closed half line [0, ∞). Taking, for instance, f (s) = g(s) = e−s in (1.2), we have the following Corollary 1. For nonnegative selfadjoint operators A and B whose operator sum C := A + B is selfadjoint on D[C] = D[A] ∩ D[B] it holds in operator norm that

(e−tB/2n e−tA/n e−tB/2n )n − e−tC = O(n−1 ),

(e−tA/n e−tB/n )n − e−tC = O(n−1 ),

n → ∞,

(1.6)

uniformly on each compact t-interval in [0, ∞), and further, if C is strictly positive, uniformly on [0, ∞). It is with error bound O(n−1/2 ) that this theorem has first been proved in [1] when f and g satisfy (1.4) with 3/2 ≤ κ ≤ 2 as well as (1.1) and (1.3), though with convergence uniform on each compact t-interval in (0, ∞), and further, for C strictly positive, on the half line [T , ∞) for every T > 0. The error bound O(n−1 ) obtained in Theorem 1 turns out to be optimal and ultimate, though with κ = 2. Therefore Theorem 1 does properly extend and contain, now with optimal error bound, all the known related results, not only in the abstract case such as in Rogava [I 21], Ichinose–Tamura [I 11, I 13] and Neidhardt–Zagrebnov [I 16, I 17, I 18], but also for the Schrödinger operators such as in Helffer [I 6], Dia–Schatzman [I 3], Ichinose–Takanobu [I 7, I 8], Doumeki–Ichinose–Tamura [I 4], Ichinose–Tamura [I 12] and Ichinose–Takanobu [I 9, I 10]. Indeed, in all these cases, the operator sum of two selfadjoint operators concerned there is selfadjoint. A little more detailed account of these facts is referred to in the Introduction in [1]. Next, we want to give a generalization to the case of the sum of m selfadjoint operators A1 , A2 , ..., Am in H. Then the product formula lim (e−tA1 /n e−tA2 /n · · · e−tAm /n )n = e−tC ,

n→∞

n → ∞,

in strong operator topology was already shown by Kato–Masuda [2], when C is even the form sum of A1 , A2 , · · · , Am which is selfadjoint. In this note we content ourselves to show the following theorem, though it only deals with the symmetric product case.

Note on Norm Convergence of Trotter–Kato Product Formula

501

Theorem 2. Let f1 , · · · , fm be functions having properties (1.3) and (1.4) with κ = 2 as well as (1.1). If A1 , · · · , Am are m nonnegative selfadjoint operators in a Hilbert space H with domains D[A1 ], · · · , D[Am ] such that the operator sum C := A1 + · · · + Am is selfadjoint on D[C] = D[A1 ] ∩ · · · ∩ D[Am ], then it holds in operator norm that

[fm (tAm /2n) · · · f2 (tA2 /2n)f1 (tA1 /n)f2 (tA2 /2n) · · · fm (tAm /2n)]n − e−tC = O(n−1 ), n → ∞.

(1.7)

The convergence is uniform on each compact t-interval in the closed half line [0, ∞), and further, if C is strictly positive, i.e. C ≥ η for some constant η > 0, uniform on the whole closed half line [0, ∞). To prove our theorems we make essential use of Lemma I.2.1, the operator-norm version of Chernoff’s theorem with error bound, proved in [1]. Theorems 1 and 2 are shown in Sects. 2 and 3. Section 4 remarks optimality of the new error bound O(n−1 ). 2. Proof of Theorem 1 (a) The symmetric product case. We are quickly jumping to the circumstances around (I.3.6) in the proof of this case of the theorem in [1]. So recall the notation St = t −1 (1 − F (t)) with F (t) = g(tB/2)f (tA)g(tB/2) as well as (I.3.2). By Lemma I.2.1 with α = 1, it suffices to show that

(1 + St )−1 − (1 + C)−1 || = O(t), We have

t ↓ 0.

(1 + St )−1 − (1 + C)−1 = (1 + St )−1 (C − St )(1 + C)−1 .

(2.1) (2.2)

The new idea is to iterate this formula (2.2) with help of its adjoint form. Then we get (1 + St )−1 − (1 + C)−1 = ((1 + C)−1 + [(1 + St )−1 − (1 + C)−1 ])(C − St )(1 + C)−1 = (1 + C)−1 (C − St )(1 + C)−1

(2.3)

+ [(C − St )(1 + C)−1 ]∗ (1 + St )−1 (C − St )(1 + C)−1 ≡ R1 (t) + R2 (t). First, for the second term in the last member of (2.3), we have, using (I.3.8), −1/2

R2 (t) = [Kt

−1/2

(C − St )(1 + C)−1 ]∗ (1 + Qt )−1 [Kt

(C − St )(1 + C)−1 ].

What is actually proved in Lemma I.3.2 is −1/2

Kt

(C − St )(1 + C)−1 = O(t 1/2 ).

By this bound and Lemma I.3.1, we have the bound

R2 (t) = O(t).

(2.4)

502


Next, the first term can be represented (cf. (I.3.12)) as R1 (t) = (1 + C)−1 (A − At )(1 + C)−1 + (1 + C)−1 (B − Bt/2 )(1 + C)−1 + (1 + C)−1 4t Bt/2 (1 − tAt )Bt/2 + 2t (At Bt/2 + Bt/2 At ) (1 + C)−1 ≡ R11 (t) + R12 (t) + R13 (t).

(2.5) (t) in the last member of (2.5) have norm In the next lemma we prove all the three R1j of order O(t). Lemma 1.

R11 (t) ≤ a 2 [f ]2 t,

R12 (t) ≤ 2−1 a 2 [g]2 t,

R13 (t) ≤ da 2 t,

(2.6)

with a constant d independent of t > 0. Proof. I. Just in the same way as in (I.3.1), since C is a selfadjoint and so closed operator, by the closed graph theorem there is a positive constant a such that

(1 + A)(1 + C)−1 = (1 + C)−1 (1 + A) ≤ a,

(1 + B)(1 + C)−1 = (1 + C)−1 (1 + B) ≤ a.

(2.7)

(t) is rewritten as Therefore R11 R11 (t) = [(1 + C)−1 (1 + A)][(1 + A)−1 − (1 + A)−2 (1 + At )][(1 + A)(1 + C)−1 ],

so that by (2.7)

R11 (t) ≤ a 2 (1 + A)−1 − (1 + A)−2 (1 + At ) .

On the other hand, we have by our assumption on f 1 1 − f (tλ) −1 −2 2

(1 + A) − (1 + A) (1 + At ) = sup − 1+ /(1 + λ) t λ≥0 1 + λ λ 2 |f (tλ) − 1 + tλ| = t sup t 2 λ2 λ≥0 1 + λ ≤ [f ]2 t. (t) ≤ a 2 [f ] t. Thus we get the bound R11 2 (t), the proof is the same as for R (t). We have only to note that II. For R12 11 R12 (t) = [(1 + C)−1 (1 + B)][(1 + B)−1 − (1 + B)−2 (1 + Bt/2 )][(1 + B)(1 + C)−1 ]. (t) we have III. For R13 R13 (t) =

t [(1 + C)−1 (1 + B)][(1 + B)−1 Bt/2 ] 4 · f (tA)[Bt/2 (1 + B)−1 ][(1 + B)(1 + C)−1 ] t + [(1+C)−1 (1+A)][(1+A)−1 At ][Bt/2 (1+B)−1 ][(1+B)(1+C)−1 ] 2 t + [(1+C)−1 (1+B)][(1+B)−1 Bt/2 ] 2 · [At (1 + A)−1 ][(1 + A)(1 + C)−1 ].


503

Then by (2.7) with the constants a0 , b0 introduced in (I.3.14), we get the bound

R13 (t) ≤ (a0 b0 + b02 /4)a 2 t.

This completes the proof of Lemma 1, ending the proof of the symmetric product case. (b) The non-symmetric product case. The proof in this case in [1] also is valid, as mentioned at the beginning of the proof of this case there, because Lemma I.2.1 holds for every α with 0 < α ≤ 1. This ends the proof of Theorem 1. 3. Proof of Theorem 2 Let Cj = A1 + A2 + · · · + Aj ,

j = 1, 2, · · · , m.

(3.1)

Here Cj may be understood simply as the operator sum of j selfadjoint operators A1 , A2 , · · · , Aj with domain D[Cj ] := D[A1 ] ∩ D[A2 ] ∩ · · · D[Aj ], which may not be selfadjoint if 1 < j < m, or as the form sum of these j operators which is selfadjoint. Note that C1 = A1 and Cm = C. Put, with the notations (I.3.2), Aj,t = t −1 [1 − fj (tAj )], Cj,t = t −1 [1 − fj (tAj /2) · · · f2 (tA2 /2)f1 (tA1 )f2 (tA2 /2) · · · fj (tAj /2)], t Kj,t = 1 + Cj −1,t + Aj,t/2 − A2j,t/2 4

(3.2)

for j = 2, 3, · · · , m. There will be below no Ct , which Cm,t differs from. Moreover, we put Qj,t =

t 2 −1/2 −1/2 Aj,t/2 Cj −1,t Aj,t/2 Kj,t K 4 j,t t −1/2 −1/2 − Kj,t (Cj −1,t Aj,t/2 + Aj,t/2 Cj −1,t )Kj,t . 2

Then we have the identity 1/2

1/2

1 + Cj,t = Kj,t (1 + Qj,t )Kj,t ,

(3.3)

and the following estimate may be proved by the same reasoning as in the proof of Lemma I.3.1: √

(1 + Qj,t )−1 ≤ 2/(3 − 5). (3.4) Similar to the proof of our Theorem 1, all we have to do now is to show the following two estimates:

(1 + C)−1 (C − Cm,t )(1 + C)−1 = O(t), and

−1/2

Km,t (C − Cm,t )(1 + C)−1 = O(t 1/2 ).

504


To this end, we prove for each j = 1, 2, · · · , m the following estimates: (a)j

(1 + C)−1 (Cj − Cj,t )(1 + C)−1 = O(t),

(b)j

Kj,t

(c)j

Cj,t (1 + C)−1 = O(1).

−1/2

(Cj − Cj,t )(1 + C)−1 = O(t 1/2 ),

The proof is done by induction on j . Notice that for each inductive step j , we have, similarly to (I.3.8) and (I.3.12), the following identity: Cj +1 − Cj +1,t = (Cj − Cj,t ) + (Aj +1 − Aj +1,t/2 ) t + Aj +1,t/2 (1 − tCj,t )Aj +1,t/2 4 t + (Cj,t Aj +1,t/2 + Aj +1,t/2 Cj,t ) 2 as operators on D[C] and the estimate −1/2

Kj,t

1/2

Kj −1,t ≤

√ 2/(3 − 5).

(3.5)

(3.6)

Here we can see (3.6) from (3.3), (3.4) with definitions (3.1) and (3.2) for j = 2, · · · m, noting Kj,t ≥ 1 + Cj −1,t and setting K1,t = 1 + C1,t = 1 + A1,t . As in the proof of Theorem 1, since C is a selfadjoint and so closed operator, we have again by the closed graph theorem

(1 + Aj )(1 + C)−1 ≤ a,

j = 1, 2, · · · , m,

(3.7)

with some positive constant a. For j = 1, the estimates (a)1 and (b)1 are trivial. The estimate (c)1 is also obvious. Assume that the estimates (a)j , (b)j and (c)j are valid. Then we use Cj +1,t (1 + C)−1 = Cj +1 (1 + C)−1 + (Cj +1,t − Cj +1 )(1 + C)−1

(3.8)

to show the estimate (c)j +1 . The first term on the right-hand side of (3.8) is bounded in view of (3.7) and by induction hypothesis (c)j . We use (3.5) and (c)j to estimate the second term. By analogous arguments used to prove our Theorem 1, the identity (3.5) gives us the estimate (a)j +1 and, together with (3.6), the estimate (b)j +1 . Details of these estimates are now some routine calculations. Thus we have proved the estimates (a)j , (b)j and (c)j for all j = 1, 2, · · · , m, ending the proof of Theorem 2. 4. Optimality of the Error Bound In this section, we want to note that the new error bound O(n−1 ) in Theorem 1 is optimal. We consider with f (s) = g(s) = e−s first the non-symmetric and next symmetric product case. (a) The non-symmetric product case. For the time being, let A and B be simply bounded operators. Then by the Baker–Campbell–Hausdorff formula (e.g. [5, 3]) we have for small |t| 2 3 e−tA e−tB = exp −t (A + B) + t2 [A, B] − t6 [A − B, 21 [A, B]] + Op (|t|4 ) , (4.1)


505

with [A, B] = AB − BA, where and below Op (|t|k ), for k > 0, means some bounded operator with norm of order O(|t|k ). Then N (t) := (e−tA e−tB )1/t = exp −(A + B) + 2t [A, B] + Op (|t|2 ) . We understand N (0) = e−(A+B) and have E1 (A; B) :=

∞ k d 1 (−1)k−1

(A + B)j −1 [A, B](A + B)k−j , (4.2) N (t)|t=0 = dt 2 k! k=1

j =1

of which the right-hand side is norm-convergent and can be a non-zero operator with bound 2−1 [A, B] e A+B , if A and B do not commute with each other. It follows that N (t) = e−(A+B) + tE1 (A; B) + Op (t 2 ), so that with t = 1/n, (e−A/n e−B/n )n = N (1/n)n = e−(A+B) + n−1 E1 (A; B) + Op (n−2 ).

(4.3)

Thus we have seen that, in the non-symmetric case, the error bound O(n−1 ) is optimal. (b) The symmetric product case. We have from (4.1), e−tB/2 e−tA e−tB/2 = exp −t (A + B) −

4 t3 24 [2A + B, [A, B]] + Op (|t| )

.

(4.4)

Similarly it follows that S(t) := (e−tB/2 e−tA e−tB/2 )1/t t2 = exp −(A + B) − 24 [2A + B, [A, B]] + Op (|t|3 ) . Here we understand S(0) = e−(A+B) and have dS(t)/dt|t=0 = 0, so that with t = 1/n, (e−B/2n e−A/n e−B/2n )n = S(1/n)n = e−(A+B) + Op (n−2 ).

(4.5)

Hence, in the symmetric case, the optimal error bound would appear to be of order O(n−2 ). But it is not. In fact, in this case also the optimal error bound is just of order O(n−1 ). In the following example we shall see that there exist unbounded nonnegative selfadjoint operators A and B in a Hilbert space H such that the operator sum A + B is selfadjoint on D[A] ∩ D[B] and the following lower estimate holds for n large:

e−t (A+B) − (e−tB/2n e−tA/n e−tB/2n )n ≥ L(t)n−1 , where L(t) is a positive continuous function of t > 0, independent of n. We are using the same idea as in [4].

506


Example. Let H = ⊕∞ Hilbert spaces k=1 Hk be the direct sum of a countable family of

∞ 2 Hk := R with inner product (· , ·)k . It has the inner product (z, w) = k=1 (zk , wk )k , for z = (zk )k∈N and w = (wk )k∈N in H. Let S, T and E be the matrices 1 0 0 1 1 0 . (4.6) , E= , T = S= 0 1 1 0 0 −1 Note that

ST + T S = O,

S 2 = T 2 = E.

(4.7)

For each k, define two bounded nonnegative selfadjoint operators Ak = k(S + E),

Bk = k(S cos θk + T sin θk + E)

(4.8)

on Hk , where the parameters θk ∈ (0, π/2] are so chosen that cos θk = 1 − εk ,

εk = 1/2k 2 .

(4.9)

Then consider two unbounded nonnegative selfadjoint operators A = (Ak )k∈N , in H with domains

B = (Bk )k∈N

(4.10a)

D[A] = z = (zk )k∈N ;

Ak zk 2 < ∞ , k

Bk zk 2 < ∞ , D[B] = z = (zk )k∈N ;

(4.10b)

k

and their operator sum A + B = (Ak + Bk )k∈N with domain D[A] ∩ D[B], which is symmetric and nonnegative. In the following two propositions we shall see that these operators constitute an example where the lower error bound in the symmetric product case is also just L(t)n−1 with a positive continuous function L(t) of t > 0. Proposition 1. A and B have the same domain, and the operator sum A+B is selfadjoint on D[A] ∩ D[B] = D[A] = D[B]. Proof. For each k we have 1 0 cos2 (θk /2) cos(θk /2) sin(θk /2) , Bk = 2k Ak = 2k , 0 0 cos(θk /2) sin(θk /2) sin2 (θk /2) so that

Ak + Bk = 2k[E + cos(θk /2)(S cos(θk /2) + T sin(θk /2))].

(4.11)

Ak +Bk has eigenvalues 2k(1±cos(θk /2)), and can be diagonalized with the orthogonal matrix cos(θk /4) sin(θk /4) Pk = − sin(θk /4) cos(θk /4)


as Pk (Ak + Bk )Pk−1

= 2k

507

1 + cos(θk /2)

Pk (Ak + Bk )

2

Pk−1

= (4k)

2

,

1 − cos(θk /2)

0

so that

0

0

0

sin4 (θk /4)

cos4 (θk /4)

.

To show Proposition 1, we have only to show that

Bz 2 ≤ 2 Az 2 + 2 z 2 ,

Az 2 ≤ 21 (A + B)z 2 + 21 z 2 ,

z ∈ D[A],

(4.12a)

z ∈ D[A] ∩ D[B],

(4.12b)

for it also follows from (4.12ab) that √

Az ≤ ( 2 + 1)( Bz + z ),

z ∈ D[B].

To do so it suffices to show that for zk = t (xk , yk ) ∈ R2 ,

Bk zk 2k ≤ 2 Ak zk 2k + 2 zk 2k ,

Ak zk 2k

≤

1 2 (Ak

+ Bk )zk 2k

+

(4.13a) 2 1 2 zk k .

(4.13b)

We get (4.13a) for zk = t (xk , yk ) ∈ R2 as 2

Bk zk 2k = (zk , Bk2 zk )k = (2k)2 xk cos(θk /2) + yk sin(θk /2) ≤ 2(2k)2 (xk )2 + εk (2k)2 (yk )2 ≤ 2 Ak zk 2k + 2 zk 2k . We get (4.13b) with wk = t (uk , vk ) = Pk zk as

Ak zk 2k = (wk , Pk A2k Pk−1 wk )k ≤ 2(2k)2 (uk )2 cos2 (θk /4) + (vk )2 sin2 (θk /4) ≤ 21 (4k)2 cos4 (θk /4) + 1 (uk )2 + 21 (4k)2 sin4 (θk /4) + 1 (vk )2 = 21 (wk , Pk (Ak + Bk )2 Pk−1 wk )k + 21 (wk , wk )k = 21 (Ak + Bk )zk 2k + 21 zk 2k . Proposition 2. There is a positive bounded continuous function L(t) of t > 0 independent of n such that the lower estimate

e−t (A+B) − (e−tB/2n e−tA/n e−tB/2n )n ≥ L(t)n−1 holds for every t > 0 and n ≥ 1.

(4.14)

508


Proof. Note that the inequalities

e−t (A+B) − (e−tB/2n e−tA/n e−tB/2n )n ≥ e−t (An +Bn ) − (e−tBn /2n e−tAn /n e−tBn /2n )n n ≥ 21 |Tr[e

−t (An +Bn )

− (e

(4.15)

−tBn /2n −tAn /n −tBn /2n n

e

e

) ]|

hold, where the norm in the first member means the operator norm of bounded operators on H, that in the second member the operator norm on Hn = R2 and Tr in the third member the trace of 2 × 2 matrices. For later use, let us note for 2 = S cos θ + T sin θ the following formulas: e−s2 = E cosh s − 2 sinh s,

Tre−s2 = 2 cosh s.

(4.16)

To get the first formula in (4.16), expand the exponential and use 22 = E , a consequence of (4.7). The second formula follows from the first one and Tr2 = 0. Thanks to (4.11) and the above formulas, we get Tre−t (An +Bn ) = 2e−2nt cosh 2nt cos(θn /2) .

(4.17)

for n large. On the other hand, the second trace in the last member of (4.15) is, by (4.16), equal to n n Tr e−tAn /2n e−tBn /n e−tAn /2n = e−2nt Tr e−tS/2 e−t (S cos θn +T sin θn ) e−tS/2 n = e−2nt Tr an E − bn S − cn T , where an , bn and cn are positive numbers defined by an = cosh2 t + sinh2 t cos θn = cosh 2t − εn sinh2 t, bn = sinh t cosh t (1 + cos θn ), cn = sinh t sin θn .

(4.18)

Since they satisfy the identity an2 − bn2 − cn2 = 1, there exist positive numbers Kn and 4n such that an = cosh Kn ,

bn = sinh Kn cos 4n ,

cn = sinh Kn sin 4n .

(4.19)

Setting s = Kn and θ = 4n in (4.16), we have n n Tr e−tBn /2n e−tAn /n e−tBn /2n = e−2nt Tr e−Kn (S cos 4n +T sin 4n ) = 2e−2nt cosh nKn .

(4.20)

Now, with θn or εn in (4.9) let us introduce positive numbers δn such that δn = 2 − 2 cos(θn /2),

or

εn = 2δn − 21 δn2 .

(4.21)


509

Note that εn /2 ≤ δn ≤ εn . By (4.18) and (4.19) we have cosh Kn − cosh(2 − δn )t =(1 − 21 εn ) cosh 2t + 21 εn − cosh(2 − δn )t t d2 1 1 = (t − s) ds 2 [(1 − 2 εn ) cosh 2s + 2 εn − cosh(2 − δn )s]ds 0 t = (t − s)[4(1 − 21 εn ) cosh 2s − (2 − δn )2 cosh(2 − δn )s]ds 0 t ≥(2δn − 2δn2 + 21 δn3 ) (t − s)(cosh 2s − 1)ds.

(4.22)

0

Here the last step is due to the convexity of the function cosh s: cosh (1 − 21 δn )2s + 21 δn 0s ≤ (1 − 21 δn ) cosh 2s + 21 δn cosh 0s. We see from (4.9) and (4.21), 2δn − 2δn2 + 21 δn3 = δn (2 − εn ) ≥ εn (1 − 21 εn ) ≥ 3/8n2 , and

0

t

(t − s)(cosh 2s − 1)ds = 41 (cosh 2t − 1 − 2t 2 ).

We are about to use the mean value theorem: Let b > a. Then for real-valued smooth functions ϕ(s) and ψ(s) there exists ξ with a < ξ < b such that ϕ (ξ ) ϕ(b) − ϕ(a) = . ψ(b) − ψ(a) ψ (ξ ) Note (4.22) implies that Kn > (2 − δn )t for t > 0. Then we get with (4.17) and (4.20) that, for some Mn with (2 − δn )t < Mn < Kn , −tB /2n −tA /n −tB /2n n −t (A +B ) 1 n n n e n e n −e 2 Tr e = e−2nt cosh nKn − cosh n(2 − δn )t sinh nMn = e−2nt n cosh Kn − cosh(2 − δn )t sinh Mn −2nt (n−1)Mn 3 2 ≥e e 32n (cosh 2t − 1 − 2t ), where we have used (4.22) and the inequality sinh ns/ sinh s ≥ e(n−1)s . Since (n − 1)Mn ≥ (n − 1)(2 − δn )t ≥ (n − 1)(2 − 1/2n2 )t ≥ 2(n − 1)t − t/8, we have proved (4.14) with L(t) = of Proposition 2.

3 −17t/8 (cosh 2t 32 e

− 1 − 2t 2 ). This ends the proof

Acknowledgements. The first and second authors are grateful to the Japan Society of the Promotion of Science (Grant-in-Aid for Scientific Researches (B) No. 11440040 and No. 11440056, respectively) for supporting this work. The third author is grateful to Centre de Physique Théorique, CNRS-Luminy, Marseille and Université de Toulon et du Var, France, for their support and hospitality.

510


References 1. Ichinose, T. and Tamura, Hideo: The norm convergence of the Trotter-Kato product formula with error bound. Commun. Math. Phys. 217, 489–502 (2001) 2. Kato, T. and Masuda, K.: Trotter’s product formula for nonlinear semigroups generated by the subdifferentials of convex functionals. J. Math. Soc. Japan 30, 169–178 (1978) 3. Suzuki, M.: On the convergence of exponential operators – the Zassenhaus formula, BCH formula and systematic approximations. Commun. Math. Phys. 57, 193–200 (1977) 4. Tamura, Hiroshi: A remark on operator-norm convergence of Trotter-Kato product formula. Integr. Equ. Oper. Theory 37, 350–356 (2000) 5. Varadarajan, S.: Lie Groups, Lie Algebras, and Their Representations. Berlin–Heidelberg–New York– Tokyo: Springer Verlag, 1974, 1984 Communicated by H. Araki

Commun. Math. Phys. 221, 511 – 523 (2001)

Communications in



Quantum Symmetry Groups of Noncommutative Spheres Joseph C. Várilly Departamento de Matemática, Universidad de Costa Rica, 2060 San José, Costa Rica. E-mail: [email protected] Received: 28 February 2001 / Accepted: 12 March 2001

Abstract: We show that the noncommutative spheres of Connes and Landi are quantum homogeneous spaces for certain compact quantum groups. We give a general construction of homogeneous spaces which support noncommutative spin geometries. 1. Introduction Noncommutative geometry [6] has established itself as a theory which goes beyond the realm of differentiable manifolds and deals in a unified fashion with many singular geometric spaces, too. A fundamental feature of NCG is that it fully incorporates all compact, boundaryless spin manifolds under the heading of “noncommutative spin geometries”: see [7] and Chap. 11 of [17]. Outstanding examples of singular geometric spaces are the noncommutative tori [5, 9, 25], orbit spaces of discrete group actions, and leaf spaces of foliations. Recently, a new class of examples has appeared, the “noncommutative spheres” of Connes and Landi [10], from a purely cohomological construction. The Moyal-like nature of the twisted products introduced in [10] suggests that the underlying noncommutative spaces of these spin geometries may be obtained, as C ∗ algebras, by the general deformation construction of Rieffel [27]. The question arises as to whether these are in fact noncommutative homogeneous spaces, that is, subalgebras of invariants of certain Hopf algebras which may be regarded as “quantized symmetry groups”. This question is more delicate than it might seem, because it must be answered at the C ∗ -algebra level: these “symmetry groups” must be found in the category of “compact quantum groups” in the sense of Woronowicz [37] or perhaps in the wider category of “locally compact quantum groups” [20]. As it happens, the compact noncommutative spaces which we discuss below have compact (quantum) symmetry groups, so we shall restrict ourselves here to Woronowicz’ version. Regular Associate of the Abdus Salam ICTP, Trieste.

512

J. C. Várilly

In Sects. 2 and 3 we review the construction of noncommutative spheres and Rieffel’s C ∗ -deformation theory. Section 4 treats compact quantum groups built by such deformations. In Sect. 5, we explain how both constructions mesh to yield the desired quantum homogeneous spaces. In the final section, we briefly discuss noncommutative spin geometries on these homogeneous spaces. 2. Quantized 4-Spheres The construction of noncommutative spin geometries by Connes and Landi proceeds in two stages. First, the data (A, H, D, C, χ ) of an even real spectral triple [6, 17] are sought as possible solutions to a system of equations for the Chern character in cyclic homology: chk (p) ≡ (p − 21 ) dp2k = 0 πD (chm (p)) = χ ,

for

k = 0, 1, . . . , m − 1,

(1a) (1b)

where p = p2 = p ∗ is an orthogonal projector in a matrix algebra Mr (A), · denotes the conditional expectation (or partial trace) onto A, χ is the grading operator on the Z2 -graded Hilbert space H, and πD (a0 da1 . . . dan ) := a0 [D, a1 ] . . . [D, an ] represents elements of the universal graded differential algebra over A as operators on H. These equations impose restrictions, first of all, on the algebra A itself. In dimension two, i.e., when m = 1 and r = 2, only commutative solutions are found; in fact, Connes showed by an elementary argument [8] – see also [17, Sect. 11.A] and [22] – that (1a) alone forces A to be a commutative algebra whose Gelfand spectrum is a closed subset of the 2-sphere S2 . This equation also makes ch1 (p) a Hochschild 2-cycle, whose associated volume form is the standard volume form on the sphere, so the Gelfand spectrum must be the whole S2 , and thus A C ∞ (S2 ) on the basis of (1) alone! Even in commutative cases such as this, where D may be taken as the Dirac operator given by some metric and spin structure on the spectrum of A, the final condition (1b) does not determine the metric, but only its volume form; thus the cohomological conditions (1) allow for volume-preserving variations of the metric, as befits a theory which aspires to incorporate gravity. In dimension four, with m = 2 and r = 4, there is also a commutative solution given in [8], namely the smooth function algebra C ∞ (S4 ). Later, Connes and Landi [10] found a family of noncommutative solutions, parametrized by a complex number of modulus one λ = e2πiθ : these are the algebras C ∞ (S4θ ) (together with their corresponding Dirac operators), which may be called “smooth function algebras for noncommutative 4-spheres S4θ ”, in the standard parlance of quantum group theorists. Their representations are uniformly bounded and in each case a C ∗ -norm is quickly found, allowing to complete them to “continuous function algebras”, denoted C(S4θ ). This procedure extends directly to higher dimensions, yielding noncommutative spheres in any even dimension greater than 2 from the corresponding “instanton algebras” (so called because the finite projective modules pAr may be regarded as vector bundles over A). Starting from the odd Chern character in cyclic homology, one can also search for odd-dimensional noncommutative spaces with this method (in the odd case, H is ungraded and χ in (1b) is replaced by 1). A striking feature of this construction is that these noncommutative manifolds are parametrized by numbers of modulus one, in contrast to the real numbers q = ±1 which label the well-known 2-spheres S2qc of Podle´s [24], which were originally constructed as

Quantum Symmetry Groups of Noncommutative Spheres

513

homogeneous spaces of the compact quantum groups SUq (2). By combining features of both constructions, D¸abrowski, Landi and Masuda [12] built a family of quantized 4-spheres S4q ; on computing the Chern characters of the instantons, they found that (1a) is violated, inasmuch as ch1 (p) = (1 − q 2 ) times a nonvanishing term. In any case, it is clear that the Connes–Landi spheres S4θ lie outside the realm of q-spheres of the Podle´s type. Indeed, several other variants on the S4q spheres have since appeared [2, 3, 31], which, however, do not incorporate the S4θ family [11]. Of particular note is the construction by Hong and Szymański [18] of a large family of quantized n-spheres Snq , for n ≥ 2 and q > 0, by deforming C(Sn ) to Cuntz–Krieger C ∗ -algebras based on certain directed graphs; but again, the S4θ family is not included. Therefore, it behooves us to ask whether that family may be realized as “quantum homogeneous spaces”. 3. Deformations of Homogeneous Spaces The second stage of the Connes–Landi construction is the provision of spin geometries on the spheres S4θ . This is accomplished by a deformation of the commutative spectral / ), where D / denotes a Dirac operator on the Hilbert space H of triple (C ∞ (S4 ), H, D square-integrable spinors over S4 . In the deformation, D / is kept fixed, so that all spectral data, including the classical dimension (four!) of the geometry are unchanged: only the algebra and its representation on H are modified. One declares a kind of Moyal product on C ∞ (S4 ) by the following recipe: first, note that there is an isometric action of the 2-torus T2 on S4 , allowing us to decompose any 4 smooth function on S as a series f = r fr indexed by r ∈ Z2 , where fr lies in the r th spectral subspace: (e2πiφ1 , e2πiφ2 ) · fr = e2πi(r1 φ1 +r2 φ2 ) fr . The series converges rapidly in the Fréchet topology of C ∞ (S4 ). By introducing the following star-product of homogeneous elements: fr × gs := e2πiθr1 s2 fr gs ,

(2)

Connes and Landi constructed a representation of C(S4θ ) on the spinor space H (having bounded commutators with D / ); in essence, the representation is explicit only on the smooth subalgebra, which is just the vector space C ∞ (S4 ) with the commutative product replaced by the star-product (2). More generally, if M is a compact Riemannian manifold admitting a Lie group of isometries of rank l ≥ 2, so that M carries an isometric action of the torus Tl , one can decompose C ∞ (M) into spectral subspaces indexed by Zl . The Moyal product of two homogeneous functions fr and gs is then given by fr × gs := ρ(r, s) fr gs ,

(3)

where ρ : Zl × Zl → T is a 2-cocycle on the additive group Zl . The cocycle relation ρ(r, s + t)ρ(s, t) = ρ(r, s)ρ(r + s, t) guarantees associativity of the new product. For instance [17, 34], one may take ρ(r, s) := exp −2π i j rc in the extended solution. A proof of Fact 8 can be found in [6]. The proof is valid with minor modification also when = 0. In Sect. 2.2 we will sharpen Fact 7. In the context of Fact 8, Fact 6 and Fact 7 indicate that noncompact solutions should be sought among those solutions that satisfy w2 (r) ≤ 1 for all r ∈ (0, rc ). 2. Existence of Noncompact Solutions In this section, we establish the existence of noncompact solutions. We first define such solutions rigorously as follows: A solution of Eqs. (1.1) and (1.2) is noncompact if (1) there exists a finite rc > 0 that satisfies limrrc A(r) = 0, (2) the solution is smooth for all r ∈ [0, rc ), and (3) limrrc w 2 (r) < ∞. Properties (1) and (2) require a noncompact solution to be a member of the one parameter family described by Fact 5 that is not particlelike. Property (3) is the significant feature that, because of Fact 8, distinguishes noncompact solutions from other solutions.

530

A. N. Linden

2.1. Outline of existence proof. The existence proof is based on three theorems relating to particlelike solutions which hold only in the case = 0. PL 1. For each n > 0 there exists a solution (An (r), wn (r)) of Eqs. (1.1) and (1.2) that satisfies the following conditions: (I) (An (0), wn (0), wn (0)) = (1, 1, 0), (II) the solution is regular (i.e.,A(r) > 0) for all r > 0, (III) limr∞ (An (r), wn2 (r), wn (r)) = (1, 1, 0), and (IV) wn has exactly n zeros in the interval r ∈ [0, ∞). Solutions that satisfy PL 1 are the particlelike solutions. PL 2. Let P¯n = (¯rn , A¯ n , w ¯ n, w ¯ n ) be a sequence of points in such that w ¯ n2 → 1, r¯n → ∞ and r¯n (1 − A¯ n ) < M for some M > 0. Let Pn (r) = (r, A(r), w(r), w (r), w (r)) be the orbit through P¯n , defined for r > rn and suppose that 0 ≤ w (rn )/w(rn ) ≤ 1. Then, for sufficiently large n, Pn (r) exits through w2 = 1, at r = ren and %(w, ¯ w ¯ ) − n n %(w(re ), w (re )) < 5π/4, where %(w, w ) = arctan(w /w). ¯ A > 0 as long as the PL 3. There exists a λ¯ , 1 < λ¯ < 2 such that whenever λ < λ, orbit is in . Reference [4] contains proofs of PL 1 and PL 2. A proof of PL 3 can be found in reference [5]. Fact 5 gives a continuous one parameter family of solutions that are smooth in a neighborhood of r = 0 and satisfy A(0) = 1, w(0) = 1, w (0) = 0 and w (0) = −λ < 0. Throughout the remainder of this paper, unless otherwise stated, all solutions (A, w) are members of this family. We treat and λ as parameters and, when necessary to avoid ambiguity, we write solutions as (A(, λ, r), w(, λ, r)). Noncompact solutions will be found by a perturbation argument that can be described as follows: PL 2 gives solutions (A(0, λn , r), w(0, λn , r)) that satisfy the following conditions: (I) w(0, λn , r) has n zeros, (II) the orbit of (A(0, λn , r), w(0, λn , r)) leaves through w = −1 if n is odd, the orbit of (A(0, λn , r), w(0, λn , r)) leaves through w = 1 if n is even, and (III) A(r) > 0 at the exit point re . (See [4]). Perturbing these solutions by changing will give, provided that is small, similar solutions (A(, λn , r), w(, λn , r)) and (A(, λn+1 , r), w(, λn+1 , r)). We will fix a small and consider all solutions (A(, λ, r), w(, λ, r)), where λ is between λn and λn+1 . One of the perturbed solutions will turn out to be noncompact.

Noncompact Static Spherically Symmetric Solutions of Einstein SU(2)-Yang–Mills Equations

531

For reasons that will become apparent, the perturbation argument requires that A , w, and w be well behaved near rc and that rc be a continuous function of λ. This may not be the case when rc2 ∈ J , where J is the interval defined by √ 1 − 1 − 4 1 J = (0, 2] ∪ , . (2.1) 2 Consequently, we will need to eliminate this possibility. 2.2. Existence proof. The next lemma shows that the cosmological constant precludes globally regular solutions in the sense of PL 1. √ Lemma 2.1. For any > 0 and λ, there exists an rc (, λ) ≤ 3/ such that one or both of the following holds: (1) limrrc A(r) = 0, (2) limrrc w 2 (r) = ∞. Proof. We suppose a solution (A, w) to be valid up to rc in the sense that for any r˜ ∈ (0, rc ), limr˜r A(r) > 0 and limr˜r w2 (r) < ∞. In other words, Eqs. (1.1) and (1.2) are assumed to be nonsingular for all r ∈ (0, rc ). Next, we consider the following function: µ(r) = r(1 − A − r 2 /3).

(2.2)

A simple calculation using Eq. (1.1) yields µ = 2Aw + 2

(1 − w2 )2 . r2

(2.3)

Now,√obviously, µ ≥ 0 whenever√ A > 0. Moreover, A > 0 whenever µ < 0 and r < 3/. If, for any rˆ < min{rc , 3/}, µ(ˆr ) < 0, then µ(r) < 0 for all r ∈ [0, rˆ ]. In particular, µ(0) < 0 and, thus, A(0) > 1. Because we are considering only solutions in the family of Fact 5, we conclude that 0 < A(r) ≤ 1 − r 2 /3 whenever r ∈ (0, min{rc , 3/}). (2.4) √ If for some rˆ ∈ (0, min{rc , 3/}), w2 (ˆr ) = 1 and w (ˆr ) = 0, then Fact 3 implies Eqs. (1.1) and (1.2) have a solution of the form r 2 ˆr 2 rˆ A(r) = 1 − + A(ˆr ) + −1 . 3 3 r The only such solution that satisfies A(0) = 1 is A(r) = 1 − r 2 /3. But this solution clearly satisfies the theorem. To complete the proof, we need √ consider only the remaining situation; namely that in which for all r ∈ (0, min{rc , 3/}), either w2 (r)√ = 1 or w (r) = 0; i.e., µ > 0. In this case, there exist ) > 0 and δ ∈ (0, min{rc , 3/}), that√satisfy µ(r) > ) √ whenever√r ∈ (δ, min{rc , 3/}). On the other √ hand, if rc ≥ 3/ , then Eq. (2.2) gives µ( 3/) < 0. It must be then that rc < 3/.

532

A. N. Linden

Theorem I does not exclude the possibility that limrrc w2 (r) = ∞ while limrrc A(r) > 0. However, because all such solutions have orbits that exit at some re < rc , we need not consider them separately. Nevertheless, limrrc A(r) = 0 for these solutions too. Before proving this, we preclude the possibility of a solution becoming singular while its orbit is still in solely because of a blow up of w . Lemma 2.2. Suppose that limrrc w 2 (r) = ∞ and limrrc w2 (r) < ∞. Then limrrc A(r) = 0. Proof. limrrc A(r) exists because of Fact 1. If limrrc A(r) > 0 and there exists a sequence {rn } rc that satisfies limn∞ w (rn ) = ∞, then limn∞ Aw (rn ) = ∞ and the sequence can be chosen so that (Aw ) (rn ) > 0 also. However, because w is bounded, Eq. (1.9) precludes the possibility of any such sequence. Lemma 2.3. If, for some finite rc , limrrc w2 (r) = ∞, then limrrc A(r) = 0. Proof. Because of Fact 2, we may assume that limrrc w(r) = +∞. Fact 1 again implies that A must have a limit Ac as r rc . Consequently, as in the proof of Fact 7, w must have a limit as r rc and clearly this limit must equal +∞. If Ac > 0, then there exists a δ > 0 such that 2Aw 2 /r > Ac w for all r ∈ (rc −δ, rc ). Moreover, δ can be chosen so that, in this same interval, (r) < 0. Dividing Eq. (1.1) by r and integrating yields, for any r ∈ (rc − δ, rc ), A(r) = A(rc − δ) +

r rc −δ

− 2Aw 2 dρ < A(rc − δ) − Ac ρ

r rc −δ

w (ρ) dρ

(2.5)

= A(rc − δ) + Ac w(rc − δ) − Ac w(r). Clearly, because limrrc w(r) = ∞, Eq. (2.5) gives some rˆ ∈ (rc − δ, rc ) that satisfies A(ˆr ) = 0. However, this contradicts the definition of rc . It follows that Ac = 0. Fact 1, Lemma 2.1, and Lemma 2.3 can be summarized as follows: Theorem I. Suppose > 0. Then, √ for any solution of Eqs. (1.1) and (1.2) that is smooth at r = 0, there exists some rc ≤ 3/ that satisfies limrrc A(r) = 0. We now make precise the set of solutions subject to perturbation by defining K(0 , λ0 ) = {(, λ) : 0 ≤ ≤ 0 and 0 ≤ λ ≤ λ0 }. In Sect. 3 we will establish, for each noncompact solution sought, the existence of 0 and λ0 such that the set K = K(0 , λ0 )

(2.6)

satisfies the following conditions: (I) for any (, λ) ∈ Ko (the interior of K), limrrc (A (r), w(r), w (r)) = (Ac , wc , wc ) exists. Moreover, both Ac and wc are finite, and (II) for any (, λ) ∈ Ko , rc (, λ) is a continuous function of λ.


533

We will then restrict perturbations by requiring (, λ) to be in Ko . For each solution (A(, λ, r), w(, λ, r)) with (, λ) ∈ Ko , there exists a smallest rc > 0 where the solution leaves . Given a K, we define the subsets Kc = {(, λ) ∈ Ko : A(rc ) = 0} and ¯ c = {(, λ) ∈ Ko : A(rc ) > 0}. K

(2.7) (2.8)

¯ c whenWe will refer to a solution (A(, λ, r), w(, λ, r)) as a solution in K, Kc or K ever (, λ) is in any of these respective sets. To distinguish solutions in Kc from those ¯ c , we denote rc by re for solutions in K ¯ c . In other words, re is used instead of rc in K 2 for those solutions for which rc satisfies w (rc ) = 1 and A(rc ) > 0; i.e., for solutions whose orbits exit with positive A. Because of Lemma 2.2, an orbit can leave only if w2 > 1 or if A = 0, regardless ¯ c are regular at re . The continuity of rc as a of the value of w ; that is, all solutions in K function of λ will be proved in the general case in Sect. 3. Here, we use the regularity ¯ c is open and that rc (λ) restricted to K ¯ c is of solutions at re to prove that the set K continuous. ¯ c and w(, ¯ λ) ¯ ∈K ¯ λ, ¯ re (, ¯ λ)) ¯ = ±1. Then for every ) > 0 Lemma 2.4. Suppose (, ¯ |λ− λ|} ¯ < δ, |re (, λ)−re (, ¯ λ)| ¯ < there exist δ > 0 such that whenever max{|− |, ), A(re (, λ)) > 0 and w(, λ, rc (, λ)) = ±1. Proof. Lemma 2.2 implies that any solution (A, w) that satisfies A(re ) > 0, also satisfies limrre w 2 (r) < ∞. From standard theorems it follows that the solution ¯ λ), ¯ w(, ¯ λ)) ¯ can be extended to some r˜ > re (, ¯ λ) ¯ with A > 0 whenever (A(, ¯ λ, ¯ re (, ¯ λ)) ¯ =1 r < r˜ . Also, invoking Fact 2 and Fact 3, we may assume that w(, ¯ λ, ¯ re (, ¯ λ)) ¯ > 0. There then exist η > 0 and rη ∈ (re , r˜ ) that satisfy and w (, ¯ λ, ¯ rη ) = 1 + 2η. w(, Continuous dependence on parameters and standard theorems guarantee the existence of ¯ − δ, ¯ + δ) × (λ¯ − δ, λ¯ + δ) such that whenever (, λ) ∈ V , the a neighborhood V = ( solution A(, λ, r) exists beyond rη and w(, λ, rη ) > 1 + η > 1; i.e., for all solutions in V , (A, w) also exits at re (, λ) through w = 1. It remains to prove that re (, λ) is continuous. Now, w(, λ, r) and w (, λ, r) are ¯ λ, ¯ re (λ, ¯ λ)) ¯ and w (, ¯ λ, ¯ re (, ¯ λ)) ¯ = 0. The Implicit Function continuous in r at (, Theorem gives the continuity of rc locally. Fact 6 implies that there is no other r > 0 satisfying w2 (r) = 1. This completes the proof. In the process of proving Lemma 2.4 we also proved the following: ¯ c , A and w have finite limits as r re (, λ). Lemma 2.5. For any (, λ) ∈ K The existence of noncompact solutions is a corollary of the following lemma. This lemma assumes results to be proved in Sect. 3: Lemma 2.6. Suppose w (, λ0 , rc (, λ0 )) = ±∞. Then there exists a neighborhood Uλ0 of λ0 such that for all λ ∈ Uλ0 , one of the following holds: (1) w (, λ, rc (, λ)) = ±∞, (2) w(, λ, rc (, λ)) = ±1, or (3) (A(, λ, r), w(, λ, r)) is noncompact.

534

A. N. Linden

Proof. From Fact 2, we may assume, without loss of generality, that w (, λ0 , rc (, λ0 )) = −∞. Assuming that, for λ near λ0 , limrrc (,λ) w (r) exists, the only alternatives to Cases (1), (2), and (3) are the following: (a) w (, λ, rc (, λ) = +∞ or (b) w(, λ, rc (, λ)) = 1. We show that the assumption that either (a) or (b) holds for λ sufficiently close to λ0 leads to a contradiction. Assuming that rc (, λ) is a continuous function of λ, we prove that (i) for λ sufficiently close to λ0 , rc (, λ) > rc (, λ0 ) and (ii) for λ sufficiently close to λ0 , rc (, λ) < rc (, λ0 ). Both (i) and (ii) cannot hold, so we will have the desired contradiction. To prove (i) we choose arbitrary M > 0. There exist ) such that, if r is within ) of rc (, λ0 ), then w (, λ0 , r) < −2M. The continuity of rc implies the existence of δ > 0 such that, if λ is within δ of λ0 , then rc (, λ) > rc (, λ0 ) − ). Continuous dependence on parameters ensures (choosing δ smaller if necessary) that, if λ is within δ of λ0 and r ∈ (rc (, λ0 ) − ), min{rc (, λ0 ), rc (λ, λ)}), then w (, λ, r) < −M; i.e., rc (, λ) > rc (, λ0 ) when either (a) or (b) holds; for when either holds, w (rc (, λ)) > 0. This proves (i). √ To prove (ii) we note that Lemma 3.4 states that rc (, λ0 ) > 1/ . Eq. (1.3) then implies that there exist c > 0 and r˜ < rc (, λ0 ) such that any λ that satisfies rc (, λ) > r˜ also satisfies (, λ, r) < −c whenever r ∈ (˜r , rc (, λ)) . Also, by continuous dependence on parameters and the fact w (, λ0 , rc (, λ0 )) = −∞, for such λ, there exist rˆ ∈ (˜r , rc (, λ)) such that, provided that λ is also sufficiently close to λ0 , w (, λ, rˆ ) < −2/(ˆr c). Now, if for any of these λ either (a) or (b) holds, then there exist s ∈ (ˆr , rc (, λ)) that satisfy w2 (s) < 1, w (s) < −1/(ˆr c) and w (s) > 0. Thus, [r 2 Aw + r w + w(1 − w2 )]r=s > 0. But this contradicts Eq. (1.2). We conclude that, for λ sufficiently close to λ0 , rc (λ, λ) < rc (, λ0 ). Theorem II. For each positive integer N , there exist N such that for each fixed ∈ (0, N ) there exist {λn ()}N n=1 such that the solution (A(, λn (), r), w(, λn (), r)) is noncompact and w(, λn (), r) has at least n zeros in the interval (0, rc (, λn )). Proof. From our definition of a noncompact solution and previous results, it suffices to find solutions in K for which limrrc w 2 (r) < ∞. To this end, using PL 2 and continuous dependence on parameters, we can find (for sufficiently small fixed ) λ˜ n and µ˜ n such that (I) (, max{λ˜ n , µ˜ n }) ∈ K (II) (A(, λ˜ n , rc (, λ˜ n )), w(, λ˜ n , rc (, λ˜ n ))) = (A− c , −1) and ˜ (A(, µ˜ n , rc (, λn )), w(λ, µ˜ n , rc (, µ˜ n ))) = (A+ c , 1), + with A− c and Ac both strictly positive, and


535

(III) w(, λ˜ n , r) and w(, µ˜ n , r) both have at least n zeros before their respective crash points (points rc where A(rc ) = 0). Condition (III) follows from the Implicit Function Theorem and Fact 3. Without loss of generality, we assume that λ˜ n < µ˜ n . If this is not the case, we simply interchange their roles in what follows. We fix µ˜ n and define λˆ = sup{λ˜ n < µ˜ n that satisfy (II) and (III)}. Next, we define µˆ = inf{µ˜ n > λˆ that satisfy (II) and (III)}. Clearly λˆ ≤ µ. ˆ It follows from Lemma 2.6, Lemma 2.4 and the definitions of λˆ and µˆ ˆ µ] that the inequality is strict; i.e., λˆ < µˆ and that {}×[λ, ˆ ∈ Kc . From Lemma 3.3 it also 2 ˆ µ], follows that, for all λ ∈ [λ, ˆ either w (rc (, λ)) = ∞ or (A(, λ, r), w(, λ, r)) is noncompact. We now define ˆ µ] ˆ : w (, λ, rc (, λ)) = ∞} and E + = {λ ∈ [λ, ˆ µ] ˆ : w (, λ, rc (, λ)) = −∞}. E − = {λ ∈ [λ, Lemma 2.6 implies for each λ ∈ E + the existence of an open set Uλ containing λ and such that Uλ ∩E − is empty. Similarly, for each λ ∈ E − , there exists an open set Uλ containing λ such that Uλ ∩ E + is empty. Clearly, U + = λ∈E + Uλ and U − = λ∈E − Uλ are open sets. Also, U + ∩ E − and U − ∩ E + are both empty. Now, either U + and U − are ˆ µ] disjoint or they have nonempty intersection. If they are disjoint, then because [λ, ˆ is − ˆ connected, there exists at least one λp such that λp ∈ [λ, µ] ˆ but λp ∈ / E ∪ E + . If they are not disjoint, then there exists a λp ∈ U + ∩ U − and again, λp ∈ E + ∪ E − . (A(, λp , r), w(, λp , r)) is, therefore, noncompact.

3. Proofs of Technical Lemmas In this section we establish the claims made in Sect. 2 that were used to prove Theorem 2.2. The main goals are to establish the continuity of rc as a function of λ for fixed and to establish limits on A , w, and w as r approaches rc . For technical reasons, the possibilities are broken down as follows: (1) (2) (3) (4)

rc = re ; i.e., an orbit√leaves with A > 0 and w2 = 1, A(rc ) = 0 and r√c ≤ 2, √ A(rc ) = 0 and 2√< rc ≤ 1/ , and √ A(rc ) = 0 and 1/ < rc ≤ 3/.

We have already proved the continuity of √rc in Case 1. (See Lemma 2.4.) We will exclude Case 2. The reason for choosing R = 2 is that with this choice, in either Case 3 and Case 4, we can establish limits on w and A as r rc . Case 3 will then be shown to be impossible. Finally, we will prove the continuity of rc in Case 4.

536

A. N. Linden

3.1. Limits of w and A . We begin by eliminating Case 2 and establishing the limits on A , w, and w if rc is in the remaining region. Lemma 3.1. Let R > 1 be arbitrary and define R = ∩ {(r, A, w, w ) : 0 < r < R}. ¯ R) > 0 and ( ¯ λ, ¯ R) There exists a λ∗ (0) such that for all λ¯ ∈ [0, λ∗ (0)), there exist )(λ, ¯ ¯ such that all solutions in K(, λ) exit R at some re ≤ R and satisfy A(, λ, r) > ) throughout the interval r ∈ [0, re ]. Proof. We define λ∗ (0) = limn∞ λn (0), where λn (0) is the value of λ that produces the nth particlelike solution. For any λ < λ∗ (0), necessarily A(0, λ, r) > 0 for all ¯ there r ∈ [0, re (0, λ)] ([4, Theorem 3.1]). Therefore, for any λ¯ < λ∗ (0) and any λ < λ, are two possibilities: (A) A(0, λ, r) is a particlelike solution, or (B) w2 (0, λ, re ) = 1 for some re > 0. In Case A, A > 0 for all r > 0. Continuous dependence on parameters ensures that for any R > 0, there exists a (, λ)-neighborhood Uλ of (0, λ) such that, whenever (, λ) ∈ Uλ and 0 ≤ r ≤ R, A(, λ, r) > 0. In Case B, Lemma 2.4 also implies whenever (, λ) ∈ Uλ , the existence of a (, λ)neighborhood Uλ such that w2 (, λ, re (, λ)) = 1 and A > 0 throughout the interval [0, re (, λ)). (re is the point at which the solution exits .) Thus, we have for each λ, a neighborhood Uλ such that whenever (, λ) ∈ Uλ , the solution (A(, λ, r), w(, λ, r)) exits R at some reR (, λ) ≤ R and A(, λ, r)) > 0 ¯ is throughout the interval [0, reR ]. The result now follows because the interval [0, λ] compact. √ Throughout the rest of this paper, we √ fix R = 2 and unless stated otherwise, assume ¯ λ¯ , 2), λ) ¯ for same λ¯ that satisfies Lemma 3.1. As an solutions lie in the set K = K(( obvious consequence of Lemma 3.1 we have, for such solutions, the following: √ Lemma 3.2. Suppose A(rc ) = 0. Then rc > 2. The next lemma is crucial to establishing the continuity of rc . Lemma 3.3. Suppose limrrc w2 (r) < ∞. Then Aw 2 , , A , and w all have finite limits as r rc . Proof. We first note that from Eq. (1.1) it is clear that the existence of a finite limit of any two of the Aw 2 , A and implies the existence of a finite limit of the third. Also limrrc (r) exists if and only if limrrc w(r) exists. In Lemma 2.5 we already proved the result in the case where A(rc ) > 0. Thus, we may assume A(rc ) = 0. We define z(r) =

2Aw 2 + . r r

A simple calculation using Eqs. (1.1) and (1.2) yields (1 − w2 )2 2 2 = 0. r z + 2rw z + 2 1 − A − 2 r2

(3.1)

(3.2)


537

Because rc2 > 2, whenever r is sufficiently close to rc , the last term on the left side of Eq. (3.2) is strictly positive. Thus, near rc , we cannot have z(r) = z (r) = 0. Also, z < 0 whenever z = 0. Therefore, z has only one sign near rc . There are now two cases to consider: (A) z > 0 near rc and (B) z < 0 near rc . In either case, since the last term on the left side of Eq. (3.2) is bounded, z is bounded from one side or the other; i.e., limrrc z(r) exists. Case A. Equation (3.2) implies z < 0 near rc . Therefore limrrc z(r) is finite. We consider the two subcases: (Ai) limrrc z > 0 and (Aii) limrrc z = 0. Case Ai. We prove that the assumption that Aw 2 has no limit leads to a contradiction. For under this assumption, there exists a sequence {rn } rc such that (Aw 2 ) (rn ) = 0 and Aw 2 (rn ) > ); i.e., limn∞ w 2 (rn ) = ∞. Evaluating Eq. (1.10) at rn gives w [r 2 w z + 2w(1 − w2 )]r=rn = 0. As n ∞, the first term in parentheses dominates the second term since the second term remains bounded while the first term is unbounded; i.e., for sufficiently large n, the expression on the left cannot equal 0. This proves that Aw 2 has a limit. Since z also has a limit, and w must also have limits. Furthermore, limrrc (r) is finite because w is bounded. Thus limrrc Aw 2 is also finite. Case Aii and B. By hypothesis, is bounded near rc . Thus, since z is bounded from above and Aw 2 ≥ 0, Aw 2 is also bounded. It follows that −∞ < lim z(r) ≤ 0. rrc

We now define y = ww (1 − w2 ).

(3.3)

A straightforward calculation using Eqs. (1.1) and (1.2) yields r 2 Ay + r y + w2 (1 − w2 )2 − r 2 Aw (1 − 3w2 ) = 0. 2

(3.4)

limrrc z(r) ≤ 0 and Aw 2 ≥ 0 imply that limrrc (r) ≤ 0. We prove that the assumption limrrc (r) does not exist leads to a contradiction. Indeed, under this assumption, for sufficiently small ) > 0 and any M > 0, there exist r0M and r1M , r0M < r1M such that (r ) ≤ −) in the interval [r0M , r1M ], (r )(r0M ) = M the point on (r M , r M ) (r )(r1M ), and (r ) (r0M ) < −M (see Fig. 2). We denote by rm 0 1 where r is minimized.

538

A. N. Linden r

✻

r0M

0

M rm

r1M

✲r

−)

r = rc

Fig. 2

Since the last two terms on the left side of Eq. (3.4) are bounded, there exists a positive B1 such that r 2 Ay < −(r )y + B1

(3.5)

throughout the intervals [r0M , r1M ]. Now, a simple calulation using Eqs. (1.1) and (1.3) yields (r ) − 2Aw − 2

2(1 − w2 )2 4y + 2r 2 − = 0. r2 r

(3.6)

The middle three terms on the left side of Eq. (3.6) are also bounded; i.e., there exist B2 > 0 such that r(r ) r(r ) − B2 < y < + B2 4 4

(3.7)

throughout these same intervals. Inequality (3.7) allows us to choose M sufficiently large so that y(r0M )

B1 B1 −B1 > M > y(r0M ). ≥ M s (s) ) r0 (r0 )

Thus, there can be no such s. Inequality (3.10) follows. Inequality (3.7) now yields, for r0M sufficiently close to rc and all r ∈ (r0M , r1M ], 4(y(r0M ) + B2 ) 4(y(r) + B2 ) < r r M 4(y(r0 ) − B2 ) M < + 2 r0M M < (r ) (r0M ) + < 0. 2

(r ) (r)
0, independent of K, such that whenever (, λ) ∈ K, ∈ (0, ∗ ), limrrc (,λ) w2 (r) ≤ 1, and limrrc (,λ) w 2 (r) < ∞ all hold, then √ rc (, λ) > 1/ . Proof. Without loss generality, we assume there is a sequence {rn } rc such that √ of√ w (rn ) > (1 − 1/ 3)/ 6 and w (rn ) > 0. Equation (1.2) and Lemma 3.3 now yield lim (r) ≤ 0.

rrc

Multiplying Eq. (1.3) by r 2 , evaluating at rc , and solving as a quadratic in rc2 yields either √ 1 − 1 − 4 0 < rc2 ≤ or (3.12) 2 √ 1 + 1 − 4 ≤ rc2 . (3.13) 2 √ √ Since 1 − 4 = 1 − 2 + ◦(2 ), any rc that satisfies Eq. (3.12) is less than 2 provided is sufficiently small. Therefore, by assumption (i.e., (, λ) ∈ K), and because of Lemma 3.2, Eq. (3.12) can be ignored.

540

A. N. Linden

We also consider only sufficiently small so that √ √ (1 + 1 − 4)/(2) > 1/ − 2 2 and prove √ that, choosing √ smaller if necessary, whenever A(rc ) = 0, w (rc ) 2≤ 1, and rc ∈ [1/ − 2, 1/ ], there exists an r¯ > 0 such that either A(¯r ) = 0 or w (¯r ) = 1. Thus, such a solution cannot be in the family of Fact 5. This will complete the proof. We simplify notation by setting

√ 2 a = √ − 3 3, 3

2 b = √ , 3

and

1 c = √ − 3.

Lemma 3.6 states that for any of the one parameter solutions of Fact 5, provided is sufficiently small, Aw 2 (r) < A throughout the interval [b , c ]. Lemma 3.7 states that there exist K > 0 such the same hypotheses, A < K/r throughout the same √ that under√ interval when also 2 < rc < 1/ . These lemmas, Eqs. (2.2) and (2.3) yield ˜ µ < 2K/r

(3.14)

throughout the interval [b , c ] for some K˜ > K. Also, Lemma 3.5 gives, for some ˜ positive constant M, µ < 2M˜

(3.15)

throughout the interval [c , rc ]. On one hand, integrating inequalities (3.14) and (3.15) gives c rc µ (s) ds + µ (s) ds µ(rc ) − µ(b ) = s=b s=c rc c ˜ ds + 2K/s 2M˜ ds < s=b

s=c

c ≤ 2K˜ ln( ) + 6M˜ b √ ˜ = 2K˜ ln(3/2 − 9 /2) + 6M.

(3.16)

It is clear that for sufficiently small , µ(rc ) − µ(b ) < L,

(3.17)

where L is any number satisfying ˜ L > 2K˜ ln(3/2) + 6M. On the other hand, we consider also

r 2 h(r) = µ(r) + rA(r) = r 1 − , 3 √ h (r) = 1 − r 2 > 0 for all r ∈ (0, rc ), if rc ≤ 1/ . In this case, h(rc ) − h(b ) > h(c ) − h(b ).

(3.18)

(3.19)


541

A simple calculation yields h(c ) − h(b ) =

8 √

81

+ ◦(0 ).

(3.20)

Inequality (3.19) and Eq. (3.20) together imply that, for sufficiently small, h(rc ) − h(b ) > L.

(3.21)

Now, µ(rc ) = h(rc ). So comparing inequality (3.21) to inequality (3.17) gives µ(b ) > h(b )

(3.22)

whenever is sufficiently small. Also because µ(rc ) = h(rc ), either µ(r) > h(r) for all r ∈ (b , rc ) or there exists an r¯ ∈ (b , rc ) such that µ(¯r ) = h(¯r ). In the former case, Eq. (3.18) gives A < 0 in (b , rc ). We therefore rule this case out. In the latter case, Eq. (3.18) gives some r¯ < rc that satisfies A(¯r ) = 0. This completes the proof assuming Lemmas 3.5, 3.6 and 3.7. Lemma 3.5. There exist M > 0 such that for all (, λ) ∈ Kc and for all r ∈ [0, rc (, λ)], Aw 2 (, λ, r) < M. √ Proof. Lemma 3.1 gives, for all solutions in K, an re ≤ 2 such that (A, w) exits √2 at re and A(re ) > 0. We define ρ(, λ) = min{1, re (, λ)} and ; = {(, λ, r) : (, λ) ∈ K and 0 ≤ r ≤ ρ(, λ)}. Standard results and Lemma 2.2 imply that any solution (A(, λ, r), w(, λ, r)) in K can be extended beyond ρ(, λ). It follows from continuous dependence of solutions on parameters that ; is a closed subset of R3 . Being bounded, it is also compact. Therefore, there exists an M1 such that Aw (r) < M1 2

for all (, λ, r) ∈ ;.

(3.23)

We have Aw 2 (r) < M1 for r ∈ √ (0, rc ) whenever ρ < 1. To find a bound when ρ = 1 we define M = max{M1 , 2 + 2/ 27} and recall Eq. (1.10), r 2 (Aw ) + w [rw ( + 2Aw ) + 2w(1 − w2 )] = 0. √ In the interval [1, rc ] 0 < A < 1, |w(1 − w2 )| < 2/(3 3), and, because of Lemma 2.1,

> −4. Also, for all r˜ in this interval, whenever Aw 2 (˜r ) > M, w 2 (˜r ) > 1. This and Eq. (1.10) imply that in the interval (1, rc ), 2

2

(Aw ) (˜r ) < 0 2

whenever Aw (˜r ) > M. 2

(3.24)

Inequalities (3.23) and (3.24) imply that Aw 2 cannot exceed M in the interval (1, rc ]. Equation (3.23) also implies that Aw 2 cannot exceed M (> M1 ) in the interval [0, 1]. The result follows. We now improve on this bound in the interval [b , c ].

542

A. N. Linden

Lemma 3.6. For sufficiently small, any solution (A, w) in Kc that satisfies rc > c +1 also satisfies w 2 < 1 throughout the interval [b , c ]. Proof. We first prove that

+A>

1 r

for all r ∈ [a , c ]

(3.25)

whenever is sufficiently small, (, λ) ∈ Kc , and rc > c +1. To this end, we consider sufficiently small so that a > 0 and examine the function ψ(r) = 1 −

1 1 − r 2 − . 2 r r

We will prove that ψ(r) > 0 for all r ∈ (a , c ). The result will follow. Now, √ 1 (rψ) √ = rψ(c + 3) = −1 − .

(3.26)

(3.27)

Also, (rψ) (r) = 1 +

1 − 3r 2 . r2

(3.28)

From Eq. (3.28), it is clear that (rψ )(r) → −2

uniformly in [c , rc ] as 0.

(3.29)

Equations (3.27) and (3.28) imply that, for sufficiently small , ψ(c ) > 0.

(3.30)

Simple calculations give ψ (r) =

2 1 − 2r + 2 3 r r

(3.31)

and ψ (r) = −

6 2 − 2 − 3 . r4 r

(3.32)

From Eqs. (3.31) and (3.32), it follows readily that 1 32 r ψ (a ) = − +◦ √ 81

(3.33)

ψ (r) < 0 for all r > 0.

(3.34)

3

and

For sufficiently small, the right side of Eq. (3.33) is negative. This and Eq. (3.34) imply that ψ (r) < 0 for all r ∈ (a , c ).

(3.35)


543

Equations (3.30) and (3.35) establish that ψ(r) > 0 for all r ∈ [a , c ].

(3.36)

Finally, ψ = +A+

1 1 1 (1 − w2 )2 − − ≤ +A− . r2 r2 r r

Inequality (3.25) follows. We now define the set

2 W = r ∈ (a , c ) : |w (r)| ≤ √ . 3 3

(3.37)

W is not empty. In fact, W ∩ (a , b )

is not empty.

(3.38)

Indeed, √ otherwise, without loss of generality, (Fact 2), we may assume that w (r) > 2/(3 3) for all r ∈ (a , b ). Integrating this yields w(b ) − w(a ) > 2, contradicting the assumption that |w| ≤ 1 in [0, rc ]. This establishes (3.38). Next, we define

rˆ = sup{r ∈ W }. (3.39) √ Again using Fact 2, we assume that w > 2/(3 3) throughout the interval (ˆr , c ). Equations (1.2) and (3.25) yield

w 1 w w = 2 −r( + A)w − w(1 − w2 ) + (3.40) < in (ˆr , c ). r A r r √ Integrating inequality (3.40) from rˆ to r with the condition w (ˆr ) = 2/(3 3) yields 2r w (r) < √ 3 3ˆr

for all r ∈ (ˆr , c ).

(3.41)

In particular, w (r) < 1

√ for all r ∈ (ˆr , min{c , 3 3ˆr /2}).

(3.42)

Now,

√ √ √ √ 3 3a 3 3 2 3 3ˆr > = √ −3 3 2 2 2 3 (3.43) √ 3 27 =√ − . 2 √ It follows easily from inequality (3.43) that for small , 3 3ˆr /2 exceeds c . Substituting this fact into (3.42) yields w (r) < 1 for all r ∈ [b , c ]. Lemma 3.7. There exists a K √ such that for sufficiently small and any solution in Kc that satisfies rc ∈ (c − 1, 1/ ), the solution also satisfies A < K/r in the interval [b , c ].

544

A. N. Linden

Proof. Invoking Lemma 3.6 gives from Eq. (1.1) 1 (r 3 A) > r 2 1 − 2 − r 2 = r 2 − 1 − r 4 > −1 r

(3.44)

throughout the interval [b , c ]. Integrating inequality (3.44) from any r ∈ [b , c ] to c yields A(r)
0 such that, for sufficiently small, (rA) > −2M

for all r ∈ [b , rc ].

(3.47)

Integrating inequality (3.47) from c to rc yields A(r) < 6M/r

for all r ∈ [rc − 3, rc ].

(3.48)

From the hypotheses, for sufficiently small , rc − 3 < c . The result now follows upon substituting (3.46) and (3.48) into (3.45). 3.3. Limit of w . Lemma 3.8. If 0 < < ∗ (∗ as in Lemma 3.4), limrrc w (r) exists. ¯ c . Thus, we need only consider the Proof. We have already proved this for (, λ) ∈ K case (, λ) ∈ Kc . In view of Lemma 3.3, there are only two subcases to consider: (1) limrrc (r) = 0, and (2) limrrc (r) = 0. Case 1. We suppose that there exists a sequence {rn } rc such that w (rn ) = 0 for each n. Then Eq. (1.2) implies w (rn ) =

w(rn )(w2 (rn ) − 1) . rn (rn )

(3.49)

Consequently (again making use of Lemma 3.3) the right side of Eq. (3.49) has a limit as n ∞. Since limrrc (r) = 0, the result follows. Case 2. There are two subcases to consider: (2a) limrrc w = 0 and (2b) limrrc w = 0.


545

Case 2a. Equation (1.2) implies that w has one sign near rc . This is because w has only one sign near rc and whenever w (r) = 0, w w(r) < 0. Now, as in Case 1, for any sequence {rn } rc such that w (rn ) = 0 for all n, w (rn ) =

w(rn )(w2 (rn ) − 1) . rn (rn )

Clearly the right side of this equation goes to ±∞ as n ∞. Because w has only one sign, this must go to one or the other of ±∞. The result follows. Case 2b. A(rc ) = w(rc ) = (rc ) = 0 implies rc2 = preclude this for small .

√ 1± 1−4 . But Lemmas 3.2 and 3.4 2

3.4. Continuity of rc . Lemma 3.9. For sufficiently small fixed , rc (, λ) is a continuous function of λ. Proof. We have already proved the continuity of rc as a function of λ for solutions in ¯ c (Lemma 2.4 and Lemma 2.2). It remains to prove the result in the case (, λ) ∈ Kc . K √ From Lemma 3.4, we may assume that rc (, λ) > 1/ . Since is fixed, we drop the dependence on it in what follows. We recall the function µ(λ, r) = r(1 − A(λ, r) − r 2 /3) which, for each λ, is a nondecreasing function of r (see Eq. (2.3)). Also, to simplify notation, we define h(r) = µ + rA = r(1 − r 2 /3) and δ(λ, )) = h(rc (λ)) − h(rc (λ) + )). Obviously, µ(λ, r) = h(r) if and only if A(r) = 0; i.e., only at rc (λ). Furthermore, h (r) = 1 − r 2 < 0 whenever r is sufficiently close to rc (λ). Therefore, δ(λ, )) > 0 whenever ) > 0. Moreover, both h and µ are continuous, h is decreasing, and µ is increasing. These facts enable us to find, for any ) > 0, r˜ < rc (λ) such that µ(λ, r˜ ) > h(˜r ) −

δ(λ, )) . 2

r˜ can be taken to be within ) of rc (λ). (See Fig. 3.) Continuous dependence on parameters now gives η > 0 such that whenever λ˜ is within η of λ, ˜ r˜ ) > h(˜r ) − δ. µ(λ, ˜ Clearly, for all such λ, ˜ > r˜ . rc (λ) ˜ < rc (λ) + ). If for any such ˜ rc (λ) We now prove by contradiction that for all such λ, ˜ rc (λ) ˜ ≥ rc (λ) + ), then, µ(λ, ˜ r) is well defined up to rc (λ) + ) and is a continuous λ, function of r in the interval (˜r , rc (λ) + )). Now, on one hand, ˜ r˜ ) > h(˜r ) − δ > h(rc (λ)) − δ ˜ rc (λ) + )) ≥ µ(λ, µ(λ, = h(rc (λ) + )).

(3.50)

546

A. N. Linden

✻

µ

✻

˜ r) µ(λ,

δ

❄)✲ ✛ ) ✲✛

h(r)

√1 r˜

rc

r˜c

✲r

˜ µ = µ(λ, r), rc = rc (λ), r˜c = rc (λ) Fig. 3.

˜ r˜ ) > 0, On the other hand, because A(λ, ˜ r˜ ) = h(˜r ) − r˜ A(λ, ˜ r˜ ) < h(˜r ). µ(λ,

(3.51)

˜ r) Equations (3.50), (3.51), and the Intermediate Value Theorem applied to h(r) − µ(λ, ˜ ˜ (with fixed λ) guarantee an rc (λ) ∈ (˜r , rc (λ) + )) such that ˜ rc (λ)) ˜ = h(rc (λ)); ˜ µ(λ, ˜ rc (λ)) ˜ = 0. This contradicts the assumption that rc (λ) ˜ > rc (λ) + ) and i.e., A(λ, completes the proof. References 1. Breitenlohner, P., Forgács, P., and Maison, D.: Static spherically symmetric solutions of the Einstein– Yang–Mills equations. Commun. Math. Phys. 163, 141–172 (1994) 2. Linden, A : Far field behavior of globally smooth static spherically symmetric solutions of Einstein SU(2)–Yang Mills equations. J. Math. Phys. 42 (3), 1196–1202 (March 2001) 3. Rindler, W.: Essential Relativity. Berlin–Heidelberg–New York: Springer-Verlag, 1977 4. Smoller, J. and Wasserman, A.: Existence of infinitely many smooth static solutions of the Einstein/Yang– Mills equations. Commun. Math. Phys. 151 (2), 303–325 (1993) 5. Smoller, J. and Wasserman, A.: An investigation of the limiting behavior of particle-like solutions to the Einstein–Yang/Mills equations and a new black hole solution. Commun. Math. Phys. 161, 365–369 (1994) 6. Smoller, J. and Wasserman, A.: Regular solutions of the Einstein–Yang-Mills equations. J. Math. Phys. 36 (8), 4301–4323 (August 1995) 7. Smoller, J. and Wasserman, A.: Reissner-Nordstróm-like solutions of the SU(2) Einstein–Yang/Mills equations. J. Math. Phys. 38 (12), 6522–6559 (December 1997) 8. Smoller, J. and Wasserman,A.: Extendability of solutions of the Einstein–Yang/Mills equations. Commun. Math. Phys. 194, 707–732 (1998) 9. Smoller, J., Wasserman, A., Yau, S.-T. and McLeod, J.B.: Smooth static solutions of the Einstein/Yang– Mills equations. Commun. Math. Phys. 143, 115–147 (1991)


547

10. Volkov, M.S., Straumann, N., Lavrelashvili, G. Heusler, M. and Brodbeck, O.: Cosmological analogues of the Bartnik–McKinnon solutions. Phys. Rev. D 54, 7243–7251 (1996) Communicated by H. Nicolai

Commun. Math. Phys. 221, 549 – 571 (2001)

Communications in



The q-Deformed Knizhnik–Zamolodchikov–Bernard Heat Equation Giovanni Felder1 , Alexander Varchenko2, 1 Departement Mathematik, ETH-Zentrum, 8092 Zürich, Switzerland 2 Department of Mathematics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3250,

USA Received: 4 October 2000 / Accepted: 25 March 2001

Abstract: We introduce a q-deformation of the genus one sl2 Knizhnik–Zamolodchikov–Bernard heat equation. We show that this equation for the dependence on the moduli of elliptic curves is compatible with the qKZB equations, which give the dependence on the marked points.

1. Introduction The Knizhnik–Zamolodchikov–Bernard equations are a system of differential equations arising in conformal field theory on Riemann surfaces. For each g, n ∈ Z≥0 , a simple complex Lie algebra g, n highest weight g-modules Vi , and a complex parameter κ, we have such a system of equations. In the case of genus g = 1, they have the form κ∂zj v = −

ν

) h(j ν ∂λν v +

r(zj − zl , λ)(j,l) v.

(1)

l:l=j

The unknown function v takes values in the zero weight space V [0] = ∩x∈h Ker(x) of the tensor product V = V1 ⊗ · · · ⊗ Vn with respect to a Cartan subalgebra h of g. It depends on variables z1 , . . . , zn ∈ C and λ = λν hν ∈ h , where (hν ) is an orthonormal basis of h , with respect to a fixed invariant bilinear form. The notation x (j ) for x ∈ End(Vj ) or x ∈ g means 1 ⊗ · · · ⊗ x ⊗ · · · ⊗ 1. Similarly x (i,j ) denotes the action on the i th and j th factor of x ∈ End(Vi ⊗ Vj ). The “r-matrix” r ∈ g ⊗ g obeys r(z, λ) + r(−z, λ)(2,1) = 0,

[r(z, λ), h ⊗ 1 + 1 ⊗ h] = 0, ∀h ∈ h ,

Supported in part by NSF grant DMS-9801582.

550

G. Felder, A. Varchenko

with ( i xi ⊗ yi )(21) = i yi ⊗ xi , and is a solution of the classical dynamical Yang– Baxter equation [FW] (r (1,2) = r(z1 − z2 , λ) ⊗ 1 ∈ U g⊗3 , etc.) ν

∂λν r (1,2) h(3) ν +

ν

∂λν r (2,3) h(1) ν +

ν

∂λν r (3,1) h(2) ν

− [r (1,2) , r (1,3) ] − [r (1,2) , r (2,3) ] − [r (1,3) , r (2,3) ] = 0. As a consequence, the KZB equations (1) are compatible, meaning that if the equations are written as ∇j v = 0, then the differential operators ∇j commute with each other. The solutions of the classical dynamical Yang–Baxter equation arising in conformal field theory are parametrized by the modulus τ in the upper half plane and can be expressed in terms of theta functions, see [FW, FV1]. A difference version of this story was proposed in [F]: Suppose that for an Abelian complex Lie algebra h we have h -modules Vi , i = 1, . . . , n with a weight decomposition Vi = ⊕µ∈h ∗ Vi [µ] into finite dimensional weight spaces Vi [µ]. Then we say that meromorphic functions Rij (z, λ) of z ∈ C and λ ∈ h ∗ with values in Endh (Vi ⊗ Vj ), (1 ≤ i = j ≤ n) form a system of dynamical R-matrices if they obey the (quantum) dynamical Yang–Baxter equation Rij (z1 − z2 , λ − 2ηh(3) )(12) Rik (z1 − z3 , λ)(13) Rj k (z2 − z3 , λ − 2ηh(1) )(23) = Rj k (z2 − z3 , λ)(23) Rik (z1 − z3 , λ − 2ηh(2) )(13) Rij (z1 − z2 , λ)(12) , in End(Vi ⊗ Vj ⊗ Vk ) for all i < j < k and are “unitary”: Rij (z, λ)Rj i (−z, λ)(21) = IdVi ⊗Vj . We adopt a standard notation: for instance, R(z, λ − 2ηh(3) )(12) acts on a tensor v1 ⊗ v2 ⊗ v3 as R(z, λ − 2ηµ3 ) ⊗ Id if v3 has weight µ3 . The deformation parameter (“Planck’s constant”) is here η. If we have a family of dynamical R-matrices depending on η such that Rij = IdVi ⊗Vj + 2ηrij + O(η2 ) as η → 0, we recover the classical dynamical Yang–Baxter equation and the unitarity condition for rij , i. e. the properties that r obeys, viewed as an element of End(Vi ⊗ Vj ). If we have a system of dynamical R-matrices Rij we can then construct a compatible system of difference equations, the qKZB equations for a function v(z1 , . . . , zn , λ) ∈ V [0]. They are a dynamical version of the I. Frenkel–Reshetikhin qKZ equations [FR], and their semiclassical limit are the KZB equations. Their construction is reviewed in 2.1 below. The main examples of solutions of the classical and of the quantum dynamical Yang– Baxter equations are associated with elliptic curves. In the quantum case, they can be viewed as intertwining operators between tensor products of representations of elliptic quantum groups taken in different orders [FV2]. In the rank one case (one-dimensional h ) explicit expressions for R matrices R ,M depending on two “highest weights” , M ∈ C are known. They are associated to pairs of evaluation Verma modules [FV2] for the elliptic quantum group Eτ,η (sl2 ) and were computed using the functional realization of these modules [FTV1]. If n highest weights 1 , . . . , n ∈ C are given, then Rij = R i , j form a system of dynamical R matrices as described above. Hypergeometric solutions of the corresponding qKZB equations were introduced and studied in [FTV1, FTV2, FV4]. See also [T] where similar equations are studied and solved. Special cases of these equations appear in the statistical mechanics of RSOS

q-Deformed Knizhnik–Zamolodchikov–Bernard Heat Equation

551

models. Form factors and correlation functions in the infinite volume limit are conjectured to obey qKZB equations. In these cases explicit formulae were proposed by Lukyanov and Pugai [LP]. The subject of this paper is a deformation of the KZB heat equation: in the classical case, additionally to the KZB equations above, that are associated to the variation of the marked points on the elliptic curve, one also has an equation associated to the variation of the modulus τ of the elliptic curve. The function v also depends on τ and one has an additional equation, compatible with the KZB equations, the KZB heat equation 1 4πiκ∂τ v = λ v + s(z, λ, τ )(ij ) v 2 i,j

for some s ∈ g ⊗ g. Here λ denotes the Laplacian of h corresponding to the invariant bilinear form. For example, if n = 1 then this equation reduces to ℘ (α(λ), τ )eα e−α v, 4πiκ∂τ v = λ − α∈$

where eα are properly normalized root vectors and ℘ is the Weierstrass function. In this paper we propose a discrete version of the KZB heat equation in the rank one case. The heat operator is an integral operator, whose kernel is given in terms of hypergeometric integral solutions of the qKZB equations of [FTV2]. In Sect. 2 we review the qKZB equations and their hypergeometric solutions. Then we introduce the elliptic Shapovalov form, which is an ingredient in the integral operator, and the qKZB heat equation in Sect. 3. We prove that it is compatible with the qKZB equations, discuss its properties and show in Sect. 4, in an illustrative example, that its semiclassical limit coincides with the KZB heat equation. Finally, in Sect. 5 we study the special case where the step of the difference equation is a negative integer multiple of the deformation parameter. In this case, the semiclassical limit gives the KZB equations with positive integer κ, the situation arising in conformal field theory. In this case the KZB equations are defined on sections of the finite dimensional vector bundle of conformal blocks, of which we describe a difference analogue in simple cases. It is very likely that the hypergeometric solutions of the qKZB equations are also solutions of the qKZB heat equation. However, we were able to prove this only in the case where the sum of the highest weights is two. In this case the hypergeometric integrals are one-dimensional. In a sequel to this paper, we show that integral operators of the kind introduced in this paper can also be used to describe the transformation properties of hypergeometric solutions under the modular group. In fact it turns out that the hypergeometric solutions, at least if i = 2 obey remarkable identities under transformations of the modulus τ and the step p by SL(3, Z) acting on CP 2 with affine coordinates τ, p. These identities give both the solutions and the monodromy of the solutions. The whole picture results in an non-Abelian version of the properties of the elliptic gamma functions, which is a generalized Jacobi modular form for SL(3, Z) × Z3 in the sense of [FV5]. 2. Hypergeometric Solutions of the qKZB Equations

2.1. The qKZB equations. Fix = ( 1 , . . . , n ) ∈ Rn such that m = ni=1 i /2 is a nonnegative integer, and a complex number η. Unless stated otherwise, we will assume that these parameters are generic.

552


Let τ and p be generic complex numbers in the upper half plane. Let V j be the vector space with basis e0 , e1 , . . . equipped with the action of an operator h given by hek = ( j − 2k)ek . We view V i as a representation of the Abelian Lie algebra h = Ch. To these data is associated a system of dynamical R-matrices and thus a system of qKZB difference equations. The R-matrices R j , k (z, λ, τ ) [FV2] of Eτ,η (sl2 ) are endomorphisms of V j ⊗ V k . Let V = V 1 ⊗ · · · ⊗ V n . The qKZB equations are equations for a meromorphic function v(z, of z ∈ Cn and λ ∈ C taking its values λ) n in the zero weight subspace V [0] = Ker( i=1 h(i) ) of V (this subspace is nontrivial since i /2 is assumed to be a nonnegative integer). It will be more convenient to view v as a function v(z) taking values in the space F(V [0]) of meromorphic functions of λ ∈ C with values in V [0]. Let δj , j = 1, . . . , n be the standard basis of Cn . Then the qKZB equations have the form v(z + pδi ) = Ki (z, τ, p)v(z),

i = 1, . . . , n.

The qKZB operators Ki (z, τ, p) act on the space F(V [0]) and are given by Kj (z, τ, p) = Rj,j −1 (zj − zj −1 + p, τ ) · · · Rj,1 (zj − z1 + p, τ ) /j Rj,n (zj − zn , τ ) · · · Rj,j +1 (zj − zj +1 , τ ). The operators Rj,k (z, τ ) are defined by the formula Rj,k (z, τ ) v(λ) = R

j,

k

k−1

z, λ − 2η

(l)

h , τ v(λ),

l=1,l=j

and (/j v)(λ) = v(λ − 2ηµ) if h(j ) v(λ) = µv(λ) and is extended by linearity to F(V [0]). The qKZB system of difference equations is compatible, i.e., we have Kj (z + pδl , τ, p)Kl (z, τ, p) = Kl (z + pδj , τ, p)Kj (z, τ, p),

(2)

for all j, l, as a consequence of the dynamical Yang–Baxter equations satisfied by the R-matrices. We also consider the “mirror” qKZB operators ∨ ∨ Kj∨ (z, p, τ ) = Rj,j +1 (zj − zj +1 + τ, p) · · · Rj,n (zj − zn + τ, p) ∨ ∨ (zj − z1 , p) · · · Rj,j /j Rj,1 −1 (zj − zj −1 , p),

with ∨ (z, p) v(λ) = R Rj,k

j,

n

k (z, λ − 2η

h(l) , p) v(λ),

l=k+1,l=j

The corresponding system of qKZB equations v(z + τ δj ) = Kj∨ (z, p, τ ) v(z),

j = 1, . . . n,

is also compatible. In fact, if we write x ∨ = (xn , . . . , x1 ) for any x = (x1 , . . . , xn ) ∈ Cn and let P : V → V∨ be the linear map sending v1 ⊗ · · · ⊗ vn to vn ⊗ · · · ⊗ v1 , then we have, adding the dependence on in the notation, Ki∨ (z, p, τ ; ) = P −1 Kn+1−i (z∨ , p, τ ; ∨ )P .


553

2.2. Hypergeometric solutions. In [FTV2] we constructed a “universal hypergeometric function”, which is a projective solution of the qKZB equations: it is a function u(z, λ, µ, τ, p), defined for generic values of the parameters η, , taking values in V [0] ⊗ V [0] and obeying the equations u(z + δj p, τ, p) = Kj (z, τ, p) ⊗ Dj u(z, τ, p), u(z + δj τ, τ, p) = Dj∨ ⊗ Kj∨ (z, p, τ ) u(z, τ, p),

(3)

u(z + δj , τ, p) = u(z, τ, p). Here we view u as taking values in the space of functions of λ and µ with values in V [0] ⊗ V [0]. Kj acts on the variable λ and Kj∨ on the variable µ. The operators Dj , Dj∨ act by multiplication by diagonal matrices Dj (µ), Dj∨ (λ), respectively. For our purpose, the most convenient description of these matrices is in terms of the function α(λ) = exp(−π iλ2 /4η). We have, for j = 1, . . . , n, α(µ − 2η(h(j +1) + · · · + h(n) )) πiη j (j −1 l=1 e α(µ − 2η(h(j ) + · · · + h(n) )) α(λ − 2η(h(1) + · · · + h(j −1) )) −πiη j (j −1 l=1 Dj∨ (λ) = e α(λ − 2η(h(1) + · · · + h(j ) )) Dj (µ) =

n

l−

l=j +1

n

l−

l=j +1

l)

,

l)

.

These operators are diagonal in the basis of V [0] formed by tensor products eI = ei1 ⊗ · · · ⊗ ein of basis vectors of the V k so that ( k − 2ik ) = 0. From u one can construct projective solutions (eigenfunctions of the corresponding difference operators) by takingcoefficients of the basis vectors eI . If Di (µ)eI = di,I (µ)eI , di,I (µ) ∈ C, and u = uI ⊗ eI then for any fixed I and µ, the function v(z, ˜ λ) = uI (z, λ, µ, τ, p) obeys v(z ˜ + pδi ) = di,I (µ)Ki (z, τ, p)v(z). ˜ It follows that v(z, λ) =

n

di,I (µ)−zi /p v(z, ˜ λ),

i=1

is a true solution to the qKZB equations. The parameters I and µ determine the multipliers, as is easily seen from the explicit expression for u below: v(z + δi , λ) = di,I (µ)−1/p v(z, λ),

v(z, λ + 1) = e

− π i(µ+2ηm) 2η

v(z, λ).

(4)

The second system of equations in (3) gives the monodromy of these solutions, see [FTV2]. The explicit expression for u is given by the following formulas. − π iλµ 4η k (ti − zk , τ, p) 4−2η (ti − tj , τ, p) u(z, λ, µ, τ, p) = e 2η I,J

i,k

i<j

ωI (t, z, λ, τ )ωJ∨ (t, z, µ, p)dt1 · · · dtm eI

(5) ⊗ eJ .

554


The phase function 4 has the product formula ∞ 1 − e2πi(z−a+j τ +kp) 1 − e2πi(−z−a+(j +1)τ +(k+1)p) . 4a (z, τ, p) = 1 − e2πi(z+a+j τ +kp) 1 − e2πi(−z+a+(j +1)τ +(k+1)p) j,k=0 It is symmetric under exchanging τ and p and obeys the functional equation 4a (z + p, τ, p) = e2πia

θ(z + a, τ ) 4a (z, τ, p). θ(z − a, τ )

(6)

The weight functions ωI are given by ω(i1 ,...,in ) (t1 , . . . , tm , z, λ, τ ) =

i<j

× ×

n l−1 θ (ti − tj , τ ) θ(ti − zk + η θ (ti − tj + 2η, τ ) θ(ti − zk − η

k 0 for which the integral is over the torus (R/Z)m and defines the integral in general by analytic continuation. Remark 2.1. In [FTV2] we used only weight functions and no mirror weight functions. Then the qKZB equations (3) only involve qKZB operator and no mirror qKZB operators. The choices of this paper make the qKZB heat equation more transparent. The proof that u obeys the relations (3) is the same as the proof of Theorem 31 in [FTV2]. Note however that the conventions in the definitions of Dj are different there. 3. The qKZB Heat Equation In this section we define a q-analogue of the KZB heat equation, prove that it is compatible with the other qKZB equations and show that the differential KZB heat equation arises in the semiclassical limit in the simplest non-trivial case. The qKZB heat equation is an integral equation. The integration kernel is a contraction with the fundamental hypergeometric solution. The contraction is defined using the elliptic Shapovalov form.


555

3.1. The elliptic Shapovalov form. For j = 1, . . . , n, µ, τ ∈ C, Im τ > 0, let Q j (µ, τ ) : V j ⊗ V j → C be the bilinear form on V j with matrix elements Q

j

(µ, τ )(ek ⊗ el ) = δk,l Qk j (µ, τ ), j

Qk (µ, τ ) =

θ (0, τ ) θ (2η, τ )

k k l=1

θ(2η( j + 1 − l), τ )θ (2ηl, τ ) . θ(µ + 2η( j + 1 − k − l), τ )θ (µ − 2ηl, τ )

(7)

Out of these bilinear forms we define a bilinear form Q(µ, τ ) : V ⊗ V → C on the tensor product V : (2,n+2) (µ, τ )(1,n+1) Q 2 µ + 2ηh(1) , τ ··· n−1 (n,2n) Q n µ + 2η h(j ) , τ ,

Q(µ, τ ) = Q

1

j =1

and a bilinear form on the space of functions of λ with values in V [0]: if f and g are holomorphic functions from C to V [0], we set (if the integral converges) Qτ (f ⊗ g) = Q(µ, τ )f (µ) ⊗ g(−µ)α(µ) dµ. The integration is on the path t → 2ηt + =, −∞ < t < ∞. This bilinear form is called the elliptic Shapovalov form. The main property of the elliptic Shapovalov form is that the R-matrix is in a certain sense symmetric with respect to it, see Lemma 3.9. In particular it is a sort of contravariant form for the action of the elliptic quantum group, see Eq. (12) below.

3.2. Notation. To write the following formulae in the most transparent form we will use the following notational conventions. Let for k ∈ Z≥1 and a complex vector space V , Fk (V ) denote the space of meromorphic functions of k complex variables with values in V ⊗k . For example u(z, τ, p) ∈ F2 (V [0]). We also set F0 (V ) = C and F(V ) = F1 (V ). If f ∈ Fj (V ) and g ∈ Fk (V ), we define f ⊗ g ∈ Fj +k by (f ⊗ g)(λ1 , . . . , λj +k ) = f (λ1 , . . . , λj ) ⊗ g(λj +1 , . . . , λj +k ). If Ai ∈ End(F(V )) are difference operators with meromorphic coefficients we write A1 ⊗ · · · ⊗ Ar ∈ End(Fr (V )) to denote the composition of the operators Ai , each acting on the i th variable and the i th factor. We also use this notation if one of the Ai is an integral operator Q : D ⊂ F2 (V ) → C of the form f → Q(µ)f (µ, −µ)dµ, Q(µ) ∈ (V ⊗ V )∗ , defined on some subset D. Then A1 ⊗ · · · ⊗ Ar is defined on some subset of Fr (V ) and maps to Fr−2 (V ). 3.3. The qKZB heat equation. Let T (z, τ, p) be the integral operator on V [0]-valued functions of one complex variable λ T (z, τ, p)v = (α ⊗ Qτ +p )u(z, τ, τ + p) ⊗ v,

(8)

where α is the operator of multiplication by the function α(λ). More explicitly, if {eI } is a basis of V [0] consisting of tensor products of basis vectors and u = I,J uI,J eI ⊗ eJ ,

556

v=


vI eI , Q(µ, τ )(eI ⊗ eJ ) = QI (µ, τ )δI,J , we have

T (z, τ, p)v(λ) α(λ) uI,J (z, λ, µ, τ, τ + p)QJ (µ, τ + p)vJ (−µ)α(µ)dµ eI . = I

(9)

J

Theorem 3.1. The equations v(z + pδj , τ ) = Kj (z, τ, p)v(z, τ ), j = 1, . . . , n, v(z, τ ) = T (z, τ, p)v(z, τ + p),

(10)

are compatible, i.e., we have, in addition to (2), T (z + pδj , τ, p)Kj (z, τ + p, p) = Kj (z, τ, p)T (z, τ, p). The proof of this theorem is given in Sect. 3.6. A similar statement holds for the qKZB difference operators Kj∨ : they obey the compatibility identities Kj∨ (z + τ δl , p, τ )Kl∨ (z, p, τ ) = Kl∨ (z + τ δj , p, τ )Kj∨ (z, p, τ ), for all j, l, and the operator T ∨ (z, p, τ )v = (Qτ +p ⊗ α)v ⊗ u(z, τ + p, p), obeys T ∨ (z + τ δj , p, τ )Kj∨ (z, p + τ, τ ) = Kj∨ (z, p, τ )T ∨ (z, p, τ ), implying the compatibility of the mirror qKZB equations v(z + τ δj , p) =Kj∨ (z, p, τ )v(z, p),

j = 1, . . . , n,

∨

v(z, p) = T (z, p, τ )v(z, p + τ ). Corollary 3.2. Let U (z, λ, µ, τ, p) be the function U (z, τ, p) = α ⊗ Qτ +p ⊗ α u(z, τ, τ + p) ⊗ u(z, τ + p, p) . Then U obeys the system of equations (3). Conjecture 3.3. Let U (z, τ, p) be the function defined in Corollary 3.2. Then U (z, τ, p) = Cu(z, τ, p), for some constant C.

4πiη /(2π √4iη)). This conjecture is proved in the case where i = 2 (with C = −e This proof will be published elsewhere, [FV3]. Assuming the conjecture correct, we can obtain solutions to the full system (10) as in 3.3: for arbitrary I = (i1 , . . . , in ) ∈ Zn≥0 with ( k − 2ik ) = 0 and µ ∈ C, let v(z, λ, τ ) = e

2

µ τ − iπ4ηp

n

di,I (µ)−zi /p uI (z, λ, µ, τ, p).

i=1

Then the function v(z, τ ) : λ → v(z, λ, τ ) is a solution of (10). It also obeys (4) and v(z, λ, τ + 1) = e

2

µ − iπ4ηp

v(z, λ, τ ).


557

Remark. The Shapovalov pairing Qτ contains an integration dµ which we chose to be the integral on the path t → 2ηt + =. This choice makes the integral convergent if Im(η) < 0 for a large class of functions, thanks to the strong decay at infinity of the Gaussian function α(µ) on this path. This class contains in particular our hypergeometric solutions. It should be however emphasized that the only properties of dµ that are needed for this construction are that it be a linear form on functions of µ invariant under translations by 2η times weights of vectors in V i , and that it be well defined on a suitable class of functions. In particular, if the highest weights are rational with greatest common denominator d, we may replace the integral over µ by the sum over the set {λ + 2ηk/d, k ∈ Z}, so that the heat equation may be viewed as a difference equation of infinite order. 3.4. The case of integer highest weights. If ∈ Z≥0 , let S be the h -submodule of V generated by e +1 , e +2 , . . . . The + 1-dimensional quotient V /S is denoted by L . The classes e¯0 , . . . , e¯ of e0 , . . . , e build a basis of L . The space L carries a one-dimensional family of representations of the elliptic quantum group Eτ,η (sl2 ), see [FV2]. For integer highest weights i , j , the R-matrix R i , j (z, λ, τ ) preserves S i ⊗ V j and V i ⊗ S j [FV2, FTV1]. Therefore it induces an endomorphism, still denoted R i , j (z, λ, τ ), of L j ⊗L j . If 1 , . . . , n ∈ Z≥0 we then have a system of dynamical R-matrices and thus a system of qKZB equations defined on L [0] = (L 1 ⊗ · · · ⊗ L n )[0]. A universal hypergeometric function u(z, ˆ λ, µ, τ, η) taking values in the tensor product of finite dimensional modules L [0] ⊗ L [0] and obeying (3) was found in [MV]. It is defined by u(z, ˆ λ, µ, τ, η) = I,J uI,J (z, λ, µ, τ, p, η)e¯I ⊗ e¯J , where uI,J are the analytic continuation of the components of the universal hypergeometric function for V [0] ⊗ V [0] which are shown to exist for these values of I, J . Then we can introduce a heat operator Tˆ (z, τ, p) acting on functions with values in L [0] by the same formula (8). Theorem 3.4. Suppose that

1, . . . ,

n

are non-negative integers. Then the equations

v(z + pδj , τ ) = Kj (z, τ, p)v(z, τ ), j = 1, . . . , n, v(z, τ ) = Tˆ (z, τ, p)v(z, τ + p),

(11)

for a function v taking values in L [0] are compatible, i.e., we have, in addition to (2), Tˆ (z + pδj , τ, p)Kj (z, τ + p, p) = Kj (z, τ, p)Tˆ (z, τ, p). The proof of this theorem is contained in 3.7 below. 3.5. Rational η. A particularly interesting case is the case of integer highest weights and rational η. Let us for instance assume that 2N η = 1 for some positive integer N and suppose that the highest weights 1 , . . . , n are positive integers. If N is large enough, then the qKZB operators may still be defined. Indeed we have:

558


Lemma 3.5. Let 1 , 2 be positive integers, and N be a large enough integer. Then the matrix elements of the R-matrix R 1 , 2 (z, λ, τ ; η) ∈ End(L 1 ⊗ L 2 ) with respect to the basis {e¯i ⊗ e¯j } are regular functions of η at 2η = 1/N for fixed generic values of z, λ, τ . Proof. The R matrix R 1 , 2 (z1 − z2 , λ, τ ; η), for generic η, is uniquely determined up to normalization by an intertwining condition for tensor products of Eτ,η (sl2 )-modules L i (zi ), see [FV2]. The Eτ,η (sl2 ) module L (z) may be realized for integer as a symmetric tensor product of two-dimensional modules [FV2], so that the matrix elements of R i , j , for a basis consisting of symmetrized tensor products of basis vectors, can be expressed as polynomials in the matrix elements of R1,1 . The latter matrix elements are known explicitly and are regular as functions of η. For i = 1, . . . , , the basis vector e¯i of L is proportional to the symmetrized tensor products of basis vectors of the two dimensional modules. The proportionality constant is an elliptic factorial

i j =1 θ (2ηj )/θ (2η) which is regular and non-zero at 2ηN = 1 as long as N > λ. Thus if N > max( 1 , . . . , n ), the matrix elements of the R-matrix are regular at 2ηN = 1. ! " Fix some generic complex number =. Then we may consider the qKZB equations as equations for functions v(z, λ), where the dynamical parameter λ runs over the finite set {k/N + = | k ∈ Z/2N Z}. Indeed, the coefficients of the qKZB operators are 1-periodic functions of λ and the shifts of λ in the difference operators /i are integer multiples of 2η = 1/N. The shift by the generic number = ensures that on this finite set no poles of the qKZB operators are encountered. Thus, we get: Proposition 3.6. Suppose that N = (2η)−1 and 1 , . . . , n are positive integers, with N large enough. Fix a generic complex number =. Let FN (=) be the space of functions f : N1 Z → V [0] so that f (λ + 2) = f (λ). Then the qKZB operators Ki (z, τ, p), Ki∨ (z, p, τ ) are well-defined endomorphisms of FN (=). In this situation we thus have a truly holonomic system, i.e., a compatible system of difference equations for functions taking values in a finite dimensional vector space FN (=). In order to define the heat equation we have to worry about the fact that the universal hypergeometric function uˆ is not defined for all values of the parameters. Recall that u(z, ˆ λ, µ, τ, p) is also a meromorphic function of η. Let us say that η is a regular point for uˆ if uˆ is regular at this point for all λ, µ and all generic z, τ, p. Theorem 3.7. Let η, = ( 1 , . . . , n ), = be as in Proposition 3.6 and assume that η is a regular point for u. ˆ Then the heat operator TˆN (z, τ, p)v(λ) = e−

N iπ λ2 2

2N−1

(1 ⊗ Q(µk , τ + p))u(z, ˆ λ, µk , τ, τ + p) ⊗ v(−µk )e−

N iπ µ2 k 2

,

k=0

with µk = −=+k/N, maps FN (=) to itself. Moreover, the equations for v(z, τ ) ∈ FN (=), v(z + pδj , τ ) = Kj (z, τ, p)v(z, τ ), j = 1, . . . , n, v(z, τ ) = TˆN (z, τ, p)v(z, τ + p),


559

are compatible, i.e., we have, in addition to (2), TˆN (z + pδj , τ, p)Kj (z, τ + p, p) = Kj (z, τ, p)TˆN (z, τ, p), on FN (=). The proof of this theorem is contained in Sect. 3.7 below. The description of the set of regular points for uˆ will be studied elsewhere. Here we only remark that in the case n = 1, 1 = 2, the point η = 1/2N is a regular point for all N ≥ 3, as can easily be checked since uˆ is given by a one-dimensional integral. In this case, Conjecture 3.3 holds, see [FV3], namely, we have u(z, ˆ λ, ν, τ, p) = CN e− ×

N iπ(λ2 +ν 2 ) 2

2N−1

ˆ λ, µk , τ, τ + p) 1 ⊗ Q(µk , τ + p) ⊗ 1 u(z,

k=0

N iπ µ2k ⊗ u(z, ˆ −µk , ν, τ + p, p) e− 2 , for all λ ∈ = +

1 N Z,

ν∈

1 N Z.

The constant is CN =

S(N ) =

2N−1

e−

π ik 2 2N

ie2π i/N S(N)

, with the Gauss sum

√ = (1 − i) N .

k=0

3.6. Proof of Theorem 3.1. The proof is based on some identities involving R-matrices, Q and Dj . As above, we set α(λ) = exp(−π iλ2 /4η). Lemma 3.8. For any

, M,

α(λ − 2η(h(1) + h(2) )) R α(λ − 2ηh(2) )

,M (z

+ τ, λ, τ ) = e−2πiη

,M

R

,M (z, λ, τ )

α(λ − 2ηh(1) ) . α(λ)

Proof. One way to prove this lemma is to use the functional realization (see [FTV1]): The matrix R ,M relates two bases of the same space of functions. The basis elements are products of ratios of theta functions and have therefore well-behaved transformation properties under shifts of z by τ . The computation is straightforward and will not be reproduced here. ! " Lemma 3.9. Let

, M ∈ C and v, w ∈ V ⊗ VM . Then

$Q (µ + 2ηh(2) , τ ) ⊗ QM (µ, τ ) v, R (1)

,M (z, −µ, τ ) w%

= $Q (µ, τ ) ⊗ Q (µ + 2ηh , τ )R M

,M (z, µ + 2η(h

(1)

+ h(2) ), τ )v, w%.

Proof. We first prove a version of this identity for L-operators. Let L(ζ, λ) ∈ End(C2 ⊗ V (z)) be the L-operator of the evaluation Verma module V (z). We claim that, for any v1 , v2 ∈ C2 ⊗ V (z), $Q1 (µ + 2ηh(2) , τ ) ⊗ Q (µ, τ ) v1 , L(ζ, −µ, τ ) v2 % = $Q1 (µ, τ ) ⊗ Q (µ + 2ηh(1) , τ )L(ζ, µ + 2η(h(1) + h(2) ), τ )v1 , v2 %.

(12)

560


We have Q10 (µ, τ ) = 1 and Q11 (µ, τ ) = θ(2η)θ(µ − 2η)−1 θ (µ)−1 . Define the matrix elements of L by L(ζ, µ)ej ⊗ v = k=0,1 ek ⊗ Lkj (ζ, µ)v. Then the claim is equivalent, to Qk (µ, τ )$ek , L00 (ζ, −µ)ek % = Qk (µ + 2η, τ )$L00 (ζ, µ + 2η(

− 2k + 1))ek , ek %,

Qk (µ, τ )$ek , L01 (ζ, −µ)ek−1 % = Q11 (µ, τ )Qk−1 (µ − 2η, τ )$L10 (ζ, µ + 2η( Q11 (µ + 2η(

− 2k), τ )Qk (µ, τ )$ek , L10 (ζ, −µ)ek+1 %

= Qk+1 (µ + 2η, τ )$L01 (ζ, µ + 2η( Q11 (µ + 2η(

− 2k + 1))ek , ek−1 %,

− 2k − 1))ek , ek+1 %,

− 2k), τ )Qk (µ, τ )$ek , L11 (ζ, −µ)ek %

= Q11 (µ, τ )Qk (µ − 2η, τ )$L11 (ζ, µ + 2η(

− 2k − 1))ek , ek %.

These identities follow immediately from the explicit expressions given in [FV2] for the j operators Lk (called a, b, c, d in [FV2]). We now extend this result to the general case. We use the intertwining property of the R-matrix: let L , LM be the L-operators of V (z1 ), VM (z2 ), respectively. Then1 R ,M (z1 − z2 , µ) ∈ End(V (z1 ) ⊗ VM (z2 )) is uniquely determined up to a factor by the relation L(ζ, µ)R

,M (z1

− z2 , µ − 2ηh(1) )(23) = R

,M (z1

− z2 , µ)(23) L (ζ, µ).

The operators L and L (giving the action of the quantum group on the tensor product by using the coproduct and the opposite coproduct, respectively) are defined by L(ζ, µ) = L (ζ, µ − 2ηh(3) )(12) LM (ζ, µ)(13) , L (ζ, µ) = LM (ζ, µ − 2ηh(2) )(13) L (ζ, µ)(12) . The R-matrix normalized by the condition R ,M (z1 − z2 , µ)e0 ⊗ e0 = e0 ⊗ e0 . In particular, if v = w = e0 ⊗ e0 , the claim of the lemma is correct for trivial reasons. We prove the general case by induction: let us suppose that the lemma is proved for v, w of weight + M − 2j , j = 0, . . . , k − 1, k ≥ 1. Now it is known, see [FV2], that, for generic parameters, the weight space V (z1 ) ⊗ VM (z2 )[ + M − 2k] is spanned by vectors of the form L01 (ζ, λ) x (or L01 (ζ, λ) x), ζ ∈ C, x of weight + M − 2(k − 1), and any fixed generic λ. Indeed, if these vectors did not span the weight space, they would be part of a proper submodule, contradicting the irreducibility of the tensor product. By iterating (12), we obtain $Q1 (µ + 2η(h(2) + h(3) )) ⊗ Q (µ + 2ηh(3) ) ⊗ QM (µ)v1 , L(ζ, −µ)v2 % = $Q1 (µ) ⊗ Q (µ + 2η(h(1) + h(3) )) ⊗ QM (µ + 2ηh(1) ) L (ζ, µ + 2η(h(1) + h(2) + h(3) ))v1 , v2 %. 1 Here we omit the argument τ to shorten the notation.


561

In particular, if v1 = e0 ⊗ v, v2 = e1 ⊗ w, one has $Q (µ + 2ηh(2) ) ⊗ QM (µ) v, L01 (ζ, −µ) w% = Q11 (µ)$Q (µ+2η(−1+h(2) ))⊗QM (µ−2η)L10 (ζ, µ+2η(1+h(1) +h(2) )) v, w%. We turn to the proof of the induction step. Suppose that v, w have weight and write w = L01 (ζ, µ)x. Let us set z = z1 − z2 . Then $Q (µ + 2ηh(2) ) ⊗ QM (µ) v, R

+ M − 2k,

,M (z, −µ) w%

= $Q (µ + 2ηh(2) ) ⊗ QM (µ) v, R

0 ,M (z, −µ)L 1 (ζ, −µ) x%

= $Q (µ + 2ηh(2) ) ⊗ QM (µ) v, L01 (ζ, −µ)R

,M (z, −µ + 2η) x%

= Q11 (µ)$Q (µ + 2η(−1 + h(2) )) ⊗ QM (µ − 2η) L10 (ζ, µ + 2η(1 + h(1) + h(2) )) v, R

,M (z, −µ + 2η) x%

= Q11 (µ)$Q (µ − 2η) ⊗ QM (µ + 2η(−1 + h(1) )) R

,M (z, µ + 2η(−1 + h

(1)

+ h(2) ))L10 (ζ, µ + 2η(1 +

+ M − 2k)) v, x%.

In the last step, we used the induction hypothesis. The calculation continues by commuting R with L , and then by bringing L to the right. This last part is similar to the above calculation read backwards, and will not be reproduced in detail. One finally obtains, as desired, $Q (µ) ⊗ QM (µ + 2ηh(1) )R

,M (z, µ + 2η(h

(1)

+ h(2) )) v, w%. " !

This completes the induction step and thus the proof of the lemma. Lemma 3.10. α (Dj∨ )−1 Kj (z, τ, p + τ ) = Kj (z, τ, p) α e−πiη

j(

α Dj−1 Kj∨ (z, p, τ + p) = Kj∨ (z, p, τ ) α e−πiη

l)

l=j

j(

l)

l=j

.

Proof. This is a straightforward consequence of the definition of the difference operators Kj , Kj∨ and of Lemma 3.8. ! " Lemma 3.11. Let f , g be holomorphic functions from C to V , Qτ +p (f, Kj (z, τ + p, p)g) = Qτ +p (Dj−1 Kj∨ (z + pδj , τ + p, τ )f, g)eπiη

j(

Qτ +p (Kj∨ (z, τ + p, τ )f, g) = Qτ +p (f, (Dj∨ )−1 Kj (z + τ δj , τ + p, p)g)eπiη

l)

l=j

j(

l=j

,

l)

.

Proof. The proof of the first identity is given by using Lemma 3.9 to bring the R-matrices in Kj to the left, the translation invariance of the integral to bring /j to the left, and 3.10 to commute the resulting Kj∨ (z + pδj , τ + p, −p) with α. The proof of the second identity is similar. ! "

562


We can now complete the proof of Theorem 3.1. Let Cj = eπiη function from C to V [0],

j(

l=j

l)

and v be a

T (z + pδj , τ, p)Kj (z, τ + p, p) v = (α ⊗ Qτ +p )u(z + pδj , τ, p + τ ) ⊗ Kj (z, τ + p, p) v = Cj (α ⊗ Qτ +p )(1 ⊗ Dj−1 Kj∨ (z + pδj , τ + p, τ )u(z + pδj , τ, p + τ )) ⊗ v = Cj (α ⊗ Qτ +p )((Dj∨ )−1 ⊗ Dj−1 u(z + (p + τ )δj , τ, τ + p)) ⊗ v = Cj (α ⊗ Qτ +p )((Dj∨ )−1 Kj (z, τ, τ + p) ⊗ 1 u(z, τ, τ + p)) ⊗ v = Kj (z, τ, p)T (z, τ, p) v.

3.7. Proof of Theorem 3.4 and Theorem 3.7. Theorem 3.4 can be proven in the same way as Theorem 3.1. There is however an apparent difficulty: the heat operator involves the Shapovalov form which contains a sum over all components of u, including those that a priori do not have a limit for integer highest weights. The solution is provided by Theorem 2 of [MV]: let us say that I = (i1 , . . . , in ) ∈ Zn≥0 is admissible for if ia ≤ a for all a = 1, . . . , n. Then (a special case of) Theorem 2 states that the components uI,J (z, λ, µ, τ, p) such that I or J is admissible, are regular functions of the highest weights at for generic values of the other variables. Moreover, Qk (µ, τ ) vanishes if k ≥ , cf. (7), so that the sum appearing in the compatibility condition is effectively restricted to admissible indices. Theorem 3.7 is proven in the same way as Theorem 3.1 and 3.4. In fact the only property of the integral over µ that is used in the proof is the translation invariance. So the same proof gives the compatibility relation in this case provided the function of µk on the right-hand side is periodic in k with period 2N . Now u(z, λ, µ, τ, p) is exp(−πiN λµ) times a 2-periodic function of λ and µ. So the exponential factors combine into the expression e−

iπ N (λ+µk )2 2

= e−

iπ N (λ−=+k/N )2 2

.

If λ ∈ = + N1 Z, this expression is periodic in k with period 2N . The same argument shows that TN (z, τ, p)v(λ) is 2-periodic in λ for λ ∈ = + N1 Z. ! " 4. Semiclassical Limit We consider here the semiclassical limit of our quantum heat equation in the simplest non-trivial case and show that we do recover the KZB heat equation in this limit. The case we consider is n = 1, with 1 = 2. The qKZB equations for the dependence of z1 are trivial in this case and we can assume that z1 = 0. The zero weight space is onedimensional, and we identify it with C using the basis e1 . Suppose that vη (λ, τ ) is a family of solutions of the qKZB equations with parameters τ, p = −2κη, τ, η, parametrized by η around zero. Assume that vη has an asymptotic expansion vη (λ, τ ) = v0 (λ, τ ) + O(η)


563

. −2η

1/2

−1/2 . 2η

Fig. 1. The integration cycle γ . The points ±2η are the singularities of the integrand

at η = 0. We want to find the equations satisfied by v0 . For this we expand the qKZB heat equation vη (λ, τ ) =

−1 − iπ4ηλ2 √ e 4π iη 2 − iπ4ηµ θ(4η,τ +p)θ (0,τ +p) × u(λ, µ, τ, p, η) θ(µ+2η,τ e vη (−µ, τ + p) dµ, +p)θ(µ−2η,τ +p) (13)

around η = 0, setting p = −2κη and keeping τ, κ, λ fixed. The dependence of η of the constant in front of the integral was chosen in such a way that the semiclassical limit exists. The integration path is t → µ = ηt (t ∈ R). The hypergeometric solution u is independent of z in this case and is given by the formula: θ(λ + t, τ ) θ(µ + t, τ + p) − iπ λµ u(λ, µ, τ, p, η) = e 2η dt. 42η (t, τ, τ + p) θ(t − 2η, τ ) θ(t − 2η, τ + p) γ The integration cycle γ is depicted in Fig. 1. Theorem 4.1. Suppose that vη (λ, τ ) is a family of solutions of (13) with an asymptotic expansion vη (λ, τ ) = v0 (λ, τ ) + ηv1 (λ, τ ) + · · · , then v(λ, τ ) = v0 (λ, τ )/θ (λ, τ ) obeys the KZB heat equation 2πiκ

∂ 2v ∂v = 2 − 2℘ (λ, τ )v + c(τ )v, ∂τ ∂λ

for some c(τ ) independent of λ.

564


Proof. The integral on the right-hand side of (13) has the form η∞ i − iπ (λ+µ)2 Iη = √ e 4η g(λ, −µ, η)dµ. 4iη −η∞ This integral has the asymptotic expansion as η → 0,

1 ∂ 2 ∂ Iη = g(λ, λ, 0) + η g(λ, µ, 0) + g(λ, λ, η) + O(η2 ). iπ ∂µ2 µ=λ ∂η η=0 To compute the various terms of this expression, we first notice that the integration cycle in u is pinched by the singularities as η → 0. The integral defining u can then be expressed as a divergent (as η → 0) part given by 2π i times the residue at t = 2η plus the integral on a cycle γ¯ which stays away from the singularities. ˜ by To compute the residue we introduce 4 1 − e2πi(t−2η) ˜ 2η (t, τ, τ + p) 4 1 − e2πi(t+2η) 2π i ˜ 2η (2η, τ, τ + p) + O((t − 2η)2 ). = (t − 2η) 8πiη 4 e −1

42η (t, τ, τ + p) =

˜ 2η (2η, τ, τ − 2κη) is regular and converges to 1. As η → 0, 4 We then have g(λ, µ, η) =

2πi ˜ 2η (2η, τ, τ + p) 4 −1 θ (λ + 2η, τ + p)θ(4η, τ + p)vη (µ, τ + p) × θ (0, τ )θ (µ + 2η, τ + p) 1 θ(λ + t, τ ) θ(µ + t, τ + p) − dt 42η (t, τ, τ + p) 2πi γ¯ θ(t − 2η, τ ) θ(t − 2η, τ + p)

e8πiη

×

θ (4η, τ + p)θ (0, τ + p) vη (µ, τ + p). θ (µ + 2η, τ + p)θ(µ − 2η, τ + p)

From these formulae we can compute the various terms: g(λ, λ, 0) = v0 (λ, τ ), 2 v0 (λ, τ ) g(λ, µ, 0) = θ(λ, τ )∂λ . θ(λ, τ ) µ=λ

2 ∂ µ ∂2

Finally ∂ ∂ g(λ, λ, η) = C (τ ) v (λ, τ ) + η vη (λ, τ ) 1 0 ∂η η=0 ∂η η=0 ∂ v0 (λ, τ ) 2 θ(t + λ, τ )θ(t − λ, τ ) − 2κ θ (λ, τ ) − dt v0 (λ, τ ). π i γ¯ θ(t, τ )2 θ(λ, τ )2 ∂τ θ (λ, τ ) Here C1 (τ ) is some scalar function independent of λ. Using the identity θ (t + λ, τ )θ (t − λ, τ ) 1 = (℘ (λ, τ ) − ℘ (t, τ )) , θ (0, τ )2 θ (t, τ )2 θ (λ, τ )2


565

we see that the right-hand side of (13) is ∂ v0 (λ, τ ) + η vη (λ, τ ) ∂η η=0 1 ∂2 ∂ 2 v0 (λ, τ ) + ηθ (λ, τ ) − 2κ − ℘ (λ, τ ) + c(τ ) + O(η2 ), 2 iπ ∂λ ∂τ πi θ(λ, τ ) for some function c(τ ) independent of λ. Since the first two terms also appear on the left-hand side, the proof is complete. 5. Conformal Blocks In this section, we introduce, in the simplest case of one marked point, a difference analogue of the vector bundle of conformal blocks. We begin by reviewing the differential case. The vector bundle of conformal blocks is, in this case, a vector bundle on the moduli space M1,1 of genus one curves with one marked point. The projectivization of this vector bundle carries a connection given by the KZB differential operator. We then give a difference analogue of this vector bundle. It has a (discrete) connection, which is now given by the qKZB heat operator T . 5.1. The differential case. Let g be a simple complex Lie algebra with Cartan subalgebra h and root space decomposition g = h ⊕ ⊕α∈$ gα . Let the non-degenerate invariant bilinear form ( , ) on g ' g∗ be normalized so that the highest root θ obeys (θ, θ ) = 2. Let κ be an integer larger than or equal to the dual Coxeter number h∨ of g, and ∈ h∗ be a dominant integral weight, so that (θ, ) ≤ κ − h∨ . Denote by L the irreducible g-module of highest weight . To these data one associates a holomorphic vector bundle of conformal blocks on the moduli space M1,1 of genus one complex curves with one marked point [TUY]. Its projectivization carries a canonical flat connection. The fiber over a point may be defined as a space of coinvariants for the Lie algebra of g-valued rational functions on the curve whose poles are at the marked point, acting on the irreducible affine Kac–Moody Lie algebra module of highest weight and level κ − h∨ . An explicit description [FW] of this bundle, which for our purposes can be taken as a definition, may be obtained by viewing M1,1 as the quotient of the upper half plane H+ by SL(2, Z). We may then regard the vector bundle Eκ, of conformal blocks as an SL(2, Z)-equivariant vector bundle over H+ . Let L [0] = {v ∈ L | hv = 0} be the zero weight space of L . It carries a natural linear action of the Weyl group W of g. The fiber of Eκ, over τ ∈ H+ is then to the space of holomorphic maps v : h → L [0] such that v(λ + q1 + q2 τ ) = exp(−π iκ(q2 , q2 )τ − 2π iκ(q2 , λ))v(λ), for all λ ∈ h and q1 , q2 in the coroot lattice Q∨ . (ii) v(w·λ) = =(w)w·v(λ) for all w ∈ W , where = : W → {±1} is the homomorphism sending reflections to −1. (iii) For all roots α, x ∈ gα , and integers l ≥ 0, r, s, the map v obeys the vanishing condition x l v(λ) = O (α(λ) − r − sτ )l+1 , (i)

as α(λ) → r + sτ .

566


The action of SL(2, Z) on the base may be lifted to an action on the bundle: let ab g= ∈ SL(2, Z) act on H+ by τ → g · τ = (aτ + b)/(cτ + d). Then we have cd isomorphisms ψg (τ ) : Eκ,m (τ ) → Eκ,m (g · τ ) given by ψg (τ )v(λ) = e

π iκ 2 2 c(cλ+d)λ

v((cτ + d)λ),

obeying the cocycle condition ψgh (τ ) = ψg (h · τ )ψh (τ ). Denote by π : H+ → M1,1 the canonical projection. Local holomorphic sections of the vector bundle of conformal blocks on an open set U ⊂ M1,1 are then the same as holomorphic sections v of Eκ,m on π −1 (U ) so that v(g · τ ) = ψg (τ )−1 v(τ ). In other words, they are holomorphic functions v(λ, τ ) on C × p −1 (U ) obeying (i)-(iii) for each fixed τ and such that 2 λ aτ + b − πiκcλ v , = e 2(cτ +d) v(λ, τ ). cτ + d cτ + d The projectivization of this vector bundle carries a holomorphic connection, and horizontal sections may be constructed by an elliptic version of hypergeometric integrals [FV1]. We describe here the connection in the case of sl(2, C). If g = sl(2, C) and = mα, m = 0, 1, . . . , then L [0] is one dimensional. Let us choose a basis of L [0] and identify h ' h∗ with C via the basis α/2. Then Eκ, (τ ) = Eκ,2m (τ ) consists of holomorphic functions v(λ) on the complex plane so that (i) v(λ + 2r + 2sτ ) = exp(−2π iκ(s 2 τ + sλ))v(λ), (ii) v(−λ) = (−1)m+1 v(λ), (iii) v is divisible by θ(λ, τ )m+1 in the ring of holomorphic functions. If κ ≥ 2m + 2, we have Eκ,2m (τ ) = θ(λ, τ )m+1 Kκ−2m−2 (τ )W , where Kκ (τ )W is the κ + 1-dimensional space of holomorphic even functions obeying (i). Otherwise Eκ,2m (τ ) is trivial. It follows that κ − 2m − 1, if κ ≥ 2m + 2, dim(Eκ,2m (τ )) = 0, otherwise. The connection on Eκ,2m is defined by its covariant derivative /(U, Eκ,2m ) → /(U, Eκ,2m ) ⊗ 41 (U ) on local holomorphic sections: 1 2 −1 ∇v(λ, τ ) = ∂τ − ∂ − m(m + 1)℘ (λ, τ ) − η(τ ) ∂τ η(τ ) v(λ, τ ) dτ. 2πiκ λ Here ℘ is the Weierstrass elliptic function with periods 1 and τ and η(τ ) = eπiτ/12

∞

(1 − e2πij τ )

j =1

is the Dedekind η-function.2 In spite of the poles of the ℘ function, this connection is well-defined on Eκ,2m as can be seen by noticing that the poles cancel in the expression of the induced connection θ −m−1 ◦ ∇ ◦ θ m+1 on KW κ−2m−2 . The fact that ∇ preserves (i) and (ii) is easily checked. 2 The connection, being on the projectivization, is really defined up to adding a multiple of the identity. We have chosen it here so that it defines a connection on the vector bundle over M1,1 .


567

The connection ∇ is SL(2, Z)-equivariant, in the following sense: if U ⊂ H+ is an SL(2, Z)-invariant open set, and g ∈ SL(2, Z), we have the pull-back g ∗ : /(U, Eκ,2m ) → /(U, Eκ,2m ), sending a section v(τ ) to ψg (τ )−1 v(g · τ ). We may extend g ∗ to /(U, Eκ,2m ) ⊗ 41 (U ) by tensoring with the pull-back of differential forms. Then g ∗ ◦ ∇ = ∇ ◦ g ∗ . Therefore the connection is well-defined on the vector bundle of conformal blocks on M1,1 . Example. If m = 0, ∇ is essentially the differential operator of the heat equation. The theta functions 2 θj,κ (λ, τ ) = e2πiκ(r τ +rλ) , j ∈ Z/2κZ, r∈Z+j/2κ

form a basis of Kκ (τ ) for fixed τ , and obey the heat equation 2π iκ∂τ θj,κ = ∂λ2 θj,κ . Moreover, we have θj,κ (−λ, τ ) = θ−j,κ (λ, τ ). It follows that the functions (14) vj (λ, τ ) = η(τ )−1 θj +1,κ (λ, τ ) − θ−j −1,κ (λ, τ ) , j = 0, 1, . . . , κ − 2, form a basis of the space of horizontal sections. See [FV1] for the case of arbitrary m. 5.2. The difference case. Let us turn to the difference case (for sl(2, C)). We describe a difference analogue of E ,2m , a holomorphic vector bundle E ,2m,η on H+ which is preserved by the qKZB heat operator. We fix a generic η in the lower half plane. Guided by the semiclassical analysis of Sect. 4, we suppose that −p/2η = κ is an integer ≥ 2 and consider the qKZB heat operator (8) for n = 1, z1 = 0, 1 = 2m. We start with the somewhat trivial but instructive case m = 0, and write Tκ,0 (τ ) = T (z = 0, τ, p = −2ηκ). Here the qKZB heat operator is − π i (λ+µ)2 Tκ,0 (τ )v(λ) = e 4η v(−µ) dµ. 2ηR

The integral is over the path t → 2ηt, −∞ < t < ∞. We define Eκ,2m=0,η = Eκ,0 to be the holomorphic vector bundle of odd theta functions, as in the differential case: the fiber over τ ∈ H+ is Eκ,0,η (τ ) = {f ∈ Kκ (τ ) | f (−λ) = −f (λ)} Theorem 5.1. Let κ ≥ 2 and suppose that Im η < 0, Im τ > 0. Then Tκ,0 (τ ) maps Eκ,0,η (τ − 2ηκ) to Eκ,0,η (τ ). This theorem is based on the identity i − iπ (λ+µ)2 θj,κ (λ, τ ) = √ e 4η θj,κ (−µ, τ − 2ηκ) dµ, 4iη 2ηR

j ∈ Z/2κZ,

which gives the action of Tκ,0 (τ ) on the basis θj −θ−j , j = 1, . . . , κ −1, of Kκ (τ −2ηκ). Let us now turn to the case of general m. To compare with the classical

limit we consider the qKZB operator for the quotient v of the dependent function by m j =1 θ(λ + 2ηj, τ ), i.e., we set Tκ,m (τ ) = φm (τ )−1 ◦ T (z = 0, τ, p = τ − 2ηκ) ◦ φm (τ − 2ηκ),

where φm (τ ) is the operator of multiplication by the function λ → m j =1 θ(λ + 2ηj, τ ).

568


Example. If m = 1, the qKZB operator for v is v → Tκ,1 (τ )v is Tκ,1 (τ )v(λ) = α(λ) V (λ, µ, τ, τ − 2ηκ)α(µ)v(−µ)dµ, 2ηR

with kernel V (λ, µ, τ, σ ) iλµ − π2η 42η (t, τ, σ ) = ce γ

θ(λ + t, τ )θ(µ + t, σ ) dt, θ (t − 2η, τ )θ (λ + 2η, τ )θ (t − 2η, σ )θ(µ + 2η, σ )

for some c = c(τ, σ ) independent of λ, µ. The integration cycle is depicted in Fig. 1. Let Eκ,2m,η (τ ) be the space of holomorphic functions so that (i) v(λ + 2r + 2sτ ) = exp(4π iηm(m + 1)s − 2π iκ(s 2 τ + sλ))v(λ),

θ(λ+2ηj,τ ) v(λ), (ii) v(−λ) = (−1)m+1 m

m j =1 θ(λ−2ηj,τ ) (iii) v is divisible by j =0 θ (λ − 2ηj, τ ) in the ring of holomorphic functions. Alternatively (and more simply), Eκ,2m,η (τ ) is the space of functions of the form

m W j =0 θ (λ − 2ηj, τ ) ϕ(λ), with ϕ ∈ Kκ−2m−2 (τ ) . In particular, Eκ,2m,η (τ ) has the same dimension as the space Eκ,2m (τ ) appearing in the differential case. Let Eκ,2m,η = ∪τ ∈H+ Eκ,2m,η (τ ). It is naturally a holomorphic vector bundle over H+ . Theorem 5.2. Let m, κ ∈ Z≥0 , κ ≥ 2m + 2 and suppose that Im η < 0, Im τ > 0. Then Tκ,m (τ ) maps Eκ,2m,η (τ − 2ηκ) to Eκ,2m,η (τ ). Proof. This theorem is a corollary of the results of [FV4]. We give here the proof in the simplest case m = 1. The proof of the general case is similar. Let v ∈ Eκ,2,η (τ − 2ηκ), and set v˜ = Tκ,1 (τ )v. Properties (i), (ii) for v˜ can be checked straightforwardly, by using the identities θ (λ + 2, τ ) = θ (λ, τ ),

θ (λ + 2τ, τ ) = e−4πi(λ+τ ) θ(λ, τ ),

θ(−λ, τ ) = −θ(λ, τ ),

obeyed by θ and translating the integration variable in the integral over µ. The latter involves moving the integration contour, which presents no problem as the vanishing condition (iii) for v guarantees that the integrand has no poles. Let us check that v˜ is holomorphic and obeys (iii). As the zeros of θ(λ, τ ) are simple and on the lattice Z+τ Z, v˜ is regular except possibly for simple poles at −2η + Z + τ Z. We claim that v˜ vanishes at λ = r + sτ and at λ = 2η + r + sτ for all r, s ∈ Z. Then (ii) implies that v˜ is regular at the points −2η + Z + τ Z (and thus everywhere), and that v˜ is divisible by θ (λ, τ )θ (λ − 2η, τ ). Since v˜ obeys (i), it is sufficient to prove the claim for r, s ∈ {0, 1}. It follows from (ii) that v(0) ˜ = 0 and, in conjunction with (i), also v(r ˜ + sτ ) = 0, r, s ∈ {0, ±1}. For example, we have v(−τ ˜ )=

θ (τ + 2η, τ ) e−8πiη θ(−τ + 2η, τ ) ˜ ). v(τ ˜ )= v(τ ˜ ) = −e−8πη v(τ θ (τ − 2η, τ ) θ(τ − 2η, τ )

˜ ), so v(τ ˜ ) = 0. On the other hand, (i) implies v(−τ ˜ ) = e−8πiη v(τ


569

Let us check that v(2η) ˜ vanishes. By using the functional equation (6) for 42η , we obtain V (2η, µ, τ, σ )

θ(t + 2η, τ )θ (µ + t, σ ) 42η (t, τ, σ ) dt θ(t − 2η, τ )θ (4η, τ )θ (t − 2η, σ )θ(µ + 2η, σ ) θ(µ + t, σ ) = c e−πiµ−4πiη 42η (t + σ, τ, σ ) dt θ(4η, τ )θ(t − 2η, σ )θ (µ + 2η, σ ) θ(µ + t − σ, σ ) = c e−πiµ−4πiη 42η (t, τ, σ ) dt θ(4η, τ )θ (t − 2η − σ, σ )θ (µ + 2η, σ ) θ(µ + t, σ ) = c eπiµ 42η (t, τ, σ ) dt θ(4η, τ )θ(t − 2η, σ )θ (µ + 2η, σ ) θ (0, τ ) = resλ=−2η V (λ, µ, τ, σ ). θ (4η, τ ) = c e−πiµ

In this calculation the change of variable t → t − σ was used. For this our choice of t-integration contour is essential, since it implies that one does not encounter poles when one deformes it back to the original position. For general m this identity is Part III of Theorem 26 in [FV4]. Thus v(2η) ˜ =

θ (0, τ ) ˜ resλ=−2η v(λ). θ(4η, τ )

But it follows from (ii) that v(2η) ˜ =−

θ (0, τ ) resλ=−2η v(λ), ˜ θ(4η, τ )

so v(2η) ˜ = 0. The same argument may be applied to 2η + r + sτ with r, s ∈ {0, ±1} (or even for general r, s). We have V (2η + r + sτ, τ, σ ) =

θ (0, τ ) 2πisσ e resλ=−2η+r+sτ V (λ, µ, τ, σ ). θ(4η, τ )

This implies that v(2η ˜ + r + sτ ) = e−4πiηκs

θ (0, τ ) ˜ resλ=−2η+r+sτ v(λ). θ(4η, τ )

On the other hand, using (ii) and (i) we obtain the same equation but with the opposite sign, so that both sides vanish. Thus v(2η ˜ + r + sτ ) = 0 and v˜ is regular at the potential singularities λ = −2η + r + sτ ,r, s ∈ Z. ! " A more direct reformulation of this theorem is the following. Corollary 5.3. Let m, κ ∈ Z≥0 , κ ≥ 2m + 2 and suppose that Im η < 0, Im τ > 0. Let, for t ∈ Cm , ωm (t, λ, τ ) =

1≤i<j ≤m

m θ(ti − tj , τ ) θ(λ+tj , τ ) . θ(ti − tj + 2η, τ ) θ(tj − 2ηm, τ ) j =1

570


Introduce the integral kernel M(λ, µ, τ, p) =

e

− π4ηi (λ+µ)2

u0 (λ, µ, τ, p)θ(µ, p) , j =−m θ(λ − 2ηj, τ )

m

where u0 (λ, µ, τ, p) =

m

42ηm (ti , τ, p)

i=1

4−2η (ti − tj , τ, p)

1≤i<j ≤m

× ωm (t, λ, τ ) ωm (t, µ, p) dt1 · · · dtm . The integration is over a torus as in 2.2. Then the integral operator M(τ )φ(λ) = M(λ, µ, τ, τ − 2ηκ)φ(−µ) dµ 2ηR

maps Kκ−2m−2 (τ − 2ηκ)W to Kκ−2m−2 (τ )W . Remark. A section v of Eκ,2m,η is called projectively horizontal if it obeys the qKZB equation Tκ,m (τ )v(τ − 2ηκ) = C(τ ) v(τ ) up to a scalar factor C(τ ). For m = 0 projectively horizontal sections are given by odd theta functions as in the differential case, see (14). In a sequel [FV3] to this paper, we show that for m = 1 (and conjecturally for higher m as well), projectively horizontal sections are again given by elliptic hypergeometric integrals. Remark. The compatibility of the difference operator Tκ,m (τ ) with the SL(2, Z) action can be better understood in terms of a discrete connection on a space with an SL(3, Z)action. This will be discussed in [FV6] and in a paper in preparation. Acknowledgement. We thank R. Ferretti for explanations on Gauss sums.

References [B] [F]

[FV1] [FV2] [FV3] [FV4]

[FV5]

Bernard, D.: On the Wess–Zumino–Witten model on the torus. Nucl. Phys. B 303, 77–93 (1988); On the Wess–Zumino–Witten model on Riemann surfaces. Nucl. Phys. B 309, 145–174 (1988) Felder, G.: Conformal field theory and integrable systems associated to elliptic curves. In: Proceedings of the International Congress of Mathematicians, Zürich 1994, Basel–Boston: Birkhäuser, 1994, pp. 1247–1255; Elliptic quantum groups, In: Proceedings of the International Congress of Mathematical Physics, Paris 1994, Cambridge, MA: International Press 1995, pp. 211–218 Felder, G. and Varchenko, A.: Integral representation of solutions of the elliptic KnizhnikZamolodchikov-Bernard equation. Int. Math. Res. Notices, 5, 221–233 (1995) Felder, G. and Varchenko, A.: On representations of the elliptic quantum group Eτ,η (sl2 ). Commun. Math. Phys. 181, 741–761 (1996) Felder, G. and Varchenko, A.: q-deformed KZB equation: completeness and modular properties. Preprint 2001 Felder, G. and Varchenko,A.: Resonance relations for solutions of the elliptic QKZB equations, fusion rules, and eigenvectors of transfer matrices of restricted interaction-round-a-face models. Commun. Contemp. Math. 1, no. 3, 335–403 (1999) ˜ 3 . math.QA/9907061, Felder, G. and Varchenko, A.: The elliptic gamma function and SL(3, Z)×Z Adv. Math. 156, 44–76 (2000)


571

Felder, G. and Varchenko, A.: Special functions, conformal blocks, Bethe ansatz, and SL(3, Z). math.QA/0101136, to appear in Phil. Trans. [FTV1] Felder, G., Tarasov, V. and Varchenko, A.: Solutions of the elliptic qKZB equations and Bethe ansatz I. Am. Math. Soc. Transl. 180, 45–75 (1997) [FTV2] Felder, G., Tarasov, V. and Varchenko, A.: Monodromy of solutions of the elliptic Knizhnik-Zamolodchikov-Bernard difference equations. q-alg/9705017, Int. J. Math. 10, 943–975 (1999) [FW] Felder, G. and Wieczerkowski, C.: Conformal blocks on elliptic curves and the Knizhnik– Zamolodchikov–Bernard equation. Commun. Math. Phys. 176, 133–162 (1996) [FR] Frenkel, I. and Reshetikhin, N.: Quantum affine algebras and holonomic difference equations. Commun. Math. Phys. 146, 1–60 (1992) [LP] Lukyanov, S. and Pugai, Ya.: Multi-point Local Height Probabilities in the Integrable RSOS Model. Nucl. Phys. B 473, 631–658 (1996) [MV] Mukhin, E. and Varchenko, A.: Solutions of the qKZB equation in tensor products of finite dimensional modules over the elliptic quantum group Eτ,η sl2 . Fields Institute Communications 24, 385–396 (1999) [T] Takebe, T.: A system of difference equations with elliptic coefficients and Bethe vectors. Commun. Math. Phys. 183, 161–182 (1997) [TUY] Tsuchiya, A., Ueno, K. and Yamada, Y.: Conformal field theory on universal family of stable curves with gauge symmetries. Adv. Stud. Pure Math. 19, 459–566 (1989) [FV6]

Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 221, 573 – 590 (2001)

Communications in



Smoothing Property for Schrödinger Equations with Potential Superquadratic at Infinity Kenji Yajima , Guoping Zhang Department of Mathematical Sciences, University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo 153-8914, Japan Received: 10 October 2000 / Accepted: 29 March 2001

Dedicated to Jean-Michel Combes on the occasion of his Sixtieth Birthday Abstract: We prove a smoothing property for one dimensional time dependent Schrödinger equations with potentials which satisfy V (x) ∼ C|x|k at infinity, k > 2. As an application, we show that the initial value problem for certain nonlinear Schrödinger equations with such potentials is L2 well-posed. We also prove a sharp asymptotic estimate of the Lp -norm of the normalized eigenfunctions of H = − + V for large energy. 1. Introduction We consider the initial value problem for a Schrödinger equation on the line R:   ∂u i = (D 2 + V (x))u, x ∈ R1 , t ∈ R, ∂t u(0, x) = u (x), x ∈ R1 ,

(1.1)

0

where D = −i∂/∂x. We assume that V (x) satisfies the following assumption. Let 1 V (j ) (x) be the j th derivative of V (x) and A = (1 + |A|2 ) 2 for a self-adjoint operator A. Assumption 1.1. The potential V (x) is real valued and of C 3 -class. There exists a constant R > 0 such that the following conditions are satisfied for |x| ≥ R: (1) V (x) is convex. (2) For j = 1, 2, 3, |V (j ) (x)| ≤ Cj x−1 |V (j −1) (x)| for some constants Cj . (3) For k > 2, D1 xk ≤ V (x) ≤ D2 xk , where 0 < D1 ≤ D2 < ∞. Partly supported by the Grant-in-Aid for Scientific Research, The Ministry of Education, Science, Sports and Culture, Japan Grant Nr. 11304006. Partly supported by the TonenGeneral International Scholarship Foundation.

574

K. Yajima, G. Zhang

We say that V is superquadratic (at infinity) if it satisfies (1), (2) and (3). Under Assumption 1.1 the operator D 2 + V (x) defined on C0∞ (R) is essentially self-adjoint in L2 (R) and we denote its closure by H . Thus, H is self-adjoint with the domain D(H ) = {u ∈ L2 (R) : D 2 u + V u ∈ L2 (R)} and the solution in L2 (R) of (1.1) is given by u(t, x) = e−itH u0 (x) in terms of the exponential function of H . In this paper, we prove a smoothing property for Eq. (1.1). We then apply it to prove that the initial value problem for nonlinear Schrödinger equations with superquadratic potentials is time locally L2 well-posed, if the nonlinearities are suffciently mild and spatially localized. We define θ (k, p) as follows, for 2 ≤ p ≤ ∞ and 2 < k < ∞:  1 1 1   − , if 2 ≤ p < 4;    k 2 p   1 , if p = 4; θ (k, p) =  4k −  1 1  1 1    − 1− 1− , if 4 < p ≤ ∞, 4 3 p k s (R). where a− denotes any number < a. We write B s (R) for the Besov space B2,1

Theorem 1.2. Let V satisfy Assumption 1.1. Let 2 ≤ p ≤ ∞ and let α, β ∈ R be such that α + β ≤ θ (k, p). Then, there exists a constant C > 0 such that g(t)i∂/∂tα H β e−itH u0 (x)Lp (Rx ,L2 (Rt )) ≤ Cg

1+ 1 2k

B4

(R)

u0 L2 (Rx ) ,

(1.2)

for any g ∈ B 4 + 2k (R) and u0 ∈ L2 (R). 1

1

The next theorem shows that the order θ(k, p) of Theorem 1.2 may be replaced by 1 for all 2 ≤ p ≤ ∞ if the spatial variable x is restricted to a compact interval of R. 2k Theorem 1.3. Let V satisfy Assumption 1.1. Let K ⊂ R be compact and let α, β ∈ R 1 . Then, there exists a constant C > 0 such that be such that α + β ≤ 2k sup g(t)i∂/∂tα H β e−itH u0 (x)L2 (Rt ) ≤ Cg

x∈K

1+ 1 2k

B4

(Rt )

u0 L2 (Rx )

(1.3)

for any g ∈ B 4 + 2k (R) and u0 ∈ L2 (R). 1

1

Note that i∂/∂tα H β e−itH = i∂/∂tα+β e−itH = H α+β e−itH in (1.2) and (1.3). The following corollary is readily obtained from Theorem 1.2 and Theorem 1.3 with the help of elliptic estimates and interpolation theory. Corollary 1.4. Suppose V satisfies Assumption 1.1. Let 2 ≤ p < ∞ and K ⊂ R be a compact interval. Then, there exists a constant C > 0 such that the following estimates are satisfied: Dx 2θ(k,p) e−itH u0 Lp (Rx ,L2 ((−T ,T )t )) + xkθ(k,p) e−itH u0 Lp (Rx ,L2 ((−T ,T )t )) ≤ Cu0 L2 (Rx ) . Dx 1/k e−itH u0 L2 ((−T ,T )t ,L2 (Kx )) ≤ Cu0 L2 (R) .

(1.4) (1.5)

Smoothing Property for Schrödinger Equations

575

One consequence of (1.5) is that e−itH u0 (·) ∈ Hloc (R), a.e. t for u0 ∈ L2 (R) and the solution u(t, x) of (1.1) is smoother than the initial function u0 by the order 1/k for almost all t. This is a manifestation of the smoothing property of Eq. (1.1). We remark that we may obtain a series of estimates of the form g(t)e−itH u0 (x)Lp (Rx ,Lq (Rt )) ≤ Cu0 2 from (1.2) with the help of the Sobolev embedding theorem and elliptic estimates. In this case we always need q < p. Since Kato’s remarkable discovery ([K1] and [K2]), the smoothing property of linear and nonlinear dispersive equations has been intensively studied by many authors in conjunction with applications mainly to the convergence problem and to the initial value problem for nonlinear equations. There is a large number of references, e.g. [St, P, Br, GV1,Y1,V, CS, KY, Sj, KPV, BAD, GV2, BT, HK, Su, H]. Most of these papers are concerned with equations with coefficients which are either constant or asymptotically constant at spatial infinity. For Schrödinger equations, the smoothing property has been extended to the case when potentials increase at most quadratically at infinity ([K3,Y2]) viz. |D β V (x)| ≤ Cβ for |β| ≥ 2, and the following estimates: 1/k

e−itH u0 Lθ ((−T ,T )t ,Lp (Rnx )) ≤ Cu0 L2 (Rnx ) ,

%(x)(1 − )α/2 e−itH u(x)Lθ ((−T ,T )t ,Lp (Rnx )) ≤ CuL2 (Rnx )

(1.6) (1.7)

have long been known ([Y2]) (see also [Y3] for Schrödinger equations with magnetic potentials which increase at most linearly at infinity). Here T > 0 is anyfinite number, 2 1 1 − 2. Estimates of the type (1.7) are called the differentiability improving property by obvious reason. Note that (1.7) with p = θ = 2 and α = 1/2 is equivalent to (1.5) with k = 2. When potentials are superquadratic at infinity, however, no estimates of this kind can be found in the literature to the best of the authors’ knowledge. This situation may be related to the fact that the smoothness and boundedness properties of the distribution kernel E(t, x, y) of e−itH , the fundamental solution or FDS for short, has a sharp transition when the growth rate at infinity of the potential passes that of C|x|2 ([Y4]): E(t, x, y) is smooth and spatially bounded for all t = 0 if V (x) = o(|x|2 ). If V (x) = O(x2 ) these results hold for small |t| > 0. However, if V (x) ≥ Cx2+ε , ε > 0, E(t, x, y) is nowhere C 1 and can be unbounded at spatial infinity ([MY]). Recall that (1.6) is a consequence of the bound |E(t, x, y)| ≤ C|t|−n/2 for small |t|, and (1.7) T of the fact that %2 (x(t))dt is a peudo-differential operator of order −1, where −T

x(t) = eitH xe−itH is the Heisenberg position operator. These two properties hold for potentials with |V (x)| ≤ C|x|2 but not for superquadratic potentials. One of our motivations to this work was to examine whether or not this transition is inherited by the smoothing property of Eq. (1.1). Recall that E(t, x, y) under Assumption 1.1 satisfies, for arbitrary ρ ∈ C0∞ (R3 ), |ρE(τ, ξ, η)| ≤ C(|τ | + |ξ |2 + |η|2 )−1/k ,

(1.8)

where ˆ stands for the Fourier transform ([Y4], Remark 1.2). We should remember here a celebrated theorem of Zygmund ([Z], see also [B]) that e−itH u0 L4 (T×T) ≤

576

K. Yajima, G. Zhang

Cu0 L2 (T) for H = D 2 on the torus T = R/2π Z. Notice that the FDS for this −in2 t+in(x−y) is nowhere locally integrable with respect to H , E0 (t, x, y) = ∞ n=−∞ e (t, x, y) and functions which satisfy (1.8) are smoother than E0 . This indicates, therefore, that Schrödinger equations with superquadratic potentials satisfy a certain smoothing property. Our result shows that this is indeed the case and, moreover, such a transition as in the smoothness of E(t, x, y) does not appear in the smoothing property. Note that estimates (1.2) and (1.3) differ from (1.6) or (1.7) by the change of the order of integrations by x and t, in particular. Nonetheless, we continue to refer to such estimates as (1.2) and (1.3) as the smoothing property. We mention that the estimate of the form (1.2) appears already in [K1] in a slightly disguised form: For M ∈ Ln+ε (Rn ) ∩ Ln−ε (Rn ), ε > 0, Meit u0 L2 (Rn+1 ) ≤ C(MLn−ε (Rn ) + MLn+ε (Rn ) )u0 L2 (Rnx ) , t,x

(see also [KY] where the right side is replaced by CMLn (Rn ) u0 L2 (Rnx ) ) and that [KPV] elaborated and applied it to nonlinear Schrödinger equations. We also remark that there is a micro-local version of (1.7) and the following is known: When H is Schrödinger operators on certain Riemannian minifolds, (1.7) holds with α = 1/2 for u0 ∈ L2 supported by U if all bicharacteristics starting from U are non-trapping for all t < 0 ([CKS]) and it does not hold if they are trapping ([D1, D2]). It is well-known that the operator H is bounded from below and its spectrum consists of simple eigenvalues λ1 < λ2 < · · · → ∞. We denote the corresponding normalized eigenfunctions by ψ1 , ψ2 . . . . The proof of Theorem 1.2 and Theorem 1.3 heavily depends upon the following theorem on the asymptotic behavior as λn → ∞ of Lp norm of ψn which we think is of interest in its own right. For the quantities A and B, we write A ∼ B if there exist two positive constants c1 and c2 such that c1 A ≤ B ≤ c2 A. Theorem 1.5. Let Assumption 1.1 be satisfied. Let ψ(x, E) be the normalized eigenfunction of H = − + V (x) with the eigenvalue E. Then: (1) For 1 ≤ p ≤ ∞, we have

ψ(x, E)Lp ∼

Cp E −θ(k,p) , CE

1 − 4k

if p = 4; 1 4

(log E) ,

if p = 4,

(1.9)

for large E, where Cp can be taken independent of p, p ∈ (4 − ε, 4 + ε), ε > 0.

(2) For compact interval K ⊂ R, sup |ψ(x, E)| ∼ E − 2k for large E. 1

x∈K

Remark 1.1. If we set u0 (x) = ψn (x) in (1.2), we have g(t)i∂/∂tα H β e−itH u0 (x)Lp (Rx ,L2 (Rt )) = gL2 λn θ(k,p) ψn Lp (R) . Hence, Theorem 1.5 (1) implies that the condition α + β ≤ θ(k, p) in (1.2) cannot be relaxed. Likewise Theorem 1.5 (2) implies that the exponent 1/2k of Theorem 1.3 is sharp. The exponents θ (k, p) and 1/2k are decreasing functions of k and this matches the fact that the FDS is more singular for larger k ([Y4]). With respect to p on the other hand, θ (k, p) is increasing for 2 ≤ p < 4 and decreasing for 4 < p. The proof of Theorem 1.5 will show that the p dependence of θ(k, p) is related to the behavior of ψn (x) near the turning point S.


577

As an application of Theorem 1.2 and Theorem 1.3, we show that the initial value problem for nonlinear Schrödinger equations with superquadratic potentials and with spatially localized mild nonlinearities   ∂u i = −u + V (x)u + f (x, u), x ∈ R, t ∈ R, (1.10) ∂t u(0, x) = u (x), x∈R 0

is L2 well-posed. Define, for r ≥ 1 and δ > 0, 2 X = L4 (Rx ; L2r loc (Rt )) ∩ C(Rt , L (Rx )),

Xδ = L4 (Rx ; L2r ((−δ, δ)t )) ∩ C((−δ, δ)t , L2 (Rx )); 2 Y = L2r loc (Rt × Rx ) ∩ C(Rt , L (Rx )), 2 Yδ = L2r ((−δ, δ)t , L2r loc (Rx )) ∩ C((−δ, δ)t , L (Rx )).

2k and let φ(x) ∈ 2k − 1

Theorem 1.6. Let V satisfy Assumption 1.1. Let 1 ≤ r < 4

L 2−r (R). Suppose that f (x, u) satisfies |f (x, u)| ≤ C|φ(x)||u|r ,

x ∈ R, u ∈ C,

|f (x, u) − f (x, v)| ≤ C|φ(x)||u − v|(|u|

r−1

+ |v|

r−1

(1.11)

), x ∈ R, u, v ∈ C. (1.12)

Then, the problem (1.10) is locally well-posed in X for any u0 ∈ L2 (R), viz. there exists δ > 0 such that (1.10) admits a unique solution u(t, x) in Xδ and L2 (R) u0 → u ∈ Xδ is continuous. If f further satisfies f (x, u)u is real for x ∈ R, u ∈ C,

(1.13)

then (1.10) is globally well-posed in X, viz. the solution u(t, x) uniquely extends to the whole real line R and L2 (R) u0 → u ∈ XT is continuous for all T > 0. k and let K⊂R be a k−1 compact interval. Suppose f satisfies f (x, u) = 0 for x ∈ K and

Theorem 1.7. Let V satisfy Assumption 1.1. Let 1 ≤ r ≤ |f (x, u)| ≤ C|u|r ,

x ∈ K, u ∈ C,

|f (x, u) − f (x, v)| ≤ C|u − v|(|u|

r−1

+ |v|

r−1

), x ∈ K, u, v ∈ C.

(1.14) (1.15)

Then, (1.10) is locally well-posed in Y for any u0 ∈ L2 (R). If f further satisfies (1.13), then (1.10) is globally well-posed in Y . We outline here the plan of the paper, briefly explaining how Theorem 1.2 may be derived from Theorem 1.5. In Sect. 2, we prove Theorem 1.5 by applying Langer’s turning point theory as presented in Titchmarsh’s monograph [T1]. Theorem 1.2 will be proved in Sect. 3. We expand u(t, x) = e−itH u0 in terms of the eigenfunctions ψ1 , ψ2 , . . . of ˆ 0 (n)e−itλn ψn (x), where uˆ 0 (n) = (u0 , ψn ) is the nth H in the form u(t, x) = ∞ n=1 u generalized Fourier coefficient. Then, the Plancherel formula implies ∞ |g(t)u(t, x)|2 dt = | uˆ 0 (n)ψn (x)g(λ ˆ − λn )|2 dλ. R

R n=1

578

K. Yajima, G. Zhang

If gˆ is supported by a sufficiently small interval, then, for any fixed λ ∈ R, there is only one eigenvalue such that g(λ ˆ − λn ) = 0 because λn+1 − λn → ∞ as n → ∞. Hence the right-hand side becomes ∞ ˆ 0 (n)|2 |ψn (x)|2 g2L2 and Minkowski’s inequality n=1 |u implies ∞

1/2 −itH 2 2 u0 (x)Lp (Rx ,L2 (Rt )) ≤ gL2 |uˆ 0 (n)| ψn (x)Lp . g(t)e n=1

−2θ(k,p) 1/2 ∞ ˆ 0 (n)|2 λn = H −θ(k,p) u0 by The right hand side is bounded by C n=1 |u virtue of Theorem 1.5, then, (1.2) follows for such g. For general g we use the standard “cutting and pasting” by the dyadic decomposition of the unity. Theorem 1.3 is also proved in Sect. 3 using a similar idea. We prove Theorem 1.6 and Theorem 1.7 in Sect. 4 via the standard contraction mapping theorem by applying Theorem 1.2 and Theorem 1.3, respectively.

2. Lp Estimate of Eigenfunctions In this section we prove Theorem 1.5. We denote by ψ(x, E) the eigenfunction of −ψ (x) + V (x)ψ(x) = Eψ(x)

(2.1)

such that ψ(·, E)L2 (R) = 1. We use the following estimates (2.3) and (2.4) due to Titchmarsh ([T1,T2]). For large E > 0, we write X for the positive root X of V (X) = E. We have V (x) > E for x > X and V (x) < E for 0 ≤ x < X. We set ζ (x) =

x

E − V (t)dt,

(2.2)

X

where the branch of the square root is chosen in such a way that arg ζ (x) = π/2 for x > X, and arg ζ (x) = −π for x < X. Lemma 2.1. Let the notation be as above. Then, there exists a constant CE+ such that

− 21 −1 −Imζ 1/6 1 1 E X e |ζ | (1) −1 ψ(x, E) = CE+ [E − V (x)]− 4 (π ζ /2) 2 H1/3 (ζ ) + O 1 + |ζ |1/6 (2.3) as E → ∞ uniformly with respect to x > 0. We have the estimate CE+ ∼ (XE − 2 ) 2 . 1

1

(2.4)

Similar statement holds for x < 0. Outline of the proof. For the readers’ convenience, we outline the proof here. It is based upon Langer’s turning point theory as presented in Chapter 22.27 of [T2]. We make a change of independent variable x → ζ (x) and dependent variable ψ → G in (2.1), where ψ(x) = [E − V (x)]− 4 G(ζ ). 1

(2.5)


579

We sometimes write G(x) for G(ζ (x)). Then G(ζ ) satisfies 5 d 2G G = f (x)G, + 1 + dζ 2 36ζ 2

(2.6)

where f (x) is defined by f (x) =

V (x) 5V (x)2 5 − − . 36ζ 2 4(E − V (x))2 16(E − V (x))3

We then transform Eq. (2.6) into the integral equation of the form π i ∞ (1) πζ 1 (1) (1) H1/3 (ζ )J1/3 (θ ) − J1/3 (ζ )H1/3 (θ ) × G(x) = ( ) 2 H1/3 (ζ ) + 2 2 x ×ζ 1/2 θ 1/2 f (t)(E − V (t))1/2 G(t)dt.

(2.7)

(j )

Here Jν (ζ ) and Hν (ζ ) are the Bessel and Hankel functions, respectively, and we wrote 1 πζ 21 (1) 2 ζ = ζ (x) and θ = ζ (t). ( πζ 2 ) J1/3 (ζ ) and ( 2 ) H1/3 (ζ ) are linearly independent solutions of the associate homogeneous equation d 2G 5 G = 0, + 1 + 36ζ 2 dζ 2 and the inhomonegenous term is chosen in such a way that the solution of (2.7) decays 1 1 (1) as x → ∞. The functions ζ 2 H1/3 (ζ )eIm ζ and ζ 2 J1/3 (ζ )e−Im ζ are bounded for x ∈ (0, ∞), and Im (ζ − θ ) > 0 in the integrand of (2.7). It can be proven ([T2, Lemma 22.27]) that ∞ 1 , E → ∞, |f (x)||E − V (x)|1/2 dx = O XE 1/2 0 ∞ 1 , x → ∞. |f (x)||E − V (x)|1/2 dx = O xV (x)1/2 x It follows that (2.7) can be uniquely solved by iteration in the function space G = {G : e ζ (x) G(x) is bounded and continuous} and the solution G(x, E) satisfies, as E → ∞, (1)

G(x, E) = (π ζ /2) 2 H1/3 (ζ ) + O(E − 2 X −1 e−Imζ |ζ |1/6 /(1 + |ζ |1/6 )) 1

1

(2.8)

uniformly with respect to x ∈ (0, ∞) and that, for fixed E, as x → ∞, (1)

G(x, E) = (π ζ /2) 2 H1/3 (ζ )(1 + O(x −1 V (x)−1/2 )). 1

Since the linear space of solutions of (2.1) which decay as x → ∞ is one dimensional, 1 −1 we have ψ(x, E) = CE+ [E − V (x)]− 4 G(x, E) for a constant CE+ . Titchmarsh ([T1, pp. 170–171]) shows CE+ ∼ (XE − 2 ) 2 . 1

1

580

K. Yajima, G. Zhang

−1 + −1 − We write the right side of (2.3) in the form CE+ ψ (x, E) and we let CE− ψ (x, E) be the corresponding expression for x ∈ (−∞, 0). It follows from Lemma 2.1 that

ψ(x, E)Lp (R) ∼ ψ(x, E)Lp (R+ ) + ψ(x, E)Lp (R− ) ∼ X − 2 E 4 (ψ + (x, E)Lp (R+ ) + ψ − (x, E)Lp (R− ) ). 1

1

(2.9)

We estimate the Lp -norm of ψ + (x, E). The estimate for ψ − (x, E) is similar. We define q(y) and Q(y) by

q(y) =

V (yX) , V (X)

Q(y) =

 1 √    1 − q(s)ds, if y < 1; − y

y √    q(s) − 1ds, if y > 1. i

(2.10)

1

We have 1

ζ (x) = E 2 XQ(x/X). Under the assumptions, we have V (x) ∼ xV (x) ∼ |x|k for |x| ≥ R. Lemma 2.2. Let V satisfy Assumption 1.1 and K > 1. Then there exists a constant L such that the following estimates are satisfied uniformly with respect to |X| ≥ L: 1 − q(y) ∼ 1 − y, for 0 ≤ y ≤ 1, q(y) − 1 ∼ y − 1, for 1 ≤ y ≤ K, k

q(y) − 1 ∼ y ,

(2.11)

for y ≥ K,

and Q(y) ∼ −(1 − y)3/2 , for 0 ≤ y ≤ 1, −iQ(y) ∼ (y − 1)3/2 , −iQ(y) ∼ y

1+k/2

,

for 1 ≤ y ≤ K,

(2.12)

for y ≥ K.

Proof. Take sufficiently large L > 2R, R being the constant of Assumption 1.1. Then, we have for 1/2 ≤ y ≤ 1, uniformly with respect to |X| ≥ L, 1 − q(y) =

V (X) − V (yX) XV (θ X) = (1 − y) ∼ 1 − y, V (X) V (X)

y ≤ θ ≤ 1.

Let 0 ≤ y ≤ 1/2 and R ≤ yX. We have 0 < V (yX) ≤ V (R) + y(V (X) − V (R)) ≤ yV (X) since V (x) is convex for |x| ≥ R, and 1 − q(y) ≥ 1 − y. If yX ≤ R, |V (yX)| ≤ sup|x|≤R |V (x)| ≤ 10−1 V (X) and 1 − q(y) ∼ 1 − y is obvious for |X| ≥ L and large L. This proves the first estimate. Estimates for q(y) − 1, y > 1, may be obtained similarly. Estimates (2.12) for Q(y) may be obtained by integrating (2.11).


581

Hereafter we let E large enough such that the corresponding X satisfies the condition |X| ≥ L of Lemma 2.2. Writing ψ + (x, E) in the form ψ + (x, E) = E − 4 [1 − q(x/X)]− 4 G(E 2 XQ(x/X), E) 1

1

1

and changing variable, we have ∞

+

p

|ψ (x, E)| dx = XE

∞

− p4

0

p

|1 − q(y)|− 4 |G(E 2 XQ(y), E)|p dy. 1

0 1

(1)

We insert (2.8) for G(x, E). This produces two integrals, the one with (π ζ /2) 2 H1/3 (ζ ) and the other with the remainder term O(. . . ) in place of G(ζ, E). We estimate the latter first as it is simpler. We define  −1 ,  if p < 4;  (4 − p) 1 if p = 4; δ(p) = log(E 2 X),  p−4 1  (p − 4)−1 (E 2 X) 6 , if p > 4. Lemma 2.3. There exists a constant C > 0 such that for large E ≥ E0 ,

∞ |1 − q(y)|

− p4

p

1

E

− 21

X

|E 1/2 XQ(y)| 6

−1 −E 1/2 XIm Q(y)

e

dy

1

(1 + |E 1/2 XQ(y)|) 6

0

≤ C p (E 2 X)−p δ(p). 1

(2.13)

Proof. We split the integral into three parts by using the constant K of Lemma 2.2, 1

K +

0

∞ +

1

. . . dy ≡ I1 + I2 + I3 . K

By virtue of (2.11) and (2.12), we have I1 ≤ C

p

1 (1 − y)

− p4

1 2

(E X)

−p

0

E 1/2 X(1 − y)3/2 1 + E 1/2 X(1 − y)3/2

1

= C p (E 2 X)−p (E 2 X) 1

1

p−4 6

Since |e

I2 ≤ C p

dy

2

(E2 X) 3

0 −XE 1/2 ImQ(y)

p/6

1 1 dy ≤ C p (E 2 X)−p δ(p). 3/2 p/6 (1 + y )

| ≤ 1 for 1 ≤ y ≤ K, we likewise have

K |y − 1|

− p4

1

≤ C p (E 2 X)−p δ(p). 1

p

1

E

− 21

X −1

|E 1/2 XQ(y)| 6 1

(1 + |E 1/2 XQ(y)|) 6

dy

(2.14)

582

K. Yajima, G. Zhang p

kp

k

For K ≤ y < ∞, we have |1 − q(y)|− 4 ∼ y − 4 ≤ C p , −iQ(y) ∼ y 1+ 2 ≥ cy and I3 ≤ C

p

∞

e−cpXE

1/2 y

dy ≤ C p e−cpE

1/2 X

≤ C p δ(p).

(2.15)

K

Combining estimates (2.14) and (2.15), we obtain (2.13). (1)

Recall that H 1 (ζ ) satisfies the following (cf. [T1, (7.1.8), (7.8.5) and (7.8.7))]: 3 1 2 (1) (1) When ζ = −z < 0, H 1 (ζ ) = √ e− 6 πi {J 1 (z) + J− 1 (z)} and 3 3 3 3  3 1 1   2 2 π − 2 e 3 πi {cos(z − (π/4)) + O(z−1 )} (z → ∞), 1 (1) ζ 2 H 1 (ζ ) = 2 23 e 13 πi 1  3 (z → 0). z 6 (1 + O(z)) √ 3 F(2/3)

2 − 2 πi e 3 K 1 (w) and 3 π

(1)

(2) When ζ = iw and w ≥ 0, H 1 (ζ ) = 3 1

(1)

ζ 2 H 1 (ζ ) =

O(e−w ) 1 3

2 e

3

− 16 π

π

−1

(2.16)

1 6

3 2

F(1/3)w + O(w )

(w → ∞), (w → 0).

(2.17)

Lemma 2.4. There exists a constant C > 0 such that for large E ≥ E0 , ∞

p

(1)

|(1 − q(y))|− 4 |ζ 2 H 1 (ζ )|p dy ≤ C p δ(p), 1

ζ = E 1/2 XQ(y).

(2.18)

3

0

Proof. We split the integral into four parts 1 K ∞ + + · · · dy = II1 + II2 + II3 0

K

1

and estimate them separately. When 0 ≤ y ≤ 1, ζ = E 1/2 XQ(y) ∼ −E 1/2 X(1 − y)3/2 < 0. We take large N > 0 and split the integral II1 into two parts II1 = II11 + II12 . II11 is the integral over the part of the interval (0, 1) where N < E 1/2 X(1 − y)3/2 and II12 over the complement. Applying the first relation of (2.16) to II11 and the second to II12 , we obtain 2

II11 ≤ C p

1

2

1−N 3 (E 2 X)− 3

p

(1 − y)− 4 dy ≤ C p δ(p),

(2.19)

0 2

p

II12 ≤ C p (E 1/2 X) 6

2

N 3 (E1/2 X)− 3

p

p

y − 4 y 4 dy

0

= C p N (E 1/2 X) 2 3

p−4 6

≤ C p δ(p).

(2.20)


583

When 1 ≤ y ≤ K, we have q(y)−1 ∼ y−1 and w = −iζ ∼ E 1/2 XQ(y)(y−1)3/2 > 0. We split the integral p K p 1 2 II2 = |1 − q(y)|− 4 |w 2 K 1 (w)|p dy = II21 + II22 3 π 1

into the part II21 over w ≥ 1 and II22 over 0 ≤ w ≤ 1. We apply the first of (2.17) to II21 and the second to II22 and obtain K p (y − 1)− 4 dy ≤ C p δ(p). (2.21) II21 ≤ C p 1+C(E 1/2 X)−2/3

II22 ≤ C p

−2/3 C(E 1/2 X)

p

p

y − 4 (E 1/2 Xy 3/2 ) 6 dy ≤ C p (E 1/2 X)

p−4 6

≤ C p δ(p).

(2.22)

0 1

k

For K ≤ y < ∞, q(y) − 1 ∼ y k , w ∼ E 2 Xy 1+ 2 and (2.17) yields II3 ≤ C

p

∞

kp

y − 4 e−cpE

k 1/2 Xy 1+ 2

dy ≤ C p e−cpE

1/2 X

≤ C p δ(p).

(2.23)

K

Combining estimates (2.19), (2.20), (2.21), (2.22) and (2.23), we obtain (2.18).

Lemma 2.5. There exists a constant C > 0 such that we have following lower bound 1

p

(1)

|(1 − q(y))− 4 |ζ 2 H 1 (ζ )|p dy ≥ C p δ(p), 1

ζ = E 1/2 XQ(y)

3

0

for sufficiently large E ≥ E0 . Proof. Denote the integral on the left by II11 as in the proof of the previous lemma. We take N large enough so that |O(1/z)| ≤ 1/10 in the first of (2.16) for z ≥ N . Take a large 1 3 C > 0 such that z = −ζ ∼ E 2 X(1 − y) 2 ≥ N when CN 2/3 (E 1/2 X)−2/3 < 1 − y < 1. Then, by virtue of (2.16), we have, for E ≥ E0 , p π 1 (1 − y)−p/4 cos ζ − dy +O II11 ≥ C p 4 ζ N 2/3 (E 1/2 X)−2/3 0. An entirely similar argument produces the corresponding estimate for ψ(x, E)Lp (R− ) and we obtain the upper bound of (1.9). The lower bound readily follows from Lemma 2.5. For proving the second statement, we remark that the estimate (2.3) remains to hold for x ∈ K uniformly. It is obvious from (2.4) that

1 1 1 E − 2 X −1 e−Imζ |ζ |1/6 −1 − 41 (2.25) ≤ CX − 2 (E − 2 X −1 ). CE+ (E − V (x)) O 1 + |ζ |1/6 1

Since ζ = −z ∼ −E 2 X for large E uniformly for x ∈ K, we have from the first relation of (2.16) that 1 1 1 1 π (1) −1 CE+ + O(E − 2 X −1 ) . [E − V (x)]− 4 (π ζ /2) 2 H1/3 (ζ ) ∼ X− 2 cos z − 4 (2.26) 1

The second statement follows by combining (2.25) and (2.26) because X ∼ E k .

3. Smoothing Properties In this section we prove Theorem 1.2 and Theorem 1.3 by using estimates obtained in Sect. 2. We write gˆ for the Fourier transform of g. In terms of the eigenvalues λ1 < λ2 < . . . of H and the corresponding normalized eigenfunctions ψ1 (x), ψ2 (x), . . . , we may write e−itH u0 (x) =

R

e−itλn uˆ 0 (n)ψn (x),

(3.1)

n=1

where uˆ 0 (n) =

∞

u0 (x)ψn (x)dx, n = 1, 2, . . . are the generalized Fourier coefficients.

Under Assumption 1.1 we know that there exists a constant C > 0 such that k−2

λn ≡ λn+1 − λn ≥ Cλn2k ,

(3.2)

2k

hence λn ≥ Cn k+2 for n = 1, 2, . . . (cf. e.g. [Y4]). Lemma 3.1. Suppose u0 ∈ D(H I ) for sufficiently large I, then g(t)e

−itH

u0 (x)2L2 (R ) t

≤ Cg

∞

2 1+ 1 2k

B4

(R)

n=1

|uˆ 0 (n)ψn (x)|2 ,

∀x ∈ R.

(3.3)


585

Proof. By virtue of Theorem 1.5, (3.1) converges uniformly with respect to (t, x). If the j

1

+1

support of gˆ has a diameter < 2j , then, by virtue of (3.2), there exist at most C2 2 k number of λn such that g(λ+λ ˆ n ) = 0 for every fixed λ. It follows by Plancherel theorem that for such g, ∞ |g(t)e

−itH

∞ ∞ u0 (x)| dt = | g(λ ˆ + λn )uˆ 0 (n)ψn (x)|2 dλ 2

n=1

−∞

−∞

≤ C2 ≤ C2

j

j

1 1 2+k

1 1 2+k

∞ ∞

|g(λ ˆ + λn )uˆ 0 (n)ψn (x)|2 dλ

(3.4)

n=1−∞

g ˆ 2L2

∞

|uˆ 0 (n)ψn (x)|2 ,

n=1

where in the second step we used Schwarz’ inequality. If g is not compactly supported, ∞ we decompose it by using a dyadic decomposition of the unity hˆ j (λ) = 1 such j =−∞

that supp hˆ 0 ⊂{λ : |λ| < 1}, in the form g =

∞

supp hˆ ±j ⊂{λ : ±2|j |−2 < λ < ±2|j | }, j = 1, 2, . . . .

gj so that gˆ j = gˆ hˆ j has a support whose diameter is less than

j =−∞

2|j | . Then, (3.4) implies g(t)e−itH u0 (x)2L2 (R ) t

  2 ∞ ∞ j 1 1 + ≤C gˆ j L2 (R) 2 2 2 k  |uˆ 0 (n)ψn (x)|2 n=1

j =0

≤ Cg2

∞ 1+ 1 2k

B4

(R)

|uˆ 0 (n)ψn (x)|2 .

n=1

By virtue of Minkowski inequality we have 1/2 ∞

1/2 ∞ 2 2 = | u ˆ (n)ψ (x)| | u ˆ (n)ψ (x)| 0 n 0 n p n=1 n=1 Lp/2 L

1/2 ∞ ≤ |uˆ 0 (n)|2 ψn (x)2Lp . n=1

The right-hand side may be estimated by using Theorem 1.5 by ∞

1/2 2 −2θ(k,p) Cp |uˆ 0 (n)| λn = Cp H −θ(k,p) u0 L2 . n=1

(3.5)

586

K. Yajima, G. Zhang

Combination of (3.3) and (3.5) yields g(t)e−itH u0 (x)Lp (Rx ,L2 (Rt )) ≤ Cp g

1+ 1 2k

B4

(R)

H −θ(k,p) u0 L2 (R) ,

(3.6)

where the constant Cp is taken uniformly with respect to p outside (4 − ε, 4 + ε). Since D(H I ) is dense in L2 (R), (3.6) holds for all u ∈ L2 (R). Theorem 1.2 follows from (3.6). Proof of Theorem 1.3. Theorem 1.5 (2) implies that sup

∞

x∈K n=1

|uˆ 0 (n)|2 |ψn (x)|2 ≤ C

∞ n=1

−

1

|λn 2k uˆ 0 (n)|2 = CH − 2k u0 2L2 (R) . 1

Thus, Theorem 1.3 follows by combining (3.3) with (3.7).

(3.7)

4. Applications to Nonlinear Equations In this section we prove Theorem 1.6 and Theorem 1.7. Since the proofs are quite similar, we prove Theorem 1.7, and only indicate the modifications necessary for the proof of Theorem 1.6. Hereafter, we often omit some of the variables of function u(t, x) and write u(t) or simply u for u(t, x), if no confusions are feared. By taking g such that g(t) = 1 for |t| ≤ δ in Theorem 1.2 and Theorem 1.3, we have i∂/∂tα H β e−itH u0 Lp (Rx ,L2 ([−δ,δ]t )) ≤ Cδ u0 L2 ,

α + β = θ(k, p), p ≥ 2; (4.1)

sup i∂/∂t

x∈K

1/2k −itH

e

u0 L2 ([−δ,δ]t ) ≤ Cδ u0 L2 .

(4.2)

Proof of Theorem 1.7. We prove Theorem 1.7 for t ≥ 0 only. The argument for t ≤ 0 is similar. We consider the equivalent integral equation t −itH u(t) = e u0 − i e−i(t−s)H f (x, u(s))ds.

(4.3)

0

For δ > 0, we write Kδ = [0, δ] × K and define the Banach space Yδ (K) by Yδ (K) = C([0, δ], L2 (R)) ∩ L2r (Kδ ),

uYδ (K) ≡ uL∞ ([0,δ],L2 (R)) + uL2r (Kδ ) .

We define a nonlinear map K : Yδ (K) → Yδ (K) by t K(u) = e−itH u0 − i%(u), %(u) = e−i(t−s)H f (x, u(s))ds.

(4.4)

0

Write BM = {u ∈ Yδ (K) : uYδ (K) ≤ M}. Lemma 4.1. The map K is well defined on Yδ (K). There exist M > 0 and δ > 0 depending only on u0 L2 (R) such that K maps BM into itself and K(u) − K(v)Yδ (K) ≤

1 u − vYδ (K) , 2

u, v ∈ BM .

(4.5)


587

Proof. For u0 ∈ L2 (R), we have e−itH u0 ∈ C(R, L2 (R)). By virtue of (4.2) and the Sobolev embedding theorem, e−itH u0 ∈ L∞ (Kx , L2r ([0, δ]t )). Hence, e−itH u0 ∈ Yδ (K) and e−itH u0 Yδ (K) ≤ c1 u0 L2 .

(4.6)

Let χ (s < t) be such that χ (s < t) = 1 if 0 < s < t, and χ (s < t) = 0 otherwise. If u ∈ Yδ (K), then, the assumptions that f (x, u) = 0 for x ∈ K and (1.14) imply f (x, u(t, x)) ∈ L2 ([0, δ]t × Rx ) and f (x, u(t, x))L2 (Kδ ) ≤ CurL2r (K ) .

(4.7)

δ

It then easily follows that %(u) ∈ C([0, δ], L2 (R)) and by Schwarz’ inequality and %(u)L∞ ([0,δ];L2 (R)) ≤ Cδ 2 urL2r (K ) . 1

(4.8)

δ

By Minkowski’s inequality, (4.2) and (4.7), we have δ %(u)L2r (Kδ ) ≤ |χ (s < t)e−itH {eisH f (x, u(s, x))}L2r (Kδ ) ds 0

≤C 0

δ

1

f (x, u(s, x))L2 (K) ds ≤ Cδ 2 f (x, u)L2 (Kδ )

(4.9)

≤ Cδ 2 urL2r (K ) , 1

δ

which with (4.6) and (4.8) implies that K is well-defined on Yδ (K). It follows also from (4.6), (4.8) and (4.9) that, with constants c1 and c2 which can be taken independent of small δ, KuYδ (K) ≤ e−itH u0 Yδ (K) + f (u)Yδ (K) ≤ c1 u0 L2 + c2 δ 2 urYδ (K) . (4.10) 1

Thus, if we take M such that M > 2c1 u0 L2 , δ < (2c2 M r−1 )−2 , then KuYδ (K) ≤ 2c1 u0 L2 < M whenever uYδ (K) ≤ M and K maps BM into itself. To show that K satisfies (4.5), we estimate t K(u1 ) − K(u2 ) = −i e−i(t−s)H [f (x, u1 (s)) − f (x, u2 (s))]ds. 0

We have by Minkowski’s inequality and Hölder’s inequality that δ K(u1 ) − K(u2 )L∞ ([0,δ]t ;L2 (Rx )) ≤ δ ≤C

f (x, u1 (s)) − f (x, u2 (s))L2 (K) ds 0

|u1 − u2 |(|u1 |r−1 + |u2 |r−1 )L2 (K) ds

0

δ ≤C

u1 (s) − u2 (s)L2r (K) (u1 r−1 + |u2 r−1 )ds L2r (K) L2r (K)

0

≤ Cδ 2 (u1 r−1 + u2 r−1 )u1 − u2 L2r (Kδ ) . L2r (K ) L2r (K ) 1

δ

δ

(4.11)

588

K. Yajima, G. Zhang

Likewise, by virtue of (4.6), we have by Minkowski’s inequality and Hölder’s ineqaulity δ K(u1 ) − K(u2 )L2r (Kδ ) ≤ χ (s < t)e−itH eisH [f (x, u1 ) − f (x, u2 )]L2r (Kδ ) ds 0

≤C

δ

0

f (x, u1 (s, x)) − f (x, u2 (s, x))L2 (Rx ) ds

≤ Cδ 2 (u1 r−1 + u2 r−1 )u1 − u2 L2r (Kδ ) . L2r (K ) L2r (K ) 1

δ

δ

(4.12) Combining (4.11) with (4.12), we obtain r−1 K(u1 ) − K(u2 )Yδ (K) ≤ c3 δ 2 (u1 r−1 Yδ (K) + u2 Yδ (K) )u1 − u2 Yδ (K) , 1

(4.13)

and (4.5) follows if we choose δ such that δ < min{(2c2 M r−1 )−2 , (4c3 M r−1 )−2 }. Continuation of Proof of Theorem 1.7. By virtue of Lemma 4.1, the contraction mapping theorem implies that K has a unique fixed point u ∈ BM and (4.3) has a unique solution u in Yδ (K). To prove that the solution depends on the initial data u0 continuously as described in the theorem, we take u0 , u˜ 0 ∈ L2 (R) and let u and u˜ be the corresponding solutions. Then, the preceding estimates (4.6) and (4.13) show u − u ˜ Yδ (K) ≤ c1 u0 − u˜ 0 L2 + c3 δ 2 (ur−1 ˜ r−1 ˜ Yδ (K) Yδ (K) + u Yδ (K) )u − u 1

and u − u ˜ Yδ (K) ≤ cu0 − u˜ 0 L2 for small δ > 0. This shows the desired continuous dependence. When f satisfies the additional assumption (1.13), we will show u(t)L2 = u0 L2 . Once this is shown, the solution u(t) extends uniquely to [0, ∞) since the length δ of the interval on which the solution exists depends only on u0 L2 (Rx ) as has been shown above. Also the map L2 (R) u0 → u ∈ C([0, T ], L2 (R)) ∩ L2r ([0, T ]t × K) is continuous for any T > 0 because u(t, ·) is L2 (Rx ) valued continuous and we will be done. To show u(t)L2 = u0 L2 , we compute · 2L2 (R ) of both sides of (4.3). x

Denoting the inner product and the norm of L2 (Rx ) by (·, ·) and · , respectively, and writing f (t, x) = f (t, u(t, x)), we have 2 t −itH −i(t−s)H u(t) = e u0 − i e f (s, x)ds 0 t 2 isH = u0 L2 − 2Re u0 , i e f (s, x)ds 0 t t + (eisH f (s, x), eirH f (r, x))dsdr. 0

0

The last two terms on the right cancel each other because the last integral is equal to s t r t f (s, x), e−i(s−r)H f (r, x)dr ds + e−i(r−s)H f (s, x)ds, f (r, x) dr 0 0 0 0 t t = (f (s, x), iu(s) − ie−isH u0 ))ds + (iu(r) − ie−irH u0 , f (r, x))dr 0 0 t = 2Re u0 , i eisH f (s, x)ds , 0


589

where we used the fact that u is a solution in the first step and (1.13) in the second. This completes the proof. Proof of Theorem 1.6. The proof is very similar to that of Theorem 1.7 and we only indicate the necessary modifications. Instead of Yδ (K), we use now the Banach space Xδ = C([0, δ]t ; L2 (Rx )) ∩ L4 (Rx ; L2r ([0, δ]t )) with the norm uXδ = uL∞ ([0,δ]t ;L2 (Rx )) + uL4 (Rx ;L2r ([0,δ]t )) . (This notation is slightly different from that in the theorem, but no confusion should occur.) We define the nonlinear operators % and K by (4.4) as previously and set BM = {u ∈ Xδ : uXδ ≤ M}. We show that, for any u0 ∈ L2 (R), K is a contraction map from BM into BM if the parameters δ > 0 and M are chosen suitably. To show e−itH u0 ∈ Xδ and e−itH u0 Xδ ≤ Cu0 L2 , we use (4.1) instead of (4.2) and Sobolev embedding theorem which implies e−itH u0 ∈ L4 (Rx ; L2r ([0, δ]t )). By the assumption on f , we have δ δ %(u)L∞ ([0,δ]t ;L2 (Rx )) ≤ f (x, u(s))L2 ds ≤ C |φ(x)||u(s)|r L2 ds 0

1

≤ Cδ 2

1 2

= Cδ { ≤ Cδ

1 2

0

[0,δ]×R

1 2

|φ(x)|2 |u(t, x)|2r dtdx 1

R

|φ(x)|2 u(t, x)2r dx} 2 L2r ([0,δ] ) t

φ 4 (R) urL4 (R ;L2r ([0,δ] )) x t L 2−r

≤ Cδ 2 urXδ . 1

(4.14)

As in the proof of Theorem 1.7, (4.1) and (4.14) imply %(u)L4 (Rx ;L2r ([0,δ]t )) ≤ Cδ 2 urXδ . 1

(4.15)

It follows that K maps BM into BM for suitable M and δ which depend only on u0 L2 . The rest of the proof may be done by repeating the argument of the proof of Theorem 1.7 by using these estimates. We omit the details.

References [BAD] Ben-Artzi, M. and Devinatz, A.: Local smoothing and convergence properties of Schrödinger type equations. J. Funct. Anal. 101, 231–254 (1991) [BT] Ben-Artzi, M. and Trèves, A.: Uniform estimates for a class of evolution equations. J. Funct. Anal. 120, 264–299 (1994) [B] Bourgain, J.: Fourier transform restriction phenomena for certain lattice subsets and application to non-linear evolution equations I, Schrödinger equation. GAFA 3, 107–156 (1993) [Br] Brenner, Ph.: On scattering and everywhere defined scattering operator for nonlinear Klein-Gordon equation. J. Diff. Eq. 56, 310–344 (1985) [CKS] Craig, W., Kappeler, T. and Strauss, W.: Microlocal dispersive smoothing for the Schrödinger equation. Comm. Pure Appl. Math. 48, 769–860 (1995)

590

[CS]

K. Yajima, G. Zhang

Constantin, P. and Saut, J.C.: Local smoothing properties of Schrödinger equations. Indiana Univ. Math. J. 38, 791–810 (1989) [D1] Doi, S.: Smoothing effects for Schrödinger evolution groups on Riemannian manifolds. Duke Math. J. 82, 679–706 (1996) [D2] Doi, S.: Commutator algebra and abstract smoothing effect. J. Funct. Anal. 168, 428–469 (1999) [F] Fujiwara, D.: Remarks on convergence of the Feynman path integrals. Duke Math. J. 47, 41–96 (1980) [GV1] Ginibre, J. and Velo, G.: Scattering theory in the energy space for a class of nonlinear Schrödinger equations. J. Math. Pures et Appl. 64, 363–401 (1985) [GV2] Ginibre, J. and Velo, G.: Smoothing properties and retarded estimates for some dispersive evolution equations. Commun. Math. Phys. 144, 163–188 (1992) [H] Hoshiro, T.: Mourre’s method and smoothing properties of dispersive equations. Commun. Math. Phys. 202, 255–265 (1999) [HK] Hayashi, N. and Kato, K.: Analyticity in time and smoothing effect of solutions to nonlinear Schrödinger equations. Commun. Math. Phys. 184, 273–300 (1997) [K1] Kato, T.: Wave operators and similarity for some non-selfadjoint operators. Math. Ann. 162, 258–279 (1966) [K2] Kato, T.: On the Cauchy problem for the (generalized) Korteweg-de Vries equation. Studies in Appl. Math., Adv. Math. Suppl. Studies 8, 93–128 (1983) [K3] Kato, T.: Nonlinear Schrödinger equations. In: Lect. Notes for Physics 345 Schrödinger Operators, 1988 [KY] Kato, T. and Yajima, K.: Some examples of smooth operators and the associated smoothing effect. Rev. Math. Phys. 1, 481–496 (1989) [KPV] Kenig, C.E., Ponce, G. and Vega, L.: Oscillatory integrals and regularity of dispersive equations. Indiana Univ. Math. J. 40, 33–69 (1991) [MY] Martinez, A. and Yajima, K.: On the Fundamental Solution of Semiclassical Schrödinger Equations at Resonant Times. Commun. Math. Phys. 216, 357–373 (2001) [P] Pecher, H.: Nonlinear small data scattering for wave and Klein–Gordon equation. Math. Z. 185, 261–270 (1984) [Sj] Sjölin, P.: Local regularity of solutions to nonlinear Schrödinger equations. Ark. Mat. 28, 145–157 (1990) [St] Strichartz, R.S.: Restrictions of Fourier transforms to a quadratic surface and decay of solutions of wave equations. Duke Math. J. 44, 704–714 (1977) [Su] Sugimoto, M.: Global smoothing properties of generalized Schrödinger equations. J. Anal. Math. 76, 191–204 (1998) [T1] Titchmarsh, E.C.: Eigenfunction expansions associated with second-order differential equations, Part 1, 2nd edition. Oxford: Oxford University Press, 1962 [T2] Titchmarsh, E.C.: Eigenfunction expansions associated with second-order differential equations, Part 2. Oxford: Oxford University Press, 1958 [V] Vega, L.: Schrödinger equations: Pointwise convergence to the initial data. Proc. A. M. S. 120, 874– 878 (1988) [Y1] Yajima, K.: Existence of evolution for time dependent Schrödinger equations. Commun. Math. Phys. 110, 415–426 (1987) [Y2] Yajima, K.: On smoothing property of Schrödinger propagators. Lecture Notes in Mathematics, 1450, pp. 20–35 [Y3] Yajima, K.: Schrödinger evolution equation with magnetic fields. J. d’Analyse Math. 56, 29–76 (1991) [Y4] Yajima, K.: Smoothness and non-smoothness of the fundamental solution of time dependent Schrödinger equations. Commun. Math. Phys. 181, 605–629 (1996) [Z] Zygmund, A.: On the Fourier coefficients and transforms of functions of two variables. Studia Math. 50, 189–201 (1974) Communicated by B. Simon

Commun. Math. Phys. 221, 591 – 657 (2001)

Communications in



Higher-Dimensional BF Theories in the Batalin–Vilkovisky Formalism: The BV Action and Generalized Wilson Loops Alberto S. Cattaneo , Carlo A. Rossi Mathematisches Institut, Universität Zürich–Irchel, Winterthurerstrasse 190, 8057 Zürich, Switzerland. E-mail: [email protected]; [email protected] Received: 25 October 2000 / Accepted: 30 March 2001

Abstract: This paper analyzes in detail the Batalin–Vilkovisky quantization procedure for BF theories on an n-dimensional manifold and describes a suitable superformalism to deal with the master equation and the search of observables. In particular, generalized Wilson loops for BF theories with additional polynomial B-interactions are discussed in any dimensions. The paper also contains the explicit proofs to the theorems stated in [16]. Contents 1. 2. 3.

4.

5.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 A brief discussion of Assumption 1 . . . . . . . . . . . . BF Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The BRST procedure . . . . . . . . . . . . . . . . . . . 3.2 Classical observables . . . . . . . . . . . . . . . . . . . The Batalin–Vilkovisky Quantization Procedure for BF Theories 4.1 Functional derivatives . . . . . . . . . . . . . . . . . . . 4.2 The BV antibracket . . . . . . . . . . . . . . . . . . . . 4.3 Properties of the BV antibracket . . . . . . . . . . . . . . 4.4 The BV Laplacian . . . . . . . . . . . . . . . . . . . . . 4.5 BV cohomology and observables . . . . . . . . . . . . . The BV Superformalism for BF Theories . . . . . . . . . . . . 5.1 The space of functionals SA,B . . . . . . . . . . . . . . . 5.2 Main properties of the super BV antibracket . . . . . . . 5.3 The super BV Laplacian . . . . . . . . . . . . . . . . . . 5.4 Twists . . . . . . . . . . . . . . . . . . . . . . . . . . .

A. S. C. acknowledges partial support of SNF Grant No. 2100-055536.98/1

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

592 593 594 595 596 597 598 598 599 600 603 604 605 606 608 612 612

592

A. S. Cattaneo, C. A. Rossi

The BV Action for BF Theories . . . . . . . . . . . . . . . . 6.1 The master equation . . . . . . . . . . . . . . . . . . . 6.2 The BV -closedness of the BV action . . . . . . . . . . 6.3 Canonical BF theories . . . . . . . . . . . . . . . . . . 6.4 Sigma-model interpretation . . . . . . . . . . . . . . . 6.5 Gauge fixing . . . . . . . . . . . . . . . . . . . . . . . 6.6 Superpropagator . . . . . . . . . . . . . . . . . . . . . 7. Generalized Wilson Loops in Odd Dimensions . . . . . . . . . 7.1 The “cosmological term” . . . . . . . . . . . . . . . . 7.2 The generalized Wilson loop in the BV superformalism 8. Other Loop Observables . . . . . . . . . . . . . . . . . . . . . 8.1 The odd-dimensional case . . . . . . . . . . . . . . . . 8.2 The even-dimensional case . . . . . . . . . . . . . . . 8.3 The BV -exactness of the polynomial observables . . . 9. Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 The case M = Rm . . . . . . . . . . . . . . . . . . . . 9.2 Nontrivial bundles . . . . . . . . . . . . . . . . . . . . Appendix A. Definition and Main Properties of the Pushforward . . Appendix B. Sign Rules . . . . . . . . . . . . . . . . . . . . . . . . B.1 Dot products . . . . . . . . . . . . . . . . . . . . . . . B.2 Superderivations . . . . . . . . . . . . . . . . . . . . . B.3 Pullbacks and push-forwards . . . . . . . . . . . . . . Appendic C. The Universal Global Angular Form . . . . . . . . . . Appendix D. Parallel Transport as a Function on LM . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.

. . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .

613 614 616 616 621 623 624 626 627 628 630 630 632 636 638 638 638 640 641 642 643 643 644 649 655

1. Introduction Topological BF theories [32, 26, 4] are in three dimensions just another way of writing Chern–Simons theory [36] (at least at the perturbative level and disregarding anomalies) [11]. In particular, they produce 3-manifold and knot invariants. The 2-dimensional version, at least in its canonical version (see Remark 3.1), is a particular case of the Poisson sigma model [27, 31] and describes the deformation quantization of the dual of the corresponding Lie algebra [17]. Higher-dimensional generalizations did not have interesting topological interpretations – apart from the partition functions which describe torsion invariants – due to the lack of interesting observables. These were recently introduced in [16] thanks to a superformalism that simplifies a lot the combinatorics of the associated Batalin–Vilkovisky (BV) cohomology. The meaning of these observables is that their expectations values are cohomology classes on the space of framed imbeddings of S 1 (as those described in [13]). The superformalism for BF theories in the BV framework is not entirely new as it was proposed in [34, 28], but unfortunately the sign rules were not spelt out. So the first aim of this paper, after reviewing BF theories in Sect. 3 and the BV formalism in Sect. 4, is to give a complete description in Sect. 5 of the superformalism and of its properties, including explicit sign rules. This leads to a straightforward proof of the master equation for BF theories in Sect. 6, both for ordinary BF theories and for their “canonical” version (see Subsect. 6.3). If the resulting BV action is interpreted as a supersymmetric sigma model, it falls in the general framework discussed in [1], see Subsect. 6.4. However, the superformalism used in this paper is better suited for the

Higher-Dimensional BF Theories in the Batalin–Vilkovisky Formalism

593

introduction (see Sects. 7 and 8) of generalized Wilson loops, which are constructed in terms of Chen’s iterated integrals [18] involving the superfields. We refer to [16] for an overview of the above topics and for a discussion of the main theorems, which we prove here. In Sect. 9 we finally discuss the generalization of the above construction to the case of nontrivial bundles. Now we briefly comment on the observables defined in this paper and in [16]. First, they are defined on loops (observables related to higher-dimensional submanifolds are of course of great interest and will be discussed in a forthcoming paper; we refer to [20, 11, 15] for previous attempts in this direction). Second, the quantum BV formalism requires considering the so-called BV Laplacian (see Subsect. 4.4) and this forces one to restrict to imbeddings (more precisely, to framed imbeddings). Third, one needs restrictions on the Lie algebra underlying the definition of the BF theory; a semisimple Lie algebra (or more generally a Lie algebra as in Assumption 3) in Sect. 2 will do only in the odd-dimensional case and for a specific observable whose expectation value involves Feynman diagrams that, apart from an obvious dependence of the propagators on the dimension, are exactly the same as in the computation of knot invariants from Chern–Simons theory (see [13] and references therein); the main characteristic of these diagrams is that they involve completely antisymmetric trivalent vertices which satisfy a diagrammatic version of the Jacobi identity (see [5]): namely, each vertex represents a binary operation with the same properties of a Lie bracket. More general observables (and, in particular, any observable in the even-dimensional case) requires stronger restrictions on the involved Lie algebra; in particular, our construction works when it comes from an associative algebra (see Assumption 4 in Sect. 8). The Feynman diagrams for the expectation values of these generalized Wilson loops contain (k + 1)-valent vertices (for any k ≥ 2) which can be interpreted as k-ary operations satisfying certain conditions. The way BF theories generate vertices (i.e., k + 1-ary operations) is through one single binary operation (i.e., an associative product) plus a trace; but in principle there might exist more general algebraic structures generating graph cocycles (in the sense of [29] and made precise in [13]) which yield cohomology classes of imbedded circles. Observe that, both in the odd and the even dimensional cases, one can define observables involving only 3-valent vertices (in the odd-dimensional case, they are the first observables pointed out above which do not require the extra assumption on the Lie algebra). The theory itself formally looks like a 3-dimensional BF theory with cosmological term (so essentially a Chern–Simons theory) “rigidly” transported to higher dimensions and this might provide an interpretation of the Chern–Simons theory for strings proposed in [35].

2. Preliminaries In this section we introduce the main objects that we will use throughout the paper. We begin by considering a principal G-bundle P → M, where M is a connected orientable manifold of dimension m ≥ 2. We will denote by g the Lie algebra of G. We consider the associated bundle P ×G V for a G-module V . In particular, we will be interested in ad P := P ×G g and ad∗ P := P ×G g∗ . We denote by ∗ (N ; V ) the space of V -valued forms on a manifold N . By ∗bas (P ; V ) we denote the invariant, horizontal forms on P taking values in V ; then ∗bas (P ; V ) ∼ = ∗ (M, P ×G V ),

594

where ∗ (M, W ) = (M, one also has


∗

T ∗ M ⊗ W ) for a vector bundle W → M. For P trivial,

∗ (M, P ×G V ) ∼ = ∗ (M; V ). The gauge group G of P is defined as the set of all equivariant automorphisms of P ; it can be identified with the set (M, Ad P ) of all sections of the bundle Ad P := P ×G G, where G acts on itself by conjugation. For P trivial, it can be identified with the group (M, G) of maps from M to G. Another important ingredient that we need is the Universal Enveloping Algebra (UEA) of a Lie algebra. We denote by U(g) the UEA of g. We recall that U(g) is an associative algebra. Further, we denote by ι : g → U(g) the canonical inclusion of g into U(g) (which is a Lie algebra morphism). Throughout the paper we will be confronted with the problem of integrating along fibers forms with values in some vector space V ; for the main properties of the pushforward of real forms and for its generalizations to the case of forms with values in some algebra, we refer to Appendix A. We end this section with some simplifying assumptions that we consider throughout the paper unless explicitly stated otherwise. Assumption 1. The manifold M is compact and there is a flat connection A0 on P , such that all the cohomology groups Hd∗A (M, ad P ) are trivial. 0

Assumption 2. The principal bundle P is trivial. Assumption 3. The Lie algebra g is endowed with a symmetric, Ad-invariant, nondegenerate bilinear form , (e.g., if g is semisimple, we may take the Killing form). In the following, we will extend this form to ∗ (M, ad P ) in the usual way. 2.1. A brief discussion of Assumption 1. Assumption 1 is very strong; we want to briefly discuss it in view of future applications (definitions of loop observables). Let us suppose for a moment that we consider a Lie group G, whose Lie algebra g satisfies the third assumption, and a compact, oriented manifold M of dimension m. In particular, the 0th and the mth De Rham cohomology groups are nontrivial. Let us suppose additionally that the Lie algebra g possesses some invariant elements under the adjoint action of G (i.e., the 0th cohomology group H0 (G, g) is nontrivial). Then it can be shown that the first of the above assumptions cannot hold true. More generally, if the 0th cohomology group H0 (G, g) is nontrivial, and the manifold M is compact and oriented, then there exists no flat connection A0 on P such that Hd∗A (M, ad P ) is trivial. This implies that, 0 e.g., Assumption 1 is not compatible with the case of a compact, oriented manifold M and the Lie algebra g = gl(N ). However, we may assume that Assumption 1 holds true for odd-dimensional compact, oriented manifolds; this is in analogy with the assumption made by Axelrod and Singer in [2] in the introduction of [2]. This forces us to exclude principal bundles P , whose structure group G possesses the nontrivial 0th cohomology group with coefficients in g. For the even dimensional case, we may consider special even-dimensional manifolds arising as products of two odd-dimensional manifolds M1 and M2 , one of which (say M1 ) is the base space of a principal bundle P with Lie group G, satisfying Assumption 1, with flat acyclic connection A0 . We consider then on M1 the complex ∗ T ∗ M1 ⊗ ad P , dA0 and on M2 the complex (∗ T ∗ M2 , d); both are elliptic complexes, and the first one is acyclic by assumption. We take then the exterior


595

tensor product of the two complexes defined on M1 × M2 , with induced differential; this is again an elliptic complex, and, by the Kuenneth Theorem and the acyclicity of the complex on M1 , it is acyclic. So, the existence of odd-dimensional manifolds, for which Assumption 1 holds true implies the existence of even-dimensional manifolds, for which Assumption 1 is valid. So, we have found some algebraic-topological obstructions to the existence of odd-dimensional compact, oriented manifolds for which Assumption 1 is valid, but we are still not able to produce a definitive criterion for the existence of such manifolds. We work therefore under the hope that there are Lie groups G and odddimensional compact, oriented manifolds, for which Assumption 1 holds. In the case G = GL(N ), as we have seen before, there are no such manifolds. So, in this case, i.e. in Sect. 8, we choose implicitly M = Rn with the flat trivial connection. 3. BF Theories The fundamental ingredients that we need are a connection 1-form A on P and an (m−2)form of the adjoint type B. We then construct the curvature FA of the connection and define the classical BF action as the functional B , FA . (3.1) S cl = M

Remark 3.1. A more natural setting would be to consider B as a form of the coadjoint type. In this case, one would not have to introduce an invariant bilinear form on g, and Assumption 3 could be discarded. Instead one would use the canonical pairing between g∗ and g. We will call these theories canonical BF theories and will comment more on them in Subsect. 6.3. However, for the main purposes of this paper (namely, to define loop observables), one needs anyway to consider B of the adjoint type (or to introduce an isomorphism between g and its dual). So we will stick for most of the paper to the setting described in this section. Let us first compute the Euler–Lagrange equations of motion for the BF action; they are given by the couple of equations FA = 0,

dA B = 0.

(3.2)

In the following, by “on shell” we will refer to the space of solutions with the extra condition that the connection 1-form is as in Assumption 1. Next we turn to the symmetries of this action: A → Ag ,

B → Ad(g −1 )B + dAg τ1 ,

where by Ag we denote the right action of the gauge group element g on the connection A, and τ1 is an element of m−3 (M, ad P ). The symmetries under which the BF action is invariant can be interpreted as the action of the semidirect product GAd m−3 (M, ad P ) on A×m−2 (M, ad P ), where A denotes the space of connections on P . In infinitesimal form we obtain A → A + dA c,

B → B + ([B, c] + dA τ1 ),

where c is in 0 (M, ad P ) (the Lie algebra of G).

(3.3)

596


These symmetries are reducible on shell, i.e. each solution (A0 ; B0 ) with A0 as in Assumption 1 has as isotropy group the semidirect product {e}Ad {τ1 ∈ m−3 (M, ad P ) : dA0 τ1 = 0}. This isotropy group is isomorphic to m−4 (M, ad P )/dA0 m−5 (M, ad P ), because of Assumption 1; there are in this quotient nontrivial isotropy groups isomorphic to m−5 (M, ad P )/dA0 m−6 (M, ad P ), and so on until we arrive at 0 (M, ad P ) which acts freely on 1 (M, ad P ). Therefore, we have to adopt the extended BRST procedure to consistently fix all the symmetries, by introducing a hierarchy of ghosts for ghosts. Unfortunately the isotropy groups off shell are different from the above groups; so we have to resort to the BV formalism which generalizes BRST and works also in this case; see the next subsections for more details on both procedures.

3.1. The BRST procedure. For the sake of simplicity, let us restrict ourselves for the moment to the special case m = 4. We first promote the 0-form c and the 1-form τ1 appearing in the infinitesimal gauge transformations (3.3) to anticommuting fields of ghost number 1; A (and every variation of A which is a 1-form) and B will be given ghost number 0. We then define the BRST operator δBRST for the 4-dimensional BF theory by the rules δBRST A = dA c,

δBRST B = [B, c] + dA τ1 ,

1 δBRST c = − [c, c], 2

(3.4)

and δBRST τ1 = −[τ1 , c] + dA τ2 ,

δBRST τ2 = [τ2 , c],

where τ2 is a form in 0 (M, ad P ) to which we assign ghost number two. Then δBRST is an odd operator of ghost number 1 and a differential for the Lie bracket. By the graded Leibnitz rule w.r.t. the ghost number, it follows that 2 B = [FA , τ2 ] = 0, δBRST 2 while for the other fields, δBRST = 0. We notice that a sufficient condition for δBRST to be a differential is FA = 0; this is exactly the first equation in (3.2). Otherwise the BRST quantization procedure fails, but the BRST operator closes on shell; we can therefore apply to this situation another formalism to quantize the BF theory, namely the BV quantization procedure which works well for such a theory. A similar problem arises for any m ≥ 4. In general, however, because of the on-shell reducibility discussed in the last subsection, we have to introduce more ghosts for ghosts τk with values in m−2−k (M, ad P ), k = 1, . . . , m − 2, and ghost number k. The BRST operator is defined by (3.4) and by the rules:

δBRST τk = (−1)k [ τk , c ] + dA τk+1 , δBRST τm−2 = (−1)m τm−2 , c .

(3.5)

2 = 0 mod FA . It is then easy to see that δBRST The case m = 2 and m = 3 are the only ones in which the BRST formalism works, but one can apply the BV formalism there as well obtaining equivalent results.


597

3.2. Classical observables. We start by considering Tr ρ H (A)|10 , where by H (A)|10 we denote the inverse of the usual holonomy w.r.t. the connection A viewed as a G-valued function on LM (see Remark D.1 for technical details). By considering a representation (ρ, V ) we then obtain an Aut V -valued function, which under the trace then yields an ordinary function. This function depends also on the choice of a connection A, but its very definition implies that it is invariant w.r.t. the action of the gauge group G on the space A of connections on P , so it defines a function on A/G × LM (A/G is the moduli space of G-connections). We notice that in a local trivialization, the inverse of the holonomy possesses a representation in terms of a formal series of iterated integrals. In the case P trivial, the holonomy becomes a function on LM with values in G. Next we define n,n H (A)|10 , 1,n ∧ · · · ∧ B hn,ρ (A, B) := Tr ρ πn∗ B (3.6) where the notations are as in Appendix D. From now on, we will omit the wedge product between forms. Notice that we have i,n is a form i ; for all i, B already omitted to write ρ before all forms in the definition of B on LM × n with values in End(V ). It follows from the definition that hn,ρ (A, B) for all n is a differential form of degree (m − 3)n on LM. Proposition 3.2. Let A and B be on shell; then hn,ρ (A, B) is a closed form for m odd and for all n, while for m even and greater than 4, the h2k+1,ρ (A, B)’s are closed. Proof (sketch). Since FA = 0 and dA B = 0, as a consequence of Theorem D.3 the following identities hold: = d

dπ ∗ ev(0)∗ A B A B = 0, ∀i; 1

dev(0)∗ A H (A)|10 = 0 as a consequence of D.2. The cyclicity of Tr ρ implies Tr ρ dev(0)∗ A = d Tr ρ . i,n ’s there is We recall now that in the Definition (3.6) before the products of the B a push-forward; thus, in order to compute dhn,ρ , we will have to apply the generalized Stokes Theorem (A.2). The boundary of the n-simplex can be written as the union of other (n − 1)-simplices (the faces of the simplex), corresponding to the collapsing of successive points, plus two other faces, where the first point tends to 0 or the last tends 2 i,n , which to 1. The faces of the first type give 0, because they yield terms containing B vanish for dimensional reasons. The remaining two faces give 1,n−1 . . . B n−1,n−1 H (A)|10 − (−1)m(n−1) Tr ρ ev(0)∗ Bπn−1∗ B 1,n−1 . . . B n−1,n−1 H (A)|10 ev(0)∗ B ; + (−1)n+1 Tr ρ πn−1∗ B again, for m = dim M odd, the cyclicity of Tr ρ implies that these terms cancel each other. This also works for m even, in case n is odd. On the other hand, when both m and n are even, these two terms have the same sign, and therefore they do not cancel each other. Similar computations show that the hn,ρ (A, B)’s are observables on shell and modulo exact terms, either if m is odd and greater than 5 or if m is even and greater than 4 but n is odd. Another advantage of the BV formalism that we must introduce for the reasons explained before is that it allows to deal with observables which are BRST closed on shell, upon extending them suitably. This will be explained in the next sections.

598


4. The Batalin–Vilkovisky Quantization Procedure for BF Theories We now briefly review the BV formalism [3], though in a form already adapted to BF theories. For a general account on the formalism, see e.g. [33] and references therein. Let us consider all the fields of the theory, i.e. the connection one-form A (which we write A0 + a, where A0 is a given flat connection on P , and a is an element of 1 (M, ad P )), the tensorial (m − 2)-form B of adjoint type, the ghost c with values in 0 (M, ad P ) and the ghosts τj , j ∈ {1, . . . , m − 2}, for which holds: τj takes values in m−2−j (M, ad P ) and has ghost number gh τj = j . We then associate to each field φ α a canonical “antifield”, denoted by φα+ , as follows: suppose that the field φ α has degree deg φ α and ghost number gh φ α ; then the antifield φα+ is a form on M with values in ad P , whose degree is set to be equal to m − deg φ α and its ghost number is set to be −1 − gh φ α . The fundamental ingredients of the BV antibracket are the left and right partial derivatives of a functional F , which we are going to define precisely in the following subsection. To simplify the notations from now on we will denote all the fields and antifields collectively as “fields” and will use the symbols ϕα , where α runs from 1 to (2m + 2); M := {ϕα }α . 4.1. Functional derivatives. We pick a commutative algebra A (usually, we take A = R or A = C, but see Remark 4.1). We are going to consider (formal) power series of local functionals in the fields taking values in A. We introduce a grading, which on monomials is defined as the sum of the ghost numbers of all the fields appearing there and which is then extended by linearity. We finally consider the graded commutative algebra S(A) generated by such objects. We then define the left and right functional derivatives of an element F in S(A) w.r.t. the field ϕα by − − → ← F ∂ ∂ F d ρα , = , ρα . F (ϕα + tρα ) = dt t=0 ∂ϕα ∂ϕα M M It follows from these definitions that the functional derivatives are in general distributional forms. For convenience of notations, however, we will denote the space of distributional forms with the same symbol ∗ used for smooth forms since this causes no harm. So the functional derivatives of F in S(A) w.r.t. ϕα are elements of pα (M, ad P ⊗ A), with the property that − → ← − ∂ F F ∂ = deg = m − deg ϕα . (4.1) pα := deg ∂ϕα ∂ϕα As for the ghost number one has − → ← − ∂ F F ∂ = gh = gh F − gh ϕα . gα := gh ∂ϕα ∂ϕα

(4.2)

From the definitions and the above introduced notations, we also obtain the following useful identities: ← − − → F ∂ ∂ F = (−1)pα deg φα +gα gh φα . (4.3) ∂ϕα ∂ϕα


599

Beside the manifold M, let us consider another (possibly infinite dimensional) manifold N (e.g., LM). In the following we will also consider (formal) power series of functionals taking values in ∗ (N; E), for some associative algebra E (e.g, R, C, U(g) or End(V ), for some g-module V ). On this space we can introduce two gradings: the first is the ghost number which is defined as in the case of S(A); the second is simply the form degree on N . We denote by S ∗ (N ; E) the bigraded superalgebra generated by such functionals (this superalgebra is supercommutative iff E is). Let us notice, at last, that for E an A-module, S(A) is a subalgebra of S ∗ (N ; E). For the left (resp. right) derivative of a functional in S ∗ (N ; E), we use the canonical identification of p (M, ad P ⊗ q (N ; E)) with p,q (M × N, ad P E) (respectively with q,p (N × M, E ad P )). We next introduce the following notations: π1 : N × M −→ N, π2 : N × M −→ M,

(x, ˜ x) −→ x, ˜ (x, ˜ x) −→ x,

π˜ 1 : M × N −→ M, π˜ 2 : M × N −→ N,

(x; x) ˜ −→ x, (x; x) ˜ −→ x. ˜

and

We have used the following useful notation: Let E → M and F → N be vector bundles over M, resp. N. Then we define E F := π˜ 1∗ (E) ⊗ π˜ 2∗ (F), F E := π1∗ (F) ⊗ π2∗ (E);

resp.

it follows that they are vector bundles over M × N , resp. N × M. With these notations we can finally define the functional derivatives of F in S ∗ (N ; E):

← − − → d F ∂ ∂ F ∗ ∗ = π1∗ , π2 ρα . F (ϕα + tρα ) = π˜ 2∗ π˜ 1 ρα , dt t=0 ∂ϕα ∂ϕα The functional derivatives have now two different form degrees: one is the form degree w.r.t. M and is still given by (4.1); the other is the form degree w.r.t. N and remains equal to deg F . The ghost number is given by (4.2) as before. 4.2. The BV antibracket. We define the BV antibracket for two elements F , G in S(A) as the functional: ( F , G ) :=

M

← ← − − − − → → F ∂ F ∂ ∂ G ∂ G α − (−1)(m+1) deg φ . , + , ∂φ α ∂φ α ∂φα+ ∂φα

We note that this functional is again in S(A), since we integrate over M and since the functional derivatives of an element of S(A) are once again power series; it is not difficult to see that the ghost number of the BV antibracket of two homogeneous elements F and G in S(A), with ghost numbers gh F and gh G, is given by gh F + gh G + 1. Next, we

600


define the BV antibracket for two functionals F and G in S ∗ (N ; E) by the formula:

( F , G ) := π13∗

← −

∗ F ∂ π12 ∂φ α

− (−1)

∗ , π23

deg φ α (m+1)

− → ∂ G ∂φα+

π13∗

← −

∗ F ∂ π12 ∂φα+

∗ , π23

− → ∂ G , ∂φ α

(4.4)

where we use the projections π12 : N × M × N → N × M, (n1 ; m; n2 ) → (n1 ; m); π23 : N × M × N → M × N, (n1 ; m; n2 ) → (m; n2 ); π13 : N × M × N → N × N, (n1 ; m; n2 ) → (n1 ; n2 ). This formula needs some explanations. Let us suppose that F and G are homogeneous as elements of ∗ (N; E), with degrees deg F , resp. deg G. We consider the case of a trivial algebra bundle E = N × E over N ; in this case, the left functional derivatives are elements of deg F,p (N × M, E ad P ), while the right ones are elements of q,deg F (M×N, ad P E). The product that we write in this case denotes two operations: the first consists in the usual wedge multiplication of the form parts, while the second is the multiplication in E of the algebra part. (We refer to the beginning of Appendix B for further details.) Therefore, in this special case, the BV antibracket of two homogeneous functionals F , G, in S ∗ (N; E) gives as a result a homogeneous element of S ∗ (N ; E), with degree in N equal to deg F + deg G and ghost number gh F + gh G + 1. We last define the BV antibracket for two special functionals, for we will often consider this case in the following: namely, we pick a functional F in S(A) and a functional G in S ∗ (N; E), where E is an A-module:

( F , G ) := π˜ 2∗

← − − ← − − → → ∂ G ∂ G deg φ α (m+1) ∗F ∂ − (−1) . , π˜ 2∗ π˜ 1 , ∂φα+ ∂φα+ ∂φ α

F ∂ π˜ 1∗ α ∂φ

It is clear that in this case the BV antibracket of F and G is an element of S ∗ (N ; E). For homogeneous elements, the degree of the antibracket is equal to the degree of G, while gh ( F , G ) = gh F + gh G + 1. 4.3. Properties of the BV antibracket. We recall first, in a unified way, the ghost and degree properties of the antibracket. We denote by S the algebra of functionals (which according to the case may be S(A) or S ∗ (N ; E)) and by S p,g the subspace of homogeneous functionals of form degree p and ghost number g by (in the case of S(A), p is necessarily zero). Then the antibracket is a bilinear operator

( , ) : S p,g ⊗ S p ,g → S p+p ,g+g +1 . We list (without proofs) some useful identities for the BV antibracket. We begin with the graded commutativity ( F , G ) = −(−1)deg F deg G+(gh F +1)(gh G+1) (G, F ),


601

which holds whenever one of the two functionals is central. The next property is the graded Jacobi identity (−1)deg F deg H +(gh F +1)(gh H +1) ( F , ( G , H ) ) + cyclic permutations = 0. which holds whenever two of the three functionals are central. The last property is the graded Leibnitz rule ( F , GH ) = ( F , G ) H + (−1)deg F deg G+(gh F +1) gh G G ( F , H ) , which holds whenever F or G or both are central. In particular this holds in the following important cases: (i) all functionals are in S(A); (ii) all functionals are in S ∗ (N ; E) with E a commutative algebra; (iii) F or G or both are in S(A) and the remaining functional(s) are in S ∗ (N ; E) for E an A-module. Remark 4.1. If we restrict ourselves to S(A), then the above properties hold on the whole algebra. An algebra with a bracket satisfying the above properties is known as a Gerstenhaber algebra [23]. The Leibnitz rule will play a key-rôle in the following section, where we define the BV operator via the BV antibracket; the functional F will be there the BV action for the BF theory. Let us in fact suppose that we have a homogeneous local functional S in S(A) with even ghost number (usually, A = R and gh S = 0). We can then define the following operator on the superalgebra S ∗ (N ; E), with E an A-module: δS F := ( S , F ) . It follows easily from the A-linearity of the BV antibracket that δS is a A-linear operator on the algebra S ∗ (N ; E). The most important property of such an operator is an immediate consequence of the Leibnitz rule written above; namely, δS (F G) = (δS F )G + (−1)gh F F (δS G); i.e. the operator δS is a graded (0, gh S + 1)-derivation on S. (If moreover ( S , S ) = 0, then the Jacobi identity implies δS2 = 0.) We now list some other useful properties of the derivation δS . Lemma 4.2. Suppose that the functional F lies in S ∗ (N ; E), and that we have a map h : H → N from some manifold H to the manifold N , then the following identity holds: δS [h∗ (F )] = h∗ (δS F ). Lemma 4.3. Suppose that we have a functional F in S ∗ (H ; E), where H is the total space of a fiber bundle over N with typical fiber some manifold B (possibly with boundaries or corners) and projection π. The integration along the fiber of the functional F yields a functional in S ∗ (N; E) with degree deg π∗ (F ) = deg F − dim B, if we suppose additionally that F is homogeneous in the degree. Then we obtain the following identity: δS [π∗ (F )] = π∗ (δS F ). Lemma 4.4. Let us suppose that we have a functional F in S ∗ (N ; E), for some manifold N and some algebra E. Let us denote by d the exterior derivative on N . Then the following identity holds: δS (dF ) = d(δS F ).

602


Lemma 4.5. Let us suppose that the functional F belongs to the superalgebra S ∗ (N ; g); let us suppose additionally that we have a g-module (V , ρ). The application of Tr ρ to F gives an element of S ∗ (N ; R). Then we obtain the following identity: δS [Tr ρ (F )] = Tr ρ (δS F ). We will only sketch a few ideas of the proofs of the above lemmata. For Lemma 4.2 we only have to write down explicitly the expressions for the two BV antibrackets, which in this special case involve the push-forward w.r.t the projection π¯ 1 : H × M → H , resp. π1 : N × M → M, and the pullbacks w.r.t. the maps π¯ 2 : M ×H → H , resp. π2 : M ×N → M; these maps do appear in the definition of the partial functional derivatives of F . Then we have to consider the following commutative diagram: id×h

M × H −−−−→ M × N    π π¯ 2 2 H

h

−−−−→

N

It is easy to see that id × h induces an orientation preserving map (namely, the identity map) between the fibers (π¯ 2 )−1 ({e}) (∼ = {e} × M) and (π2 )−1 (π(e)) (∼ = {π(e)} × M), for e ∈ H . From Lemma A.1 the claim follows. For Lemma 4.3, we have to write down again explicitly the BV antibrackets on the two sides of the identity. In this case we use the following commutative diagram: id×π

M × H −−−−→ M × N    π π˜ 2 2 H

π

−−−−→

N

The commutativity of this diagram implies that the composite bundles HB×M = (M × H ; π ◦ π˜ 2 ; N ; B × M) and HM×B = (M × H ; π2 ◦ (id × π ); N ; M × B) possess the same total space and the same base space, but have different fibers; in fact, the fiber of the first is isomorphic to B × M, while the fiber of the second one is M × B. We can go from one bundle to the other via a bundle morphism which is the identity on the total and on the base space, but which reverses the orientation of the fibers, and we know that the orientation of a fiber bundle is induced by the orientations of the base space and of the fiber; this will imply the following identity: π∗ ◦ π˜ 2∗ = (−1)m dim B π2∗ ◦ (id × π )∗ , and the coefficient (−1)m dim B comes from the orientation reversal of the fibers of the two bundles (for the property of the push-forward, see Lemma A.2). This identity will imply the claim. For Lemma 4.4 we simply apply the generalized Stokes’theorem for the push-forward w.r.t. π˜ 2 : M × N → N ; notice that in this case the fiber, i.e. M, has no boundary. Then we have to remember that the exterior derivative dM×N on M × N splits as dN + σ dM , where the sign σ is given by σ = (−1)degN (ω) , for a form ω on M × N with degree over N equal to degN (ω). We have to remember that, in the defining equation for the right functional derivative, the test form is independent of N , therefore the exterior derivative


603

on N applied to (the pullback w.r.t. π˜ 1 of) the test form gives 0 as result; next, we know that the integrand form has maximal degree w.r.t. M, so that the exterior derivative w.r.t. M of the integrand gives 0. Then the result follows from all the above considerations. Lemma 4.5 follows easily from the definition of the partial functional derivatives and from the fact that the trace Tr ρ acts only on the End V -part of the tensor product (remember that the functionals we are considering take their values in End(V ) for some g-module V ). 4.4. The BV Laplacian. Let us temporarily choose a Riemannian metric on M and let us denote by the induced Hodge star operator. Let us pick a field φ α ; we denote by φα∗ the field (called sometimes the Hodge dual antifield of φ α ) defined by the formula φα∗ := φα+ , where φα+ is the antifield of φ α . It follows easily from the definition that the degree of φα∗ is given by the degree of the field it is associated to, while its ghost number is given by −1 − gh φ α . Let α, β be two elements in p (M, ad P ). We define α , β Hodge :=

M

α , β .

Now we define a new type of functional derivatives. We begin with functionals in the space S(A). Let us once again denote collectively by ϕα the fields φ α and their newly defined antifields φα∗ . Let ρα be a form with the same degree and ghost number as ϕα . Let F be an element in S(A); we then define the Hodge functional derivatives of F by the formula

− → d δ F F (ϕα + tρα ) = ρα , dt t=0 δϕα

= Hodge

← − F δ , ρα δϕα

. Hodge

It follows from the definition that, for a homogeneous functional F , the Hodge functional derivatives w.r.t. ϕα lie in deg ϕα (M, ad P ) and possess ghost number equal to gh F − gh ϕα . We have now at our disposal the essential elements to construct the BV Laplacian. We start defining the BV Laplacian of an element of S(A) by the formula α BV F := (−1)gh φ α

− → − → δ δ F , α δφ δφα∗

.

(4.5)

Hodge

The result is again a functional in S(A), and, if F is homogeneous, then BV F is homogeneous of ghost number gh F + 1. Remark 4.6. This definition can also be extended to functionals in the space S ∗ (N ; E) in analogy with the construction presented in the preceding subsection. For a homogeneous functional G in S ∗ (N; E), BV G is again a functional in S ∗ (N ; E), whose ghost number is given by gh G + 1 and whose degree is unchanged.

604


Remark 4.7. Turning to a unified notation S, we have in general BV : S p,g → S p,g+1 . Notice however that BV is not well-defined for all functionals in S. It is particularly illdefined on local functionals. The correct definition would involve some regularization. We assume however that, independently of the regularization, BV F = 0 whenever F depends only on one element in each pair field–antifield, as the formal definition of BV suggests. The properties of the BV Laplacian BV are: • the BV Laplacian is a coboundary operator, i.e. BV 2 = 0; • the BV antibracket measures the failure of the BV Laplacian to be a derivation, i.e. BV (F G) = (BV F )G + (−1)gh F ( F , G ) + (−1)gh F F (BV G),

(4.6)

where one of the functionals must be central. The latter property in particular implies that the BV Laplacian is well-defined on the subalgebra generated by those local functionals which are killed by BV (e.g., those described in the previous remark). Remark 4.8. If we restrict ourselves to S(A), then the above properties hold on the whole algebra. A Gerstenhaber algebra with an operator satisfying the above properties is known as a BV algebra. Remark 4.9. We note that we can define (independently of the dimension) the BV antibracket by

←

← − − → − − → F δ δ G F δ δ G , − , . ( F , G ) := α ∗ ∗ δφ δφα δφα δφ α Hodge

Hodge

This is the definition of the BV antibracket in its original setting [3]. This expression depends in general on the Riemannian metric on M, but in the case of BF theories the antibracket is actually independent thereof since it is equal to the one defined in Subsect. 4.2. 4.5. BV cohomology and observables. We have introduced the BV Laplacian in order to deal with the quantum version of the BV formalism, which is needed when considering functional integrals with weight exp(i/)S, where S should be a solution of the quantum master equation ( S , S ) − 2iBV S = 0. The main consequence of the quantum master equation is that the operator BV := δBV − iBV

(4.7)


605

is a coboundary operator of ghost number 1; it is not a differential, because of (4.6). This operator is fundamental in the BV formalism; namely, all the meaningful observables in the BV formalism lie in the 0-ghost number cohomology of BV . This means (at least formally) that the vacuum expectation values of BV -cohomology classes, weighted by exp(i/)S, are independent of the choice of gauge fixing. In turn, the v.e.v.s of trivial BV -cohomology classes or of classes of ghost number different from zero vanish. We will show that the BV action S of BF theories, to be introduced in (6.1), satisfies separately the equations BV S = 0

and

( S , S ) = 0,

which imply that S satisfies the quantum master equation. 5. The BV Superformalism for BF Theories The aim of this section is to define a new type of BV antibracket, which will allow us to obtain the BV action for BF theories in a simple way and to write it in a compact form. From now on we consider a new grading on the space of functionals S called the total degree, which is defined as the sum of the form degree and the ghost number; we will denote the total degree of a form α with degree deg α and ghost number gh α by |α| := deg α +gh α; by homogeneous we will mean homogeneous w.r.t. the total degree.

We note now that all the fields c+ ; a + ; B; τ1 ; . . . ; τm−2 have total degree m − + 2, while all the remaining fields τm−2 ; . . . ; τ1+ ; B + ; a; c have total degree equal to 1. Here a is the difference between A and a given background connection A0 as in Assumption 1; for notational consistency, we denote by a + the associated antifield. We can cast all the fields in two homogeneous superforms which we will denote by B and A: B :=

m−2

(−1)

k(k−1) 2

τk + B + (−1)m a + + c+ ,

(5.1)

k=1

A := (−1)m+1 c + A + (−1)m B + +

m−2

(−1)

k(k−1) 2 +m(k+1)

k=1

τk+ .

(5.2)

Further, we define a := A − A0 . We refer to Appendix B, for the definitions of the dot product · and of the dot Lie bracket [[ ; ]]. We only recall that the dot structures make the algebra S into a superalgebra w.r.t. the total degree. Analogously, we define the dot version ; of the bilinear form , on ∗ (M, ad P ) by α ; β := (−1)gh α deg β α , β . It satisfies β ; α = (−1)|α||β| α ; β .

606


5.1. The space of functionals SA,B . As in the previous section, we consider the algebra generated by local functionals in the fields taking values in a commutative algebra A or in a de Rham complex ∗ (N ; E). However, from now on we will restrict ourselves only to those functionals which depend on the linear combinations A and B (and not on the component fields). We will denote these algebras by SA,B (A), resp. SA,B (N ; E), or generically by SA,B . We give SA,B a unique grading, by defining the degree of a monomial in the superfields A and B to be the sum of the total degrees of its factors. Since the superform a has total degree 1 and lies in ∗ (M, ad P ), we can consider A as a superconnection in the sense of Mathai and Quillen [30]. With the help of the dot Lie bracket (see Appendix B), we can then define the covariant derivative of B w.r.t. the superconnection A and the supercurvature FA by: dA B := dA0 B + [[a ; B]], 1 FA := dA0 a + [[a ; a]]. 2 Notice that the supercurvature would contain the extra term FA0 if the background connection A0 were not chosen to be flat. Note that in this new context the exterior and covariant derivatives are operators of total degree 1. 5.1.1. The super functional derivatives. We begin by introducing the super test forms ρa and ρB : the super test form ρa is defined to be the sum of the test forms corresponding to the fields that appear in the superform a, with the same sign convention as in (5.2); analogously we define the super test form ρB . By definition, the super test forms have then total degree 1, resp. m − 2. We then define the super functional derivatives of an element F in SA,B (A) by:

− − →

← F ∂ d ∂ F ρa ; ; ρa = , F (a + tρa ; B) = dt t=0 ∂a ∂a M M

− − →

← d F ∂ ∂ F ρB ; ; ρB = . F (a; B + tρB ) = t=0 dt ∂B ∂B M M It is then easy to determine the total degree of the super functional derivatives of F ; in fact, the following identities hold: → ← − − ∂ F F ∂ = = |F | + m − 1, ∂a ∂a (5.3) → ← − − ∂ F F ∂ = = |F | + 2. ∂B ∂B It will be also useful to express the right derivative of the functional F in terms of the left one, and vice versa. The result of this computation is given by: − → ← − ∂ F F ∂ = (−1)|F |+m−1 , ∂a ∂a − → ← − ∂ F |F |m F ∂ = (−1) . ∂B ∂B


607

We next define the super functional derivatives of an element F of SA,B (N ; E) by:

← − − → F ∂ ∂ F ∗ = π1∗ ; π2 ρ a ; ; ∂a ∂a

← − − → d F ∂ ∂ F ∗ ∗ = π1∗ ; π2 ρ B . F (a; B + tρB ) = π˜ 2∗ π˜ 1 ρB ; dt t=0 ∂B ∂B d F (a + tρa ; B) = π˜ 2∗ dt t=0

π˜ 1∗ ρa

Their total degrees are still given by (5.3). 5.1.2. The super BV antibracket. Let us pick two functionals F and G in SA,B (A); then the super BV antibracket is defined by:

(( F ; G )) :=

M

← ← − − − − → → F ∂ F ∂ ∂ G ∂ G m ; − (−1) ; . ∂B ∂a ∂a ∂B

(5.4)

Note that the BV antibracket of F and G is again a functional in SA,B (A). Next we consider a functional F in SA,B (A) and a functional G in SA,B (N ; E), with E an A-module; we define the BV antibracket of F and G by:

(( F ; G )) = π˜ 2∗

π˜ 1∗

− F← ∂ ∂B

− − − → → ← ∂ G ∂ G dim M ∗ F ∂ π˜ 2∗ π˜ 1 ; − (−1) ; . ∂a ∂a ∂B (5.5)

In this case the BV antibracket of F and G is a functional in SA,B (N ; E). We finally define the BV antibracket of two functionals F and G in SA,B (N ; E) by:

(( F ; G )) := π13∗

∗ π12

− F← ∂

∗ − (−1)m π13

∂B

∗ π12

∗ ; π23

→ − ∂ G ∂a

− F← ∂ ∂a

− G← ∂ ∗ ; π23 . ∂B

In this case we obtain that (( F ; G )) is a functional in SA,B (N, E). The antibracket, in all the above cases, has total degree 1; i.e., if we denote generically k the subspace of homogeneous elements of by SA,B the space of functionals and by SA,B total degree k, then k+l+1 k l ⊗ SA,B → SA,B . (( ; )) : SA,B

From now on we will use the short notation given in (5.4) for all types of functionals that we have discussed until now, and we omit in all cases the specific notation, leaving to the reader the understanding of the real meaning of the formula.

608


5.2. Main properties of the super BV antibracket. One could now wonder if there is a relationship between the super BV antibracket defined in the previous subsection and the BV antibracket defined in 4.2 that we have discussed in the previous subsection. We begin by explaining this relationship for the case of functionals in SA,B (A). Lemma 5.1. Suppose that we have two functionals F and G in SA,B (A); then the following identity holds: (( F ; G )) = ( F , G ) .

(5.6)

Proof. We prove the identity for homogeneous functionals; the general case follows by linearity. We begin by computing the functional derivatives of F and G: −

← d F ∂ d ; ρa . F (a + tρa ; B) = F (a + tρa ; c + tρc ; . . . ; B) = dt t=0 dt t=0 ∂a M Next, we note that the integral selects the part of the integrand whose form degree in M is equal to m, and that the super test form ρa is the sum of the usual test forms (with some signs to be considered). We write ρa as ρa =

m

σai ρai ,

i=0

where by ρai we denote the degree i part of ρa ; i.e., ρa0 = ρc , ρa1 =ρa and so on. The signs σai are the same as in the definition (5.2) of A; namely, a = σai ai . Similarly we introduce signs σBj as in B = σBj Bj . We can then write:

M

← ← ← − − − F ∂ F ∂ F ∂ ; ρa = , ρc + , ρa + . . . ∂a ∂c ∂a M M −

← m F ∂ = σaj ; ρaj , ∂a m−j M

(5.7)

j =0

where the subscript denotes the restriction to the term of the indicated form degree. We recall that gh ρaj = 1 − j ; then we obtain e.g. for the j th term in the last expression of the above identity (recalling the definition of the total degree of the functional derivative of F w.r.t. a): − −

← ← F ∂ F ∂ (|F |+j −1)j = (−1) ; ρaj , ρaj . ∂a m−j ∂a m−j M M By confronting the two expressions in (5.7), and doing similar computations in the other cases, we obtain for j = 0, . . . , m: − F← ∂ ∂a

m−j

∂B

m−j

− F← ∂

← − F ∂ = σaj (−1) , ∂aj ← − (m−2−j )(m−j ) F ∂ = σBj (−1) , ∂Bj (|F |+j −1)j


→ − ∂ F ∂a m−j − → ∂ F ∂B m−j

609

− → ∂ F = σaj (−1) , ∂aj − → (|F |−m+2+j )j ∂ F = σBj (−1) . ∂Bj (1−j )(m−j )

We cast then all these expressions in the definition of the super BV antibracket

← − − → F ∂ ∂ G ; . ∂B ∂a

Then we use the above expressions, and, after rewriting ; as , , we compute the products σBm−j σaj , separately for the case m even and m odd. In order for the superbracket to coincide with the ordinary bracket, these products must be −1 if i = 0, σBm−i σai = 1 otherwise for m even, and

(−1)i (−1)i+1

σBm−i σai =

for i = 0, 1, otherwise

for m odd. It can be readily computed that the choice of signs made in (5.1) and in (5.2) is consistent with the above rules; therefore, the proof then follows. For the general case of elements of SA,B (N ; E), the above rule must be slightly modified. We begin by noting that any homogeneous F of total degree |F | in this algebra can be written in the form F = Fl , l

where Fl is an element of S l,|F |−l (N; E). This is obtained by expanding the superfields in their components. We are now ready to state the following Lemma 5.2. Let F and G be homogeneous elements of SA,B (N ; E). If we expand them according to the above rule F =

Fk and G =

k

Gl ,

l

then the following identity holds: (( F ; G )) =

(−1)(|F |−k+1)l ( Fk , Gl ) .

(5.8)

k,l

Proof. The proof of this identity is similar to the proof of Lemma 5.1; in fact, we have to compute the functional derivatives of F and G w.r.t. a and B, and express them via

610


the functional derivatives w.r.t. the usual fields of the theory. We therefore recall the formulae for the functional derivatives, and we apply them to F , obtaining: ← − F ∂ ; π2∗ ρa ∂a

− m F← ∂ = σaj π1∗ ; π2∗ ρaj ∂a m−j j =0 d = Fl (a + tρa ; c + tρc ; . . . ) dt t=0

d F (a + tρa ; B) = π1∗ dt t=0

l

m d = Fl (aj + tρaj ) dt t=0 l j =0

← − m Fl ∂ ∗ = π1∗ , π2 ρaj ∂aj l j =0

← − m Fl ∂ ∗ = π1∗ , π2 ρaj . ∂aj j =0

(5.9)

l

Then the following holds, if we go from the dot product to the ordinary product:

(−1)(|F |−l−1+j )j

← ← − − Fl ∂ F ∂ l , π2∗ ρaj = ; π2∗ ρaj . ∂aj ∂aj

By confronting the terms in (5.9), and operating similarly for the other cases, we obtain the following identities for j = 0, . . . , m: − F← ∂ ∂a m−j − F← ∂ ∂B m−j → − ∂ F ∂a m−j → − ∂ F ∂B m−j

← − Fl ∂ = σaj (−1) , ∂aj l ← − Fl ∂ = σBj (−1)(|F |−l−m+j )j , ∂Bj l − → (1−j )(l−m+j ) ∂ Fl = σaj (−1) , ∂aj l − → (m−j )(l−m+j ) ∂ Fl = σBj (−1) . ∂Bj

(|F |−l−1+j )j

l

We can finally cast all these expressions in the explicit formula for the super BV antibracket, and, by recalling the explicit values of the chosen signs σaj and σBj , we can finally obtain the desired identity (recall the form degree selection rule imposed by the pushforwards).


611

We note that for the case in which F is in SA,B (A) and G is in SA,B (N ; E) for an A-module E, then the following identity holds: (( F ; G )) =

(−1)(|F |+1)l ( F , Gl ) ;

(5.10)

l

this formula will play a special rôle in some later computations (we skip the proof of this identity, because it is in principle the same as for the two previous lemmata). Let us now extend the super BV antibracket (( ; )) to the whole of S by the following rule (( α ; β )) := (−1)(gh α+1) deg β ( α , β ) , with α and β homogeneous elements of S. Recalling the graded commutativity rule, the graded Leibnitz rule and the graded Jacobi rule for ( , ), one can then show the following properties of the super BV antibracket (( ; )): • (( α ; β )) = −(−1)(|α|+1)(|β|+1) (( β ; α )), whenever one of the two elements is central in S. • (( α ; βγ )) = (( α ; β )) γ + (−1)(|α|+1)|β| β (( α ; γ )), whenever α or β or both are central in S. • (−1)(|α|+1)(|γ |+1) (( α ; (( β ; γ )) )) + cyclic permutations = 0, whenever two of the three elements are central in S. Here we have used the previous notational convention for the total degree. In particular if we restrict to SA,B , by linearity the previous identities hold if we replace α, β and γ with elements F , G and H of SA,B . For central elements in SA,B we can take e.g. any functional F in SA,B (A), while considering as more general elements in SA,B (N ; E), for an A-module E. (We have omitted the products between elements in SA,B , but it is understood that we are dealing with the shifted dot product.) Let us now pick a central element S of SA,B with even total degree; we then define an operator δ on the superspace SA,B by δ := (( S ; )) ; since S has even total degree, δ is an odd derivation by the above identities. From Lemma 5.2, 4.3, 4.2, 4.4 and 4.5 we can derive the useful properties of δS : Corollary 5.3. Suppose that the functional F lies in SA,B (N ; E), and that we have a map h from some manifold H to the manifold N , then the following identity holds: δ[h∗ (F )] = h∗ (δF ). Corollary 5.4. Suppose that we have a homogeneous functional F in SA,B (H ; E), where E is a real or complex algebra and H is the total space H of a bundle over N with typical fiber B. The integration along fiber of the functional F gives a functional of the same type, defined on the manifold N and with total degree |π∗ (F )| = |F | − dim B. Then we obtain the following identity: δ[π∗ (F )] = (−1)dim B π∗ (δF ).

612


Corollary 5.5. Let us pick a functional F in SA,B (N ; E), for N and E as in the preceding lemma. Let us denote by d the exterior derivative on the manifold N . Then the following identity holds: δ(dF ) = −d(δF ). Corollary 5.6. Let us suppose that the functional F belong to the superalgebra SA,B (N, g); let us suppose additionally that we have a g-module V . The application of the trace to F gives an element of SA,B (N ; R) (or of SA,B (N ; C), depending on whether V is a real or complex module). Then we obtain the following identity: δ[Tr ρ (F )] = Tr ρ (δF ). 5.3. The super BV Laplacian. In analogy with what we have done for the BV antibracket, let us introduce a “twisted” version of the BV Laplacian on the superalgebra S, endowed with the two usual gradings (the form degree and the ghost number). We define the super BV Laplacian by α := (−1)deg α BV α, for α ∈ S. Since the BV Laplacian is nilpotent, it follows immediately that the super BV Laplacian is nilpotent, too. Let us take α and β in S, and let us suppose that at least one of the two elements is central in S. It follows then from (4.6) that (α · β) = (α) · β + (−1)|α| (( α ; β )) + (−1)|α| α · (β),

(5.11)

where α or β must be central. Restricting to the super algebra SA,B , it follows easily that the super BV Laplacian is a coboundary operator k+1 k : SA,B → SA,B

which satisfies (5.11) with α and β in SA,B . The BV operator BV defined in (4.7) is replaced in the superformalism by the operator = δ − i. As a consequence of the general case, is a coboundary operator. 5.4. Twists. Let O be an even element of SA,B . We define the twisted BV coboundary operator by = exp − i O exp i O = + ∂ O + i 6O , with ∂ O := (( O ; )), and 6O = O + as a multiplication operator.

1 (( O ; O )) 2


613

Definition 5.7. We call flat an even functional O with 6O = 0, flat observable an -closed flat functional, and flat invariant observable a δ-closed flat observable. A basic fact that we will need in Sects. 7 and 8 is expressed by the following Lemma 5.8. If O is a flat observable, then so is λO for any constant λ; moreover, ∂ O is a superdifferential (of degree |O| + 1) which anticommutes with . If we also assume that O is invariant, then ∂ O anticommutes with δ, so δ λ := δ + λ∂ O is an odd differential for all λ. Proof. By definition, a flat observable O satisfies separately O = 0

and

(( O ; O )) = 0.

This implies that 6λO = 0 for all λ. The second equation above together with the Jacobi and square to zero and identity implies that ∂ O is a coboundary operator. Since 6O = 0, we obtain ∂ O + ∂ O + ∂ 2O = 0. The second claim then follows since ∂ O squares to zero. For O invariant, we also have (( S ; O )) = 0. So by Jacobi we obtain δ∂ O + ∂ O δ = 0. 6. The BV Action for BF Theories We have now at our disposal all the tools needed to write down the correct BV action for BF theories. Namely, we claim that it is given by S :=

M

B ; FA .

(6.1)

Remark 6.1. Earlier versions of this form for the BV action of BF theories can be found in [34, 28], where however proofs were not given and, in particular, there was no explicit treatment of the sign conventions (i.e., our “dot” structures). Special cases (with explicit signs) were also considered in [10, 12]. In particular, the structure of the BV action in terms of superfields is an agreement with the general pattern described in [21]. See also [37] and [1] for the case of Chern–Simons theory. This form of the BV action holds not only for the BF theories described in the previous sections but also for the “canonical BF theories” pointed out in Remark 3.1 (observe that the two-dimensional case has already been considered in [17]). We divide the proof, for the ordinary case, in two steps: (i) we show that the above functional is a solution of the master equation corresponding to the BF action (3.1) with infinitesimal symmetries (3.4) and (3.5) (Subsect. 6.1); (ii) we show that it is BV -closed (Subsect. 6.2). In Subsect. 6.3 we will then give the proof in the case of canonical BF theories.

614


6.1. The master equation. We begin with the statement of the main theorem, and we devote the rest of the section to its proof and to some important consequences. Theorem 6.2. The following identity holds: (( S ; S )) = 0. Remark 6.3. By Lemma 5.1, the above result is equivalent to the statement that the action S satisfies the usual ME w.r.t. the usual BV antibracket. Proof. We begin by computing the left and right partial derivatives w.r.t. a and B; e.g. the left partial derivative of S w.r.t. a is given by d B ; dA ρa S(a + tρa ; B) = dt t=0 M dA B ; ρa = ρa ; dA B . = (−1)m−1 M

M

It follows that:

similarly

← − − → S∂ m−1 ∂ S = (−1) = (−1)m−1 dA B; ∂a ∂a

(6.2)

← − − → S∂ ∂ S = = FA . ∂B ∂B

(6.3)

If we now insert the above functional derivatives in the formula for the BV antibracket, we obtain FA ; dA B = 2 dA FA ; B , d FA ; B − 2 (( S ; S )) = 2 M

M

M

by the invariance of , (A is a superconnection). The first term vanishes by Stokes’ Theorem, and the second by the super Bianchi identity dA FA = 0. So the claim follows.

Since S satisfies the ME, the Leibnitz rule and the Jacobi identity for the super BV antibracket imply that the operator δ := (( S ; )) , defined on SA,B (N ; E), is an odd differential. In many of the forthcoming computations we need the following Proposition 6.4. The action of δ on the superfields a and B is given by: δa = (−1)m FA

(6.4)

δB = (−1)m dA B.

(6.5)

and


615

Proof. The above formulae follow from (5.5). Let us begin with δa: − − →

← S∂ ∂ a δa(x) = ; (x) . ∂B ∂a M By definition however

M

− → ∂ a d ρ; (x) = [a(x) + tρ(x)] = ρ(x), ∂a dt t=0 ← −

provided ρ is a test form of total degree 1. Since S∂B∂ has total degree 2, we cannot apply the above formula directly. We use then the following trick. Let be a scalar of total degree −1. Then · δa(x) = (−1)

m

M

← − − → ← − S∂ ∂ a S∂ m · ; (x) = (−1) · (x). ∂B ∂a ∂B

Thus, δa(x) = (−1)m

← − S∂ (x) = (−1)m FA (x), ∂B

where we have used (6.3). Similarly, we have ← − S∂ δB(x) = − (x) = (−1)m dA B, ∂a by (6.2).

Recalling the formula (5.10) that expresses the super BV antibracket in terms of the usual BV antibracket, we can now recover the action of the usual δBV operator defined by ( S , ). Namely, δa =

m

σaj (−1)j δBV aj .

j =0

By decomposing the expression for δa in its homogeneous components and by confronting the two expressions, we get the action of the BV operator δBV on the fields. Similarly, we can recover the action of δBV on the components of B. By setting the antifields to zero, one obtains then that δBV on the fields {A, B, c, τ1 , . . . , τm−2 } coincides with the BRST operator given in (3.4) and (3.5). Moreover, it follows easily from the definition of S and of the superforms a and B that the action reduces to the classical BF action, if we set all antifields to 0. Thus, we have proved the following Theorem 6.5. S is a solution of the master equation for the BF theory.

616


6.2. The BV -closedness of the BV action. We now turn to the proof of the identity BV S = S = 0.

(6.6)

First, we recall that g is endowed with a nondegenerate, invariant bilinear symmetric, form , . We now choose a basis {Xk } of g such that Xi , Xj = si δij , si = ±1; in this basis we have the structure constants fijk given by the relation

dim g fijk Xk . Xi , Xj = k=1

We then introduce the symbols f˜ijk as sk fijk . Thus, f˜ijk =

Xi , Xj , Xk .

From the non-degeneracy of , one then gets the useful relation j f˜ijk = −f˜ik = −f˜kji .

(6.7)

If we write the BV action as a sum of local terms in the fields, we see from the very definition of the BV Laplacian BV (see Remark 4.7) that the only terms in this sum which are not automatically 0 have the form ∗ φα , [φ α , c] Hodge , for all α in the index set of the fields (this the only way to pair a field and its antifield that is allowed by the integration over M); we can rewrite it in the form (up to some sign) ∗ , (6.8) φα , [φ α , c] Hodge = f˜jik φα∗,i , φ α,j ck Hodge

with

( α , β )Hodge :=

M

α ∧ β,

α, β ∈ ∗ (M),

(6.9)

and φ α = φ α,i Xi and similarly for φα∗ and c. Now, by (6.7), one sees that in the above formula no field component is paired to the corresponding antifield component. So, by Remark 4.7, it is annihilated by the BV Laplacian. 6.3. Canonical BF theories. We start here a digression about the version of BF theories mentioned in Remark 3.1. The material covered in this subsection is not essential for the rest of the paper and can be safely skipped. However, this kind of BF theories is interesting by itself (and appears in two-dimensions as a particular case of the Poisson sigma model [27, 31]). We recall now the basic idea: since the curvature FA is a tensorial form of the adjoint type, the most natural way to define a BF theory is to choose B of the coadjoint type and to use the canonical pairing between g∗ and g. We consider then B as a form in m−2 (M, ad∗ P ). Observe that since we do not introduce a bilinear form on g anymore, Assumption 3 is in this case meaningless. For simplicity we will retain in this case as well Assumptions 1 and 2. We begin with some notations:


617

• By , we will denote in this subsection the canonical pairing between g∗ and g; it can be naturally extended to a pairing between forms in p (M, ad∗ P ) and forms in q (M, ad P ), and we will denote this pairing by the same symbol. • By {Xi } we denote a basis of g, while by {X j } we denote its dual basis: Xi , Xj = δji . • By fijk we denote the structure constants w.r.t. the basis {Xl }, i.e. fijk = Xk , Xi , Xj . • By Ad∗ we denote the coadjoint action of G on g∗ ; i.e. ∗ Ad (g)ξ , X := ξ , Ad(g −1 )X . • By ad∗ we denote the coadjoint action of g on g∗ ; i.e., ∗ ad (X)ξ , Y = − ξ , ad(X)Y . The coadjoint action can be extended to an action of forms in p (M, ad P ) on q (M, ad∗ P ) in the usual way. We only notice the sign rules for this extended coadjoint action ad∗ ([ α , β ])γ = ad∗ (α) ad∗ (β)γ − (−1)deg α deg β+gh α gh β ad∗ (β) ad∗ (α)γ ; ∗ ad (α)γ , β = −(−1)deg α deg γ +gh α gh γ γ , [ α , β ] , for α, β ∈ ∗ (M, ad P ) and γ ∈ ∗ (M, ad∗ P ), where we have implicitly supposed to consider forms with additional ghost number gradation. • Finally, we denote (improperly) by dA the covariant derivative acting on ∗ (M, ad∗ P ); it satisfies d α , β = dA α , β + (−1)deg α α , dA β , where α ∈ ∗ (M, ad∗ P ) and β ∈ ∗ (M, ad P ), and dA (dA α) = ad∗ (FA )α. With these conventions in mind, we define the canonical BF action functional by B , FA . S can := M

The Euler–Lagrange equations are still given by (3.2), where now the covariant derivative is understood to operate on m−2 (M, ad∗ P ). We let the group 0 (M, G) m−3 (M, ad∗ P ) operate (from the right) on A × m−2 (M, ad∗ P ) by the rule (A, B)(g, τ ) := (Ag , Ad∗ (g −1 )B + dAg τ ). It is then easy to verify that S can is invariant under this action. The infinitesimal transformations then read δA = dA c;

δB = − ad∗ (c)B + dA τ.

These symmetries present the same reducibility problems as in Sect. 3; therefore, we have to resort to the BV formalism here as well.

618


6.3.1. The BRST and the BV formalism. The BRST transformations corresponding to the reducible infinitesimal symmetries in this case read δBRST A = dA c; 1 δBRST c = − [ c , c ]; 2

δBRST B = − ad∗ (c)B + dA τ1 ; δBRST τk = − ad∗ (c)τk + dA τk+1 ,

k = 1, . . . , m − 3;

δBRST τm−2 = − ad∗ (c)τm−2 .B = − ad∗ (c)B + dA τ1 ;

Here c denotes the Faddeev–Popov ghost, i.e. a form on the space of fields with values in 0 (M, ad P ) with ghost number 1, and by τk we denote the ghosts for ghosts taking values in m−2−k (M, ad∗ P ) and with ghost number k. These BRST transformations present the same problems as in Sect. 3, namely δBRST is a differential only modulo terms containing the curvature of A. We have therefore to switch to the BV formalism. We associate to each field φ α ∈ {A, B, c, τ1 , . . . , τm−2 } a canonical antifield φα+ following the rules α

α

ad∗ P ), then its canonical antifield • if φ α takes values in αp (M, ad P ), resp. p (M, α ∗ m−p m−p takes values in (M, ad P ), resp. (M, ad P ); • the ghost number of φα+ is set to be equal to −1 − gh φ α . We define the total degree of a form α with degree deg α and ghost number gh α by |α| := deg α + gh α. Accordingly to what we have done in Sect. 5, we define the dot duality by the rule α ; β := (−1)gh α deg β α , β , for α an element of ∗ (M, ad∗ P ) with ghost number gh α and β in ∗ (M, ad P ) with form degree deg β. The dot Lie bracket [[ ; ]] is defined analogously as in Appendix B, and it enjoys the same sign rule. We define additionally the super coadjoint action of ∗ (M, ad P ) on ∗ (M, ad∗ P ) by the rule ad∗ (α)β := (−1)gh α deg β ad∗ (α)β. Without proof we write down some useful formulae, which are analogous to the formulae displayed in Appendix B ad∗ ([[α ; β]])γ = ad∗ (α) ad∗ (β)γ − (−1)|α||β| ad∗ (β) ad∗ (α)γ , ∗ ad (α)γ ; β = −(−1)|α||γ | γ ; [[α ; β]] , for α, β ∈ ∗ (M, ad P ) and γ ∈ ∗ (M, ad∗ P ). If A is a connection on P , we also have d γ ; α = dA γ ; α + (−1)|γ | γ ; dA α . Finally, it is useful to write the duality pairing also in the opposite order; as usual, one defines X , ξ = ξ , X for X ∈ g and ξ ∈ g∗ . When we extend the pairing to forms and then consider the dot version, we obtain the rule γ ; α = (−1)|γ ||α| α ; γ .


619

We then define the functional derivatives w.r.t. all the fields of the theory and the BV antibracket ( , ) by the same formulae as in Subsect. 4.1 (where the invariant, nondegenerate bilinear form , is replaced by the duality pairing). This antibracket enjoys all usual properties of a BV antibracket. Analogously, provided we have a solution S of the master equation ( S , S ) = 0, we define the BV differential δ by the rule δ := ( S , ). This operator has the same properties of the previously introduced BV differential (see Subsect. 4.3). Finally, we define the Hodge duals of the fields, the Hodge functional derivatives and the BV Laplacian in this case by the same formulae as in Subsect. 4.4 (with the only difference that by , we mean here the duality pairing between g and g∗ ). We will denote all these objects by the same symbols as in the previous sections. 6.3.2. The BV superformalism and the BV action. We choose a background flat connection A0 , and we write a general connection A as A = A0 + a, with a in 1 (M, ad P ), with ghost number 0. We are now ready to define in this case the analogues of the superforms introduced in Sect. 5, namely B :=

m−2

(−1)

k(k−1) 2

τk + B + (−1)n a + + c+ ,

k=1

A := (−1)n+1 c + A + (−1)n B + +

m−2

(−1)

k(k−1) 2 +n(k+1)

k=1

τk+ .

We also define a := A − A0 . We notice that B is a superform of total degree m − 2 with values in ad∗ P , while A can be interpreted once again as a superconnection on M. It is not difficult to see that the curvature of the superconnection A is given by the formula 1 FA = dA0 a + [[a ; a]]. 2 We go on, as in Subsect. 5.1, to define the functional derivatives w.r.t. B and a and the super BV antibracket; they enjoy the same properties as the previously introduced ones, and we will denote them by the same symbols. Finally, we claim that the BV action for the canonical BF theory on M is given by the formula B ; FA . S := M

In order to prove the claim, we show once again separately that S satisfies the master equation and that it is (at least formally) BV -closed. The master equation. The proof that S satisfies the master equation is analogous to the proof of the corresponding claim in Sect. 6.1; we therefore omit it. We will only write down the action of the super BV differential δ := (( S ; )) on the super fields a and B δa = (−1)m FA , δB = (−1)m dA0 B + ad∗ (a)B ;

(6.10) (6.11)

the action of the usual BV differential on all the fields (fields and antifields) is encoded in the two previous equations and, upon switching off the antifields, gives back the BRST

620


operator defined at the beginning of Subsubsect. 6.3.1. It is also easy to verify that S reduces to S can if we set the antifields to zero. The BV -closedness of the BV action. The proof that S satisfies the equation BV S = 0 is a little bit different from the proof of the same identity in Sect. 6.2; it relies on a formal argument similar to that used in [17]. As noted before, the main property of the BV Laplacian lies in the fact that it contracts each field with the corresponding antifield at the same point (see Remark 4.7); therefore, the only terms in the BV action that are not trivially annihilated by the BV Laplacian are of the form M

φα+ , c , φ α ,

for some field φ α . More precisely, they are (independently of the dimension of M) given by the combination I=

1 ∗ c , [ c , c ] Hodge − a ∗ , [ c , a ] Hodge + 2 m−2 − B ∗ , ad∗ (c)B Hodge + (−1)l+1 τl∗ , ad∗ (c)τl Hodge . l=1

This is obtained from the formula for the BV action after rewriting the dot duality, the super coadjoint action and the dot Lie bracket in terms of the usual ones, and recalling that the integral selects only the top form degree part of the integrand. W.r.t. the bases {Xi } and {X j }, we can write a field φ α with values in ∗ (M, ad P ), resp. in ∗ (M, ad∗ P ), as φ α = φ α i Xi , resp. φ α = φjα X j . For any two real-valued forms on M with the same degree we define ( α , β )Hodge =

M

α ∧ β,

where is the star Hodge operator w.r.t. some chosen metric on M. We therefore obtain 1 I = − fjik ci∗ , ck cj − fjik ai∗ , a k cj Hodge Hodge 2 m−2 + fjki (B ∗ )i , Bk cj + fjki (τl∗ )i , (τl )k cj Hodge

l=1

Hodge

;

we have used here the identity Xi , ad∗ (Xj )X k = − Xk , Xj , Xi = −fjki . Finally, we apply the BV Laplacian to the above expression and get (independently of the dimension of M)


621

BV S = BV I m m m i j − + dvol fj i c =C m m−1 m−2 M m l − · · · + (−1) + ··· m−l =C dvol fjii cj (1 − 1)m = 0, M

where dvol is the Riemannian volume element and C is an infinite constant (explicitly, a Dirac distribution evaluated in 0). The binomial coefficients appear as the number of components of the forms φjα ; e.g. , Bk is an m − 2 form on the m-dimensional manifold m M, so it has m−2 components in local coordinates. The signs before the binomial coefficients are determined by the ghost numbers of the fields φ α (recall the explicit definition (4.5) of the BV Laplacian). Of course the previous computation should be performed with a regularization in order to avoid the infinite constant C. If the regularization is such that the above formal manipulations still hold, then S is BV harmonic. 6.4. Sigma-model interpretation. The BV action (6.1) can be seen as the action functional for a supersymmetric sigma model with source ? := @T M, where M is our original m-dimensional manifold and @ indicates that the fiber has to be taken with reversed Grassmann parity; the target N has to be chosen among the following possibilities: m odd m even

ordinary BF @g × @g @g × g

canonical BF @g × @g∗ @g × g∗

where @ again reverses the Grassmann parity. To encompass all cases, we will write N = V1 × V2 with V1 and V2 as in the above table. The superfields a and B are then related to the 1 and 2 components of a map f : ? → N . Also recall that there is a pairing , of V2 with V1 which is induced from the bilinear form of Assumption 3 resp. from the canonical pairing in the case of ordinary resp. canonical BF theories. Following [1] one can give the BV bracket and the BV action for BF theories (to begin with in the case when the background connection A0 is trivial) a beautiful interpretation in terms of a natural QP -structure on the space E of maps ? → N . Let us recall that a P -manifold is a supermanifold endowed with an odd non-degenerate closed 2-form (shortly, an odd symplectic form); a Q-manifold is a supermanifold endowed with an odd vector field Q that has vanishing Lie bracket with itself; finally, a QP -manifold is a supermanifold that has both structures in a compatible way, i.e., such that the odd symplectic form is Q-invariant. Notice that an odd symplectic form defines a BV bracket; moreover, an even solution of the master equation defines a compatible Q-structure and vice versa. The P -structure on E is defined in terms of the following constant symplectic form on N: ω(v1 ⊕ v2 , w1 ⊕ w2 ) := v2 , w1 − (−1)m v1 , w2 , v1 ⊕ v2 , w1 ⊕ w2 ∈ T(ξ1 ,ξ2 ) N # V1 ⊕ V2 , ∀(ξ1 , ξ2 ) ∈ N.

622


Observe that ω is odd (even) when m is even (odd); i.e., ω defines an ordinary symplectic structure – though on an odd vector space – when m is odd and a P -structure when m is even. This induces the following constant odd symplectic form on E: ω(φ, ˜ φ ) := ω(φ, φ ) dµ, ?

φ, φ ∈ Tf E # (?, f ∗ T N ), ∀f ∈ E. Here we have denoted by ? dµ the canonical density associated to the supermanifold @T M. It is defined as follows: since @T M = (M, ∗ (M)), then every function on @T M can be identified with a sum of forms on M of all degrees, so there is a canonical density given by the usual integral of forms (which selects the top degree part of the integrand). Locally, dµ = dx 1 · · · dx m , ?|U

U

where the x’s are local coordinates on M. Next we come to the Q-structure. Observe first that any flow on ? or on N defines (by composition on the right resp. on the left) a flow on E and that flows of the two types commute. Infinitesimally, any vector field on ? or on N defines a vector field on E and vector fields of the two types commute. Moreover, nilpotency is preserved. In ? and conclusion, any Q-structures Q? on ? and QN on N determine Q-structures Q QN on E; moreover, any linear combination of the two is still a Q-structure. On N we consider the Q-structure given by the Hamiltonian vector field associated to the function σ (ξ1 , ξ2 ) =

1 ξ2 , [ ξ1 , ξ1 ] . 2

Observe that this function is odd (even) for m odd (even), so the corresponding vector field is always odd. The corresponding Q-structure on E yields the following action on the superfields a and B: 1 [[a ; B]] ordinary BF , δ N a = [[a ; a]], δ N B = ad∗ (a)B canonical BF . 2 This Q-structure is automatically compatible with the P -structure defined above. On ? we consider instead the canonical Q-structure which in local coordinates reads Q? = θ i

∂ . ∂x i

The induced Q-structure on E acts on the superfields by δ ? a = da,

δ ? B = dB.

This Q-structure is also compatible with the P -structure defined by ω˜ as follows by an explicit computation: in fact, it is not difficult to check that the odd vector field Q? has zero-divergence w.r.t. the density specified above. Since M has no boundary, this guarantees automatically that the P -structure on E is compatible with the Q-structure defined by Q? .


623

Finally, one considers a linear combination with nonvanishing coefficients of the above vector fields. This yields an entire family of QP -structures on E. Notice however that rescaling a with a parameter λ and B with 1/λ (λ = 0) is a canonical transformation. So, up to equivalence, one can always set the coefficients to have the same ratio as in (6.4) and (6.5) (or (6.10) and (6.11) for canonical BF theories). Given the P -structure, there is a unique (up to an additive constant) action functional generating the given Qstructure. Choosing the additive constant appropriately, the action functional is then a multiple of our S in (6.1). Finally, the remaining multiplicative constant can be absorbed in (or taken as a definition thereof). In order to take into account nontrivial background connections (or even nontrivial bundles P → M), one has to modify a little bit the above construction. First one has to introduce a vector bundle E → ? with fiber N , with ? and N as above. If the original bundle P is trivial, so will be E (otherwise it will be constructed by using the transition functions of ad P and ad∗ P ). The space E will be now the space of sections of E. The P - and QN -structures are introduced as above. The Q? -structure instead requires the choice of a connection A0 in order to lift to E the vector field on ?; this connection has ˜ ? . The rest of the construction is the moreover to be flat to ensure the nilpotency of Q same as above.

6.5. Gauge fixing. We conclude this section giving a brief account on the gauge fixing necessary to start a perturbative expansion of the theory. (For simplicity we restrict ourselves to ordinary BF theories, the modifications needed for the canonical ones being obvious.) The first step is to extend the space of fields by introducing antighosts and Lagrange multipliers. Along with the usual ghost c one introduces an antighost c¯ (of ghost number −1) and a Lagrange multiplier λ (of ghost number 0); both are chosen to take values in the sections of ad P . Similarly, along with the ghost τ1 one introduces an antighost τ¯1 and a Lagrange multiplier λ1 with values in m−3 (M, ad P ). As for the ghosts-for-ghosts τk , one needs an entire family of k antighosts and k Lagrange multipliers ([4]). Namely, we denote by τ¯k,i and λk,i (i = 1, . . . , k) the antighosts and the Lagrange multipliers corresponding to τk , all of which take values in m−2−k (M, ad P ). As for the ghost number, one sets gh(τ¯k,i ) = 2i − k − 2,

gh(λk,i ) = 2i − k − 1.

We will denote by 6 the collection of the fields including the newly introduced ones. Next one has to consider antifields for the antighosts and the Lagrange multipliers. + + They will be denoted by c¯+ , λ+ , τ¯1+ , λ¯ + 1 , τ¯k,i and λk,i (k = 2, . . . , m − 3; i = 1, . . . , k) with the usual rule; i.e., each antifield takes values in the space of forms of complementary degree of the corresponding field and its ghost number is minus the ghost number of the corresponding field, minus one. We will denote by 6+ the collection of all the antifields including the new ones. Finally, we extend the BV structure by declaring that each of the new antifields is BV-canonically conjugated to its field. The newly introduced fields are there only to write down a gauge fixing fermion (see later). From the point of view of BV cohomology their complex must be trivial; i.e., one sets δ τ¯k,i = λk,i ,

δλk,i = 0,

k = 2, . . . , m − 3; i = 1, . . . , k.

624


This is achieved by defining the extended BV action: S ext = S + ?, with S given in (6.1) and ?=

M

−c¯

+

λ − τ¯1+ λ1

+

k m−3 k=2 i=1

! + (−1)k τ¯k,i λk,i

.

(6.12)

The gauge-fixed action, which is a function of 6 only, is then defined by S g.f. = S ext

,

− → 6+ = ∂∂6G

where G (the gauge-fixing fermion) is a function of 6 of ghost number −1 and has to be chosen so that the Hessian of S g.f. at a critical point is not degenerate. In case one wants to expand around a given flat connection A0 , a suitable gauge-fixing fermion (in accordance with Assumption 1) is G=

M

c¯ dA0 a + τ¯1 dA0 B +

m−4

τ¯k+1,1 dA0 τk

k=1

+

k+1 m−4

(6.13) τ¯k+1,i dA0 τ¯k,k+2−i ,

k=1 i=2

where is the Hodge star operator induced from a Riemannian metric on M. The BV formalism ensures in particular that the partition function and the expectation values of BV closed observables do not depend on the chosen metric. 6.6. Superpropagator. The perturbative expansion of the partition function or of the expectation values of observables is obtained in terms of propagators, i.e., expectation values of the fields w.r.t. the quadratic part of the action S ext . We will briefly describe this computation in the case of ordinary BF theories. Since the interaction terms and the observables that we will introduce in the next sections depend only on the superfields a and B, it is enough to compute the “superpropagator” 1 i η = π1∗ a π2∗ B 0 := B ; dA0 a + ? π1∗ a π2∗ B, − → exp Z 6+ = ∂∂6G M where Z is the partition function, A0 is the chosen background flat connection, ? is the extension defined in (6.12), and π1 and π2 are the projections from M × M to M. So η is a distributional (m − 1)-form on M × M with values in ad P ad P . This superpropagator with the gauge fixing (6.13) can be computed by generalizing Axelrod and Singer’s construction [2] to higher dimensions. Another possibility is to compute the main properties of the superpropagator and then construct a form that satisfies them generalizing the construction of [8]. relies on the symmetry a ↔ B The first property of the quadratic part of the action: M B ; dA0 a . This implies T ∗ η = (−1)m η,

(6.14)


625

where T is the automorphism of ad P ad P that acts on the basis by exchanging the points and at the same time exchanges the corresponding fibers (in a local trivialization T (x, x ; ξ, ξ ) = (x , x; ξ , ξ ), with x, x ∈ U ⊂ M and ξ, ξ ∈ g). A subsequent computation shows that i dA0 η = (−1)m δ 0 (π1∗ a π2∗ B) 0 , where δ 0 is the linear part of δ. By the main properties of the BV formalism, one then gets the Ward identity (−1)m dA0 η = (π1∗ a π2∗ B) 0 . The right-hand side is a delta form localized on the diagonal Diag of M × M tensorized with the section φ of ad P ⊗ ad P → Diag determined by the invariant form , ; that is, φ is the section corresponding to the constant equivariant map φ˜ : P → g × g, p → i σi ei ⊗ ei , where {ei } is a pseudo-orthonormal basis of g: ei , ej = σi δij , σi = ±1. Thus, if we define , 13 on g⊗g⊗g as acting on the first and third components and define consequently ; 13 , we may interpret η as a distributional form such that the linear operator P : ∗ (M, ad P ) → ∗−1 (M, ad P ), P γ := π2∗ η ; π1∗ γ 13 , γ ∈ ∗ (M, ad P ), satisfies the equation dA0 P + P dA0 = id.

(6.15)

A regularized version of η consists in finding a smooth form (which we will continue to denote by η) on the configuration space C2 (M) := M × M \ Diag so that P defined as above (with the obvious understanding that the projections π1 and π2 are now from C2 (M) to M) satisfies the same equation. Notice however that C2 (M) is not compact; so one has to replace it by its compactification C 2 (M) which is obtained from M × M by replacing the diagonal with its differential-geometric blowup. Observe that C 2 (M) is a manifold with boundary the spherical normal bundle SN Diag of Diag in M × M. Since we have removed the diagonal, we require now that the superpropagator should be an A0 -covariantly closed form η ∈ m−1 (C 2 (M), ad P ad P ), where, by abuse of notation, we have denoted by ad P ad P the pulled-back bundle of ad P ad P w.r.t. the projection C 2 (M) → M × M. Observe that the generalized Stokesformula implies that the left-hand side of (6.15) applied to a form γ is π∗∂ ι∗ η ; π ∂∗ γ 13 , with ι the inclusion of SN Diag in C 2 (M) and π ∂ the projection SN Diag → Diag. Thus, for (6.15) to hold, one has to require that the restriction of η to the boundary should be ι∗ η = ϑ ⊗ π ∂∗ φ + π ∂∗ β,

(6.16)

where ϑ is the global angular form of SNDiag and β ∈ m−1 (Diag, ad P ⊗ ad P ) is π∂

so far undetermined. Recall that a global angular form ϑ on a sphere bundle S −→ B is a form satisfying π∗∂ ϑ = 1 and dϑ = −π ∂∗ e, where e is a representative of the Euler class of the bundle. In our case, since NDiag # T M, e will be a representative of the Euler class of M. The first property of ϑ is what we need for (6.15) to hold; the second property, of which one cannot dispose, implies dA0 β = e ⊗ φ.

626


This is a very strong constraint in even dimensions, as it requires the Euler class of M to be trivial. In fact, multiply both sides of the equation by φ and contract the first g-component with the third and the second with the fourth; this yields d φ ; β 13,24 = e dim g. Notice finally that we also want η to satisfy (6.14), with T now the corresponding involution on ad P ad P → C 2 (M). In particular, this implies that one has to choose ϑ to be even (odd) under the antipodal map on the fibers if m is even (odd); in odd dimensions this also implies that one must choose e = 0. Moreover, β has to be an element of m−1 (Diag, S2 ad P ) in even dimensions and of m−1 (Diag, 2 ad P ) in odd dimensions. Such a form η can be obtained generalizing the construction of [8]: Theorem 6.6. Under Assumptions 1 and 3, there exists a covariantly closed element η of m−1 (C 2 (M), ad P ad P ), satisfying (6.14) and (6.16). Moreover, in odd dimensions β will be automatically covariantly closed, while in even dimensions – where the above Assumptions imply [e] = 0 – this will be true only if one chooses e = 0. Finally, (M, 2 ad P ) is trivial in odd dimensions and if β may be chosen to vanish if Hdm−1 A 0

Hdm−1 (M, S2 ad P ) is trivial in even dimensions. A 0

Proof. One first builds a global angular form ϑ on SN Diag with the correct behavior under the antipodal map on the fibers: one may construct it as in Appendix C using the Levi–Civita connection for a given Riemannian metric, which also allows to identify SNDiag with the unit sphere bundle SODiag ×SO(m) S m−1 . Next one extends ϑ to the complement of the zero section of N Diag and multiplies it by a function ρ that is identically one in a neighborhood U1 of the zero section and identically zero outside a second neighborhood U2 ⊃ U1 . One then defines η0 ∈ m−1 (C 2 (M), ad P ad P ) as the extension by zero of ρ ϑ ⊗ π ∂∗ φ. Since dA0 φ = 0, dA0 η0 is the extension by zero of dρ ϑ ⊗ π ∂∗ φ − ρ π ∂∗ (e ⊗ φ). The last form may be extended to the zero section of NDiag; hence, the extension by zero of dA0 η0 can be seen as a covariantly closed element of m (M × M, ad P ad P ). The general Künneth theorem implies Hd∗A (M ×M, ad P ad P ) ∼ = Hd∗A (M, ad P )⊗2 . So Assumption 1 implies that there is a 0

0

form α ∈ m−1 (M ×M, ad P ad P ) such that dA0 π ∗ α = dA0 η0 , with π the projection C 2 (M) → M × M. Also observe that one may choose α to satisfy T ∗ α = (−1)m α. Finally, define η := η0 − π ∗ α. An easy check shows that it satisfies all properties above (with β determined by the restriction of α to the diagonal). Remark 6.7. There are a couple of interesting cases when M does not satisfy Assumption 1, but one can define the superpropagator anyway. First, when M = Rm (see Subsect. 9.1) it all boils down to looking for (the higher-dimensional generalization of) Bott and Taubes’s [9] tautological forms, as described in [16]. Second, when M is a rational homology sphere, one can generalize the construction of [7] (which does not yield a closed η, so that extra diagrams must be introduced to correct for it) or alternatively remove one point, as suggested in [29], and essentially reduce to the previous case. 7. Generalized Wilson Loops in Odd Dimensions In this section we display some observables for odd-dimensional BF theories which in some sense generalize the classical observables (3.6), i.e. the iterated-integral expansions of Wilson loops. In the first subsection we construct a flat invariant observable


627

(see Definition 5.7 on Subsect. 5.4) S3 which represents a sort of “cosmological term” (although it does not have the correct ghost number, except for the case dim M = 3). We next define in Subsect. 7.2 a “generalized holonomy” constructed via iterated integrals by means of A and B, and we show that it defines a cohomology class w.r.t. the super BV coboundary operator twisted with S3 which takes values in H ∗ (Imbf (S 1 , M)). From this we then derive a true BV observable. 7.1. The “cosmological term”. We define the local functional 1 B ; [[B ; B]] S3 := 6 M which is an element of SA,B (R) of total degree 2m − 6. We want to show that S3 is a flat invariant observable in the sense of Definition 5.7. This is expressed by the following Lemma 7.1. δS3 = 0, S3 = 0, (( S3 ; S3 )) = 0.

(7.1) (7.2) (7.3)

Proof. First of all, we write down the left partial derivatives of S3 : − → ∂ S3 = 0, ∂a

− → ∂ S3 1 = [[B ; B]]. ∂B 2

With the help of (6.2) and by the definition of the super BV antibracket, we get 1 dA B ; [[B ; B]] . δS3 = (( S ; S3 )) = 2 M By the invariance of , it follows 1 1 dA B ; [[B ; B]] = d B ; [[B ; B]] = 0 2 M 6 M by Stokes’ theorem. So we have proved (7.1). Equations (7.2) and (7.3) follow from the definitions of the super BV antibracket and of the super BV Laplacian and from the fact that S3 depends only on B. It follows from Lemma 5.8 that not only S3 but any of its multiples is a flat observable. So we introduce the “cosmological constant” κ and consider a twisting by κ 2 S3 (the reason for putting κ 2 instead of κ will be clear in the next subsection). We then define δ κ 2 := δ + κ 2 (( S3 ; )) and, again by Lemma 5.8, δ κ 2 is an odd differential for any κ. Its action on the fundamental superfields is easily computed: δ κ 2 a = −FA −

κ2 [[B ; B]], 2

δ κ 2 B = −dA B.

(7.4)

628


7.2. The generalized Wilson loop in the BV superformalism. We want to define an object that generalizes the observable introduced in [11] for the 3-dimensional BF theory with cosmological term. We shall realize this proposal by introducing the new superform Cκ := a + κB. Observe that Cκ is not a homogeneous element in SA,B (M, ad P ) w.r.t. the total degree, but it is homogeneous of degree one with respect to its reduction modulo 2. By recalling (7.4), it is easy to see that 1 δ κ 2 Cκ = −dA0 Cκ − [[Cκ ; Cκ ]]. 2 The previous equation suggests that we may interpret the superform Cκ as a “variation” of the flat connection A0 , and therefore δ κ 2 Cκ can be interpreted as its curvature. Observe that, since Cκ is of odd degree, all the formulae of Appendix D are basically the same as if Cκ were an ordinary variation of A0 . We exploit then this analogy to define the nth iterated integral of Cκ as κ n,n H (A0 )|1 . κ 1,n · · · C πn∗ C 0 We refer from now on to Appendix D for the main notations (simplices, evaluation maps, κ : We have written etc.). We recall the definition of C κ := H (A)|• ev∗ Cκ H (A)|• −1 , C 0 1 0 κ i,n := π ∗ C and C i,n κ . We have suppressed ρ before all the Cκ ’s in the above product; th the forms considered in the n iterated integral take values in the associative algebra End(V ). We then define the generalized holonomy of Cκ from 0 to 1 via the path-ordered exponential κ 1,n · · · C κ n,n H (A0 )|1 ; Hol(Cκ ) := H (A0 )|10 + πn∗ C 0 n≥1

it defines an element in SA,B (LM, End(V )), and since dim n = n, it follows that it has even total degree. We now pick a finite-dimensional representation ρ and define the generalized Wilson loop Hρ (κ; A, B) = Tr ρ Hol(Cκ ).

(7.5)

From the previous considerations, it is an element of SA,B (LM, R), with even total degree. We are now ready to state the main theorem of this section. Theorem 7.2. The generalized Wilson loop is (δ κ 2 + d)-closed: (δ κ 2 + d)Hρ (κ; A, B) = 0. Proof. By above reasonings, we can consider Cκ as a variation of the (flat) connection A0 . The cyclicity of the trace allows to replace the exterior derivative by the covariant derivative dev(0)∗ A0 . Hol(Cκ ) has the same form as H (A + a)|10 of Appendix D, where we have set A0 = A, and we have replaced a by Cκ and the wedge product by the dot


629

product. According to the sign rules for the dot product and repeating almost verbatim the arguments used in the proof of Theorem D.3, we get dHρ (κ; A, B) =

m

(−1)m+i

m≥1 i=1

" κ 1,m · · · δ · Tr ρ πm∗ C C 2 κ κ

i,m

# κ m,m H (A0 )|1 . ···C 0

Recalling Lemma 5.4, 5.3 and 5.6 and the Leibnitz rule, it is then not difficult to verify that δ κ 2 Hρ (κ; A, B) =

m

(−1)m+i+1

m≥1 i=1

$ % 1 κ 1,m · · · δ · Tr ρ πm∗ C κ 2 Cκ i,m · · · Cκ m,m H (A0 )|0

which yields the desired identity. We would like a stronger assertion than what we proved in the above theorem; namely, that Hρ is (−i + δ κ 2 + d)-closed. So we need Hρ (κ; A, B) = 0. If a loop has transversal self-intersections, the above identity is certainly false since on the two intersecting strands complementary components of a field and its antifield appear. If the loop has non-transversal intersections or cusps, it is not even clear what the action of the BV Laplacian should be. However, even restricting to imbeddings might not be enough since in the computation of the BV Laplacian there are ill-defined terms coming from subsequent fields in the iterated integrals as the evaluation points come together. To establish the validity of the above identity, we can choose the following Regularization procedure. We only consider elements of Imbf (S 1 , M), the space of framed imbeddings of S 1 into M. For each element we then consider a tubular neighborhood of the imbedding and use the framing to select a companion imbedding on the boundary. Finally we put each component of A appearing in the iterated integrals on the imbedding and each component of B on its companion (following a procedure introduced in [14]). Since the cosmological term is a flat invariant observable we then obtain, under the above assumption, the following Corollary 7.3.

i 2 ( + d) exp κ S3 Hρ (κ; A, B) = 0. As a consequence, the d-cohomology class of the above functional is BV observable. This implies Theorem 2 of [16], which states that the above functional defines an H ∗ (Imbf (S 1 , M))-valued BV observable.

630


Remark 7.4. We notice finally that the v.e.v.s of the generalized Wilson loops together with the cubic cosmological term do not depend on the representative of flat connection A0 . Let in fact g ∈ G be a gauge transformation viewed as a section of Ad P . Then one can verify that H (A0 )|tt21 is sent to g −1 (γ (t1 ))H (A0 )|tt21 g(γ (t2 )) (see Remark D.1 for the precise definitions in the general case). This implies that the superfields a and B in the generalized Wilson loops are acted upon by Adg (this is a consequence of the definition of the generalized Wilson loops and of the cyclicity of the trace). This can be compensated by a change of variables in the functional integral whose formal measure is constructed using the bilinear form , and hence formally Ad-invariant. Therefore, the v.e.v.s of the generalized Wilson loops are functions on the moduli space of flat connections. 8. Other Loop Observables We now generalize the ideas of the previous section along two directions: (i) consider variations of the connection which are not necessarily of odd degree; (ii) introduce interaction terms with higher powers of B. Both generalizations require the following Assumption 4. Throughout this section we work with a Lie algebra g, coming from an associative algebra endowed with a trace Tr (e.g., we may take g = gl(N ) with the usual trace of matrices). Furthermore, we define the ad-invariant symmetric bilinear form η , ξ on g by Tr ηξ and assume that it is nondegenerate (as required by Assumption 3). Finally, we will only consider representations ρ of g as an associative algebra. 8.1. The odd-dimensional case. 8.1.1. Higher-order B-interactions. We define, for k ∈ N, the following even element of SA,B : 1 O2k+1 = Tr B2k+1 . 2k + 1 M Observe that even powers of B would vanish by the cyclicity of the trace Lemma 8.1. The following identities hold for the functional O2k+1 : δ O2k+1 = 0, ∀k ∈ N, O2k+1 = 0, ∀k ∈ N, (( O2k+1 ; O2l+1 )) = 0, ∀k, l ∈ N.

(8.1) (8.2) (8.3)

It follows in particular that, ∀k ∈ N, the functional O2k+1 is a flat invariant observable (see Subsect. 5.4). Proof. From the definition of the super BV antibracket, we get $ % 1 2k Tr dA B · B d Tr B2k+1 = 0. = (( S ; O2k+1 )) = 2k + 1 M M Equations (8.2) and (8.3) follow respectively from the definition of the BV Laplacian and of the super BV antibracket, and from the fact that the functionals O2k+1 do not depend on a.


631

Let us now choose n ∈ N and a sequence of real numbers µ = µ(λ) = {µ2 , µ4 , . . . , µ4n+2 }. Then we define the following even element of SA,B (R): Oµ :=

2n+1

µ2k O2k+1 .

k=1

From the lemma it follows that Oµ is a flat invariant observable for any µ. So, as in Subsect. 5.4, we can introduce the following odd differential: δµ = δ +

2n+1

µ2k ∂ O2k+1 .

k=1

Its action on the fundamental superfields is easily computed: δ µ a = −FA −

2n+1

µ2k B2k ,

δ µ B = −dA B.

(8.4)

k=1

8.1.2. Extended generalized Wilson loops. Let λ := {λ1 , λ3 , . . . , λ2n+1 } be another sequence of real numbers with the same n as above. We then define the odd superform Cλ := a +

n

λ2k+1 B2k+1 .

k=0

If the sequences µ and λ are related by µ2k :=

λ2i+1 λ2j +1 ,

(8.5)

0≤i,j ≤n i+j =k−1

then (8.4) implies 1 δ µ Cλ = −dA0 Cλ − [[Cλ ; Cλ ]]. 2 The above expression has again the form of a curvature; we can therefore view the superform Cλ as a variation of the connection A0 . So, analogously to what we did in Subsect. 7.2, we define the path-ordered integral λm,m H (A0 )|1 . λ1,m · · · C πm∗ C Hol(Cλ ) = H (A0 )|10 + 0 m≥1

We next define accordingly Hρ (λ; A, B) := Tr ρ Hol(Cλ ). Repeating the arguments used in the proof of (7.2), we can state the following

632


Theorem 8.2. If µ and λ are related by (8.5), then (δ µ + d)Hρ (λ; A, B) = 0. Since Oµ is a flat invariant observable, this implies the following Corollary 8.3. With the same hypothesis as above and with the regularization procedure defined before Cor. 7.2, we obtain

i ( + d) exp Oµ Hρ (λ; A, B) = 0. Again this implies that the d-cohomology class of the above functional is a BV observable; from this Theorem 4 of [16] follows. Remark 8.4. The same reasonings sketched in Remark 7.4 do hold in this case as well; therefore, we may conclude that the v.e.v.s of the generalized Wilson loops with higherorder B-interactions depend only on the class [A0 ] in {A ∈ A : FA = 0} /G.

8.2. The even-dimensional case. We now turn to the problem of defining generalized Wilson loop observables for the case dim M even. Observe that in even-dimensional BF theories B has even total degree; so [[B ; B]] = 0. This implies that it is not possible to define a generalized cosmological term as in Sect. 7 because we cannot anymore rely on the dot Lie bracket to construct this functional. Therefore, in order to define products of B with itself (either cubic or not) we must do as in the preceding subsection and, in particular, we need Assumption 4. 8.2.1. B-interactions. For a given k > 1 we define the following even element of SA,B : Ok =

1 k

M

Tr Bk .

We now state the following Lemma 8.5. The functionals Ok satisfy the identities δOk = 0, ∀k > 1, Ok = 0, ∀k > 1, (( Ok ; Ol )) = 0, ∀k, l > 1.

(8.6) (8.7) (8.8)

Proof. By definition of the super BV antibracket, we can write 1 δ k

M

Tr Bk =

% 1 $ Tr dA B · Bk−1 = d Tr Bk = 0. k M M

Equations (8.7) and (8.8) are consequences of the fact that the Ok s do not depend on a and of the definitions of the super BV antibracket and of the super BV Laplacian.


633

Again it follows that each linear combination of Ok s is a flat invariant observable (see Subsect. 5.4). So, for a given positive integer n, we take a sequence of real numbers µ := {µ2 , µ3 , . . . , µ2n } and define Oµ =

2n

µi Oi+1 .

i=2

Therefore, Lemma 5.8 implies that δ µ := δ + ∂ µ := δ +

2n

µi (( Oi+1 ; ))

i=2

is an odd differential for any sequence µ. Moreover we have δ µ a = FA +

2n

µi Bi ,

δ µ B = dA B,

(8.9)

i=2

using once again arguments similar to those introduced in the proof of (6.4) and (6.5). 8.2.2. The generalized Wilson loop. For the same n as above, we consider a sequence λ = {λ1 , . . . , λn } so that µi := λk λ l (8.10) 1≤k,l≤n k+l=i

for the previously introduced subsequence µ. Then we define Bλ =

n

λ i Bi .

i=1

From (8.9) it follows that ∂ µ a = Bλ · Bλ ,

∂ µ Bλ = 0.

(8.11)

λ of the previous subsection, we define Then, in analogy with the superform C −1 λ := H (A0 )|• ev∗ Bλ H (A0 )|• −1 , a := H (A0 )|•0 ev∗1 a H (A0 )|•0 , B 0 1 0 λi,n and and, accordingly with the notations of Appendix B.3, B ai,n , which we will write λt and at . We then define as B i

i

$ % λt · H ( λt · H ( a)|tt21 · · · B a)|1tm H (A0 )|10 , a)|t01 · B hm,ρ (λ; A, B) := Tr ρ πm∗ H ( m 1

where we have written ∗ H (A + a)|• ; • H ( a)|t01 := π1,m 0 0 ∗ H (A + a)|1 ; • H ( a)|1tm := πm,m 0 • t ∗ • H ( a)|ti+1 := π H (A + a)|•• , 0 i,i+1,m i

634


using the notations of Remark D.2, where we have set again A0 = A, and we have replaced a by a and wedge products by dot products; πi,i+1,m (γ ; t1 , . . . , tm ) := (γ ; ti , ti+1 ), for i ∈ {1, . . . , m − 1}. We finally define Hoρ (λ; A, B) =

∞

h2m+1,ρ (λ; A, B).

m=0

We can now state the main theorem of this subsection Theorem 8.6. The following identity holds: (d − δ µ )Hoρ (λ; A, B) = 0. Proof. We begin by computing the exterior derivative of one of the factors of the above sum. With the help of the generalized Stokes Theorem we obtain λt · · · ∗ dh2m+1,ρ (λ; A, B) = Tr ρ (−1)2m+1 π2m+1∗ dπ2m+1 a)|t01 · B ev(0)∗ A0 H ( 1 λt · · · . + Tr ρ (−1)2m π∂2m+1 ∗ H ( a)|t01 · B 1 (8.12) We consider the first term on the right-hand side of (8.12); the Leibnitz rule for the dot product implies λt · · · ∗ dπ2m+1 a)|t01 · B ev(0)∗ A0 H ( 1 =

2m+1 i=1

$ % ti λt · · · dπ ∗ ev(0)∗ A H ( H ( a)|t01 · B a )| ··· B λt t 0 1 i i−1 2m+1

λt ∗ + H ( a)|t01 · · · B · dπ2m+1 a)|1t2m+1 . ev(0)∗ A0 H ( 2m+1 We recall that dev(0)∗ A0 H (A0 )|10 = 0 by (D.2). We compute the following expression: $ % λt ∗ a)|ttii−1 B dπ2m+1 ev(0)∗ A0 H ( i $ % λt + H ( ∗ ∗ = dπ2m+1 a)|ttii−1 · B a)|ttii−1 · dπ2m+1 ev(0)∗ A0 H ( ev(0)∗ A0 Bλti . i For the second term on the right-hand side of the above equation, we obtain, repeating (almost) the same arguments used in the proof of Theorem D.3, H ( a)|ttii−1 · d A0 Bλ ti ; for the first term, we obtain analogously % $ λt . ati−1 · H ( a)|ttii−1 + H ( a)|ttii−1 · ati · B δH ( a)|ttii−1 − i Summing up all these contributions with the right signs and using repeatedly (A.2), we obtain, for the first term on the right-hand side of (8.12), the result % % $ $ λt · · · − π2m+1∗ H ( λt · · · · ev(0)∗ a a)|t01 · B a)|t01 · B − ev(0)∗ a · π2m+1∗ H ( 1 1 % $ λt · · · . + δ π2m+1∗ H ( a)|t01 · B 1


635

% $ λt · · · have odd total By the invariance of Tr ρ , and since a and π2m+1∗ H ( a)|t01 · B 1 degree, we get % $ λt · · · ∗ a)|t01 · B (−1)2m+1 Tr ρ π2m+1∗ dπ2m+1 ev(0)∗ A0 H ( 1 % $ λt · · · . = δ Tr ρ H (a)|t01 · B 1 We now consider the second term on the right-hand side of (8.12). We recall the orientation choices for the m-simplex made in Appendix B.3; with these in mind we obtain (once again with the same arguments of the proof of Theorem D.3) % $ Bt1 · · · a)|t0i · − Tr ρ ev(0)∗ Bλ · π2m∗ H ( & ' λt · · · · ev(0)∗ Bλ + Tr ρ π2m∗ H ( a)|t01 · B 1 (8.13) 2m % $ t1 2m+j +1 + (−1) Tr ρ π2m∗ H ( a)|0 · Bt1 · · · (B λ · Bλ )ti · · · . j =1

Since the trace is cyclic in the arguments and Bλ has even total degree, the first two terms in the above expression cancel each other. In summary, we have obtained $ % λt · · · a)|t01 · B (−1)2m Tr ρ π∂2m+1 ∗ H ( 1 =

2m j =1

$ % (−1)2m+j +1 Tr ρ π2m∗ H ( a)|t01 · · · (B λ · Bλ )ti · · · .

t Recalling formulae (8.11), we now apply ∂ µ to H ( a)|ti+1 . We obtain by repeating (ali most) the same arguments as in the proof of Theorem D.3 t t a)|ti+1 = − H ( a)|tti · (B a)|ti+1 , ∂ µ H ( λ · Bλ )t · H ( i ti+1 ≥t≥ti

with the same unifying notation of Remark D.5. After repeated application of Lemma A.1 and A.2, we get the following result: ∂ µ h2m−1,ρ (λ; A, B) =

2m k=1

% $ λt · · · (B (−1)2m+k+1 Tr ρ π2m∗ H ( a)|t01 · B λ · Bλ )tk · · · 1

$ % λt · · · ; a)|t01 · B = (−1)2m Tr ρ π∂2m+1 ∗ H ( 1 so the claim follows. Remark 8.7. Observe that the statement of Theorem 8.6 does not extend to h2i,ρ . The problem in this case arises in (8.13) in which the first two terms sum up instead of canceling each other. This reflects what was already noted in Subsect. 3.2 about the classical versions of these observables in even dimensions. Since Oµ is a flat invariant observable, the results of Subsect. 5.4 together with Theorem 8.6 imply

636


Corollary 8.8. If µ and λ are related by (8.10), then i Oµ Hoρ (λ; A, B) = 0, ( − d) exp

(8.14)

under the assumptions of the regularization procedure before Cor. 7.3. We notice that this implies Theorem 3 and Theorem 4 (for M even-dimensional) of [16]. Remark 8.9. Let us finally note that, following the same arguments sketched in Remark 7.4, we may prove that the v.e.v.s of Hoρ (λ; A, B) together with (the exponential of) the polynomial B-terms depend only on the G-equivalence class of the flat connection A0 . 8.3. The BV -exactness of the polynomial observables. We end with a digression devoted to proving the identity 1 On = BV On s , (8.15) n where we have used the following notation: a ; B . s := M

Of course, the functional s depends implicitly on a chosen background flat connection A0 , because the superfield a is seen as a supervariation of the superconnection A, constructed via A0 ; we do not indicate the dependence on A0 in order to avoid cumbersome notation. It is immediate to verify that s is an element of S with ghost number −1. The validity of (8.15) relies on the important identity satisfied by the BV antibracket and by the BV Laplacian, namely the failure of the BV Laplacian BV to satisfy the Leibnitz rule (4.6). We already know that, for all n, M Tr Bn is BV -closed (since it does depend only on B). We want to prove separately the following identities: ( On , s ) = n On ,

BV s = 0.

(8.16)

If we assume the validity of the two previous identities, we can then derive (8.15) from (4.6). We begin with the first identity: Theorem 8.10. The following identity holds ( On , s ) = n On for all n ∈ N. Proof. Since On does not depend on a, we have

( On , s ) = (( On ; s )) =

M

← − − → On ∂ ∂ s ; . ∂B ∂a

(8.17)


637

We compute the right super functional derivative of s w.r.t. a getting − → ∂ s = B. ∂a The left super functional derivative w.r.t. B of On reads ← − On ∂ = Bn−1 . ∂B So it follows by ← − − →

On ∂ ∂ s n−1 n−1 B ; = ;B = Tr B ·B= Tr Bn , ∂B ∂a M M M M that the claim is true. We want now to prove the second identity in (8.16). Since , is nondegenerate by assumption, we can find a basis Xi of g, i = 1, . . . , dim g, which satisfies X i , Xj = δ ij σi , where σi = ±1. We can then write φ α = φiα X i ,

φα∗ = φα∗ j X j ,

where the coefficients φiα and φα∗ i are forms on M (of course, sum over repeated indices is understood here). By recalling the formulae defining the Hodge dual antifields and the definition of ; for forms with ghost number, we may write, despite the dimension of M, m−2 τk∗ , τk Hodge s = − c∗ , c Hodge − a ∗ , a Hodge + B ∗ , B Hodge +

=

dim g i=1

k=1

( σi

) m−2 ∗ ∗ ∗ ∗ − ci , ci Hodge − ai , ai Hodge + Bi , Bi Hodge + τi , τi Hodge , k=1

where ( , )Hodge is defined in (6.9). We now apply the BV Laplacian to the above expression, and we get the following result BV s = =

dim g i=1 dim g

σi C

m m m m + ... − + − · · · + (−1)l m−l m m−1 m−2

σi C (1 − 1)m = 0,

i=1

where C is an infinite constant (in fact, it is given by the Dirac δ distribution evaluated in 0, multiplied by the volume of the manifold M). This argument is very similar to that used in the proof of the BV -closedness of the BV action for canonical BF theories (see Subsect. 6.3). The binomial coefficients take into account the number of components of φiα (recall that they are forms on M), while the signs come from the ghost numbers of the fields. In an appropriate regularization procedure, the above expression vanishes before one applies the regularization procedure. So the claim follows.

638


9. Generalizations At the beginning of this paper, see the beginning of Sect. 2, we have made three assumptions. As we have seen, Assumption 3 is necessary for the construction of loop observables in odd dimensions (one can get rid of it, if one just considers the action, see Subsect. 6.3), and actually one needs the stronger Assumption 4 before Cor. 7.3 for the even-dimensional case (or for more general loop observables). Assumption 1 is on the other hand needed if one wants to avoid extra symmetries than those displayed in (3.3) or extra reducibility than that described thereafter. Some modifications are needed when M is not compact, as we briefly describe in Subsect. 9.1 for the case M = Rm . Finally, Assumption 2 is there only for the sake of simplicity. We sketch in Subsect. 9.2 some ideas for the generalization of the constructions in the paper if we get rid of it. 9.1. The case M = Rm . Here one has to require that the superfields should decay sufficiently fast at infinity, the only flat connection is the trivial one, and the infinitesimal gauge transformations (3.3) have no extra reducibility than that considered in the paper. Therefore, all the conclusions automatically apply to this case as well. Let us only remark that in this case one may also consider observables associated to paths with endpoints at infinity. This simplifies all proofs in Sects. 7 and 8, as one can disregard the extremal boundary terms in the iterated integrals (which before had to be proved to cancel each other by the cyclicity of the trace). In particular, for m even one needs no more the restriction on the definition of the functional appearing in Theorem 8.6, as this holds even if one sums on all the h’s and not only on the odd ones. 9.2. Nontrivial bundles. The main features of the paper that we have to generalize are: (i) the BV formalism and (ii) the Wilson loops. 9.2.1. The BV formalism. The main problem of the generalization of the BV formalism to the case P nontrivial arises because the fields of the BF theories (classical fields, Faddeev–Popov ghost, ghosts for ghosts and associated BV antighosts) are no longer forms on M with values in g, but rather forms on M taking values in the nontrivial bundle ad P . We have therefore to generalize for such forms the notion of functional derivatives of elements of S(A) (for an R-algebra A). They are easy to be carried out and have to be considered as “distributional forms” on M taking values in the bundle ad P ⊗ A. The main difficulty lies in the generalization of functional derivatives for elements of S ∗ (N; E) (where now E → N is a bundle over the manifold N ). We consider them as elements of ∗,∗ (M × N, ad P E) where, by abuse of notation, we denote by ∗,∗ smooth forms as well as distributional forms (which is the main case) on M × N . The external tensor product of two bundles on two different manifolds was already defined in Subsect. 4.1. Then, the pairing induced by , between the (pull-back of) forms on M with values in ad P and the so defined functional derivatives gives forms on M × N with values in the pull-back of the bundle E. It is a well-known fact (see for example [24] and [25]) that, given a (smooth) map f : N1 → N2 between two manifolds and a bundle N2 → N2 , forms on N1 taking values in the pulled-back bundle f ∗ N2 → N1 are generated by pull-backs of sections of N2 as an ∗ (N1 )-module. Usually, we consider the map f to be a fibration, in order to perform a push-forward w.r.t. it. The generalization of the notion of functional derivatives (as sketched above) leads to the generalization of the BV antibracket as well. Analogously one can generalize the BV


639

Laplacian, the super BV antibracket and the super BV Laplacian. Then all constructions described in Sects. 4, 5 and 6 hold in the general case. 9.2.2. The generalized Wilson loops. We face the following problems: we have to construct iterated integrals consistently with the fact that ad P is no more trivial. (We refer to Appendix B.3 for the main notations that we use in the next paragraphs.) We sketch now this generalization which naturally works only in the case when one restricts oneself to representations of g coming from representations of the corresponding Lie group G. The representation ρ. The Wilson loops defined in the paper depend explicitly by construction on some finite-dimensional representation ρ : g → End(V ): in fact, we apply to the fields (which, in the case when P is trivial, are forms on M with values in g) the representation ρ, and we obtain forms on M with values in the algebra End(V ). This we can do no longer in the case P nontrivial; the forms are in fact elements of ∗ (M, ad P ). We shall therefore construct a bundle morphism from ad P to some associated bundle, which must be related to the representation ρ. A natural way to do this consists in taking a representation ρ : G → Aut(V ), for some finite-dimensional vector space V ; this induces a representation ρ on End(V ) by conjugation. Therefore, we can define the associated bundle EndP (V ) := P ×ρ End(V ). The derivative at the identity of ρ is an equivariant morphism ρ from g to End(V ) and induces a morphism from ad P to EndP (V ) (which, by abuse of notation, we still denote by ρ). The iterated integrals for the Wilson loops. The generalized Wilson loops (in the case P trivial) are constructed via iterated integrals that involve pull-backs of forms on M with values in End(V ) w.r.t. evaluation maps from LM × n to M. The construction of the generalized Wilson loops is based on the “holonomy” H (A + a)|10 of the connection A + a defined in Appendix B.3. The main object in the definition of H (A + a)|10 is a ; this was constructed by means of the evaluation map ev1 and by conjugation with H (A)|•0 ; by pulling back a w.r.t. πi,n (i = 1, . . . , n), we obtain a form on LM × n with values in End(V ). In the nontrivial case, we work as follows: first, we take the image under ρ of the form a, obtaining a form on M with values in EndP (V ). Next we take its pull-back w.r.t. the map ev1 and obtain a form on LM × [0, 1] taking values in Endev∗1 P (V ) (ev∗1 P inherits the structure of a principal bundle over LM ×[0, 1] as a pull-back of P ). We then take a flat background connection A0 ; by means of it, we can construct a G-equivariant isomorphism from ev∗1 P to (ev(0) ◦ π1 )∗ P that induces in turn an isomorphism between ∗ (LM ×[0, 1], Endev∗1 P (V )) and ∗ (LM ×[0, 1], End(ev(0)◦π1 )∗ P (V )). We still denote by a ∈ 1 (LM × [0, 1], End(ev(0)◦π1 )∗ P (V )) the result of these operations on the form ∗ a . We can now a. Finally, we define ai,n ∈ 1 (LM × n , End(ev(0)◦πn )∗ P (V )) as πi,n multiply ai,n by aj,n , for different i, j ≤ n, since all these forms take now values in the same algebra bundle End(ev(0)◦πn )∗ P (V ) → LM × n . In this way we may define the generalization of all the functionals appearing in Sects. 7 and 8, and the related theorems still hold. The isomorphism described above depends explicitly on the connection A0 . The gauge group G operates (not necessarily freely) by “conjugation” on the set of equivariant bundle morphisms from ev∗1 P to (ev(0) ◦ π1 )∗ P , which we denote by HomG (ev∗1 P ; (ev(0) ◦ π1 )∗ P ). It is well-known that G operates on the space A(P ) of connections on P , making it into a G-principal bundle (modulo reducible connections and analytical technicalities, which we have skipped here). It can be proved that the isomorphism described above defines an equivariant map from A(P ) to HomG (ev∗1 P ; (ev(0)◦π1 )∗ P ).

640


This implies by construction that the v.e.v.s of the generalized Wilson loops do not depend on the flat connection A0 , but rather on its equivalence class in {A ∈ A(P ) : FA = 0} /G only. (This is in analogy with Remark 7.4.) Appendix A. Definition and Main Properties of the Pushforward π

Let M be a manifold and E − → M a smooth fiber bundle with typical fiber F , where F is an oriented compact manifold possibly with boundaries and corners. Let m, resp. e, resp. f , denote the dimensions of M, resp. E, resp. F (so e = f + m). We pick a form ω in p (E), where p ≥ f ; we then define the pushforward π∗ ω of the form ω w.r.t. π as the form in p−f (M) which satisfies the following identity: π∗ ω ∧ η = ω ∧ π ∗ η , ∀η ∈ m+f −p (M). (A.1) M

E

In the case p < f we define π∗ ω = 0. We now list without proof the main properties of the push-forward: π∗ (π ∗ α ∧ β) = (−1)f deg α α ∧ π∗ β, π∗ (α ∧ π ∗ β) = π∗ α ∧ β, f

f

∀α ∈ ∗ (M), ∀β ∈ ∗ (E), ∀α ∈ ∗ (E), ∀β ∈ ∗ (M),

∗

(A.2)

∗

dπ∗ α = (−1) π∗ dα − (−1) π∂∗ ι α, ∀α ∈ (E), where ι : E∂ → E is the canonical injection of the fiber bundle with typical fiber ∂F , and π∂ : ι(E∂ ) → M is the corresponding projection. Another important property which we use throughout the paper is given by the following lemma (without proof). We consider two manifolds M and N and suppose that π

π˜

E− → M, resp. F − → N , is a fiber bundle over the manifold M, resp. N . Let ϕ : E → F be a bundle morphism with base map ψ : M → N . We cast all these maps in the following commutative square: ϕ

E −−−−→   π

F   π˜

ψ

M −−−−→ N We suppose additionally that φ induces orientation preserving diffeomorphisms of the fibers. Lemma A.1. Under the above assumptions, the following identity holds: (π∗ ◦ ϕ ∗ )α = (ψ ∗ ◦ π˜ ∗ )α,

∀α ∈ p (F).

(A.3)

For the proof, see [24]. Let us suppose that we have a fiber bundle E → F, and let us additionally suppose that F → M is a fiber bundle, too; let us denote the projections by π1 , resp. π2 . π1

E −−−−→ F  π 2 M


641

If we compose the two projections we obtain a fiber bundle E → M, with projection π = π2 ◦ π1 , whose orientation will be determined by the orientation of the resulting fiber, the product manifold of the fibers of the two bundles. Then we obtain the following Lemma A.2. With the above hypotheses, the following identity holds π∗ α = π2∗ (π1∗ α),

∀α ∈ p (E).

(A.4)

This is just Fubini’s Theorem for repeated integration, and the definition of the pushforward is consistent with the orientation choices. We end this Appendix by defining the push-forward of forms on E → M with values in some finite dimensional vector space W . This is simply given by π∗ (α ⊗ v) := π∗ α ⊗ v on generators and extended by linearity. Appendix B. Sign Rules To introduce the dot product, let us for a moment suppose that we have a Z-graded superalgebra E, and let us consider ∗ (M; E) with differential d(ω ⊗ e) := dω ⊗ e. Let us pick an element ω ⊗ e in ∗ (M; E); we can assign to it two gradings, namely its degree as a form on M and the degree of its E-part; from now on, we will call the degree in E “ghost number”. By “homogeneous” in ∗ (M; E) we mean from now on any element α of given degree and ghost number. We then define the product of homogeneous elements in ∗ (M; E) by the rule (ω ⊗ e) (ω ⊗ e ) := ω ∧ ω ⊗ ee . The graded Leibnitz rule reads d(α β) = dα β + (−1)deg α α dβ,

∀α, β ∈ ∗ (M; E).

In the case when E is supercommutative, it also follows that α β = (−1)deg α deg β+gh α gh β β α. In case E is associative, we define the super Lie bracket of two homogeneous elements a, b by [ a , b ] := a b − (−1)gh a gh b b a, ∀a, b ∈ E; it satisfies the graded antisymmetry [ a , b ] = −(−1)gh a gh b [ b , a ] and the graded Jacobi identity [ a , [ b , c ] ] = [ [ a , b ] , c ] + (−1)gh a gh b [ b , [ a , c ] ], for all homogeneous a, b, c ∈ E. The super Lie bracket on E can be extended to ∗ (M; E) with the help of the wedge product by the rule [ α ⊗ a , β ⊗ b ] := α ∧ β ⊗ [ a , b ].

642


The graded antisymmetry and the graded Jacobi identity imply deg α deg β+gh α gh β [ β , α ]; • [ α , β ] = −(−1) • α, β ,γ = [ α , β ] , γ + (−1)deg α deg β+gh α gh β β , α , γ ,

for all homogeneous forms α, β, γ ∈ ∗ (M; E). Remark B.1. It is possible to start directly with a super Lie algebra H instead of E. The graded antisymmetry and the graded Jacobi identity in ∗ (M; H) hold as in the previous formulae. B.1. Dot products. Since ∗ (M; E) has two gradings, each homogeneous element α in the degree and in the ghost number inherits a new grading, the total degree, which is defined by |α| := deg α + gh α. With the help of the total degree, we can define the dot product of two homogeneous forms α, β in ∗ (M; E) by the rule α · β := (−1)gh α deg β α β, and accordingly the dot Lie bracket [[α ; β]] := (−1)gh α deg β [ α , β ]. We now list some obvious properties: Let us suppose that E is supercommutative; then α · β = (−1)|α||β| β · α,

(graded commutativity).

For the dot Lie bracket holds in general [[α ; β]] = −(−1)|α||β| [[β ; α]]

(graded antisymmetry), |α||β|

[[α ; [[β ; γ ]]]] = [[[[α ; β]] ; γ ]] + (−1)

[[β ; [[α ; γ ]]]], (graded Jacobi identity),

for all homogeneous forms α, β, γ in ∗ (M; E). Next, we notice that the exterior derivative satisfies the following graded Leibnitz rule d(α · β) = dα · β + (−1)|α| α · dβ. If we consider an (ungraded) algebra bundle (or more generally, a Lie algebra bundle) B → M, we can consider the space ∗ (M, B) ⊗ E, instead of ∗ (M; E); we define accordingly the total degree (B is ungraded and each fiber is an algebra) and the dot product (and the dot Lie bracket). We next consider a covariant derivative dA , coming from a connection A on B, and define its action on ∗ (M, B) ⊗ E by the rule dA (α ⊗ a) := dA α ⊗ a. Then, the Leibnitz rule for dA w.r.t. the dot product and the dot Lie bracket follows easily.


643

B.2. Superderivations. We can also consider in this setting the BV operator δ defined by the BV action as a graded derivation on the superalgebra E, which we extend to ∗ (M; E) by the rule δ(α ⊗ a) := α ⊗ δa. It follows: • δ(α β) = δα β + (−1)gh α α δβ for homogeneous α, β in ∗ (M; E); • δ ◦ d = d ◦ δ on ∗ (M; E). Let us next define δ := (−1)deg 1 ⊗ δ, where (−1)deg is the operator which multiplies each homogeneous form on M by the parity of its degree. From its very definition, it follows • δ(α · β) = δα · β + (−1)|α| α · δβ for homogeneous α, β in ∗ (M; E); • δ ◦ d = −d ◦ δ on ∗ (M; E). Remark B.2. The same identities can be proved even when we substitute ∗ (M; E) with ∗ (M, B) ⊗ E, for an ungraded algebra bundle B → M and the exterior derivative with a covariant one, or if we replace E by a super Lie algebra H. We can then define the operator D := d ⊗ 1 + (−1)m+1 δ,

m = dim M;

it follows easily from all the above results that it is a superderivation w.r.t. the total degree. Moreover, if δ is nilpotent, then so is D, and consequently a differential on ∗ (M; E). If we are dealing with ∗ (M, B) ⊗ E, we can replace d by a covariant derivative dA and define DA := dA ⊗ 1 + δ, which is then a superderivation. Moreover, if A is flat, DA is a superdifferential, too. (Of course any linear combination of d ⊗ 1 and δ has these properties. The conventional choice of the factor (−1)m+1 is consistent with the choices made in the rest of the paper.) In the paper, we also consider a flat background connection A0 and its relative covariant derivative, along with a sum of forms, which we denote by a, of total degree 1. Then dA = dA0 + [[a ; ]] defines a superconnection on ∗ (M, ad P ). In the setting of this Appendix, this is tantamount to choosing forms on ∗ (M, ad P ) ⊗ E of total degree 1; we sum all these forms and obtain a variation of the superconnection A0 . We define accordingly DA := dA + (−1)m+1 δ; it is clear that DA is a derivation, and its curvature is given by DA2 = [[(−1)m+1 δa + FA ; ]] =: [[FA ; ]]; so (6.4) can be interpreted as the vanishing of the curvature FA of A on ∗ (M, ad P )⊗E; thus, A is formally “superflat”. Similarly, (6.5) implies that the superform B (seen as an element of ∗ (M, ad P ) ⊗ E of total degree m − 2) is DA -closed. B.3. Pullbacks and push-forwards. Finally, let π : E → M be a fiber bundle. We then define the pullback, resp. push-forward, w.r.t. π by the rules • π ∗ (ω ⊗ e) := π ∗ ω ⊗ e, for ω ⊗ e ∈ ∗ (M; E); • π∗ (η ⊗ e) := π∗ η ⊗ e, for η ⊗ e ∈ ∗ (E, E).

644


It follows immediately that • δ ◦ π ∗ = π ∗ ◦ δ; • δ ◦ π∗ = π∗ ◦ δ. It is then not difficult to verify that • δ ◦ π ∗ = π ∗ ◦ δ; • δ ◦ π∗ = (−1)rk E π∗ ◦ δ. By definition of the dot product, it follows (in analogy with the first two equations in (A.2)) • π∗ (π ∗ α · β) = (−1)rk E |α| α · π∗ β; • π∗ (α · π ∗ β) = α · π∗ β.

Appendix C. The Universal Global Angular Form In this appendix we construct the universal global angular form by using a fermionic integral representation. This is analogous to the construction of the Mathai–Quillen representative [30] of the Thom class (see [6] and [19] and references therein). Recall p that global angular form on a sphere bundle S − → M is a form ϑ on S satisfying p∗ ϑ = 1 and dϑ = −p∗ e, where e is a representative of the Euler class of the bundle. Let Q → M be an SO(n)-principal bundle (not necessarily SO (M)). Let E the associated vector bundle Q ×SO(n) En with En the n-dimensional Euclidean vector space. We denote by , the corresponding scalar product. Consider then the associated unit sphere bundle S = Q ×SO(n) S n−1 as the base manifold of S = Q × S n−1 . Let us summarize all these bundles and the respective projections in the following commutative square: p

Q ←−−−−   π

Q × S n−1 = S   π

p

M ←−−−− Q ×SO(n) S n−1 = S. We denote by θ a connection form on Q. By abuse of notation, we denote again by the and by F its same symbol its pull-back w.r.t. p (which is again a connection on S), curvature. Last, let us denote by x the canonical euclidean coordinates on Rn (with S n−1 defined as the locus of x , x = 1). We may consider x as an equivariant function on S with values in Rn (which inherits the canonical representation of SO(n)), and by ∇x its corresponding covariant derivative, yielding a basic 1-form on S with values in Rn . (Here, the right action of SO(n) on S is defined by (q, x)O := (qO, O −1 x).) From which is horizontal and invariant w.r.t. the now, by basic we will mean every form on S, action of SO(n). Our aim is to construct a global angular form on the trivial sphere bundle S in terms of the monomials . [x; F, k; ∇x, l] = a1 ...a2k+l+1 x a1 F a2 a3 . . . F a2k a2k+1 (∇x)a2k+2 . . . (∇x)a2k+l+1 ,

(C.1)


645

where 2k + l + 1 = n, ij ...n is the totally antisymmetric tensor and sums over repeated indices are understood. Observe that these monomials are basic in S → S since x, ∇x and F are horizontal and equivariant. Our first task is to write a generating function for these monomials. To do so, we consider @T Rn . We go on denoting by x the (even) coordinates on the base and denote by ρi , collectively ρ, the n Grassmann coordinates on the fiber. We introduce then the Berezin integration [Dρ] by the rules: • [Dρ]P (ρ) = 0, for any polynomial P in the odd variables ρi of degree strictly less than n; • [Dρ]ρ1 · · · ρn = 1. These two rules determine a unique Berezin integral on any polynomial in the Grassmann variables ρ (any smooth function in the variables ρ has the form of a polynomial of maximal degree n). Then the generating function we are looking for reads G = [Dρ] ρ , x exp S, (C.2) where S = ρ , ∇x +

λ ρ , Fρ , 2

(C.3)

and λ is a parameter. For the next discussion we need to introduce also the following generating function of basic n-forms: 6 = [Dρ] exp S. (C.4) To prove that the forms generated by 6 and G are actually basic just observe that the action of SO(n) on x, ∇x and F can be compensated for by a change of variables corresponding to the fundamental representation of SO(n) on the vector space generated by {ρ i }. Remark C.1. The Thom class on P ×SO(n) Rn can be written as a basic form on P × Rn as n (−1), 2 1 t U= [Dρ] exp − x , x + ρ , ∇x − ρ , Fρ , (2πt)n/2 2t 2 for any t > 0 [6]. So, apart from a multiplicative constant, 6 is the restriction of U |t=−λ to P × S n−1 , while G is the restriction of the form obtained contracting U |t=−λ with the radial vector ∂ field r ∂r . Now we have the following Lemma C.2. 6 and G obey the equation: ∂ 1 dG = (−1)n+1 n − 2 λ + 6. ∂λ λ

(C.5)

646


Proof. When differentiating a form given as in (C.4) or (C.2), we apply the following rules: 1. ρ is odd with respect to exterior derivative; 2. ρ behaves “as if” it were covariantly closed. To justify the second rule, we first notice that, given any n × n matrix X, integration by parts shows that * + ∂ [Dρ] X ρ , f = Tr X [Dρ] f. (C.6) ∂ρ (With commuting variables we would have the same relation with a minus sign on the r.h.s.) As a consequence, [Dρ] δf = 0, with

+ * ∂ f, δf = −θρ , ∂ρ

(C.7)

because θ takes values in so(n). Therefore, n ˜ [Dρ] df, d [Dρ] f = (−1) where the new exterior derivative d˜ is defined by d ± δ. Introducing the covariant derivative = d˜ + θ ·, ∇ = 0, that is, rule 2. In particular, we have we get from (C.7) that ∇ρ = − ρ , ∇x , d˜ ρ , x = − ρ , ∇x d˜ ρ , ∇x = − ρ , ∇∇x = −ρ ,Fx , = ∇, and since on x-variables ∇ d˜ ρ , Fρ = 0, by the Bianchi identity. Therefore, dG = (−1)n+1 A + (−1)n B, with

A= B=

[Dρ] ρ , ∇x exp S, [Dρ] ρ , x ρ , F x exp S,


647

and S defined in (C.3). Now some simple manipulations and the use of (C.6) show that * + ∂ A = [Dρ] ρ , exp S − λ [Dρ] ρ , Fρ exp S ∂ρ (C.8) ∂ = n 6 − 2 λ 6. ∂λ Similarly, we get * + 1 ∂ B= − [Dρ] ρ , x x , exp S λ ∂ρ 1 + [Dρ] ρ , x x , ∇x exp S λ * + 1 ∂ 1 ρ , x exp S = − 6, = − [Dρ] x, λ ∂ρ λ

(C.9)

where we have used the constraint x , x = 1 and the ensuing identity 0 = d x , x = 2 x , ∇x .

To exploit (C.5), it is convenient to expand 6 and G in powers of λ: 6=

∞

λk 6k ,

k=0

G=

∞

λk Gk .

k=0

Notice that these are actually finite sums. By performing the integrations we get n

(−1)k+, 2 [F, k; ∇x, n − 2k], k 2 k! (n − 2k)! for k = 0, 1, . . . , , n2 -, and 6k =

(C.10)

n−2

(−1)k+/ 2 0 [x; F, k; ∇x, n − 2k − 1], 2k k! (n − 2k − 1)!

Gk =

(C.11)

for k = 0, 1, . . . , / n−2 2 0. Applying (C.5) to the power expansions, we get 60 = 0, dGk = (−1)

n+1

(n − 2k) 6k + 6k+1 .

(C.12) (C.13)

Then we have the following Lemma C.3. The form s

. n−1 ϑ= Ck Gk ∈ n−1 ), basic (Q × S k=0

(C.14)

648


with s = / n−2 2 0, induces a global angular form ϑ on S if and only if the coefficients Ck are defined by  (s − k)!  (−1)k+s for n = 2s + 2 2k+1 π s+1 . (C.15) Ck = (2s − 2k)!  (−1)k+s for n = 2s + 1 2s−k+1 (2π)s (s − k)! Proof. The forms ϑ and ϑ are related by the formula ϑ = π ∗ ϑ.

(C.16)

The first property a global angular form has to satisfy is p∗ ϑ = 1. By the surjectivity of π and by (C.16), it suffices to show that p ∗ ϑ = 1. Since p ∗ selects the θ -independent part in G0 =

(−1)s i ...i x i1 (∇x)i2 . . . (∇x)in , (n − 1)! 1 n

this property is satisfied if and only if we set the correct normalization: C0 =

(−1)s , n−1

(C.17)

where n−1 is the volume of the unit (n − 1)-sphere; that is, 2 π s+1 , s! 2 (2π )s = . (2s − 1)!!

2s+1 = 2s Next we use (C.12) and (C.13) to get n+1

dϑ = (−1)

s

(n − 2k) Ck + Ck−1 6k + (−1)n+1 Cs 6s+1 .

k=0

Now recall that the differential of a global angular form must be basic on S → Q (in particular it has to be the pullback w.r.t. p of a representative of the Euler class). By (C.16), together with the surjectivity of π , it is sufficient to show the identity dϑ = − p ∗ π ∗ e, where e is a representative of the Euler class. All the 6k with k ≤ s contain a form on S n−1 , so they cannot be p -basic (i.e., S n−1 -independent). Therefore, we must choose the coefficients Ck so that the terms in square brackets vanish. This yields a recursion rule that, once the initial condition is fixed by (C.17), has the unique solution (C.15). Now observe that the last term 6s+1 vanishes when n is odd. Therefore, ϑ is closed in this case, and this is enough to prove that it is a global angular form. If n is even, however, 1 ρ , Fρ = Pfaff F, 6s+1 = [Dρ] exp 2 with Pfaff denoting the Pfaffian, and the recursion fixes Cs =

1 . (2π)s+1


649

As a consequence, in this case we get dϑ =

−1 Pfaff F. (2π )n/2

Since the r.h.s. is minus (a representative of the pullback to Q × S n−1 of) the Euler class, the lemma is proved. We can rewrite the results of the lemma and (C.11) as follows. In the odd-dimensional case, n = 2s + 1 – cf. [22] – one has η¯ =

s 1 1 [x; F, k; ∇x, 2s − 2k]. s 2 (4π) k! (s − k)!

(C.18)

k=0

In the even-dimensional case, n = 2s + 2, we get instead η¯ =

s 1 1 (s − k)! [x; F, k; ∇x, 2s − 2k + 1]. s+1 k 2π 4 k! (2s − 2k + 1)!

(C.19)

k=0

Also observe that if one denotes by T the antipodal map on the fiber crossed with identity on the base, one has T ∗ ϑ = (−1)n ϑ. Remark C.4. From (C.18), we see that, in the odd-dimensional case, ϑ can also be given the following expression: 1 1 1 1 s [Dρ] ρ , x S = ϑ= [Dρ] ρ , x exp S , 2 s! (4π)s 2 4π with S = ρ , F ρ − ρ , ∇x 2 = ρ , (F + ∇x ∇x) ρ . This is in accordance with the interpretation given in [7] of ϑ as one half of the Euler class of the tangent bundle along the fiber TS n−1 S.

Appendix D. Parallel Transport as a Function on LM Let us consider a trivial principal bundle P → M; so there exists a global section σ : M → P . Let us now pick a connection A on M; we define the covariant derivative on ∗ (M, ad P ) by the formula dA µ := dµ + [σ ∗ A, µ],

(D.1)

where µ is some section on ad P , and σ ∗ A is a 1-form on M with values in g. Since dA : (M, ad P ) → 1 (M, ad P ) has all the properties of a covariant derivative, it can be easily extended to forms on M with values in ad P . We pick an element a in 1 (M, ad P ), and we may define another connection starting from A, namely σ ∗ A + a

650


(which we write A + a). For the sake of simplicity, let us suppose that A is flat. Then the curvature of A + a is given by 1 FA+a = dA a + [ a , a ]. 2 We apply to A + a the canonical injection ι from g to U(g), so as to obtain a 1-form on M with values in U(g); we omit writing ι before A + a. Let us then define −1 a := H (A)|•0 ev∗1 (a) H (A)|•0 , where by ev1 we have denoted the evaluation map ev1 (γ ; t) := γ (t), a map from LM × [0, 1] to M, and by H (A)|•0 the (inverse of the) parallel transport w.r.t. the connection A, viewed as a function on LM × [0, 1] with values in Aut U(g). Remark D.1. We want to comment here on the definition of H (A)|•0 (even in the case when P is nontrivial). We first consider the product of the pulled-back bundles (ev(0) ◦ π1 )∗ P → LM × [0, 1] and ev∗1 P → LM × [0, 1] which is then a G × G-bundle over LM × [0, 1] × LM × [0, 1]. Then we denote by P = (ev(0) ◦ π1 )∗ P ×π ev∗1 P → LM × [0, 1] the restriction of the product bundle to the diagonal of the base manifold. We then consider the G-valued function H (A)|•0 defined implicitly by the equation p1 , p2 , γ )|t0 , γp1 (t) = p2 H (A;

p1 , p2 ∈ P : π(p1 ) = γ (0), π(p2 ) = γ (t),

where γp1 is the unique A-horizontal lift of γ starting at p1 . It is then clear that this function is G×G-equivariant if we define the action φ : (g, h; k) → hkg −1 of G×G on G. So we can identify H (A)|•0 with a section H (A)|•0 of the associated bundle P ×φ G. (In the case when P is trivial, H (A)|•0 can eventually be identified with a map from LM × [0, 1] to G.) Consider next a finite-dimensional representation ρ : G → Aut V . This induces the action ρ : (g, h; ψ) → ρ(h)ψρ(g −1 ) of G × G on Aut V . So we can • also define H (A)|0ρ as the corresponding section of the associated bundle P ×ρ Aut V . (In the case P trivial, this can then be identified with a map from LM × [0, 1] to Aut V .) In particular, since G operates on g by the adjoint action, it operates on U(g) as well; so the above construction in this case yields the Aut U(g)-parallel transport with free final point H (A)|•0Ad . For the sake of simplicity, throughout the paper we always omit the index referring to the representation as it is clear from the context. Since A is flat, H (A)|•0 enjoys the following useful property: dH (A)|•0 = −π1∗ ev(0)∗ A H (A)|•0 + H (A)|•0 ev∗1 A,

(D.2)

where ev(0) : LM → M is defined by ev(0)(γ ) := γ (0); we define further, for n ∈ N, the maps πn : LM × n → LM by πn (γ ; t1 , . . . , tn ) := γ , and by n we denote the n-simplex (D.3) n := (t1 , . . . , tn ) ∈ [0, 1]n : 0 ≤ t1 ≤ · · · ≤ tn ≤ 1 , with orientation given by dtn ∧ · · · ∧ dt1 . By H (A)|10 we denote the (inverse of the) holonomy along the loop γ , considered as a function on LM, taking values in G. It


651

follows from its definition that a is a 1-form on LM × [0, 1] with values in U(g). We can now define the parallel transport w.r.t. A + a from 1 to 0 as the formal series in U(g) H (A + a)|10 := H (A)|10 +

a1,n ∧ · · · ∧ πn∗ an,n H (A)|10 ,

(D.4)

n≥1 ∗ where ai,n := πi,n a and πi,n (γ ; t1 , . . . , tn ) := (γ ; ti ). It follows from its very definition that the parallel transport is an element of 0 (LM; U(g)).

Remark D.2. We can define the parallel transport with free final point w.r.t. the connection A + a by H (A + a)|•0 := 1 +

πn+1,n+1∗ an,n+1 , a1,n+1 ∧ · · · ∧

n≥1

with the same notations as above; it follows from its very definition that this particular parallel transport is a map LM × [0, 1] → U(g). The parallel transport as a function on LM × [0, 1] with free initial point is defined analogously by the formula H (A + a)|1• := 1 +

a2,n+1 ∧ · · · ∧ π1,n+1∗ an+1,n+1 .

n≥1

Further, we can define the parallel transport with free end-points as a function on LM × 2 : H (A + a)|•• := 1 +

π1,n,1∗ an+1,n+2 , a2,n+2 ∧ · · · ∧

n≥1

where π1,n,1 (γ ; s1 , s2 , . . . , sn+1 , sn+2 ) := (γ ; s1 , sn+2 ). Theorem D.3. If we denote by dev(0)∗ A the covariant derivative w.r.t. the connection ev(0)∗ A on forms on LM with values in U(g), then the following identity holds, for any flat connection A on P and any a ∈ 1 (M, ad P ): dev(0)∗ A H (A + a)|10 = − ev(0)∗ a ∧ H (A + a)|10 + H (A + a)|10 ev(0)∗ a +    1 1 H (A + a)|s0 ∧ F − A+a s ∧ H (A + a)|s  H (A)|0 . 1≥s≥0

Remark D.4. We have written 1 H (A + a)|s0 ∧ F A+a s ∧ H (A + a)|s 1≥s≥0

$ % 1 := π1∗ H (A + a)|•0 ∧ F A+a ∧ H (A + a)|• ,

where we have used again the notations in Remark D.2.

652


Proof. We shall apply Stokes Theorem to the push-forward w.r.t. the maps πn ; we note that the n-simplex n has a boundary, and that this boundary can be written as ∂n =

n 6

(∂n )α ,

α=0

where each (∂n )α ∼ = n−1 . With our choice of orientation of the simplices – see after (D.3) – the first face of the boundary comes with opposite orientation, while the second has the right one, the third has opposite orientation again, and so forth: or((∂n )α ) = (−1)α+1 or(n−1 ). We apply the covariant derivative w.r.t A0 to the nth term of the series, and we obtain: $ % dev(0)∗ A πn∗ a1,n ∧ · · · ∧ an,n H (A)|10 $ % a1,n ∧ · · · ∧ = (−1)n πn∗ dπn∗ ev(0)∗ A an,n H (A)|10 (D.5) $ % − (−1)n π∂n ∗ ι∗∂n a1,n ∧ · · · ∧ an,n H (A)|10 , where π∂n : LM × ∂n → LM denotes the projection onto the first factor, while ι∂n : LM × ∂n → LM × n is the canonical injection of the boundary of the simplex into the simplex itself. We have used implicitly the identity dev(0)∗ A H (A)|10 = 0, which follows from (D.2). We now consider the two terms on the right-hand side of (D.5) separately, and we th begin with 7nthe second term, which we call “the n boundary term” from now on. Since ∂n = α=0 ∂n,α , we can write n $ % ι∗∂n a1,n ∧ · · · ∧ an,n = ι∗∂n,α an,n , a1,n ∧ · · · ∧ α=0

and ι∂n,α : LM × (∂n )α → LM × n is the canonical injection of the α th face of the boundary. Considering the orientations of the faces, we obtain for the nth boundary term the following expression: n α=0

$ % (−1)α+1 πn−1∗ ι∗∂n,α an,n . a1,n ∧ · · · ∧

We begin with the first face α = 0; it is not difficult to prove the following identities: ι(0) ◦ πn−1 j = 1, πj,n ◦ ι∂n,0 = πj −1,n−1 j = 1; similarly, one shows for α = n, πj,n ◦ ι∂n,n =

πj,n−1 ι(1) ◦ πn−1

j = n, j = n.


653

We have denoted by ι(0), resp. ι(1), the injection of LM into LM × [0, 1] given by ι(0)(γ ) := (γ ; 0), resp. ι(1)(γ ) := (γ ; 1). For α = 0, n,   j < α, πj,n−1 πj,n ◦ ι∂n,α = πα,n−1 j = α, α + 1,  π j −1,n−1 j > α + 1 holds. It follows therefore (ι(0)H (A)|•0 = 1 by its very definition) ∗ ι∗∂n,0 a1,n ∧ · · · ∧ an,n = πn−1 ev(0)∗ a ∧ a1,n−1 ∧ · · · ∧ an−1,n−1 ; ι∗∂n,α an,n = ∧ a)α,n−1 ∧ · · · ∧ an−1,n−1 ; a1,n ∧ · · · ∧ a1,n−1 ∧ · · · ∧ (a ∗ ∗ ι∂n,n a1,n ∧ · · · ∧ a1,n−1 ∧ · · · ∧ an,n = an−1,n−1 ∧ πn−1 ev(0)∗ a. We consider now the first term under the action of the push-forward w.r.t. πn−1 : $ % a1,n ∧ · · · ∧ an,n H (A)|10 πn−1∗ ι∗∂n,0 $ % ∗ = πn−1∗ πn−1 ev(0)∗ a ∧ a1,n−1 ∧ · · · ∧ an−1,n−1 H (A)|10 $ % a1,n−1 ∧ · · · ∧ = (−1)n−1 ev(0)∗ a ∧ πn−1∗ an−1,n−1 H (A)|10 . Similarly, we obtain for α = n, $ % an,n H (A)|10 a1,n ∧ · · · ∧ πn−1∗ ι∗∂n,n $ % a1,n−1 ∧ · · · ∧ = πn−1∗ an−1,n−1 H (A)|10 ∧ ev(0)∗ a, and for α = 0, n we obtain $ % an,n H (A)|10 a1,n ∧ · · · ∧ πn−1∗ ι∗∂n,α $ % = πn−1∗ ∧ a)α,n−1 ∧ · · · ∧ an−1,n−1 H (A)|10 . a1,n−1 ∧ · · · ∧ (a Finally, we obtain the following expression for the nth boundary term of (D.5): n−1 α=1

$ a1,n−1 ∧ · · · (−1)α+1 πn−1∗

% ∧ (a ∧ a)α,n−1 ∧ · · · ∧ an−1,n−1 H (A)|10 − (−1)n−1 ev(0)∗ a $ % a1,n−1 ∧ · · · ∧ ∧ πn−1∗ an−1,n−1 H (A)|10 + (−1)n+1 % $ · a1,n−1 ∧ · · · ∧ an−1,n−1 H (A)|10 ∧ ev(0)∗ a.

654


We then consider the first term of (D.5), and by the Leibnitz rule we obtain $ % a1,n ∧ · · · ∧ an,n πn∗ dπn∗ ev(0)∗ A =

n

$ % a1,n ∧ · · · ∧ d

(−1)i+1 πn∗ an,n ; A a i,n ∧ · · · ∧

i=1

here we have used dπ1∗ ev(0)∗ A a = d

A a, which is a consequence of (D.2). Summing up all the two contributions to (D.5) with the correct signs, we obtain for the left-hand side of (D.5), n

$ % a1,n ∧ · · · ∧ d

(−1)n+i+1 πn∗ an,n H (A)|10 A a i,n ∧ · · · ∧

i=1 n−1

+

$ % (−1)n+α πn−1∗ ∧ a)α,n−1 ∧ · · · ∧ an−1,n−1 H (A)|10 a1,n−1 ∧ · · · ∧ (a

α=1

$ % − ev(0)∗ a ∧ πn−1∗ an−1,n−1 H (A)|10 a1,n−1 ∧ · · · ∧ % $ + a1,n−1 ∧ · · · ∧ an−1,n−1 H (A)|10 ∧ ev(0)∗ a. We begin by summing up all the terms which contain before them ev(0)∗ a, and we obtain − ev(0)∗ a ∧ H (A + a)|10 ; similarly, by summing up all the terms which have ev(0)∗ a on the right, we obtain H (A + a)|10 ∧ ev(0)∗ a. By recalling the definition of the curvature of the connection A + a, the sum of the remaining terms will give us n n≥1 i=1

$ % 1 (−1)n+i+1 πn∗ ∧ · · · ∧ a a1,n ∧ · · · ∧ F A+a i,n n,n H (A)|0 .

For 1 ≤ i ≤ n, n ≥ 1, we shall now write the projection πn as the composition of three projections, i.e. πn = πi,n ◦ π1,i−1,n ◦ πi+1,n,n , where the projections are defined as follows: πi+1,n,n : LM × n → LM × i ; (γ ; s1 , . . . , sn ) → (γ , s1 , . . . , si ); π1,i−1,n : LM × i → LM × [0; t]; (γ ; s1 , . . . , si ) → (γ , si ); πi+1,n,n : LM × [0; 1] → LM; (γ ; si ) → γ . We notice for j ≤ i the useful identity πj,n = πj,i ◦ πi+1,n,n , and for j > i holds πj,n = πj −i,n−i ◦ π¯ 1,i,n , for π¯ 1,i,n (γ ; s1 , . . . , sn ) = (γ ; si , . . . , sn ). We then use the following identity (which follows from πn = πi,n ◦ π1,i−1,n ◦ πi+1,n,n and Lemma A.2): πn∗ = (−1)(n−i)i πi,n∗ ◦ π1,i−1,n∗ ◦ πi+1,n,n∗ ;


655

note the appearance of signs in the above identity: this is due to the fact that the three projections above do reverse the product orientation of the fiber of the trivial bundle over LM given by the projection πn . It is finally useful to introduce the commutative diagram π¯ 1,i,n

LM × n −−−−→ LM × 1+n−i   π πi+1,n,n  1+n−i π1,i−1,n

LM × i −−−−→ LM × [0, 1]; ∗ , in order to get the pullback this diagram allows us to apply LemmaA.1 to πi+1,n,n∗ ◦π¯ 1,i,n w.r.t. π1,i−1,n before the pushforward w.r.t. π1+n−i . We can then apply the first identity of (A.2), when we integrate w.r.t. π1,i−1,n . We shall use once again such a commutative diagram, after integration w.r.t. π1,i−1,n , along with (A.2) and Lemma A.1, in order to obtain the desired identity, accordingly to the notation introduced in Remark D.2.

Remark D.5. Similar identities can be proved for the two other cases in which we consider parallel transports as functions on LM × [0, 1], resp. on LM × 2 . We obtain for the first case the result dπ ∗ ev(0)∗ A H (A + a)|•0 = −π1∗ ev(0)∗ a ∧ H (A + a)|•0 + H (A + a)|•0 ∧ a− $ % ∗ • H (A + a)|•0 ∧ F − π2,2∗ π1,2 A+a ∧ H (A + a)|• . An analogous identity holds for the holonomy as a function depending on the final point: dπ ∗ ev(0)∗ A H (A + a)|1• =

− a ∧ H (A + a)|1• $

+ H (A + a)|1•

∗ − π1,2∗ H (A + a)|•• ∧ π2,2

∧ π1∗

H (A)|10

∗

ev(0) a % 1 F . A+a ∧ H (A + a)|•

H (A)|10

−1

For the second case, we get ∗ ∗ dπ2∗ ev(0)∗ A H (A + a)|•• = − π1,2 a ∧ H (A + a)|•• + H (A + a)|•• ∧ π2,2 a $ % ∗ ∗ ∗ − π2,3∗ FA+a ∧ π3,3 H (A + a)|•• ∧ π2,3 π1,3 H (A + a)|•• ,

where πj,3 : LM × 3 → LM × 2 forgets the j th point of the 3-simplex. We have preferred to adopt the notation t2 H (A + a)|st1 ∧ F A+a s ∧ H (A + a)|s t2 ≥s≥t1

for the third term in the three above expressions, where t1 ≤ t2 can be fixed or can be understood as variables, given the case in the specific context. Acknowledgement. We thank P. Cotta-Ramusino for stimulating discussions and R. Longoni for carefully reading the manuscript and for useful comments.

656


References 1. Alexandrov, M., Kontsevich, M., Schwarz, A. and Zaboronsky, O.: The geometry of the master equation and topological quantum field theory. Int. J. Mod. Phys. A 12, 1405–1430 (1997) 2. Axelrod, S. and Singer, I.M.: Chern–Simons perturbation theory. In: Proceedings of the XXth DGM Conference, edited by S. Catto and A. Rocha, Singapore: World Scientific, 1992, pp. 3–45; Chern–Simons perturbation theory. II. J. Diff. Geom. 39 (1994), 173–213 3. Batalin, I.A. and Vilkovisky, G.A.: Relativistic S-matrix of dynamical systems with boson and fermion constraints. Phys. Lett. 69 B, 309–312 (1977); Fradkin, E.S. and Fradkina, T.E.: Quantization of relativistic systems with boson and fermion first- and second-class constraints. Phys. Lett. 72 B, 343–348 (1978) 4. Blau, M. and Thompson, G.: Topological gauge theories of antisymmetric tensor fields. Nucl. Phys. 205, 130–172 (1991); Birmingham, D., Blau, M., Rakowski, M. and Thompson, G.: Topological field theory. Phys. Rept. 209, 129 (1991) 5. Bar-Natan, D.: On the Vassiliev knot invariants. Topology, 34, 423–472 (1995) 6. Berline, N., Getzler, E. and Vergne, M.: Heat Kernels and Dirac Operators. Berlin: Springer-Verlag, 1992 7. Bott, R. and Cattaneo, A.S.: Integral invariants of 3-manifolds. J. Diff. Geom. 48, 91–133 (1998) 8. Bott, R. and Cattaneo, A.S.: Integral invariants of 3-manifolds. II. J. Diff. Geom. 53, 1–13 (1999) 9. Bott, R. and Taubes, C.: On the self-linking of knots. J. Math. Phys. 35, 5247–5287 (1994) 10. Cattaneo, A.S.: Abelian BF theories and knot invariants. Commun. Math. Phys. 189, 795–828 (1997) 11. Cattaneo, A.S., Cotta-Ramusino, P., Fröhlich, and Martellini, M.: Topological BF theories in 3 and 4 dimensions. J. Math. Phys. 36, 6137–6160 (1995) 12. Cattaneo, A.S., Cotta-Ramusino, P., Fucito, F., Martellini, M., Rinaldi, M., Tanzini, A. and Zeni, M.: Four-dimensional Yang–Mills theory as a deformation of Topological BF theory. Commun. Math. Phys. 197, 571–621 (1998) 13. Cattaneo, A.S., Cotta-Ramusino, P. and Longoni, R.: Configuration spaces and Vassiliev classes in any dimension. math.GT/9910139 14. Cattaneo, A.S., Cotta-Ramusino, P. and Martellini, M.: Three-dimensional BF theories and the Alexander–Conway invariant of knots Nucl. Phys. B 346, 355–382 (1995) 15. Cattaneo, A.S., Cotta-Ramusino, P. and Rinaldi, M.: Loop and path spaces and four-dimensional BF theories: connections, holonomies and observables. Commun. Math. Phys. 204, 493–524 (1999) 16. Cattaneo, A.S., Cotta-Ramusino, P. and Rossi, C.A.: Loop observables for BF theories in any dimension and the cohomology of knots. Lett. Math. Phys. 51, 301–316 (2000) 17. Cattaneo, A.S. and Felder, G.: A path integral approach to the Kontsevich quantization formula. math.QA/9902090, Commun. Math. Phys. 212, 591–611 (2000) 18. Chen, K.: Iterated integrals of differential forms and loop space homology. Ann. Math. 97, 217–246 (1973) 19. Cordes, S., Moore, G. and Ramgoolam, S.: Lectures on 2D Yang–Mills Theory, Equivariant Cohomology and Topological Field Theories. In: Fluctuating Geometries in Statistical Mechanics and Field Theory, Les Houches LXII, ed. D. P. Ginsparg and J. Zinn-Justin London: Elsevier, 1996; hep-th/9411210 20. Cotta-Ramusino, P. and Martellini, M.: BF Theories and 2-Knots. In: Knots and Quantum Gravity (J. C. Baez ed.), Oxford–New York: Oxford University Press, 1994 21. Damgaard, P.H. and Grigoriev, M.A.: Superfield BRST charge and the master action. Phys. Lett. B 474, 323–330 (2000) 22. Freed, D., Harvey, J., Minasian, R. and Moore, G.: Gravitational anomaly cancellation for M-theory fivebranes. Adv. Theor. Math. Phys. 2, 601–618 (1998) 23. Gerstenhaber, M.: The cohomology structure of an associative ring. Ann. Math. 78, 267–288 (1962); On the deformation of rings and algebras. Ann. Math. 79, 59–103 (1964) 24. Greub, W., Halperin, S. and Vanstone, R.: Connections, curvature, and cohomology. Vol. I: De Rham cohomology of manifolds and vector bundles Pure and Applied Mathematics 47 I, New York–London: Academic Press, 1972 25. Greub, W., Halperin, S. and Vanstone, R.: Connections,curvature and cohomology. Vol. II: Lie groups, principal bundles, characteristic classes, Pure and Applied Mathematics 47 II, New York–London: Academic Press, 1973 26. Horowitz, G.T.: Exactly soluble diffeomorphism invariant theories. Commun. Math. Phys. 125, 417–436 (1989) 27. Ikeda, N.: Two-dimensional gravity and nonlinear gauge theory. Ann. Phys. 235, 435–464 (1994) 28. Ikemori, H.: Extended form method of antifield-BRST formalism for BF theories. Mod. Phys. Lett. A7, 3397–3402 (1992); Extended form method of antifield-BRST formalism for topological quantum field theories. Class. Quant. Grav. 10, 233 (1993) 29. Kontsevich, M.: Feynman diagrams and low-dimensional topology. First European Congress of Mathematics, Paris 1992, Volume II, Progress in Mathematics 120. Basel: Birkhäuser, 1994, pp. 97–121 30. Mathai, V. and Quillen, D.: Superconnections, Thom classes and equivariant differential forms. Topology 25, 85–110 (1986)


657

31. Schaller, P. and Strobl, T.: Poisson structure induced (topological) field theories. Mod. Phys. Lett. A9, 3129–3136 (1994) 32. Schwarz, A.S.: The Partition Function of Degenerate Quadratic Functionals and Ray–Singer Invariants. Lett. Math. Phys. 2, 247–252 (1978) 33. Stasheff, J.: Deformation theory and the Batalin–Vilkovisky master equation. In: Deformation Theory and Symplectic Geometry (Ascona, 1996), Math. Phys. Stud. 20, Dordrecht: Kluwer, 1997, pp. 271– 284; The (secret?) homological algebra of the Batalin–Vilkovisky approach. In: Secondary Calculus and Cohomological Physics, Moscow, 1997), Contemp. Math. 219, Providence, RI: AMS, 1998, pp. 195–210 34. Wallet, J.-C.: Algebraic setup for the gauge fixing of BF and superBF systems. Phys. Lett. B 235, 71 (1990) 35. Witten, E.: Some remarks about string field theory. Physica Scripta T15, 70–77 (1987) 36. Witten, E.: Quantum field theory and the Jones polynomial. Commun. Math. Phys. 121, 351–399 (1989) 37. Witten, E.: Chern–Simons gauge theory as a string theory. In: The Floer memorial volume, Progr. Math. 133, Basel: Birkhäuser, 1995, pp. 637–678 Communicated by R. H. Dijkgraaf

Commun. Math. Phys. 221, 659 – 676 (2001)

Communications in



On the Stability of the Kerr Metric Horst R. Beyer Max Planck Institute for Gravitational Physics, Albert Einstein Institute, 14476 Golm, Germany Received: 28 August 2000 / Accepted: 4 April 2001

Abstract: The reduced (in the angular coordinate ϕ) wave equation and Klein–Gordon equation are considered on a Kerr background and in the framework of C 0 -semigroup theory. Each equation is shown to have a well-posed initial value problem, i.e., to have a unique solution depending continuously on the data. Further, it is shown that the spectrum of the semigroup’s generator coincides with the spectrum of an operator polynomial whose coefficients can be read off from the equation. In this way the problem of deciding stability is reduced to a spectral problem and a mathematical basis is provided for mode considerations. For the wave equation it is shown that the resolvent of the semigroup’s generator and the corresponding Green’s functions can be computed using spheroidal functions. It is to be expected that, analogous to the case of a Schwarzschild background, the quasinormal frequencies of the Kerr black hole appear as resonances, i.e., poles of the analytic continuation of this resolvent. Finally, stability of the solutions of the reduced Klein–Gordon equation is proven for large enough masses. 1. Introduction Linear stability of the Schwarzschild black hole was demonstrated by Kay and Wald [14] who showed the boundedness of all solutions of the wave equation corresponding to C ∞ data of compact support. Their proof rests on the positivity of the conserved energy. The problem is more subtle for Kerr space time. A conserved energy exists, but the energy density is negative inside the ergosphere. Hence the total energy could be finite while the field still might grow exponentially in parts of the spacetime. Papers by Press and Teukolsky [22], Hartle and Wilkins [8], and Stewart [28] make the absence of exponentially growing normal modes very plausible. Whiting [31] has proven that there are no such modes, and in his proof he showed that normal modes grow at most linearly in time. Recent numerical evolution calculations [16, 17] for slowly and fast rotating Kerr black holes show no sign of exponential growth. In the case of massive scalar perturbations of Kerr results of Damour, Deruelle and Ruffini [3], Zouros and

660

H. R. Beyer

Eardley [34], and Detweiler [3] point to the existence of unstable modes. These modes are very slowly growing with growth times similar to the age of the universe. This fact complicates the numerical detection of such modes. Here we consider the reduced (in the angular coordinate ϕ) wave equation and Klein– Gordon equation on a Kerr background and in the framework of C 0 -semigroup theory. For this the mathematical framework from [2] is used. For each equation it is shown that the initial value problem is well-posed, i.e., has a unique solution which depends continuously on the data. Further, it is shown that the spectrum of the semigroup’s generator coincides with the spectrum of an operator polynomial whose coefficients can be read off from the equation. In this way the problem of deciding stability is reduced to a spectral problem. For the wave equation it is shown that the resolvent of the semigroup’s generator and the corresponding Green’s functions can be computed using spheroidal functions. It is to be expected that, analogous to the case of a Schwarzschild background, the quasinormal frequencies of the Kerr black hole appear as poles of the analytic continuation of this resolvent. Finally, the stability of the background with respect to reduced massive perturbations is proven for large enough masses. This is done by applying an abstract stability criterium from [2]. The Kerr metric in Boyer-Lindquist coordinates t, r, θ, ϕ is given by

g=

4Mar sin2 θ dtdϕ − dr 2 − dθ 2 2 2 2Ma r sin − r 2 + a2 + sin θdϕ 2 , 1−

2Mr

dt 2 +

(1)

where M is the mass, a ∈ [0, M] is the rotational parameter, := r 2 − 2Mr + a 2 ,

:= r 2 + a 2 cos2 θ.

(2)

The coordinates are constrained by −∞ < t < +∞, r+ < r < +∞, −π < ϕ < π and 0 < θ < π, where r+ := M +

M 2 − a2.

(3)

As a little reminder on the Kerr geometry we give the following basic facts relevant for the discussion of the wave equation. The coordinate vector field ∂/∂r becomes singular at r = r+ . This value of the radial coordinate marks the event horizon for Kerr spacetime. The coordinate vectorfield ∂/∂t is null on the ergosphere r=M+

M 2 − a 2 cos2 θ,

(4)

is spacelike inside and timelike outside. So t is not a time coordinate inside the ergosphere and therefore one might think that Boyer-Lindquist coordinates are unsuitable for a stability discussion. It turns out that this is not the case for the methods (from semigroup theory) of this paper. Finally, the Kerr metric is globally hyperbolic outside the horizon and hence the Cauchy problem for the scalar wave equation is well posed for data on any Cauchy surface. This result is not used in this paper. Existence, uniqueness and continuous dependence of the solutions on the initial data is proved here, too.

On the Stability of the Kerr Metric

661

The reduced wave equation governing solutions of the form ψ(t, r, θ, φ) = exp(imϕ) u(t, r, θ), where m runs through all integers, is given by −1 (r 2 + a 2 )2 ∂ 2u 2 2 − a + sin θ ∂t 2 4mMar ∂u ∂ ∂ m2 a 2 1 ∂ ∂ m2 · i − − − sin θ + u = 0. (5) ∂t ∂r ∂r sin θ ∂θ ∂θ sin2 θ A first inspection shows that 0
0 is assumed to have the dimension l −2 . The exact value of ε does not influence the results in any essential way. Finally, we define A as the Friedrichs extension of A0 . As a consequence A is a denselydefined, linear, selfadjoint and semibounded operator having the same lower bound as A0 , i.e., ε. The objects X, A, B and C are easily seen to satisfy Assumptions 1 and 4 of [2]. Applying the results of that paper gives Theorem 1. (i) By Y := D(A1/2 ) × X

(14)

(ξ |η) := A1/2 ξ1 |A1/2 η1 + ξ2 |η2

(15)

and

for all ξ = (ξ1 , ξ2 ), η = (η1 , η2 ) ∈ Y there is defined a complex Hilbert space (Y, ( | )). (ii) The operators G and −G defined by G(ξ, η) := (−η, (A + C)ξ + iBη)

(16)

for all ξ ∈ D(A) and η ∈ D(A1/2 ) are infinitesimal generators of strongly continuous semigroups T+ : [0, ∞) → L(Y, Y ) and T− : [0, ∞) → L(Y, Y ), respectively. (iii) For all t ∈ [0, ∞): |T± (t)| exp(C A−1/2 t) ,

(17)

where | |, denote the operator norm for L(Y, Y ) and L(X, X), respectively.

664

H. R. Beyer

(iv) For every t0 ∈ R and every ξ ∈ D(A) × D(A1/2 ) there is a uniquely determined differentiable map u : R → Y such that u(t0 ) = ξ

(18)

u (t) = −Gu(t)

(19)

and

for all t ∈ R. Here denotes differentiation of functions assuming values in Y . Moreover this u is given by T+ (t)ξ for t 0 u(t) := (20) T− (−t)ξ for t < 0 for all t ∈ R. (v) λ ∈ C is a spectral value, eigenvalue of iG if and only if A + C − λB − λ2

(21)

is not bijective and not injective, respectively. (vi) For any λ from the resolvent set of iG and any η = (η1 , η2 ) ∈ Y one has: (iG − λ)−1 η = (ξ, i(λξ + η1 )) ,

(22)

ξ := (A + C − λB − λ2 )−1 [(B + λ)η1 − iη2 ] .

(23)

where

Equation (19) is the interpretation of (5) used in this paper. In this sense (iv) shows the well-posedness of the initial value problem for (5), i.e., the existence and uniqueness of the solution and its continuous dependence on the initial data. Moreover (20) gives a representation of the solution and (iii) gives a rough bound for its growth in time. In general, this bound is not strong enough to imply stability of the solutions to (5). Part (v) reduces the determination of the generator’s spectrum to the determination of the spectrum of the operator polynomial A + C − λB − λ2 , λ ∈ C [19, 26]. Moreover (vi) does the same for the resolvents. Further, [2] gives the following stability criteria: Theorem 2. (i) If 1 ξ |(A + C)ξ + ξ |Bξ 2 0 4 for all ξ ∈ D(A) with ξ = 1, then the spectrum of iG is real. (ii) If A + C − (b/2)B − (b2 /4) is positive for some b ∈ R, then the spectrum of iG is real and there are K 0 and t0 0 such that |u(t)| Kt for all t t0 .


665

Here , | | denote the induced norm on (X, |) and (Y, ( | )), respectively. Note that the reality of the generator’s spectrum would exclude the existence of exponentially growing mode solutions of (5). It seems that these criteria are not strong enough to prove stability of the solutions of (5).1 But later on (ii) will be used to conclude stability for the corresponding Klein–Gordon equation for cases where the mass exceeds some given bound depending on m. Note that the positivity of A0 +C would imply stability via (ii). On first sight positivity of A0 + C seems unlikely because of the negative potential term −m2 a 2 / in (13). On the other hand it is well-known that the occurrence of such a term can be due to the chosen representation space for A0 + C. In addition the domain of this operator is very much restricted by the condition that its elements have compact support in . Since is open it follows that the support of such a function has a strictly positive distance from the boundary. In the theory of Schrödinger operators it is wellknown from so-called “Poincare-Sobolev inequalities” that the kinetic energy associated with such a state can exceed a negative potential energy. See, e.g., [33] or for a simple example [23] Vol. II, Example 1 in Chapter X.3. Indeed such inequalities were found, but only ones leading to a positive potential term with asymptotic behaviour ∼ −β for r → r+ , where 0 β < 1. So none of them was found to be strong enough to show positivity of A + C. Indeed the apparent absence of better estimates lead to the impression that A + C is indeed negative. If this is really true it should be easy to prove using the results on the domain of A + C from the next section. This point has not been investigated further, because the negativity alone would not give any further information on the stability of the solutions of (5). 4. Investigation of the Domain of A + C In this section the domain of A + C will be further investigated. This is done for two reasons. Firstly, to make sure that it contains functions having a reasonable behaviour, both, on the axis of symmetry and on the horizon. It turns out that this is indeed the case. In particular, as it should be the case, functions of the form f (r)Pml (cos θ), where f ∈ C02 (Ir , C) and Pml , l = |m|, |m| + 1, . . . , are the usual generalized Legendre polynomials are found to be in the domain of A + C. Secondly such information is needed as a basis for the construction of the resolvent of G in the next section. We do not give a full characterization of D(A + C) here. Instead more modestly sufficient conditions are given on functions f (r) and g(θ) which guarantee that the product f (r)g(θ ) is in D(A + C). These conditions will turn out to be sufficient as a basis for the next section. They are as follows: Theorem 3. For this denote Ir := (r+ , ∞) and Iθ := (0, π ) and define Xr := L2C (Ir , r 4 /),

Xθ := L2C (Iθ , sin θ),

(24)

and for every f ∈ C 2 (Ir , C) and g ∈ C 2 (Iθ , C), Dr2 f

m2 a 2 f , := 4 − f − r

Dθ2 g := −

1 m2 sin θ g + g. (25) sin θ sin2 θ

1 In the following discussion the trivial cases a = 0, i.e., the case of a Schwarschild background, and m = 0 corresponding to purely axial perturbations, are excluded. Of course, for these stability of the solutions can be concluded from Theorem 2(ii).

666

H. R. Beyer

Let be f ∈ C 2 (Ir , C) ∩ Xr and g ∈ C 2 (Iθ , C) ∩ Xθ such that Dr2 f ∈ Xr and Dθ2 g ∈ Xθ

(26)

and for m = 0 in addition such that lim sin θ g (θ ) = lim sin θ g (θ ) = 0.

(27)

θ→π

θ→0

Then f (r)g(θ ) ∈ D(A + C) and 2 (A + C)f (r)g(θ) = Drθ f (r)g(θ ).

(28)

Proof. First it follows from the obvious inequalities 4M 2 r 4 (r 2 + a 2 )2 r4 − a 2 sin2 θ 2 , r+

(29)

that L2C (, r 4 sin θ/) and X are identical as sets and that the associated norms on that set are equivalent. A further consequence of (29) along with partial integration is the fact that f (r)g(θ ) is in the domain D((A0 + C)∗ ) of the adjoint (A0 + C)∗ of A0 + C and in particular that 2 f (r)g(θ ). (A0 + C)∗ f (r)g(θ ) = Drθ

(30)

Hence f (r)g(θ ) ∈ D((A0 + C)∗ ) if and only if there is a sequence h0 , h1 , . . . of elements of C02 (, C) converging to f (r)g(θ) and such that for every given ε > 0 there is ν0 ∈ N such that for all µ, ν ∈ N satisfying µ ν0 and ν ν0 : |hµ − hν |(A0 + C + α)(hµ − hν )| < ε.

(31)

In the following the existence of such a sequence will be shown. Basic for this is the following inequality valid for all u ∈ C02 (Ir , C) and v ∈ C02 (Iθ , C): ∞ π 2 ∗ 2 |u| dr sin θ v Dθ v dθ u(r)v(θ )|(A0 + C + α)u(r)v(θ ) +

∞

r +∞ r+

r+

r 4

−1 ∗

u

(Dr2

+m a

2 2

4 /r+ )u dr

0

π

0

−2 4 r 4 −1 u∗ (Dr2 + m2 a 2 /r+ + r+ )u dr

sin θ |v(θ )| dθ 2

π 0

(32)

sin θ v ∗ (Dθ2 + 1)v dθ .

Here some elementary estimates have been used along with the positivity of Dr2 + 4 on C 2 (I , C) ⊂ X . Since A+C +α is in particular positive also the following m2 a 2 /r+ r 0 r inequality is valid for all u1 , u2 ∈ C02 (Ir , C) and v1 , v2 ∈ C02 (Iθ , C): |u1 (r)v1 (θ ) − u2 (r)v2 (θ )|(A0 + C + α)[u1 (r)v1 (θ ) − u2 (r)v2 (θ )]| = (A + C + α)1/2 [u1 (r) − u2 (r)]v1 (θ ) + (A + C + α)1/2 u2 (r)[v1 (θ ) − v2 (θ )]2 2[u1 (r) − u2 (r)]v1 (θ )|(A0 + C + α)[u1 (r) − u2 (r)]v1 (θ ) + 2u2 (r)[v1 (θ ) − v2 (θ )]|(A0 + C + α)u2 (r)[v1 (θ ) − v2 (θ )],

(33)


667

4 + r −2 . Since f is in the domain of the Friedrichs extension of D 2 where α := m2 a 2 /r+ + r on C02 (Ir , C) ⊂ Xr there is a sequence f0 , f1 , . . . of elements of C02 (Ir , C) converging to f (r) and such that for every given ε > 0 there is ν0 ∈ N such that for all µ, ν ∈ N satisfying µ ν0 and ν ν0 : ∞ ∗ r 4 −1 fµ − fν (Dr2 + α )(fµ − fν ) dr < ε. (34) r+

Obviously, by an argument analogous to (33) this implies that the sequence ∞ r 4 −1 fν∗ (Dr2 + α )fν dr, ν ∈ N r+

(35)

is bounded. Moreover since g is in the domain of the Friedrichs extension of Dθ2 on C02 (Iθ , C) ⊂ Xθ there is a sequence g0 , g1 , · · · of elements of C02 (Iθ , C) converging to g(θ ) and such that for every given ε > 0 there is ν0 ∈ N such that for all µ, ν ∈ N satisfying µ ν0 and ν ν0 : π ∗ sin θ gµ − gν (Dθ2 + 1) (gµ − gν ) dθ < ε. (36) 0

Here too, by an argument analogous to (33) this implies that the sequence π sin θ gν∗ (Dθ2 + 1) gν dθ, ν ∈ N

(37)

0

is bounded. Finally, because of |fµ (r)gµ (θ ) − fν (r)gν (θ )|(A0 + C + α)[fµ (r)gµ (θ ) − fν (r)gν (θ )]| ∞ ∗ 2 r 4 −1 fµ − fν (Dr2 + α )(fµ − fν ) dr ·

r+ π

0

+2 · 0

π

sin θ ∞ r+

gµ∗ (Dθ2

+ 1)gµ dθ

r 4 −1 fν∗ (Dr2 + α )fν dr

sin θ gµ − gν

∗

(Dθ2

+ 1) gµ − gν dθ ,

(38)

the sequence h0 , h1 , . . . defined by hν := fν (r)gν (θ ),

ν∈N

(39)

has the required properties. In the proof we have used facts on the Sturm-Liouville operators Dr2 and Dθ2 . Now, for the reader’s convenience these will be given. For this define the (obviously) linear, symmetric and semibounded operators Ar0 , Aθ0 in Xr and Xθ , respectively, by Ar0 f := Dr2 f,

Aθ0 g := Dθ2 g,

for every f ∈ C02 (Ir , C) and every g ∈ C02 (Iθ , C). Then one has the following

(40)

668

H. R. Beyer

Lemma 4. (i) Ar0 is essentially self-adjoint. (ii) Aθ0 is essentially self-adjoint for m > 0. For m = 0, the Friedrichs extension of Aθ0 is given by the closure of the operator AθF defined by AθF f := Dθ2 g for every g ∈ C 2 (Iθ , C) ∩ Xθ satisfying (27) together with the condition that Dθ2 g ∈ Xθ . For all m the spectrum of the Friedrichs extension of Aθ0 is given by {|m|(|m| + 1), (|m| + 1)(|m| + 2), . . . }. Proof. (i) For this define the auxiliary Sturm–Liouville operator Aˆ r 0 in Xr by Aˆ r 0 f := − 4 f r

(41)

for every f ∈ C02 (Ir , C). Obviously, Aˆ r 0 is densely-defined, linear, symmetric and positive. Moreover since −m2 a 2 /r 4 is bounded on Ir , it follows by the Rellich–Kato theorem (see, e.g, Theorem X.12 in Volume II of [23]) that Ar0 is essentially self adjoint if and only if Aˆ r0 is essentially self-adjoint. Now, the equation f = 0 has nonvanishing constants as solutions. Since these are not in Xr at both ends of Ir , it follows that Aˆ r0 is in the limit point case, both, at r+ and at +∞. Hence Aˆ r0 is essentially selfadjoint (see, e.g., [30]). Finally, from this follows (i). (ii) This statement is, of course, well-known. 5. Computation of the Generator’s Resolvent In the following the resolvent of G will be determined for spectral parameters λ which are non-real and at the same time such that iaλ is not an exceptional value.2 Note that because of Theorem 1 (vi), the resolvent of G can be derived from the inverses of the operator polynomial A + C − λB − λ2 which are given in (ii) of the following theorem on a dense subset of X. Theorem 5. Let λ be a non-real element of the resolvent set of iG which moreover is such that iaλ is not an exceptional value. For each m ∈ Z let pslm (cos θ, −a 2 λ2 ),

l = |m|, |m| + 1, |m| + 2, · · ·

(42)

be the basis3 of Xθ consisting of spheroidal eigenfunctions of Dθ2 + λ2 a 2 sin2 θ corresponding to the eigenvalues 2 2 m 2 2 λm |m| (−a λ ), λ|m|+1 (−a λ ), . . . ,

(43)

respectively.4 Finally, let g ∈ C0 (Ir , C), m ∈ Z and l ∈ {|m|, |m| + 1, . . . }. Then (i) the subset of X consisting of all finite linear combinations of elements of the form h(r, θ )g(r)plm (cos θ, −a 2 λ2 ),

(44)

2 For the definition of these values see [20]. 3 In the sense that the span of these functions is dense in X . Note that these functions are not orthogonal θ

in general. Instead this sequence and the sequence consisting of its complex conjugates form a biorthogonal Basis of Xθ . See Theorem 4 in Sect. 3.23 of [20]. 4 For the definition of the functions ps m see [20]. l


669

where h :=

r4

(r 2 + a 2 )2 − a 2 sin2 θ

−1 ,

(45)

and g, l run through the elments of C0 (Ir , C) and {|m|, |m| + 1, |m| + 2, . . . }, respectively, is dense in X; (ii) (A + C − λB − λ2 )−1 h(r, θ )g(r)plm (cos θ, −a 2 λ2 ) = fr (r)plm (cos θ, −a 2 λ2 ),

(46)

where fr ∈ C 2 (Ir , C) ∩ Xr is such that Dr2 fr ∈ Xr and moreover satisfies 2 2 2 4 fr + λ m Drλ l (−a λ )(/r )fr = g.

(47)

Here for every φ ∈ C 2 (Ir , C), 2 φ := − Drλ

1 2 2 (ma + 2λMr) φ − + λ ( + 4Mr) φ. r4 r4

(48)

Proof. First we notice that h is C ∞ on and satisfies as a consequence of (29,) 2 /(4M 2 ) h 1. r+

(49)

Hence the maximal multiplication operator Th by h in X is defined on the whole of X, is bijective and its inverse is given by the maximal multiplication operator T1/ h which is defined on the whole of X, too, by the function 1/ h in X. Obviously, the subset of X consisting of all finite linear combinations of elements of the form g(r)plm (cos θ, −a 2 λ2 ),

(50)

where g ∈ C0 (Ir , C) and l = |m|, |m| + 1, |m| + 2, . . . is dense in X. Hence this is true for the subset of X consisting of all finite linear combinations of elements of the form h(r, θ )g(r)plm (cos θ, −a 2 λ2 ),

(51)

where g ∈ C0 (Ir , C) and l = |m|, |m| + 1, |m| + 2, . . . , too. In the following let g be some element of C0 (Ir , C) and l be some element of {|m|, |m| + 1, |m| + 2, . . . }. We will compute the element f ∈ X satisfying (52) A + C − λB − λ2 f (r, θ) = h(r, θ )g(r)plm (cos θ, −a 2 λ2 ). We note that by Theorem 2 A + C − λB − λ2 fr (r)plm (cos θ, −a 2 λ2 )

2 2 2 4 m 2 2 = h(r, θ ) Drλ fr + λ m l (−a λ )(/r )fr (r) pl (cos θ, −a λ )

(53)

for every fr ∈ C 2 (Ir , C) ∩ Xr such that Dr2 fr ∈ Xr .

(54)

670

H. R. Beyer

In the following we construct such a fr which satisfies in particular (47). Then by the bijectivity of A + C − λB − λ2 we conclude that f (r, θ ) = fr (r)plm (cos θ, −a 2 λ2 ).

(55)

For this construction we need some auxiliary solutions f1 , f2 , f3 and f4 of the homogeneous equation associated with (47), i.e.,

4Mr 2(r − M) (ma + 2λMr)2 s 2 +λ 1+ (56) fr + fr + + fr = 0, 2 2 2 where s := λm l (−a λ ), having special asymptotic behaviour at the singular point r = r+ and at +∞. First, by defining f¯r := 1/2 fr and by introducing the new independent variable r∗ , √ √ (57) r∗ := r(r + 4M) + 2M ln ( r + 4M + r)2 /(4M) ,

one gets a homogeneous first order system for f¯r and d f¯r /dr∗ which is equivalent to (56) and which satisfies the assumptions of Theorem 4 in the Appendix. From this theorem follows the existence of linear independent continuously differentiable solutions (f¯r1 , d f¯r1 /dr∗ ) and (f¯r2 , d f¯r2 /dr∗ ) of the system along with continuously differentiable functions R1 and R2 such that d f¯r1 f¯r1 (r∗ ) = eiλr∗ (1 + R11 (r∗ )) , (r∗ ) = eiλr∗ (iλ + R12 (r∗ )) , dr∗ d f¯r2 f¯r2 (r∗ ) = e−iλr∗ (1 + R21 (r∗ )) , (r∗ ) = e−iλr∗ (−iλ + R22 (r∗ )) , dr∗ lim |R1 (r∗ )| = lim |R2 (r∗ )| = 0. (58) r∗ →∞

r∗ →∞

In the following denote by fr1 , fr2 the solutions of (56) corresponding to (f¯r1 , d f¯r1 /dr∗ ) and (f¯r2 , d f¯r2 /dr∗ ), respectively. Morover define fr1 for Im(λ) > 0 (59) frR := fr2 for Im(λ) < 0 . Then it follows by (58) that φfrR ∈ C 2 (Ir , C) ∩ Xr and Dr2 (φfrR ) ∈ Xr for every φ ∈ C 2 (Ir , R) which is identically 0 for r < r0 and identically 1 for r > r1 , where r0 , r1 ∈ Ir are such that r0 < r1 , but otherwise arbitrary. For the second step by defining g1 := fr / and g2 := f one gets a homogeneous first order system for (g1 , g2 ) which is equivalent to (56) and which satisfies the assumptions of Corollary 5 in the Appendix. From this corollary follows the existence of linear independent continuously differentiable solutions (g11 , g12 ) and (g21 , g22 ) of the system along with continuously differentiable functions R3 and R4 such that g11 (r) = (r − r+ )−σ1 [1 + R31 (r)], g12 (r) = (r − r+ )−σ1 [−i(ma + 2λMr+ ) + R32 (r)], g21 (r) = (r − r+ )−σ2 [1 + R41 (r)], g22 (r) = (r − r+ )−σ2 [i(ma + 2λMr+ ) + R42 (r)], lim |R3 (r∗ )| = lim |R3 (r∗ )| = 0,

r∗ →r+

r∗ →r+

(60)


671

where M 2 − a 2 + i ((ma/2) + λMr+ ) / M 2 − a 2 ,

σ2 := M 2 − a 2 − i ((ma/2) + λMr+ ) / M 2 − a 2 . σ1 :=

(61)

In the following denote by fr3 , fr4 the solutions of (56) corresponding to (g11 , g12 ) and (g21 , g22 ), respectively. Moreover define fr3 for Im(λ) > 0 frL := (62) fr4 for Im(λ) < 0 . Then it follows by (60) that φfrL ∈ C 2 (Ir , C) ∩ Xr and Dr2 (φfrL ) ∈ Xr for every φ ∈ C 2 (Ir , R) which is identically 1 for r < r0 and identically 0 for r > r1 , where r0 , r1 ∈ Ir are such that r0 < r1 , but otherwise arbitrary. In the next step we notice that frR and frL are linear independent, because otherwise we would get a contradiction to the assumed bijectivity of A + C − λB − λ2 . Hence the Wronski determinant W of frR and frL , − frL frR , (63) W := frL frR is constant and different from 0. Therefore we can define frR (r) r frL (r) ∞ fr (r) := − frL (r )g(r )dr − frR (r )g(r )dr W W r+ r

(64)

for all r ∈ Ir . It follows from the foregoing results on frL and frR and from a simple computation that fr ∈ C 2 (Ir , C) ∩ Xr , Dr2 fr ∈ Xr and that fr satisfies (47). Finally, from the bijectivity of A + C − λB − λ2 we conclude (46). 6. The Case of the Klein–Gordon Equation Compared to the wave equation considered in the previous sections, the only change in this case is that the operator C has to be substituted by C := C + (m20 /g 00 ), where m0 denotes the mass of the field and m20 /g 00 is the maximal multiplication operator in X, which is defined on the whole of X as well as bounded, since this function is easily seen to be bounded on . The other objects X, A and B stay the same. Again it is easy to verify that X, A, B and C satisfy Assumptions 1 and 4 of [2]. As a consequence one has theorems analogous to Theorem 1 and Theorem 2. They imply the well-posedness of the initial value problem, i.e., the existence and uniqueness of the solution and its continuous dependence on the initial data. Further, via the analogue of Theorem 2 (ii), Theorem 7 below implies for masses satisfying (69), that the spectrum of the corresponding generator is real and that the norm of the solutions grow at most linearly in time. In particular there are no exponentially growing modes in these cases. Lemma 6. Let B be a bounded linear and self-adjoint operator on X. Then A + B is identical to the Friedrichs extension of A0 + B .

672

H. R. Beyer

Proof. First, since B is bounded linear and self-adjoint on X, it follows that (A0 + B )∗ = A∗0 + B .

(65)

Hence the domain of the Friedrichs extension (A0 + B )F of A0 + B is given by those elements f from D(A∗0 ) for which there is a sequence f0 , f1 , . . . in D(A0 ) converging to f and such that for every δ > 0 there is a corresponding ν0 ∈ N such that for all µ, ν ∈ N, |fµ − fν |(A0 + B + B )(fµ − fν )| < δ

(66)

if, both, µ > µ0 and ν > ν0 . Since (66) implies |fµ − fν |A0 (fµ − fν )| < δ,

(67)

it follows that f is an element of D(A), too. Further, (65) implies (A0 + B )F f = (A0 + B )∗ f = Af + B f.

(68)

Hence A + B is a linear self-adjoint (by the Rellich-Kato theorem, see, e.g, Theorem X.12 in Volume II of [23]) extension of (A0 + B )F and, finally, since (A0 + B )F is self-adjoint, (A0 + B )F = A + B . Theorem 7. Define b := ma/(Mr+ ) and let be |m|a 2M a2 m0 1+ + 2. 2Mr+ r+ r+

(69)

Then A + C + m20 /g 00 + (b/2)B − b2 /4

(70)

is positive. Proof. Because of the preceding lemma it is enough to prove that f |(A0 + C + m20 /g 00 + (b/2)B − b2 /4)f 0

(71)

for all f ∈ C02 (, C). Now let f be such an element. Since its support supp(f ) is a compact subset of there are r0 > r+ and r1 > r0 such that supp(f ) ⊂ J ×(0, π ) ⊂ , where J := (r0 , r1 ). In a first step one gets by partial integration, Fubini’s theorem and Lemma 4 (ii), f |(A0 + C)f ∂ ∂ m2 a 2 1 ∂ ∂ m2 f dr dθ = sin θ f ∗ − − − sin θ + ∂r ∂r sin θ ∂θ ∂θ sin2 θ π r1 d d m2 a 2 = − fθ dr sin θ dθ fθ∗ − dr dr r0 0 r1 π 1 d d m2 ∗ + fr dθ dr sin θ fr − sin θ + sin θ dθ dθ sin2 θ r0 0 π r1 m2 a 2 fθ∗ |m|(|m| + 1) − fθ dr sin θ dθ. r0 0


673

Further using f |f/g

π

r1

r |fθ | dr sin θdθ, π r1 r 2 f |Bf = 4mMa |fθ | dr sin θdθ, r0 0 (r 2 + a 2 )2 f |f sin θ |f |2 dr dθ, 00

0

2

2

(72)

r0

(73) (74)

we get f | A0 + C + m20 /g 00 + (b/2)B − b2 /4 f π r1 m 2 a 2 r − r+ ∗ fθ |m|(|m| + 1) − 2 + m20 r 2 r − r r − r0 0 + 2 2 m a 2 2 − (r + 2Mr + a ) fθ dr sin θ dθ 2 4M 2 r+ π r1 a2 ∗ 2 (75) fθ |m| + m 1 − 2 r+ r0 0 m2 a 2 2 2 2 2 2 2 + · (r + 2Mr + a )r − r (r + 2Mr + a ) f dr sin θ dθ 0. + θ + + 4M 2 r+ Hence the positivity of A0 + C + m20 /g 00 + (b/2)B − b2 /4 follows.

7. Discussion The reduced (in the angular coordinate ϕ) wave equation and Klein–Gordon equation were considered on a Kerr background and in the framework of C 0 -semigroup theory. Each equation was shown to have a well-posed initial value problem, i.e., to have a unique solution depending continuously on the data. Further, it was proved that the spectrum of the semigroup’s generator coincides with the spectrum of an operator polynomial whose coefficients can be read off from the equation. In this way the problem of deciding stability is reduced to a spectral problem. In addition a mathematical basis is provided for mode considerations.5 For the wave equation it was shown that the resolvent of the semigroup’s generator and the corresponding Green’s functions can be computed using spheroidal functions. It is to be expected that, analogous to the case of a Schwarzschild background, the quasinormal frequencies of the Kerr black hole appear as resonances, i.e., poles of the analytic continuation of this resolvent. Finally, stability of the background with respect to reduced massive perturbations was proved for masses exceeding a given bound (see (69)). 5 To give an example for this claim, say, we would be able to show that the unstable spectrum of G consists of discrete eigenvalues and that the corresponding eigenstates seperate in the way assumed by Whiting. Then, via the results of this paper, Whiting’s result [31] on the absence of exponentially growing modes would imply the stability of the solutions of the wave equation.

674

H. R. Beyer

It is interesting to compare the last result to earlier results of Detweiler in [4], Damour, Deruelle, Ruffini in [3] and Zouros, Eardley in [34]. These make the existence of exponentially growing modes for the massive Klein–Gordon equation very plausible. They found approximate unstable modes in the superradient regime, i.e., with frequencies ω satisfying Re(ω) < ma/(2Mr+ ). These modes become stable when this condition is violated. The approximations made in these papers lead to further restrictions. It turns out that the assumption of, both, these restrictions and the bound (69) derived here is incompatible with the assumption of superradience. Hence the stability result here does not contradict the results in these papers, but is complementary instead.6 Moreover it suggests that the negation of an inequality of the form of (69) (or some equivalent form) is the superradient condition. For this it should be noted that with some effort and along the lines of this paper it may be possible to improve (69), i.e., to decrease the bound. For this the Poincare–Sobolev inequalities mentioned at the end of Section 3 should be helpful. Acknowledgements. The author is grateful to B. G. Schmidt for pointing his attention to the problem of defining the quasi-normal frequencies of the Kerr black hole as resonances and to J. L. Friedman, B. G. Schmidt and B. F. Whiting for valuable discussions.

8. Appendix The following theorem used in Sect. 5 was first proved by Dunkel in [5] (compare also [18, 1, 9]). Theorem 8. Let n ∈ N \ {0}, a ∈ R, I := [a, ∞) and I0 := (a, ∞). In addition let A0 be a diagonalizable complex n × n matrix and e1 , . . . , en be a basis of Cn consisting of eigenvectors of A0 . Further, for each j ∈ {1, . . . , n} let λj be the eigenvalue corresponding to ej and Pj be the matrix representing the projection of Cn onto C.ej with respect to the canonical basis of Cn . Finally, let A1 be a continuous map from I into the complex n × n matrices M(n × n, C) such that A1j k is Lebesgue integrable for each j, k ∈ 1, . . . , n. Then there is a C 1 map R : I0 → M(n × n, C) with limt→∞ Rj k (t) = 0 for each j, k ∈ 1, . . . , n and such that u : I0 → M(n × n, C) defined by u(t) :=

n

exp(λj t) · (E + R(t)) · Pj

(76)

j =1

for all t ∈ I0 (where E is the n × n unit matrix), maps into the invertible n × n matrices and satisfies u (t) = (A0 + A1 (t)) · u(t)

(77)

for all t ∈ I0 . This theorem has the following Corollary 9. Let n ∈ N \ {0}; a, t0 ∈ R with a < t0 ; µ ∈ N; αµ := 1 for µ = 0 and αµ := µ for µ = 0. In addition let A0 be a diagonalizable complex n × n matrix and e1 , . . . , en be a basis of Cn consisting of eigenvectors of A0 . Further, for each j ∈ 6 The author is very grateful to J. L. Friedman for directing his attention to this fact.


675

{1, . . . , n} let λj be the eigenvalue corresponding to ej and Pj be the matrix representing the projection of Cn onto C.ej with respect to the canonical basis of Cn . Finally, let A1 be a continuous map from (a, t0 ) into the complex n × n matrices M(n × n, C) for which there is a number c ∈ (a, t0 ) such that the restriction of A1j k to [c, t0 ) is Lebesgue integrable for each j, k ∈ 1, . . . , n. Then there is a C 1 map R : (a, t0 ) → M(n × n, C) with limt→0 Rj k (t) = 0 for each j, k ∈ 1, . . . , n and such that u : (a, t0 ) → M(n × n, C) defined by n −λj · (E + R(t)) · P j for µ = 0 j =1 (t0 − t) (78) u(t) := n −µ ) · (E + R(t)) · P j for µ = 0 j =1 exp(λj (t0 − t) for all t ∈ (a, t0 ) (where E is the n × n unit matrix), maps into the invertible n × n matrices and satisfies αµ u (t) = A + A (t) · u(t) (79) 0 1 (t0 − t)µ+1 for each t ∈ (a, t0 ). References 1. Bellman, R.: A survey of the theory of the boundedness, stability, and asymptotic behaviour of solutions of linear and nonlinear differential and difference equations, NAVEXOS P-596. Washington, DC: Office of Naval Research, 1949 2. Beyer, H.R.:A framework for perturbations and stability of differentially rotating stars. Preprint, submitted to Proc. R. Soc. Lond. A. Online Los Alamos Archive Preprint: http://xxx.lanl.gov/abs/astro-ph/0007342, 2000 3. Damour, T., Deruelle, N. and Ruffini, R.: On quantum resonances in stationary geometries Lett. Nuovo Cimento 15, 257 (1976) 4. Detweiler, S. L.: Klein–Gordon equation and rotating black holes. Phys. Rev. D 22, 2323–2326 (1980) 5. Dunkel, O.: Regular singular points of a system of homogeneous linear differential equations of the first order. Am. Acad. Arts Sci. Proc. 38, 341–370 (1912–1913) 6. Erdelyi, A. (ed.): Higher Transcendental Functions Volume II. Florida: Robert Krieger, 1981 7. Goldberg, S.: Unbounded Linear Operators. New York: Dover, 1985 8. Hartle, J.B. and Wilkins, D.C.: Analytic properties of the Teukolsky equation. Commun. Math. Phys. 38, 47–63 (1974) 9. Hille, E: Lectures on ordinary differential equations. Reading: Addison-Wesley, 1969 10. Hille, E. and Phillips, R.S.: Functional Analysis and Semi-Groups. Providence: AMS, 1957 11. Hirzebruch, F. and Scharlau, W.: Einführung in die Funktionalanalysis. Mannheim: BI, 1971 12. Kato, T: Perturbation Theory for Linear Operators. Berlin: Springer, 1980 13. Kay, B.S.: The double-wedge algebra for quantum fields on Schwarschild and Minkowski spacetimes. Commun. Math. Phys. 100, 57–81 (1985) 14. Kay, B.S. and Wald, R.M.: Linear stability of Schwarzschild under perturbations which are non-vanishing on the bifurcation 2-sphere. Class. Quantum Grav. 4, 893–898 (1987) 15. Kokkotas, K. and Schmidt, B.G.: Quasi-Normal Modes of Stars and Black Holes, Living Reviews in Relativity, 1999-2, 1999 16. Krivan, W., Laguna, P. and Papadopoulos, P.: Dynamics of scalar fields in the background of rotating black holes, Phys. Rev. D 54, 4728–4734 (1996) For a related online version see: Krivan W. et al.: Dynamics of Scalar Fields in the Background of Rotating Black Holes. Online Los Alamos, 1996 Archive Preprint: http://xxx.lanl.gov/abs/gr-qc/9606003 17. Krivan, W., Laguna, P., Papadopoulos, P. and Andersson, N.: Dynamics of perturbations of rotating black holes. Phys. Rev. D 56, 3395-3404 (1997). For a related online version see: W. Krivan, et al. 1997 Dynamics of perturbations of rotating black holes. Online Los Alamos Archive Preprint: http://xxx.lanl.gov/abs/grqc/9702048 18. Levinson, N.: The asymptotic nature of the solutions of linear systems of differential equations Duke Math. J. 15, 111–126 (1948)

676

H. R. Beyer

19. Markus, A.S.: Introduction to the Spectral Theory of Operator Pencils. Providence: AMS, 1988 20. Meixner, J. and Schaefke, F.W.: Mathieusche Funktionen und Sphaeroidfunktionen. Berlin: Springer, 1954 21. Pazy, A.: Semigroups of Linear Operators and Applications to Partial Differential Equations. New York: Springer, 1983 22. Press, W.H. and Teukolsky, S.: Perturbations of a rotating black hole. II Dynamical stability of the Kerr metric. Astrophys. J. 185, 649–673 (1973) 23. Reed, M. and Simon, B.: Methods of Mathematical Physics Volume I, II, III, IV. New York: Academic, 1975, 1978, 1979, 1980 24. Riesz, F. and Sz-Nagy, B.: Functional Analysis. New York: Unger, 1955 25. Renardy, M. and Rogers, R.: An introduction to partial differential equations, corr. 2nd print. New York: Springer, 1996 26. Rodman, L.: An Introduction to Operator Polynomials. Basel: Birkäuser, 1989 27. Teukolsky, S.A.: Perturbations of a rotating black hole. I. Fundamental equations for gravitational, electromagnetic, and neutrino-field perturbations. ApJ 185, 635–647 (1973) 28. Stewart, J.M.: On the stability of Kerr’s space-time. Proc. R. Soc. London, Ser. A 344, 65–79 (1975) 29. Wald, R.M.: Note on the stability of the Schwarschild metric. J. Math. Phys. 20, 1056–1058 (1979) 30. Weidmann, J.: Lineare Operatoren in Hilberträumen. Teubner: Stuttgart, 1976 31. Whiting, B.F.: Mode stability of the Kerr black hole. J. Math. Phys. 30, 1301–1305 (1989) 32. Yosida, K.: Functional Analysis. Berlin: Springer, 1980 33. Ziemer, W.P.: Weakly differentiable functions. New York: Springer, 1989 34. Zouros, T.J.M. and Eardley, D.M.: Instabilities of massive scalar perturbations of a rotating black hole. Ann. Phys. (N.Y.) 118, 139–155 (1979) Communicated by H. Nicolai

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 223

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 270

Communications in Mathematical Physics - Volume 208

Communications in Mathematical Physics - Volume 186

Communications In Mathematical Physics - Volume 294

Communications in Mathematical Physics - Volume 217

Communications In Mathematical Physics - Volume 274

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 225

Communications In Mathematical Physics - Volume 263

Communications in Mathematical Physics - Volume 211

Communications In Mathematical Physics - Volume 293

Communications in Mathematical Physics - Volume 246

Communications In Mathematical Physics - Volume 298

Communications in Mathematical Physics - Volume 234

Communications In Mathematical Physics - Volume 288

Communications in Mathematical Physics - Volume 304

Communications In Mathematical Physics - Volume 292

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 222

Communications in Mathematical Physics - Volume 260

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 223

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 270

Communications in Mathematical Physics - Volume 208

Communications in Mathematical Physics - Volume 186

Communications In Mathematical Physics - Volume 294

Communications in Mathematical Physics - Volume 217

Communications In Mathematical Physics - Volume 274

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 225

Communications In Mathematical Physics - Volume 263

Communications in Mathematical Physics - Volume 211

Communications In Mathematical Physics - Volume 293

Communications in Mathematical Physics - Volume 246

Communications In Mathematical Physics - Volume 298

Communications in Mathematical Physics - Volume 234

Communications In Mathematical Physics - Volume 288

Communications in Mathematical Physics - Volume 304

Communications In Mathematical Physics - Volume 292

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 222

Communications in Mathematical Physics - Volume 260

Recommend Documents